Re: [PATCH] PM / Domains: Fix integer overflows on u32 bit multiplies
On Sun 2021-02-07 22:46:48, Colin King wrote: > From: Colin Ian King > > There are three occurrances of u32 variables being multiplied by > 1000 using 32 bit multiplies and the result being assigned to a > 64 bit signed integer. These can potentially lead to a 32 bit > overflows, so fix this by casting 1000 to a UL first to force > a 64 bit multiply hence avoiding the overflow. Ummm. No? a) Can you imagine any situation where they result in overflow? b) How does casting to UL help on 32 bit system? Best regards, Pavel > Addresses-Coverity: ("Unintentional integer overflow") > Fixes: 30f604283e05 ("PM / Domains: Allow domain power states to be read from > DT") > Signed-off-by: Colin Ian King > --- > drivers/base/power/domain.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index aaf6c83b5cf6..ddeff69126ff 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -2831,10 +2831,10 @@ static int genpd_parse_state(struct genpd_power_state > *genpd_state, > > err = of_property_read_u32(state_node, "min-residency-us", ); > if (!err) > - genpd_state->residency_ns = 1000 * residency; > + genpd_state->residency_ns = 1000UL * residency; > > - genpd_state->power_on_latency_ns = 1000 * exit_latency; > - genpd_state->power_off_latency_ns = 1000 * entry_latency; > + genpd_state->power_on_latency_ns = 1000UL * exit_latency; > + genpd_state->power_off_latency_ns = 1000UL * entry_latency; > genpd_state->fwnode = _node->fwnode; > > return 0; -- http://www.livejournal.com/~pavelmachek signature.asc Description: Digital signature
[PATCH v2] mtd: spi-nor: winbond: Add support for w25q512jvq
Add support for w25q512jvq. This is of the same series chip with w25q256jv, which is already supported, but with size doubled and different JEDEC ID. Tested on Intel whitley platform with dd from/to the flash for read/write respectly, and flash_erase for erasing the flash. Signed-off-by: Shuhao Mai --- drivers/mtd/spi-nor/winbond.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mtd/spi-nor/winbond.c b/drivers/mtd/spi-nor/winbond.c index e5dfa786f190..b1d307fcdf9c 100644 --- a/drivers/mtd/spi-nor/winbond.c +++ b/drivers/mtd/spi-nor/winbond.c @@ -97,6 +97,8 @@ static const struct flash_info winbond_parts[] = { SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) }, { "w25m512jv", INFO(0xef7119, 0, 64 * 1024, 1024, SECT_4K | SPI_NOR_QUAD_READ | SPI_NOR_DUAL_READ) }, + { "w25q512jvq", INFO(0xef4020, 0, 64 * 1024, 1024, +SECT_4K | SPI_NOR_QUAD_READ | SPI_NOR_DUAL_READ) }, }; /** -- 2.20.1
RE: [PATCH v2 6/9] scsi: ufshpb: Add hpb dev reset response
> > + if (hpb->is_hcm) { > > + struct ufshpb_lu *h; > > + struct scsi_device *sdev; > > + > > + shost_for_each_device(sdev, hba->host) { > > I haven't test it yet, but this line shall cause recursive spin lock - > in current code base, ufshpb_rsp_upiu() is called with host_lock held. Yayks Ouch. Will fix. Thanks, Avri
Re: [PATCH v13 3/4] phy: Add Sparx5 ethernet serdes PHY driver
Hi Vinod, On Thu, 2021-02-04 at 13:31 +0530, Vinod Koul wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you > know the content is safe > > On 29-01-21, 14:07, Steen Hegelund wrote: > > Add the Microchip Sparx5 ethernet serdes PHY driver for the 6G, 10G > > and 25G > > interfaces available in the Sparx5 SoC. > > > > Signed-off-by: Bjarni Jonasson > > Signed-off-by: Steen Hegelund > > Reviewed-by: Andrew Lunn > > Reviewed-by: Alexandre Belloni > > --- > > ... > > sdx5_rmw(SD25G_LANE_LANE_1E_LN_CFG_RXLB_EN_SET(params- > > >cfg_rxlb_en), > > + SD25G_LANE_LANE_1E_LN_CFG_RXLB_EN, > > + priv, > > + SD25G_LANE_LANE_1E(sd_index)); > > + > > + sdx5_rmw(SD25G_LANE_LANE_19_LN_CFG_TXLB_EN_SET(params- > > >cfg_txlb_en), > > + SD25G_LANE_LANE_19_LN_CFG_TXLB_EN, > > + priv, > > + SD25G_LANE_LANE_19(sd_index)); > > + > > + sdx5_rmw(SD25G_LANE_LANE_2E_LN_CFG_RSTN_DFEDIG_SET(0), > > + SD25G_LANE_LANE_2E_LN_CFG_RSTN_DFEDIG, > > + priv, > > + SD25G_LANE_LANE_2E(sd_index)); > > + > > + sdx5_rmw(SD25G_LANE_LANE_2E_LN_CFG_RSTN_DFEDIG_SET(1), > > + SD25G_LANE_LANE_2E_LN_CFG_RSTN_DFEDIG, > > + priv, > > + SD25G_LANE_LANE_2E(sd_index)); > > + > > + sdx5_rmw(SD_LANE_25G_SD_LANE_CFG_MACRO_RST_SET(0), > > + SD_LANE_25G_SD_LANE_CFG_MACRO_RST, > > + priv, > > + SD_LANE_25G_SD_LANE_CFG(sd_index)); > > + > > + sdx5_rmw(SD25G_LANE_LANE_1C_LN_CFG_CDR_RSTN_SET(0), > > + SD25G_LANE_LANE_1C_LN_CFG_CDR_RSTN, > > + priv, > > + SD25G_LANE_LANE_1C(sd_index)); > > This looks quite terrible :( > > Can we do a table here for these and then write the configuration > table, > that may look better and easy to maintain ? I will restructure this. > > > + > > + usleep_range(1000, 2000); > > + > > + sdx5_rmw(SD25G_LANE_LANE_1C_LN_CFG_CDR_RSTN_SET(1), > > + SD25G_LANE_LANE_1C_LN_CFG_CDR_RSTN, > > + priv, > > + SD25G_LANE_LANE_1C(sd_index)); > > + > > + usleep_range(1, 2); > > + > > + sdx5_rmw(SD25G_LANE_CMU_FF_REGISTER_TABLE_INDEX_SET(0xff), > > + SD25G_LANE_CMU_FF_REGISTER_TABLE_INDEX, > > + priv, > > + SD25G_LANE_CMU_FF(sd_index)); > > + > > + value = sdx5_rd(priv, SD25G_LANE_CMU_C0(sd_index)); > > + value = SD25G_LANE_CMU_C0_PLL_LOL_UDL_GET(value); > > + > > + if (value) { > > + dev_err(macro->priv->dev, "25G PLL Loss of Lock: > > 0x%x\n", value); > > + ret = -EINVAL; > > + } > > + > > + value = sdx5_rd(priv, SD_LANE_25G_SD_LANE_STAT(sd_index)); > > + value = SD_LANE_25G_SD_LANE_STAT_PMA_RST_DONE_GET(value); > > + > > + if (value != 0x1) { > > + dev_err(macro->priv->dev, "25G PMA Reset failed: > > 0x%x\n", value); > > + ret = -EINVAL; > > continue on error..? I will change that. > > > + } > > + > > + sdx5_rmw(SD25G_LANE_CMU_2A_R_DBG_LOL_STATUS_SET(0x1), > > + SD25G_LANE_CMU_2A_R_DBG_LOL_STATUS, > > + priv, > > + SD25G_LANE_CMU_2A(sd_index)); > > + > > + sdx5_rmw(SD_LANE_25G_SD_SER_RST_SER_RST_SET(0x0), > > + SD_LANE_25G_SD_SER_RST_SER_RST, > > + priv, > > ... > > sdx5_inst_rmw(SD10G_LANE_LANE_0E_CFG_RXLB_EN_SET(params- > > >cfg_rxlb_en) | > > + SD10G_LANE_LANE_0E_CFG_TXLB_EN_SET(params- > > >cfg_txlb_en), > > + SD10G_LANE_LANE_0E_CFG_RXLB_EN | > > + SD10G_LANE_LANE_0E_CFG_TXLB_EN, > > + sd_inst, > > + SD10G_LANE_LANE_0E(sd_index)); > > + > > + sdx5_rmw(SD_LANE_SD_LANE_CFG_MACRO_RST_SET(0), > > + SD_LANE_SD_LANE_CFG_MACRO_RST, > > + priv, > > + SD_LANE_SD_LANE_CFG(sd_lane_tgt)); > > + > > + sdx5_inst_rmw(SD10G_LANE_LANE_50_CFG_SSC_RESETB_SET(1), > > + SD10G_LANE_LANE_50_CFG_SSC_RESETB, > > + sd_inst, > > + SD10G_LANE_LANE_50(sd_index)); > > + > > + sdx5_rmw(SD10G_LANE_LANE_50_CFG_SSC_RESETB_SET(1), > > + SD10G_LANE_LANE_50_CFG_SSC_RESETB, > > + priv, > > + SD10G_LANE_LANE_50(sd_index)); > > + > > + sdx5_rmw(SD_LANE_MISC_SD_125_RST_DIS_SET(params->fx_100), > > + SD_LANE_MISC_SD_125_RST_DIS, > > + priv, > > + SD_LANE_MISC(sd_lane_tgt)); > > + > > + sdx5_rmw(SD_LANE_MISC_RX_ENA_SET(params->fx_100), > > + SD_LANE_MISC_RX_ENA, > > + priv, > > + SD_LANE_MISC(sd_lane_tgt)); > > + > > + sdx5_rmw(SD_LANE_MISC_MUX_ENA_SET(params->fx_100), > > + SD_LANE_MISC_MUX_ENA, > > + priv, > > + SD_LANE_MISC(sd_lane_tgt)); > > Table for this set as well as other places please
Re: [PATCH] optee: simplify i2c access
On 08/02/21, Jens Wiklander wrote: > Hi Jorge, > > On Wed, Jan 27, 2021 at 11:41 AM Jens Wiklander > wrote: > > > > Hi Arnd, > > > > On Mon, Jan 25, 2021 at 12:38 PM Arnd Bergmann wrote: > > > > > > From: Arnd Bergmann > > > > > > Storing a bogus i2c_client structure on the stack adds overhead and > > > causes a compile-time warning: > > > > > > drivers/tee/optee/rpc.c:493:6: error: stack frame size of 1056 bytes in > > > function 'optee_handle_rpc' [-Werror,-Wframe-larger-than=] > > > void optee_handle_rpc(struct tee_context *ctx, struct optee_rpc_param > > > *param, > > > > > > Change the implementation of handle_rpc_func_cmd_i2c_transfer() to > > > open-code the i2c_transfer() call, which makes it easier to read > > > and avoids the warning. > > > > > > Fixes: c05210ab9757 ("drivers: optee: allow op-tee to access devices on > > > the i2c bus") > > > Signed-off-by: Arnd Bergmann > > > --- > > > drivers/tee/optee/rpc.c | 31 --- > > > 1 file changed, 16 insertions(+), 15 deletions(-) > > > > Looks good to me. > > Reviewed-by: Jens Wiklander > > Would you mind testing this? sure, doing it this morning. btw what Arnd has done - removing the unnecessary level of indirection - was pretty much my initial though but I thought it was easier to read the way I wrote it (I guess I was wrong and I obviously missed the stack size increase) but yes, will test > > Thanks, > Jens
[PATCH V2 4/4] Squashfs: add more sanity checks in xattr id lookup
Sysbot has reported a warning where a kmalloc() attempt exceeds the maximum limit. This has been identified as corruption of the xattr_ids count when reading the xattr id lookup table. This patch adds a number of additional sanity checks to detect this corruption and others. 1. It checks for a corrupted xattr index read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This would cause an out of bounds read. 2. It checks against corruption of the xattr_ids count. This can either lead to the above kmalloc failure, or a smaller than expected table to be read. 3. It checks the contents of the index table for corruption. Reported-by: syzbot+2ccea6339d3683608...@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher Cc: sta...@vger.kernel.org --- fs/squashfs/xattr_id.c | 66 -- 1 file changed, 57 insertions(+), 9 deletions(-) diff --git a/fs/squashfs/xattr_id.c b/fs/squashfs/xattr_id.c index d99e08464554..52905ce2b6f7 100644 --- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -31,10 +31,15 @@ int squashfs_xattr_lookup(struct super_block *sb, unsigned int index, struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_XATTR_BLOCK(index); int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]); + u64 start_block; struct squashfs_xattr_id id; int err; + if (index >= msblk->xattr_ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->xattr_id_table[block]); + err = squashfs_read_metadata(sb, , _block, , sizeof(id)); if (err < 0) @@ -50,13 +55,17 @@ int squashfs_xattr_lookup(struct super_block *sb, unsigned int index, /* * Read uncompressed xattr id lookup table indexes from disk into memory */ -__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, +__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start, u64 *xattr_table_start, int *xattr_ids) { - unsigned int len; + struct squashfs_sb_info *msblk = sb->s_fs_info; + unsigned int len, indexes; struct squashfs_xattr_id_table *id_table; + __le64 *table; + u64 start, end; + int n; - id_table = squashfs_read_table(sb, start, sizeof(*id_table)); + id_table = squashfs_read_table(sb, table_start, sizeof(*id_table)); if (IS_ERR(id_table)) return (__le64 *) id_table; @@ -70,13 +79,52 @@ __le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, if (*xattr_ids == 0) return ERR_PTR(-EINVAL); - /* xattr_table should be less than start */ - if (*xattr_table_start >= start) + len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids); + + /* +* The computed size of the index table (len bytes) should exactly +* match the table start and end points +*/ + start = table_start + sizeof(*id_table); + end = msblk->bytes_used; + + if (len != (end - start)) return ERR_PTR(-EINVAL); - len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + table = squashfs_read_table(sb, start, len); + if (IS_ERR(table)) + return table; + + /* table[0], table[1], ... table[indexes - 1] store the locations +* of the compressed xattr id blocks. Each entry should be less than +* the next (i.e. table[0] < table[1]), and the difference between them +* should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] +* should be less than table_start, and again the difference +* shouls be SQUASHFS_METADATA_SIZE or less. +* +* Finally xattr_table_start should be less than table[0]. +*/ + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } - TRACE("In read_xattr_index_table, length %d\n", len); + if (*xattr_table_start >= le64_to_cpu(table[0])) { + kfree(table); + return ERR_PTR(-EINVAL); + } - return squashfs_read_table(sb, start + sizeof(*id_table), len); + return table; } -- 2.20.1
Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task
On Wed, Feb 03, 2021 at 03:41:56PM +0300, Pavel Tikhomirov wrote: > Currently there is no way to differentiate the file with alive owner > from the file with dead owner but pid of the owner reused. That's why > CRIU can't actually know if it needs to restore file owner or not, > because if it restores owner but actual owner was dead, this can > introduce unexpected signals to the "false"-owner (which reused the > pid). > > Let's change the api, so that F_GETOWN(EX) returns 0 in case actual > owner is dead already. > > Cc: Jeff Layton > Cc: "J. Bruce Fields" > Cc: Alexander Viro > Cc: linux-fsde...@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: Cyrill Gorcunov > Cc: Andrei Vagin > Signed-off-by: Pavel Tikhomirov I can't imagine a scenario where we could break some backward compatibility with this change, so Reviewed-by: Cyrill Gorcunov
Re: [PATCH] kbuild: simplify access to the kernel's version
On Sun, Feb 07, 2021 at 11:13:52AM -0500, Sasha Levin wrote: > Instead of storing the version in a single integer and having various > kernel (and userspace) code how it's constructed, export individual > (major, patchlevel, sublevel) components and simplify kernel code that > uses it. > > This should also make it easier on userspace. > > Signed-off-by: Sasha Levin > --- > Makefile | 5 - > drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 ++-- > drivers/usb/core/hcd.c | 4 ++-- > drivers/usb/gadget/udc/aspeed-vhub/hub.c | 4 ++-- > include/linux/usb/composite.h | 4 ++-- > kernel/sys.c | 2 +- > 6 files changed, 13 insertions(+), 10 deletions(-) Reviewed-by: Greg Kroah-Hartman
[PATCH V2 3/4] Squashfs: add more sanity checks in inode lookup
Sysbot has reported an "slab-out-of-bounds read" error which has been identified as being caused by a corrupted "ino_num" value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the inodes count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large inodes count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Reported-by: syzbot+04419e3ff19d2970e...@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher Cc: sta...@vger.kernel.org --- fs/squashfs/export.c | 41 + 1 file changed, 33 insertions(+), 8 deletions(-) diff --git a/fs/squashfs/export.c b/fs/squashfs/export.c index ae2c87bb0fbe..3f134ba86a45 100644 --- a/fs/squashfs/export.c +++ b/fs/squashfs/export.c @@ -41,12 +41,17 @@ static long long squashfs_inode_lookup(struct super_block *sb, int ino_num) struct squashfs_sb_info *msblk = sb->s_fs_info; int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1); int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1); - u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]); + u64 start; __le64 ino; int err; TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num); + if (ino_num == 0 || (ino_num - 1) >= msblk->inodes) + return -EINVAL; + + start = le64_to_cpu(msblk->inode_lookup_table[blk]); + err = squashfs_read_metadata(sb, , , , sizeof(ino)); if (err < 0) return err; @@ -111,7 +116,10 @@ __le64 *squashfs_read_inode_lookup_table(struct super_block *sb, u64 lookup_table_start, u64 next_table, unsigned int inodes) { unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes); + unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes); + int n; __le64 *table; + u64 start, end; TRACE("In read_inode_lookup_table, length %d\n", length); @@ -121,20 +129,37 @@ __le64 *squashfs_read_inode_lookup_table(struct super_block *sb, if (inodes == 0) return ERR_PTR(-EINVAL); - /* length bytes should not extend into the next table - this check -* also traps instances where lookup_table_start is incorrectly larger -* than the next table start + /* +* The computed size of the lookup table (length bytes) should exactly +* match the table start and end points */ - if (lookup_table_start + length > next_table) + if (length != (next_table - lookup_table_start)) return ERR_PTR(-EINVAL); table = squashfs_read_table(sb, lookup_table_start, length); + if (IS_ERR(table)) + return table; /* -* table[0] points to the first inode lookup table metadata block, -* this should be less than lookup_table_start +* table0], table[1], ... table[indexes - 1] store the locations +* of the compressed inode lookup blocks. Each entry should be +* less than the next (i.e. table[0] < table[1]), and the difference +* between them should be SQUASHFS_METADATA_SIZE or less. +* table[indexes - 1] should be less than lookup_table_start, and +* again the difference should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); } -- 2.20.1
[PATCH v2 1/2] mei: bus: simplify mei_cl_device_remove()
The driver core only calls a bus' remove function when there is actually a driver and a device. So drop the needless check and assign cldrv earlier. (Side note: The check for cldev being non-NULL is broken anyhow, because to_mei_cl_device() is a wrapper around container_of() for a member that is not the first one. So cldev only can become NULL if dev is (void *)0xc (for archs with 32 bit pointers) or (void *)0x18 (for archs with 64 bit pointers).) Signed-off-by: Uwe Kleine-König --- drivers/misc/mei/bus.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c index 2907db260fba..50d617e7467e 100644 --- a/drivers/misc/mei/bus.c +++ b/drivers/misc/mei/bus.c @@ -878,13 +878,9 @@ static int mei_cl_device_probe(struct device *dev) static int mei_cl_device_remove(struct device *dev) { struct mei_cl_device *cldev = to_mei_cl_device(dev); - struct mei_cl_driver *cldrv; + struct mei_cl_driver *cldrv = to_mei_cl_driver(dev->driver); int ret = 0; - if (!cldev || !dev->driver) - return 0; - - cldrv = to_mei_cl_driver(dev->driver); if (cldrv->remove) ret = cldrv->remove(cldev); -- 2.29.2
[PATCH v2 2/2] mei: bus: change remove callback to return void
The driver core ignores the return value of mei_cl_device_remove() so passing an error value doesn't solve any problem. As most mei drivers' remove callbacks return 0 unconditionally and returning a different value doesn't have any effect, change this prototype to return void and return 0 unconditionally in mei_cl_device_remove(). The only driver that could return an error value is modified to emit an explicit warning in the error case. Acked-by: Guenter Roeck Signed-off-by: Uwe Kleine-König --- drivers/misc/mei/bus.c | 5 ++--- drivers/misc/mei/hdcp/mei_hdcp.c | 7 +-- drivers/nfc/microread/mei.c | 4 +--- drivers/nfc/pn544/mei.c | 4 +--- drivers/watchdog/mei_wdt.c | 4 +--- include/linux/mei_cl_bus.h | 2 +- 6 files changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c index 50d617e7467e..54dddae46705 100644 --- a/drivers/misc/mei/bus.c +++ b/drivers/misc/mei/bus.c @@ -879,17 +879,16 @@ static int mei_cl_device_remove(struct device *dev) { struct mei_cl_device *cldev = to_mei_cl_device(dev); struct mei_cl_driver *cldrv = to_mei_cl_driver(dev->driver); - int ret = 0; if (cldrv->remove) - ret = cldrv->remove(cldev); + cldrv->remove(cldev); mei_cldev_unregister_callbacks(cldev); mei_cl_bus_module_put(cldev); module_put(THIS_MODULE); - return ret; + return 0; } static ssize_t name_show(struct device *dev, struct device_attribute *a, diff --git a/drivers/misc/mei/hdcp/mei_hdcp.c b/drivers/misc/mei/hdcp/mei_hdcp.c index 9ae9669e46ea..8447ad4b7d47 100644 --- a/drivers/misc/mei/hdcp/mei_hdcp.c +++ b/drivers/misc/mei/hdcp/mei_hdcp.c @@ -845,16 +845,19 @@ static int mei_hdcp_probe(struct mei_cl_device *cldev, return ret; } -static int mei_hdcp_remove(struct mei_cl_device *cldev) +static void mei_hdcp_remove(struct mei_cl_device *cldev) { struct i915_hdcp_comp_master *comp_master = mei_cldev_get_drvdata(cldev); + int ret; component_master_del(>dev, _component_master_ops); kfree(comp_master); mei_cldev_set_drvdata(cldev, NULL); - return mei_cldev_disable(cldev); + ret = mei_cldev_disable(cldev); + if (ret) + dev_warn(>dev, "mei_cldev_disable() failed\n"); } #define MEI_UUID_HDCP GUID_INIT(0xB638AB7E, 0x94E2, 0x4EA2, 0xA5, \ diff --git a/drivers/nfc/microread/mei.c b/drivers/nfc/microread/mei.c index 5dad8847a9b3..8fa7771085eb 100644 --- a/drivers/nfc/microread/mei.c +++ b/drivers/nfc/microread/mei.c @@ -44,15 +44,13 @@ static int microread_mei_probe(struct mei_cl_device *cldev, return 0; } -static int microread_mei_remove(struct mei_cl_device *cldev) +static void microread_mei_remove(struct mei_cl_device *cldev) { struct nfc_mei_phy *phy = mei_cldev_get_drvdata(cldev); microread_remove(phy->hdev); nfc_mei_phy_free(phy); - - return 0; } static struct mei_cl_device_id microread_mei_tbl[] = { diff --git a/drivers/nfc/pn544/mei.c b/drivers/nfc/pn544/mei.c index 579bc599f545..5c10aac085a4 100644 --- a/drivers/nfc/pn544/mei.c +++ b/drivers/nfc/pn544/mei.c @@ -42,7 +42,7 @@ static int pn544_mei_probe(struct mei_cl_device *cldev, return 0; } -static int pn544_mei_remove(struct mei_cl_device *cldev) +static void pn544_mei_remove(struct mei_cl_device *cldev) { struct nfc_mei_phy *phy = mei_cldev_get_drvdata(cldev); @@ -51,8 +51,6 @@ static int pn544_mei_remove(struct mei_cl_device *cldev) pn544_hci_remove(phy->hdev); nfc_mei_phy_free(phy); - - return 0; } static struct mei_cl_device_id pn544_mei_tbl[] = { diff --git a/drivers/watchdog/mei_wdt.c b/drivers/watchdog/mei_wdt.c index 5391bf3e6b11..53165e49c298 100644 --- a/drivers/watchdog/mei_wdt.c +++ b/drivers/watchdog/mei_wdt.c @@ -619,7 +619,7 @@ static int mei_wdt_probe(struct mei_cl_device *cldev, return ret; } -static int mei_wdt_remove(struct mei_cl_device *cldev) +static void mei_wdt_remove(struct mei_cl_device *cldev) { struct mei_wdt *wdt = mei_cldev_get_drvdata(cldev); @@ -636,8 +636,6 @@ static int mei_wdt_remove(struct mei_cl_device *cldev) dbgfs_unregister(wdt); kfree(wdt); - - return 0; } #define MEI_UUID_WD UUID_LE(0x05B79A6F, 0x4628, 0x4D7F, \ diff --git a/include/linux/mei_cl_bus.h b/include/linux/mei_cl_bus.h index 959ad7d850b4..07f5ef8fc456 100644 --- a/include/linux/mei_cl_bus.h +++ b/include/linux/mei_cl_bus.h @@ -68,7 +68,7 @@ struct mei_cl_driver { int (*probe)(struct mei_cl_device *cldev, const struct mei_cl_device_id *id); - int (*remove)(struct mei_cl_device *cldev); + void (*remove)(struct mei_cl_device *cldev); }; int __mei_cldev_driver_register(struct mei_cl_driver *cldrv, -- 2.29.2
[PATCH 1/2] sched/features: Fix hrtick reprogramming
Hung tasks and RCU stall cases were reported on systems which were not 100% busy. Investigation of such unexpected cases (no sign of potential starvation caused by tasks hogging the system) pointed out that the periodic sched tick timer wasn't serviced anymore after a certain point and that caused all machinery that depends on it (timers, RCU, etc.) to stop working as well. This issues was however only reproducible if HRTICK was enabled. Looking at core dumps it was found that the rbtree of the hrtimer base used also for the hrtick was corrupted (i.e. next as seen from the base root and actual leftmost obtained by traversing the tree are different). Same base is also used for periodic tick hrtimer, which might get "lost" if the rbtree gets corrupted. Much alike what described in commit 1f71addd34f4c ("tick/sched: Do not mess with an enqueued hrtimer") there is a race window between hrtimer_set_expires() in hrtick_start and hrtimer_start_expires() in __hrtick_restart() in which the former might be operating on an already queued hrtick hrtimer, which might lead to corruption of the base. Use hrtick_start() (which removes the timer before enqueuing it back) to ensure hrtick hrtimer reprogramming is entirely guarded by the base lock, so that no race conditions can occur. Co-developed-by: Daniel Bristot de Oliveira Signed-off-by: Daniel Bristot de Oliveira Co-developed-by: Luis Claudio R. Goncalves Signed-off-by: Luis Claudio R. Goncalves Signed-off-by: Juri Lelli --- kernel/sched/core.c | 8 +++- kernel/sched/sched.h | 1 + 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index be3a956c2d23..d2d79a2c30f5 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -355,8 +355,9 @@ static enum hrtimer_restart hrtick(struct hrtimer *timer) static void __hrtick_restart(struct rq *rq) { struct hrtimer *timer = >hrtick_timer; + ktime_t time = rq->hrtick_time; - hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED_HARD); + hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD); } /* @@ -380,7 +381,6 @@ static void __hrtick_start(void *arg) void hrtick_start(struct rq *rq, u64 delay) { struct hrtimer *timer = >hrtick_timer; - ktime_t time; s64 delta; /* @@ -388,9 +388,7 @@ void hrtick_start(struct rq *rq, u64 delay) * doesn't make sense and can cause timer DoS. */ delta = max_t(s64, delay, 1LL); - time = ktime_add_ns(timer->base->get_time(), delta); - - hrtimer_set_expires(timer, time); + rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta); if (rq == this_rq()) __hrtick_restart(rq); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 6edc67df3554..3e16dff206b3 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1030,6 +1030,7 @@ struct rq { call_single_data_t hrtick_csd; #endif struct hrtimer hrtick_timer; + ktime_t hrtick_time; #endif #ifdef CONFIG_SCHEDSTATS -- 2.29.2
[PATCH v2 0/2] mei: bus: Some cleanups
Hello, changes since v1: - Added a missing ; found by kernel test robot, thanks - Added an Ack for Guenter rangediff can be found below. Uwe Kleine-König (2): mei: bus: simplify mei_cl_device_remove() mei: bus: change remove callback to return void drivers/misc/mei/bus.c | 11 +++ drivers/misc/mei/hdcp/mei_hdcp.c | 7 +-- drivers/nfc/microread/mei.c | 4 +--- drivers/nfc/pn544/mei.c | 4 +--- drivers/watchdog/mei_wdt.c | 4 +--- include/linux/mei_cl_bus.h | 2 +- 6 files changed, 12 insertions(+), 20 deletions(-) Range-diff against v1: -: > 1: 86b2bf521a84 mei: bus: simplify mei_cl_device_remove() 1: 10a3dfb49d4f ! 2: 807117116ccb mei: bus: change remove callback to return void @@ Commit message return an error value is modified to emit an explicit warning in the error case. +Acked-by: Guenter Roeck Signed-off-by: Uwe Kleine-König ## drivers/misc/mei/bus.c ## @@ drivers/misc/mei/hdcp/mei_hdcp.c: static int mei_hdcp_probe(struct mei_cl_device - return mei_cldev_disable(cldev); + ret = mei_cldev_disable(cldev); + if (ret) -+ dev_warn(>dev, "mei_cldev_disable() failed\n") ++ dev_warn(>dev, "mei_cldev_disable() failed\n"); } #define MEI_UUID_HDCP GUID_INIT(0xB638AB7E, 0x94E2, 0x4EA2, 0xA5, \ base-commit: 5c8fe583cce542aa0b84adc939ce85293de36e5e -- 2.29.2
[PATCH 2/2] sched/features: Distinguish between NORMAL and DEADLINE hrtick
The HRTICK feature has traditionally been servicing configurations that need precise preemptions point for NORMAL tasks. More recently, the feature has been extended to also service DEADLINE tasks with stringent runtime enforcement needs (e.g., runtime < 1ms with HZ=1000). Enabling HRTICK sched feature currently enables the additional timer and task tick for both classes, which might introduced undesired overhead for no additional benefit if one needed it only for one of the cases. Separate HRTICK sched feature in two (and leave the traditional case name unmodified) so that it can be selectively enabled when needed. With $ echo HRTICK > /sys/kernel/debug/sched_features the NORMAL/fair hrtick gets enabled. With $ echo HRTICK_DL > /sys/kernel/debug/sched_features the DEADLINE hrtick gets enabled. Co-developed-by: Daniel Bristot de Oliveira Signed-off-by: Daniel Bristot de Oliveira Co-developed-by: Luis Claudio R. Goncalves Signed-off-by: Luis Claudio R. Goncalves Signed-off-by: Juri Lelli --- kernel/sched/core.c | 2 +- kernel/sched/deadline.c | 4 ++-- kernel/sched/fair.c | 4 ++-- kernel/sched/features.h | 1 + kernel/sched/sched.h| 26 -- 5 files changed, 30 insertions(+), 7 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d2d79a2c30f5..15e2d7c1ac1a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4955,7 +4955,7 @@ static void __sched notrace __schedule(bool preempt) schedule_debug(prev, preempt); - if (sched_feat(HRTICK)) + if (sched_feat(HRTICK) || sched_feat(HRTICK_DL)) hrtick_clear(rq); local_irq_disable(); diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 1508d126e88b..7e28777b652c 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1832,7 +1832,7 @@ static void set_next_task_dl(struct rq *rq, struct task_struct *p, bool first) if (!first) return; - if (hrtick_enabled(rq)) + if (hrtick_enabled_dl(rq)) start_hrtick_dl(rq, p); if (rq->curr->sched_class != _sched_class) @@ -1895,7 +1895,7 @@ static void task_tick_dl(struct rq *rq, struct task_struct *p, int queued) * not being the leftmost task anymore. In that case NEED_RESCHED will * be set and schedule() will start a new hrtick for the next task. */ - if (hrtick_enabled(rq) && queued && p->dl.runtime > 0 && + if (hrtick_enabled_dl(rq) && queued && p->dl.runtime > 0 && is_leftmost(p, >dl)) start_hrtick_dl(rq, p); } diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 59b645e3c4fd..8a8bd7b13634 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5429,7 +5429,7 @@ static void hrtick_update(struct rq *rq) { struct task_struct *curr = rq->curr; - if (!hrtick_enabled(rq) || curr->sched_class != _sched_class) + if (!hrtick_enabled_fair(rq) || curr->sched_class != _sched_class) return; if (cfs_rq_of(>se)->nr_running < sched_nr_latency) @@ -7116,7 +7116,7 @@ done: __maybe_unused; list_move(>se.group_node, >cfs_tasks); #endif - if (hrtick_enabled(rq)) + if (hrtick_enabled_fair(rq)) hrtick_start_fair(rq, p); update_misfit_status(p, rq); diff --git a/kernel/sched/features.h b/kernel/sched/features.h index e875eabb6600..1bc2b158fc51 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -38,6 +38,7 @@ SCHED_FEAT(CACHE_HOT_BUDDY, true) SCHED_FEAT(WAKEUP_PREEMPTION, true) SCHED_FEAT(HRTICK, false) +SCHED_FEAT(HRTICK_DL, false) SCHED_FEAT(DOUBLE_TICK, false) /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 3e16dff206b3..ed0f347ab2f9 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2104,17 +2104,39 @@ extern const_debug unsigned int sysctl_sched_migration_cost; */ static inline int hrtick_enabled(struct rq *rq) { - if (!sched_feat(HRTICK)) - return 0; if (!cpu_active(cpu_of(rq))) return 0; return hrtimer_is_hres_active(>hrtick_timer); } +static inline int hrtick_enabled_fair(struct rq *rq) +{ + if (!sched_feat(HRTICK)) + return 0; + return hrtick_enabled(rq); +} + +static inline int hrtick_enabled_dl(struct rq *rq) +{ + if (!sched_feat(HRTICK_DL)) + return 0; + return hrtick_enabled(rq); +} + void hrtick_start(struct rq *rq, u64 delay); #else +static inline int hrtick_enabled_fair(struct rq *rq) +{ + return 0; +} + +static inline int hrtick_enabled_dl(struct rq *rq) +{ + return 0; +} + static inline int hrtick_enabled(struct rq *rq) { return 0; -- 2.29.2
[PATCH 0/2] HRTICK reprogramming and optimization
Hi All, Hung tasks and RCU stall cases were reported on systems which were not 100% busy. Investigation of such unexpected cases (no sign of potential starvation caused by tasks hogging the system) pointed out that the periodic sched tick timer wasn't serviced anymore after a certain point and that caused all machinery that depends on it (timers, RCU, etc.) to stop working as well. This issue was however only reproducible if HRTICK was enabled. Looking at core dumps it was found that the rbtree of the hrtimer base used also for the hrtick was corrupted (i.e. next as seen from the base root and actual leftmost obtained by traversing the tree are different). Same base is also used for periodic tick hrtimer, which might get "lost" if the rbtree gets corrupted. Much alike what is described in commit 1f71addd34f4c ("tick/sched: Do not mess with an enqueued hrtimer") there is infact a race window between hrtimer_set_expires() in hrtick_start and hrtimer_start_expires() in __hrtick_restart() in which the former might be operating on an already queued hrtick hrtimer, which might lead to corruption of the base. Patch 01/02 fixes this case. While at it, it might be desired to avoid HRTICK overhead in cases where it is only actually used to service a specific subset of scheduling classes (currently it services both fair and deadline “at once”). Patch 02/02 proposes an optimization by making HRTICK feature selectable on a per class basis, so one can, say, enable it only to service DEADLINE and leave NORMAL task preemption points less fine grained. Series available at https://github.com/jlelli/linux.git sched/hrtick-fixes Hope they both make sense. Comments, questions and suggestions are more than welcome. Best, Juri Juri Lelli (2): sched/features: Fix hrtick reprogramming sched/features: Distinguish between NORMAL and DEADLINE hrtick kernel/sched/core.c | 10 -- kernel/sched/deadline.c | 4 ++-- kernel/sched/fair.c | 4 ++-- kernel/sched/features.h | 1 + kernel/sched/sched.h| 27 +-- 5 files changed, 34 insertions(+), 12 deletions(-) -- 2.29.2
Re: [RESEND PATCH v3 3/5] misc: Add Synopsys DesignWare xData IP driver to Kconfig
On Tue, Feb 02, 2021 at 05:56:36PM +0100, Gustavo Pimentel wrote: > Add Synopsys DesignWare xData IP driver to Kconfig. > > This driver enables/disables the PCIe traffic generator module > pertain to the Synopsys DesignWare prototype. > > Signed-off-by: Gustavo Pimentel > --- > drivers/misc/Kconfig | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig > index fafa8b0..6d5783f 100644 > --- a/drivers/misc/Kconfig > +++ b/drivers/misc/Kconfig > @@ -423,6 +423,17 @@ config SRAM > config SRAM_EXEC > bool > > +config DW_XDATA_PCIE > + depends on PCI > + tristate "Synopsys DesignWare xData PCIe driver" > + default n "N" is a default option and not needed to be stated explicitly. Thanks
Re: [PATCH v2] memory: tegra186-emc: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
On Sun, 7 Feb 2021 at 09:03, Jiapeng Chong wrote: > > Fix the following coccicheck warning: > > drivers/memory/tegra/tegra186-emc.c:158:0-23: WARNING: > tegra186_emc_debug_max_rate_fops should be defined with > DEFINE_DEBUGFS_ATTRIBUTE. > > drivers/memory/tegra/tegra186-emc.c:128:0-23: WARNING: > tegra186_emc_debug_min_rate_fops should be defined with > DEFINE_DEBUGFS_ATTRIBUTE. > > Reported-by: Abaci Robot Hi, My question from v1 is still valid because I did not receive any coccinelle report - where can we find it? Best regards, Krzysztof
[PATCH v1] kvm: x86: Revise guest_fpu xcomp_bv field
Bit 63 of the XCOMP_BV field indicates that the save area is in the compacted format and the remaining bits indicate the states that have space allocated in the save area, not only user states. Since fpstate_init() has initialized xcomp_bv, let's just use that. Signed-off-by: Jing Liu --- arch/x86/kvm/x86.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1b404e4d7dd8..f115493f577d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4435,8 +4435,6 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu, return 0; } -#define XSTATE_COMPACTION_ENABLED (1ULL << 63) - static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) { struct xregs_state *xsave = >arch.guest_fpu->state.xsave; @@ -4494,7 +4492,8 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src) /* Set XSTATE_BV and possibly XCOMP_BV. */ xsave->header.xfeatures = xstate_bv; if (boot_cpu_has(X86_FEATURE_XSAVES)) - xsave->header.xcomp_bv = host_xcr0 | XSTATE_COMPACTION_ENABLED; + xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | +xfeatures_mask_all; /* * Copy each region from the non-compacted offset to the @@ -9912,9 +9911,6 @@ static void fx_init(struct kvm_vcpu *vcpu) return; fpstate_init(>arch.guest_fpu->state); - if (boot_cpu_has(X86_FEATURE_XSAVES)) - vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv = - host_xcr0 | XSTATE_COMPACTION_ENABLED; /* * Ensure guest xcr0 is valid for loading -- 2.18.4
Re: [PATCH v19 2/3] scsi: ufs: L2P map management for HPB read
On 2021-01-29 13:30, Daejun Park wrote: This is a patch for managing L2P map in HPB module. The HPB divides logical addresses into several regions. A region consists of several sub-regions. The sub-region is a basic unit where L2P mapping is managed. The driver loads L2P mapping data of each sub-region. The loaded sub-region is called active-state. The HPB driver unloads L2P mapping data as region unit. The unloaded region is called inactive-state. Sub-region/region candidates to be loaded and unloaded are delivered from the UFS device. The UFS device delivers the recommended active sub-region and inactivate region to the driver using sensedata. The HPB module performs L2P mapping management on the host through the delivered information. A pinned region is a pre-set regions on the UFS device that is always activate-state. The data structure for map data request and L2P map uses mempool API, minimizing allocation overhead while avoiding static allocation. The mininum size of the memory pool used in the HPB is implemented as a module parameter, so that it can be configurable by the user. To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096 The map_work manages active/inactive by 2 "to-do" lists. Each hpb lun maintains 2 "to-do" lists: hpb->lh_inact_rgn - regions to be inactivated, and hpb->lh_act_srgn - subregions to be activated Those lists are maintained on IO completion. Reviewed-by: Bart Van Assche Reviewed-by: Can Guo Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufs.h| 36 ++ drivers/scsi/ufs/ufshcd.c | 4 + drivers/scsi/ufs/ufshpb.c | 993 +- drivers/scsi/ufs/ufshpb.h | 65 +++ 4 files changed, 1083 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index 65563635e20e..075c12e7de7e 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -472,6 +472,41 @@ struct utp_cmd_rsp { u8 sense_data[UFS_SENSE_SIZE]; }; +struct ufshpb_active_field { + __be16 active_rgn; + __be16 active_srgn; +}; +#define HPB_ACT_FIELD_SIZE 4 + +/** + * struct utp_hpb_rsp - Response UPIU structure + * @residual_transfer_count: Residual transfer count DW-3 + * @reserved1: Reserved double words DW-4 to DW-7 + * @sense_data_len: Sense data length DW-8 U16 + * @desc_type: Descriptor type of sense data + * @additional_len: Additional length of sense data + * @hpb_op: HPB operation type + * @reserved2: Reserved field + * @active_rgn_cnt: Active region count + * @inactive_rgn_cnt: Inactive region count + * @hpb_active_field: Recommended to read HPB region and subregion + * @hpb_inactive_field: To be inactivated HPB region and subregion + */ +struct utp_hpb_rsp { + __be32 residual_transfer_count; + __be32 reserved1[4]; + __be16 sense_data_len; + u8 desc_type; + u8 additional_len; + u8 hpb_op; + u8 reserved2; + u8 active_rgn_cnt; + u8 inactive_rgn_cnt; + struct ufshpb_active_field hpb_active_field[2]; + __be16 hpb_inactive_field[2]; +}; +#define UTP_HPB_RSP_SIZE 40 + /** * struct utp_upiu_rsp - general upiu response structure * @header: UPIU header structure DW-0 to DW-2 @@ -482,6 +517,7 @@ struct utp_upiu_rsp { struct utp_upiu_header header; union { struct utp_cmd_rsp sr; + struct utp_hpb_rsp hr; struct utp_upiu_query qr; }; }; diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b8d6a52f5603..52e48de8d27c 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -5018,6 +5018,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) */ pm_runtime_get_noresume(hba->dev); } + + if (scsi_status == SAM_STAT_GOOD) + ufshpb_rsp_upiu(hba, lrbp); break; case UPIU_TRANSACTION_REJECT_UPIU: /* TODO: handle Reject UPIU Response */ @@ -9228,6 +9231,7 @@ EXPORT_SYMBOL(ufshcd_shutdown); void ufshcd_remove(struct ufs_hba *hba) { ufs_bsg_remove(hba); + ufshpb_remove(hba); ufs_sysfs_remove_nodes(hba->dev); blk_cleanup_queue(hba->tmf_queue); blk_mq_free_tag_set(>tmf_tag_set); diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 1f84141ed384..48edfdd0f606 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,11 +16,73 @@ #include "ufshpb.h" #include "../sd.h" +/* memory management */ +static struct kmem_cache *ufshpb_mctx_cache; +static mempool_t *ufshpb_mctx_pool; +static mempool_t *ufshpb_page_pool; +/* A cache size of 2MB can cache ppn in the 1GB range. */ +static unsigned int ufshpb_host_map_kbytes = 2048; +static int tot_active_srgn_pages; + +static struct
[PATCH] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
THREAD_ALIGN_SHIFT = THREAD_SHIFT + 1 = PAGE_SHIFT + 1 Maximum PAGE_SHIFT is 18 for 256k pages so THREAD_ALIGN_SHIFT is 19 at the maximum. No need to clobber cr1, it can be preserved when moving r1 into CR when we check stack overflow. This reduces the number of instructions in Machine Check Exception prolog and fixes a build failure reported by the kernel test robot on v5.10 stable when building with RTAS + VMAP_STACK + KVM. That build failure is due to too many instructions in the prolog hence not fitting between 0x200 and 0x300. Allthough the problem doesn't show up in mainline, it is still worth the change. Reported-by: kernel test robot Fixes: 98bf2d3f4970 ("powerpc/32s: Fix RTAS machine check with VMAP stack") Cc: sta...@vger.kernel.org Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_32.h| 2 +- arch/powerpc/kernel/head_book3s_32.S | 6 -- 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h index a2f72c966baf..abc7b603ab65 100644 --- a/arch/powerpc/kernel/head_32.h +++ b/arch/powerpc/kernel/head_32.h @@ -47,7 +47,7 @@ lwz r1,TASK_STACK-THREAD(r1) addir1, r1, THREAD_SIZE - INT_FRAME_SIZE 1: - mtcrf 0x7f, r1 + mtcrf 0x3f, r1 bt 32 - THREAD_ALIGN_SHIFT, stack_overflow #else subir11, r1, INT_FRAME_SIZE /* use r1 if kernel */ diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S index 54140f4927e5..10e6aa88b1ff 100644 --- a/arch/powerpc/kernel/head_book3s_32.S +++ b/arch/powerpc/kernel/head_book3s_32.S @@ -278,12 +278,6 @@ MachineCheck: 7: EXCEPTION_PROLOG_2 addir3,r1,STACK_FRAME_OVERHEAD #ifdef CONFIG_PPC_CHRP -#ifdef CONFIG_VMAP_STACK - mfspr r4, SPRN_SPRG_THREAD - tovirt(r4, r4) - lwz r4, RTAS_SP(r4) - cmpwi cr1, r4, 0 -#endif beq cr1, machine_check_tramp twi 31, 0, 0 #else -- 2.25.0
[PATCH] MAINTAINERS: rectify BROADCOM PMB (POWER MANAGEMENT BUS) DRIVER
Commit 8bcac4011ebe ("soc: bcm: add PM driver for Broadcom's PMB") includes a new MAINTAINERS section BROADCOM PMB (POWER MANAGEMENT BUS) DRIVER with 'drivers/soc/bcm/bcm-pmb.c', but the file was actually added at 'drivers/soc/bcm/bcm63xx/bcm-pmb.c'. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains: warning: no file matches F:drivers/soc/bcm/bcm-pmb.c Point the file entry to the right location. Signed-off-by: Lukas Bulwahn --- applies cleanly on next-20210205 Rafal, please ack. Florian, please pick this minor fixup patch for soc next tree. MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 6b507e8d7828..c23731c88dc2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3647,7 +3647,7 @@ M:bcm-kernel-feedback-l...@broadcom.com L: linux...@vger.kernel.org S: Maintained T: git git://github.com/broadcom/stblinux.git -F: drivers/soc/bcm/bcm-pmb.c +F: drivers/soc/bcm/bcm63xx/bcm-pmb.c F: include/dt-bindings/soc/bcm-pmb.h BROADCOM SPECIFIC AMBA DRIVER (BCMA) -- 2.17.1
TLS for 5.10
smime.p7m Description: S/MIME encrypted message
Re: [PATCH] staging: octeon: remove braces from single-line block
Hi! On 06/02/2021 21:17, Phillip Potter wrote: > This removes the braces from the if statement that checks the > physical node return value in cvm_oct_phy_setup_device, as this > block contains only one statement. Fixes a style warning. > > Signed-off-by: Phillip Potter Reviewed-by: Alexander Sverdlin > --- > drivers/staging/octeon/ethernet-mdio.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/staging/octeon/ethernet-mdio.c > b/drivers/staging/octeon/ethernet-mdio.c > index 0bf545849b11..b0fd083a5bf2 100644 > --- a/drivers/staging/octeon/ethernet-mdio.c > +++ b/drivers/staging/octeon/ethernet-mdio.c > @@ -146,9 +146,8 @@ int cvm_oct_phy_setup_device(struct net_device *dev) > goto no_phy; > > phy_node = of_parse_phandle(priv->of_node, "phy-handle", 0); > - if (!phy_node && of_phy_is_fixed_link(priv->of_node)) { > + if (!phy_node && of_phy_is_fixed_link(priv->of_node)) > phy_node = of_node_get(priv->of_node); > - } > if (!phy_node) > goto no_phy; > -- Best regards, Alexander Sverdlin.
RE: [PATCH] seccomp: Improve performance by optimizing memory barrier
> From: Leon Romanovsky [mailto:l...@kernel.org] > Sent: Monday, February 8, 2021 2:44 PM > To: Wanghongzhe (Hongzhe, EulerOS) > Cc: keesc...@chromium.org; l...@amacapital.net; w...@chromium.org; > a...@kernel.org; dan...@iogearbox.net; and...@kernel.org; ka...@fb.com; > songliubrav...@fb.com; y...@fb.com; john.fastab...@gmail.com; > kpsi...@kernel.org; linux-kernel@vger.kernel.org; net...@vger.kernel.org; > b...@vger.kernel.org > Subject: Re: [PATCH] seccomp: Improve performance by optimizing memory > barrier > > On Mon, Feb 01, 2021 at 08:49:41PM +0800, wanghongzhe wrote: > > If a thread(A)'s TSYNC flag is set from seccomp(), then it will > > synchronize its seccomp filter to other threads(B) in same thread > > group. To avoid race condition, seccomp puts rmb() between reading the > > mode and filter in seccomp check patch(in B thread). > > As a result, every syscall's seccomp check is slowed down by the > > memory barrier. > > > > However, we can optimize it by calling rmb() only when filter is NULL > > and reading it again after the barrier, which means the rmb() is > > called only once in thread lifetime. > > > > The 'filter is NULL' conditon means that it is the first time > > attaching filter and is by other thread(A) using TSYNC flag. > > In this case, thread B may read the filter first and mode later in CPU > > out-of-order exection. After this time, the thread B's mode is always > > be set, and there will no race condition with the filter/bitmap. > > > > In addtion, we should puts a write memory barrier between writing the > > filter and mode in smp_mb__before_atomic(), to avoid the race > > condition in TSYNC case. > > > > Signed-off-by: wanghongzhe > > --- > > kernel/seccomp.c | 31 ++- > > 1 file changed, 22 insertions(+), 9 deletions(-) > > > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c index > > 952dc1c90229..b944cb2b6b94 100644 > > --- a/kernel/seccomp.c > > +++ b/kernel/seccomp.c > > @@ -397,8 +397,20 @@ static u32 seccomp_run_filters(const struct > seccomp_data *sd, > > READ_ONCE(current->seccomp.filter); > > > > /* Ensure unexpected behavior doesn't result in failing open. */ > > - if (WARN_ON(f == NULL)) > > - return SECCOMP_RET_KILL_PROCESS; > > + if (WARN_ON(f == NULL)) { > > + /* > > +* Make sure the first filter addtion (from another > > +* thread using TSYNC flag) are seen. > > +*/ > > + rmb(); > > + > > + /* Read again */ > > + f = READ_ONCE(current->seccomp.filter); > > + > > + /* Ensure unexpected behavior doesn't result in failing open. */ > > + if (WARN_ON(f == NULL)) > > + return SECCOMP_RET_KILL_PROCESS; > > + } > > IMHO, double WARN_ON() for the fallback flow is too much. > Also according to the description, this "f == NULL" check is due to races and > not programming error which WARN_ON() are intended to catch. > > Thanks Maybe you are right. I think 'if (f == NULL)' is enough for this optimizing.
Re: [PATCH] optee: simplify i2c access
Hi Jorge, On Wed, Jan 27, 2021 at 11:41 AM Jens Wiklander wrote: > > Hi Arnd, > > On Mon, Jan 25, 2021 at 12:38 PM Arnd Bergmann wrote: > > > > From: Arnd Bergmann > > > > Storing a bogus i2c_client structure on the stack adds overhead and > > causes a compile-time warning: > > > > drivers/tee/optee/rpc.c:493:6: error: stack frame size of 1056 bytes in > > function 'optee_handle_rpc' [-Werror,-Wframe-larger-than=] > > void optee_handle_rpc(struct tee_context *ctx, struct optee_rpc_param > > *param, > > > > Change the implementation of handle_rpc_func_cmd_i2c_transfer() to > > open-code the i2c_transfer() call, which makes it easier to read > > and avoids the warning. > > > > Fixes: c05210ab9757 ("drivers: optee: allow op-tee to access devices on the > > i2c bus") > > Signed-off-by: Arnd Bergmann > > --- > > drivers/tee/optee/rpc.c | 31 --- > > 1 file changed, 16 insertions(+), 15 deletions(-) > > Looks good to me. > Reviewed-by: Jens Wiklander Would you mind testing this? Thanks, Jens
[RFC PATCH v3] MIPS: tlbex: Avoid access invalid address when pmd is modifying
From: wangrui When modifying pmd through THP, invalid address access may occurs in the tlb handler. Because the tlb handler loads value of pmd twice, one is used for huge page testing and the other is used to load pte. So these two values may be different: CPU 0: (app) CPU 1: (khugepaged) 1: scan hit: set pmd to invalid_pmd_table (pmd_clear) 2: tlb invalid: handle_tlbl, load pmd for huge page testing, is not a huge page 3: collapsed: set pmd to huge page 4: handle_tlbl: load pmd again for load pte(as base address), the value of pmd is not an address, access invalid address! This patch avoids the inconsistency of two memory loads by reusing the result of one load. Signed-off-by: hev Signed-off-by: wangrui --- arch/mips/mm/tlbex.c | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c index a7521b8f7658..5842074502ad 100644 --- a/arch/mips/mm/tlbex.c +++ b/arch/mips/mm/tlbex.c @@ -723,11 +723,10 @@ static void build_is_huge_pte(u32 **p, struct uasm_reloc **r, unsigned int tmp, unsigned int pmd, int lid) { - UASM_i_LW(p, tmp, 0, pmd); if (use_bbit_insns()) { - uasm_il_bbit1(p, r, tmp, ilog2(_PAGE_HUGE), lid); + uasm_il_bbit1(p, r, pmd, ilog2(_PAGE_HUGE), lid); } else { - uasm_i_andi(p, tmp, tmp, _PAGE_HUGE); + uasm_i_andi(p, tmp, pmd, _PAGE_HUGE); uasm_il_bnez(p, r, tmp, lid); } } @@ -1103,7 +1102,6 @@ EXPORT_SYMBOL_GPL(build_update_entries); struct mips_huge_tlb_info { int huge_pte; int restore_scratch; - bool need_reload_pte; }; static struct mips_huge_tlb_info @@ -1118,7 +1116,6 @@ build_fast_tlb_refill_handler (u32 **p, struct uasm_label **l, rv.huge_pte = scratch; rv.restore_scratch = 0; - rv.need_reload_pte = false; if (check_for_high_segbits) { UASM_i_MFC0(p, tmp, C0_BADVADDR); @@ -1323,7 +1320,6 @@ static void build_r4000_tlb_refill_handler(void) } else { htlb_info.huge_pte = K0; htlb_info.restore_scratch = 0; - htlb_info.need_reload_pte = true; vmalloc_mode = refill_noscratch; /* * create the plain linear handler @@ -1348,11 +1344,14 @@ static void build_r4000_tlb_refill_handler(void) build_get_pgde32(, K0, K1); /* get pgd in K1 */ #endif + UASM_i_LW(, K0, 0, K1); #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT - build_is_huge_pte(, , K0, K1, label_tlb_huge_update); + build_is_huge_pte(, , K1, K0, label_tlb_huge_update); #endif - build_get_ptep(, K0, K1); + GET_CONTEXT(, K1); /* get context reg */ + build_adjust_context(, K1); + UASM_i_ADDU(, K1, K0, K1); /* add in offset */ build_update_entries(, K0, K1); build_tlb_write_entry(, , , tlb_random); uasm_l_leave(, p); @@ -1360,8 +1359,6 @@ static void build_r4000_tlb_refill_handler(void) } #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT uasm_l_tlb_huge_update(, p); - if (htlb_info.need_reload_pte) - UASM_i_LW(, htlb_info.huge_pte, 0, K1); build_huge_update_entries(, htlb_info.huge_pte, K1); build_huge_tlb_write_entry(, , , K0, tlb_random, htlb_info.restore_scratch); @@ -2059,20 +2056,20 @@ build_r4000_tlbchange_handler_head(u32 **p, struct uasm_label **l, build_get_pgde32(p, wr.r1, wr.r2); /* get pgd in ptr */ #endif + UASM_i_LW(p, wr.r3, 0, wr.r2); #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT /* * For huge tlb entries, pmd doesn't contain an address but * instead contains the tlb pte. Check the PAGE_HUGE bit and * see if we need to jump to huge tlb processing. */ - build_is_huge_pte(p, r, wr.r1, wr.r2, label_tlb_huge_update); + build_is_huge_pte(p, r, wr.r1, wr.r3, label_tlb_huge_update); #endif UASM_i_MFC0(p, wr.r1, C0_BADVADDR); - UASM_i_LW(p, wr.r2, 0, wr.r2); UASM_i_SRL(p, wr.r1, wr.r1, PAGE_SHIFT + PTE_ORDER - PTE_T_LOG2); uasm_i_andi(p, wr.r1, wr.r1, (PTRS_PER_PTE - 1) << PTE_T_LOG2); - UASM_i_ADDU(p, wr.r2, wr.r2, wr.r1); + UASM_i_ADDU(p, wr.r2, wr.r3, wr.r1); #ifdef CONFIG_SMP uasm_l_smp_pgtable_change(l, *p); -- 2.30.0
Re: [96e8740] [PATCH 2/2] Staging: wimax: i2400m: some readability improvements.
On Sun, Feb 07, 2021 at 10:11:24PM +0300, dev.dra...@bk.ru wrote: > From: Dmitrii Wolf > > Hello, developers! > Sorry for the late answer. As you know - i am a newbie and it is my first > kernel patch. > After reading kernelnewbies.or, ./Documentation/process/ files and viewing > FOSDEM's videpo > "Write and Submit your first Linux kernel Patch", i took a decision to send > you some > changes. I understand that it is annoying to get this "style fixing" > patches. So, the > Joe Perches's idea to improve code readability was implemented in second > patch. Also, > some new readability improvements added to it. > Thanks in advance! > > Signed-off-by: Dmitrii Wolf > --- > drivers/staging/wimax/i2400m/netdev.c | 8 > drivers/staging/wimax/i2400m/rx.c | 25 + > 2 files changed, 17 insertions(+), 16 deletions(-) > > diff --git a/drivers/staging/wimax/i2400m/netdev.c > b/drivers/staging/wimax/i2400m/netdev.c > index 0895a2e441d3..5f79ccc87656 100644 > --- a/drivers/staging/wimax/i2400m/netdev.c > +++ b/drivers/staging/wimax/i2400m/netdev.c > @@ -366,13 +366,13 @@ netdev_tx_t i2400m_hard_start_xmit(struct sk_buff *skb, > result = i2400m_net_wake_tx(i2400m, net_dev, skb); > else > result = i2400m_net_tx(i2400m, net_dev, skb); > - if (result < 0) { > -drop: > - net_dev->stats.tx_dropped++; > - } else { > + if (result >= 0) { > net_dev->stats.tx_packets++; > net_dev->stats.tx_bytes += skb->len; > } > +drop: > + net_dev->stats.tx_dropped++; > + > dev_kfree_skb(skb); > d_fnend(3, dev, "(skb %p net_dev %p) = %d\n", skb, net_dev, result); > return NETDEV_TX_OK; > diff --git a/drivers/staging/wimax/i2400m/rx.c > b/drivers/staging/wimax/i2400m/rx.c > index 807bd3db69e9..fdc5da409683 100644 > --- a/drivers/staging/wimax/i2400m/rx.c > +++ b/drivers/staging/wimax/i2400m/rx.c > @@ -194,8 +194,8 @@ void i2400m_report_hook_work(struct work_struct *ws) > spin_unlock_irqrestore(>rx_lock, flags); > if (list_empty()) > break; > - else > - d_printf(1, dev, "processing queued reports\n"); > + > + d_printf(1, dev, "processing queued reports\n"); > list_for_each_entry_safe(args, args_next, , list_node) { > d_printf(2, dev, "processing queued report %p\n", args); > i2400m_report_hook(i2400m, args->l3l4_hdr, args->size); > @@ -756,16 +756,15 @@ unsigned __i2400m_roq_update_ws(struct i2400m *i2400m, > struct i2400m_roq *roq, > roq_data_itr = (struct i2400m_roq_data *) _itr->cb; > nsn_itr = __i2400m_roq_nsn(roq, roq_data_itr->sn); > /* NSN bounds assumed correct (checked when it was queued) */ > - if (nsn_itr < new_nws) { > - d_printf(2, dev, "ERX: roq %p - release skb %p " > - "(nsn %u/%u new nws %u)\n", > - roq, skb_itr, nsn_itr, roq_data_itr->sn, > - new_nws); > - __skb_unlink(skb_itr, >queue); > - i2400m_net_erx(i2400m, skb_itr, roq_data_itr->cs); > - } else { > + if (nsn_itr >= new_nws) { > break; /* rest of packets all nsn_itr > nws */ > } > + d_printf(2, dev, "ERX: roq %p - release skb %p " > + "(nsn %u/%u new nws %u)\n", > + roq, skb_itr, nsn_itr, roq_data_itr->sn, > + new_nws); > + __skb_unlink(skb_itr, >queue); > + i2400m_net_erx(i2400m, skb_itr, roq_data_itr->cs); > } > roq->ws = sn; > return new_nws; > @@ -904,8 +903,9 @@ void i2400m_roq_queue_update_ws(struct i2400m *i2400m, > struct i2400m_roq *roq, > struct i2400m_roq_data *roq_data; > roq_data = (struct i2400m_roq_data *) >cb; > i2400m_net_erx(i2400m, skb, roq_data->cs); > - } else > + } else { > __i2400m_roq_queue(i2400m, roq, skb, sn, nsn); > + } > > __i2400m_roq_update_ws(i2400m, roq, sn + 1); > i2400m_roq_log_add(i2400m, roq, I2400M_RO_TYPE_PACKET_WS, > @@ -1321,9 +1321,10 @@ void i2400m_unknown_barker(struct i2400m *i2400m, > 8, 4, buf, 64, 0); > printk(KERN_ERR "%s... (only first 64 bytes " > "dumped)\n", prefix); > - } else > + } else { > print_hex_dump(KERN_ERR, prefix, DUMP_PREFIX_OFFSET, > 8, 4, buf, size, 0); > + } > } > EXPORT_SYMBOL(i2400m_unknown_barker); > > -- > 2.25.1 > Hi, This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him a patch that has triggered this response. He used to manually respond to these common problems, but in order to save
Re: [PATCH v7 4/7] crypto: add ecc curve and expose them
On Mon, 8 Feb 2021 at 07:37, Vitaly Chikunov wrote: > > Herbert, > > On Fri, Jan 29, 2021 at 02:00:04PM +1100, Herbert Xu wrote: > > On Thu, Jan 28, 2021 at 09:49:41PM -0500, Stefan Berger wrote: > > > > > > In my patch series I initially had registered the akciphers under the > > > names > > > ecc-nist-p192 and ecc-nist-p256 but now, in V4, joined them together as > > > 'ecdsa'. This may be too generic for a name. Maybe it should be called > > > ecsda-nist for the NIST family. > > > > What I'm proposing is specifying the curve in the name as well, i.e., > > ecdsa-nist-p192 instead of just ecdsa or ecdsa-nist. > > > > This simplifies the task of handling hardware that only supports a > > subset of curves. > > So, if some implementation supports multiple curves (like EC-RDSA > currently supports 5 curves), it should add 5 ecrdsa-{a,b,c,..} > algorithms with actually the same top level implementation? > Right? > Yes. The only difference will be the init() function, which can be used to set the TFM properties that define which curve is being used. The other routines can be generic, and refer to those properties if the behavior is curve-specific. > > > There is a parallel discussion of exactly what curves we should > > support in the kernel. Personally if there is a user in the kernel > > for it then I'm happy to see it added. In your specific case, as > > long as your use of the algorithm in x509 is accepted then I don't > > have any problems with adding support in the Crypto API. > > > > Cheers, > > -- > > Email: Herbert Xu > > Home Page: http://gondor.apana.org.au/~herbert/ > > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v14 00/11] support reserving crashkernel above 4G on arm64 kdump
Hi all, Friendly ping... On 2021/1/30 15:10, Chen Zhou wrote: > There are following issues in arm64 kdump: > 1. We use crashkernel=X to reserve crashkernel below 4G, which > will fail when there is no enough low memory. > 2. If reserving crashkernel above 4G, in this case, crash dump > kernel will boot failure because there is no low memory available > for allocation. > > To solve these issues, change the behavior of crashkernel=X. > crashkernel=X tries low allocation in DMA zone and fall back to high > allocation if it fails. > > We can also use "crashkernel=X,high" to select a high region above > DMA zone, which also tries to allocate at least 256M low memory in > DMA zone automatically and "crashkernel=Y,low" can be used to allocate > specified size low memory. > > When reserving crashkernel in high memory, some low memory is reserved > for crash dump kernel devices. So there may be two regions reserved for > crash dump kernel. > In order to distinct from the high region and make no effect to the use > of existing kexec-tools, rename the low region as "Crash kernel (low)", > and pass the low region by reusing DT property > "linux,usable-memory-range". We made the low memory region as the last > range of "linux,usable-memory-range" to keep compatibility with existing > user-space and older kdump kernels. > > Besides, we need to modify kexec-tools: > arm64: support more than one crash kernel regions(see [1]) > > Another update is document about DT property 'linux,usable-memory-range': > schemas: update 'linux,usable-memory-range' node schema(see [2]) > > This patchset contains the following eleven patches: > 0001-x86-kdump-replace-the-hard-coded-alignment-with-macr.patch > 0002-x86-kdump-make-the-lower-bound-of-crash-kernel-reser.patch > 0003-x86-kdump-use-macro-CRASH_ADDR_LOW_MAX-in-functions-.patch > 0004-x86-kdump-move-xen_pv_domain-check-and-insert_resour.patch > 0005-x86-kdump-move-reserve_crashkernel-_low-into-crash_c.patch > 0006-x86-elf-Move-vmcore_elf_check_arch_cross-to-arch-x86.patch > 0007-arm64-kdump-introduce-some-macroes-for-crash-kernel-.patch > 0008-arm64-kdump-reimplement-crashkernel-X.patch > 0009-x86-arm64-Add-ARCH_WANT_RESERVE_CRASH_KERNEL-config.patch > 0010-arm64-kdump-add-memory-for-devices-by-DT-property-li.patch > 0011-kdump-update-Documentation-about-crashkernel.patch > > 0001-0004 are some x86 cleanups which prepares for making > functionsreserve_crashkernel[_low]() generic. > 0005 makes functions reserve_crashkernel[_low]() generic. > 0006 fix compiling warning. > 0007-0009 reimplements arm64 crashkernel=X. > 0010 adds memory for devices by DT property linux,usable-memory-range. > 0011 updates the doc. > > Changes since [v13] > - Rebased on top of 5.11-rc5. > - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL. > Since reserve_crashkernel[_low]() implementations are quite similar on > other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in > arch/Kconfig and select this by X86 and ARM64. > - Some minor cleanup. > > Changes since [v12] > - Rebased on top of 5.10-rc1. > - Keep CRASH_ALIGN as 16M suggested by Dave. > - Drop patch "kdump: add threshold for the required memory". > - Add Tested-by from John. > > Changes since [v11] > - Rebased on top of 5.9-rc4. > - Make the function reserve_crashkernel() of x86 generic. > Suggested by Catalin, make the function reserve_crashkernel() of x86 generic > and arm64 use the generic version to reimplement crashkernel=X. > > Changes since [v10] > - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin. > > Changes since [v9] > - Patch 1 add Acked-by from Dave. > - Update patch 5 according to Dave's comments. > - Update chosen schema. > > Changes since [v8] > - Reuse DT property "linux,usable-memory-range". > Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the > low > memory region. > - Fix kdump broken with ZONE_DMA reintroduced. > - Update chosen schema. > > Changes since [v7] > - Move x86 CRASH_ALIGN to 2M > Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M. > - Update Documentation/devicetree/bindings/chosen.txt. > Add corresponding documentation to > Documentation/devicetree/bindings/chosen.txt > suggested by Arnd. > - Add Tested-by from Jhon and pk. > > Changes since [v6] > - Fix build errors reported by kbuild test robot. > > Changes since [v5] > - Move reserve_crashkernel_low() into kernel/crash_core.c. > - Delete crashkernel=X,high. > - Modify crashkernel=X,low. > If crashkernel=X,low is specified simultaneously, reserve spcified size low > memory for crash kdump kernel devices firstly and then reserve memory above > 4G. > In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then > pass to crash dump kernel by DT property "linux,low-memory-range". > - Update Documentation/admin-guide/kdump/kdump.rst. > > Changes since [v4] > - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike. > > Changes since [v3] > - Add
Re: [PATCH 00/21] [Set 2] Rid W=1 warnings from Clock
On 26/01/2021 14:45, Lee Jones wrote: This set is part of a larger effort attempting to clean-up W=1 kernel builds, which are currently overwhelmingly riddled with niggly little warnings. This is the last set. Clock is clean after this. Lee Jones (21): clk: zynq: pll: Fix kernel-doc formatting in 'clk_register_zynq_pll's header clk: ti: clkt_dpll: Fix some kernel-doc misdemeanours clk: ti: dpll3xxx: Fix some kernel-doc headers and promote other worthy ones clk: qcom: clk-regmap: Provide missing description for 'devm_clk_register_regmap()'s dev param clk: sunxi: clk-sun9i-core: Demote non-conformant kernel-doc headers clk: sunxi: clk-usb: Demote obvious kernel-doc abuse clk: tegra: clk-tegra30: Remove unused variable 'reg' clk: clkdev: Ignore suggestion to use gnu_printf() as it's not appropriate here clk: tegra: cvb: Provide missing description for 'tegra_cvb_add_opp_table()'s align param clk: ti: dpll44xx: Fix some potential doc-rot clk: renesas: renesas-cpg-mssr: Fix formatting issues for 'smstpcr_saved's documentation clk: sunxi: clk-sun6i-ar100: Demote non-conformant kernel-doc header clk: qcom: gcc-ipq4019: Remove unused variable 'ret' clk: clk-fixed-mmio: Demote obvious kernel-doc abuse clk: clk-npcm7xx: Remove unused static const tables 'npcm7xx_gates' and 'npcm7xx_divs_fx' clk: qcom: mmcc-msm8974: Remove unused static const tables 'mmcc_xo_mmpll0_1_2_gpll0{map}' clk: clk-xgene: Add description for 'mask' and fix formatting for 'flags' clk: qcom: clk-rpm: Remove a bunch of superfluous code clk: spear: Move prototype to accessible header clk: imx: Move 'imx6sl_set_wait_clk()'s prototype out to accessible header clk: zynqmp: divider: Add missing description for 'max_div' arch/arm/mach-imx/common.h | 1 - arch/arm/mach-imx/cpuidle-imx6sl.c | 1 + arch/arm/mach-imx/pm-imx6.c| 1 + arch/arm/mach-spear/generic.h | 12 --- arch/arm/mach-spear/spear13xx.c| 1 + drivers/clk/clk-fixed-mmio.c | 2 +- drivers/clk/clk-npcm7xx.c | 108 - drivers/clk/clk-xgene.c| 5 +- drivers/clk/clkdev.c | 7 ++ drivers/clk/imx/clk-imx6sl.c | 1 + drivers/clk/qcom/clk-regmap.c | 1 + drivers/clk/qcom/clk-rpm.c | 63 --- drivers/clk/qcom/gcc-ipq4019.c | 7 +- drivers/clk/qcom/mmcc-msm8974.c| 16 drivers/clk/renesas/renesas-cpg-mssr.c | 4 +- drivers/clk/spear/spear1310_clock.c| 1 + drivers/clk/spear/spear1340_clock.c| 1 + drivers/clk/sunxi/clk-sun6i-ar100.c| 2 +- drivers/clk/sunxi/clk-sun9i-core.c | 8 +- drivers/clk/sunxi/clk-usb.c| 2 +- drivers/clk/tegra/clk-tegra30.c| 5 +- drivers/clk/tegra/cvb.c| 1 + drivers/clk/ti/clkt_dpll.c | 3 +- drivers/clk/ti/dpll3xxx.c | 20 ++--- drivers/clk/ti/dpll44xx.c | 6 +- For the TI portions: Reviewed-by: Tero Kristo drivers/clk/zynq/pll.c | 12 +-- drivers/clk/zynqmp/divider.c | 1 + include/linux/clk/imx.h| 15 include/linux/clk/spear.h | 23 ++ 29 files changed, 92 insertions(+), 238 deletions(-) create mode 100644 include/linux/clk/imx.h create mode 100644 include/linux/clk/spear.h Cc: Ahmad Fatoum Cc: Andy Gross Cc: Avi Fishman Cc: Benjamin Fair Cc: Bjorn Andersson Cc: Boris BREZILLON Cc: Chen-Yu Tsai Cc: "Emilio López" Cc: Fabio Estevam Cc: Geert Uytterhoeven Cc: Jan Kotas Cc: Jernej Skrabec Cc: Jonathan Hunter Cc: linux-arm-ker...@lists.infradead.org Cc: linux-arm-...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-o...@vger.kernel.org Cc: linux-renesas-...@vger.kernel.org Cc: linux-te...@vger.kernel.org Cc: Loc Ho Cc: Maxime Ripard Cc: Michael Turquette Cc: Michal Simek Cc: Nancy Yuen Cc: Nuvoton Technologies Cc: NXP Linux Team Cc: open...@lists.ozlabs.org Cc: Patrick Venture Cc: Pengutronix Kernel Team Cc: Peter De Schrijver Cc: Philipp Zabel Cc: Prashant Gaikwad Cc: Rajan Vaja Cc: Rajeev Kumar Cc: Richard Woodruff Cc: Russell King Cc: Sascha Hauer Cc: Shawn Guo Cc: Shiraz Hashim Cc: "Sören Brinkmann" Cc: Stephen Boyd Cc: Tali Perry Cc: Tero Kristo Cc: Thierry Reding Cc: Tomer Maimon Cc: Viresh Kumar
Re: [PATCH] printk: avoid prb_first_valid_seq() where possible
On (21/02/05 15:23), John Ogness wrote: > If message sizes average larger than expected (more than 32 > characters), the data_ring will wrap before the desc_ring. Once the > data_ring wraps, it will start invalidating descriptors. These > invalid descriptors hang around until they are eventually recycled > when the desc_ring wraps. Readers do not care about invalid > descriptors, but they still need to iterate past them. If the > average message size is much larger than 32 characters, then there > will be many invalid descriptors preceding the valid descriptors. > > The function prb_first_valid_seq() always begins at the oldest > descriptor and searches for the first valid descriptor. This can > be rather expensive for the above scenario. And, in fact, because > of its heavy usage in /dev/kmsg, there have been reports of long > delays and even RCU stalls. > > For code that does not need to search from the oldest record, > replace prb_first_valid_seq() usage with prb_read_valid_*() > functions, which provide a start sequence number to search from. > > Fixes: 896fbe20b4e2333fb55 ("printk: use the lockless ringbuffer") > Reported-by: kernel test robot Can we please also ask the kernel test robot to test this patch? -ss
Re: [PATCH] seccomp: Improve performance by optimizing memory barrier
On Mon, Feb 01, 2021 at 08:49:41PM +0800, wanghongzhe wrote: > If a thread(A)'s TSYNC flag is set from seccomp(), then it will > synchronize its seccomp filter to other threads(B) in same thread > group. To avoid race condition, seccomp puts rmb() between > reading the mode and filter in seccomp check patch(in B thread). > As a result, every syscall's seccomp check is slowed down by the > memory barrier. > > However, we can optimize it by calling rmb() only when filter is > NULL and reading it again after the barrier, which means the rmb() > is called only once in thread lifetime. > > The 'filter is NULL' conditon means that it is the first time > attaching filter and is by other thread(A) using TSYNC flag. > In this case, thread B may read the filter first and mode later > in CPU out-of-order exection. After this time, the thread B's > mode is always be set, and there will no race condition with the > filter/bitmap. > > In addtion, we should puts a write memory barrier between writing > the filter and mode in smp_mb__before_atomic(), to avoid > the race condition in TSYNC case. > > Signed-off-by: wanghongzhe > --- > kernel/seccomp.c | 31 ++- > 1 file changed, 22 insertions(+), 9 deletions(-) > > diff --git a/kernel/seccomp.c b/kernel/seccomp.c > index 952dc1c90229..b944cb2b6b94 100644 > --- a/kernel/seccomp.c > +++ b/kernel/seccomp.c > @@ -397,8 +397,20 @@ static u32 seccomp_run_filters(const struct seccomp_data > *sd, > READ_ONCE(current->seccomp.filter); > > /* Ensure unexpected behavior doesn't result in failing open. */ > - if (WARN_ON(f == NULL)) > - return SECCOMP_RET_KILL_PROCESS; > + if (WARN_ON(f == NULL)) { > + /* > + * Make sure the first filter addtion (from another > + * thread using TSYNC flag) are seen. > + */ > + rmb(); > + > + /* Read again */ > + f = READ_ONCE(current->seccomp.filter); > + > + /* Ensure unexpected behavior doesn't result in failing open. */ > + if (WARN_ON(f == NULL)) > + return SECCOMP_RET_KILL_PROCESS; > + } IMHO, double WARN_ON() for the fallback flow is too much. Also according to the description, this "f == NULL" check is due to races and not programming error which WARN_ON() are intended to catch. Thanks
Re: [PATCH v2] printk: fix deadlock when kernel panic
On (21/02/06 13:41), Muchun Song wrote: > We found a deadlock bug on our server when the kernel panic. It can be > described in the following diagram. > > CPU0: CPU1: > panic rcu_dump_cpu_stacks > kdump_nmi_shootdown_cpus nmi_trigger_cpumask_backtrace > register_nmi_handler(crash_nmi_callback) printk_safe_flush > __printk_safe_flush > > raw_spin_lock_irqsave(_lock) > // send NMI to other processors > apic_send_IPI_allbutself(NMI_VECTOR) > // NMI interrupt, > dead loop > crash_nmi_callback At what point does this decrement num_online_cpus()? Any chance that panic CPU can apic_send_IPI_allbutself() and printk_safe_flush_on_panic() before num_online_cpus() becomes 1? > printk_safe_flush_on_panic > printk_safe_flush > __printk_safe_flush > // deadlock > raw_spin_lock_irqsave(_lock) -ss
Re: [PATCH v1] vdpa/mlx5: Restore the hardware used index after change map
On Mon, Feb 08, 2021 at 12:27:18PM +0800, Jason Wang wrote: > > On 2021/2/6 上午7:07, Si-Wei Liu wrote: > > > > > > On 2/3/2021 11:36 PM, Eli Cohen wrote: > > > When a change of memory map occurs, the hardware resources are destroyed > > > and then re-created again with the new memory map. In such case, we need > > > to restore the hardware available and used indices. The driver failed to > > > restore the used index which is added here. > > > > > > Also, since the driver also fails to reset the available and used > > > indices upon device reset, fix this here to avoid regression caused by > > > the fact that used index may not be zero upon device reset. > > > > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 > > > devices") > > > Signed-off-by: Eli Cohen > > > --- > > > v0 -> v1: > > > Clear indices upon device reset > > > > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 ++ > > > 1 file changed, 18 insertions(+) > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > index 88dde3455bfd..b5fe6d2ad22f 100644 > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > @@ -87,6 +87,7 @@ struct mlx5_vq_restore_info { > > > u64 device_addr; > > > u64 driver_addr; > > > u16 avail_index; > > > + u16 used_index; > > > bool ready; > > > struct vdpa_callback cb; > > > bool restore; > > > @@ -121,6 +122,7 @@ struct mlx5_vdpa_virtqueue { > > > u32 virtq_id; > > > struct mlx5_vdpa_net *ndev; > > > u16 avail_idx; > > > + u16 used_idx; > > > int fw_state; > > > /* keep last in the struct */ > > > @@ -804,6 +806,7 @@ static int create_virtqueue(struct mlx5_vdpa_net > > > *ndev, struct mlx5_vdpa_virtque > > > obj_context = MLX5_ADDR_OF(create_virtio_net_q_in, in, > > > obj_context); > > > MLX5_SET(virtio_net_q_object, obj_context, hw_available_index, > > > mvq->avail_idx); > > > + MLX5_SET(virtio_net_q_object, obj_context, hw_used_index, > > > mvq->used_idx); > > > MLX5_SET(virtio_net_q_object, obj_context, > > > queue_feature_bit_mask_12_3, > > > get_features_12_3(ndev->mvdev.actual_features)); > > > vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, > > > virtio_q_context); > > > @@ -1022,6 +1025,7 @@ static int connect_qps(struct mlx5_vdpa_net > > > *ndev, struct mlx5_vdpa_virtqueue *m > > > struct mlx5_virtq_attr { > > > u8 state; > > > u16 available_index; > > > + u16 used_index; > > > }; > > > static int query_virtqueue(struct mlx5_vdpa_net *ndev, struct > > > mlx5_vdpa_virtqueue *mvq, > > > @@ -1052,6 +1056,7 @@ static int query_virtqueue(struct > > > mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueu > > > memset(attr, 0, sizeof(*attr)); > > > attr->state = MLX5_GET(virtio_net_q_object, obj_context, state); > > > attr->available_index = MLX5_GET(virtio_net_q_object, > > > obj_context, hw_available_index); > > > + attr->used_index = MLX5_GET(virtio_net_q_object, obj_context, > > > hw_used_index); > > > kfree(out); > > > return 0; > > > @@ -1535,6 +1540,16 @@ static void teardown_virtqueues(struct > > > mlx5_vdpa_net *ndev) > > > } > > > } > > > +static void clear_virtqueues(struct mlx5_vdpa_net *ndev) > > > +{ > > > + int i; > > > + > > > + for (i = ndev->mvdev.max_vqs - 1; i >= 0; i--) { > > > + ndev->vqs[i].avail_idx = 0; > > > + ndev->vqs[i].used_idx = 0; > > > + } > > > +} > > > + > > > /* TODO: cross-endian support */ > > > static inline bool mlx5_vdpa_is_little_endian(struct mlx5_vdpa_dev > > > *mvdev) > > > { > > > @@ -1610,6 +1625,7 @@ static int save_channel_info(struct > > > mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqu > > > return err; > > > ri->avail_index = attr.available_index; > > > + ri->used_index = attr.used_index; > > > ri->ready = mvq->ready; > > > ri->num_ent = mvq->num_ent; > > > ri->desc_addr = mvq->desc_addr; > > > @@ -1654,6 +1670,7 @@ static void restore_channels_info(struct > > > mlx5_vdpa_net *ndev) > > > continue; > > > mvq->avail_idx = ri->avail_index; > > > + mvq->used_idx = ri->used_index; > > > mvq->ready = ri->ready; > > > mvq->num_ent = ri->num_ent; > > > mvq->desc_addr = ri->desc_addr; > > > @@ -1768,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct > > > vdpa_device *vdev, u8 status) > > > if (!status) { > > > mlx5_vdpa_info(mvdev, "performing device reset\n"); > > > teardown_driver(ndev); > > > + clear_virtqueues(ndev); > > The clearing looks fine at the first glance, as it aligns with the other > > state cleanups floating around at the same place. However, the thing is > > get_vq_state() is supposed to be called right after to get sync'ed with > > the latest internal avail_index from device while vq
Re: [PATCH] MAINTAINERS: repair file pattern in MEDIATEK IOMMU DRIVER
On Mon, 2021-02-08 at 07:10 +0100, Lukas Bulwahn wrote: > Commit 6af4873852c4 ("MAINTAINERS: Add entry for MediaTek IOMMU") mentions > the pattern 'drivers/iommu/mtk-iommu*', but the files are actually named > with an underscore, not with a hyphen. > > Hence, ./scripts/get_maintainer.pl --self-test=patterns complains: > > warning: no file matches F:drivers/iommu/mtk-iommu* > > Repair this minor typo in the file pattern. > > Signed-off-by: Lukas Bulwahn > --- > applies cleanly on next-20210205 > > Yong, please ack. +Joerg. sorry for the typo. Acked-by: Yong Wu > Will, please pick this minor fixup for your iommu-next tree. > > MAINTAINERS | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 674f42375acf..6b507e8d7828 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -11200,7 +11200,7 @@ L:io...@lists.linux-foundation.org > L: linux-media...@lists.infradead.org (moderated for non-subscribers) > S: Supported > F: Documentation/devicetree/bindings/iommu/mediatek* > -F: drivers/iommu/mtk-iommu* > +F: drivers/iommu/mtk_iommu* > F: include/dt-bindings/memory/mt*-port.h > > MEDIATEK JPEG DRIVER
Re: [PATCH v7 4/7] crypto: add ecc curve and expose them
Herbert, On Fri, Jan 29, 2021 at 02:00:04PM +1100, Herbert Xu wrote: > On Thu, Jan 28, 2021 at 09:49:41PM -0500, Stefan Berger wrote: > > > > In my patch series I initially had registered the akciphers under the names > > ecc-nist-p192 and ecc-nist-p256 but now, in V4, joined them together as > > 'ecdsa'. This may be too generic for a name. Maybe it should be called > > ecsda-nist for the NIST family. > > What I'm proposing is specifying the curve in the name as well, i.e., > ecdsa-nist-p192 instead of just ecdsa or ecdsa-nist. > > This simplifies the task of handling hardware that only supports a > subset of curves. So, if some implementation supports multiple curves (like EC-RDSA currently supports 5 curves), it should add 5 ecrdsa-{a,b,c,..} algorithms with actually the same top level implementation? Right? > There is a parallel discussion of exactly what curves we should > support in the kernel. Personally if there is a user in the kernel > for it then I'm happy to see it added. In your specific case, as > long as your use of the algorithm in x509 is accepted then I don't > have any problems with adding support in the Crypto API. > > Cheers, > -- > Email: Herbert Xu > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 6/9] scsi: ufshpb: Add hpb dev reset response
On 2021-02-02 16:30, Avri Altman wrote: The spec does not define what is the host's recommended response when the device send hpb dev reset response (oper 0x2). We will update all active hpb regions: mark them and do that on the next read. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 54 --- drivers/scsi/ufs/ufshpb.h | 1 + 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 49c74de539b7..28e0025507a1 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -17,6 +17,7 @@ #include "../sd.h" #define WORK_PENDING 0 +#define RESET_PENDING 1 #define ACTIVATION_THRSHLD 4 /* 4 IOs */ #define EVICTION_THRSHLD (ACTIVATION_THRSHLD << 6) /* 256 IOs */ @@ -349,7 +350,8 @@ void ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) if (rgn->reads == ACTIVATION_THRSHLD) activate = true; spin_unlock_irqrestore(>rgn_lock, flags); - if (activate) { + if (activate || + test_and_clear_bit(RGN_FLAG_UPDATE, >rgn_flags)) { spin_lock_irqsave(>rsp_list_lock, flags); ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); hpb->stats.rb_active_cnt++; @@ -1068,6 +1070,24 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case HPB_RSP_DEV_RESET: dev_warn(>sdev_ufs_lu->sdev_dev, "UFS device lost HPB information during PM.\n"); + + if (hpb->is_hcm) { + struct ufshpb_lu *h; + struct scsi_device *sdev; + + shost_for_each_device(sdev, hba->host) { I haven't test it yet, but this line shall cause recursive spin lock - in current code base, ufshpb_rsp_upiu() is called with host_lock held. Regards, Can Guo. + h = sdev->hostdata; + if (!h) + continue; + + if (test_and_set_bit(RESET_PENDING, +>work_data_bits)) + continue; + + schedule_work(>ufshpb_lun_reset_work); + } + } + break; default: dev_notice(>sdev_ufs_lu->sdev_dev, @@ -1200,6 +1220,27 @@ static void ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) spin_unlock_irqrestore(>rsp_list_lock, flags); } +static void ufshpb_reset_work_handler(struct work_struct *work) +{ + struct ufshpb_lu *hpb; + struct victim_select_info *lru_info; + struct ufshpb_region *rgn; + unsigned long flags; + + hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work); + + lru_info = >lru_info; + + spin_lock_irqsave(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) + set_bit(RGN_FLAG_UPDATE, >rgn_flags); + + spin_unlock_irqrestore(>rgn_state_lock, flags); + + clear_bit(RESET_PENDING, >work_data_bits); +} + static void ufshpb_normalization_work_handler(struct work_struct *work) { struct ufshpb_lu *hpb; @@ -1392,6 +1433,8 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) } else { rgn->rgn_state = HPB_RGN_INACTIVE; } + + rgn->rgn_flags = 0; } return 0; @@ -1502,9 +1545,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_hpb_lu); INIT_WORK(>map_work, ufshpb_map_work_handler); - if (hpb->is_hcm) + if (hpb->is_hcm) { INIT_WORK(>ufshpb_normalization_work, ufshpb_normalization_work_handler); + INIT_WORK(>ufshpb_lun_reset_work, + ufshpb_reset_work_handler); + } hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", sizeof(struct ufshpb_req), 0, 0, NULL); @@ -1591,8 +1637,10 @@ static void ufshpb_discard_rsp_lists(struct ufshpb_lu *hpb) static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb) { - if (hpb->is_hcm) + if (hpb->is_hcm) { + cancel_work_sync(>ufshpb_lun_reset_work); cancel_work_sync(>ufshpb_normalization_work); + } cancel_work_sync(>map_work); } diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index 71b082ee7876..e55892ceb3fc 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -184,6 +184,7 @@ struct ufshpb_lu { /* for selecting victim */ struct victim_select_info lru_info; struct work_struct ufshpb_normalization_work; + struct work_struct
Re: [RFC PATCH v4 00/17] virtio/vsock: introduce SOCK_SEQPACKET support
On 07.02.2021 19:20, Michael S. Tsirkin wrote: > On Sun, Feb 07, 2021 at 06:12:56PM +0300, Arseny Krasnov wrote: >> This patchset impelements support of SOCK_SEQPACKET for virtio >> transport. >> As SOCK_SEQPACKET guarantees to save record boundaries, so to >> do it, two new packet operations were added: first for start of record >> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also, >> both operations carries metadata - to maintain boundaries and payload >> integrity. Metadata is introduced by adding special header with two >> fields - message count and message length: >> >> struct virtio_vsock_seq_hdr { >> __le32 msg_cnt; >> __le32 msg_len; >> } __attribute__((packed)); >> >> This header is transmitted as payload of SEQ_BEGIN and SEQ_END >> packets(buffer of second virtio descriptor in chain) in the same way as >> data transmitted in RW packets. Payload was chosen as buffer for this >> header to avoid touching first virtio buffer which carries header of >> packet, because someone could check that size of this buffer is equal >> to size of packet header. To send record, packet with start marker is >> sent first(it's header contains length of record and counter), then >> counter is incremented and all data is sent as usual 'RW' packets and >> finally SEQ_END is sent(it also carries counter of message, which is >> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is >> incremented again. On receiver's side, length of record is known from >> packet with start record marker. To check that no packets were dropped >> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are >> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by >> 1) and length of data between two markers is compared to length in >> SEQ_BEGIN header. >> Now as packets of one socket are not reordered neither on >> vsock nor on vhost transport layers, such markers allows to restore >> original record on receiver's side. If user's buffer is smaller that >> record length, when all out of size data is dropped. >> Maximum length of datagram is not limited as in stream socket, >> because same credit logic is used. Difference with stream socket is >> that user is not woken up until whole record is received or error >> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags. >> Tests also implemented. >> >> Arseny Krasnov (17): >> af_vsock: update functions for connectible socket >> af_vsock: separate wait data loop >> af_vsock: separate receive data loop >> af_vsock: implement SEQPACKET receive loop >> af_vsock: separate wait space loop >> af_vsock: implement send logic for SEQPACKET >> af_vsock: rest of SEQPACKET support >> af_vsock: update comments for stream sockets >> virtio/vsock: dequeue callback for SOCK_SEQPACKET >> virtio/vsock: fetch length for SEQPACKET record >> virtio/vsock: add SEQPACKET receive logic >> virtio/vsock: rest of SOCK_SEQPACKET support >> virtio/vsock: setup SEQPACKET ops for transport >> vhost/vsock: setup SEQPACKET ops for transport >> vsock_test: add SOCK_SEQPACKET tests >> loopback/vsock: setup SEQPACKET ops for transport >> virtio/vsock: simplify credit update function API >> >> drivers/vhost/vsock.c | 8 +- >> include/linux/virtio_vsock.h| 15 + >> include/net/af_vsock.h | 9 + >> include/uapi/linux/virtio_vsock.h | 16 + >> net/vmw_vsock/af_vsock.c| 588 +++--- >> net/vmw_vsock/virtio_transport.c| 5 + >> net/vmw_vsock/virtio_transport_common.c | 316 ++-- >> net/vmw_vsock/vsock_loopback.c | 5 + >> tools/testing/vsock/util.c | 32 +- >> tools/testing/vsock/util.h | 3 + >> tools/testing/vsock/vsock_test.c| 126 + >> 11 files changed, 895 insertions(+), 228 deletions(-) >> >> TODO: >> - What to do, when server doesn't support SOCK_SEQPACKET. In current >>implementation RST is replied in the same way when listening port >>is not found. I think that current RST is enough,because case when >>server doesn't support SEQ_PACKET is same when listener missed(e.g. >>no listener in both cases). >- virtio spec patch Ok > >> v3 -> v4: >> - callbacks for loopback transport >> - SEQPACKET specific metadata moved from packet header to payload >>and called 'virtio_vsock_seq_hdr' >> - record integrity check: >>1) SEQ_END operation was added, which marks end of record. >>2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented >> on every marker send. >> - af_vsock.c: socket operations for STREAM and SEQPACKET call same >>functions instead of having own "gates" differs only by names: >>'vsock_seqpacket/stream_getsockopt()' now replaced with >>'vsock_connectible_getsockopt()'. >> - af_vsock.c: 'seqpacket_dequeue' callback returns error
Re: [PATCH] i2c: mv64xxx: Fix check for missing clock
On 2/8/21 12:28 AM, Samuel Holland wrote: > In commit e5c02cf54154 ("i2c: mv64xxx: Add runtime PM support"), error > pointers to optional clocks were replaced by NULL to simplify the resume > callback implementation. However, that commit missed that the IS_ERR > check in mv64xxx_of_config should be replaced with a NULL check. As a > result, the check always passes, even for an invalid device tree. Sorry, please ignore this unrelated patch. I accidentally copied it to the wrong directory before sending this series. Samuel
Re: linux-next: build warning after merge of the v4l-dvb tree
Em Mon, 8 Feb 2021 11:32:08 +1100 Stephen Rothwell escreveu: > Hi all, > > After merging the v4l-dvb tree, today's linux-next build (x86_64 > allmodconfig) produced this warning: > > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_set_serial_link' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_configure_i2c' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_set_high_threshold' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_configure_gmsl_link' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_set_gpios' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_clear_gpios' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_enable_gpios' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_disable_gpios' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_verify_id' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_set_address' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_set_deserializer_address' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > WARNING: modpost: drivers/media/i2c/rdacm21-camera_module: > 'max9271_set_translation' exported twice. Previous export was in > drivers/media/i2c/rdacm20-camera_module.ko > > Introduced by commit > > a59f853b3b4b ("media: i2c: Add driver for RDACM21 camera module") > It sounds to be due to a Makefile mess: drivers/media/i2c/Makefile:rdacm20-camera_module-objs := rdacm20.o max9271.o drivers/media/i2c/Makefile:rdacm21-camera_module-objs := rdacm21.o max9271.o Neither drivers should be including max9271.o as their objects, but, instead, be addressing max9271 dependency via Kconfig. Thanks, Mauro
[PATCH] drivers: firmware: xilinx: Fix dereferencing freed memory
From: Tejas Patel Fix smatch warning: drivers/firmware/xilinx/zynqmp.c:1288 zynqmp_firmware_remove() error: dereferencing freed memory 'feature_data' Use hash_for_each_safe for safe removal of hash entry. Fixes: acfdd18591ea ("firmware: xilinx: Use hash-table for api feature check") Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Tejas Patel Signed-off-by: Rajan Vaja --- drivers/firmware/xilinx/zynqmp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/xilinx/zynqmp.c b/drivers/firmware/xilinx/zynqmp.c index 7eb9958..83082e2 100644 --- a/drivers/firmware/xilinx/zynqmp.c +++ b/drivers/firmware/xilinx/zynqmp.c @@ -2,7 +2,7 @@ /* * Xilinx Zynq MPSoC Firmware layer * - * Copyright (C) 2014-2020 Xilinx, Inc. + * Copyright (C) 2014-2021 Xilinx, Inc. * * Michal Simek * Davorin Mista @@ -1280,12 +1280,13 @@ static int zynqmp_firmware_probe(struct platform_device *pdev) static int zynqmp_firmware_remove(struct platform_device *pdev) { struct pm_api_feature_data *feature_data; + struct hlist_node *tmp; int i; mfd_remove_devices(>dev); zynqmp_pm_api_debugfs_exit(); - hash_for_each(pm_api_features_map, i, feature_data, hentry) { + hash_for_each_safe(pm_api_features_map, i, tmp, feature_data, hentry) { hash_del(_data->hentry); kfree(feature_data); } -- 2.7.4
[PATCH net-next RESEND 5/5] net: stmmac: dwmac-sun8i: Add a shutdown callback
The Ethernet MAC and PHY are usually major consumers of power on boards which may not be able to fully power off (those with no PMIC). Powering down the MAC and internal PHY saves power while these boards are "off". Reviewed-by: Chen-Yu Tsai Signed-off-by: Samuel Holland --- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index 4638d4203af5..926e8d5e8963 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -1282,6 +1282,15 @@ static int sun8i_dwmac_remove(struct platform_device *pdev) return 0; } +static void sun8i_dwmac_shutdown(struct platform_device *pdev) +{ + struct net_device *ndev = platform_get_drvdata(pdev); + struct stmmac_priv *priv = netdev_priv(ndev); + struct sunxi_priv_data *gmac = priv->plat->bsp_priv; + + sun8i_dwmac_exit(pdev, gmac); +} + static const struct of_device_id sun8i_dwmac_match[] = { { .compatible = "allwinner,sun8i-h3-emac", .data = _variant_h3 }, @@ -1302,6 +1311,7 @@ MODULE_DEVICE_TABLE(of, sun8i_dwmac_match); static struct platform_driver sun8i_dwmac_driver = { .probe = sun8i_dwmac_probe, .remove = sun8i_dwmac_remove, + .shutdown = sun8i_dwmac_shutdown, .driver = { .name = "dwmac-sun8i", .pm = _pltfr_pm_ops, -- 2.26.2
[PATCH net-next RESEND 4/5] net: stmmac: dwmac-sun8i: Minor probe function cleanup
Adjust the spacing and use an explicit "return 0" in the success path to make the function easier to parse. Reviewed-by: Chen-Yu Tsai Signed-off-by: Samuel Holland --- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index 0e8d88417251..4638d4203af5 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -1227,6 +1227,7 @@ static int sun8i_dwmac_probe(struct platform_device *pdev) ndev = dev_get_drvdata(>dev); priv = netdev_priv(ndev); + /* The mux must be registered after parent MDIO * so after stmmac_dvr_probe() */ @@ -1245,7 +1246,8 @@ static int sun8i_dwmac_probe(struct platform_device *pdev) goto dwmac_remove; } - return ret; + return 0; + dwmac_mux: reset_control_put(gmac->rst_ephy); clk_put(gmac->ephy_clk); -- 2.26.2
[PATCH net-next RESEND 0/5] dwmac-sun8i cleanup and shutdown hook
These patches clean up some things I noticed while fixing suspend/resume behavior. The first four are minor code improvements. The last one adds a shutdown hook to minimize power consumption on boards without a PMIC. Now that the fixes series is merged, I'm resending this series rebased on top of net-next and with Chen-Yu's Reviewed-by tags. Samuel Holland (5): net: stmmac: dwmac-sun8i: Return void from PHY unpower net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check net: stmmac: dwmac-sun8i: Use reset_control_reset net: stmmac: dwmac-sun8i: Minor probe function cleanup net: stmmac: dwmac-sun8i: Add a shutdown callback .../net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 31 --- 1 file changed, 19 insertions(+), 12 deletions(-) -- 2.26.2
[PATCH net-next RESEND 2/5] net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
sun8i_dwmac_unpower_internal_phy already checks if the PHY is powered, so there is no need to do it again here. Reviewed-by: Chen-Yu Tsai Signed-off-by: Samuel Holland --- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index 8e505019adf8..3c3d0b99d3e8 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -1018,10 +1018,8 @@ static void sun8i_dwmac_exit(struct platform_device *pdev, void *priv) { struct sunxi_priv_data *gmac = priv; - if (gmac->variant->soc_has_internal_phy) { - if (gmac->internal_phy_powered) - sun8i_dwmac_unpower_internal_phy(gmac); - } + if (gmac->variant->soc_has_internal_phy) + sun8i_dwmac_unpower_internal_phy(gmac); clk_disable_unprepare(gmac->tx_clk); -- 2.26.2
[PATCH net-next RESEND 1/5] net: stmmac: dwmac-sun8i: Return void from PHY unpower
This is a deinitialization function that always returned zero, and that return value was always ignored. Have it return void instead. Reviewed-by: Chen-Yu Tsai Signed-off-by: Samuel Holland --- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index a5e0eff4a387..8e505019adf8 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -820,15 +820,14 @@ static int sun8i_dwmac_power_internal_phy(struct stmmac_priv *priv) return 0; } -static int sun8i_dwmac_unpower_internal_phy(struct sunxi_priv_data *gmac) +static void sun8i_dwmac_unpower_internal_phy(struct sunxi_priv_data *gmac) { if (!gmac->internal_phy_powered) - return 0; + return; clk_disable_unprepare(gmac->ephy_clk); reset_control_assert(gmac->rst_ephy); gmac->internal_phy_powered = false; - return 0; } /* MDIO multiplexing switch function -- 2.26.2
[PATCH net-next RESEND 3/5] net: stmmac: dwmac-sun8i: Use reset_control_reset
Use the appropriate function instead of reimplementing it, and update the error message to match the code. Reviewed-by: Chen-Yu Tsai Signed-off-by: Samuel Holland --- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index 3c3d0b99d3e8..0e8d88417251 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -806,11 +806,9 @@ static int sun8i_dwmac_power_internal_phy(struct stmmac_priv *priv) /* Make sure the EPHY is properly reseted, as U-Boot may leave * it at deasserted state, and thus it may fail to reset EMAC. */ - reset_control_assert(gmac->rst_ephy); - - ret = reset_control_deassert(gmac->rst_ephy); + ret = reset_control_reset(gmac->rst_ephy); if (ret) { - dev_err(priv->device, "Cannot deassert internal phy\n"); + dev_err(priv->device, "Cannot reset internal PHY\n"); clk_disable_unprepare(gmac->ephy_clk); return ret; } -- 2.26.2
Re: [PATCH v2 1/1] riscv/kasan: add KASAN_VMALLOC support
Hi Nylon, Le 1/22/21 à 10:56 PM, Palmer Dabbelt a écrit : On Fri, 15 Jan 2021 21:58:35 PST (-0800), nyl...@andestech.com wrote: It references to x86/s390 architecture. >> So, it doesn't map the early shadow page to cover VMALLOC space. Prepopulate top level page table for the range that would otherwise be empty. lower levels are filled dynamically upon memory allocation while booting. I think we can improve the changelog a bit here with something like that: "KASAN vmalloc space used to be mapped using kasan early shadow page. KASAN_VMALLOC requires the top-level of the kernel page table to be properly populated, lower levels being filled dynamically upon memory allocation at runtime." Signed-off-by: Nylon Chen Signed-off-by: Nick Hu --- arch/riscv/Kconfig | 1 + arch/riscv/mm/kasan_init.c | 57 +- 2 files changed, 57 insertions(+), 1 deletion(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 81b76d44725d..15a2c8088bbe 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -57,6 +57,7 @@ config RISCV select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL_RELATIVE select HAVE_ARCH_KASAN if MMU && 64BIT + select HAVE_ARCH_KASAN_VMALLOC if MMU && 64BIT select HAVE_ARCH_KGDB select HAVE_ARCH_KGDB_QXFER_PKT select HAVE_ARCH_MMAP_RND_BITS if MMU diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c index 12ddd1f6bf70..4b9149f963d3 100644 --- a/arch/riscv/mm/kasan_init.c +++ b/arch/riscv/mm/kasan_init.c @@ -9,6 +9,19 @@ #include #include #include +#include + +static __init void *early_alloc(size_t size, int node) +{ + void *ptr = memblock_alloc_try_nid(size, size, + __pa(MAX_DMA_ADDRESS), MEMBLOCK_ALLOC_ACCESSIBLE, node); + + if (!ptr) + panic("%pS: Failed to allocate %zu bytes align=%zx nid=%d from=%llx\n", + __func__, size, size, node, (u64)__pa(MAX_DMA_ADDRESS)); + + return ptr; +} extern pgd_t early_pg_dir[PTRS_PER_PGD]; asmlinkage void __init kasan_early_init(void) @@ -83,6 +96,40 @@ static void __init populate(void *start, void *end) memset(start, 0, end - start); } +void __init kasan_shallow_populate(void *start, void *end) +{ + unsigned long vaddr = (unsigned long)start & PAGE_MASK; + unsigned long vend = PAGE_ALIGN((unsigned long)end); + unsigned long pfn; + int index; + void *p; + pud_t *pud_dir, *pud_k; + pgd_t *pgd_dir, *pgd_k; + p4d_t *p4d_dir, *p4d_k; + + while (vaddr < vend) { + index = pgd_index(vaddr); + pfn = csr_read(CSR_SATP) & SATP_PPN; At this point in the boot process, we know that we use swapper_pg_dir so no need to read SATP. + pgd_dir = (pgd_t *)pfn_to_virt(pfn) + index; Here, this pgd_dir assignment is overwritten 2 lines below, so no need for it. + pgd_k = init_mm.pgd + index; + pgd_dir = pgd_offset_k(vaddr); pgd_offset_k(vaddr) = init_mm.pgd + pgd_index(vaddr) so pgd_k == pgd_dir. + set_pgd(pgd_dir, *pgd_k); + + p4d_dir = p4d_offset(pgd_dir, vaddr); + p4d_k = p4d_offset(pgd_k, vaddr); + + vaddr = (vaddr + PUD_SIZE) & PUD_MASK; Why do you increase vaddr *before* populating the first one ? And pud_addr_end does that properly: it returns the next pud address if it does not go beyond end address to map. + pud_dir = pud_offset(p4d_dir, vaddr); + pud_k = pud_offset(p4d_k, vaddr); + + if (pud_present(*pud_dir)) { + p = early_alloc(PAGE_SIZE, NUMA_NO_NODE); + pud_populate(_mm, pud_dir, p); init_mm is not needed here. + } + vaddr += PAGE_SIZE; Why do you need to add PAGE_SIZE ? vaddr already points to the next pud. It seems like this patch tries to populate userspace page table whereas at this point in the boot process, only swapper_pg_dir is used or am I missing something ? Thanks, Alex + } +} + void __init kasan_init(void) { phys_addr_t _start, _end; @@ -90,7 +137,15 @@ void __init kasan_init(void) kasan_populate_early_shadow((void *)KASAN_SHADOW_START, (void *)kasan_mem_to_shadow((void *) - VMALLOC_END)); + VMEMMAP_END)); + if (IS_ENABLED(CONFIG_KASAN_VMALLOC)) + kasan_shallow_populate( + (void *)kasan_mem_to_shadow((void *)VMALLOC_START), + (void *)kasan_mem_to_shadow((void *)VMALLOC_END)); + else + kasan_populate_early_shadow( + (void *)kasan_mem_to_shadow((void *)VMALLOC_START), + (void *)kasan_mem_to_shadow((void *)VMALLOC_END)); for_each_mem_range(i, &_start, &_end) { void *start = (void *)_start; > Thanks, this is on for-next. ___ linux-riscv mailing list linux-ri...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
[PATCH] i2c: mv64xxx: Fix check for missing clock
In commit e5c02cf54154 ("i2c: mv64xxx: Add runtime PM support"), error pointers to optional clocks were replaced by NULL to simplify the resume callback implementation. However, that commit missed that the IS_ERR check in mv64xxx_of_config should be replaced with a NULL check. As a result, the check always passes, even for an invalid device tree. Fixes: e5c02cf54154 ("i2c: mv64xxx: Add runtime PM support") Reported-by: Dan Carpenter Signed-off-by: Samuel Holland --- drivers/i2c/busses/i2c-mv64xxx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c index b03c344323d1..c590d36b5fd1 100644 --- a/drivers/i2c/busses/i2c-mv64xxx.c +++ b/drivers/i2c/busses/i2c-mv64xxx.c @@ -813,7 +813,7 @@ mv64xxx_of_config(struct mv64xxx_i2c_data *drv_data, * need to know tclk in order to calculate bus clock * factors. */ - if (IS_ERR(drv_data->clk)) { + if (!drv_data->clk) { rc = -ENODEV; goto out; } -- 2.26.2
[PATCH] i2c: mv64xxx: Fix check for missing clock
In commit e5c02cf54154 ("i2c: mv64xxx: Add runtime PM support"), error pointers to optional clocks were replaced by NULL to simplify the resume callback implementation. However, that commit missed that the IS_ERR check in mv64xxx_of_config should be replaced with a NULL check. As a result, the check always passes, even for an invalid device tree. Fixes: e5c02cf54154 ("i2c: mv64xxx: Add runtime PM support") Reported-by: Dan Carpenter Signed-off-by: Samuel Holland --- drivers/i2c/busses/i2c-mv64xxx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c index b03c344323d1..c590d36b5fd1 100644 --- a/drivers/i2c/busses/i2c-mv64xxx.c +++ b/drivers/i2c/busses/i2c-mv64xxx.c @@ -813,7 +813,7 @@ mv64xxx_of_config(struct mv64xxx_i2c_data *drv_data, * need to know tclk in order to calculate bus clock * factors. */ - if (IS_ERR(drv_data->clk)) { + if (!drv_data->clk) { rc = -ENODEV; goto out; } -- 2.26.2
Incorrect RSS page accounting of processes with multiple mapping pages
Hi, I believe there is an unexpected RES page accounting when doing multiple page mapping. The sample code was pasted below. In the sample code, The same 1g pages are mapped for three times. And it is expected that the process gets 1g RES instead of 3g RES pages(top command showed result). memfd.c #include #include #include #include #include #include #include "memfd.h" const size_t SIZE = 1024*1024*1024; // 1g int main() { long step=0; long UNITS = SIZE / 4; int fd = memfd_create("testmemfd", MFD_ALLOW_SEALING); // replacing the MFD_ALLOW_SEALING flag with 0 doesn't seem to change anything if (fd == -1) { perror("memfd_create"); } if (ftruncate(fd, SIZE) == -1) { perror("ftruncate"); } void * data1 = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (data1 == MAP_FAILED) { perror("mmap"); } void * data2 = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (data2 == MAP_FAILED) { perror("mmap"); } void * data3 = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (data3 == MAP_FAILED) { perror("mmap"); } //close(fd); // removing close(fd) or the mmap() code doesn't seem to change anything printf("%d\n", fd); while (1) { step = step % UNITS; ((int *)data1)[step] = 1; ((int *)data2)[step] = 2; ((int *)data3)[step] = 3; step++; } return 0; } memfd.h #ifndef _MEMFD_H #define _MEMFD_H /* * * SPDX-License-Identifier: Unlicense * * ** No glibc wrappers exist for memfd_create(2), so provide our own. * * * * Also define memfd fcntl sealing macros. While they are already * * defined in the kernel header file , that file as ** a whole conflicts with the original glibc header . * */ static inline int memfd_create(const char *name, unsigned int flags) { return syscall(__NR_memfd_create, name, flags); } #ifndef F_LINUX_SPECIFIC_BASE #define F_LINUX_SPECIFIC_BASE 1024 #endif #ifndef F_ADD_SEALS #define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9) #define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10) #define F_SEAL_SEAL 0x0001 /* prevent further seals from being set */ #define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */ #define F_SEAL_GROW 0x0004 /* prevent file from growing */ #define F_SEAL_WRITE0x0008 /* prevent writes */ #endif #endif /* _MEMFD_H */
Re: [PATCH] fs/buffer.c: Add checking buffer head stat before clear
Hi Andrew, 在 2021/2/6 7:45, Andrew Morton 写道: > On Wed, 3 Feb 2021 14:14:50 +0800 Shaokun Zhang > wrote: > >> From: Yang Guo >> >> clear_buffer_new() is used to clear buffer new stat. When PAGE_SIZE >> is 64K, most buffer heads in the list are not needed to clear. >> clear_buffer_new() has an enpensive atomic modification operation, >> Let's add checking buffer head before clear it as __block_write_begin_int >> does which is good for performance. > > Did this produce any measurable improvement? It has been tested on Huwei Kunpeng 920 which is ARM64 platform and test commond is below: numactl --cpunodebind=0 --membind=0 fio -name=randwrite -numjobs=16 -filename=/mnt/test1 -rw=randwrite -ioengine=libaio -direct=0 -iodepth=64 -sync=0 -norandommap -group_reporting -runtime=60 -time_based -bs=4k -size=5G The test result before patch: WRITE: bw=930MiB/s (976MB/s), 930MiB/s-930MiB/s (976MB/s-976MB/s), io=54.5GiB (58.5GB), run=60001-60001msec The test result after patch: WRITE: bw=958MiB/s (1005MB/s), 958MiB/s-958MiB/s (1005MB/s-1005MB/s), io=56.1GiB (60.3GB), run=60001-60001msec > > Perhaps we should give clear_buffer_x() the same optimization as > set_buffer_x()? > Good catch, but we check it more about it, if we do it the same as set_buffer_x(), many more codes will be fixed, such as ext4_wait_block_bitmap it has done sanity check using buffer_new and clear_buffer_new will check it again. Thanks, Shaokun > > static __always_inline void set_buffer_##name(struct buffer_head *bh) \ > { \ > if (!test_bit(BH_##bit, &(bh)->b_state))\ > set_bit(BH_##bit, &(bh)->b_state); \ > } \ > static __always_inline void clear_buffer_##name(struct buffer_head *bh) > \ > { \ > clear_bit(BH_##bit, &(bh)->b_state);\ > } \ > > > . >
[PATCH] MAINTAINERS: repair file pattern in MEDIATEK IOMMU DRIVER
Commit 6af4873852c4 ("MAINTAINERS: Add entry for MediaTek IOMMU") mentions the pattern 'drivers/iommu/mtk-iommu*', but the files are actually named with an underscore, not with a hyphen. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains: warning: no file matches F:drivers/iommu/mtk-iommu* Repair this minor typo in the file pattern. Signed-off-by: Lukas Bulwahn --- applies cleanly on next-20210205 Yong, please ack. Will, please pick this minor fixup for your iommu-next tree. MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 674f42375acf..6b507e8d7828 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11200,7 +11200,7 @@ L: io...@lists.linux-foundation.org L: linux-media...@lists.infradead.org (moderated for non-subscribers) S: Supported F: Documentation/devicetree/bindings/iommu/mediatek* -F: drivers/iommu/mtk-iommu* +F: drivers/iommu/mtk_iommu* F: include/dt-bindings/memory/mt*-port.h MEDIATEK JPEG DRIVER -- 2.17.1
[PATCH] nfc: st-nci: Remove unnecessary variable
From: wengjianfeng The variable r is defined at the beginning and initialized to 0 until the function returns r, and the variable r is not reassigned.Therefore, we do not need to define the variable r, just return 0 directly at the end of the function. Signed-off-by: wengjianfeng --- drivers/nfc/st-nci/se.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/nfc/st-nci/se.c b/drivers/nfc/st-nci/se.c index 807eae0..1cba8f6 100644 --- a/drivers/nfc/st-nci/se.c +++ b/drivers/nfc/st-nci/se.c @@ -276,7 +276,6 @@ static int st_nci_hci_apdu_reader_event_received(struct nci_dev *ndev, u8 event, struct sk_buff *skb) { - int r = 0; struct st_nci_info *info = nci_get_drvdata(ndev); pr_debug("apdu reader gate event: %x\n", event); @@ -298,7 +297,7 @@ static int st_nci_hci_apdu_reader_event_received(struct nci_dev *ndev, } kfree_skb(skb); - return r; + return 0; } /* -- 1.9.1
[RFC PATCH v2] MIPS: tlbex: Avoid access invalid address when pmd is modifying
From: wangrui When modifying pmd through THP, invalid address access may occurs in the tlb handler. Because the tlb handler loads value of pmd twice, one is used for huge page testing and the other is used to load pte. So these two values may be different: CPU 0: (app) CPU 1: (khugepaged) 1: scan hit: set pmd to invalid_pmd_table (pmd_clear) 2: tlb invalid: handle_tlbl, load pmd for huge page testing, is not a huge page 3: collapsed: set pmd to huge page 4: handle_tlbl: load pmd again for load pte(as base address), the value of pmd is not an address, access invalid address! This patch avoids the inconsistency of two memory loads by reusing the result of one load. Signed-off-by: hev Signed-off-by: wangrui --- arch/mips/mm/tlbex.c | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c index a7521b8f7658..cfb98290ce06 100644 --- a/arch/mips/mm/tlbex.c +++ b/arch/mips/mm/tlbex.c @@ -720,14 +720,14 @@ static void build_huge_tlb_write_entry(u32 **p, struct uasm_label **l, * Check if Huge PTE is present, if so then jump to LABEL. */ static void -build_is_huge_pte(u32 **p, struct uasm_reloc **r, unsigned int tmp, - unsigned int pmd, int lid) +build_is_huge_pte(u32 **p, struct uasm_reloc **r, unsigned int out, + unsigned int tmp, unsigned int pmd, int lid) { - UASM_i_LW(p, tmp, 0, pmd); + UASM_i_LW(p, out, 0, pmd); if (use_bbit_insns()) { - uasm_il_bbit1(p, r, tmp, ilog2(_PAGE_HUGE), lid); + uasm_il_bbit1(p, r, out, ilog2(_PAGE_HUGE), lid); } else { - uasm_i_andi(p, tmp, tmp, _PAGE_HUGE); + uasm_i_andi(p, tmp, out, _PAGE_HUGE); uasm_il_bnez(p, r, tmp, lid); } } @@ -1103,7 +1103,6 @@ EXPORT_SYMBOL_GPL(build_update_entries); struct mips_huge_tlb_info { int huge_pte; int restore_scratch; - bool need_reload_pte; }; static struct mips_huge_tlb_info @@ -1118,7 +1117,6 @@ build_fast_tlb_refill_handler (u32 **p, struct uasm_label **l, rv.huge_pte = scratch; rv.restore_scratch = 0; - rv.need_reload_pte = false; if (check_for_high_segbits) { UASM_i_MFC0(p, tmp, C0_BADVADDR); @@ -1323,7 +1321,6 @@ static void build_r4000_tlb_refill_handler(void) } else { htlb_info.huge_pte = K0; htlb_info.restore_scratch = 0; - htlb_info.need_reload_pte = true; vmalloc_mode = refill_noscratch; /* * create the plain linear handler @@ -1349,19 +1346,21 @@ static void build_r4000_tlb_refill_handler(void) #endif #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT - build_is_huge_pte(, , K0, K1, label_tlb_huge_update); + build_is_huge_pte(, , K0, K1, K1, label_tlb_huge_update); +#else + UASM_i_LW(, K0, 0, K1); #endif - build_get_ptep(, K0, K1); - build_update_entries(, K0, K1); + GET_CONTEXT(, K1); /* get context reg */ + build_adjust_context(, K1); + UASM_i_ADDU(, K0, K0, K1); /* add in offset */ + build_update_entries(, K1, K0); build_tlb_write_entry(, , , tlb_random); uasm_l_leave(, p); uasm_i_eret(); /* return from trap */ } #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT uasm_l_tlb_huge_update(, p); - if (htlb_info.need_reload_pte) - UASM_i_LW(, htlb_info.huge_pte, 0, K1); build_huge_update_entries(, htlb_info.huge_pte, K1); build_huge_tlb_write_entry(, , , K0, tlb_random, htlb_info.restore_scratch); @@ -2065,14 +2064,15 @@ build_r4000_tlbchange_handler_head(u32 **p, struct uasm_label **l, * instead contains the tlb pte. Check the PAGE_HUGE bit and * see if we need to jump to huge tlb processing. */ - build_is_huge_pte(p, r, wr.r1, wr.r2, label_tlb_huge_update); + build_is_huge_pte(p, r, wr.r3, wr.r1, wr.r2, label_tlb_huge_update); +#else + UASM_i_LW(p, wr.r3, 0, wr.r2); #endif UASM_i_MFC0(p, wr.r1, C0_BADVADDR); - UASM_i_LW(p, wr.r2, 0, wr.r2); UASM_i_SRL(p, wr.r1, wr.r1, PAGE_SHIFT + PTE_ORDER - PTE_T_LOG2); uasm_i_andi(p, wr.r1, wr.r1, (PTRS_PER_PTE - 1) << PTE_T_LOG2); - UASM_i_ADDU(p, wr.r2, wr.r2, wr.r1); + UASM_i_ADDU(p, wr.r2, wr.r3, wr.r1); #ifdef CONFIG_SMP uasm_l_smp_pgtable_change(l, *p); -- 2.30.0
Please I Need Your Help
How are you today, Please accept my sincere apologies if my email does not meet your business or personal ethics, I really like to have a good relationship with you, and I have a special reason why I decided to contact you because of the urgency of my situation here. I came across your e-mail contact prior to a private search while in need of your assistance.I am Miss.Salma Malek single girl, am 24 years old from Libya, am presently in St.Christopher's Parish for refugee in Burkina Faso under United Nations High commission for Refugee,I lost my parents in the recent war in Libya, right now am in Burkina Faso, please save my life i am in danger need your help in transferring my inheritance my father left behind for me in a Bank in Burkina Faso here,i have every necessary document for the fund, all i needed is a foreigner who will stand as the foreign partner to my father and beneficiary of the fund. The money deposited in the Bank is US$10.5 MILLION UNITED STATES DOLLAR) I just need this fund to be transfer to your bank account so that i will come over to your country and complete my education as you know that my country have been in deep crisis due to the war.And I cannot go back there again because I have nobody again all of my family were killed in the war. If you are interested to save me and help me receive my inheritance fund into your bank account with utmost good faith. Please get back to me for further details . Best Regards. Miss.Salma Malek.
[PATCH] kernel: exit.c: fix a spacing coding style
Add some spaces before and after the operator. Signed-off-by: jiahao --- kernel/exit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/exit.c b/kernel/exit.c index 04029e3..ffc507e 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -888,7 +888,7 @@ EXPORT_SYMBOL(complete_and_exit); SYSCALL_DEFINE1(exit, int, error_code) { - do_exit((error_code&0xff)<<8); + do_exit((error_code & 0xff) << 8); } /* -- 2.7.4
Re: [PATCH] gpiolib: cdev: convert stream-like files from
On Sun, Feb 7, 2021 at 10:00 AM Yang Li wrote: > > Eliminate the following coccicheck warning: > ./drivers/gpio/gpiolib-cdev.c:2307:7-23: WARNING: gpio_fileops: .read() > has stream semantic; safe to change nonseekable_open -> stream_open. > > Reported-by: Abaci Robot > Signed-off-by: Yang Li > --- > drivers/gpio/gpiolib-cdev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c > index 1631727..bad68ef 100644 > --- a/drivers/gpio/gpiolib-cdev.c > +++ b/drivers/gpio/gpiolib-cdev.c > @@ -2304,7 +2304,7 @@ static int gpio_chrdev_open(struct inode *inode, struct > file *file) > get_device(>dev); > file->private_data = cdev; > > - ret = nonseekable_open(inode, file); > + ret = stream_open(inode, file); > if (ret) > goto out_unregister_notifier; > > -- > 1.8.3.1 > I think you have a false positive here - we don't even take the offset argument into account so I don't see how the line_watch_read callback could be interpreted as seekable. Bart
Re: [RFC PATCH v2] taskstats: add /proc/taskstats to fetch pid/tgid status
On Fri, Feb 05, 2021 at 10:43:02AM +0800, Weiping Zhang wrote: > On Fri, Feb 5, 2021 at 8:08 AM Balbir Singh wrote: > > > > On Thu, Feb 04, 2021 at 10:37:20PM +0800, Weiping Zhang wrote: > > > On Thu, Feb 4, 2021 at 6:20 PM Balbir Singh wrote: > > > > > > > > On Sun, Jan 31, 2021 at 05:16:47PM +0800, Weiping Zhang wrote: > > > > > On Wed, Jan 27, 2021 at 7:13 PM Balbir Singh > > > > > wrote: > > > > > > > > > > > > On Fri, Jan 22, 2021 at 10:07:50PM +0800, Weiping Zhang wrote: > > > > > > > Hello Balbir Singh, > > > > > > > > > > > > > > Could you help review this patch, thanks > > > > > > > > > > > > > > On Mon, Dec 28, 2020 at 10:10 PM Weiping Zhang > > > > > > > wrote: > > > > > > > > > > > > > > > > Hi David, > > > > > > > > > > > > > > > > Could you help review this patch ? > > > > > > > > > > > > > > > > thanks > > > > > > > > > > > > > > > > On Fri, Dec 18, 2020 at 1:24 AM Weiping Zhang > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > If a program needs monitor lots of process's status, it needs > > > > > > > > > two > > > > > > > > > syscalls for every process. The first one is telling kernel > > > > > > > > > which > > > > > > > > > pid/tgid should be monitored by send a command(write socket) > > > > > > > > > to kernel. > > > > > > > > > The second one is read the statistics by read socket. This > > > > > > > > > patch add > > > > > > > > > a new interface /proc/taskstats to reduce two syscalls to one > > > > > > > > > ioctl. > > > > > > > > > The user just set the target pid/tgid to the struct > > > > > > > > > taskstats.ac_pid, > > > > > > > > > then kernel will collect statistics for that pid/tgid. > > > > > > > > > > > > > > > > > > Signed-off-by: Weiping Zhang > > > > > > > > > > > > Could you elaborate on the overhead your seeing for the syscalls? I > > > > > > am not > > > > > > in favour of adding new IOCTL's. > > > > > > > > > > > > Balbir Singh. > > > > > > > > > > Hello Balbir Singh, > > > > > > > > > > Sorry for late reply, > > > > > > > > > > I do a performance test between netlink mode and ioctl mode, > > > > > monitor 1000 and 1 sleep processes, > > > > > the netlink mode cost more time than ioctl mode, that is to say > > > > > ioctl mode can save some cpu resource and has a quickly reponse > > > > > especially when monitor lot of process. > > > > > > > > > > proccess-countnetlink ioctl > > > > > - > > > > > 1000 0.004446851 0.001553733 > > > > > 1 0.047024986 0.023290664 > > > > > > > > > > you can get the test demo code from the following link > > > > > https://github.com/dublio/tools/tree/master/c/taskstat > > > > > > > > > > > > > Let me try it out, I am opposed to adding the new IOCTL interface > > > > you propose. How frequently do you monitor this data and how much > > > > time in spent in making decision on the data? I presume the data > > > > mentioned is the cost per call in seconds? > > > > > > > This program just read every process's taskstats from kernel and do not > > > any extra data calculation, that is to say it just test the time spend on > > > these syscalls. It read data every 1 second, the output is delta time > > > spend to > > > read all 1000 or 1 processes's taskstat. > > > > > > t1 = clock_gettime(); > > > for_each_pid /* 1000 or 1 */ > > > read_pid_taskstat > > > t2 = clock_gettime(); > > > > > > delta = t2 - t1. > > > > > > > > proccess-countnetlink ioctl > > > > > - > > > > > 1000 0.004446851 0.001553733 > > > > > 1 0.047024986 0.023290664 > > > > > > Since netlink mode needs two syscall and ioctl mode needs one syscall > > > the test result shows netlink cost double time compare to ioctl. > > > So I want to add this interface to reduce the time cost by syscall. > > > > > > You can get the test script from: > > > https://github.com/dublio/tools/tree/master/c/taskstat#test-the-performance-between-netlink-and-ioctl-mode > > > > > > Thanks > > > > > > > Have you looked at the listener interface in taskstats, where one > > can register to listen on a cpumask against all exiting processes? > > > > That provides a register once and listen and filter interface (based > > on pids/tgids returned) and lets the task be done on exit as opposed > > to polling for data. > > > That is a good feature to collect data async mode, now I want to collect > those long-time running process's data in a fixed frequency, like iotop. > So I try to reduce the overhead cost by these syscalls when I polling > a lot of long-time running processes. > > Thanks a ton Still not convinced about it, I played around with it. The reason we did not use ioctl in the first place is to get the benefits of TLA with netlink, which ioctl's miss. IMHO, the overhead is not very significant even for 10,000 processes in your experiment. I am open to considering enhancing
Re: [PATCH] dax: fix default return code of range_parse()
ping On 2021/1/26 上午10:13, Shiyang Ruan wrote: The return value of range_parse() indicates the size when it is positive. The error code should be negative. Signed-off-by: Shiyang Ruan --- drivers/dax/bus.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 737b207c9e30..3003558c1a8b 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -1038,7 +1038,7 @@ static ssize_t range_parse(const char *opt, size_t len, struct range *range) { unsigned long long addr = 0; char *start, *end, *str; - ssize_t rc = EINVAL; + ssize_t rc = -EINVAL; str = kstrdup(opt, GFP_KERNEL); if (!str)
Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
On Sat, Feb 06, 2021 at 04:29:24AM -0800, Si-Wei Liu wrote: > While virtq is stopped, get_vq_state() is supposed to > be called to get sync'ed with the latest internal > avail_index from device. The saved avail_index is used > to restate the virtq once device is started. Commit > b35ccebe3ef7 introduced the clear_virtqueues() routine > to reset the saved avail_index, however, the index > gets cleared a bit earlier before get_vq_state() tries > to read it. This would cause consistency problems when > virtq is restarted, e.g. through a series of link down > and link up events. We could defer the clearing of > avail_index to until the device is to be started, > i.e. until VIRTIO_CONFIG_S_DRIVER_OK is set again in > set_status(). Not sure I understand the scenario. You are talking about reset of the device followed by up/down events on the interface. How can you trigger this? > > Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change > map") > Signed-off-by: Si-Wei Liu > --- > drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > index aa6f8cd..444ab58 100644 > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device > *vdev, u8 status) > if (!status) { > mlx5_vdpa_info(mvdev, "performing device reset\n"); > teardown_driver(ndev); > - clear_virtqueues(ndev); > mlx5_vdpa_destroy_mr(>mvdev); > ndev->mvdev.status = 0; > ++mvdev->generation; > @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device > *vdev, u8 status) > > if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { > if (status & VIRTIO_CONFIG_S_DRIVER_OK) { > + clear_virtqueues(ndev); > err = setup_driver(ndev); > if (err) { > mlx5_vdpa_warn(mvdev, "failed to setup > driver\n"); > -- > 1.8.3.1 >
Re: [PATCH V3 16/19] virtio-pci: introduce modern device module
On 2021/2/5 下午11:34, Michael S. Tsirkin wrote: On Mon, Jan 04, 2021 at 02:55:00PM +0800, Jason Wang wrote: Signed-off-by: Jason Wang I don't exactly get why we need to split the modern driver out, and it can confuse people who are used to be seeing virtio-pci. The virtio-pci module still there. No user visible changes. Just some codes that could be shared with other driver were split out. The vdpa thing so far looks like a development tool, why do we care that it depends on a bit of extra code? If I'm not misunderstanding, trying to share codes is proposed by you here: https://lkml.org/lkml/2020/6/10/232 We also had the plan to convert IFCVF to use this library. Thanks
Re: [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
On Sat, Feb 06, 2021 at 04:29:22AM -0800, Si-Wei Liu wrote: > When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa, > 22 extra bytes worth of MTU length is shown in guest. > This is because the mlx5_query_port_max_mtu API returns > the "hardware" MTU value, which does not just contain the > Ethernet payload, but includes extra lengths starting > from the Ethernet header up to the FCS altogether. > > Fix the MTU so packets won't get dropped silently. > > Signed-off-by: Si-Wei Liu Acked-by: Eli Cohen > --- > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 4 > drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 ++- > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > index 08f742f..b6cc53b 100644 > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > @@ -4,9 +4,13 @@ > #ifndef __MLX5_VDPA_H__ > #define __MLX5_VDPA_H__ > > +#include > +#include > #include > #include > > +#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN) > + > struct mlx5_vdpa_direct_mr { > u64 start; > u64 end; > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > index dc88559..b8416c4 100644 > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > @@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, > u16 idx) > .free = mlx5_vdpa_free, > }; > > +static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu) > +{ > + u16 hw_mtu; > + int err; > + > + err = mlx5_query_nic_vport_mtu(mdev, _mtu); > + if (err) > + return err; > + > + *mtu = hw_mtu - MLX5V_ETH_HARD_MTU; > + return 0; > +} > + > static int alloc_resources(struct mlx5_vdpa_net *ndev) > { > struct mlx5_vdpa_net_resources *res = >res; > @@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev, > init_mvqs(ndev); > mutex_init(>reslock); > config = >config; > - err = mlx5_query_nic_vport_mtu(mdev, >mtu); > + err = query_mtu(mdev, >mtu); > if (err) > goto err_mtu; > > -- > 1.8.3.1 >
Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
On Sat, Feb 06, 2021 at 04:29:23AM -0800, Si-Wei Liu wrote: > The mlx_features denotes the capability for which > set of virtio features is supported by device. In > principle, this field needs not be cleared during > virtio device reset, as this capability is static > and does not change across reset. > > In fact, the current code may have the assumption > that mlx_features can be reloaded from firmware > via the .get_features ops after device is reset > (via the .set_status ops), which is unfortunately > not true. The userspace VMM might save a copy > of backend capable features and won't call into > kernel again to get it on reset. This causes all > virtio features getting disabled on newly created > virtqs after device reset, while guest would hold > mismatched view of available features. For e.g., > the guest may still assume tx checksum offload > is available after reset and feature negotiation, > causing frames with bogus (incomplete) checksum > transmitted on the wire. > > Signed-off-by: Si-Wei Liu > --- > drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > index b8416c4..aa6f8cd 100644 > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device > *vdev, u8 status) > clear_virtqueues(ndev); > mlx5_vdpa_destroy_mr(>mvdev); > ndev->mvdev.status = 0; > - ndev->mvdev.mlx_features = 0; > ++mvdev->generation; > return; > } Since we assume that device capabilities don't change, I think I would get the features through a call done in mlx5v_probe after the netdev object is created and change mlx5_vdpa_get_features() to just return ndev->mvdev.mlx_features. Did you actually see this issue in action? If you did, can you share with us how you trigerred this? > -- > 1.8.3.1 >
RE: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory pin
> -Original Message- > From: David Rientjes [mailto:rient...@google.com] > Sent: Monday, February 8, 2021 3:18 PM > To: Song Bao Hua (Barry Song) > Cc: Matthew Wilcox ; Wangzhou (B) > ; linux-kernel@vger.kernel.org; > io...@lists.linux-foundation.org; linux...@kvack.org; > linux-arm-ker...@lists.infradead.org; linux-...@vger.kernel.org; Andrew > Morton ; Alexander Viro ; > gre...@linuxfoundation.org; j...@ziepe.ca; kevin.t...@intel.com; > jean-phili...@linaro.org; eric.au...@redhat.com; Liguozhu (Kenneth) > ; zhangfei@linaro.org; chensihang (A) > > Subject: RE: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory > pin > > On Sun, 7 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > NUMA balancer is just one of many reasons for page migration. Even one > > simple alloc_pages() can cause memory migration in just single NUMA > > node or UMA system. > > > > The other reasons for page migration include but are not limited to: > > * memory move due to CMA > > * memory move due to huge pages creation > > > > Hardly we can ask users to disable the COMPACTION, CMA and Huge Page > > in the whole system. > > > > What about only for mlocked memory, i.e. disable > vm.compact_unevictable_allowed? > > Adding syscalls is a big deal, we can make a reasonable inference that > we'll have to support this forever if it's merged. I haven't seen mention > of what other unevictable memory *should* be migratable that would be > adversely affected if we disable that sysctl. Maybe that gets you part of > the way there and there are some other deficiencies, but it seems like a > good start would be to describe how CONFIG_NUMA_BALANCING=n + > vm.compact_unevcitable_allowed + mlock() doesn't get you mostly there and > then look into what's missing. > I believe it can resolve the performance problem for the SVA applications if we disable vm.compact_unevcitable_allowed and NUMA_BALANCE, and use mlock(). The problem is that it is insensible to ask users to disable unevictable_allowed or numa balancing of the whole system only because there is one SVA application in the system. SVA, for itself, is a mechanism to let cpu and devices share same address space. In a typical server system, there are many processes, the better way would be only changing the behavior of the specific process rather than changing the whole system. It is hard to ask users to do that only because there is a SVA monster. Plus, this might negatively affect those applications not using SVA. > If it's a very compelling case where there simply are no alternatives, it > would make sense. Alternative is to find a more generic way, perhaps in > combination with vm.compact_unevictable_allowed, to achieve what you're > looking to do that can be useful even beyond your originally intended use > case. sensible. Actually pin is exactly the way to disable migration for specific pages AKA. disabling "vm.compact_unevictable_allowed" on those pages. It is hard to differentiate what pages should not be migrated. Only apps know that as even SVA applications can allocate many non-IO pages which should be able to move. Thanks Barry
linux-next: build failure after merge of the kvm tree
Hi all, After merging the kvm tree, today's linux-next build (x86_64 allmodconfig) failed like this: drivers/gpu/drm/i915/gvt/kvmgt.c: In function 'kvmgt_page_track_add': drivers/gpu/drm/i915/gvt/kvmgt.c:1706:12: error: passing argument 1 of 'spin_lock' from incompatible pointer type [-Werror=incompatible-pointer-types] 1706 | spin_lock(>mmu_lock); |^~ || |rwlock_t * In file included from include/linux/wait.h:9, from include/linux/pid.h:6, from include/linux/sched.h:14, from include/linux/ratelimit.h:6, from include/linux/dev_printk.h:16, from include/linux/device.h:15, from drivers/gpu/drm/i915/gvt/kvmgt.c:32: include/linux/spinlock.h:352:51: note: expected 'spinlock_t *' {aka 'struct spinlock *'} but argument is of type 'rwlock_t *' 352 | static __always_inline void spin_lock(spinlock_t *lock) | ^~~~ drivers/gpu/drm/i915/gvt/kvmgt.c:1715:14: error: passing argument 1 of 'spin_unlock' from incompatible pointer type [-Werror=incompatible-pointer-types] 1715 | spin_unlock(>mmu_lock); | ^~ | | | rwlock_t * In file included from include/linux/wait.h:9, from include/linux/pid.h:6, from include/linux/sched.h:14, from include/linux/ratelimit.h:6, from include/linux/dev_printk.h:16, from include/linux/device.h:15, from drivers/gpu/drm/i915/gvt/kvmgt.c:32: include/linux/spinlock.h:392:53: note: expected 'spinlock_t *' {aka 'struct spinlock *'} but argument is of type 'rwlock_t *' 392 | static __always_inline void spin_unlock(spinlock_t *lock) | ^~~~ drivers/gpu/drm/i915/gvt/kvmgt.c: In function 'kvmgt_page_track_remove': drivers/gpu/drm/i915/gvt/kvmgt.c:1740:12: error: passing argument 1 of 'spin_lock' from incompatible pointer type [-Werror=incompatible-pointer-types] 1740 | spin_lock(>mmu_lock); |^~ || |rwlock_t * In file included from include/linux/wait.h:9, from include/linux/pid.h:6, from include/linux/sched.h:14, from include/linux/ratelimit.h:6, from include/linux/dev_printk.h:16, from include/linux/device.h:15, from drivers/gpu/drm/i915/gvt/kvmgt.c:32: include/linux/spinlock.h:352:51: note: expected 'spinlock_t *' {aka 'struct spinlock *'} but argument is of type 'rwlock_t *' 352 | static __always_inline void spin_lock(spinlock_t *lock) | ^~~~ drivers/gpu/drm/i915/gvt/kvmgt.c:1749:14: error: passing argument 1 of 'spin_unlock' from incompatible pointer type [-Werror=incompatible-pointer-types] 1749 | spin_unlock(>mmu_lock); | ^~ | | | rwlock_t * In file included from include/linux/wait.h:9, from include/linux/pid.h:6, from include/linux/sched.h:14, from include/linux/ratelimit.h:6, from include/linux/dev_printk.h:16, from include/linux/device.h:15, from drivers/gpu/drm/i915/gvt/kvmgt.c:32: include/linux/spinlock.h:392:53: note: expected 'spinlock_t *' {aka 'struct spinlock *'} but argument is of type 'rwlock_t *' 392 | static __always_inline void spin_unlock(spinlock_t *lock) | ^~~~ drivers/gpu/drm/i915/gvt/kvmgt.c: In function 'kvmgt_page_track_flush_slot': drivers/gpu/drm/i915/gvt/kvmgt.c:1775:12: error: passing argument 1 of 'spin_lock' from incompatible pointer type [-Werror=incompatible-pointer-types] 1775 | spin_lock(>mmu_lock); |^~ || |rwlock_t * In file included from include/linux/wait.h:9, from include/linux/pid.h:6, from include/linux/sched.h:14, from include/linux/ratelimit.h:6, from include/linux/dev_printk.h:16, from include/linux/device.h:15, from drivers/gpu/drm/i915/gvt/kvmgt.c:32: include/linux/spinlock.h:352:51: note: expected 'spinlock_t *' {aka 'struct spinlock *'} but argument is of type 'rwlock_t *' 352 | static __always_inline void spin_lock(spinlock_t *lock) | ^~~~ drivers/gpu/drm/i915/gvt/kvmgt.c:1784:14: error: passing argument 1 of 'spin_unlock' from incompatible pointer type [-Werror=incompatible-pointer-types] 1784 | spin_unlock(>mmu_lock); | ^~ | | | rwlock_t *
[PATCHv2 2/3] media: uvcvideo: add ROI auto controls
From: Sergey Senozhatsky This patch adds support for Region of Interest bmAutoControls. ROI control is a compound data type: Control Selector CT_REGION_OF_INTEREST_CONTROL Mandatory Requests SET_CUR, GET_CUR, GET_MIN, GET_MAX, GET_DEF wLength 10 Offset FieldSize 0wROI_Top 2 2wROI_Left2 4wROI_Bottom 2 6wROI_Right 2 8bmAutoControls 2 (Bitmap) uvc_control_mapping, however, can handle only s32 data type at the moment: ->get() returns s32 value, ->set() accepts s32 value; while v4l2_ctrl maximum/minimum/default_value can hold only s64 values. Hence ROI control handling is split into two patches: a) bmAutoControls is handled via uvc_control_mapping as V4L2_CTRL_TYPE_MENU b) ROI rectangle (SET_CUR, GET_CUR, GET_DEF) handling is implemented separately, by the means of selection API. Signed-off-by: Sergey Senozhatsky --- .../media/v4l/ext-ctrls-camera.rst| 25 +++ drivers/media/usb/uvc/uvc_ctrl.c | 19 ++ include/uapi/linux/usb/video.h| 1 + include/uapi/linux/v4l2-controls.h| 9 +++ 4 files changed, 54 insertions(+) diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-camera.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-camera.rst index c05a2d2c675d..1593c999c8e2 100644 --- a/Documentation/userspace-api/media/v4l/ext-ctrls-camera.rst +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-camera.rst @@ -653,6 +653,31 @@ enum v4l2_scene_mode - || ++ +``V4L2_CID_REGION_OF_INTEREST_AUTO (bitmask)`` +This determines which, if any, on board features should track to the +Region of Interest. + +.. flat-table:: +:header-rows: 0 +:stub-columns: 0 + +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_EXPOSURE`` + - Auto Exposure. +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_IRIS`` + - Auto Iris. +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_WHITE_BALANCE`` + - Auto White Balance. +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_FOCUS`` + - Auto Focus. +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_FACE_DETECT`` + - Auto Face Detect. +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_DETECT_AND_TRACK`` + - Auto Detect and Track. +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_IMAGE_STABILIXATION`` + - Image Stabilization. +* - ``V4L2_CID_REGION_OF_INTEREST_AUTO_HIGHER_QUALITY`` + - Higher Quality. + .. [#f1] This control may be changed to a menu control in the future, if more diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c index b3dde98499f4..5502fe540519 100644 --- a/drivers/media/usb/uvc/uvc_ctrl.c +++ b/drivers/media/usb/uvc/uvc_ctrl.c @@ -355,6 +355,15 @@ static const struct uvc_control_info uvc_ctrls[] = { .flags = UVC_CTRL_FLAG_GET_CUR | UVC_CTRL_FLAG_AUTO_UPDATE, }, + { + .entity = UVC_GUID_UVC_CAMERA, + .selector = UVC_CT_REGION_OF_INTEREST_CONTROL, + .index = 21, + .size = 10, + .flags = UVC_CTRL_FLAG_SET_CUR | UVC_CTRL_FLAG_GET_CUR + | UVC_CTRL_FLAG_GET_MIN | UVC_CTRL_FLAG_GET_MAX + | UVC_CTRL_FLAG_GET_DEF + }, }; static const struct uvc_menu_info power_line_frequency_controls[] = { @@ -753,6 +762,16 @@ static const struct uvc_control_mapping uvc_ctrl_mappings[] = { .v4l2_type = V4L2_CTRL_TYPE_BOOLEAN, .data_type = UVC_CTRL_DATA_TYPE_BOOLEAN, }, + { + .id = V4L2_CID_REGION_OF_INTEREST_AUTO, + .name = "Region of Interest (auto)", + .entity = UVC_GUID_UVC_CAMERA, + .selector = UVC_CT_REGION_OF_INTEREST_CONTROL, + .size = 16, + .offset = 64, + .v4l2_type = V4L2_CTRL_TYPE_BITMASK, + .data_type = UVC_CTRL_DATA_TYPE_BITMASK, + }, }; /* diff --git a/include/uapi/linux/usb/video.h b/include/uapi/linux/usb/video.h index d854cb19c42c..c87624962896 100644 --- a/include/uapi/linux/usb/video.h +++ b/include/uapi/linux/usb/video.h @@ -104,6 +104,7 @@ #define UVC_CT_ROLL_ABSOLUTE_CONTROL 0x0f #define UVC_CT_ROLL_RELATIVE_CONTROL 0x10 #define UVC_CT_PRIVACY_CONTROL 0x11 +#define UVC_CT_REGION_OF_INTEREST_CONTROL 0x14 /* A.9.5. Processing Unit Control Selectors */ #define UVC_PU_CONTROL_UNDEFINED 0x00 diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
[PATCHv2 1/3] media: v4l UAPI docs: document ROI selection targets
From: Sergey Senozhatsky Document new v4l2-selection target which will be used for the Region of Interest v4l2 control. Signed-off-by: Sergey Senozhatsky --- .../media/v4l/selection-api-configuration.rst | 23 +++ .../media/v4l/v4l2-selection-targets.rst | 21 + include/uapi/linux/v4l2-common.h | 8 +++ 3 files changed, 52 insertions(+) diff --git a/Documentation/userspace-api/media/v4l/selection-api-configuration.rst b/Documentation/userspace-api/media/v4l/selection-api-configuration.rst index fee49bf1a1c0..9f69d71803f6 100644 --- a/Documentation/userspace-api/media/v4l/selection-api-configuration.rst +++ b/Documentation/userspace-api/media/v4l/selection-api-configuration.rst @@ -135,3 +135,26 @@ and the height of rectangles obtained using ``V4L2_SEL_TGT_CROP`` and ``V4L2_SEL_TGT_COMPOSE`` targets. If these are not equal then the scaling is applied. The application can compute the scaling ratios using these values. + +Configuration of Region of Interest (ROI) += + +The range of coordinates of the top left corner, width and height of +areas that can be ROI is given by the ``V4L2_SEL_TGT_ROI_BOUNDS`` target. +It is recommended for the driver developers to put the top/left corner +at position ``(0,0)``. The rectangle's coordinates are in global sensor +coordinates. The units are in pixels and independent of the field of view. +They are not impacted by any cropping or scaling that is currently being +used. + +The top left corner, width and height of the Region of Interest area +currently being employed by the device is given by the +``V4L2_SEL_TGT_ROI_CURRENT`` target. It uses the same coordinate system +as ``V4L2_SEL_TGT_ROI_BOUNDS``. + +In order to change active ROI top left, width and height coordinates +use ``V4L2_SEL_TGT_ROI`` target. + +Each capture device has a default ROI rectangle, given by the +``V4L2_SEL_TGT_ROI_DEFAULT`` target. Drivers shall set the ROI rectangle +to the default when the driver is first loaded, but not later. diff --git a/Documentation/userspace-api/media/v4l/v4l2-selection-targets.rst b/Documentation/userspace-api/media/v4l/v4l2-selection-targets.rst index e877ebbdb32e..cb3809418fa6 100644 --- a/Documentation/userspace-api/media/v4l/v4l2-selection-targets.rst +++ b/Documentation/userspace-api/media/v4l/v4l2-selection-targets.rst @@ -69,3 +69,24 @@ of the two interfaces they are used. modified by hardware. - Yes - No +* - ``V4L2_SEL_TGT_ROI_CURRENT`` + - 0x0200 + - Current Region of Interest rectangle. + - Yes + - No +* - ``V4L2_SEL_TGT_ROI_DEFAULT`` + - 0x0201 + - Suggested Region of Interest rectangle. + - Yes + - No +* - ``V4L2_SEL_TGT_ROI_BOUNDS`` + - 0x0202 + - Bounds of the Region of Interest rectangle. All valid ROI rectangles fit + inside the ROI bounds rectangle. + - Yes + - No +* - ``V4L2_SEL_TGT_ROI`` + - 0x0203 + - Sets the new Region of Interest rectangle. + - Yes + - No diff --git a/include/uapi/linux/v4l2-common.h b/include/uapi/linux/v4l2-common.h index 7d21c1634b4d..d0c108fba638 100644 --- a/include/uapi/linux/v4l2-common.h +++ b/include/uapi/linux/v4l2-common.h @@ -78,6 +78,14 @@ #define V4L2_SEL_TGT_COMPOSE_BOUNDS0x0102 /* Current composing area plus all padding pixels */ #define V4L2_SEL_TGT_COMPOSE_PADDED0x0103 +/* Current Region of Interest area */ +#define V4L2_SEL_TGT_ROI_CURRENT 0x0200 +/* Default Region of Interest area */ +#define V4L2_SEL_TGT_ROI_DEFAULT 0x0201 +/* Region of Interest bounds */ +#define V4L2_SEL_TGT_ROI_BOUNDS0x0202 +/* Set Region of Interest area */ +#define V4L2_SEL_TGT_ROI 0x0203 /* Selection flags */ #define V4L2_SEL_FLAG_GE (1 << 0) -- 2.30.0
[PATCHv2 0/3] Add UVC 1.5 Region Of Interest control to uvcvideo
Hello, RFC This patch set adds UVC 1.5 Region of Interest support. v1->v2: - Address Laurent's comments Sergey Senozhatsky (3): media: v4l UAPI docs: document ROI selection targets media: uvcvideo: add ROI auto controls media: uvcvideo: add UVC 1.5 ROI control .../media/v4l/ext-ctrls-camera.rst| 25 +++ .../media/v4l/selection-api-configuration.rst | 23 +++ .../media/v4l/v4l2-selection-targets.rst | 21 +++ drivers/media/usb/uvc/uvc_ctrl.c | 19 +++ drivers/media/usb/uvc/uvc_v4l2.c | 143 +- include/uapi/linux/usb/video.h| 1 + include/uapi/linux/v4l2-common.h | 8 + include/uapi/linux/v4l2-controls.h| 9 ++ 8 files changed, 246 insertions(+), 3 deletions(-) -- 2.30.0
[PATCHv2 3/3] media: uvcvideo: add UVC 1.5 ROI control
From: Sergey Senozhatsky This patch implements parts of UVC 1.5 Region of Interest (ROI) control, using the uvcvideo selection API. There are several things to mention here. First, UVC 1.5 defines CT_DIGITAL_WINDOW_CONTROL; and ROI rectangle coordinates "must be within the current Digital Window as specified by the CT_WINDOW control." (4.2.2.1.20 Digital Region of Interest (ROI) Control.) This is not entirely clear if we need to implement CT_DIGITAL_WINDOW_CONTROL. ROI is naturally limited by: ROI GET_MIN and GET_MAX rectangles. Besides, the H/W that I'm playing with implements ROI, but doesn't implement CT_DIGITAL_WINDOW_CONTROL, so WINDOW_CONTROL is probably optional. Second, the patch doesn't implement all of the ROI requests. Namely, SEL_TGT_BOUNDS for ROI implements GET_MAX (that is maximal ROI rectangle area). GET_MIN is not implemented (as of now) since it's not very clear if user space would need such information. Signed-off-by: Sergey Senozhatsky --- drivers/media/usb/uvc/uvc_v4l2.c | 143 ++- 1 file changed, 140 insertions(+), 3 deletions(-) diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c index 252136cc885c..71b4577196e5 100644 --- a/drivers/media/usb/uvc/uvc_v4l2.c +++ b/drivers/media/usb/uvc/uvc_v4l2.c @@ -1139,14 +1139,60 @@ static int uvc_ioctl_querymenu(struct file *file, void *fh, return uvc_query_v4l2_menu(chain, qm); } -static int uvc_ioctl_g_selection(struct file *file, void *fh, -struct v4l2_selection *sel) +/* UVC 1.5 ROI rectangle is half the size of v4l2_rect */ +struct uvc_roi_rect { + __u16 top; + __u16 left; + __u16 bottom; + __u16 right; +}; + +static int uvc_ioctl_g_roi_target(struct file *file, void *fh, + struct v4l2_selection *sel) { struct uvc_fh *handle = fh; struct uvc_streaming *stream = handle->stream; + struct uvc_roi_rect *roi; + u8 query; + int ret; - if (sel->type != stream->type) + switch (sel->target) { + case V4L2_SEL_TGT_ROI_DEFAULT: + query = UVC_GET_DEF; + break; + case V4L2_SEL_TGT_ROI_CURRENT: + query = UVC_GET_CUR; + break; + case V4L2_SEL_TGT_ROI_BOUNDS: + query = UVC_GET_MAX; + break; + default: return -EINVAL; + } + + roi = kzalloc(sizeof(struct uvc_roi_rect), GFP_KERNEL); + if (!roi) + return -ENOMEM; + + ret = uvc_query_ctrl(stream->dev, query, 1, stream->dev->intfnum, +UVC_CT_REGION_OF_INTEREST_CONTROL, roi, +sizeof(struct uvc_roi_rect)); + if (!ret) { + sel->r.left = roi->left; + sel->r.top = roi->top; + sel->r.width= roi->right; + sel->r.height = roi->bottom; + } + + kfree(roi); + return ret; +} + +static int uvc_ioctl_g_sel_target(struct file *file, void *fh, + struct v4l2_selection *sel) +{ + struct uvc_fh *handle = fh; + struct uvc_streaming *stream = handle->stream; switch (sel->target) { case V4L2_SEL_TGT_CROP_DEFAULT: @@ -1173,6 +1219,96 @@ static int uvc_ioctl_g_selection(struct file *file, void *fh, return 0; } +static int uvc_ioctl_g_selection(struct file *file, void *fh, +struct v4l2_selection *sel) +{ + struct uvc_fh *handle = fh; + struct uvc_streaming *stream = handle->stream; + + if (sel->type != stream->type) + return -EINVAL; + + switch (sel->target) { + case V4L2_SEL_TGT_CROP_DEFAULT: + case V4L2_SEL_TGT_CROP_BOUNDS: + case V4L2_SEL_TGT_COMPOSE_DEFAULT: + case V4L2_SEL_TGT_COMPOSE_BOUNDS: + return uvc_ioctl_g_sel_target(file, fh, sel); + case V4L2_SEL_TGT_ROI_CURRENT: + case V4L2_SEL_TGT_ROI_DEFAULT: + case V4L2_SEL_TGT_ROI_BOUNDS: + return uvc_ioctl_g_roi_target(file, fh, sel); + } + + return -EINVAL; +} + +static bool validate_roi_bounds(struct uvc_streaming *stream, + struct v4l2_selection *sel) +{ + bool ok = true; + + if (sel->r.left > USHRT_MAX || sel->r.top > USHRT_MAX || + sel->r.width > USHRT_MAX || sel->r.height > USHRT_MAX) + return false; + + /* perhaps also can test against ROI GET_MAX */ + + mutex_lock(>mutex); + if ((u16)sel->r.width > stream->cur_frame->wWidth) + ok = false; + if ((u16)sel->r.height > stream->cur_frame->wHeight) + ok = false; + mutex_unlock(>mutex); + + return ok; +} + +static int uvc_ioctl_s_roi(struct file *file, void *fh, + struct
Re: drivers/opp/of.c:842:12: warning: stack frame size of 2064 bytes in function '_of_add_opp_table_v2'
On 07-02-21, 04:09, kernel test robot wrote: > f47b72a15a9679 drivers/base/power/opp/of.c Viresh Kumar 2016-05-05 841 /* > Initializes OPP tables based on new bindings */ > 5ed4cecd75e902 drivers/opp/of.cViresh Kumar 2018-09-12 @842 > static int _of_add_opp_table_v2(struct device *dev, struct opp_table > *opp_table) > f47b72a15a9679 drivers/base/power/opp/of.c Viresh Kumar 2016-05-05 843 { > f47b72a15a9679 drivers/base/power/opp/of.c Viresh Kumar 2016-05-05 844 > struct device_node *np; > 283d55e68d8a0f drivers/opp/of.cViresh Kumar 2018-09-07 845 > int ret, count = 0, pstate_count = 0; > 3ba98324e81add drivers/opp/of.cViresh Kumar 2016-11-18 846 > struct dev_pm_opp *opp; I am not able to figure out why the stack frame warning will shoot off for this routine, using just pointers, no big allocations on stack.. False positive ? -- viresh
Re: [RFC 0/3] mm/page_alloc: Fix pageblock_order with HUGETLB_PAGE_SIZE_VARIABLE
On 2/4/21 12:31 PM, Anshuman Khandual wrote: > The following warning gets triggered while trying to boot a 64K page size > without THP config kernel on arm64 platform. > > WARNING: CPU: 5 PID: 124 at mm/vmstat.c:1080 __fragmentation_index+0xa4/0xc0 > Modules linked in: > CPU: 5 PID: 124 Comm: kswapd0 Not tainted 5.11.0-rc6-4-ga0ea7d62002 #159 > Hardware name: linux,dummy-virt (DT) > [8.810673] pstate: 2045 (nzCv daif +PAN -UAO -TCO BTYPE=--) > [8.811732] pc : __fragmentation_index+0xa4/0xc0 > [8.812555] lr : fragmentation_index+0xf8/0x138 > [8.813360] sp : 864079b0 > [8.813958] x29: 864079b0 x28: 0372 > [8.814901] x27: 7682 x26: 8000135b3948 > [8.815847] x25: 1fffe00010c80f48 x24: > [8.816805] x23: x22: 000d > [8.817764] x21: 0030 x20: 0005ffcb4d58 > [8.818712] x19: 000b x18: > [8.819656] x17: x16: > [8.820613] x15: x14: 8000114c6258 > [8.821560] x13: 6000bff969ba x12: 1fffe000bff969b9 > [8.822514] x11: 1fffe000bff969b9 x10: 6000bff969b9 > [8.823461] x9 : dfff8000 x8 : 0005ffcb4dcf > [8.824415] x7 : 0001 x6 : 41b58ab3 > [8.825359] x5 : 600010c80f48 x4 : dfff8000 > [8.826313] x3 : 8000102be670 x2 : 0007 > [8.827259] x1 : 86407a60 x0 : 000d > [8.828218] Call trace: > [8.828667] __fragmentation_index+0xa4/0xc0 > [8.829436] fragmentation_index+0xf8/0x138 > [8.830194] compaction_suitable+0x98/0xb8 > [8.830934] wakeup_kcompactd+0xdc/0x128 > [8.831640] balance_pgdat+0x71c/0x7a0 > [8.832327] kswapd+0x31c/0x520 > [8.832902] kthread+0x224/0x230 > [8.833491] ret_from_fork+0x10/0x30 > [8.834150] ---[ end trace 472836f79c15516b ]--- > > This warning comes from __fragmentation_index() when the requested order > is greater than MAX_ORDER. > > static int __fragmentation_index(unsigned int order, >struct contig_page_info *info) > { > unsigned long requested = 1UL << order; > > if (WARN_ON_ONCE(order >= MAX_ORDER)) <= Triggered here > return 0; > > Digging it further reveals that pageblock_order has been assigned a value > which is greater than MAX_ORDER failing the above check. But why this > happened ? Because HUGETLB_PAGE_ORDER for the given config on arm64 is > greater than MAX_ORDER. > > The solution involves enabling HUGETLB_PAGE_SIZE_VARIABLE which would make > pageblock_order a variable instead of constant HUGETLB_PAGE_ORDER. But that > change alone also did not really work as pageblock_order still got assigned > as HUGETLB_PAGE_ORDER in set_pageblock_order(). HUGETLB_PAGE_ORDER needs to > be less than MAX_ORDER for its appropriateness as pageblock_order otherwise > just fallback to MAX_ORDER - 1 as before. While here it also fixes a build > problem via type casting MAX_ORDER in rmem_cma_setup(). > > This series applies in v5.11-rc6 and has been slightly tested on arm64. But > looking for some early feedbacks particularly with respect to concerns in > subscribing HUGETLB_PAGE_SIZE_VARIABLE on a platform where the hugetlb page > size is config dependent but not really a runtime variable. Even though it > appears that HUGETLB_PAGE_SIZE_VARIABLE is used only while computing the > pageblock_order, could there be other implications ? > > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Robin Murphy > Cc: Marek Szyprowski > Cc: Christoph Hellwig > Cc: Andrew Morton > Cc: linux-arm-ker...@lists.infradead.org > Cc: io...@lists.linux-foundation.org > Cc: linux...@kvack.org > Cc: linux-kernel@vger.kernel.org Probably missed some more folks, adding them here. + Michal Hocko + Vlastimil Babka + Mike Kravetz + Matthew Wilcox
Re: [PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
On 2021/2/6 下午8:29, Si-Wei Liu wrote: When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa, 22 extra bytes worth of MTU length is shown in guest. This is because the mlx5_query_port_max_mtu API returns the "hardware" MTU value, which does not just contain the Ethernet payload, but includes extra lengths starting from the Ethernet header up to the FCS altogether. Fix the MTU so packets won't get dropped silently. Signed-off-by: Si-Wei Liu Acked-by: Jason Wang --- drivers/vdpa/mlx5/core/mlx5_vdpa.h | 4 drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 ++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h index 08f742f..b6cc53b 100644 --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h @@ -4,9 +4,13 @@ #ifndef __MLX5_VDPA_H__ #define __MLX5_VDPA_H__ +#include +#include #include #include +#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN) + struct mlx5_vdpa_direct_mr { u64 start; u64 end; diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index dc88559..b8416c4 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx) .free = mlx5_vdpa_free, }; +static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu) +{ + u16 hw_mtu; + int err; + + err = mlx5_query_nic_vport_mtu(mdev, _mtu); + if (err) + return err; + + *mtu = hw_mtu - MLX5V_ETH_HARD_MTU; + return 0; +} + static int alloc_resources(struct mlx5_vdpa_net *ndev) { struct mlx5_vdpa_net_resources *res = >res; @@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev, init_mvqs(ndev); mutex_init(>reslock); config = >config; - err = mlx5_query_nic_vport_mtu(mdev, >mtu); + err = query_mtu(mdev, >mtu); if (err) goto err_mtu;
Re: [PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
On 2021/2/6 下午8:29, Si-Wei Liu wrote: While virtq is stopped, get_vq_state() is supposed to be called to get sync'ed with the latest internal avail_index from device. The saved avail_index is used to restate the virtq once device is started. Commit b35ccebe3ef7 introduced the clear_virtqueues() routine to reset the saved avail_index, however, the index gets cleared a bit earlier before get_vq_state() tries to read it. This would cause consistency problems when virtq is restarted, e.g. through a series of link down and link up events. We could defer the clearing of avail_index to until the device is to be started, i.e. until VIRTIO_CONFIG_S_DRIVER_OK is set again in set_status(). Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") Signed-off-by: Si-Wei Liu Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index aa6f8cd..444ab58 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) if (!status) { mlx5_vdpa_info(mvdev, "performing device reset\n"); teardown_driver(ndev); - clear_virtqueues(ndev); mlx5_vdpa_destroy_mr(>mvdev); ndev->mvdev.status = 0; ++mvdev->generation; @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + clear_virtqueues(ndev); err = setup_driver(ndev); if (err) { mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
Re: [PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
On 2021/2/6 下午8:29, Si-Wei Liu wrote: The mlx_features denotes the capability for which set of virtio features is supported by device. In principle, this field needs not be cleared during virtio device reset, as this capability is static and does not change across reset. In fact, the current code may have the assumption that mlx_features can be reloaded from firmware via the .get_features ops after device is reset (via the .set_status ops), which is unfortunately not true. The userspace VMM might save a copy of backend capable features and won't call into kernel again to get it on reset. This is not the behavior of Qemu but it's valid. This causes all virtio features getting disabled on newly created virtqs after device reset, while guest would hold mismatched view of available features. For e.g., the guest may still assume tx checksum offload is available after reset and feature negotiation, causing frames with bogus (incomplete) checksum transmitted on the wire. Signed-off-by: Si-Wei Liu Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index b8416c4..aa6f8cd 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) clear_virtqueues(ndev); mlx5_vdpa_destroy_mr(>mvdev); ndev->mvdev.status = 0; - ndev->mvdev.mlx_features = 0; ++mvdev->generation; return; }
Re: [PATCH 1/4] mm/highmem: Lift memcpy_[to|from]_page to core
On 2/7/21 19:13, Ira Weiny wrote: >>> +static inline void memcpy_from_page(char *to, struct page *page, size_t >>> offset, size_t len) >> How about following ? >> static inline void memcpy_from_page(char *to, struct page *page, size_t >> offset, >> size_t len) > It is an easy change and It is up to Andrew. But I thought we were making the > line length limit longer now. > > Ira > True, not sure what is the right thing going forward especially when new changes are mixed with the old ones, I'll leave it to the maintainer to decide.
Re: [PATCH v1] vdpa/mlx5: Restore the hardware used index after change map
On 2021/2/6 上午7:07, Si-Wei Liu wrote: On 2/3/2021 11:36 PM, Eli Cohen wrote: When a change of memory map occurs, the hardware resources are destroyed and then re-created again with the new memory map. In such case, we need to restore the hardware available and used indices. The driver failed to restore the used index which is added here. Also, since the driver also fails to reset the available and used indices upon device reset, fix this here to avoid regression caused by the fact that used index may not be zero upon device reset. Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen --- v0 -> v1: Clear indices upon device reset drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 88dde3455bfd..b5fe6d2ad22f 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -87,6 +87,7 @@ struct mlx5_vq_restore_info { u64 device_addr; u64 driver_addr; u16 avail_index; + u16 used_index; bool ready; struct vdpa_callback cb; bool restore; @@ -121,6 +122,7 @@ struct mlx5_vdpa_virtqueue { u32 virtq_id; struct mlx5_vdpa_net *ndev; u16 avail_idx; + u16 used_idx; int fw_state; /* keep last in the struct */ @@ -804,6 +806,7 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque obj_context = MLX5_ADDR_OF(create_virtio_net_q_in, in, obj_context); MLX5_SET(virtio_net_q_object, obj_context, hw_available_index, mvq->avail_idx); + MLX5_SET(virtio_net_q_object, obj_context, hw_used_index, mvq->used_idx); MLX5_SET(virtio_net_q_object, obj_context, queue_feature_bit_mask_12_3, get_features_12_3(ndev->mvdev.actual_features)); vq_ctx = MLX5_ADDR_OF(virtio_net_q_object, obj_context, virtio_q_context); @@ -1022,6 +1025,7 @@ static int connect_qps(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *m struct mlx5_virtq_attr { u8 state; u16 available_index; + u16 used_index; }; static int query_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq, @@ -1052,6 +1056,7 @@ static int query_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueu memset(attr, 0, sizeof(*attr)); attr->state = MLX5_GET(virtio_net_q_object, obj_context, state); attr->available_index = MLX5_GET(virtio_net_q_object, obj_context, hw_available_index); + attr->used_index = MLX5_GET(virtio_net_q_object, obj_context, hw_used_index); kfree(out); return 0; @@ -1535,6 +1540,16 @@ static void teardown_virtqueues(struct mlx5_vdpa_net *ndev) } } +static void clear_virtqueues(struct mlx5_vdpa_net *ndev) +{ + int i; + + for (i = ndev->mvdev.max_vqs - 1; i >= 0; i--) { + ndev->vqs[i].avail_idx = 0; + ndev->vqs[i].used_idx = 0; + } +} + /* TODO: cross-endian support */ static inline bool mlx5_vdpa_is_little_endian(struct mlx5_vdpa_dev *mvdev) { @@ -1610,6 +1625,7 @@ static int save_channel_info(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqu return err; ri->avail_index = attr.available_index; + ri->used_index = attr.used_index; ri->ready = mvq->ready; ri->num_ent = mvq->num_ent; ri->desc_addr = mvq->desc_addr; @@ -1654,6 +1670,7 @@ static void restore_channels_info(struct mlx5_vdpa_net *ndev) continue; mvq->avail_idx = ri->avail_index; + mvq->used_idx = ri->used_index; mvq->ready = ri->ready; mvq->num_ent = ri->num_ent; mvq->desc_addr = ri->desc_addr; @@ -1768,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) if (!status) { mlx5_vdpa_info(mvdev, "performing device reset\n"); teardown_driver(ndev); + clear_virtqueues(ndev); The clearing looks fine at the first glance, as it aligns with the other state cleanups floating around at the same place. However, the thing is get_vq_state() is supposed to be called right after to get sync'ed with the latest internal avail_index from device while vq is stopped. The index was saved in the driver software at vq suspension, but before the virtq object is destroyed. We shouldn't clear the avail_index too early. Good point. There's a limitation on the virtio spec and vDPA framework that we can not simply differ device suspending from device reset. Need to think about that. I suggest a new state in [1], the issue is that people doesn't like the asynchronous API that it introduces. Possibly it can be postponed to where VIRTIO_CONFIG_S_DRIVER_OK gets set again, i.e. right before the setup_driver() in mlx5_vdpa_set_status()? Looks like a good workaround. Thanks -Siwei [1]
Re: [PATCH V3 11/14] coresight: sink: Add TRBE driver
On 2/5/21 11:23 PM, Mathieu Poirier wrote: > On Wed, Jan 27, 2021 at 02:25:35PM +0530, Anshuman Khandual wrote: >> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is >> accessible via the system registers. The TRBE supports different addressing >> modes including CPU virtual address and buffer modes including the circular >> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1), >> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the >> access to the trace buffer could be prohibited by a higher exception level >> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU >> private interrupt (PPI) on address translation errors and when the buffer >> is full. Overall implementation here is inspired from the Arm SPE driver. >> > > I got this message when applying the patch: > > Applying: coresight: sink: Add TRBE driver > .git/rebase-apply/patch:76: new blank line at EOF. > + > warning: 1 line adds whitespace errors. It could be the additional blank line at the end of documentation file i.e Documentation/trace/coresight/coresight-trbe.rst, will drop it. > >> Cc: Mathieu Poirier >> Cc: Mike Leach >> Cc: Suzuki K Poulose >> Signed-off-by: Anshuman Khandual >> --- >> Changes in V3: >> >> - Added new DT bindings document TRBE.yaml >> - Changed TRBLIMITR_TRIG_MODE_SHIFT from 2 to 3 >> - Dropped isb() from trbe_reset_local() >> - Dropped gap between (void *) and buf->trbe_base >> - Changed 'int' to 'unsigned int' in is_trbe_available() >> - Dropped unused function set_trbe_running(), set_trbe_virtual_mode(), >> set_trbe_enabled() and set_trbe_limit_pointer() >> - Changed get_trbe_flag_update(), is_trbe_programmable() and >> get_trbe_address_align() to accept TRBIDR value >> - Changed is_trbe_running(), is_trbe_abort(), is_trbe_wrap(), is_trbe_trg(), >> is_trbe_irq(), get_trbe_bsc() and get_trbe_ec() to accept TRBSR value >> - Dropped snapshot mode condition in arm_trbe_alloc_buffer() >> - Exit arm_trbe_init() when arm64_kernel_unmapped_at_el0() is enabled >> - Compute trbe_limit before trbe_write to get the updated handle >> - Added trbe_stop_and_truncate_event() >> - Dropped trbe_handle_fatal() >> >> Documentation/trace/coresight/coresight-trbe.rst | 39 + >> arch/arm64/include/asm/sysreg.h |1 + >> drivers/hwtracing/coresight/Kconfig | 11 + >> drivers/hwtracing/coresight/Makefile |1 + >> drivers/hwtracing/coresight/coresight-trbe.c | 1023 >> ++ >> drivers/hwtracing/coresight/coresight-trbe.h | 160 >> 6 files changed, 1235 insertions(+) >> create mode 100644 Documentation/trace/coresight/coresight-trbe.rst >> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c >> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h >> >> diff --git a/Documentation/trace/coresight/coresight-trbe.rst >> b/Documentation/trace/coresight/coresight-trbe.rst >> new file mode 100644 >> index 000..1cbb819 >> --- /dev/null >> +++ b/Documentation/trace/coresight/coresight-trbe.rst >> @@ -0,0 +1,39 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> + >> +== >> +Trace Buffer Extension (TRBE). >> +== >> + >> +:Author: Anshuman Khandual >> +:Date: November 2020 >> + >> +Hardware Description >> + >> + >> +Trace Buffer Extension (TRBE) is a percpu hardware which captures in system >> +memory, CPU traces generated from a corresponding percpu tracing unit. This >> +gets plugged in as a coresight sink device because the corresponding trace >> +genarators (ETE), are plugged in as source device. >> + >> +The TRBE is not compliant to CoreSight architecture specifications, but is >> +driven via the CoreSight driver framework to support the ETE (which is >> +CoreSight compliant) integration. >> + >> +Sysfs files and directories >> +--- >> + >> +The TRBE devices appear on the existing coresight bus alongside the other >> +coresight devices:: >> + >> +>$ ls /sys/bus/coresight/devices >> +trbe0 trbe1 trbe2 trbe3 >> + >> +The ``trbe`` named TRBEs are associated with a CPU.:: >> + >> +>$ ls /sys/bus/coresight/devices/trbe0/ >> +align dbm >> + >> +*Key file items are:-* >> + * ``align``: TRBE write pointer alignment >> + * ``dbm``: TRBE updates memory with access and dirty flags >> + > > Please add documentation for these, the same way it was done for all the > other CS > components [1]. > > [1]. https://elixir.bootlin.com/linux/latest/source/Documentation/ABI/testing > (sysfs-bus-coresight-device-xyz) Sure, will add the following new sysfs doc file in this regard. Marked the KernelVersion as 5.12, will change if required. new file mode 100644 index 000..5cb090f --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe @@ -0,0 +1,14 @@ +What:
Re: [PATCH V3 17/19] vdpa: set the virtqueue num during register
On 2021/2/5 下午11:27, Michael S. Tsirkin wrote: On Mon, Jan 04, 2021 at 02:55:01PM +0800, Jason Wang wrote: This patch delay the queue number setting to vDPA device registering. This allows us to probe the virtqueue numbers between device allocation and registering. Reviewed-by: Stefano Garzarella Signed-off-by: Jason Wang Conflicts with other patches in the vhost tree. Can you rebase please? Will do. Thanks --- drivers/vdpa/ifcvf/ifcvf_main.c | 5 ++--- drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 ++--- drivers/vdpa/vdpa.c | 8 drivers/vdpa/vdpa_sim/vdpa_sim.c | 4 ++-- include/linux/vdpa.h | 7 +++ 5 files changed, 13 insertions(+), 16 deletions(-) diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index 8b4028556cb6..d65f3221d8ed 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -438,8 +438,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id) } adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa, - dev, _vdpa_ops, - IFCVF_MAX_QUEUE_PAIRS * 2); + dev, _vdpa_ops); if (adapter == NULL) { IFCVF_ERR(pdev, "Failed to allocate vDPA structure"); return -ENOMEM; @@ -463,7 +462,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id) for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) vf->vring[i].irq = -EINVAL; - ret = vdpa_register_device(>vdpa); + ret = vdpa_register_device(>vdpa, IFCVF_MAX_QUEUE_PAIRS * 2); if (ret) { IFCVF_ERR(pdev, "Failed to register ifcvf to vdpa bus"); goto err; diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index f1d54814db97..a1b9260bf04d 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1958,8 +1958,7 @@ static int mlx5v_probe(struct auxiliary_device *adev, max_vqs = MLX5_CAP_DEV_VDPA_EMULATION(mdev, max_num_virtio_queues); max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS); - ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, _vdpa_ops, -2 * mlx5_vdpa_max_qps(max_vqs)); + ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, _vdpa_ops); if (IS_ERR(ndev)) return PTR_ERR(ndev); @@ -1986,7 +1985,7 @@ static int mlx5v_probe(struct auxiliary_device *adev, if (err) goto err_res; - err = vdpa_register_device(>vdev); + err = vdpa_register_device(>vdev, 2 * mlx5_vdpa_max_qps(max_vqs)); if (err) goto err_reg; diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index a69ffc991e13..ba89238f9898 100644 --- a/drivers/vdpa/vdpa.c +++ b/drivers/vdpa/vdpa.c @@ -61,7 +61,6 @@ static void vdpa_release_dev(struct device *d) * initialized but before registered. * @parent: the parent device * @config: the bus operations that is supported by this device - * @nvqs: number of virtqueues supported by this device * @size: size of the parent structure that contains private data * * Driver should use vdpa_alloc_device() wrapper macro instead of @@ -72,7 +71,6 @@ static void vdpa_release_dev(struct device *d) */ struct vdpa_device *__vdpa_alloc_device(struct device *parent, const struct vdpa_config_ops *config, - int nvqs, size_t size) { struct vdpa_device *vdev; @@ -99,7 +97,6 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent, vdev->index = err; vdev->config = config; vdev->features_valid = false; - vdev->nvqs = nvqs; err = dev_set_name(>dev, "vdpa%u", vdev->index); if (err) @@ -122,11 +119,14 @@ EXPORT_SYMBOL_GPL(__vdpa_alloc_device); * vdpa_register_device - register a vDPA device * Callers must have a succeed call of vdpa_alloc_device() before. * @vdev: the vdpa device to be registered to vDPA bus + * @nvqs: number of virtqueues supported by this device * * Returns an error when fail to add to vDPA bus */ -int vdpa_register_device(struct vdpa_device *vdev) +int vdpa_register_device(struct vdpa_device *vdev, int nvqs) { + vdev->nvqs = nvqs; + return device_add(>dev); } EXPORT_SYMBOL_GPL(vdpa_register_device); diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index 6a90fdb9cbfc..b129cb4dd013 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -357,7 +357,7 @@ static struct vdpasim *vdpasim_create(void) else ops = _net_config_ops; - vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
Re: [PATCH V3 18/19] virtio_vdpa: don't warn when fail to disable vq
On 2021/2/5 下午11:24, Michael S. Tsirkin wrote: On Mon, Jan 04, 2021 at 02:55:02PM +0800, Jason Wang wrote: There's no guarantee that the device can disable a specific virtqueue through set_vq_ready(). One example is the modern virtio-pci device. So this patch removes the warning. Signed-off-by: Jason Wang Do we need the read as a kind of flush though? The problem is that PCI forbids write 0 to queue_enable. So I'm not sure what kind of flush do we need here? Thanks --- drivers/virtio/virtio_vdpa.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 4a9ddb44b2a7..e28acf482e0c 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -225,9 +225,8 @@ static void virtio_vdpa_del_vq(struct virtqueue *vq) list_del(>node); spin_unlock_irqrestore(_dev->lock, flags); - /* Select and deactivate the queue */ + /* Select and deactivate the queue (best effort) */ ops->set_vq_ready(vdpa, index, 0); - WARN_ON(ops->get_vq_ready(vdpa, index)); vring_del_virtqueue(vq); -- 2.25.1
[PATCH RESEND] rsi: remove redundant assignment
From: wengjianfeng INVALID_QUEUE has been used as a return value,it is not necessary to assign it to q_num,so just return INVALID_QUEUE. Signed-off-by: wengjianfeng --- drivers/net/wireless/rsi/rsi_91x_core.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/wireless/rsi/rsi_91x_core.c b/drivers/net/wireless/rsi/rsi_91x_core.c index 2d49c5b..a48e616 100644 --- a/drivers/net/wireless/rsi/rsi_91x_core.c +++ b/drivers/net/wireless/rsi/rsi_91x_core.c @@ -193,8 +193,7 @@ static u8 rsi_core_determine_hal_queue(struct rsi_common *common) if (recontend_queue) goto get_queue_num; - q_num = INVALID_QUEUE; - return q_num; + return INVALID_QUEUE; } common->selected_qnum = q_num; -- 1.9.1
Re: [PATCH v3 09/13] vhost/vdpa: remove vhost_vdpa_config_validate()
On 2021/2/5 下午10:17, Stefano Garzarella wrote: On Fri, Feb 05, 2021 at 08:32:37AM -0500, Michael S. Tsirkin wrote: On Fri, Feb 05, 2021 at 10:16:51AM +0100, Stefano Garzarella wrote: On Fri, Feb 05, 2021 at 11:27:32AM +0800, Jason Wang wrote: > > On 2021/2/5 上午1:22, Stefano Garzarella wrote: > > get_config() and set_config() callbacks in the 'struct vdpa_config_ops' > > usually already validated the inputs. Also now they can return an error, > > so we don't need to validate them here anymore. > > > > Let's use the return value of these callbacks and return it in case of > > error in vhost_vdpa_get_config() and vhost_vdpa_set_config(). > > > > Originally-by: Xie Yongji > > Signed-off-by: Stefano Garzarella > > --- > > drivers/vhost/vdpa.c | 41 + > > 1 file changed, 13 insertions(+), 28 deletions(-) > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > index ef688c8c0e0e..d61e779000a8 100644 > > --- a/drivers/vhost/vdpa.c > > +++ b/drivers/vhost/vdpa.c > > @@ -185,51 +185,35 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, u8 __user *statusp) > > return 0; > > } > > -static int vhost_vdpa_config_validate(struct vhost_vdpa *v, > > - struct vhost_vdpa_config *c) > > -{ > > - long size = 0; > > - > > - switch (v->virtio_id) { > > - case VIRTIO_ID_NET: > > - size = sizeof(struct virtio_net_config); > > - break; > > - } > > - > > - if (c->len == 0) > > - return -EINVAL; > > - > > - if (c->len > size - c->off) > > - return -E2BIG; > > - > > - return 0; > > -} > > - > > static long vhost_vdpa_get_config(struct vhost_vdpa *v, > > struct vhost_vdpa_config __user *c) > > { > > struct vdpa_device *vdpa = v->vdpa; > > struct vhost_vdpa_config config; > > unsigned long size = offsetof(struct vhost_vdpa_config, buf); > > + long ret; > > u8 *buf; > > if (copy_from_user(, c, size)) > > return -EFAULT; > > - if (vhost_vdpa_config_validate(v, )) > > + if (config.len == 0) > > return -EINVAL; > > buf = kvzalloc(config.len, GFP_KERNEL); > > > Then it means usersapce can allocate a very large memory. Good point. > > Rethink about this, we should limit the size here (e.g PAGE_SIZE) or > fetch the config size first (either through a config ops as you > suggested or a variable in the vdpa device that is initialized during > device creation). Maybe PAGE_SIZE is okay as a limit. If instead we want to fetch the config size, then better a config ops in my opinion, to avoid adding a new parameter to __vdpa_alloc_device(). I vote for PAGE_SIZE, but it isn't a strong opinion. What do you and @Michael suggest? Thanks, Stefano Devices know what the config size is. Just have them provide it. Okay, I'll add get_config_size() callback in vdpa_config_ops and I'll leave vhost_vdpa_config_validate() that will use that callback instead of 'virtio_id' to get the config size from the device. At this point I think I can remove the "vdpa: add return value to get_config/set_config callbacks" patch and leave void return to get_config/set_config callbacks. Does this make sense? Thanks, Stefano Yes I think so. Thanks
Re: [PATCH v16 11/12] powerpc: Use OF alloc and free for FDT
Rob Herring writes: > On Thu, Feb 4, 2021 at 10:42 AM Lakshmi Ramasubramanian > wrote: ... >> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c >> index d0e459bb2f05..51d2d8eb6c1b 100644 >> --- a/arch/powerpc/kexec/elf_64.c >> +++ b/arch/powerpc/kexec/elf_64.c >> @@ -19,6 +19,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -32,7 +33,7 @@ static void *elf64_load(struct kimage *image, char >> *kernel_buf, >> unsigned int fdt_size; >> unsigned long kernel_load_addr; >> unsigned long initrd_load_addr = 0, fdt_load_addr; >> - void *fdt; >> + void *fdt = NULL; >> const void *slave_code; >> struct elfhdr ehdr; >> char *modified_cmdline = NULL; >> @@ -103,18 +104,12 @@ static void *elf64_load(struct kimage *image, char >> *kernel_buf, >> } >> >> fdt_size = fdt_totalsize(initial_boot_params) * 2; >> - fdt = kmalloc(fdt_size, GFP_KERNEL); >> + fdt = of_alloc_and_init_fdt(fdt_size); >> if (!fdt) { >> pr_err("Not enough memory for the device tree.\n"); >> ret = -ENOMEM; >> goto out; >> } >> - ret = fdt_open_into(initial_boot_params, fdt, fdt_size); >> - if (ret < 0) { >> - pr_err("Error setting up the new device tree.\n"); >> - ret = -EINVAL; >> - goto out; >> - } >> >> ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr, > > The first thing this function does is call setup_new_fdt() which first > calls of_kexec_setup_new_fdt(). (Note, I really don't understand the > PPC code split. It looks like there's a 32-bit and 64-bit split, but > 32-bit looks broken to me. Nothing ever calls setup_new_fdt() except > setup_new_fdt_ppc64()). I think that's because 32-bit doesn't support kexec_file_load(). cheers
Re: [PATCH v3 1/7] seqnum_ops: Introduce Sequence Number Ops
Hi-- Comments are inline. On 2/3/21 10:11 AM, Shuah Khan wrote: > Sequence Number api provides interfaces for unsigned atomic up counters. > > There are a number of atomic_t usages in the kernel where atomic_t api > is used for counting sequence numbers and other statistical counters. > Several of these usages, convert atomic_read() and atomic_inc_return() > return values to unsigned. Introducing sequence number ops supports > these use-cases with a standard core-api. > > Sequence Number ops provide interfaces to initialize, increment and get > the sequence number. These ops also check for overflow and log message to > indicate when overflow occurs. > > Signed-off-by: Shuah Khan > --- > Documentation/core-api/index.rst | 1 + > Documentation/core-api/seqnum_ops.rst | 53 ++ > MAINTAINERS | 7 ++ > include/linux/seqnum_ops.h| 129 + > lib/Kconfig | 9 ++ > lib/Makefile | 1 + > lib/test_seqnum_ops.c | 133 ++ > 7 files changed, 333 insertions(+) > create mode 100644 Documentation/core-api/seqnum_ops.rst > create mode 100644 include/linux/seqnum_ops.h > create mode 100644 lib/test_seqnum_ops.c > diff --git a/Documentation/core-api/seqnum_ops.rst > b/Documentation/core-api/seqnum_ops.rst > new file mode 100644 > index ..ed4eba394799 > --- /dev/null > +++ b/Documentation/core-api/seqnum_ops.rst > @@ -0,0 +1,53 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +.. include:: > + > +.. _seqnum_ops: > + > +== > +Sequence Number Operations > +== > + > +:Author: Shuah Khan > +:Copyright: |copy| 2021, The Linux Foundation > +:Copyright: |copy| 2021, Shuah Khan > + > +Sequence Number api provides interfaces for unsigned up counters. API > + > +Sequence Number Ops > +=== > + > +seqnum32 and seqnum64 types support implementing unsigned up counters. :: > + > +struct seqnum32 { u32 seqnum; }; > +struct seqnum64 { u64 seqnum; }; > + > +Initializers > + > + > +Interfaces for initializing sequence numbers. :: > + > +#define SEQNUM_INIT(i){ .seqnum = i } > +seqnum32_init(seqnum, val) > +seqnum64_init(seqnum, val) > + > +Increment interface > +--- > + > +Increments sequence number and returns the new value. Checks for overflow > +conditions and logs message when overflow occurs. This check is intended > +to help catch cases where overflow could lead to problems. :: > + > +seqnum32_inc(seqnum): Calls atomic_inc_return(seqnum). > +seqnum64_inc(seqnum): Calls atomic64_inc_return(seqnum). > + > +Return/get value interface > +-- > + > +Returns sequence number value. :: > + > +seqnum32_get() - return seqnum value. > +seqnum64_get() - return seqnum value. > + > +.. warning:: > +seqnum32 wraps around to INT_MIN when it overflows. > diff --git a/MAINTAINERS b/MAINTAINERS > index cc1e6a5ee6e6..f9fe1438a8cd 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -16235,6 +16235,13 @@ S: Maintained > F: Documentation/fb/sm712fb.rst > F: drivers/video/fbdev/sm712* > > +SEQNUM OPS > +M: Shuah Khan > +L: linux-kernel@vger.kernel.org > +S: Maintained > +F: include/linux/seqnum_ops.h > +F: lib/test_seqnum_ops.c > + > SIMPLE FIRMWARE INTERFACE (SFI) > S: Obsolete > W: http://simplefirmware.org/ > diff --git a/include/linux/seqnum_ops.h b/include/linux/seqnum_ops.h > new file mode 100644 > index ..e8d8481445d3 > --- /dev/null > +++ b/include/linux/seqnum_ops.h > @@ -0,0 +1,129 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * seqnum_ops.h - Interfaces for unsigned atomic sequential up counters. > + * > + * Copyright (c) 2021 Shuah Khan > + * Copyright (c) 2021 The Linux Foundation > + * > + * Sequence Number functions provide support for unsgined atomic up unsigned > + * counters. > + * > + * The interface provides: > + * seqnumu32 & seqnumu64 functions: > + * initialization > + * increment and return > + * > + * seqnumu32 and seqnumu64 functions leverage/use atomic*_t ops to > + * implement support for unsigned atomic up counters. > + * > + * Reference and API guide: > + * Documentation/core-api/seqnum_ops.rst for more information. > + */ > + > +#ifndef __LINUX_SEQNUM_OPS_H > +#define __LINUX_SEQNUM_OPS_H > + > +#include > + > +/** > + * struct seqnum32 - Sequence number atomic counter > + * @seqnum: atomic_t > + * > + **/ > +struct seqnum32 { > + u32 seqnum; > +}; > + > +#define SEQNUM_INIT(i) { .seqnum = i } > + > +/* > + * seqnum32_init() - initialize seqnum value > + * @seq: struct seqnum32 pointer > + * > + */ > +static inline void seqnum32_init(struct seqnum32 *seq, u32 val) > +{ > + seq->seqnum =
[RFC PATCH v1] MIPS: tlbex: Avoid access invalid address when pmd is modifying
From: wangrui When modifying pmd through THP, invalid address access may occurs in the tlb handler. Because the tlb handler loads value of pmd twice, one is used for huge page testing and the other is used to load pte. So these two values may be different: CPU 0: (app) CPU 1: (khugepaged) 1: scan hit: set pmd to invalid_pmd_table (pmd_clear) 2: tlb invalid: handle_tlbl, load pmd for huge page testing, is not a huge page 3: collapsed: set pmd to huge page 4: handle_tlbl: load pmd again for load pte(as base address), the value of pmd is not an address, access invalid address! This patch avoids the inconsistency of two memory loads by reusing the result of one load. Signed-off-by: hev Signed-off-by: wangrui --- arch/mips/mm/tlbex.c | 28 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c index a7521b8f7658..66ca219b4457 100644 --- a/arch/mips/mm/tlbex.c +++ b/arch/mips/mm/tlbex.c @@ -720,14 +720,14 @@ static void build_huge_tlb_write_entry(u32 **p, struct uasm_label **l, * Check if Huge PTE is present, if so then jump to LABEL. */ static void -build_is_huge_pte(u32 **p, struct uasm_reloc **r, unsigned int tmp, - unsigned int pmd, int lid) +build_is_huge_pte(u32 **p, struct uasm_reloc **r, unsigned int out, + unsigned int tmp, unsigned int pmd, int lid) { - UASM_i_LW(p, tmp, 0, pmd); + UASM_i_LW(p, out, 0, pmd); if (use_bbit_insns()) { - uasm_il_bbit1(p, r, tmp, ilog2(_PAGE_HUGE), lid); + uasm_il_bbit1(p, r, out, ilog2(_PAGE_HUGE), lid); } else { - uasm_i_andi(p, tmp, tmp, _PAGE_HUGE); + uasm_i_andi(p, tmp, out, _PAGE_HUGE); uasm_il_bnez(p, r, tmp, lid); } } @@ -1103,7 +1103,6 @@ EXPORT_SYMBOL_GPL(build_update_entries); struct mips_huge_tlb_info { int huge_pte; int restore_scratch; - bool need_reload_pte; }; static struct mips_huge_tlb_info @@ -1118,7 +1117,6 @@ build_fast_tlb_refill_handler (u32 **p, struct uasm_label **l, rv.huge_pte = scratch; rv.restore_scratch = 0; - rv.need_reload_pte = false; if (check_for_high_segbits) { UASM_i_MFC0(p, tmp, C0_BADVADDR); @@ -1323,7 +1321,6 @@ static void build_r4000_tlb_refill_handler(void) } else { htlb_info.huge_pte = K0; htlb_info.restore_scratch = 0; - htlb_info.need_reload_pte = true; vmalloc_mode = refill_noscratch; /* * create the plain linear handler @@ -1349,19 +1346,19 @@ static void build_r4000_tlb_refill_handler(void) #endif #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT - build_is_huge_pte(, , K0, K1, label_tlb_huge_update); + build_is_huge_pte(, , K0, K1, K1, label_tlb_huge_update); #endif - build_get_ptep(, K0, K1); - build_update_entries(, K0, K1); + GET_CONTEXT(, K1); /* get context reg */ + build_adjust_context(, K1); + UASM_i_ADDU(, K0, K0, K1); /* add in offset */ + build_update_entries(, K1, K0); build_tlb_write_entry(, , , tlb_random); uasm_l_leave(, p); uasm_i_eret(); /* return from trap */ } #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT uasm_l_tlb_huge_update(, p); - if (htlb_info.need_reload_pte) - UASM_i_LW(, htlb_info.huge_pte, 0, K1); build_huge_update_entries(, htlb_info.huge_pte, K1); build_huge_tlb_write_entry(, , , K0, tlb_random, htlb_info.restore_scratch); @@ -2065,14 +2062,13 @@ build_r4000_tlbchange_handler_head(u32 **p, struct uasm_label **l, * instead contains the tlb pte. Check the PAGE_HUGE bit and * see if we need to jump to huge tlb processing. */ - build_is_huge_pte(p, r, wr.r1, wr.r2, label_tlb_huge_update); + build_is_huge_pte(p, r, wr.r3, wr.r1, wr.r2, label_tlb_huge_update); #endif UASM_i_MFC0(p, wr.r1, C0_BADVADDR); - UASM_i_LW(p, wr.r2, 0, wr.r2); UASM_i_SRL(p, wr.r1, wr.r1, PAGE_SHIFT + PTE_ORDER - PTE_T_LOG2); uasm_i_andi(p, wr.r1, wr.r1, (PTRS_PER_PTE - 1) << PTE_T_LOG2); - UASM_i_ADDU(p, wr.r2, wr.r2, wr.r1); + UASM_i_ADDU(p, wr.r2, wr.r3, wr.r1); #ifdef CONFIG_SMP uasm_l_smp_pgtable_change(l, *p); -- 2.30.0
Re: [PATCH RFC 3/7] kvm: x86: XSAVE state and XFD MSRs context switch
On 2/7/2021 7:49 PM, Borislav Petkov wrote: On Sun, Feb 07, 2021 at 10:42:52AM -0500, Jing Liu wrote: diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c index 7e0c68043ce3..fbb761fc13ec 100644 --- a/arch/x86/kernel/fpu/init.c +++ b/arch/x86/kernel/fpu/init.c @@ -145,6 +145,7 @@ EXPORT_SYMBOL_GPL(fpu_kernel_xstate_min_size); * can be dynamically expanded to include some states up to this size. */ unsigned int fpu_kernel_xstate_max_size; +EXPORT_SYMBOL_GPL(fpu_kernel_xstate_max_size); /* Get alignment of the TYPE. */ #define TYPE_ALIGN(TYPE) offsetof(struct { char x; TYPE test; }, test) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 080f3be9a5e6..9c471a0364e2 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -77,12 +77,14 @@ static struct xfeature_capflag_info xfeature_capflags[] __initdata = { * XSAVE buffer, both supervisor and user xstates. */ u64 xfeatures_mask_all __read_mostly; +EXPORT_SYMBOL_GPL(xfeatures_mask_all); /* * This represents user xstates, a subset of xfeatures_mask_all, saved in a * dynamic kernel XSAVE buffer. */ u64 xfeatures_mask_user_dynamic __read_mostly; +EXPORT_SYMBOL_GPL(xfeatures_mask_user_dynamic); static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1}; static unsigned int xstate_sizes[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1}; Make sure you Cc x...@kernel.org when touching code outside of kvm. There's this script called scripts/get_maintainer.pl which will tell you who to Cc. Use it before you send next time please. Thx. Thank you for the reminder. I'll cc that next time. BRs, Jing
[RFC v4 PATCH] usb: xhci-mtk: improve bandwidth scheduling with TT
When the USB headset is plug into an external hub, sometimes can't set config due to not enough bandwidth, so need improve LS/FS INT/ISOC bandwidth scheduling with TT. Fixes: 08e469de87a2 ("usb: xhci-mtk: supports bandwidth scheduling with multi-TT") Signed-off-by: Yaqii Wu Signed-off-by: Chunfeng Yun --- drivers/usb/host/xhci-mtk-sch.c | 270 +++- drivers/usb/host/xhci-mtk.h | 8 +- 2 files changed, 201 insertions(+), 77 deletions(-) diff --git a/drivers/usb/host/xhci-mtk-sch.c b/drivers/usb/host/xhci-mtk-sch.c index b45e5bf08997..f3cdfcf4e5bf 100644 --- a/drivers/usb/host/xhci-mtk-sch.c +++ b/drivers/usb/host/xhci-mtk-sch.c @@ -32,6 +32,35 @@ #define EP_BOFFSET(p) ((p) & 0x3fff) #define EP_BREPEAT(p) (((p) & 0x7fff) << 16) +enum mtk_sch_err_type { + SCH_SUCCESS = 0, + SCH_ERR_Y6, + SCH_SS_OVERLAP, + SCH_CS_OVERFLOW, + SCH_BW_OVERFLOW, + SCH_FIXME, +}; + +static char *sch_error_string(enum mtk_sch_err_type error) +{ + switch (error) { + case SCH_SUCCESS: + return "Success"; + case SCH_ERR_Y6: + return "Can't schedule Start-Split in Y6"; + case SCH_SS_OVERLAP: + return "Can't find a suitable Start-Split location"; + case SCH_CS_OVERFLOW: + return "The last Complete-Split is greater than 7"; + case SCH_BW_OVERFLOW: + return "Bandwidth exceeds the max limit"; + case SCH_FIXME: + return "FIXME, to be resolved"; + default: + return "Unknown error type"; + } +} + static int is_fs_or_ls(enum usb_device_speed speed) { return speed == USB_SPEED_FULL || speed == USB_SPEED_LOW; @@ -81,11 +110,22 @@ static u32 get_esit(struct xhci_ep_ctx *ep_ctx) return esit; } +static u32 get_bw_boundary(enum usb_device_speed speed) +{ + switch (speed) { + case USB_SPEED_SUPER_PLUS: + return SSP_BW_BOUNDARY; + case USB_SPEED_SUPER: + return SS_BW_BOUNDARY; + default: + return HS_BW_BOUNDARY; + } +} + static struct mu3h_sch_tt *find_tt(struct usb_device *udev) { struct usb_tt *utt = udev->tt; struct mu3h_sch_tt *tt, **tt_index, **ptt; - unsigned int port; bool allocated_index = false; if (!utt) @@ -107,10 +147,9 @@ static struct mu3h_sch_tt *find_tt(struct usb_device *udev) utt->hcpriv = tt_index; allocated_index = true; } - port = udev->ttport - 1; - ptt = _index[port]; + + ptt = _index[udev->ttport - 1]; } else { - port = 0; ptt = (struct mu3h_sch_tt **) >hcpriv; } @@ -125,8 +164,7 @@ static struct mu3h_sch_tt *find_tt(struct usb_device *udev) return ERR_PTR(-ENOMEM); } INIT_LIST_HEAD(>ep_list); - tt->usb_tt = utt; - tt->tt_port = port; + *ptt = tt; } @@ -206,6 +244,15 @@ static struct mu3h_sch_ep_info *create_sch_ep(struct usb_device *udev, return sch_ep; } +static void delete_sch_ep(struct usb_device *udev, struct mu3h_sch_ep_info *sch_ep) +{ + if (sch_ep->sch_tt) + drop_tt(udev); + + list_del(_ep->endpoint); + kfree(sch_ep); +} + static void setup_sch_info(struct usb_device *udev, struct xhci_ep_ctx *ep_ctx, struct mu3h_sch_ep_info *sch_ep) { @@ -375,21 +422,55 @@ static void update_bus_bw(struct mu3h_sch_bw_info *sch_bw, sch_ep->bw_budget_table[j]; } } - sch_ep->allocated = used; } -static int check_sch_tt(struct usb_device *udev, - struct mu3h_sch_ep_info *sch_ep, u32 offset) +static int check_fs_bus_bw(struct mu3h_sch_ep_info *sch_ep, int offset) +{ + struct mu3h_sch_tt *tt = sch_ep->sch_tt; + u32 num_esit, base; + u32 i, j; + u32 tmp; + + num_esit = XHCI_MTK_MAX_ESIT / sch_ep->esit; + + for (i = 0; i < num_esit; i++) { + base = offset + i * sch_ep->esit; + + /* +* Compared with hs bus, no matter what ep type +* The hub will always delay one uframe to send +* data for us. As described in the figure below. +*/ + if (sch_ep->ep_type == ISOC_OUT_EP) { + for (j = 0; j < sch_ep->num_budget_microframes; j++) { + tmp = tt->fs_bus_bw[base + 1 + j] + + sch_ep->bw_cost_per_microframe; + + if (tmp > FS_PAYLOAD_MAX) + return SCH_BW_OVERFLOW; + } + } else { + for (j = 0; j < sch_ep->cs_count; j++) { +
Re: [PATCH] mm/memtest: Add ARCH_USE_MEMTEST
On 2/5/21 2:50 PM, Vladimir Murzin wrote: > Hi Anshuman, > > On 2/5/21 4:10 AM, Anshuman Khandual wrote: >> early_memtest() does not get called from all architectures. Hence enabling >> CONFIG_MEMTEST and providing a valid memtest=[1..N] kernel command line >> option might not trigger the memory pattern tests as would be expected in >> normal circumstances. This situation is misleading. > > Documentation already mentions which architectures support that: > > memtest=[KNL,X86,ARM,PPC] Enable memtest > > yet I admit that not all reflected there But there is nothing that prevents CONFIG_MEMTEST from being set on other platforms that do not have an affect, which is not optimal. > >> >> The change here prevents the above mentioned problem after introducing a >> new config option ARCH_USE_MEMTEST that should be subscribed on platforms >> that call early_memtest(), in order to enable the config CONFIG_MEMTEST. >> Conversely CONFIG_MEMTEST cannot be enabled on platforms where it would >> not be tested anyway. >> > > Is that generic pattern? What about other cross arch parameters? Do they > already > use similar subscription or they rely on documentation? Depending solely on the documentation should not be sufficient. > > I'm not against the patch just want to check if things are consistent... Not sure about other similar situations but those if present should get fixed as well.
Re: [PATCH 1/4] mm/highmem: Lift memcpy_[to|from]_page to core
On Sun, Feb 07, 2021 at 01:46:47AM +, Chaitanya Kulkarni wrote: > On 2/5/21 18:35, ira.we...@intel.com wrote: > > +static inline void memmove_page(struct page *dst_page, size_t dst_off, > > + struct page *src_page, size_t src_off, > > + size_t len) > > +{ > > + char *dst = kmap_local_page(dst_page); > > + char *src = kmap_local_page(src_page); > > + > > + BUG_ON(dst_off + len > PAGE_SIZE || src_off + len > PAGE_SIZE); > > + memmove(dst + dst_off, src + src_off, len); > > + kunmap_local(src); > > + kunmap_local(dst); > > +} > > + > > +static inline void memcpy_from_page(char *to, struct page *page, size_t > > offset, size_t len) > How about following ? > static inline void memcpy_from_page(char *to, struct page *page, size_t > offset, > size_t len) It is an easy change and It is up to Andrew. But I thought we were making the line length limit longer now. Ira > > +{ > > + char *from = kmap_local_page(page); > > + > > + BUG_ON(offset + len > PAGE_SIZE); > > + memcpy(to, from + offset, len); > > + kunmap_local(from); > > +} > > + > > +static inline void memcpy_to_page(struct page *page, size_t offset, const > > char *from, size_t len) > How about following ? > static inline void memcpy_to_page(struct page *page, size_t offset, > const char *from, size_t len) > > +{ > > + char *to = kmap_local_page(page); > > + > > + BUG_ON(offset + len > PAGE_SIZE); > > + memcpy(to + offset, from, len); > > + kunmap_local(to); > > +} > > + > > +static inline void memset_page(struct page *page, size_t offset, int val, > > size_t len) > How about following ? > static inline void memset_page(struct page *page, size_t offset, int val, >size_t len) > > +{ > > + char *addr = kmap_local_page(page); > > + > > + BUG_ON(offset + len > PAGE_SIZE); > > + memset(addr + offset, val, len); > > + kunmap_local(addr); > > +} > > +
[PATCH] cpufreq: schedutil: Don't use the limits_changed flag any more
From: Yue Hu The limits_changed flag was introduced by commit 600f5badb78c ("cpufreq: schedutil: Don't skip freq update when limits change") due to race condition where need_freq_update is cleared in get_next_freq() which causes reducing the CPU frequency is ineffective while busy. But now, the race condition above is gone because get_next_freq() doesn't clear the flag any more after commit 23a881852f3e ("cpufreq: schedutil: Don't skip freq update if need_freq_update is set"). Moreover, need_freq_update currently will be set to true only in sugov_should_update_freq() if CPUFREQ_NEED_UPDATE_LIMITS is not set for the driver. However, limits may have changed at any time. And subsequent frequence update is depending on need_freq_update. So, we may skip this update. Hence, let's remove it to avoid above issue and make code more simple. Signed-off-by: Yue Hu --- kernel/sched/cpufreq_schedutil.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 41e498b..7dd85fb 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -40,7 +40,6 @@ struct sugov_policy { struct task_struct *thread; boolwork_in_progress; - boollimits_changed; boolneed_freq_update; }; @@ -89,11 +88,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) if (!cpufreq_this_cpu_can_update(sg_policy->policy)) return false; - if (unlikely(sg_policy->limits_changed)) { - sg_policy->limits_changed = false; - sg_policy->need_freq_update = true; + if (unlikely(sg_policy->need_freq_update)) return true; - } delta_ns = time - sg_policy->last_freq_update_time; @@ -323,7 +319,7 @@ static bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu, struct sugov_policy *sg_policy) { if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl) - sg_policy->limits_changed = true; + sg_policy->need_freq_update = true; } static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu, @@ -759,7 +755,6 @@ static int sugov_start(struct cpufreq_policy *policy) sg_policy->last_freq_update_time= 0; sg_policy->next_freq= 0; sg_policy->work_in_progress = false; - sg_policy->limits_changed = false; sg_policy->cached_raw_freq = 0; sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS); @@ -813,7 +808,7 @@ static void sugov_limits(struct cpufreq_policy *policy) mutex_unlock(_policy->work_lock); } - sg_policy->limits_changed = true; + sg_policy->need_freq_update = true; } struct cpufreq_governor schedutil_gov = { -- 1.9.1
Re: [PATCH] mm/memtest: Add ARCH_USE_MEMTEST
On 2/5/21 1:05 PM, Max Filippov wrote: > On Thu, Feb 4, 2021 at 8:10 PM Anshuman Khandual > wrote: >> >> early_memtest() does not get called from all architectures. Hence enabling >> CONFIG_MEMTEST and providing a valid memtest=[1..N] kernel command line >> option might not trigger the memory pattern tests as would be expected in >> normal circumstances. This situation is misleading. >> >> The change here prevents the above mentioned problem after introducing a >> new config option ARCH_USE_MEMTEST that should be subscribed on platforms >> that call early_memtest(), in order to enable the config CONFIG_MEMTEST. >> Conversely CONFIG_MEMTEST cannot be enabled on platforms where it would >> not be tested anyway. >> >> Cc: Russell King >> Cc: Catalin Marinas >> Cc: Will Deacon >> Cc: Thomas Bogendoerfer >> Cc: Michael Ellerman >> Cc: Benjamin Herrenschmidt >> Cc: Paul Mackerras >> Cc: Thomas Gleixner >> Cc: Ingo Molnar >> Cc: Chris Zankel >> Cc: Max Filippov >> Cc: linux-arm-ker...@lists.infradead.org >> Cc: linux-m...@vger.kernel.org >> Cc: linuxppc-...@lists.ozlabs.org >> Cc: linux-xte...@linux-xtensa.org >> Cc: linux...@kvack.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual >> --- >> This patch applies on v5.11-rc6 and has been tested on arm64 platform. But >> it has been just build tested on all other platforms. >> >> arch/arm/Kconfig | 1 + >> arch/arm64/Kconfig | 1 + >> arch/mips/Kconfig| 1 + >> arch/powerpc/Kconfig | 1 + >> arch/x86/Kconfig | 1 + >> arch/xtensa/Kconfig | 1 + >> lib/Kconfig.debug| 9 - >> 7 files changed, 14 insertions(+), 1 deletion(-) > > Anshuman, entries in arch/*/Konfig files are sorted in alphabetical order, > please keep them that way. Sure, will fix up and resend. > > Reviewed-by: Max Filippov >
Re: Linux 5.11-rc7
On 2/7/21 2:32 PM, Linus Torvalds wrote: > So it's the biggest sporting day of the year here in the US, when > everybody is getting ready to watch the yearly top TV commercials, > occasionally interrupted by some odd handegg carrying competition that > I still haven't figured out the rules for after twenty-odd years here. > It's kind of a more violent and hands-on team-oriented version of the > traditional egg-and-spoon race, and involves a lot of standing around, > apparently waiting for the next commercial to come on. > > Go forth and test. Unless you're glued to the TV, of course. I should have watched the Puppy Bowl! -- ~Randy
[PATCH v2] staging: gasket: fix indentation and lines ending with open parenthesis
This patch fixes warnings of 'checkpatch.pl'. According to Linux coding guidelines, code should be aligned properly to match with open parenthesis and lines should not end with open parenthesis. Signed-off-by: Mahak Gupta --- Changes since v1: - Use temporary variables to shorten long lines. This variable was used multiple times. --- drivers/staging/gasket/gasket_ioctl.c | 42 ++- 1 file changed, 22 insertions(+), 20 deletions(-) diff --git a/drivers/staging/gasket/gasket_ioctl.c b/drivers/staging/gasket/gasket_ioctl.c index e3047d36d8db..aa65f4fbf860 100644 --- a/drivers/staging/gasket/gasket_ioctl.c +++ b/drivers/staging/gasket/gasket_ioctl.c @@ -40,10 +40,11 @@ static int gasket_set_event_fd(struct gasket_dev *gasket_dev, /* Read the size of the page table. */ static int gasket_read_page_table_size(struct gasket_dev *gasket_dev, - struct gasket_page_table_ioctl __user *argp) + struct gasket_page_table_ioctl __user *argp) { int ret = 0; struct gasket_page_table_ioctl ibuf; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -51,8 +52,8 @@ static int gasket_read_page_table_size(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >= gasket_dev->num_page_tables) return -EFAULT; - ibuf.size = gasket_page_table_num_entries( - gasket_dev->page_table[ibuf.page_table_index]); + table = gasket_dev->page_table[ibuf.page_table_index]; + ibuf.size = gasket_page_table_num_entries(table); trace_gasket_ioctl_page_table_data(ibuf.page_table_index, ibuf.size, ibuf.host_address, @@ -66,10 +67,11 @@ static int gasket_read_page_table_size(struct gasket_dev *gasket_dev, /* Read the size of the simple page table. */ static int gasket_read_simple_page_table_size(struct gasket_dev *gasket_dev, - struct gasket_page_table_ioctl __user *argp) + struct gasket_page_table_ioctl __user *argp) { int ret = 0; struct gasket_page_table_ioctl ibuf; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -77,8 +79,8 @@ static int gasket_read_simple_page_table_size(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >= gasket_dev->num_page_tables) return -EFAULT; - ibuf.size = - gasket_page_table_num_simple_entries(gasket_dev->page_table[ibuf.page_table_index]); + table = gasket_dev->page_table[ibuf.page_table_index]; + ibuf.size = gasket_page_table_num_simple_entries(table); trace_gasket_ioctl_page_table_data(ibuf.page_table_index, ibuf.size, ibuf.host_address, @@ -92,11 +94,12 @@ static int gasket_read_simple_page_table_size(struct gasket_dev *gasket_dev, /* Set the boundary between the simple and extended page tables. */ static int gasket_partition_page_table(struct gasket_dev *gasket_dev, - struct gasket_page_table_ioctl __user *argp) + struct gasket_page_table_ioctl __user *argp) { int ret; struct gasket_page_table_ioctl ibuf; uint max_page_table_size; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -107,8 +110,8 @@ static int gasket_partition_page_table(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >= gasket_dev->num_page_tables) return -EFAULT; - max_page_table_size = gasket_page_table_max_size( - gasket_dev->page_table[ibuf.page_table_index]); + table = gasket_dev->page_table[ibuf.page_table_index]; + max_page_table_size = gasket_page_table_max_size(table); if (ibuf.size > max_page_table_size) { dev_dbg(gasket_dev->dev, @@ -119,8 +122,7 @@ static int gasket_partition_page_table(struct gasket_dev *gasket_dev, mutex_lock(_dev->mutex); - ret = gasket_page_table_partition( - gasket_dev->page_table[ibuf.page_table_index], ibuf.size); + ret = gasket_page_table_partition(table, ibuf.size); mutex_unlock(_dev->mutex); return ret; @@ -131,6 +133,7 @@ static int gasket_map_buffers(struct gasket_dev *gasket_dev, struct gasket_page_table_ioctl __user *argp) { struct gasket_page_table_ioctl ibuf; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -142,13 +145,12 @@ static int gasket_map_buffers(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >=
[PATCH] staging: gasket: fix indentation and lines ending with open parenthesis
This patch fixes warnings of 'checkpatch.pl'. According to Linux coding guidelines, code should be aligned properly to match with open parenthesis and lines should not end with open parenthesis. Signed-off-by: Mahak Gupta --- Changes since v1: - Use temporary variables to shorten long lines. This variable was used multiple times. --- drivers/staging/gasket/gasket_ioctl.c | 42 ++- 1 file changed, 22 insertions(+), 20 deletions(-) diff --git a/drivers/staging/gasket/gasket_ioctl.c b/drivers/staging/gasket/gasket_ioctl.c index e3047d36d8db..aa65f4fbf860 100644 --- a/drivers/staging/gasket/gasket_ioctl.c +++ b/drivers/staging/gasket/gasket_ioctl.c @@ -40,10 +40,11 @@ static int gasket_set_event_fd(struct gasket_dev *gasket_dev, /* Read the size of the page table. */ static int gasket_read_page_table_size(struct gasket_dev *gasket_dev, - struct gasket_page_table_ioctl __user *argp) + struct gasket_page_table_ioctl __user *argp) { int ret = 0; struct gasket_page_table_ioctl ibuf; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -51,8 +52,8 @@ static int gasket_read_page_table_size(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >= gasket_dev->num_page_tables) return -EFAULT; - ibuf.size = gasket_page_table_num_entries( - gasket_dev->page_table[ibuf.page_table_index]); + table = gasket_dev->page_table[ibuf.page_table_index]; + ibuf.size = gasket_page_table_num_entries(table); trace_gasket_ioctl_page_table_data(ibuf.page_table_index, ibuf.size, ibuf.host_address, @@ -66,10 +67,11 @@ static int gasket_read_page_table_size(struct gasket_dev *gasket_dev, /* Read the size of the simple page table. */ static int gasket_read_simple_page_table_size(struct gasket_dev *gasket_dev, - struct gasket_page_table_ioctl __user *argp) + struct gasket_page_table_ioctl __user *argp) { int ret = 0; struct gasket_page_table_ioctl ibuf; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -77,8 +79,8 @@ static int gasket_read_simple_page_table_size(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >= gasket_dev->num_page_tables) return -EFAULT; - ibuf.size = - gasket_page_table_num_simple_entries(gasket_dev->page_table[ibuf.page_table_index]); + table = gasket_dev->page_table[ibuf.page_table_index]; + ibuf.size = gasket_page_table_num_simple_entries(table); trace_gasket_ioctl_page_table_data(ibuf.page_table_index, ibuf.size, ibuf.host_address, @@ -92,11 +94,12 @@ static int gasket_read_simple_page_table_size(struct gasket_dev *gasket_dev, /* Set the boundary between the simple and extended page tables. */ static int gasket_partition_page_table(struct gasket_dev *gasket_dev, - struct gasket_page_table_ioctl __user *argp) + struct gasket_page_table_ioctl __user *argp) { int ret; struct gasket_page_table_ioctl ibuf; uint max_page_table_size; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -107,8 +110,8 @@ static int gasket_partition_page_table(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >= gasket_dev->num_page_tables) return -EFAULT; - max_page_table_size = gasket_page_table_max_size( - gasket_dev->page_table[ibuf.page_table_index]); + table = gasket_dev->page_table[ibuf.page_table_index]; + max_page_table_size = gasket_page_table_max_size(table); if (ibuf.size > max_page_table_size) { dev_dbg(gasket_dev->dev, @@ -119,8 +122,7 @@ static int gasket_partition_page_table(struct gasket_dev *gasket_dev, mutex_lock(_dev->mutex); - ret = gasket_page_table_partition( - gasket_dev->page_table[ibuf.page_table_index], ibuf.size); + ret = gasket_page_table_partition(table, ibuf.size); mutex_unlock(_dev->mutex); return ret; @@ -131,6 +133,7 @@ static int gasket_map_buffers(struct gasket_dev *gasket_dev, struct gasket_page_table_ioctl __user *argp) { struct gasket_page_table_ioctl ibuf; + struct gasket_page_table *table; if (copy_from_user(, argp, sizeof(struct gasket_page_table_ioctl))) return -EFAULT; @@ -142,13 +145,12 @@ static int gasket_map_buffers(struct gasket_dev *gasket_dev, if (ibuf.page_table_index >=
Re: ANNOUNCE: pahole v1.20 (gcc11 DWARF5's default, lots of ELF sections, BTF)
On Thu, Feb 4, 2021 at 11:07 PM Arnaldo Carvalho de Melo wrote: > > Hi, > > The v1.20 release of pahole and its friends is out, mostly > addressing problems related to gcc 11 defaulting to DWARF5 for -g, > available at the usual places: > > Main git repo: > >git://git.kernel.org/pub/scm/devel/pahole/pahole.git > > Mirror git repo: > >https://github.com/acmel/dwarves.git > > tarball + gpg signature: > >https://fedorapeople.org/~acme/dwarves/dwarves-1.20.tar.xz >https://fedorapeople.org/~acme/dwarves/dwarves-1.20.tar.bz2 >https://fedorapeople.org/~acme/dwarves/dwarves-1.20.tar.sign > FYI: Debian now ships dwarves package version 1.20-1 in unstable. Just a small nit to this release and its tagging: You did: commit 0d415f68c468b77c5bf8e71965cd08c6efd25fc4 ("pahole: Prep 1.20") Is this new? The release before: commit dd15aa4b0a6421295cbb7c3913429142fef8abe0 ("dwarves: Prep v1.19") - Sedat - > Best Regards, > > - Arnaldo > > v1.20: > > BTF encoder: > > - Improve ELF error reporting using elf_errmsg(elf_errno()). > > - Improve objcopy error handling. > > - Fix handling of 'restrict' qualifier, that was being treated as a 'const'. > > - Support SHN_XINDEX in st_shndx symbol indexes, to handle ELF objects with > more than 65534 sections, for instance, which happens with kernels built > with 'KCFLAGS="-ffunction-sections -fdata-sections", Other cases may > include when using FG-ASLR, LTO. > > - Cope with functions without a name, as seen sometimes when building kernel > images with some versions of clang, when a SEGFAULT was taking place. > > - Fix BTF variable generation for kernel modules, not skipping variables at > offset zero. > > - Fix address size to match what is in the ELF file being processed, to fix > using > a 64-bit pahole binary to generate BTF for a 32-bit vmlinux image. > > - Use kernel module ftrace addresses when finding which functions to encode, > which increases the number of functions encoded. > > libbpf: > > - Allow use of packaged version, for distros wanting to dynamically link > with > the system's libbpf package instead of using the libbpf git submodule > shipped > in pahole's source code. > > DWARF loader: > > - Support DW_AT_data_bit_offset > > This appeared in DWARF4 but is supported only in gcc's -gdwarf-5, > support it in a way that makes the output be the same for both cases. > > $ gcc -gdwarf-5 -c examples/dwarf5/bf.c > $ pahole bf.o > struct pea { > long int a:1; /* 0: 0 8 */ > long int b:1; /* 0: 1 8 */ > long int c:1; /* 0: 2 8 */ > > /* XXX 29 bits hole, try to pack */ > /* Bitfield combined with next fields */ > > intafter_bitfield; /* 4 4 */ > > /* size: 8, cachelines: 1, members: 4 */ > /* sum members: 4 */ > /* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 > bits */ > /* last cacheline: 8 bytes */ > }; > > - DW_FORM_implicit_const in attr_numeric() and attr_offset() > > - Support DW_TAG_GNU_call_site, its the standardized rename of the > previously supported > DW_TAG_GNU_call_site. > > build: > > - Fix compilation on 32-bit architectures. > > Signed-off-by: Arnaldo Carvalho de Melo
[PATCH RESEND] mwl8k: assign value when defining variables
From: wengjianfeng define refilled and then assign value to it, which should do that at the same time. Signed-off-by: wengjianfeng --- drivers/net/wireless/marvell/mwl8k.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/wireless/marvell/mwl8k.c b/drivers/net/wireless/marvell/mwl8k.c index abf3b02..435ef77 100644 --- a/drivers/net/wireless/marvell/mwl8k.c +++ b/drivers/net/wireless/marvell/mwl8k.c @@ -1208,9 +1208,8 @@ static int rxq_refill(struct ieee80211_hw *hw, int index, int limit) { struct mwl8k_priv *priv = hw->priv; struct mwl8k_rx_queue *rxq = priv->rxq + index; - int refilled; + int refilled = 0; - refilled = 0; while (rxq->rxd_count < MWL8K_RX_DESCS && limit--) { struct sk_buff *skb; dma_addr_t addr; -- 1.9.1
Re: [PATCH 7/8] mm: memcontrol: consolidate lruvec stat flushing
On Fri, Feb 5, 2021 at 10:28 AM Johannes Weiner wrote: > > There are two functions to flush the per-cpu data of an lruvec into > the rest of the cgroup tree: when the cgroup is being freed, and when > a CPU disappears during hotplug. The difference is whether all CPUs or > just one is being collected, but the rest of the flushing code is the > same. Merge them into one function and share the common code. > > Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt BTW what about the lruvec stats? Why not convert them to rstat as well?
RE: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory pin
> -Original Message- > From: owner-linux...@kvack.org [mailto:owner-linux...@kvack.org] On Behalf Of > Matthew Wilcox > Sent: Monday, February 8, 2021 2:31 PM > To: Song Bao Hua (Barry Song) > Cc: Wangzhou (B) ; linux-kernel@vger.kernel.org; > io...@lists.linux-foundation.org; linux...@kvack.org; > linux-arm-ker...@lists.infradead.org; linux-...@vger.kernel.org; Andrew > Morton ; Alexander Viro ; > gre...@linuxfoundation.org; j...@ziepe.ca; kevin.t...@intel.com; > jean-phili...@linaro.org; eric.au...@redhat.com; Liguozhu (Kenneth) > ; zhangfei@linaro.org; chensihang (A) > > Subject: Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory > pin > > On Sun, Feb 07, 2021 at 10:24:28PM +, Song Bao Hua (Barry Song) wrote: > > > > In high-performance I/O cases, accelerators might want to perform > > > > I/O on a memory without IO page faults which can result in dramatically > > > > increased latency. Current memory related APIs could not achieve this > > > > requirement, e.g. mlock can only avoid memory to swap to backup device, > > > > page migration can still trigger IO page fault. > > > > > > Well ... we have two requirements. The application wants to not take > > > page faults. The system wants to move the application to a different > > > NUMA node in order to optimise overall performance. Why should the > > > application's desires take precedence over the kernel's desires? And why > > > should it be done this way rather than by the sysadmin using numactl to > > > lock the application to a particular node? > > > > NUMA balancer is just one of many reasons for page migration. Even one > > simple alloc_pages() can cause memory migration in just single NUMA > > node or UMA system. > > > > The other reasons for page migration include but are not limited to: > > * memory move due to CMA > > * memory move due to huge pages creation > > > > Hardly we can ask users to disable the COMPACTION, CMA and Huge Page > > in the whole system. > > You're dodging the question. Should the CMA allocation fail because > another application is using SVA? > > I would say no. I would say no as well. While IOMMU is enabled, CMA almost has one user only: IOMMU driver as other drivers will depend on iommu to use non-contiguous memory though they are still calling dma_alloc_coherent(). In iommu driver, dma_alloc_coherent is called during initialization and there is no new allocation afterwards. So it wouldn't cause runtime impact on SVA performance. Even there is new allocations, CMA will fall back to general alloc_pages() and iommu drivers are almost allocating small memory for command queues. So I would say general compound pages, huge pages, especially transparent huge pages, would be bigger concerns than CMA for internal page migration within one NUMA. Not like CMA, general alloc_pages() can get memory by moving pages other than those pinned. And there is no guarantee we can always bind the memory of SVA applications to single one NUMA, so NUMA balancing is still a concern. But I agree we need a way to make CMA success while the userspace pages are pinned. Since pin has been viral in many drivers, I assume there is a way to handle this. Otherwise, APIs like V4L2_MEMORY_USERPTR[1] will possibly make CMA fail as there is no guarantee that usersspace will allocate unmovable memory and there is no guarantee the fallback path- alloc_pages() can succeed while allocating big memory. Will investigate more. > The application using SVA should take the one-time > performance hit from having its memory moved around. Sometimes I also feel SVA is doomed to suffer from performance impact due to page migration. But we are still trying to extend its use cases to high-performance I/O. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/v4l2-core/videobuf-dma-sg.c Thanks Barry
[PATCH RESEND] wl1251: cmd: remove redundant assignment
From: wengjianfeng -ENOMEM has been used as a return value,it is not necessary to assign it, and if kzalloc fail,not need free it,so just return -ENOMEM when kzalloc fail. Signed-off-by: wengjianfeng --- drivers/net/wireless/ti/wl1251/cmd.c | 36 1 file changed, 12 insertions(+), 24 deletions(-) diff --git a/drivers/net/wireless/ti/wl1251/cmd.c b/drivers/net/wireless/ti/wl1251/cmd.c index e1095b8..498c8db 100644 --- a/drivers/net/wireless/ti/wl1251/cmd.c +++ b/drivers/net/wireless/ti/wl1251/cmd.c @@ -175,10 +175,8 @@ int wl1251_cmd_vbm(struct wl1251 *wl, u8 identity, wl1251_debug(DEBUG_CMD, "cmd vbm"); vbm = kzalloc(sizeof(*vbm), GFP_KERNEL); - if (!vbm) { - ret = -ENOMEM; - goto out; - } + if (!vbm) + return -ENOMEM; /* Count and period will be filled by the target */ vbm->tim.bitmap_ctrl = bitmap_control; @@ -213,10 +211,8 @@ int wl1251_cmd_data_path_rx(struct wl1251 *wl, u8 channel, bool enable) wl1251_debug(DEBUG_CMD, "cmd data path"); cmd = kzalloc(sizeof(*cmd), GFP_KERNEL); - if (!cmd) { - ret = -ENOMEM; - goto out; - } + if (!cmd) + return -ENOMEM; cmd->channel = channel; @@ -279,10 +275,8 @@ int wl1251_cmd_join(struct wl1251 *wl, u8 bss_type, u8 channel, u8 *bssid; join = kzalloc(sizeof(*join), GFP_KERNEL); - if (!join) { - ret = -ENOMEM; - goto out; - } + if (!join) + return -ENOMEM; wl1251_debug(DEBUG_CMD, "cmd join%s ch %d %d/%d", bss_type == BSS_TYPE_IBSS ? " ibss" : "", @@ -324,10 +318,8 @@ int wl1251_cmd_ps_mode(struct wl1251 *wl, u8 ps_mode) wl1251_debug(DEBUG_CMD, "cmd set ps mode"); ps_params = kzalloc(sizeof(*ps_params), GFP_KERNEL); - if (!ps_params) { - ret = -ENOMEM; - goto out; - } + if (!ps_params) + return -ENOMEM; ps_params->ps_mode = ps_mode; ps_params->send_null_data = 1; @@ -356,10 +348,8 @@ int wl1251_cmd_read_memory(struct wl1251 *wl, u32 addr, void *answer, wl1251_debug(DEBUG_CMD, "cmd read memory"); cmd = kzalloc(sizeof(*cmd), GFP_KERNEL); - if (!cmd) { - ret = -ENOMEM; - goto out; - } + if (!cmd) + return -ENOMEM; WARN_ON(len > MAX_READ_SIZE); len = min_t(size_t, len, MAX_READ_SIZE); @@ -401,10 +391,8 @@ int wl1251_cmd_template_set(struct wl1251 *wl, u16 cmd_id, cmd_len = ALIGN(sizeof(*cmd) + buf_len, 4); cmd = kzalloc(cmd_len, GFP_KERNEL); - if (!cmd) { - ret = -ENOMEM; - goto out; - } + if (!cmd) + return -ENOMEM; cmd->size = cpu_to_le16(buf_len); -- 1.9.1
drivers/soundwire/stream.c:260:12: warning: stack frame size of 2832 bytes in function 'sdw_program_port_params'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 61556703b610a104de324e4f061dc6cf7b218b46 commit: 41ff91741c25d4987bf0405fa219b9eb339f24ee soundwire: stream: use FIELD_{GET|PREP} date: 5 months ago config: powerpc64-randconfig-r014-20210207 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc64 cross compiling tool for clang build # apt-get install binutils-powerpc64-linux-gnu # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41ff91741c25d4987bf0405fa219b9eb339f24ee git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git fetch --no-tags linus master git checkout 41ff91741c25d4987bf0405fa219b9eb339f24ee # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): In file included from drivers/soundwire/stream.c:16: In file included from include/sound/soc.h:18: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:10: In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:604: arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c), ^~~ arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~ :194:1: note: expanded from here __do_insw ^ arch/powerpc/include/asm/io.h:542:56: note: expanded from macro '__do_insw' #define __do_insw(p, b, n) readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n)) ~^ In file included from drivers/soundwire/stream.c:16: In file included from include/sound/soc.h:18: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:10: In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:604: arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c), ^~~ arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~ :199:1: note: expanded from here __do_insl ^ arch/powerpc/include/asm/io.h:543:56: note: expanded from macro '__do_insl' #define __do_insl(p, b, n) readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n)) ~^ In file included from drivers/soundwire/stream.c:16: In file included from include/sound/soc.h:18: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:10: In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:604: arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c), ^~ arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~ :204:1: note: expanded from here __do_outsb ^ arch/powerpc/include/asm/io.h:544:58: note: expanded from macro '__do_outsb' #define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n)) ~^ In file included from drivers/soundwire/stream.c:16: In file in