Re: [PATCH 3/5] exec: Remove recursion from search_binary_handler
On 5/14/2020 7:56 AM, Eric W. Biederman wrote: > Kees Cook writes: > >> On Tue, May 12, 2020 at 04:47:14PM -0700, Kees Cook wrote: >>> And now I wonder if qemu actually uses the resulting AT_EXECFD ... >> It does, though I'm not sure if this is to support crossing mount points, >> dropping privileges, or something else, since it does fall back to just >> trying to open the file. >> >> execfd = qemu_getauxval(AT_EXECFD); >> if (execfd == 0) { >> execfd = open(filename, O_RDONLY); >> if (execfd < 0) { >> printf("Error while loading %s: %s\n", filename, >> strerror(errno)); >> _exit(EXIT_FAILURE); >> } >> } > My hunch is that the fallback exists from a time when the kernel did not > implement AT_EXECFD, or so that qemu can run on kernels that don't > implement AT_EXECFD. It doesn't really matter unless the executable is > suid, or otherwise changes privileges. > > > I looked into this a bit to remind myself why exec works the way it > works, with changing privileges. > > The classic attack is pointing a symlink at a #! script that is suid or > otherwise changes privileges. The kernel will open the script and set > the privileges, read the interpreter from the first line, and proceed to > exec the interpreter. The interpreter will then open the script using > the pathname supplied by the kernel. The name of the symlink. > Before the interpreter reopens the script the attack would replace > the symlink with a script that does something else, but gets to run > with the privileges of the script. > > > Defending against that time of check vs time of use attack is why > bprm_fill_uid, and cap_bprm_set_creds use the credentials derived from > the interpreter instead of the credentials derived from the script. > > > The other defense is to replace the pathname of the executable that the > intepreter will open with /dev/fd/N. > > All of this predates Linux entirely. I do remember this was fixed at > some point in Linux but I don't remember the details. I can just read > the solution that was picked in the code. > > > > All of this makes me wonder how are the LSMs protected against this > attack. > > Let's see the following LSMS implement brpm_set_creds: > tomoyo - Abuses bprm_set_creds to call tomoyo_load_policy [ safe ] > smack- Requires CAP_MAC_ADMIN to smack setxattrs[ vulnerable? ] >Uses those xattrs in smack_bprm_set_creds What is the concern? If the xattrs change after the check, the behavior should still be consistent. > apparmor - Everything is based on names so the symlink [ safe? ] >attack won't work as it has the wrong name. >As long as the trusted names can't be renamed >apparmor appears good. > selinux - Appears to let anyone set selinux xattrs [ safe? ] >Requires permission for a sid transfer >As the attack appears not to allow anything that >would not be allowed anyway it looks like selinux >is safe. > > LSM folks, especially Casey am I reading this correctly? Did I > correctly infer how your LSMs deal with the time of check to time of use > attack on the script name? > > Eric >
[PATCH] keys: Move permissions checking decisions into the checking code
How about this then? David --- commit fa37b6c7e2f86d16ede1e0e3cb73857152d51825 Author: David Howells Date: Thu May 14 17:48:55 2020 +0100 keys: Move permissions checking decisions into the checking code Overhaul the permissions checking, moving the decisions of which permits to request for what operation and what overrides to allow into the permissions checking functions in keyrings, SELinux and Smack. To this end, the KEY_NEED_* constants are turned into an enum and expanded in number to cover operation types individually. Note that some more tweaking is probably needed to separate kernel uses, e.g. AFS using rxrpc keys, from direct userspace users. Some overrides are available and this needs to be handled specially: (1) KEY_FLAG_KEEP in key->flags - The key may not be deleted and/or things may not be removed from the keyring. (2) KEY_FLAG_ROOT_CAN_CLEAR in key->flags - The keyring can be cleared by CAP_SYS_ADMIN. (3) KEY_FLAG_ROOT_CAN_INVAL in key->flags - The key can be invalidated by CAP_SYS_ADMIN. (4) An appropriate auth token being set in cred->request_key_auth that gives a process transient permission to view and instantiate a key. This is used by the kernel to delegate instantiation to userspace. Note that this requires some tweaks to the testsuite as some of the error codes change. Signed-off-by: David Howells Reported-by: Stephen Smalley cc: Jarkko Sakkinen cc: Paul Moore cc: Stephen Smalley cc: Casey Schaufler cc: keyri...@vger.kernel.org cc: seli...@vger.kernel.org diff --git a/include/linux/key.h b/include/linux/key.h index b99b40db08fc..7fb00128c5ba 100644 --- a/include/linux/key.h +++ b/include/linux/key.h @@ -71,6 +71,34 @@ struct net; #define KEY_PERM_UNDEF 0x +/* + * The permissions required on a key that we're looking up. + */ +enum key_need_perm { + /* 0 is left undefined */ + KEY_NEED_ASSUME_AUTHORITY = 1, /* Want to assume instantiation authority */ + KEY_NEED_CHOWN, /* Want to change key's ownership/group */ + KEY_NEED_DESCRIBE, /* Want to get a key's attributes */ + KEY_NEED_GET_SECURITY, /* Want to get a key's security label */ + KEY_NEED_INSTANTIATE, /* Want to instantiate a key */ + KEY_NEED_INVALIDATE,/* Want to invalidate key */ + KEY_NEED_JOIN, /* Want to set a keyring as the session keyring */ + KEY_NEED_KEYRING_ADD, /* Want to add a link to a keyring */ + KEY_NEED_KEYRING_CLEAR, /* Want to clear a keyring */ + KEY_NEED_KEYRING_DELETE,/* Want to remove a link from a keyring */ + KEY_NEED_LINK, /* Want to create a link to a key */ + KEY_NEED_READ, /* Want to read content to userspace */ + KEY_NEED_REVOKE,/* Want to revoke a key */ + KEY_NEED_SEARCH,/* Want to find a key in a search */ + KEY_NEED_SETPERM, /* Want to set the permissions mask */ + KEY_NEED_SET_RESTRICTION, /* Want to set a restriction on a keyring */ + KEY_NEED_SET_TIMEOUT, /* Want to set the expiration time on a key */ + KEY_NEED_UNLINK,/* Want to remove a link from a key */ + KEY_NEED_UPDATE,/* Want to update a key's payload */ + KEY_NEED_USE, /* Want to use a key (in kernel) */ + KEY_NEED_WATCH, /* Want to watch a key for events */ +}; + struct seq_file; struct user_struct; struct signal_struct; @@ -420,20 +448,9 @@ static inline key_serial_t key_serial(const struct key *key) extern void key_set_timeout(struct key *, unsigned); extern key_ref_t lookup_user_key(key_serial_t id, unsigned long flags, -key_perm_t perm); +enum key_need_perm need_perm); extern void key_free_user_ns(struct user_namespace *); -/* - * The permissions required on a key that we're looking up. - */ -#defineKEY_NEED_VIEW 0x01/* Require permission to view attributes */ -#defineKEY_NEED_READ 0x02/* Require permission to read content */ -#defineKEY_NEED_WRITE 0x04/* Require permission to update / modify */ -#defineKEY_NEED_SEARCH 0x08/* Require permission to search (keyring) or find (key) */ -#defineKEY_NEED_LINK 0x10/* Require permission to link */ -#defineKEY_NEED_SETATTR 0x20 /* Require permission to change attributes */ -#defineKEY_NEED_ALL0x3f/* All the above permissions */ - static inline short key_read_state(const struct key *key) { /* Barrier versus mark_key_instantiated(). */ diff --git a/include/linux/lsm_hook_defs.h
[PATCH] net: phy: mdio-moxart: remove unneeded include
From: Bartosz Golaszewski mdio-moxart doesn't use regulators in the driver code. We can remove the regulator include. Signed-off-by: Bartosz Golaszewski --- drivers/net/phy/mdio-moxart.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/phy/mdio-moxart.c b/drivers/net/phy/mdio-moxart.c index 2d16fc4173c1..b72c6d185175 100644 --- a/drivers/net/phy/mdio-moxart.c +++ b/drivers/net/phy/mdio-moxart.c @@ -12,7 +12,6 @@ #include #include #include -#include #define REG_PHY_CTRL0 #define REG_PHY_WRITE_DATA 4 -- 2.25.0
Re: [PATCH v1 4/4] scsi: ufs: Fix WriteBooster flush during runtime suspend
On 5/14/2020 7:49 AM, Stanley Chu wrote: Hi Asutosh, On Thu, 2020-05-14 at 10:23 +0800, Stanley Chu wrote: Hi Asutosh, On Wed, 2020-05-13 at 12:31 -0700, Asutosh Das (asd) wrote: On 5/12/2020 3:47 AM, Stanley Chu wrote: Currently UFS host driver promises VCC supply if UFS device needs to do WriteBooster flush during runtime suspend. However the UFS specification mentions, "While the flushing operation is in progress, the device is in Active power mode." Therefore UFS host driver needs to promise more: Keep UFS device as "Active power mode", otherwise UFS device shall not do any flush if device enters Sleep or PowerDown power mode. Fix this by not changing device power mode if WriteBooster flush is required in ufshcd_suspend(). Signed-off-by: Stanley Chu --- drivers/scsi/ufs/ufs.h| 1 - drivers/scsi/ufs/ufshcd.c | 39 +++ 2 files changed, 19 insertions(+), 21 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index b3135344ab3f..9e4bc2e97ada 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -577,7 +577,6 @@ struct ufs_dev_info { u32 d_ext_ufs_feature_sup; u8 b_wb_buffer_type; u32 d_wb_alloc_units; - bool keep_vcc_on; u8 b_presrv_uspc_en; }; diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 169a3379e468..2d0aff8ac260 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -8101,8 +8101,7 @@ static void ufshcd_vreg_set_lpm(struct ufs_hba *hba) !hba->dev_info.is_lu_power_on_wp) { ufshcd_setup_vreg(hba, false); } else if (!ufshcd_is_ufs_dev_active(hba)) { - if (!hba->dev_info.keep_vcc_on) - ufshcd_toggle_vreg(hba->dev, hba->vreg_info.vcc, false); + ufshcd_toggle_vreg(hba->dev, hba->vreg_info.vcc, false); if (!ufshcd_is_link_active(hba)) { ufshcd_config_vreg_lpm(hba, hba->vreg_info.vccq); ufshcd_config_vreg_lpm(hba, hba->vreg_info.vccq2); @@ -8172,6 +8171,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) enum ufs_pm_level pm_lvl; enum ufs_dev_pwr_mode req_dev_pwr_mode; enum uic_link_state req_link_state; + bool keep_curr_dev_pwr_mode = false; hba->pm_op_in_progress = 1; if (!ufshcd_is_shutdown_pm(pm_op)) { @@ -8226,28 +8226,27 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) /* make sure that auto bkops is disabled */ ufshcd_disable_auto_bkops(hba); } + Unnecessary newline, perhaps? Yap, I will remove it in next version. /* -* With wb enabled, if the bkops is enabled or if the -* configured WB type is 70% full, keep vcc ON -* for the device to flush the wb buffer +* If device needs to do BKOP or WB buffer flush, keep device +* power mode as "active power mode" and its VCC supply. */ - if ((hba->auto_bkops_enabled && ufshcd_is_wb_allowed(hba)) || - ufshcd_wb_keep_vcc_on(hba)) - hba->dev_info.keep_vcc_on = true; - else - hba->dev_info.keep_vcc_on = false; - } else { - hba->dev_info.keep_vcc_on = false; + keep_curr_dev_pwr_mode = hba->auto_bkops_enabled || + ufshcd_wb_keep_vcc_on(hba); Should the device be in UFS_ACTIVE_PWR_MODE to perform auto-bkops? Also, is it needed to keep the device in UFS_ACTIVE_PWR_MODE , if flush on hibern8 is enabled and the link is being put to hibern8 mode during runtime-suspend? Perhaps that should also be factored in here? Both auto-bkops and WriteBooster flush during Hibern8 need device power mode to be "Active Power Mode". For auto-bkops, the spec mentions, "If the background operations enable bit is set and the device is in Active power mode or Idle power mode, then the device is allowed to execute any internal operations." For WriteBooster flush during Hibern8, the spec mentions, "While the flushing operation is in progress, the device is in Active power mode." Therefore here we can use an unified "keep_curr_dev_pwr_mode" to indicate the same requirements of above both features. Besides, both operations may access flash array inside UFS device thus VCC supply shall be also kept. Before this patch, the original code will keep device power mode (stay in Active Power Mode) if hba->auto_bkops_enabled is set as true during runtime-suspend with UFSHCD_CAP_AUTO_BKOPS_SUSPEND capability is enabled. This patch will not change this decision, just add "WriteBooster flush during Hibern8" feature as another condition to do so. Thank you so much to remind me that "Link shall be put in Hibern8" is a necessary condition for
[PATCH] net: phy: Fix c45 no phy detected logic
The commit "disregard Clause 22 registers present bit..." clears the low bit of the devices_in_package data which is being used in get_phy_c45_ids() to determine if a phy/register is responding correctly. That check is against 0x1FFF, but since the low bit is always cleared, the check can never be true. This leads to detecting c45 phy devices where none exist. Lets fix this by also clearing the low bit in the mask to 0x1FFE. This allows us to continue to autoprobe standards compliant devices without also gaining a large number of bogus ones. Fixes: 3b5e74e0afe3 ("net: phy: disregard "Clause 22 registers present" bit in get_phy_c45_devs_in_pkg") Cc: Heiner Kallweit Signed-off-by: Jeremy Linton --- drivers/net/phy/phy_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index ac2784192472..b93d984d35cc 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -723,7 +723,7 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id, if (phy_reg < 0) return -EIO; - if ((*devs & 0x1fff) == 0x1fff) { + if ((*devs & 0x1ffe) == 0x1ffe) { /* If mostly Fs, there is no device there, * then let's continue to probe more, as some * 10G PHYs have zero Devices In package, @@ -733,7 +733,7 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id, if (phy_reg < 0) return -EIO; /* no device there, let's get out of here */ - if ((*devs & 0x1fff) == 0x1fff) { + if ((*devs & 0x1ffe) == 0x1ffe) { *phy_id = 0x; return 0; } else { -- 2.24.1
Re: [PATCH 3/4] hwspinlock: qcom: Allow mmio usage in addition to syscon
On Thu 14 May 07:15 PDT 2020, Vinod Koul wrote: > On 12-05-20, 17:54, Bjorn Andersson wrote: > > In all modern Qualcomm platforms the mutex region of the TCSR is forked > > off into its own block, all with a offset of 0 and stride of 4096. So > > add support for directly memory mapping this register space, to avoid > > the need to represent this block using a syscon. > > > > Signed-off-by: Bjorn Andersson > > --- > > drivers/hwspinlock/qcom_hwspinlock.c | 72 +--- > > 1 file changed, 56 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/hwspinlock/qcom_hwspinlock.c > > b/drivers/hwspinlock/qcom_hwspinlock.c > > index f0da544b14d2..d8d4d729816c 100644 > > --- a/drivers/hwspinlock/qcom_hwspinlock.c > > +++ b/drivers/hwspinlock/qcom_hwspinlock.c > > @@ -70,41 +70,81 @@ static const struct of_device_id > > qcom_hwspinlock_of_match[] = { > > }; > > MODULE_DEVICE_TABLE(of, qcom_hwspinlock_of_match); > > > > -static int qcom_hwspinlock_probe(struct platform_device *pdev) > > +static struct regmap *qcom_hwspinlock_probe_syscon(struct platform_device > > *pdev, > > + u32 *base, u32 *stride) > > { > > - struct hwspinlock_device *bank; > > struct device_node *syscon; > > - struct reg_field field; > > struct regmap *regmap; > > - size_t array_size; > > - u32 stride; > > - u32 base; > > int ret; > > - int i; > > > > syscon = of_parse_phandle(pdev->dev.of_node, "syscon", 0); > > - if (!syscon) { > > - dev_err(>dev, "no syscon property\n"); > > any reason to drop the log? > Given that we first check for the syscon and then fall back to trying to use the reg, keeping this line would cause this log line to always show up on targets putting this under /soc. So I think it's better to drop the line and then require the presence of either syscon or reg using the DT schema. > > - return -ENODEV; > > - } > > + if (!syscon) > > + return ERR_PTR(-ENODEV); > > > > regmap = syscon_node_to_regmap(syscon); > > of_node_put(syscon); > > if (IS_ERR(regmap)) > > - return PTR_ERR(regmap); > > + return regmap; > > > > - ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 1, ); > > + ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 1, base); > > if (ret < 0) { > > dev_err(>dev, "no offset in syscon\n"); > > - return -EINVAL; > > + return ERR_PTR(-EINVAL); > > } > > > > - ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 2, > > ); > > + ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 2, > > stride); > > if (ret < 0) { > > dev_err(>dev, "no stride syscon\n"); > > - return -EINVAL; > > + return ERR_PTR(-EINVAL); > > } > > > > + return regmap; > > +} > > + > > +static const struct regmap_config tcsr_mutex_config = { > > + .reg_bits = 32, > > + .reg_stride = 4, > > + .val_bits = 32, > > + .max_register = 0x4, > > + .fast_io= true, > > +}; > > + > > +static struct regmap *qcom_hwspinlock_probe_mmio(struct platform_device > > *pdev, > > +u32 *offset, u32 *stride) > > +{ > > + struct device *dev = >dev; > > + struct resource *res; > > + void __iomem *base; > > + > > + /* All modern platform has offset 0 and stride of 4k */ > > + *offset = 0; > > + *stride = 0x1000; > > Wouldn't it make sense to read this from DT rather than code in kernel? > I did consider this as well as platform specific compatibles, but realized that pretty much all 64-bit targets have these values. So given that we still can represent this using the syscon approach I don't think we need to add yet another mechanism to specify these. Thanks, Bjorn
Re: [patch V4 part 1 29/36] x86/mce: Send #MC singal from task work
- On May 14, 2020, at 12:39 PM, Borislav Petkov b...@alien8.de wrote: > On Thu, May 14, 2020 at 12:03:30PM -0400, Mathieu Desnoyers wrote: >> - #MC triggered, queuing task work, >> - unrelated signal happens to be delivered to task, >> - exit to usermode loop handles do_signal first, >> - then it runs task work. > > How can that even happen? > > exit_to_usermode_loop->do_signal->get_signal and that does: > >if (unlikely(current->task_works)) >task_work_run(); > > at the top. > > So the task work will always run before the signal handler. OK yes, nevermind. I focused on its invocation from tracehook_notify_resume and missed this invocation in do_signal. My bad. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
[PATCH] arm64: dts: qcom: msm8916-samsung-a3u: add nodes for display panel
From: Michael Srba This patch wires up display support on Samsung Galaxy A3 2015. Signed-off-by: Michael Srba --- .../qcom/msm8916-samsung-a2015-common.dtsi| 44 +++ .../boot/dts/qcom/msm8916-samsung-a3u-eur.dts | 54 +++ 2 files changed, 98 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi b/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi index af812f76e8be..2a64aa269f52 100644 --- a/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi @@ -72,6 +72,24 @@ phy { }; }; + mdss@1a0 { + dsi@1a98000 { + #address-cells = <1>; + #size-cells = <0>; + + vdda-supply = <_l2>; + vddio-supply = <_l6>; + + pinctrl-names = "default", "sleep"; + pinctrl-0 = <_default>; + pinctrl-1 = <_sleep>; + }; + + dsi-phy@1a98300 { + vddio-supply = <_l6>; + }; + }; + wcnss@a21b000 { status = "okay"; }; @@ -172,6 +190,32 @@ pinconf { bias-disable; }; }; + + pmx-mdss { + mdss_default: mdss-default { + pinmux { + function = "gpio"; + pins = "gpio25"; + }; + pinconf { + pins = "gpio25"; + drive-strength = <8>; + bias-disable; + }; + }; + + mdss_sleep: mdss-sleep { + pinmux { + function = "gpio"; + pins = "gpio25"; + }; + pinconf { + pins = "gpio25"; + drive-strength = <2>; + bias-pull-down; + }; + }; + }; }; _rpm_regulators { diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts b/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts index d10f7ac5089f..b46c87289033 100644 --- a/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts +++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts @@ -7,4 +7,58 @@ / { model = "Samsung Galaxy A3U (EUR)"; compatible = "samsung,a3u-eur", "qcom,msm8916"; + + reg_panel_vdd3: regulator-panel-vdd3 { + compatible = "regulator-fixed"; + regulator-name = "panel_vdd3"; + regulator-min-microvolt = <180>; + regulator-max-microvolt = <180>; + + gpio = < 9 GPIO_ACTIVE_HIGH>; + enable-active-high; + + pinctrl-names = "default"; + pinctrl-0 = <_vdd3_default>; + }; +}; + + { + panel@0 { + reg = <0>; + + compatible = "samsung,s6e88a0-ams452ef01"; + + vdd3-supply = <_panel_vdd3>; + vci-supply = <_l17>; + reset-gpios = < 25 GPIO_ACTIVE_HIGH>; + + port { + panel_in: endpoint { + remote-endpoint = <_out>; + }; + }; + }; + + ports { + port@1 { + dsi0_out: endpoint { + remote-endpoint = <_in>; + data-lanes = <0 1>; + }; + }; + }; +}; + + { + panel_vdd3_default: panel-vdd3-default { + pinmux { + function = "gpio"; + pins = "gpio9"; + }; + pinconf { + pins = "gpio9"; + drive-strength = <2>; + bias-disable; + }; + }; }; -- 2.24.0
[PATCHv1] ARM: dts/imx6q-bx50v3: Set display interface clock parents
From: Robert Beckett Avoid LDB and IPU DI clocks both using the same parent. LDB requires pasthrough clock to avoid breaking timing while IPU DI does not. Force IPU DI clocks to use IMX6QDL_CLK_PLL2_PFD0_352M as parent and LDB to use IMX6QDL_CLK_PLL5_VIDEO_DIV. This fixes an issue where attempting atomic modeset while using HDMI and display port at the same time causes LDB clock programming to destroy the programming of HDMI that was done during the same modeset. Cc: sta...@vger.kernel.org Signed-off-by: Robert Beckett [Use IMX6QDL_CLK_PLL2_PFD0_352M instead of IMX6QDL_CLK_PLL2_PFD2_396M originally chosen by Robert Beckett to avoid affecting eMMC clock by DRM atomic updates] Signed-off-by: Ian Ray [Squash Robert's and Ian's commits for bisectability, update patch description and add stable tag] Signed-off-by: Sebastian Reichel --- arch/arm/boot/dts/imx6q-b450v3.dts | 7 --- arch/arm/boot/dts/imx6q-b650v3.dts | 7 --- arch/arm/boot/dts/imx6q-b850v3.dts | 11 --- arch/arm/boot/dts/imx6q-bx50v3.dtsi | 15 +++ 4 files changed, 15 insertions(+), 25 deletions(-) diff --git a/arch/arm/boot/dts/imx6q-b450v3.dts b/arch/arm/boot/dts/imx6q-b450v3.dts index 95b8f2d71821..fb0980190aa0 100644 --- a/arch/arm/boot/dts/imx6q-b450v3.dts +++ b/arch/arm/boot/dts/imx6q-b450v3.dts @@ -65,13 +65,6 @@ panel_in_lvds0: endpoint { }; }; - { - assigned-clocks = < IMX6QDL_CLK_LDB_DI0_SEL>, - < IMX6QDL_CLK_LDB_DI1_SEL>; - assigned-clock-parents = < IMX6QDL_CLK_PLL3_USB_OTG>, -< IMX6QDL_CLK_PLL3_USB_OTG>; -}; - { status = "okay"; diff --git a/arch/arm/boot/dts/imx6q-b650v3.dts b/arch/arm/boot/dts/imx6q-b650v3.dts index 611cb7ae7e55..8f762d9c5ae9 100644 --- a/arch/arm/boot/dts/imx6q-b650v3.dts +++ b/arch/arm/boot/dts/imx6q-b650v3.dts @@ -65,13 +65,6 @@ panel_in_lvds0: endpoint { }; }; - { - assigned-clocks = < IMX6QDL_CLK_LDB_DI0_SEL>, - < IMX6QDL_CLK_LDB_DI1_SEL>; - assigned-clock-parents = < IMX6QDL_CLK_PLL3_USB_OTG>, -< IMX6QDL_CLK_PLL3_USB_OTG>; -}; - { status = "okay"; diff --git a/arch/arm/boot/dts/imx6q-b850v3.dts b/arch/arm/boot/dts/imx6q-b850v3.dts index e4cb118f88c6..1ea64ecf4291 100644 --- a/arch/arm/boot/dts/imx6q-b850v3.dts +++ b/arch/arm/boot/dts/imx6q-b850v3.dts @@ -53,17 +53,6 @@ chosen { }; }; - { - assigned-clocks = < IMX6QDL_CLK_LDB_DI0_SEL>, - < IMX6QDL_CLK_LDB_DI1_SEL>, - < IMX6QDL_CLK_IPU1_DI0_PRE_SEL>, - < IMX6QDL_CLK_IPU2_DI0_PRE_SEL>; - assigned-clock-parents = < IMX6QDL_CLK_PLL5_VIDEO_DIV>, -< IMX6QDL_CLK_PLL5_VIDEO_DIV>, -< IMX6QDL_CLK_PLL2_PFD2_396M>, -< IMX6QDL_CLK_PLL2_PFD2_396M>; -}; - { fsl,dual-channel; status = "okay"; diff --git a/arch/arm/boot/dts/imx6q-bx50v3.dtsi b/arch/arm/boot/dts/imx6q-bx50v3.dtsi index fa27dcdf06f1..1938b04199c4 100644 --- a/arch/arm/boot/dts/imx6q-bx50v3.dtsi +++ b/arch/arm/boot/dts/imx6q-bx50v3.dtsi @@ -377,3 +377,18 @@ pci_root: root@0,0 { #interrupt-cells = <1>; }; }; + + { + assigned-clocks = < IMX6QDL_CLK_LDB_DI0_SEL>, + < IMX6QDL_CLK_LDB_DI1_SEL>, + < IMX6QDL_CLK_IPU1_DI0_PRE_SEL>, + < IMX6QDL_CLK_IPU1_DI1_PRE_SEL>, + < IMX6QDL_CLK_IPU2_DI0_PRE_SEL>, + < IMX6QDL_CLK_IPU2_DI1_PRE_SEL>; + assigned-clock-parents = < IMX6QDL_CLK_PLL5_VIDEO_DIV>, +< IMX6QDL_CLK_PLL5_VIDEO_DIV>, +< IMX6QDL_CLK_PLL2_PFD0_352M>, +< IMX6QDL_CLK_PLL2_PFD0_352M>, +< IMX6QDL_CLK_PLL2_PFD0_352M>, +< IMX6QDL_CLK_PLL2_PFD0_352M>; +}; -- 2.26.2
[GIT PULL] thread fixes v5.7-rc5
Hey Linus, /* Summary */ This contains a single fix for all exported legacy fork helpers to block accidental access to clone3() features in the upper 32 bits of their respective flags arguments. I got Cced on a glibc issue where someone reported consistent failures for the legacy clone() syscall on ppc64le when sign extension was performed (Since the clone() syscall in glibc exposes the flags argument as an int whereas the kernel uses unsigned long.). The legacy clone() syscall is odd in a bunch of ways and here two things interact. First, legacy clone's flag argument is word-size dependent, i.e. it's an unsigned long whereas most system calls with flag arguments use int or unsigned int. Second, legacy clone() ignores unknown and deprecated flags. The two of them taken together means that users on 64bit systems can pass garbage for the upper 32bit of the clone() syscall since forever and things would just work fine. The following program compiled on a 64bit kernel prior to v5.7-rc1 will succeed and will fail post v5.7-rc1 with EBADF: int main(int argc, char *argv[]) { pid_t pid; /* Note that legacy clone() has different argument ordering on * different architectures so this won't work everywhere. * * Only set the upper 32 bits. */ pid = syscall(__NR_clone, 0x | SIGCHLD, NULL, NULL, NULL, NULL); if (pid < 0) exit(EXIT_FAILURE); if (pid == 0) exit(EXIT_SUCCESS); if (wait(NULL) != pid) exit(EXIT_FAILURE); exit(EXIT_SUCCESS); } Since legacy clone() couldn't be extended this was not a problem so far and nobody really noticed or cared since nothing in the kernel ever bothered to look at the upper 32 bits. But once we introduced clone3() and expanded the flag argument in struct clone_args to 64 bit we opened this can of worms. With the first flag-based extension to clone3() making use of the upper 32 bits of the flag argument we've effectively made it possible for the legacy clone() syscall to reach clone3() only flags on accident. The sign extension scenario is just the odd corner-case that we needed to figure this out. The reason we just realized this now and not already when we introduced CLONE_CLEAR_SIGHAND was that CLONE_INTO_CGROUP assumes that a valid cgroup file descriptor has been given whereas CLONE_CLEAR_SIGHAND doesn't need to verify anything. It just silently resets the signal handlers to SIG_DFL. So the sign extension (or the user accidently passing garbage for the upper 32 bits) caused the CLONE_INTO_CGROUP bit to be raised and the kernel to error out when it didn't find a valid cgroup file descriptor. /* A final note */ Note, I'm also capping kernel_thread()'s flag argument mainly because none of the new features make sense for kernel_thread() and we shouldn't risk them being accidently activated however unlikely. If we wanted to, we could even make kernel_thread() yell when an unknown flag has been set which it doesn't do right now. But it's not worth risking this in a bugfix imho. /* Testing */ All patches have seen exposure in linux-next and are based on v5.6-rc5. /* Conflicts */ At the time of creating this pr no merge conflicts were reported. The following changes since commit 0e698dfa282211e414076f9dc7e83c1c288314fd: Linux 5.7-rc4 (2020-05-03 14:56:04 -0700) are available in the Git repository at: g...@gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux tags/for-linus-2020-05-13 for you to fetch changes up to 3f2c788a13143620c5471ac96ac4f033fc9ac3f3: fork: prevent accidental access to clone3 features (2020-05-08 17:31:50 +0200) Please consider pulling these changes from the signed for-linus-2020-05-13 tag. Thanks! Christian for-linus-2020-05-13 Christian Brauner (1): fork: prevent accidental access to clone3 features kernel/fork.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-)
Re: [PATCH 3/5] exec: Remove recursion from search_binary_handler
Casey Schaufler writes: > On 5/14/2020 7:56 AM, Eric W. Biederman wrote: >> Kees Cook writes: >> >>> On Tue, May 12, 2020 at 04:47:14PM -0700, Kees Cook wrote: And now I wonder if qemu actually uses the resulting AT_EXECFD ... >>> It does, though I'm not sure if this is to support crossing mount points, >>> dropping privileges, or something else, since it does fall back to just >>> trying to open the file. >>> >>> execfd = qemu_getauxval(AT_EXECFD); >>> if (execfd == 0) { >>> execfd = open(filename, O_RDONLY); >>> if (execfd < 0) { >>> printf("Error while loading %s: %s\n", filename, >>> strerror(errno)); >>> _exit(EXIT_FAILURE); >>> } >>> } >> My hunch is that the fallback exists from a time when the kernel did not >> implement AT_EXECFD, or so that qemu can run on kernels that don't >> implement AT_EXECFD. It doesn't really matter unless the executable is >> suid, or otherwise changes privileges. >> >> >> I looked into this a bit to remind myself why exec works the way it >> works, with changing privileges. >> >> The classic attack is pointing a symlink at a #! script that is suid or >> otherwise changes privileges. The kernel will open the script and set >> the privileges, read the interpreter from the first line, and proceed to >> exec the interpreter. The interpreter will then open the script using >> the pathname supplied by the kernel. The name of the symlink. >> Before the interpreter reopens the script the attack would replace >> the symlink with a script that does something else, but gets to run >> with the privileges of the script. >> >> >> Defending against that time of check vs time of use attack is why >> bprm_fill_uid, and cap_bprm_set_creds use the credentials derived from >> the interpreter instead of the credentials derived from the script. >> >> >> The other defense is to replace the pathname of the executable that the >> intepreter will open with /dev/fd/N. >> >> All of this predates Linux entirely. I do remember this was fixed at >> some point in Linux but I don't remember the details. I can just read >> the solution that was picked in the code. >> >> >> >> All of this makes me wonder how are the LSMs protected against this >> attack. >> >> Let's see the following LSMS implement brpm_set_creds: >> tomoyo - Abuses bprm_set_creds to call tomoyo_load_policy [ safe ] >> smack- Requires CAP_MAC_ADMIN to smack setxattrs[ vulnerable? ] >>Uses those xattrs in smack_bprm_set_creds > > What is the concern? If the xattrs change after the check, > the behavior should still be consistent. The concern is that there are xattrs set on a #! script. Someone replaces the script after smack reads the xattr and sets bprm->cred but before the interpreter reopens the script. In short if there is one script with xattrs set. I can run any script as if those xattrs were set on it. I don't know the smack security model well enough to know if that is a problem or not. It looks like it may be a concern because smack limits who can mess with it's security xattrs. Eric
Re: [PATCH v2 1/3] ASoC: tlv320adcx140: Add controls for PDM clk
On Wed, 13 May 2020 15:05:47 -0500, Dan Murphy wrote: > Add ALSA controls to configure the PDM clocks. > The clocks need to be configurable to accommodate various microphones > that use clocks for low power/low resolution modes to high power/high > resolution modes. Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-5.8 Thanks! [1/3] ASoC: tlv320adcx140: Add controls for PDM clk commit: 7cfa610205d95357f9eface292dc70fce7571f65 [2/3] ASoC: tlv320adcx140: Add device tree property for PDM edges commit: 75b0adbb0806a141b0b5f074cd6bd58bb9870c0d [3/3] ASoC: tlv320adcx140: Configure PDM sampling edge commit: 79fc48e41e39d7a98c5f8ae37f613d7ff9953c86 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark
Re: [PATCH v3 1/3] ASoC: tlv320adcx140: Add controls for PDM clk
On Thu, 14 May 2020 07:33:36 -0500, Dan Murphy wrote: > Add ALSA controls to configure the PDM clocks. > The clocks need to be configurable to accommodate various microphones > that use clocks for low power/low resolution modes to high power/high > resolution modes. Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-5.8 Thanks! [1/3] ASoC: tlv320adcx140: Add controls for PDM clk commit: 7cfa610205d95357f9eface292dc70fce7571f65 [2/3] ASoC: tlv320adcx140: Add device tree property for PDM edges commit: 75b0adbb0806a141b0b5f074cd6bd58bb9870c0d [3/3] ASoC: tlv320adcx140: Configure PDM sampling edge commit: 79fc48e41e39d7a98c5f8ae37f613d7ff9953c86 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark
Re: [PATCH] net: phy: mdio-moxart: remove unneeded include
On Thu, May 14, 2020 at 06:59:38PM +0200, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > mdio-moxart doesn't use regulators in the driver code. We can remove > the regulator include. > > Signed-off-by: Bartosz Golaszewski Reviewed-by: Andrew Lunn Andrew
Re: [PATCH v2 4/4] scsi: ufs: Fix WriteBooster flush during runtime suspend
On 5/14/2020 8:01 AM, Stanley Chu wrote: Currently UFS host driver promises VCC supply if UFS device needs to do WriteBooster flush during runtime suspend. However the UFS specification mentions, "While the flushing operation is in progress, the device is in Active power mode." Therefore UFS host driver needs to promise more: Keep UFS device as "Active power mode", otherwise UFS device shall not do any flush if device enters Sleep or PowerDown power mode. Fix this by not changing device power mode if WriteBooster flush is required in ufshcd_suspend(). Signed-off-by: Stanley Chu --- drivers/scsi/ufs/ufs.h| 1 - drivers/scsi/ufs/ufshcd.c | 42 --- 2 files changed, 22 insertions(+), 21 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index b3135344ab3f..9e4bc2e97ada 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -577,7 +577,6 @@ struct ufs_dev_info { u32 d_ext_ufs_feature_sup; u8 b_wb_buffer_type; u32 d_wb_alloc_units; - bool keep_vcc_on; u8 b_presrv_uspc_en; }; diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 169a3379e468..b9f7744ca2b4 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -8101,8 +8101,7 @@ static void ufshcd_vreg_set_lpm(struct ufs_hba *hba) !hba->dev_info.is_lu_power_on_wp) { ufshcd_setup_vreg(hba, false); } else if (!ufshcd_is_ufs_dev_active(hba)) { - if (!hba->dev_info.keep_vcc_on) - ufshcd_toggle_vreg(hba->dev, hba->vreg_info.vcc, false); + ufshcd_toggle_vreg(hba->dev, hba->vreg_info.vcc, false); if (!ufshcd_is_link_active(hba)) { ufshcd_config_vreg_lpm(hba, hba->vreg_info.vccq); ufshcd_config_vreg_lpm(hba, hba->vreg_info.vccq2); @@ -8172,6 +8171,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) enum ufs_pm_level pm_lvl; enum ufs_dev_pwr_mode req_dev_pwr_mode; enum uic_link_state req_link_state; + bool keep_curr_dev_pwr_mode = false; hba->pm_op_in_progress = 1; if (!ufshcd_is_shutdown_pm(pm_op)) { @@ -8227,27 +8227,29 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) ufshcd_disable_auto_bkops(hba); } /* -* With wb enabled, if the bkops is enabled or if the -* configured WB type is 70% full, keep vcc ON -* for the device to flush the wb buffer +* If device needs to do BKOP or WB buffer flush during +* Hibern8, keep device power mode as "active power mode" +* and VCC supply. */ - if ((hba->auto_bkops_enabled && ufshcd_is_wb_allowed(hba)) || - ufshcd_wb_keep_vcc_on(hba)) - hba->dev_info.keep_vcc_on = true; - else - hba->dev_info.keep_vcc_on = false; - } else { - hba->dev_info.keep_vcc_on = false; + keep_curr_dev_pwr_mode = hba->auto_bkops_enabled || + (((req_link_state == UIC_LINK_HIBERN8_STATE) || + ((req_link_state == UIC_LINK_ACTIVE_STATE) && + ufshcd_is_auto_hibern8_enabled(hba))) && + ufshcd_wb_keep_vcc_on(hba)); } This looks fine. But I still think the delayed check of flush status should be done to turn-off Vcc when flush is complete. - if ((req_dev_pwr_mode != hba->curr_dev_pwr_mode) && - ((ufshcd_is_runtime_pm(pm_op) && !hba->auto_bkops_enabled) || - !ufshcd_is_runtime_pm(pm_op))) { - /* ensure that bkops is disabled */ - ufshcd_disable_auto_bkops(hba); - ret = ufshcd_set_dev_pwr_mode(hba, req_dev_pwr_mode); - if (ret) - goto enable_gating; + if (req_dev_pwr_mode != hba->curr_dev_pwr_mode) { + if ((ufshcd_is_runtime_pm(pm_op) && !hba->auto_bkops_enabled) || + !ufshcd_is_runtime_pm(pm_op)) { + /* ensure that bkops is disabled */ + ufshcd_disable_auto_bkops(hba); + } + + if (!keep_curr_dev_pwr_mode) { + ret = ufshcd_set_dev_pwr_mode(hba, req_dev_pwr_mode); + if (ret) + goto enable_gating; + } } flush_work(>eeh_work); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, Linux Foundation Collaborative Project
Re: [PATCH] keys: Move permissions checking decisions into the checking code
On 5/14/2020 9:58 AM, David Howells wrote: > How about this then? > > David > --- > commit fa37b6c7e2f86d16ede1e0e3cb73857152d51825 > Author: David Howells > Date: Thu May 14 17:48:55 2020 +0100 > > keys: Move permissions checking decisions into the checking code > > Overhaul the permissions checking, moving the decisions of which permits > to > request for what operation and what overrides to allow into the > permissions > checking functions in keyrings, SELinux and Smack. > > To this end, the KEY_NEED_* constants are turned into an enum and expanded > in number to cover operation types individually. > > Note that some more tweaking is probably needed to separate kernel uses, > e.g. AFS using rxrpc keys, from direct userspace users. > > Some overrides are available and this needs to be handled specially: > > (1) KEY_FLAG_KEEP in key->flags - The key may not be deleted and/or > things > may not be removed from the keyring. > > (2) KEY_FLAG_ROOT_CAN_CLEAR in key->flags - The keyring can be cleared by > CAP_SYS_ADMIN. > > (3) KEY_FLAG_ROOT_CAN_INVAL in key->flags - The key can be invalidated by > CAP_SYS_ADMIN. > > (4) An appropriate auth token being set in cred->request_key_auth that > gives a process transient permission to view and instantiate a key. > This is used by the kernel to delegate instantiation to userspace. > > Note that this requires some tweaks to the testsuite as some of the error > codes change. > > Signed-off-by: David Howells > Reported-by: Stephen Smalley > cc: Jarkko Sakkinen > cc: Paul Moore > cc: Stephen Smalley > cc: Casey Schaufler > cc: keyri...@vger.kernel.org > cc: seli...@vger.kernel.org > ... > diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c > index 8c61d175e195..ac9c93c48097 100644 > --- a/security/smack/smack_lsm.c > +++ b/security/smack/smack_lsm.c > @@ -4230,13 +4230,15 @@ static void smack_key_free(struct key *key) > * smack_key_permission - Smack access on a key > * @key_ref: gets to the object > * @cred: the credentials to use > - * @perm: requested key permissions > + * @need_perm: requested key permission > * > * Return 0 if the task has read and write to the object, > * an error code otherwise > */ > static int smack_key_permission(key_ref_t key_ref, > - const struct cred *cred, unsigned perm) > + const struct cred *cred, > + enum key_need_perm need_perm, > + unsigned int flags) > { > struct key *keyp; > struct smk_audit_info ad; > @@ -4244,12 +4246,6 @@ static int smack_key_permission(key_ref_t key_ref, > int request = 0; > int rc; > > - /* > - * Validate requested permissions > - */ > - if (perm & ~KEY_NEED_ALL) > - return -EINVAL; > - > keyp = key_ref_to_ptr(key_ref); > if (keyp == NULL) > return -EINVAL; > @@ -4265,6 +4261,71 @@ static int smack_key_permission(key_ref_t key_ref, > if (tkp == NULL) > return -EACCES; > > + /* > + * Validate requested permissions > + */ > + switch (need_perm) { > + case KEY_NEED_ASSUME_AUTHORITY: > + return 0; > + > + case KEY_NEED_DESCRIBE: > + case KEY_NEED_GET_SECURITY: > + request |= MAY_READ; > + auth_can_override = true; > + break; > + > + case KEY_NEED_CHOWN: > + case KEY_NEED_INVALIDATE: > + case KEY_NEED_JOIN: > + case KEY_NEED_LINK: > + case KEY_NEED_KEYRING_ADD: > + case KEY_NEED_KEYRING_CLEAR: > + case KEY_NEED_KEYRING_DELETE: > + case KEY_NEED_REVOKE: > + case KEY_NEED_SETPERM: > + case KEY_NEED_SET_RESTRICTION: > + case KEY_NEED_UPDATE: > + request |= MAY_WRITE; > + break; > + > + case KEY_NEED_INSTANTIATE: > + auth_can_override = true; > + break; > + > + case KEY_NEED_READ: > + case KEY_NEED_SEARCH: > + case KEY_NEED_USE: > + case KEY_NEED_WATCH: > + request |= MAY_READ; > + break; > + > + case KEY_NEED_SET_TIMEOUT: > + request |= MAY_WRITE; > + auth_can_override = true; > + break; > + > + case KEY_NEED_UNLINK: > + return 0; /* Mustn't prevent this; KEY_FLAG_KEEP is already > +* dealt with. */ > + > + default: > + WARN_ON(1); > + return -EINVAL; > + } > + > + /* Just allow the operation if the process has an authorisation token. > + * The presence of the token means that the kernel delegated > + * instantiation of a key to the process - which is problematic if we > + * then say that the process isn't allowed to get the description
[PATCH] of/platform: Avoid compilation warning
The of_find_device_by_node() helper has its dummy counterpart for when CONFIG_OF is not enabled. However, it is clearly stated in the kernel documentation that it "takes a reference to the embedded struct device which needs to be dropped after use". Which means the of_dev_put() helper might have to be called afterwards. Unfortunately, there is no of_dev_put() dummy function if OF_CONFIG is not enabled which seems odd in this case. The of_dev_put() helper is pretty simple, it just checks the validity of the single argument and calls put_device() on it. One can just call put_device() directly to avoid any build issue but I find much more accurate in this case to create the dummy helper. With this helper, a file using of_find_device_by_node() can also call of_dev_put() without triggering the following: error: implicit declaration of function ‘of_dev_put’; did you mean ‘of_node_put’? [-Werror=implicit-function-declaration] Signed-off-by: Miquel Raynal --- include/linux/of_platform.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/of_platform.h b/include/linux/of_platform.h index 84a966623e78..84d9e60e517e 100644 --- a/include/linux/of_platform.h +++ b/include/linux/of_platform.h @@ -54,11 +54,13 @@ extern struct platform_device *of_device_alloc(struct device_node *np, struct device *parent); #ifdef CONFIG_OF extern struct platform_device *of_find_device_by_node(struct device_node *np); +extern void of_dev_put(struct platform_device *dev); #else static inline struct platform_device *of_find_device_by_node(struct device_node *np) { return NULL; } +static inline void of_dev_put(struct platform_device *dev) { } #endif /* Platform devices and busses creation */ -- 2.20.1
[PATCH] asm-generic: Update kernel documentation in io.h
The kernel documentation of: * bus_to_virt() * memcpy_fromio() * memcpy_toio() refers to older parameters. Update it to fit the actual names. Signed-off-by: Miquel Raynal --- include/asm-generic/io.h | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h index d39ac997dda8..cb617baf8d47 100644 --- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -1051,8 +1051,8 @@ static inline void *bus_to_virt(unsigned long address) /** * memset_io Set a range of I/O memory to a constant value * @addr: The beginning of the I/O-memory range to set - * @val: The value to set the memory to - * @count: The number of bytes to set + * @value: The value to set the memory to + * @size: The number of bytes to set * * Set a range of I/O memory to a given value. */ @@ -1067,9 +1067,9 @@ static inline void memset_io(volatile void __iomem *addr, int value, #define memcpy_fromio memcpy_fromio /** * memcpy_fromio Copy a block of data from I/O memory - * @dst: The (RAM) destination for the copy - * @src: The (I/O memory) source for the data - * @count: The number of bytes to copy + * @buffer:The (RAM) destination for the copy + * @addr: The (I/O memory) source for the data + * @size: The number of bytes to copy * * Copy a block of data from I/O memory. */ @@ -1085,9 +1085,9 @@ static inline void memcpy_fromio(void *buffer, #define memcpy_toio memcpy_toio /** * memcpy_toio Copy a block of data into I/O memory - * @dst: The (I/O memory) destination for the copy - * @src: The (RAM) source for the data - * @count: The number of bytes to copy + * @addr: The (I/O memory) destination for the copy + * @buffer:The (RAM) source for the data + * @size: The number of bytes to copy * * Copy a block of data to I/O memory. */ -- 2.20.1
Re: [PATCH 3/4] hwspinlock: qcom: Allow mmio usage in addition to syscon
On Tue 12 May 20:57 PDT 2020, Baolin Wang wrote: > On Wed, May 13, 2020 at 8:55 AM Bjorn Andersson > wrote: > > > > In all modern Qualcomm platforms the mutex region of the TCSR is forked > > off into its own block, all with a offset of 0 and stride of 4096. So > > add support for directly memory mapping this register space, to avoid > > the need to represent this block using a syscon. > > > > Signed-off-by: Bjorn Andersson > > --- > > drivers/hwspinlock/qcom_hwspinlock.c | 72 +--- > > 1 file changed, 56 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/hwspinlock/qcom_hwspinlock.c > > b/drivers/hwspinlock/qcom_hwspinlock.c [..] > > +static struct regmap *qcom_hwspinlock_probe_mmio(struct platform_device > > *pdev, > > +u32 *offset, u32 *stride) > > +{ > > + struct device *dev = >dev; > > + struct resource *res; > > + void __iomem *base; > > + > > + /* All modern platform has offset 0 and stride of 4k */ > > + *offset = 0; > > + *stride = 0x1000; > > + > > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > > + base = devm_ioremap_resource(>dev, res); > > I think you can use devm_platform_ioremap_resource(pdev, 0) to > simplify your code, otherwise looks good to me. You're right, I better fix this before someone with Coccinelle get the chance ;) > Reviewed-by: Baolin Wang > Thanks, Bjorn
Re: [PATCH v2 0/4] clk: qcom: gdsc: Handle supply regulators
On 17-04-20, 00:00, Bjorn Andersson wrote: > Handle supply regulators for GDSCs to allow probe deferral when regulators are > not yet present/enabled and to allow the GDSC to enable/disable dependencies > as > needed. Reviewed-by: Vinod Koul -- ~Vinod
Re: [PATCH] arm64: dts: qcom: c630: Specify UFS device reset
On 05-04-20, 23:00, Bjorn Andersson wrote: > On some device the reset line for the UFS memory needs to be tickled in > order for UFS to initialize properly, add this to the ufs_mem_hc node. Reviewed-by: Vinod Koul -- ~Vinod
Re: [PATCH v4 7/7] drm/mediatek: mtk_dsi: Create connector for bridges
Hi Chun-Kuang, On 14/5/20 18:44, Chun-Kuang Hu wrote: > Hi, Enric: > > Enric Balletbo i Serra 於 2020年5月14日 週四 > 下午11:42寫道: >> >> Hi Chun-Kuang, >> >> On 14/5/20 16:28, Chun-Kuang Hu wrote: >>> Hi, Enric: >>> >>> Enric Balletbo Serra 於 2020年5月14日 週四 上午12:41寫道: Hi Chun-Kuang, Missatge de Enric Balletbo i Serra del dia dv., 1 de maig 2020 a les 17:25: > > Use the drm_bridge_connector helper to create a connector for pipelines > that use drm_bridge. This allows splitting connector operations across > multiple bridges when necessary, instead of having the last bridge in > the chain creating the connector and handling all connector operations > internally. > > Signed-off-by: Enric Balletbo i Serra > Acked-by: Sam Ravnborg A gentle ping on this, I think that this one is the only one that still needs a review in the series. >>> >>> This is what I reply in patch v3: >>> >> >> Sorry for missing this. >> >>> I think the panel is wrapped into next_bridge here, >>> >> >> Yes, you can have for example: >> >> 1. drm_bridge (mtk_dsi) -> drm_bridge (ps8640 - dsi-to-edp) -> >> drm_panel_bridge >> (edp panel) >> >> or a >> >> 2. drm_bridge (mtk_dsi)-> drm_panel_bridge (dsi panel) >> >> The _first_ one is my use case >> >>> if (panel) { >> >> This handles the second case, where you attach a dsi panel. >> >>> dsi->next_bridge = devm_drm_panel_bridge_add(dev, panel); >>> >>> so the next_bridge is a panel_bridge, in its attach function >>> panel_bridge_attach(), >>> according to the flag DRM_BRIDGE_ATTACH_NO_CONNECTOR, if not exist, >>> it would create connector and attach connector to panel. >>> >>> I'm not sure this flag would exist or not, but for both case, it's strange. >>> If exist, you create connector in this patch but no where to attach >>> connector to panel. >> >> Yes, in fact, this is transitional patch needed, as once I converted mtk_dpi, >> mtk_dsi and mtk_hdmi to the new drm_bridge API the >> drm_bridge_connector_init() >> will be done in mtk_drm_drv. We will need to call drm_bridge_connector_init >> for >> dpi and dsi pipes and remove that call from mtk_dsi and mtk_dpi drivers. The >> graphic controller driver should create connectors and CRTCs, as example you >> can >> take a look at drivers/gpu/drm/omapdrm/omap_drv.c >> > > I have such question because I've reviewed omap's driver. In omap's > driver, after it call drm_bridge_connector_init(), it does this: > > if (pipe->output->panel) { > ret = drm_panel_attach(pipe->output->panel, > pipe->connector); > if (ret < 0) > return ret; > } > > In this patch, you does not do this. > I see, so yes, I am probably missing call drm_panel_attach in case there is a direct panel attached. Thanks for pointing it. I'll send a new version adding the drm_panel_attach call. >>> If not exist, the next_brige would create one connector and this brige >>> would create another connector. >>> >>> I think in your case, mtk_dsi does not directly connect to a panel, so >> >> Exactly >> >>> I need a exact explain. Or someone could test this on a >>> directly-connect-panel platform. >> >> I don't think I am breaking this use case but AFAICS there is no users in >> mainline that directly connect a panel using the mediatek driver. As I said >> my >> use case is the other so I can't really test. Do you know anyone that can >> test this? > > I'm not sure who can test this, but [1], which is sent by YT Shen in a > series, is a patch to support dsi command mode so dsi could directly > connect to panel. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/mediatek?h=v5.7-rc5=21898816831fc60c92dd634ab4316a24da7eb4af > > It's better that someone could test this case, but if no one would > test this, I could also accept a good-look patch. > > Regards, > Chun-Kuang. > >> >> Thanks, >> Enric >> >>> >>> Regards, >>> Chun-Kuang. >>> Thanks, Enric > --- > > Changes in v4: None > Changes in v3: > - Move the bridge.type line to the patch that adds drm_bridge support. > (Laurent Pinchart) > > Changes in v2: None > > drivers/gpu/drm/mediatek/mtk_dsi.c | 13 - > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c > b/drivers/gpu/drm/mediatek/mtk_dsi.c > index 4f3bd095c1ee..471fcafdf348 100644 > --- a/drivers/gpu/drm/mediatek/mtk_dsi.c > +++ b/drivers/gpu/drm/mediatek/mtk_dsi.c > @@ -17,6 +17,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -183,6 +184,7 @@ struct mtk_dsi { > struct drm_encoder encoder; > struct drm_bridge bridge; > struct drm_bridge *next_bridge; > + struct drm_connector *connector; > struct phy *phy; > > void __iomem *regs; > @@
Re: [PATCH] net: phy: mdio-moxart: remove unneeded include
On 5/14/2020 9:59 AM, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > mdio-moxart doesn't use regulators in the driver code. We can remove > the regulator include. > > Signed-off-by: Bartosz Golaszewski Reviewed-by: Florian Fainelli -- Florian
Re: [PATCH] arm64: dts: qcom: c630: Specify UFS device reset
On 05-04-20, 23:00, Bjorn Andersson wrote: > On some device the reset line for the UFS memory needs to be tickled in > order for UFS to initialize properly, add this to the ufs_mem_hc node. Reviewed-by: Vinod Koul -- ~Vinod
Re: [PATCH v7 2/4] usb: dwc3: qcom: Add interconnect support in dwc3 driver
On Thu, May 14, 2020 at 02:30:28PM +0300, Felipe Balbi wrote: > Felipe Balbi writes: > > > Hi, > > > > Sandeep Maheswaram writes: > >> +static int dwc3_qcom_interconnect_init(struct dwc3_qcom *qcom) > >> +{ > >> + struct device *dev = qcom->dev; > >> + int ret; > >> + > >> + if (!device_is_bound(>dwc3->dev)) > >> + return -EPROBE_DEFER; > > > > this breaks allmodconfig. I'm dropping this series from my queue for > > this merge window. > > Sorry, I meant this patch ;-) I guess that's due to INTERCONNECT being a module. There is currently a discussion about this with Viresh and Georgi in response to another automated build failure. Viresh suggests changing CONFIG_INTERCONNECT from tristate to bool, which seems sensible to me given that interconnect is a core subsystem. Let's hold back with this patch/series then until that is sorted out.
Re: [PATCH] arm64: dts: qcom: c630: Add WiFi node
On 17-10-19, 22:58, Bjorn Andersson wrote: > Specify regulators and enable the node. The firmware uses the 8 > bit version of the host capability message, so specify this quirk. Reviewed-by: Vinod Koul -- ~Vinod
Re: [PATCH] drivers: ipa: use devm_kzalloc for simplicity
On Wed, 13 May 2020 20:55:20 -0700 Wang Wenhu wrote: > Make a substitution of kzalloc with devm_kzalloc to simplify the > ipa_probe() process. > > Signed-off-by: Wang Wenhu The code is perfectly fine as is. What problem are you trying to solve?
Re: [PATCH RESEND 3/4] Documentation/litmus-tests: Merge atomic's README into top-level one
On Thu, May 14, 2020 at 08:46:18AM +0800, Boqun Feng wrote: > On Wed, May 13, 2020 at 06:39:03AM +0900, Akira Yokosawa wrote: > > From 96fa6680e3b990633ecbb6d11acf03a161b790bd Mon Sep 17 00:00:00 2001 > > From: Akira Yokosawa > > Date: Sun, 10 May 2020 15:12:57 +0900 > > Subject: [PATCH RESEND 3/4] Documentation/litmus-tests: Merge atomic's > > README into top-level one > > > > Where Documentation/litmus-tests/README lists RCU litmus tests, > > Documentation/litmus-tests/atomic/README lists atomic litmus tests. > > For symmetry, merge the latter into former, with some context > > adjustment in the introduction. > > > > Signed-off-by: Akira Yokosawa > > Acked-by: Andrea Parri > > Acked-by: Joel Fernandes (Google) > > Acked-by: Boqun Feng > > Thanks! Applied, and thank you all! I rebased, cancelling the revert with the original, resulting in an updated lkmm branch on -rcu. There was one minor conflict, so could one of you please check to make sure that I resolved things appropriately? Thanx, Paul > Regards, > Boqun > > > --- > > Documentation/litmus-tests/README| 19 +++ > > Documentation/litmus-tests/atomic/README | 16 > > 2 files changed, 19 insertions(+), 16 deletions(-) > > delete mode 100644 Documentation/litmus-tests/atomic/README > > > > diff --git a/Documentation/litmus-tests/README > > b/Documentation/litmus-tests/README > > index c4307ea9f996..ac0b270b456c 100644 > > --- a/Documentation/litmus-tests/README > > +++ b/Documentation/litmus-tests/README > > @@ -2,6 +2,25 @@ > > LITMUS TESTS > > > > > > +Each subdirectory contains litmus tests that are typical to describe the > > +semantics of respective kernel APIs. > > +For more information about how to "run" a litmus test or how to generate > > +a kernel test module based on a litmus test, please see > > +tools/memory-model/README. > > + > > + > > +atomic (/atomic derectory) > > +-- > > + > > +Atomic-RMW+mb__after_atomic-is-stronger-than-acquire.litmus > > +Test that an atomic RMW followed by a smp_mb__after_atomic() is > > +stronger than a normal acquire: both the read and write parts of > > +the RMW are ordered before the subsequential memory accesses. > > + > > +Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus > > +Test that atomic_set() cannot break the atomicity of atomic RMWs. > > + > > + > > RCU (/rcu directory) > > > > > > diff --git a/Documentation/litmus-tests/atomic/README > > b/Documentation/litmus-tests/atomic/README > > deleted file mode 100644 > > index 714cf93816ea.. > > --- a/Documentation/litmus-tests/atomic/README > > +++ /dev/null > > @@ -1,16 +0,0 @@ > > -This directory contains litmus tests that are typical to describe the > > semantics > > -of our atomic APIs. For more information about how to "run" a litmus test > > or > > -how to generate a kernel test module based on a litmus test, please see > > -tools/memory-model/README. > > - > > - > > -LITMUS TESTS > > - > > - > > -Atomic-RMW+mb__after_atomic-is-stronger-than-acquire > > - Test that an atomic RMW followed by a smp_mb__after_atomic() is > > - stronger than a normal acquire: both the read and write parts of > > - the RMW are ordered before the subsequential memory accesses. > > - > > -Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus > > - Test that atomic_set() cannot break the atomicity of atomic RMWs. > > -- > > 2.17.1 > > > >
Re: [PATCH] net: phy: mdio-moxart: remove unneeded include
czw., 14 maj 2020 o 19:13 Florian Fainelli napisał(a): > > > > On 5/14/2020 9:59 AM, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski > > > > mdio-moxart doesn't use regulators in the driver code. We can remove > > the regulator include. > > > > Signed-off-by: Bartosz Golaszewski > > Reviewed-by: Florian Fainelli > -- > Florian Hi Andrew, Florian, I noticed this by accident when I was looking at the PHY drivers to see how they handle regulators supplying PHYs. In the case of the MediaTek Pumpkin board I'm working on - the PHY is a Realtek RTL8201F and it's supplied by a regulator that's enabled on boot by the relevant PMIC driver. I'd like to model it in the device-tree but I'm not sure what the correct approach is. Some ethernet drivers have a phy-supply property but it looks wrong to me - IMO this should be handled at the PHY driver level. Is it fine if I add a probe() callback to the realtek driver and retrieve the "phy-supply" there? Bart
[PATCH v13 1/9] nvme-core: Clear any SGL flags in passthru commands
The host driver should decide whether to use SGLs or PRPs and they currently assume the flags are cleared after the call to nvme_setup_cmd(). However, passed-through commands may erroneously set these bits; so clear them for all cases. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/host/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f3c037f5a9ba..d22859543e4b 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -756,6 +756,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req, case REQ_OP_DRV_IN: case REQ_OP_DRV_OUT: memcpy(cmd, nvme_req(req)->cmd, sizeof(*cmd)); + /* passthru commands should let the driver set the SGL flags */ + cmd->common.flags &= ~NVME_CMD_SGL_ALL; break; case REQ_OP_FLUSH: nvme_setup_flush(ns, cmd); -- 2.20.1
Re: [PATCH v1] arm64: dts: qcom: apq8016-sbc-d3: Add Qualcomm APQ8016 SBC + D3 mezzanine
On 22-04-20, 13:10, Robert Foss wrote: > Add device treee support for the Qualcomm APQ8016 SBC, otherwise known as > the Dragonboard 410c with the D3 mezzanine expansion board. > > The D3 mezzanine ships in a kit with a OmniVision 5640 sensor module, > which is what this DT targets. > > Signed-off-by: Robert Foss > --- > arch/arm64/boot/dts/qcom/Makefile | 1 + > arch/arm64/boot/dts/qcom/apq8016-sbc-d3.dts | 45 + > 2 files changed, 46 insertions(+) > create mode 100644 arch/arm64/boot/dts/qcom/apq8016-sbc-d3.dts > > diff --git a/arch/arm64/boot/dts/qcom/Makefile > b/arch/arm64/boot/dts/qcom/Makefile > index cc103f7020fd..025362471929 100644 > --- a/arch/arm64/boot/dts/qcom/Makefile > +++ b/arch/arm64/boot/dts/qcom/Makefile > @@ -1,5 +1,6 @@ > # SPDX-License-Identifier: GPL-2.0 > dtb-$(CONFIG_ARCH_QCOM) += apq8016-sbc.dtb > +dtb-$(CONFIG_ARCH_QCOM) += apq8016-sbc-d3.dtb > dtb-$(CONFIG_ARCH_QCOM) += apq8096-db820c.dtb > dtb-$(CONFIG_ARCH_QCOM) += apq8096-ifc6640.dtb > dtb-$(CONFIG_ARCH_QCOM) += ipq6018-cp01-c1.dtb > diff --git a/arch/arm64/boot/dts/qcom/apq8016-sbc-d3.dts > b/arch/arm64/boot/dts/qcom/apq8016-sbc-d3.dts > new file mode 100644 > index ..1b85adeeada1 > --- /dev/null > +++ b/arch/arm64/boot/dts/qcom/apq8016-sbc-d3.dts > @@ -0,0 +1,45 @@ > +// SPDX-License-Identifier: GPL-2.0-only Dual BSD + GPL please > +/* > + * Copyright (c) 2015, The Linux Foundation. All rights reserved. we are in 2020 now :) > + */ > + > +/dts-v1/; > + > +#include "apq8016-sbc.dtsi" > + > +/ { > + model = "Qualcomm Technologies, Inc. APQ 8016 SBC w/ D3 Mezzanine"; > + compatible = "qcom,apq8016-sbc", "qcom,apq8016", "qcom,sbc"; > +}; > + > +_i2c0 { > + /delete-node/ camera_rear@3b; > + > + camera_rear@76 { > + compatible = "ovti,ov5640"; > + reg = <0x76>; > + > + enable-gpios = < 34 GPIO_ACTIVE_HIGH>; > + reset-gpios = < 35 GPIO_ACTIVE_LOW>; > + pinctrl-names = "default"; > + pinctrl-0 = <_rear_default>; > + > + clocks = < GCC_CAMSS_MCLK0_CLK>; > + clock-names = "xclk"; > + clock-frequency = <2388>; > + > + vdddo-supply = <_vdddo_1v8>; > + vdda-supply = <_vdda_2v8>; > + vddd-supply = <_vddd_1v5>; > + > + status = "ok"; > + > + port { > + ov5640_ep: endpoint { > + clock-lanes = <1>; > + data-lanes = <0 2>; > + remote-endpoint = <_ep>; > + }; > + }; > + }; > +}; > -- > 2.25.1 -- ~Vinod
[PATCH v13 4/9] nvmet-passthru: Introduce NVMet passthru Kconfig option
From: Chaitanya Kulkarni This patch updates KConfig file for the NVMeOF target where we add new option so that user can selectively enable/disable passthru code. Signed-off-by: Chaitanya Kulkarni [log...@deltatee.com: fixed some of the wording in the help message] Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/target/Kconfig | 12 1 file changed, 12 insertions(+) diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig index d7f48c0fb311..1ef775fa0048 100644 --- a/drivers/nvme/target/Kconfig +++ b/drivers/nvme/target/Kconfig @@ -15,6 +15,18 @@ config NVME_TARGET To configure the NVMe target you probably want to use the nvmetcli tool from http://git.infradead.org/users/hch/nvmetcli.git. +config NVME_TARGET_PASSTHRU + bool "NVMe Target Passthrough support" + depends on NVME_CORE + depends on NVME_TARGET + help + This enables target side NVMe passthru controller support for the + NVMe Over Fabrics protocol. It allows for hosts to manage and + directly access an actual NVMe controller residing on the target + side, incuding executing Vendor Unique Commands. + + If unsure, say N. + config NVME_TARGET_LOOP tristate "NVMe loopback device support" depends on NVME_TARGET -- 2.20.1
[PATCH v13 9/9] nvmet-configfs: Introduce passthru configfs interface
When CONFIG_NVME_TARGET_PASSTHRU as 'passthru' directory will be added to each subsystem. The directory is similar to a namespace and has two attributes: device_path and enable. The user must set the path to the nvme controller's char device and write '1' to enable the subsystem to use passthru. Any given subsystem is prevented from enabling both a regular namespace and the passthru device. If one is enabled, enabling the other will produce an error. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/target/configfs.c | 99 ++ drivers/nvme/target/nvmet.h| 1 + 2 files changed, 100 insertions(+) diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c index e0ce6e5feb3a..59eb52a0d06c 100644 --- a/drivers/nvme/target/configfs.c +++ b/drivers/nvme/target/configfs.c @@ -613,6 +613,103 @@ static const struct config_item_type nvmet_namespaces_type = { .ct_owner = THIS_MODULE, }; +#ifdef CONFIG_NVME_TARGET_PASSTHRU + +static ssize_t nvmet_passthru_device_path_show(struct config_item *item, + char *page) +{ + struct nvmet_subsys *subsys = to_subsys(item->ci_parent); + + return snprintf(page, PAGE_SIZE, "%s\n", subsys->passthru_ctrl_path); +} + +static ssize_t nvmet_passthru_device_path_store(struct config_item *item, + const char *page, size_t count) +{ + struct nvmet_subsys *subsys = to_subsys(item->ci_parent); + size_t len; + int ret; + + mutex_lock(>lock); + + ret = -EBUSY; + if (subsys->passthru_ctrl) + goto out_unlock; + + ret = -EINVAL; + len = strcspn(page, "\n"); + if (!len) + goto out_unlock; + + kfree(subsys->passthru_ctrl_path); + ret = -ENOMEM; + subsys->passthru_ctrl_path = kstrndup(page, len, GFP_KERNEL); + if (!subsys->passthru_ctrl_path) + goto out_unlock; + + mutex_unlock(>lock); + + return count; +out_unlock: + mutex_unlock(>lock); + return ret; +} +CONFIGFS_ATTR(nvmet_passthru_, device_path); + +static ssize_t nvmet_passthru_enable_show(struct config_item *item, + char *page) +{ + struct nvmet_subsys *subsys = to_subsys(item->ci_parent); + + return sprintf(page, "%d\n", subsys->passthru_ctrl ? 1 : 0); +} + +static ssize_t nvmet_passthru_enable_store(struct config_item *item, + const char *page, size_t count) +{ + struct nvmet_subsys *subsys = to_subsys(item->ci_parent); + bool enable; + int ret = 0; + + if (strtobool(page, )) + return -EINVAL; + + if (enable) + ret = nvmet_passthru_ctrl_enable(subsys); + else + nvmet_passthru_ctrl_disable(subsys); + + return ret ? ret : count; +} +CONFIGFS_ATTR(nvmet_passthru_, enable); + +static struct configfs_attribute *nvmet_passthru_attrs[] = { + _passthru_attr_device_path, + _passthru_attr_enable, + NULL, +}; + +static const struct config_item_type nvmet_passthru_type = { + .ct_attrs = nvmet_passthru_attrs, + .ct_owner = THIS_MODULE, +}; + +static void nvmet_add_passthru_group(struct nvmet_subsys *subsys) +{ + config_group_init_type_name(>passthru_group, + "passthru", _passthru_type); + configfs_add_default_group(>passthru_group, + >group); +} + +#else /* CONFIG_NVME_TARGET_PASSTHRU */ + +static void nvmet_add_passthru_group(struct nvmet_subsys *subsys) +{ +} + +#endif /* CONFIG_NVME_TARGET_PASSTHRU */ + static int nvmet_port_subsys_allow_link(struct config_item *parent, struct config_item *target) { @@ -1047,6 +1144,8 @@ static struct config_group *nvmet_subsys_make(struct config_group *group, configfs_add_default_group(>allowed_hosts_group, >group); + nvmet_add_passthru_group(subsys); + return >group; } diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 76c3a7cb9c89..d40452e50212 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -243,6 +243,7 @@ struct nvmet_subsys { #ifdef CONFIG_NVME_TARGET_PASSTHRU struct nvme_ctrl*passthru_ctrl; char*passthru_ctrl_path; + struct config_group passthru_group; #endif /* CONFIG_NVME_TARGET_PASSTHRU */ }; -- 2.20.1
[PATCH v13 6/9] nvme: Export existing nvme core functions
Export nvme_put_ns(), nvme_command_effects(), nvme_execute_passthru_rq() and nvme_find_get_ns() for use in the nvmet passthru code. The exports are conditional on CONFIG_NVME_TARGET_PASSTHRU. Based-on-a-patch-by: Chaitanya Kulkarni Signed-off-by: Logan Gunthorpe Reviewed-by: Max Gurtovoy Reviewed-by: Sagi Grimberg --- drivers/nvme/host/core.c | 14 +- drivers/nvme/host/nvme.h | 5 + 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 2604971362d8..c76f2e4b7c1c 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -457,7 +457,7 @@ static void nvme_free_ns(struct kref *kref) kfree(ns); } -static void nvme_put_ns(struct nvme_ns *ns) +void nvme_put_ns(struct nvme_ns *ns) { kref_put(>kref, nvme_free_ns); } @@ -890,8 +890,8 @@ static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf, return ERR_PTR(ret); } -static u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, - u8 opcode) +u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, +u8 opcode) { u32 effects = 0; @@ -976,7 +976,7 @@ static void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects) nvme_queue_scan(ctrl); } -static void nvme_execute_passthru_rq(struct request *rq) +void nvme_execute_passthru_rq(struct request *rq) { struct nvme_command *cmd = nvme_req(rq)->cmd; struct nvme_ctrl *ctrl = nvme_req(rq)->ctrl; @@ -3530,7 +3530,7 @@ static int ns_cmp(void *priv, struct list_head *a, struct list_head *b) return nsa->head->ns_id - nsb->head->ns_id; } -static struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned nsid) +struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned nsid) { struct nvme_ns *ns, *ret = NULL; @@ -4311,6 +4311,10 @@ EXPORT_SYMBOL_GPL(nvme_sync_queues); * use by the nvmet-passthru and should not be used for * other things. */ +EXPORT_SYMBOL_GPL(nvme_put_ns); +EXPORT_SYMBOL_GPL(nvme_command_effects); +EXPORT_SYMBOL_GPL(nvme_execute_passthru_rq); +EXPORT_SYMBOL_GPL(nvme_find_get_ns); struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path) { diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 9195dd97b61b..76b4aa8f35e1 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -695,6 +695,11 @@ static inline void nvme_hwmon_init(struct nvme_ctrl *ctrl) { } * use by the nvmet-passthru and should not be used for * other things. */ +void nvme_put_ns(struct nvme_ns *ns); +u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, +u8 opcode); +void nvme_execute_passthru_rq(struct request *rq); +struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned int nsid); struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path); #endif /* CONFIG_NVME_TARGET_PASSTHRU */ -- 2.20.1
[PATCH v13 2/9] nvme: Create helper function to obtain command effects
Separate the code to obtain command effects from the code to start a passthru request and open code nvme_known_admin_effects() in the new helper. The new helper function will be necessary for nvmet passthru code to determine if we need to change out of interrupt context to handle the effects. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/host/core.c | 39 ++- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index d22859543e4b..5062a83c3634 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1317,22 +1317,8 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio) metadata, meta_len, lower_32_bits(io.slba), NULL, 0); } -static u32 nvme_known_admin_effects(u8 opcode) -{ - switch (opcode) { - case nvme_admin_format_nvm: - return NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC | - NVME_CMD_EFFECTS_CSE_MASK; - case nvme_admin_sanitize_nvm: - return NVME_CMD_EFFECTS_CSE_MASK; - default: - break; - } - return 0; -} - -static u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, - u8 opcode) +static u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, + u8 opcode) { u32 effects = 0; @@ -1348,7 +1334,26 @@ static u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, if (ctrl->effects) effects = le32_to_cpu(ctrl->effects->acs[opcode]); - effects |= nvme_known_admin_effects(opcode); + + switch (opcode) { + case nvme_admin_format_nvm: + effects |= NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC | + NVME_CMD_EFFECTS_CSE_MASK; + break; + case nvme_admin_sanitize_nvm: + effects |= NVME_CMD_EFFECTS_CSE_MASK; + break; + default: + break; + } + + return effects; +} + +static u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, + u8 opcode) +{ + u32 effects = nvme_command_effects(ctrl, ns, opcode); /* * For simplicity, IO to all namespaces is quiesced even if the command -- 2.20.1
[PATCH v13 0/9] nvmet: add target passthru commands support
This is v13 of the passthru patchset which is mostly a resend of v12 with Sagi's reviewed-by tags collected. Below, I'll reiterrate some points I made previously that haven't been responded to: I don't think cloning the ctrl_id or the subsysnqn is a good idea. I sent an email trying to explain why here[1] but there was no response. In short, I think cloning the ctrl_id will break multipathing over fabrics and copying the subsysnqn only has the effect of breaking loopback; the user can always copy the underlying subsysnqn if it makes sense for their overall system. I maintain overriding the CMIC bit in the ctrl id is necessary to allow multipath over fabrics even if the underlying device did not support multipath. I also think the black list for admin commands is appropriate, and I added it based on Sagi's feedback[2]. There are plenty of commands that may be dangerous like firmware update and format NVM commands, and NS attach commands won't work out of the box because we don't copy the ctrl_id. It seems like there's more commands to be careful of than ones that are that are obviously acceptable. So, I think the prudent course is blacklisting by default until someone has a usecase and can show the command is safe seems and makes sense. For our present use cases, the identify, log page and vendor specific commands are all that we care about. A git branch is available here and is based on v5.7-rc5: https://github.com/sbates130272/linux-p2pmem nvmet_passthru_v13 [1] https://lore.kernel.org/linux-block/247eca47-c3bc-6452-fb19-f7aa27b05...@deltatee.com/ [2] https://lore.kernel.org/linux-block/e4430207-7def-8776-0289-0d58689dc...@grimberg.me/ -- v13 Changes: 1. Rebased onto v5.7-rc5 2. Collected Sagi's Reviewed-by tags v12 Changes: 1. Rebased onto v5.7-rc1 2. Collected Sagi's Reviewed-by tags 3. Per Sagi's feedback implement an whitelist for set/get features and audit the features for whether they are suitable to be passed-through v11 Changes: 1. Rebased onto v5.6-rc2 2. Collected Max's Reviewed-By tag v10 Changes: 1. Rebased onto v5.5-rc1 2. Disable all exports in core nvme if CONFIG_NVME_TARGET_PASSTHRU is not set and put them near the end of the file with a big fat comment (per Christoph) 3. Don't fake up the vs field: pass it through as is and bump it to 1.2.1 if it is below that (per Christoph) 4. Rework how passthru requests are submitted into the core with proper nvme_passthru_start/end handling (per Christoph) 5. Rework how commands are parsed with passthru hooks in parse_admin_cmd() and nvmet_parse_io_cmd() (per Christoph) 6. Rework commands are handled so they are only done in a work item if absolutely necessary (per Christoph) 7. The data_len hack was dropped as a patchset was introduced to remove data_len altogether (per Christoph) 8. The passthru accounting changes are now in v5.5-rc1 9. A large number of other minor cleanups that were pointed out by Christoph v9 Changes: 1. Rebased onto v5.4-rc2 (required adjusting nvme_identify_ns() usage) 2. Collected Sagi's Reviewed-By Tags 3. Squashed seperate Kconfig patch into passthru patch (Per Sagi) 4. Set REQ_FUA for flush requests and remove special casing on RQF_IO_STAT (Per Sagi) v8 Changes: 1. Rebased onto v5.3-rc6 2. Collected Max's Reviewed-By tags 3. Converted admin command black-list to a white-list, but allow all vendor specific commands. With this, we feel it's safe to allow multiple connections from hosts. (As per Sagi's feedback) v7 Changes: 1. Rebased onto v5.3-rc2 2. Rework nvme_ctrl_get_by_path() to use filp_open() instead of the cdev changes that were in v6. (Per Al) 3. Override the cmic bit to allow multipath and allow multiple connections from the same hostnqn. (At the same time I cleaned up the method of rejecting multiple connections.) See Patch 8) 4. Found a bug when used with the tcp transport (See Patch 10) v6 Changes: 1. Rebased onto v5.3-rc1 2. Rework configfs interface to simply be a passthru directory within the existing subsystem. The directory is similar to and consistent with a namespace directory. 3. Have the configfs take a path instead of a bare controller name 4. Renamed the main passthru file to io-cmd-passthru.c for consistency with the file and block-dev methods. 5. Cleaned up all the CONFIG_NVME_TARGET_PASSTHRU usage to remove all the inline #ifdefs 6. Restructured nvmet_passthru_make_request() a bit for clearer code 7. Moved nvme_find_get_ns() call into nvmet_passthru_execute_cmd() seeing calling it in nvmet_req_init() causes a lockdep warning due to nvme_find_get_ns() being able to sleep. 8. Added a check in nvmet_passthru_execute_cmd() to ensure we don't violate queue_max_segments or queue_max_hw_sectors and overrode mdts to ensure hosts don't intentionally submit commands that will exceed these
[PATCH v13 8/9] nvmet-passthru: Add enable/disable helpers
This patch adds helper functions which are used in the NVMeOF configfs when the user is configuring the passthru subsystem. Here we ensure that only one subsys is assigned to each nvme_ctrl by using an xarray on the cntlid. The subsystem's version number is overridden by the passed through controller's version. However, if that version is less than 1.2.1, then we bump the advertised version to that and print a warning in dmesg. Based-on-a-patch-by: Chaitanya Kulkarni Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/target/configfs.c | 4 ++ drivers/nvme/target/core.c | 10 +++- drivers/nvme/target/nvmet.h| 12 + drivers/nvme/target/passthru.c | 87 ++ 4 files changed, 112 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c index 58cabd7b6fc5..e0ce6e5feb3a 100644 --- a/drivers/nvme/target/configfs.c +++ b/drivers/nvme/target/configfs.c @@ -826,6 +826,10 @@ static ssize_t nvmet_subsys_attr_version_store(struct config_item *item, int major, minor, tertiary = 0; int ret; + /* passthru subsystems use the underlying controller's version */ + if (nvmet_passthru_ctrl(subsys)) + return -EINVAL; + ret = sscanf(page, "%d.%d.%d\n", , , ); if (ret != 2 && ret != 3) return -EINVAL; diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index 2547e0d8951c..a2cdd35ffb1d 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -522,6 +522,12 @@ int nvmet_ns_enable(struct nvmet_ns *ns) mutex_lock(>lock); ret = 0; + + if (nvmet_passthru_ctrl(subsys)) { + pr_info("cannot enable both passthru and regular namespaces for a single subsystem"); + goto out_unlock; + } + if (ns->enabled) goto out_unlock; @@ -1421,7 +1427,7 @@ struct nvmet_subsys *nvmet_subsys_alloc(const char *subsysnqn, if (!subsys) return ERR_PTR(-ENOMEM); - subsys->ver = NVME_VS(1, 3, 0); /* NVMe 1.3.0 */ + subsys->ver = NVMET_DEFAULT_VS; /* generate a random serial number as our controllers are ephemeral: */ get_random_bytes(>serial, sizeof(subsys->serial)); @@ -1463,6 +1469,8 @@ static void nvmet_subsys_free(struct kref *ref) WARN_ON_ONCE(!list_empty(>namespaces)); + nvmet_passthru_subsys_free(subsys); + kfree(subsys->subsysnqn); kfree_rcu(subsys->model, rcuhead); kfree(subsys); diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index deb996c90804..76c3a7cb9c89 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -20,6 +20,8 @@ #include #include +#define NVMET_DEFAULT_VS NVME_VS(1, 3, 0) + #define NVMET_ASYNC_EVENTS 4 #define NVMET_ERROR_LOG_SLOTS 128 #define NVMET_NO_ERROR_LOC ((u16)-1) @@ -240,6 +242,7 @@ struct nvmet_subsys { #ifdef CONFIG_NVME_TARGET_PASSTHRU struct nvme_ctrl*passthru_ctrl; + char*passthru_ctrl_path; #endif /* CONFIG_NVME_TARGET_PASSTHRU */ }; @@ -523,6 +526,9 @@ static inline u32 nvmet_dsm_len(struct nvmet_req *req) } #ifdef CONFIG_NVME_TARGET_PASSTHRU +void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys); +int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys); +void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys); u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req); u16 nvmet_parse_passthru_io_cmd(struct nvmet_req *req); static inline struct nvme_ctrl *nvmet_passthru_ctrl(struct nvmet_subsys *subsys) @@ -530,6 +536,12 @@ static inline struct nvme_ctrl *nvmet_passthru_ctrl(struct nvmet_subsys *subsys) return subsys->passthru_ctrl; } #else /* CONFIG_NVME_TARGET_PASSTHRU */ +static inline void nvmet_passthru_subsys_free(struct nvmet_subsys *subsys) +{ +} +static inline void nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys) +{ +} static inline u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req) { return 0; diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c index 15b23521e59f..a11c3ef34041 100644 --- a/drivers/nvme/target/passthru.c +++ b/drivers/nvme/target/passthru.c @@ -11,6 +11,11 @@ #include "../host/nvme.h" #include "nvmet.h" +/* + * xarray to maintain one passthru subsystem per nvme controller. + */ +static DEFINE_XARRAY(passthru_subsystems); + static void nvmet_passthru_execute_cmd_work(struct work_struct *w) { struct nvmet_req *req = container_of(w, struct nvmet_req, p.work); @@ -458,3 +463,85 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req) return NVME_SC_INVALID_OPCODE | NVME_SC_DNR; } } + +int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys) +{ + struct nvme_ctrl *ctrl; + int ret = -EINVAL; +
[PATCH v13 7/9] nvmet-passthru: Add passthru code to process commands
Add passthru command handling capability for the NVMeOF target and export passthru APIs which are used to integrate passthru code with nvmet-core. The new file passthru.c handles passthru cmd parsing and execution. In the passthru mode, we create a block layer request from the nvmet request and map the data on to the block layer request. Admin commands and features are on a white list as there are a number of each that don't make too much sense with passthrough. We use a white list so that new commands can be considered before being blindly passed through. In both cases, vendor specific commands are always allowed. We also blacklist reservation IO commands as the underlying device cannot differentiate between multiple hosts behind a fabric. Based-on-a-patch-by: Chaitanya Kulkarni Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/target/Makefile| 1 + drivers/nvme/target/admin-cmd.c | 7 +- drivers/nvme/target/core.c | 3 + drivers/nvme/target/nvmet.h | 39 +++ drivers/nvme/target/passthru.c | 460 include/linux/nvme.h| 4 + 6 files changed, 512 insertions(+), 2 deletions(-) create mode 100644 drivers/nvme/target/passthru.c diff --git a/drivers/nvme/target/Makefile b/drivers/nvme/target/Makefile index 2b33836f3d3e..ebf91fc4c72e 100644 --- a/drivers/nvme/target/Makefile +++ b/drivers/nvme/target/Makefile @@ -11,6 +11,7 @@ obj-$(CONFIG_NVME_TARGET_TCP) += nvmet-tcp.o nvmet-y+= core.o configfs.o admin-cmd.o fabrics-cmd.o \ discovery.o io-cmd-file.o io-cmd-bdev.o +nvmet-$(CONFIG_NVME_TARGET_PASSTHRU) += passthru.o nvme-loop-y+= loop.o nvmet-rdma-y += rdma.o nvmet-fc-y += fc.o diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c index 9d6f75cfa77c..a2dfd202eb32 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -734,7 +734,7 @@ u16 nvmet_set_feat_async_event(struct nvmet_req *req, u32 mask) return 0; } -static void nvmet_execute_set_features(struct nvmet_req *req) +void nvmet_execute_set_features(struct nvmet_req *req) { struct nvmet_subsys *subsys = req->sq->ctrl->subsys; u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10); @@ -809,7 +809,7 @@ void nvmet_get_feat_async_event(struct nvmet_req *req) nvmet_set_result(req, READ_ONCE(req->sq->ctrl->aen_enabled)); } -static void nvmet_execute_get_features(struct nvmet_req *req) +void nvmet_execute_get_features(struct nvmet_req *req) { struct nvmet_subsys *subsys = req->sq->ctrl->subsys; u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10); @@ -925,6 +925,9 @@ u16 nvmet_parse_admin_cmd(struct nvmet_req *req) if (unlikely(ret)) return ret; + if (nvmet_req_passthru_ctrl(req)) + return nvmet_parse_passthru_admin_cmd(req); + switch (cmd->common.opcode) { case nvme_admin_get_log_page: req->execute = nvmet_execute_get_log_page; diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index b685f99d56a1..2547e0d8951c 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -841,6 +841,9 @@ static u16 nvmet_parse_io_cmd(struct nvmet_req *req) if (unlikely(ret)) return ret; + if (nvmet_req_passthru_ctrl(req)) + return nvmet_parse_passthru_io_cmd(req); + req->ns = nvmet_find_namespace(req->sq->ctrl, cmd->rw.nsid); if (unlikely(!req->ns)) { req->error_loc = offsetof(struct nvme_common_command, nsid); diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 421dff3ea143..deb996c90804 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -237,6 +237,10 @@ struct nvmet_subsys { struct config_group allowed_hosts_group; struct nvmet_subsys_model __rcu *model; + +#ifdef CONFIG_NVME_TARGET_PASSTHRU + struct nvme_ctrl*passthru_ctrl; +#endif /* CONFIG_NVME_TARGET_PASSTHRU */ }; static inline struct nvmet_subsys *to_subsys(struct config_item *item) @@ -313,6 +317,11 @@ struct nvmet_req { struct bio_vec *bvec; struct work_struct work; } f; + struct { + struct request *rq; + struct work_struct work; + u16 (*end_req)(struct nvmet_req *req); + } p; }; int sg_cnt; /* data length as parsed from the SGL descriptor: */ @@ -390,6 +399,8 @@ void nvmet_req_complete(struct nvmet_req *req, u16 status); int nvmet_req_alloc_sgl(struct nvmet_req *req); void nvmet_req_free_sgl(struct nvmet_req *req); +void nvmet_execute_set_features(struct nvmet_req *req); +void nvmet_execute_get_features(struct nvmet_req *req); void
[PATCH v13 3/9] nvme: Move nvme_passthru_[start|end]() calls to common helper
Introduce a new nvme_execute_passthru_rq() helper which calls nvme_passthru_[start|end]() around blk_execute_rq(). This ensures all passthru calls (including nvme_submit_io()) will be wrapped appropriately. nvme_execute_passthru_rq() will also be useful for the nvmet passthru code. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/host/core.c | 193 --- 1 file changed, 100 insertions(+), 93 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 5062a83c3634..2ead7ad45a9d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -890,6 +890,105 @@ static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf, return ERR_PTR(ret); } +static u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, + u8 opcode) +{ + u32 effects = 0; + + if (ns) { + if (ctrl->effects) + effects = le32_to_cpu(ctrl->effects->iocs[opcode]); + if (effects & ~(NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC)) + dev_warn(ctrl->device, +"IO command:%02x has unhandled effects:%08x\n", +opcode, effects); + return 0; + } + + if (ctrl->effects) + effects = le32_to_cpu(ctrl->effects->acs[opcode]); + + switch (opcode) { + case nvme_admin_format_nvm: + effects |= NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC | + NVME_CMD_EFFECTS_CSE_MASK; + break; + case nvme_admin_sanitize_nvm: + effects |= NVME_CMD_EFFECTS_CSE_MASK; + break; + default: + break; + } + + return effects & ~NVME_CMD_EFFECTS_CSUPP; +} + +static u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, + u8 opcode) +{ + u32 effects = nvme_command_effects(ctrl, ns, opcode); + + /* +* For simplicity, IO to all namespaces is quiesced even if the command +* effects say only one namespace is affected. +*/ + if (effects & (NVME_CMD_EFFECTS_LBCC | NVME_CMD_EFFECTS_CSE_MASK)) { + mutex_lock(>scan_lock); + mutex_lock(>subsys->lock); + nvme_mpath_start_freeze(ctrl->subsys); + nvme_mpath_wait_freeze(ctrl->subsys); + nvme_start_freeze(ctrl); + nvme_wait_freeze(ctrl); + } + return effects; +} + +static void nvme_update_formats(struct nvme_ctrl *ctrl) +{ + struct nvme_ns *ns; + + down_read(>namespaces_rwsem); + list_for_each_entry(ns, >namespaces, list) + if (ns->disk && nvme_revalidate_disk(ns->disk)) + nvme_set_queue_dying(ns); + up_read(>namespaces_rwsem); +} + +static void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects) +{ + /* +* Revalidate LBA changes prior to unfreezing. This is necessary to +* prevent memory corruption if a logical block size was changed by +* this command. +*/ + if (effects & NVME_CMD_EFFECTS_LBCC) + nvme_update_formats(ctrl); + if (effects & (NVME_CMD_EFFECTS_LBCC | NVME_CMD_EFFECTS_CSE_MASK)) { + nvme_unfreeze(ctrl); + nvme_mpath_unfreeze(ctrl->subsys); + mutex_unlock(>subsys->lock); + nvme_remove_invalid_namespaces(ctrl, NVME_NSID_ALL); + mutex_unlock(>scan_lock); + } + if (effects & NVME_CMD_EFFECTS_CCC) + nvme_init_identify(ctrl); + if (effects & (NVME_CMD_EFFECTS_NIC | NVME_CMD_EFFECTS_NCC)) + nvme_queue_scan(ctrl); +} + +static void nvme_execute_passthru_rq(struct request *rq) +{ + struct nvme_command *cmd = nvme_req(rq)->cmd; + struct nvme_ctrl *ctrl = nvme_req(rq)->ctrl; + struct nvme_ns *ns = rq->q->queuedata; + struct gendisk *disk = ns ? ns->disk : NULL; + u32 effects; + + effects = nvme_passthru_start(ctrl, ns, cmd->common.opcode); + blk_execute_rq(rq->q, disk, rq, 0); + nvme_passthru_end(ctrl, effects); +} + static int nvme_submit_user_cmd(struct request_queue *q, struct nvme_command *cmd, void __user *ubuffer, unsigned bufflen, void __user *meta_buffer, unsigned meta_len, @@ -928,7 +1027,7 @@ static int nvme_submit_user_cmd(struct request_queue *q, } } - blk_execute_rq(req->q, disk, req, 0); + nvme_execute_passthru_rq(req); if (nvme_req(req)->flags & NVME_REQ_CANCELLED) ret = -EINTR; else @@ -1317,99 +1416,12 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio) metadata, meta_len, lower_32_bits(io.slba), NULL, 0); } -static u32
[PATCH v13 5/9] nvme-core: Introduce nvme_ctrl_get_by_path()
nvme_ctrl_get_by_path() is analagous to blkdev_get_by_path() except it gets a struct nvme_ctrl from the path to its char dev (/dev/nvme0). It makes use of filp_open() to open the file and uses the private data to obtain a pointer to the struct nvme_ctrl. If the fops of the file do not match, -EINVAL is returned. The purpose of this function is to support NVMe-OF target passthru. Signed-off-by: Logan Gunthorpe Reviewed-by: Max Gurtovoy Reviewed-by: Sagi Grimberg --- drivers/nvme/host/core.c | 31 +++ drivers/nvme/host/nvme.h | 9 + 2 files changed, 40 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 2ead7ad45a9d..2604971362d8 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4305,6 +4305,37 @@ void nvme_sync_queues(struct nvme_ctrl *ctrl) } EXPORT_SYMBOL_GPL(nvme_sync_queues); +#ifdef CONFIG_NVME_TARGET_PASSTHRU +/* + * The exports that follow within this ifdef are only for + * use by the nvmet-passthru and should not be used for + * other things. + */ + +struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path) +{ + struct nvme_ctrl *ctrl; + struct file *f; + + f = filp_open(path, O_RDWR, 0); + if (IS_ERR(f)) + return ERR_CAST(f); + + if (f->f_op != _dev_fops) { + ctrl = ERR_PTR(-EINVAL); + goto out_close; + } + + ctrl = f->private_data; + nvme_get_ctrl(ctrl); + +out_close: + filp_close(f, NULL); + return ctrl; +} +EXPORT_SYMBOL_GPL(nvme_ctrl_get_by_path); +#endif /* CONFIG_NVME_TARGET_PASSTHRU */ + /* * Check we didn't inadvertently grow the command structure sizes: */ diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 2e04a36296d9..9195dd97b61b 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -689,4 +689,13 @@ void nvme_hwmon_init(struct nvme_ctrl *ctrl); static inline void nvme_hwmon_init(struct nvme_ctrl *ctrl) { } #endif +#ifdef CONFIG_NVME_TARGET_PASSTHRU +/* + * The exports that follow within this ifdef are only for + * use by the nvmet-passthru and should not be used for + * other things. + */ +struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path); +#endif /* CONFIG_NVME_TARGET_PASSTHRU */ + #endif /* _NVME_H */ -- 2.20.1
Re: [PATCH] arm64: dts: qcom: msm8916-samsung-a3u: add nodes for display panel
On Thu 14 May 10:01 PDT 2020, michael.s...@seznam.cz wrote: > From: Michael Srba > > This patch wires up display support on Samsung Galaxy A3 2015. > > Signed-off-by: Michael Srba > > --- > .../qcom/msm8916-samsung-a2015-common.dtsi| 44 +++ > .../boot/dts/qcom/msm8916-samsung-a3u-eur.dts | 54 +++ > 2 files changed, 98 insertions(+) > > diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi > b/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi > index af812f76e8be..2a64aa269f52 100644 > --- a/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi > +++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi > @@ -72,6 +72,24 @@ phy { > }; > }; > > + mdss@1a0 { > + dsi@1a98000 { > + #address-cells = <1>; > + #size-cells = <0>; > + > + vdda-supply = <_l2>; > + vddio-supply = <_l6>; > + > + pinctrl-names = "default", "sleep"; > + pinctrl-0 = <_default>; > + pinctrl-1 = <_sleep>; > + }; > + > + dsi-phy@1a98300 { > + vddio-supply = <_l6>; > + }; > + }; > + > wcnss@a21b000 { > status = "okay"; > }; > @@ -172,6 +190,32 @@ pinconf { > bias-disable; > }; > }; > + > + pmx-mdss { > + mdss_default: mdss-default { > + pinmux { > + function = "gpio"; > + pins = "gpio25"; > + }; > + pinconf { > + pins = "gpio25"; > + drive-strength = <8>; > + bias-disable; > + }; > + }; Fyi, when you have a state with a single set of properties you don't need the pinmux/pinconf level here, you can directly do: mdss_default: mdss-default { pins = "gpio25"; function = "gpio"; drive-strength = <8>; bias-disable; }; But this looks good, applied towards v5.8. Regards, Bjorn
Re: [PATCH] regulator: bd718x7: remove voltage change restriction from BD71847 LDOs
On Wed, 13 May 2020 17:39:21 +0300, Matti Vaittinen wrote: > The BD71837 had a HW "feature" where changing the regulator output > voltages of other regulators but bucks 1-4 might cause spikes if > regulators were enabled. Thus SW prohibit voltage changes for other > regulators except for bucks 1-4 when regulator is enabled. > > The HW colleagues did inadvertly fix this issue for BD71847 and > BD71850. > > [...] Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git for-5.8 Thanks! [1/1] regulator: bd718x7: remove voltage change restriction from BD71847 LDOs commit: 9bcbabafa19b9f27a283777eff32e7d66fcef09c All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark
Re: [EXT] Re: [PATCH net-next v1] net: phy: tja11xx: add cable-test support
On Thu, May 14, 2020 at 06:01:52PM +0200, Andrew Lunn wrote: > On Thu, May 14, 2020 at 03:47:16PM +, Christian Herber wrote: > > Hi Andrew, > > > > > On Wed, May 13, 2020 at 03:39:00PM +0200, Andrew Lunn wrote: > > >> On Thu, May 14, 2020 at 02:09:59PM +0200, Oleksij Rempel wrote: > > >> ETHTOOL_A_CABLE_RESULT_CODE_ACTIVE_PARTNER - the link partner is active. > > >> > > >> The TJA1102 is able to detect it if partner link is master. > > >> > > > master is not a cable diagnostics issue. This is a configuration > > > issue. > > > > > Master is very relevant for cable diagnostics, as a cable > > measurement should not be done with an active link partner on the > > other end (i.e. a PHY in master mode trying to train the link). > > > So if the measurement detects an active link partner disturbing the > > measurement, it is important to report this to the user. > > So with 'normal' PHYs, we use autoneg to make the link go quiet. But > you don't have autoneg. > > If there is no way to force the link quiet, then > ETHTOOL_A_CABLE_RESULT_CODE_ACTIVE_PARTNER makes sense. But we need to > keep the meaning generic. I don't want it to mean a T1 PHY with an > active master peer. It should be usable for any reason the link cannot > be made to go quiet. It looks for me, like AT803X_CDT_STATUS_STAT_FAIL has the same reason as ACTIVE PERTNER (Master) on TJA11xx. What kind of meaning, naming would be generic? -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | signature.asc Description: PGP signature
Re: [PATCH RESEND] lockdown: Allow unprivileged users to see lockdown status
On Thu, 14 May 2020, Jeremy Cline wrote: > A number of userspace tools, such as systemtap, need a way to see the > current lockdown state so they can gracefully deal with the kernel being > locked down. The state is already exposed in > /sys/kernel/security/lockdown, but is only readable by root. Adjust the > permissions so unprivileged users can read the state. > > Fixes: 000d388ed3bb ("security: Add a static lockdown policy LSM") > Cc: Frank Ch. Eigler > Signed-off-by: Jeremy Cline Applied to git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general -- James Morris
Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
Mathieu Desnoyers writes: > - On May 5, 2020, at 9:49 AM, Thomas Gleixner t...@linutronix.de wrote: >> + >> +static __always_inline void debug_exit(unsigned long dr7) >> +{ >> +set_debugreg(dr7, 7); >> +} > > Out of curiosity, what prevents the compiler from moving instructions > outside of the code regions surrounded by entry/exit ? This is an always > inline, which invokes set_debugreg which is inline for CONFIG_PARAVIRT_XXL=n, > which in turn uses an asm() (not volatile), without any memory > clobber. > > Also, considering that "inline" is not sufficient to ensure the compiler > does not emit a traceable function, I suspect you'll also want to mark > "native_get_debugreg" and "native_set_debugreg" always inline as well. It can move it into a function and call that. Fine. If that function stays in the noinstr section then it should not emit a trace stub and if it moves it out of the section or reuses another instance in text then objtool will complain. Checking for trace stubs and other instrumentation nonsense is on the objtool wishlist anyway. But yes, marking these __always_inline prevents that. Those escaped my chase. But I would have found them once I go and fix that paravirt muck. Thanks, tglx
Re: [PATCH net-next 0/5] net: hns3: add some cleanups for -next
On Thu, 14 May 2020 20:41:21 +0800 Huazhong Tan wrote: > This patchset adds some cleanups for the HNS3 ethernet driver. You may want to spell out 'state' instead of 'stat' in patch 3, stat is often used as abbreviation of statistic. But that's a nit pick, up to you: Reviewed-by: Jakub Kicinski
Re: [PATCH] net: phy: mdio-moxart: remove unneeded include
On 5/14/2020 10:20 AM, Bartosz Golaszewski wrote: > czw., 14 maj 2020 o 19:13 Florian Fainelli napisał(a): >> >> >> >> On 5/14/2020 9:59 AM, Bartosz Golaszewski wrote: >>> From: Bartosz Golaszewski >>> >>> mdio-moxart doesn't use regulators in the driver code. We can remove >>> the regulator include. >>> >>> Signed-off-by: Bartosz Golaszewski >> >> Reviewed-by: Florian Fainelli >> -- >> Florian > > Hi Andrew, Florian, > > I noticed this by accident when I was looking at the PHY drivers to > see how they handle regulators supplying PHYs. In the case of the > MediaTek Pumpkin board I'm working on - the PHY is a Realtek RTL8201F > and it's supplied by a regulator that's enabled on boot by the > relevant PMIC driver. I'd like to model it in the device-tree but I'm > not sure what the correct approach is. Some ethernet drivers have a > phy-supply property but it looks wrong to me - IMO this should be > handled at the PHY driver level. Is it fine if I add a probe() > callback to the realtek driver and retrieve the "phy-supply" there? Don't you need to do this earlier than probe() though? If the PHY device is powered down, then surely get_phy_id() won't be able to read its registers and bind the device to the driver. This should be dealt the same way that resets are being dealt with, which is prior to the MDIO bus scan. -- Florian
Re: [PATCH] asm-generic: Update kernel documentation in io.h
On 5/14/20 10:08 AM, Miquel Raynal wrote: > The kernel documentation of: > * bus_to_virt() > * memcpy_fromio() > * memcpy_toio() > refers to older parameters. > > Update it to fit the actual names. > > Signed-off-by: Miquel Raynal > --- > include/asm-generic/io.h | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h > index d39ac997dda8..cb617baf8d47 100644 > --- a/include/asm-generic/io.h > +++ b/include/asm-generic/io.h > @@ -1051,8 +1051,8 @@ static inline void *bus_to_virt(unsigned long address) > /** > * memset_io Set a range of I/O memory to a constant value > * @addr:The beginning of the I/O-memory range to set > - * @val: The value to set the memory to > - * @count: The number of bytes to set > + * @value: The value to set the memory to > + * @size:The number of bytes to set > * > * Set a range of I/O memory to a given value. > */ > @@ -1067,9 +1067,9 @@ static inline void memset_io(volatile void __iomem > *addr, int value, > #define memcpy_fromio memcpy_fromio > /** > * memcpy_fromio Copy a block of data from I/O memory > - * @dst: The (RAM) destination for the copy > - * @src: The (I/O memory) source for the data > - * @count: The number of bytes to copy > + * @buffer: The (RAM) destination for the copy > + * @addr:The (I/O memory) source for the data > + * @size:The number of bytes to copy > * > * Copy a block of data from I/O memory. > */ > @@ -1085,9 +1085,9 @@ static inline void memcpy_fromio(void *buffer, > #define memcpy_toio memcpy_toio > /** > * memcpy_toio Copy a block of data into I/O memory > - * @dst: The (I/O memory) destination for the copy > - * @src: The (RAM) source for the data > - * @count: The number of bytes to copy > + * @addr:The (I/O memory) destination for the copy > + * @buffer: The (RAM) source for the data > + * @size:The number of bytes to copy > * > * Copy a block of data to I/O memory. > */ > https://lore.kernel.org/lkml/20200424020831.30494-1-wenhu.w...@vivo.com/ is a better (more complete) patch IMO. if Arnd would merge... thanks. -- ~Randy
Re: [PATCHv8 0/6] n_gsm serdev support and GNSS driver for droid4
* Pavel Machek [200513 19:10]: > Hi! > > > Here's the updated set of these patches fixed up for Johan's and > > Pavel's earlier comments. > > > > This series does the following: > > > > 1. Adds functions to n_gsm.c for serdev-ngsm.c driver to use > > > > 2. Adds a generic serdev-ngsm.c driver that brings up the TS 27.010 > >TTY ports configured in devicetree with help of n_gsm.c > > > > 3. Allows the use of standard Linux device drivers for dedicated > >TS 27.010 channels for devices like GNSS and ALSA found on some > >modems for example > > > 4. Adds a gnss-motmdm consumer driver for the GNSS device found on > >the Motorola Mapphone MDM6600 modem on devices like droid4 > > It does one thing ... it turns Droid 4 into useful phone! Right, a minor detail I forgot :) > Thanks a lot. I believe these are same patches as in > droid4-pending-v5.7 branch, so whole series is > > Tested-by: Pavel Machek > > Getting this into 5.8 would be nice :-). > > > Now without the chardev support, the /dev/gsmtty* using apps need > > to use "U1234AT+CFUN?" format for the packets. The advantage is > > less kernel code, and we keep the existing /dev/gsmtty* interface. > > > > If we still really need the custom chardev support, that can now > > be added as needed with the channel specific consumer driver(s), > > but looks like this won't be needed based on Pavel's ofono work. > > These work for me, and I have patched ofono with basic > functionality. It is no longer possible to use minicom for debugging, > but printf can be used instead, so that's not much of a problem. > > I have adjusted ofono code, and moved away from normal AT support > code. More API changes would not be welcome :-). There is no need for a new API or API changes as we now use the existing n_gsm tty interface for /dev/gsmtty* devices that have been around for years. Regards, Tony
Re: [PATCH v17 05/10] fs,landlock: Support filesystem access-control
On Thu, 14 May 2020, Mickaël Salaün wrote: > > This needs to be converted to the LSM API via superblock blob stacking. > > > > See Casey's old patch: > > https://lore.kernel.org/linux-security-module/20190829232935.7099-2-ca...@schaufler-ca.com/ > > s_landlock_inode_refs is quite similar to s_fsnotify_inode_refs, but I > can do it once the superblock security blob patch is upstream. Is it a > blocker for now? What is the current status of lbs_superblock? Yes it is a blocker. Landlock should not be adding its own functions in core code, it should be using the LSM API (and extending that as needed). > Anyway, we also need to have a call to landlock_release_inodes() in > generic_shutdown_super(), which does not fit the LSM framework, and I > think it is not an issue. Landlock handling of inodes is quite similar > to fsnotify. fsnotify is not an LSM. -- James Morris
Re: [PATCH 3/4] hwspinlock: qcom: Allow mmio usage in addition to syscon
On 14-05-20, 10:00, Bjorn Andersson wrote: > On Thu 14 May 07:15 PDT 2020, Vinod Koul wrote: > > > On 12-05-20, 17:54, Bjorn Andersson wrote: > > > In all modern Qualcomm platforms the mutex region of the TCSR is forked > > > off into its own block, all with a offset of 0 and stride of 4096. So > > > add support for directly memory mapping this register space, to avoid > > > the need to represent this block using a syscon. > > > > > > Signed-off-by: Bjorn Andersson > > > --- > > > drivers/hwspinlock/qcom_hwspinlock.c | 72 +--- > > > 1 file changed, 56 insertions(+), 16 deletions(-) > > > > > > diff --git a/drivers/hwspinlock/qcom_hwspinlock.c > > > b/drivers/hwspinlock/qcom_hwspinlock.c > > > index f0da544b14d2..d8d4d729816c 100644 > > > --- a/drivers/hwspinlock/qcom_hwspinlock.c > > > +++ b/drivers/hwspinlock/qcom_hwspinlock.c > > > @@ -70,41 +70,81 @@ static const struct of_device_id > > > qcom_hwspinlock_of_match[] = { > > > }; > > > MODULE_DEVICE_TABLE(of, qcom_hwspinlock_of_match); > > > > > > -static int qcom_hwspinlock_probe(struct platform_device *pdev) > > > +static struct regmap *qcom_hwspinlock_probe_syscon(struct > > > platform_device *pdev, > > > +u32 *base, u32 *stride) > > > { > > > - struct hwspinlock_device *bank; > > > struct device_node *syscon; > > > - struct reg_field field; > > > struct regmap *regmap; > > > - size_t array_size; > > > - u32 stride; > > > - u32 base; > > > int ret; > > > - int i; > > > > > > syscon = of_parse_phandle(pdev->dev.of_node, "syscon", 0); > > > - if (!syscon) { > > > - dev_err(>dev, "no syscon property\n"); > > > > any reason to drop the log? > > > > Given that we first check for the syscon and then fall back to trying to > use the reg, keeping this line would cause this log line to always show > up on targets putting this under /soc. > > So I think it's better to drop the line and then require the presence of > either syscon or reg using the DT schema. ok > > > - return -ENODEV; > > > - } > > > + if (!syscon) > > > + return ERR_PTR(-ENODEV); > > > > > > regmap = syscon_node_to_regmap(syscon); > > > of_node_put(syscon); > > > if (IS_ERR(regmap)) > > > - return PTR_ERR(regmap); > > > + return regmap; > > > > > > - ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 1, ); > > > + ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 1, base); > > > if (ret < 0) { > > > dev_err(>dev, "no offset in syscon\n"); > > > - return -EINVAL; > > > + return ERR_PTR(-EINVAL); > > > } > > > > > > - ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 2, > > > ); > > > + ret = of_property_read_u32_index(pdev->dev.of_node, "syscon", 2, > > > stride); > > > if (ret < 0) { > > > dev_err(>dev, "no stride syscon\n"); > > > - return -EINVAL; > > > + return ERR_PTR(-EINVAL); > > > } > > > > > > + return regmap; > > > +} > > > + > > > +static const struct regmap_config tcsr_mutex_config = { > > > + .reg_bits = 32, > > > + .reg_stride = 4, > > > + .val_bits = 32, > > > + .max_register = 0x4, > > > + .fast_io= true, > > > +}; > > > + > > > +static struct regmap *qcom_hwspinlock_probe_mmio(struct platform_device > > > *pdev, > > > + u32 *offset, u32 *stride) > > > +{ > > > + struct device *dev = >dev; > > > + struct resource *res; > > > + void __iomem *base; > > > + > > > + /* All modern platform has offset 0 and stride of 4k */ > > > + *offset = 0; > > > + *stride = 0x1000; > > > > Wouldn't it make sense to read this from DT rather than code in kernel? > > > > I did consider this as well as platform specific compatibles, but > realized that pretty much all 64-bit targets have these values. So given > that we still can represent this using the syscon approach I don't think > we need to add yet another mechanism to specify these. Sounds good. Reviewed-by: Vinod Koul -- ~Vinod
Re: [PATCH v4 7/7] drm/mediatek: mtk_dsi: Create connector for bridges
Hi again, On 14/5/20 19:12, Enric Balletbo i Serra wrote: > Hi Chun-Kuang, > > On 14/5/20 18:44, Chun-Kuang Hu wrote: >> Hi, Enric: >> >> Enric Balletbo i Serra 於 2020年5月14日 週四 >> 下午11:42寫道: >>> >>> Hi Chun-Kuang, >>> >>> On 14/5/20 16:28, Chun-Kuang Hu wrote: Hi, Enric: Enric Balletbo Serra 於 2020年5月14日 週四 上午12:41寫道: > > Hi Chun-Kuang, > > Missatge de Enric Balletbo i Serra del > dia dv., 1 de maig 2020 a les 17:25: >> >> Use the drm_bridge_connector helper to create a connector for pipelines >> that use drm_bridge. This allows splitting connector operations across >> multiple bridges when necessary, instead of having the last bridge in >> the chain creating the connector and handling all connector operations >> internally. >> >> Signed-off-by: Enric Balletbo i Serra >> Acked-by: Sam Ravnborg > > A gentle ping on this, I think that this one is the only one that > still needs a review in the series. This is what I reply in patch v3: >>> >>> Sorry for missing this. >>> I think the panel is wrapped into next_bridge here, >>> >>> Yes, you can have for example: >>> >>> 1. drm_bridge (mtk_dsi) -> drm_bridge (ps8640 - dsi-to-edp) -> >>> drm_panel_bridge >>> (edp panel) >>> >>> or a >>> >>> 2. drm_bridge (mtk_dsi)-> drm_panel_bridge (dsi panel) >>> >>> The _first_ one is my use case >>> if (panel) { >>> >>> This handles the second case, where you attach a dsi panel. >>> dsi->next_bridge = devm_drm_panel_bridge_add(dev, panel); so the next_bridge is a panel_bridge, in its attach function panel_bridge_attach(), according to the flag DRM_BRIDGE_ATTACH_NO_CONNECTOR, if not exist, it would create connector and attach connector to panel. I'm not sure this flag would exist or not, but for both case, it's strange. If exist, you create connector in this patch but no where to attach connector to panel. >>> >>> Yes, in fact, this is transitional patch needed, as once I converted >>> mtk_dpi, >>> mtk_dsi and mtk_hdmi to the new drm_bridge API the >>> drm_bridge_connector_init() >>> will be done in mtk_drm_drv. We will need to call drm_bridge_connector_init >>> for >>> dpi and dsi pipes and remove that call from mtk_dsi and mtk_dpi drivers. The >>> graphic controller driver should create connectors and CRTCs, as example >>> you can >>> take a look at drivers/gpu/drm/omapdrm/omap_drv.c >>> >> >> I have such question because I've reviewed omap's driver. In omap's >> driver, after it call drm_bridge_connector_init(), it does this: >> >> if (pipe->output->panel) { >> ret = drm_panel_attach(pipe->output->panel, >> pipe->connector); >> if (ret < 0) >> return ret; >> } >> >> In this patch, you does not do this. >> > > I see, so yes, I am probably missing call drm_panel_attach in case there is a > direct panel attached. Thanks for pointing it. > > I'll send a new version adding the drm_panel_attach call. > Wait, shouldn't panel be attached on the call of mtk_dsi_bridge_attach as next_bridge points to a bridge or a panel? static int mtk_dsi_bridge_attach(struct drm_bridge *bridge, enum drm_bridge_attach_flags flags) { struct mtk_dsi *dsi = bridge_to_dsi(bridge); /* Attach the panel or bridge to the dsi bridge */ return drm_bridge_attach(bridge->encoder, dsi->next_bridge, >bridge, flags); } Or I am continuing misunderstanding all this? If not exist, the next_brige would create one connector and this brige would create another connector. I think in your case, mtk_dsi does not directly connect to a panel, so >>> >>> Exactly >>> I need a exact explain. Or someone could test this on a directly-connect-panel platform. >>> >>> I don't think I am breaking this use case but AFAICS there is no users in >>> mainline that directly connect a panel using the mediatek driver. As I said >>> my >>> use case is the other so I can't really test. Do you know anyone that can >>> test this? >> >> I'm not sure who can test this, but [1], which is sent by YT Shen in a >> series, is a patch to support dsi command mode so dsi could directly >> connect to panel. >> >> [1] >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/mediatek?h=v5.7-rc5=21898816831fc60c92dd634ab4316a24da7eb4af >> >> It's better that someone could test this case, but if no one would >> test this, I could also accept a good-look patch. >> >> Regards, >> Chun-Kuang. >> >>> >>> Thanks, >>> Enric >>> Regards, Chun-Kuang. > > Thanks, > Enric > >> --- >> >> Changes in v4: None >> Changes in v3: >> - Move the bridge.type line to the patch that adds drm_bridge support. >> (Laurent Pinchart) >> >> Changes in v2: None >> >> drivers/gpu/drm/mediatek/mtk_dsi.c | 13
Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
On Mon, May 11, 2020 at 10:34 AM Paolo Bonzini wrote: > > Hi Jonathan, I think the remaining sticky point is this one: Apologies it took a couple days for me to respond; I wanted to finish evaluating our current usage to make sure I had a full picture; I'll summarize our state at the bottom. > On 11/05/20 19:02, Jonathan Adams wrote: > > I think I'd characterize this slightly differently; we have a set of > > statistics which are essentially "in parallel": > > > > - a variety of statistics, N CPUs they're available for, or > > - a variety of statistics, N interfaces they're available for. > > - a variety of statistics, N kvm object they're available for. > > > > Recreating a parallel hierarchy of statistics any time we add/subtract > > a CPU or interface seems like a lot of overhead. Perhaps a better > > model would be some sort of "parameter enumn" (naming is hard; > > parameter set?), so when a CPU/network interface/etc is added you'd > > add its ID to the "CPUs" we know about, and at removal time you'd > > take it out; it would have an associated cbarg for the value getting > > callback. > > > >> Yep, the above "not create a dentry" flag would handle the case where > >> you sum things up in the kernel because the more fine grained counters > >> would be overwhelming. > > > > nodnod; or the callback could handle the sum itself. > > In general for statsfs we took a more explicit approach where each > addend in a sum is a separate stats_fs_source. In this version of the > patches it's also a directory, but we'll take your feedback and add both > the ability to hide directories (first) and to list values (second). > > So, in the cases of interfaces and KVM objects I would prefer to keep > each addend separate. This just feels like a lot of churn just to add a statistic or object; in your model, every time a KVM or VCPU is created, you create the N statistics, leading to N*M total objects. As I was imagining it, you'd have: A 'parameter enum' which maps names to object pointers and A set of statistics which map a statfs path to {callback, cbarg, zero or more parameter enums} So adding a new KVM VCPU would just be "add an object to the KVM's VCPU parameter enum", and removing it would be the opposite, and a couple callbacks could handle basically all of the stats. The only tricky part would be making sure the parameter enum value create/destroy and the callback calls are coordinated correctly. If you wanted stats for a particular VCPU, we could mark the overall directory as "include subdirs for VCPU parameter", and you'd automatically get one directory per VCPU, with the same set of stats in it, constrained to the single VCPU. I could also imagine having an ".agg_sum/{stata,statb,...}" to report using the aggregations you have, or a mode to say "stats in this directory are sums over the following VCPU parameter". > For CPUs that however would be pretty bad. Many subsystems might > accumulate stats percpu for performance reason, which would then be > exposed as the sum (usually). So yeah, native handling of percpu values > makes sense. I think it should fit naturally into the same custom > aggregation framework as hash table keys, we'll see if there's any devil > in the details. > > Core kernel stats such as /proc/interrupts or /proc/stat are the > exception here, since individual per-CPU values can be vital for > debugging. For those, creating a source per stat, possibly on-the-fly > at hotplug/hot-unplug time because NR_CPUS can be huge, would still be > my preferred way to do it. Our metricfs has basically two modes: report all per-CPU values (for the IPI counts etc; you pass a callback which takes a 'int cpu' argument) or a callback that sums over CPUs and reports the full value. It also seems hard to have any subsystem with a per-CPU stat having to install a hotplug callback to add/remove statistics. In my model, a "CPU" parameter enum which is automatically kept up-to-date is probably sufficient for the "report all per-CPU values". Does this make sense to you? I realize that this is a significant change to the model y'all are starting with; I'm willing to do the work to flesh it out. Thanks for your time, - Jonathan P.S. Here's a summary of the types of statistics we use in metricfs in google, to give a little context: - integer values (single value per stat, source also a single value); a couple of these are boolean values exported as '0' or '1'. - per-CPU integer values, reported as a table - per-CPU integer values, summed and reported as an aggregate - single-value values, keys related to objects: - many per-device (disk, network, etc) integer stats - some per-device string data (version strings, UUIDs, and occasional statuses.) - a few histograms (usually counts by duration ranges) - the "function name" to count for the WARN statistic I mentioned. - A single statistic with two keys (for livepatch statistics; the value is the livepatch status as a
Re: [patch V4 part 1 29/36] x86/mce: Send #MC singal from task work
Mathieu Desnoyers writes: > - On May 5, 2020, at 9:16 AM, Thomas Gleixner t...@linutronix.de wrote: > >> From: Peter Zijlstra >> > > Patch title: singal -> signal. > >> Convert #MC over to using task_work_add(); it will run the same code >> slightly later, on the return to user path of the same exception. > > So I suspect that switching the order between tracehook_notify_resume() > (which ends up calling task_work_run()) and do_signal() done by an > earlier patch in this series intends to ensure the information about the > instruction pointer causing the #MC is not overwritten by do_signal() > (but I'm just guessing). No, it does not. See the ordering discussion. Aside of that signal never transported any address information. It uses force_sig(SIGBUS). Even if a different signal would be sent first then the register frame of the #MC is still there when the fatal signal is sent later. But even w/o changing the ordering the taskwork check in do_signal() runs the pending work before delivering anything. Thanks, tglx
[PATCH net-next 0/2] DP83822 Fiber enablement
Hello The DP83822 Ethernet PHY has the ability to connect via a Fiber port. The DP83825 or DP83826 do not have this ability. In order to keep the same driver the DP83822 and the 825/826 phy_driver call backs need to be changed so that the DP83822 has it's own call back for config_init and adds a probe call back. A devicetree binding was added to set the signal polarity for the fiber connection. This property is only applicable in fiber mode and is optional in fiber mode. Dan Dan Murphy (2): dt-bindings: net: dp83822: Add TI dp83822 phy net: phy: DP83822: Add ability to advertise Fiber connection .../devicetree/bindings/net/ti,dp83822.yaml | 49 ++ drivers/net/phy/dp83822.c | 140 +- 2 files changed, 181 insertions(+), 8 deletions(-) create mode 100644 Documentation/devicetree/bindings/net/ti,dp83822.yaml -- 2.26.2
[PATCH net-next 2/2] net: phy: DP83822: Add ability to advertise Fiber connection
The DP83822 can be configured to use a Fiber connection. The strap register is read to determine if the device has been configured to use a fiber connection. With the fiber connection the PHY can be configured to detect whether the fiber connection is active by either a high signal or a low signal. Fiber mode is only applicable to the DP83822 so rework the PHY match table so that non-fiber PHYs can still use the same driver but not call or use any of the fiber features. Signed-off-by: Dan Murphy --- drivers/net/phy/dp83822.c | 140 +++--- 1 file changed, 132 insertions(+), 8 deletions(-) diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c index 1dd19d0cb269..fe7443bc8b06 100644 --- a/drivers/net/phy/dp83822.c +++ b/drivers/net/phy/dp83822.c @@ -27,6 +27,11 @@ #define MII_DP83822_MISR1 0x12 #define MII_DP83822_MISR2 0x13 #define MII_DP83822_RESET_CTRL 0x1f +#define MII_DP83822_GENCFG 0x465 +#define MII_DP83822_SOR1 0x467 + +/* GENCFG */ +#define DP83822_SIG_DET_POLARITY BIT(0) #define DP83822_HW_RESET BIT(15) #define DP83822_SW_RESET BIT(14) @@ -77,6 +82,21 @@ #define DP83822_WOL_INDICATION_SEL BIT(8) #define DP83822_WOL_CLR_INDICATION BIT(11) +/* SOR1 bits */ +#define DP83822_FX_EN_STRAPBIT(11) +#define DP83822_FX_DUPLEX_STRAPBIT(0) + +#define MII_DP83822_FIBER_ADVERTISE(SUPPORTED_AUI | SUPPORTED_FIBRE | \ +SUPPORTED_BNC | SUPPORTED_Pause | \ +SUPPORTED_Asym_Pause | \ +SUPPORTED_100baseT_Full) + +struct dp83822_private { + bool fx_signal_detect_low; + int fx_enabled; + u16 fx_duplex_mode; +}; + static int dp83822_ack_interrupt(struct phy_device *phydev) { int err; @@ -255,7 +275,7 @@ static int dp83822_config_intr(struct phy_device *phydev) return phy_write(phydev, MII_DP83822_PHYSCR, physcr_status); } -static int dp83822_config_init(struct phy_device *phydev) +static int dp8382x_disable_wol(struct phy_device *phydev) { int value = DP83822_WOL_EN | DP83822_WOL_MAGIC_EN | DP83822_WOL_SECURE_ON; @@ -264,6 +284,41 @@ static int dp83822_config_init(struct phy_device *phydev) MII_DP83822_WOL_CFG, value); } +static int dp83822_config_init(struct phy_device *phydev) +{ + struct dp83822_private *dp83822 = phydev->priv; + int err = 0; + + if (dp83822->fx_enabled) { + linkmode_set_bit(ETHTOOL_LINK_MODE_FIBRE_BIT, +phydev->supported); + linkmode_set_bit(ETHTOOL_LINK_MODE_FIBRE_BIT, +phydev->advertising); + + /* Auto negotiation is not available in fiber mode */ + phydev->autoneg = AUTONEG_DISABLE; + phydev->speed = SPEED_100; + phydev->duplex = DUPLEX_FULL; + + /* Setup fiber advertisement */ + err = phy_modify_changed(phydev, MII_ADVERTISE, +ADVERTISE_1000XFULL | +ADVERTISE_1000XPAUSE | +ADVERTISE_1000XPSE_ASYM, +MII_DP83822_FIBER_ADVERTISE); + + if (err < 0) + return err; + } + + return dp8382x_disable_wol(phydev); +} + +static int dp8382x_config_init(struct phy_device *phydev) +{ + return dp8382x_disable_wol(phydev); +} + static int dp83822_phy_reset(struct phy_device *phydev) { int err; @@ -272,7 +327,60 @@ static int dp83822_phy_reset(struct phy_device *phydev) if (err < 0) return err; - dp83822_config_init(phydev); + return phydev->drv->config_init(phydev); +} + +#ifdef CONFIG_OF_MDIO +static int dp83822_of_init(struct phy_device *phydev) +{ + struct dp83822_private *dp83822 = phydev->priv; + struct device *dev = >mdio.dev; + + if (dp83822->fx_enabled) + dp83822->fx_signal_detect_low = device_property_present(dev, + "ti,signal-polarity-low"); + + return 0; +} +#else +static int dp83822_of_init(struct phy_device *phydev) +{ + return 0; +} +#endif /* CONFIG_OF_MDIO */ + +static int dp83822_read_straps(struct phy_device *phydev) +{ + struct dp83822_private *dp83822 = phydev->priv; + u16 val; + + val = phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_SOR1); + if (val < 0) + return val; + + dp83822->fx_enabled = val & DP83822_FX_EN_STRAP; + dp83822->fx_duplex_mode = val & DP83822_FX_DUPLEX_STRAP; + + return 0; +} + +static int dp83822_probe(struct phy_device *phydev) +{ + struct dp83822_private *dp83822; + int ret; + + dp83822 =
[PATCH net-next 1/2] dt-bindings: net: dp83822: Add TI dp83822 phy
Add a dt binding for the TI dp83822 ethernet phy device. CC: Rob Herring Signed-off-by: Dan Murphy --- .../devicetree/bindings/net/ti,dp83822.yaml | 49 +++ 1 file changed, 49 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/ti,dp83822.yaml diff --git a/Documentation/devicetree/bindings/net/ti,dp83822.yaml b/Documentation/devicetree/bindings/net/ti,dp83822.yaml new file mode 100644 index ..60afd43ad3b6 --- /dev/null +++ b/Documentation/devicetree/bindings/net/ti,dp83822.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: (GPL-2.0+ OR BSD-2-Clause) +# Copyright (C) 2020 Texas Instruments Incorporated +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/net/ti,dp83822.yaml#; +$schema: "http://devicetree.org/meta-schemas/core.yaml#; + +title: TI DP83822 ethernet PHY + +allOf: + - $ref: "ethernet-controller.yaml#" + +maintainers: + - Dan Murphy + +description: | + The DP83822 is a low-power, single-port, 10/100 Mbps Ethernet PHY. It + provides all of the physical layer functions needed to transmit and receive + data over standard, twisted-pair cables or to connect to an external, + fiber-optic transceiver. Additionally, the DP83822 provides flexibility to + connect to a MAC through a standard MII, RMII, or RGMII interface + + Specifications about the charger can be found at: +http://www.ti.com/lit/ds/symlink/dp83822i.pdf + +properties: + reg: +maxItems: 1 + + ti,signal-polarity-low: +type: boolean +description: | + DP83822 PHY in Fiber mode only. + Sets the DP83822 to detect a link drop condition when the signal goes + high. If not set then link drop will occur when the signal goes low. + +required: + - reg + +examples: + - | +mdio0 { + #address-cells = <1>; + #size-cells = <0>; + ethphy0: ethernet-phy@0 { +reg = <0>; +ti,signal-polarity-low; + }; +}; -- 2.26.2
Re: [patch V4 part 1 29/36] x86/mce: Send #MC singal from task work
- On May 14, 2020, at 1:38 PM, Thomas Gleixner t...@linutronix.de wrote: > Mathieu Desnoyers writes: >> - On May 5, 2020, at 9:16 AM, Thomas Gleixner t...@linutronix.de wrote: >> >>> From: Peter Zijlstra >>> >> >> Patch title: singal -> signal. >> >>> Convert #MC over to using task_work_add(); it will run the same code >>> slightly later, on the return to user path of the same exception. >> >> So I suspect that switching the order between tracehook_notify_resume() >> (which ends up calling task_work_run()) and do_signal() done by an >> earlier patch in this series intends to ensure the information about the >> instruction pointer causing the #MC is not overwritten by do_signal() >> (but I'm just guessing). > > No, it does not. See the ordering discussion. > > Aside of that signal never transported any address information. It uses > force_sig(SIGBUS). > > Even if a different signal would be sent first then the register frame > of the #MC is still there when the fatal signal is sent later. > > But even w/o changing the ordering the taskwork check in do_signal() > runs the pending work before delivering anything. Yep, that was the key thing I missed, Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
Re: [PATCH 11/20] amifb: get rid of pointless access_ok() calls
On Thu, May 14, 2020 at 04:25:35PM +0200, Bartlomiej Zolnierkiewicz wrote: > Thank you for in-detail explanations, for this patch: > > Acked-by: Bartlomiej Zolnierkiewicz > > Could you also please take care of adding missing checks for {get,put}_user() > failures later? Umm... OK; put_user() side is trivial - the interesting part is what to do about get_user() failures halfway through. Right now it treats them as "we'd read zeroes". On anything else I would say "screw it, memdup_user() the damn thing on the way in and copy from there", but... Amiga has how much RAM, again? OTOH, from my reading of that code it does appear to be limited to 4Kb of data to copy, so it's probably OK... Hell knows - I'm really confused by those #ifdef __mc68000__ in there; the driver *is* amiga-only: obj-$(CONFIG_FB_AMIGA)+= amifb.o c2p_planar.o config FB_AMIGA tristate "Amiga native chipset support" depends on FB && AMIGA and AMIGA is defined only in arch/m68k/Kconfig.machine. So how the hell can it *not* be true? OTOH, it looks like hand-optimized asm equivalents of C they have in #else, so that #else might be meant to document what's going on... I've no idea how to test any changes to that thing - the only m68k emulator I'm reasonably familiar with is aranym, and that's Atari, not Amiga. Never got around to setting up UAE... So I can do a patch more or less blindly (memdup_user() after it has checked the limits on height/width, then dereferencing from copy instead of get_user()), but I won't be able to test it.
Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
On 14/05/20 19:35, Jonathan Adams wrote: >> In general for statsfs we took a more explicit approach where each >> addend in a sum is a separate stats_fs_source. In this version of the >> patches it's also a directory, but we'll take your feedback and add both >> the ability to hide directories (first) and to list values (second). >> >> So, in the cases of interfaces and KVM objects I would prefer to keep >> each addend separate. > > This just feels like a lot of churn just to add a statistic or object; > in your model, every time a KVM or VCPU is created, you create the N > statistics, leading to N*M total objects. While it's N*M files, only O(M) statsfs API calls are needed to create them. Whether you have O(N*M) total kmalloc-ed objects or O(M) is an implementation detail. Having O(N*M) API calls would be a non-started, I agree - especially once you start thinking of more efficient publishing mechanisms that unlike files are also O(M). >> For CPUs that however would be pretty bad. Many subsystems might >> accumulate stats percpu for performance reason, which would then be >> exposed as the sum (usually). So yeah, native handling of percpu values >> makes sense. I think it should fit naturally into the same custom >> aggregation framework as hash table keys, we'll see if there's any devil >> in the details. >> >> Core kernel stats such as /proc/interrupts or /proc/stat are the >> exception here, since individual per-CPU values can be vital for >> debugging. For those, creating a source per stat, possibly on-the-fly >> at hotplug/hot-unplug time because NR_CPUS can be huge, would still be >> my preferred way to do it. > > Our metricfs has basically two modes: report all per-CPU values (for > the IPI counts etc; you pass a callback which takes a 'int cpu' > argument) or a callback that sums over CPUs and reports the full > value. It also seems hard to have any subsystem with a per-CPU stat > having to install a hotplug callback to add/remove statistics. Yes, this is also why I think percpu values should have some kind of native handling. Reporting per-CPU values individually is the exception. > In my model, a "CPU" parameter enum which is automatically kept > up-to-date is probably sufficient for the "report all per-CPU values". Yes (or a separate CPU source in my model). Paolo > Does this make sense to you? I realize that this is a significant > change to the model y'all are starting with; I'm willing to do the > work to flesh it out. > Thanks for your time, > - Jonathan > > P.S. Here's a summary of the types of statistics we use in metricfs > in google, to give a little context: > > - integer values (single value per stat, source also a single value); > a couple of these are boolean values exported as '0' or '1'. > - per-CPU integer values, reported as a table > - per-CPU integer values, summed and reported as an aggregate > - single-value values, keys related to objects: > - many per-device (disk, network, etc) integer stats > - some per-device string data (version strings, UUIDs, and > occasional statuses.) > - a few histograms (usually counts by duration ranges) > - the "function name" to count for the WARN statistic I mentioned. > - A single statistic with two keys (for livepatch statistics; the > value is the livepatch status as a string) > > Most of the stats with keys are "complete" (every key has a value), > but there are several examples of statistics where only some of the > possible keys have values, or (e.g. for networking statistics) only > the keys visible to the reading process (e.g. in its namespaces) are > included. >
Re: [PATCH] Bluetooth: hci_qca: Enable WBS support for wcn3991
Hi, On Thu, May 14, 2020 at 9:30 AM Matthias Kaehlcke wrote: > > Hi Abhishek, > > On Wed, May 13, 2020 at 08:41:25PM -0700, Abhishek Pandit-Subedi wrote: > > WCN3991 supports transparent WBS (host encoded mSBC). Add a flag to the > > device match data to show WBS is supported. > > In general this looks good to me, a few nits inside. > > > This requires the matching firmware for WCN3991 in linux-firmware: > > 1a8b0dc00f77 (qca: Enable transparent WBS for WCN3991) > > > > Signed-off-by: Abhishek Pandit-Subedi > > --- > > > > drivers/bluetooth/hci_qca.c | 23 +-- > > 1 file changed, 17 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c > > index b3fd07a6f8127..305976c4dcf42 100644 > > --- a/drivers/bluetooth/hci_qca.c > > +++ b/drivers/bluetooth/hci_qca.c > > @@ -75,6 +75,9 @@ enum qca_flags { > > QCA_HW_ERROR_EVENT > > }; > > > > +enum qca_driver_flags { > > + QCA_DRV_WIDEBAND_SPEECH_SUPPORTED = 0x1, > > s/0x1/BIT(0)/ Will change in next version. > > > +}; > > The 'driver'/'DRV' midfix is a bit misleading. WBS support is a device > capability, it's not something the driver supports or doesn't. Maybe > name it 'qca_capabilities' or similar. > > > /* HCI_IBS transmit side sleep protocol states */ > > enum tx_ibs_states { > > @@ -187,10 +190,11 @@ struct qca_vreg { > > unsigned int load_uA; > > }; > > > > -struct qca_vreg_data { > > +struct qca_device_data { > > enum qca_btsoc_type soc_type; > > struct qca_vreg *vregs; > > size_t num_vregs; > > + uint32_t flags; > > capabilities? Capabilities sounds good to me. Thanks for the feedback. Abhishek
Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
- On May 14, 2020, at 1:28 PM, Thomas Gleixner t...@linutronix.de wrote: > Mathieu Desnoyers writes: >> - On May 5, 2020, at 9:49 AM, Thomas Gleixner t...@linutronix.de wrote: >>> + >>> +static __always_inline void debug_exit(unsigned long dr7) >>> +{ >>> + set_debugreg(dr7, 7); >>> +} >> * Question 1 >> Out of curiosity, what prevents the compiler from moving instructions >> outside of the code regions surrounded by entry/exit ? This is an always >> inline, which invokes set_debugreg which is inline for CONFIG_PARAVIRT_XXL=n, >> which in turn uses an asm() (not volatile), without any memory >> clobber. >> ? * Question 2 >> Also, considering that "inline" is not sufficient to ensure the compiler >> does not emit a traceable function, I suspect you'll also want to mark >> "native_get_debugreg" and "native_set_debugreg" always inline as well. > > It can move it into a function and call that. Fine. If that function > stays in the noinstr section then it should not emit a trace stub and if > it moves it out of the section or reuses another instance in text then > objtool will complain. > > Checking for trace stubs and other instrumentation nonsense is on the > objtool wishlist anyway. > > But yes, marking these __always_inline prevents that. Those escaped my > chase. But I would have found them once I go and fix that paravirt muck. This answers only my second question (see "Question 1" above). Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
Re: [PATCH v2 bpf-next 0/7] bpf, printk: add BTF-based type printing
On Wed, 13 May 2020, Alexei Starovoitov wrote: > On Tue, May 12, 2020 at 06:56:38AM +0100, Alan Maguire wrote: > > The printk family of functions support printing specific pointer types > > using %p format specifiers (MAC addresses, IP addresses, etc). For > > full details see Documentation/core-api/printk-formats.rst. > > > > This patchset proposes introducing a "print typed pointer" format > > specifier "%pT"; the argument associated with the specifier is of > > form "struct btf_ptr *" which consists of a .ptr value and a .type > > value specifying a stringified type (e.g. "struct sk_buff") or > > an .id value specifying a BPF Type Format (BTF) id identifying > > the appropriate type it points to. > > > > There is already support in kernel/bpf/btf.c for "show" functionality; > > the changes here generalize that support from seq-file specific > > verifier display to the more generic case and add another specific > > use case; snprintf()-style rendering of type information to a > > provided buffer. This support is then used to support printk > > rendering of types, but the intent is to provide a function > > that might be useful in other in-kernel scenarios; for example: > > > > - ftrace could possibly utilize the function to support typed > > display of function arguments by cross-referencing BTF function > > information to derive the types of arguments > > - oops/panic messaging could extend the information displayed to > > dig into data structures associated with failing functions > > > > The above potential use cases hint at a potential reply to > > a reasonable objection that such typed display should be > > solved by tracing programs, where the in kernel tracing records > > data and the userspace program prints it out. While this > > is certainly the recommended approach for most cases, I > > believe having an in-kernel mechanism would be valuable > > also. > > > > The function the printk() family of functions rely on > > could potentially be used directly for other use cases > > like ftrace where we might have the BTF ids of the > > pointers we wish to display; its signature is as follows: > > > > int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, > >char *buf, int len, u64 flags); > > > > So if ftrace say had the BTF ids of the types of arguments, > > we see that the above would allow us to convert the > > pointer data into displayable form. > > > > To give a flavour for what the printed-out data looks like, > > here we use pr_info() to display a struct sk_buff *. > > > > struct sk_buff *skb = alloc_skb(64, GFP_KERNEL); > > > > pr_info("%pT", BTF_PTR_TYPE(skb, "struct sk_buff")); > > > > ...gives us: > > > > (struct sk_buff){ > > .transport_header = (__u16)65535, > > .mac_header = (__u16)65535, > > .end = (sk_buff_data_t)192, > > .head = (unsigned char *)7524fd8b, > > .data = (unsigned char *)7524fd8b, > > could you add "0x" prefix here to make it even more C like > and unambiguous ? > Sure. > > .truesize = (unsigned int)768, > > .users = (refcount_t){ > > .refs = (atomic_t){ > >.counter = (int)1, > > }, > > }, > > } > > > > For bpf_trace_printk() a "struct __btf_ptr *" is used as > > argument; see tools/testing/selftests/bpf/progs/netif_receive_skb.c > > for example usage. > > > > The hope is that this functionality will be useful for debugging, > > and possibly help facilitate the cases mentioned above in the future. > > > > Changes since v1: > > > > - changed format to be more drgn-like, rendering indented type info > > along with type names by default (Alexei) > > - zeroed values are omitted (Arnaldo) by default unless the '0' > > modifier is specified (Alexei) > > - added an option to print pointer values without obfuscation. > > The reason to do this is the sysctls controlling pointer display > > are likely to be irrelevant in many if not most tracing contexts. > > Some questions on this in the outstanding questions section below... > > - reworked printk format specifer so that we no longer rely on format > > %pT but instead use a struct * which contains type information > > (Rasmus). This simplifies the printk parsing, makes use more dynamic > > and also allows specification by BTF id as well as name. > > - ensured that BTF-specific printk code is bracketed by > > #if ENABLED(CONFIG_BTF_PRINTF) > > I don't like this particular bit: > +config BTF_PRINTF > + bool "print type information using BPF type format" > + depends on DEBUG_INFO_BTF > + default n > > Looks like it's only purpose is to gate printk test. > In such case please convert it into hidden config that is > automatically selected when DEBUG_INFO_BTF is set. > Or just make printk tests to #if IS_ENABLED(DEBUG_INFO_BTF) > I think the latter makes sense; if BTF isn't there retrieving vmlinux BTF fails and we simply fall back to standard pointer printing. The only part of the
Re: [PATCH v4 06/10] dt-bindings: mtd: update STM32 FMC2 NAND controller documentation
On Thu, May 14, 2020 at 11:35 AM Christophe Kerello wrote: > > Hi Rob, > > On 5/14/20 5:00 PM, Rob Herring wrote: > > On Wed, May 06, 2020 at 11:11:15AM +0200, Christophe Kerello wrote: > >> These bindings can be used on SOCs where the FMC2 NAND controller is > >> in standalone. In case that the FMC2 embeds 2 controllers (an external > >> bus controller and a raw NAND controller), the register base and the > >> clock will be defined in the parent node. It is the reason why the > >> register base address and the clock are now optional. > >> > >> Signed-off-by: Christophe Kerello > >> --- > >> .../devicetree/bindings/mtd/st,stm32-fmc2-nand.yaml | 19 > >> ++- > >> 1 file changed, 10 insertions(+), 9 deletions(-) > >> > >> diff --git a/Documentation/devicetree/bindings/mtd/st,stm32-fmc2-nand.yaml > >> b/Documentation/devicetree/bindings/mtd/st,stm32-fmc2-nand.yaml > >> index b059267..68fac1a 100644 > >> --- a/Documentation/devicetree/bindings/mtd/st,stm32-fmc2-nand.yaml > >> +++ b/Documentation/devicetree/bindings/mtd/st,stm32-fmc2-nand.yaml > >> @@ -18,13 +18,15 @@ properties: > >> > >> reg: > >> items: > >> - - description: Registers > >> + - description: Registers (optional) > > > > The only thing that can be optional are the last entries. You have to do > > a 'oneOf' with 6 entries and 7 entries. > > Ok, so the way to describe the reg property in my case should be: > reg: > oneOf: > - description: FMC2 embeds the NFC controller in standalone. > items: > - description: Registers > - description: Chip select 0 data > - description: Chip select 0 command > - description: Chip select 0 address space > - description: Chip select 1 data > - description: Chip select 1 command > - description: Chip select 1 address space > > - description: FMC2 embeds the NFC controller and the EBI > controller. > items: > - description: Chip select 0 data > - description: Chip select 0 command > - description: Chip select 0 address space > - description: Chip select 1 data > - description: Chip select 1 command > - description: Chip select 1 address space > > > > > And where's your new compatible string for this different h/w? > > From NFC controller point of view, it is the same HW. That's what everyone says until they have some quirk or integration difference to handle. > In the case that we have 2 controllers embedded, the register base is > shared. > The NFC driver will check at probe time the compatible string of its > parent node. > In case that it is "st,stm32mp1-fmc2-ebi", then the driver will find the > register base in the parent node (EBI node), otherwise it will find it > in the NFC node. > Is it better to have 2 compatible strings (one for each reg description) > than checking the parent's compatible string and have only one > compatible string? Why not just put the register base into the child node too? While overlapping 'reg' regions for siblings is bad, it's fine for child nodes. I guess since there are chip selects for the child nodes that may not work here. It doesn't hurt to have another compatible. You can always make the old one a fallback. With different compatibles you can make sure reg has the right number of entries. Rob
Re: [PATCH 3/3] tools/perf: build fixes for arch_errno_names.sh
On Thu, May 14, 2020 at 8:04 AM Arnaldo Carvalho de Melo wrote: > > Em Thu, Mar 05, 2020 at 11:11:10PM -0800, Ian Rogers escreveu: > > Allow the CC compiler to accept a CFLAGS environment variable. > > Make the architecture test directory agree with the code comment. > > This doesn't change the code generated but makes it easier to integrate > > running the shell script in build systems like bazel. > > I've looked at this and split this part in a separate patch, and applied > it locally, please take a look, now looking at the other bit of the > patch. This bit looks good. The CFLAGS change is something I need to deal with a directory layout change in our build system. Thanks, Ian > - Arnaldo > > > Signed-off-by: Ian Rogers > > --- > > tools/perf/trace/beauty/arch_errno_names.sh | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/tools/perf/trace/beauty/arch_errno_names.sh > > b/tools/perf/trace/beauty/arch_errno_names.sh > > index 22c9fc900c84..9f9ea45cddc4 100755 > > --- a/tools/perf/trace/beauty/arch_errno_names.sh > > +++ b/tools/perf/trace/beauty/arch_errno_names.sh > > @@ -91,7 +91,7 @@ EoHEADER > > # in tools/perf/arch > > archlist="" > > for arch in $(find $toolsdir/arch -maxdepth 1 -mindepth 1 -type d -printf > > "%f\n" | grep -v x86 | sort); do > > - test -d arch/$arch && archlist="$archlist $arch" > > + test -d $toolsdir/perf/arch/$arch && archlist="$archlist $arch" > > done > > > > for arch in x86 $archlist generic; do > > -- > commit 1b59e3b7bfc6183d3dc9f119e1875f9607d29d96 > Author: Ian Rogers > Date: Thu Mar 5 23:11:10 2020 -0800 > > perf trace: Fix the selection for architectures to generate the errno > name tables > > Make the architecture test directory agree with the code comment. > > Committer notes: > > This was split from a larger patch. > > The code was assuming the developer always worked from tools/perf/, so > make sure we > do the test -d having $toolsdir/perf/arch/$arch, to match the intent > expressed in the comment, > just above that loop. > > Signed-off-by: Ian Rogers > Cc: Adrian Hunter > Cc: Alexander Shishkin > Cc: Alexios Zavras > Cc: Andi Kleen > Cc: Greg Kroah-Hartman > Cc: Igor Lubashev > Cc: Jiri Olsa > Cc: Kan Liang > Cc: Mark Rutland > Cc: Mathieu Poirier > Cc: Namhyung Kim > Cc: Nick Desaulniers > Cc: Peter Zijlstra > Cc: Stephane Eranian > Cc: Thomas Gleixner > Cc: Wei Li > Link: > http://lore.kernel.org/lkml/20200306071110.130202-4-irog...@google.com > Signed-off-by: Arnaldo Carvalho de Melo > > diff --git a/tools/perf/trace/beauty/arch_errno_names.sh > b/tools/perf/trace/beauty/arch_errno_names.sh > index 22c9fc900c84..f8c44a85650b 100755 > --- a/tools/perf/trace/beauty/arch_errno_names.sh > +++ b/tools/perf/trace/beauty/arch_errno_names.sh > @@ -91,7 +91,7 @@ EoHEADER > # in tools/perf/arch > archlist="" > for arch in $(find $toolsdir/arch -maxdepth 1 -mindepth 1 -type d -printf > "%f\n" | grep -v x86 | sort); do > - test -d arch/$arch && archlist="$archlist $arch" > + test -d $toolsdir/perf/arch/$arch && archlist="$archlist $arch" > done > > for arch in x86 $archlist generic; do
Re: [PATCH] coresight: etm4x: Add support to disable trace unit power up
Good morning Sai, On Thu, May 14, 2020 at 04:29:15PM +0530, Sai Prakash Ranjan wrote: > From: Tingwei Zhang > > On some Qualcomm Technologies Inc. SoCs like SC7180, there > exists a hardware errata where the APSS (Application Processor > SubSystem)/CPU watchdog counter is stopped when ETM register > TRCPDCR.PU=1. Fun stuff... > Since the ETMs share the same power domain as > that of respective CPU cores, they are powered on when the > CPU core is powered on. So we can disable powering up of the > trace unit after checking for this errata via new property > called "qcom,tupwr-disable". > > Signed-off-by: Tingwei Zhang > Co-developed-by: Sai Prakash Ranjan > Signed-off-by: Sai Prakash Ranjan Co-developed-by: Sai Prakash Ranjan Signed-off-by: Tingwei Zhang > --- > .../devicetree/bindings/arm/coresight.txt | 6 > drivers/hwtracing/coresight/coresight-etm4x.c | 29 --- Please split in two patches. > 2 files changed, 25 insertions(+), 10 deletions(-) > > diff --git a/Documentation/devicetree/bindings/arm/coresight.txt > b/Documentation/devicetree/bindings/arm/coresight.txt > index 846f6daae71b..d2030128fe46 100644 > --- a/Documentation/devicetree/bindings/arm/coresight.txt > +++ b/Documentation/devicetree/bindings/arm/coresight.txt > @@ -108,6 +108,12 @@ its hardware characteristcs. > * arm,cp14: must be present if the system accesses ETM/PTM management > registers via co-processor 14. > > + * qcom,tupwr-disable: boolean. Indicates that trace unit power up can > + be disabled on Qualcomm Technologies Inc. systems where ETMs are in > + the same power domain as their CPU cores. This property is required > + to identify such systems with hardware errata where the CPU watchdog > + counter is stopped when TRCPDCR.PU=1. > + I think something like "qcom,skip-power-up" would be clearer. Also, a better choice of words is that TRCPDCR.PU does not have to be set on Qualcomm... > * Optional property for TMC: > > * arm,buffer-size: size of contiguous buffer space for TMC ETR > diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c > b/drivers/hwtracing/coresight/coresight-etm4x.c > index fb0f5f4f3a91..6886b44f6947 100644 > --- a/drivers/hwtracing/coresight/coresight-etm4x.c > +++ b/drivers/hwtracing/coresight/coresight-etm4x.c > @@ -104,6 +104,11 @@ struct etm4_enable_arg { > int rc; > }; > > +static inline bool etm4_can_disable_tupwr(struct device *dev) > +{ > + return fwnode_property_present(dev_fwnode(dev), "qcom,tupwr-disable"); > +} > + Please call fwnode_property_present() at initialisation time to set a new drvdata::skip_power_up variable. From there just switch on that in etm4_enable/disable_hw(). Thanks, Mathieu > static int etm4_enable_hw(struct etmv4_drvdata *drvdata) > { > int i, rc; > @@ -196,12 +201,14 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) > writel_relaxed(config->vmid_mask0, drvdata->base + TRCVMIDCCTLR0); > writel_relaxed(config->vmid_mask1, drvdata->base + TRCVMIDCCTLR1); > > - /* > - * Request to keep the trace unit powered and also > - * emulation of powerdown > - */ > - writel_relaxed(readl_relaxed(drvdata->base + TRCPDCR) | TRCPDCR_PU, > -drvdata->base + TRCPDCR); > + if (!etm4_can_disable_tupwr(etm_dev)) { > + /* > + * Request to keep the trace unit powered and also > + * emulation of powerdown > + */ > + writel_relaxed(readl_relaxed(drvdata->base + TRCPDCR) | > TRCPDCR_PU, > +drvdata->base + TRCPDCR); > + } > > /* Enable the trace unit */ > writel_relaxed(1, drvdata->base + TRCPRGCTLR); > @@ -476,10 +483,12 @@ static void etm4_disable_hw(void *info) > > CS_UNLOCK(drvdata->base); > > - /* power can be removed from the trace unit now */ > - control = readl_relaxed(drvdata->base + TRCPDCR); > - control &= ~TRCPDCR_PU; > - writel_relaxed(control, drvdata->base + TRCPDCR); > + if (!etm4_can_disable_tupwr(etm_dev)) { > + /* power can be removed from the trace unit now */ > + control = readl_relaxed(drvdata->base + TRCPDCR); > + control &= ~TRCPDCR_PU; > + writel_relaxed(control, drvdata->base + TRCPDCR); > + } > > control = readl_relaxed(drvdata->base + TRCPRGCTLR); > > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member > of Code Aurora Forum, hosted by The Linux Foundation
[PATCH net v2 2/2] ipmr: Add lockdep expression to ipmr_for_each_table macro
During the initialization process, ipmr_new_table() is called to create new tables which in turn calls ipmr_get_table() which traverses net->ipv4.mr_tables without holding the writer lock. However, this is safe to do so as no tables exist at this time. Hence add a suitable lockdep expression to silence the following false-positive warning: = WARNING: suspicious RCU usage 5.7.0-rc3-next-20200428-syzkaller #0 Not tainted - net/ipv4/ipmr.c:136 RCU-list traversed in non-reader section!! ipmr_get_table+0x130/0x160 net/ipv4/ipmr.c:136 ipmr_new_table net/ipv4/ipmr.c:403 [inline] ipmr_rules_init net/ipv4/ipmr.c:248 [inline] ipmr_net_init+0x133/0x430 net/ipv4/ipmr.c:3089 Fixes: f0ad0860d01e ("ipv4: ipmr: support multiple tables") Reported-by: syzbot+1519f497f2f9f0818...@syzkaller.appspotmail.com Suggested-by: Jakub Kicinski Signed-off-by: Amol Grover --- v2: - Change the lockdep expression to check for list emptiness at init - Add Fixes tag - Add Reported-by tag for syzbot report net/ipv4/ipmr.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 4897f7420c8f..5c218db2dede 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -109,9 +109,10 @@ static void mroute_clean_tables(struct mr_table *mrt, int flags); static void ipmr_expire_process(struct timer_list *t); #ifdef CONFIG_IP_MROUTE_MULTIPLE_TABLES -#define ipmr_for_each_table(mrt, net) \ - list_for_each_entry_rcu(mrt, >ipv4.mr_tables, list, \ - lockdep_rtnl_is_held()) +#define ipmr_for_each_table(mrt, net) \ + list_for_each_entry_rcu(mrt, >ipv4.mr_tables, list,\ + lockdep_rtnl_is_held() || \ + list_empty(>ipv4.mr_tables)) static struct mr_table *ipmr_mr_table_iter(struct net *net, struct mr_table *mrt) -- 2.24.1
[PATCH net v2 1/2] ipmr: Fix RCU list debugging warning
ipmr_for_each_table() macro uses list_for_each_entry_rcu() for traversing outside of an RCU read side critical section but under the protection of rtnl_mutex. Hence, add the corresponding lockdep expression to silence the following false-positive warning at boot: [4.319347] = [4.319349] WARNING: suspicious RCU usage [4.319351] 5.5.4-stable #17 Tainted: GE [4.319352] - [4.319354] net/ipv4/ipmr.c:1757 RCU-list traversed in non-reader section!! Fixes: f0ad0860d01e ("ipv4: ipmr: support multiple tables") Signed-off-by: Amol Grover --- v2: - Add appropriate Fixes tag net/ipv4/ipmr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 9cf83cc85e4a..4897f7420c8f 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -110,7 +110,8 @@ static void ipmr_expire_process(struct timer_list *t); #ifdef CONFIG_IP_MROUTE_MULTIPLE_TABLES #define ipmr_for_each_table(mrt, net) \ - list_for_each_entry_rcu(mrt, >ipv4.mr_tables, list) + list_for_each_entry_rcu(mrt, >ipv4.mr_tables, list, \ + lockdep_rtnl_is_held()) static struct mr_table *ipmr_mr_table_iter(struct net *net, struct mr_table *mrt) -- 2.24.1
Re: [PATCH v2 01/20] dt-bindings: power: Convert mti,mips-cpc to DT schema
On Thu, May 14, 2020 at 10:09:03AM -0500, Rob Herring wrote: > On Wed, 6 May 2020 20:42:19 +0300, wrote: > > From: Serge Semin > > > > It's a Cluster Power Controller embedded into the MIPS IP cores. > > Currently the corresponding dts node is supposed to have compatible > > and reg properties. > > > > Signed-off-by: Serge Semin > > Cc: Alexey Malahov > > Cc: Thomas Bogendoerfer > > Cc: Paul Burton > > Cc: Ralf Baechle > > Cc: Arnd Bergmann > > Cc: linux...@vger.kernel.org > > > > --- > > > > Changelog v2: > > - Reword the changelog summary - use shorter version. > > - Lowercase the example hex'es. > > --- > > .../bindings/power/mti,mips-cpc.txt | 8 - > > .../bindings/power/mti,mips-cpc.yaml | 35 +++ > > 2 files changed, 35 insertions(+), 8 deletions(-) > > delete mode 100644 Documentation/devicetree/bindings/power/mti,mips-cpc.txt > > create mode 100644 > > Documentation/devicetree/bindings/power/mti,mips-cpc.yaml > > > > Reviewed-by: Rob Herring Great! Thanks. -Sergey
[PATCH RFC 3/5] KVM: x86: move kvm_vcpu_gfn_to_memslot() out of try_async_pf()
No functional change intended. We will need to analyze slot information to map PTEs for KVM_MEM_ALLONES slots aggressively. Signed-off-by: Vitaly Kuznetsov --- arch/x86/kvm/mmu/mmu.c | 14 -- arch/x86/kvm/mmu/paging_tmpl.h | 7 +-- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e618472c572b..3db499df2dfc 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4078,11 +4078,10 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, kvm_vcpu_gfn_to_hva(vcpu, gfn), ); } -static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn, -gpa_t cr2_or_gpa, kvm_pfn_t *pfn, bool write, -bool *writable) +static bool try_async_pf(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, +bool prefault, gfn_t gfn, gpa_t cr2_or_gpa, +kvm_pfn_t *pfn, bool write, bool *writable) { - struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); bool async; /* Don't expose private memslots to L2. */ @@ -4118,7 +4117,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, bool exec = error_code & PFERR_FETCH_MASK; bool lpage_disallowed = exec && is_nx_huge_page_enabled(); bool map_writable; - + struct kvm_memory_slot *slot; gfn_t gfn = gpa >> PAGE_SHIFT; unsigned long mmu_seq; kvm_pfn_t pfn; @@ -4140,7 +4139,10 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, mmu_seq = vcpu->kvm->mmu_notifier_seq; smp_rmb(); - if (try_async_pf(vcpu, prefault, gfn, gpa, , write, _writable)) + slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); + + if (try_async_pf(vcpu, slot, prefault, gfn, gpa, , write, +_writable)) return RET_PF_RETRY; if (handle_abnormal_pfn(vcpu, is_tdp ? 0 : gpa, gfn, pfn, ACC_ALL, )) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index efec7d27b8c5..98e368788e8b 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -781,6 +781,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, int write_fault = error_code & PFERR_WRITE_MASK; int user_fault = error_code & PFERR_USER_MASK; struct guest_walker walker; + struct kvm_memory_slot *slot; int r; kvm_pfn_t pfn; unsigned long mmu_seq; @@ -835,8 +836,10 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, mmu_seq = vcpu->kvm->mmu_notifier_seq; smp_rmb(); - if (try_async_pf(vcpu, prefault, walker.gfn, addr, , write_fault, -_writable)) + slot = kvm_vcpu_gfn_to_memslot(vcpu, walker.gfn); + + if (try_async_pf(vcpu, slot, prefault, walker.gfn, addr, , +write_fault, _writable)) return RET_PF_RETRY; if (handle_abnormal_pfn(vcpu, addr, walker.gfn, pfn, walker.pte_access, )) -- 2.25.4
[PATCH RFC 5/5] KVM: selftests: add KVM_MEM_ALLONES test
Test the newly introduced KVM_MEM_ALLONES memslots: - Reads from all pages return '0xff' - Writes to all pages cause KVM_EXIT_MMIO Signed-off-by: Vitaly Kuznetsov --- tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/include/kvm_util.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c| 81 +++-- .../kvm/x86_64/memory_region_allones.c| 112 ++ 4 files changed, 162 insertions(+), 33 deletions(-) create mode 100644 tools/testing/selftests/kvm/x86_64/memory_region_allones.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 7af62030c12f..0c9aff445755 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -15,6 +15,7 @@ LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c TEST_GEN_PROGS_x86_64 = x86_64/cr4_cpuid_sync_test TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid +TEST_GEN_PROGS_x86_64 += x86_64/memory_region_allones TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test TEST_GEN_PROGS_x86_64 += x86_64/set_sregs_test diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 53b11d725d81..8f5ebc8520b8 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -59,6 +59,7 @@ enum vm_mem_backing_src_type { VM_MEM_SRC_ANONYMOUS, VM_MEM_SRC_ANONYMOUS_THP, VM_MEM_SRC_ANONYMOUS_HUGETLB, + VM_MEM_SRC_NONE, }; int kvm_check_cap(long cap); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 33ab0a36d230..232b63ba0b4b 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -446,8 +446,11 @@ static void __vm_mem_region_delete(struct kvm_vm *vm, "rc: %i errno: %i", ret, errno); sparsebit_free(>unused_phy_pages); - ret = munmap(region->mmap_start, region->mmap_size); - TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, errno); + if (region->mmap_start) { + ret = munmap(region->mmap_start, region->mmap_size); + TEST_ASSERT(ret == 0, "munmap failed, rc: %i errno: %i", ret, + errno); + } free(region); } @@ -636,34 +639,42 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, alignment = 1; #endif - if (src_type == VM_MEM_SRC_ANONYMOUS_THP) - alignment = max(huge_page_size, alignment); - - /* Add enough memory to align up if necessary */ - if (alignment > 1) - region->mmap_size += alignment; - - region->mmap_start = mmap(NULL, region->mmap_size, - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS - | (src_type == VM_MEM_SRC_ANONYMOUS_HUGETLB ? MAP_HUGETLB : 0), - -1, 0); - TEST_ASSERT(region->mmap_start != MAP_FAILED, - "test_malloc failed, mmap_start: %p errno: %i", - region->mmap_start, errno); - - /* Align host address */ - region->host_mem = align(region->mmap_start, alignment); - - /* As needed perform madvise */ - if (src_type == VM_MEM_SRC_ANONYMOUS || src_type == VM_MEM_SRC_ANONYMOUS_THP) { - ret = madvise(region->host_mem, npages * vm->page_size, -src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE : MADV_HUGEPAGE); - TEST_ASSERT(ret == 0, "madvise failed,\n" - " addr: %p\n" - " length: 0x%lx\n" - " src_type: %x", - region->host_mem, npages * vm->page_size, src_type); + if (src_type != VM_MEM_SRC_NONE) { + if (src_type == VM_MEM_SRC_ANONYMOUS_THP) + alignment = max(huge_page_size, alignment); + + /* Add enough memory to align up if necessary */ + if (alignment > 1) + region->mmap_size += alignment; + + region->mmap_start = mmap(NULL, region->mmap_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS + | (src_type == VM_MEM_SRC_ANONYMOUS_HUGETLB ? +MAP_HUGETLB : 0), -1, 0); + TEST_ASSERT(region->mmap_start != MAP_FAILED, + "test_malloc failed, mmap_start: %p errno: %i", + region->mmap_start, errno); + + /* Align host address */ + region->host_mem = align(region->mmap_start, alignment); + + /* As needed perform madvise */ + if (src_type
[PATCH RFC 0/5] KVM: x86: KVM_MEM_ALLONES memory
The idea of the patchset was suggested by Michael S. Tsirkin. PCIe config space can (depending on the configuration) be quite big but usually is sparsely populated. Guest may scan it by accessing individual device's page which, when device is missing, is supposed to have 'pci holes' semantics: reads return '0xff' and writes get discarded. Currently, userspace has to allocate real memory for these holes and fill them with '0xff'. Moreover, different VMs usually require different memory. The idea behind the feature introduced by this patch is: let's have a single read-only page filled with '0xff' in KVM and map it to all such PCI holes in all VMs. This will free userspace of obligation to allocate real memory and also allow us to speed up access to these holes as we can aggressively map the whole slot upon first fault. RFC. I've only tested the feature with the selftest (PATCH5) on Intel/AMD with and wiuthout EPT/NPT. I haven't tested memslot modifications yet. Patches are against kvm/next. Vitaly Kuznetsov (5): KVM: rename labels in kvm_init() KVM: x86: introduce KVM_MEM_ALLONES memory KVM: x86: move kvm_vcpu_gfn_to_memslot() out of try_async_pf() KVM: x86: aggressively map PTEs in KVM_MEM_ALLONES slots KVM: selftests: add KVM_MEM_ALLONES test Documentation/virt/kvm/api.rst| 22 ++-- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/mmu/mmu.c| 34 -- arch/x86/kvm/mmu/paging_tmpl.h| 30 - arch/x86/kvm/x86.c| 9 +- include/linux/kvm_host.h | 15 ++- include/uapi/linux/kvm.h | 2 + tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/include/kvm_util.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c| 81 +++-- .../kvm/x86_64/memory_region_allones.c| 112 ++ virt/kvm/kvm_main.c | 110 + 12 files changed, 342 insertions(+), 76 deletions(-) create mode 100644 tools/testing/selftests/kvm/x86_64/memory_region_allones.c -- 2.25.4
Re: [PATCH v2 02/20] dt-bindings: bus: Add MIPS CDMM controller
On Thu, May 14, 2020 at 10:09:43AM -0500, Rob Herring wrote: > On Wed, 6 May 2020 20:42:20 +0300, wrote: > > From: Serge Semin > > > > It's a Common Device Memory Map controller embedded into the MIPS IP > > cores, which dts node is supposed to have compatible and reg properties. > > > > Signed-off-by: Serge Semin > > Cc: Alexey Malahov > > Cc: Thomas Bogendoerfer > > Cc: Paul Burton > > Cc: Ralf Baechle > > Cc: Arnd Bergmann > > Cc: linux-m...@vger.kernel.org > > Cc: linux...@vger.kernel.org > > > > --- > > > > Changelog v2: > > - Lowercase the example hex'es. > > --- > > .../bindings/bus/mti,mips-cdmm.yaml | 35 +++ > > 1 file changed, 35 insertions(+) > > create mode 100644 Documentation/devicetree/bindings/bus/mti,mips-cdmm.yaml > > > > Reviewed-by: Rob Herring Great! Thanks. -Sergey
[PATCH RFC 4/5] KVM: x86: aggressively map PTEs in KVM_MEM_ALLONES slots
All PTEs in KVM_MEM_ALLONES slots point to the same read-only page in KVM so instead of mapping each page upon first access we can map everything aggressively. Suggested-by: Michael S. Tsirkin Signed-off-by: Vitaly Kuznetsov --- arch/x86/kvm/mmu/mmu.c | 20 ++-- arch/x86/kvm/mmu/paging_tmpl.h | 23 +-- 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3db499df2dfc..e92ca9ed3ff5 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4154,8 +4154,24 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, goto out_unlock; if (make_mmu_pages_available(vcpu) < 0) goto out_unlock; - r = __direct_map(vcpu, gpa, write, map_writable, max_level, pfn, -prefault, is_tdp && lpage_disallowed); + + if (likely(!(slot->flags & KVM_MEM_ALLONES) || write)) { + r = __direct_map(vcpu, gpa, write, map_writable, max_level, pfn, +prefault, is_tdp && lpage_disallowed); + } else { + /* +* KVM_MEM_ALLONES are 4k only slots fully mapped to the same +* readonly 'allones' page, map all PTEs aggressively here. +*/ + for (gfn = slot->base_gfn; gfn < slot->base_gfn + slot->npages; +gfn++) { + r = __direct_map(vcpu, gfn << PAGE_SHIFT, write, +map_writable, max_level, pfn, prefault, +is_tdp && lpage_disallowed); + if (r) + break; + } + } out_unlock: spin_unlock(>kvm->mmu_lock); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 98e368788e8b..7bf0c48b858f 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -789,6 +789,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) && is_nx_huge_page_enabled(); int max_level; + gfn_t gfn; pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code); @@ -873,8 +874,26 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, kvm_mmu_audit(vcpu, AUDIT_PRE_PAGE_FAULT); if (make_mmu_pages_available(vcpu) < 0) goto out_unlock; - r = FNAME(fetch)(vcpu, addr, , write_fault, max_level, pfn, -map_writable, prefault, lpage_disallowed); + if (likely(!(slot->flags & KVM_MEM_ALLONES) || write_fault)) { + r = FNAME(fetch)(vcpu, addr, , write_fault, max_level, +pfn, map_writable, prefault, lpage_disallowed); + } else { + /* +* KVM_MEM_ALLONES are 4k only slots fully mapped to the same +* readonly 'allones' page, map all PTEs aggressively here. +*/ + for (gfn = slot->base_gfn; gfn < slot->base_gfn + slot->npages; +gfn++) { + walker.gfn = gfn; + r = FNAME(fetch)(vcpu, gfn << PAGE_SHIFT, , +write_fault, max_level, pfn, +map_writable, prefault, +lpage_disallowed); + if (r) + break; + } + } + kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT); out_unlock: -- 2.25.4
[PATCH RFC 1/5] KVM: rename labels in kvm_init()
Label names in kvm_init() are horrible, rename them to make it obvious what we are going to do on the failure path. Signed-off-by: Vitaly Kuznetsov --- virt/kvm/kvm_main.c | 33 - 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 33e1eee96f75..892ea0b9087e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4674,7 +4674,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, r = kvm_arch_init(opaque); if (r) - goto out_fail; + return r; /* * kvm_arch_init makes sure there's at most one caller @@ -4685,29 +4685,29 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, */ r = kvm_irqfd_init(); if (r) - goto out_irqfd; + goto out_arch_exit; if (!zalloc_cpumask_var(_hardware_enabled, GFP_KERNEL)) { r = -ENOMEM; - goto out_free_0; + goto out_irqfd_exit; } r = kvm_arch_hardware_setup(opaque); if (r < 0) - goto out_free_1; + goto out_free_hardware_enabled; c.ret = c.opaque = opaque; for_each_online_cpu(cpu) { smp_call_function_single(cpu, check_processor_compat, , 1); if (r < 0) - goto out_free_2; + goto out_free_hardware_unsetup; } r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting", kvm_starting_cpu, kvm_dying_cpu); if (r) - goto out_free_2; + goto out_free_hardware_unsetup; register_reboot_notifier(_reboot_notifier); /* A kmem cache lets us meet the alignment requirements of fx_save. */ @@ -4721,12 +4721,12 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, NULL); if (!kvm_vcpu_cache) { r = -ENOMEM; - goto out_free_3; + goto out_free_cpuhp_unregister; } r = kvm_async_pf_init(); if (r) - goto out_free; + goto out_free_vcpu_cache; kvm_chardev_ops.owner = module; kvm_vm_fops.owner = module; @@ -4735,7 +4735,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, r = misc_register(_dev); if (r) { pr_err("kvm: misc device register failed\n"); - goto out_unreg; + goto out_async_pf_deinit; } register_syscore_ops(_syscore_ops); @@ -4750,22 +4750,21 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, return 0; -out_unreg: +out_async_pf_deinit: kvm_async_pf_deinit(); -out_free: +out_free_vcpu_cache: kmem_cache_destroy(kvm_vcpu_cache); -out_free_3: +out_free_cpuhp_unregister: unregister_reboot_notifier(_reboot_notifier); cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING); -out_free_2: +out_free_hardware_unsetup: kvm_arch_hardware_unsetup(); -out_free_1: +out_free_hardware_enabled: free_cpumask_var(cpus_hardware_enabled); -out_free_0: +out_irqfd_exit: kvm_irqfd_exit(); -out_irqfd: +out_arch_exit: kvm_arch_exit(); -out_fail: return r; } EXPORT_SYMBOL_GPL(kvm_init); -- 2.25.4
[PATCH RFC 2/5] KVM: x86: introduce KVM_MEM_ALLONES memory
PCIe config space can (depending on the configuration) be quite big but usually is sparsely populated. Guest may scan it by accessing individual device's page which, when device is missing, is supposed to have 'pci holes' semantics: reads return '0xff' and writes get discarded. Currently, userspace has to allocate real memory for these holes and fill them with '0xff'. Moreover, different VMs usually require different memory. The idea behind the feature introduced by this patch is: let's have a single read-only page filled with '0xff' in KVM and map it to all such PCI holes in all VMs. This will free userspace of obligation to allocate real memory. Later, this will also allow us to speed up access to these holes as we can aggressively map the whole slot upon first fault. Suggested-by: Michael S. Tsirkin Signed-off-by: Vitaly Kuznetsov --- Documentation/virt/kvm/api.rst | 22 ++--- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/x86.c | 9 ++-- include/linux/kvm_host.h| 15 ++- include/uapi/linux/kvm.h| 2 + virt/kvm/kvm_main.c | 79 +++-- 6 files changed, 113 insertions(+), 15 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index d871dacb984e..2b87d588a7e0 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1236,7 +1236,8 @@ yet and must be cleared on entry. /* for kvm_memory_region::flags */ #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) - #define KVM_MEM_READONLY (1UL << 1) + #define KVM_MEM_READONLY (1UL << 1) + #define KVM_MEM_ALLONES (1UL << 2) This ioctl allows the user to create, modify or delete a guest physical memory slot. Bits 0-15 of "slot" specify the slot id and this value @@ -1264,12 +1265,19 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr be identical. This allows large pages in the guest to be backed by large pages in the host. -The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and -KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of -writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to -use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it, -to make a new slot read-only. In this case, writes to this memory will be -posted to userspace as KVM_EXIT_MMIO exits. +The flags field supports the following flags: KVM_MEM_LOG_DIRTY_PAGES, +KVM_MEM_READONLY, KVM_MEM_READONLY: +- KVM_MEM_LOG_DIRTY_PAGES can be set to instruct KVM to keep track of writes to + memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to use it. +- KVM_MEM_READONLY can be set, if KVM_CAP_READONLY_MEM capability allows it, + to make a new slot read-only. In this case, writes to this memory will be + posted to userspace as KVM_EXIT_MMIO exits. +- KVM_MEM_ALLONES can be set, if KVM_CAP_ALLONES_MEM capability allows it, + to create a new virtual read-only slot which will always return '0xff' when + guest reads from it. 'userspace_addr' has to be set to NULL, KVM maps all + pages in such slots to a single readonly page. This flag is mutually exclusive + with KVM_MEM_LOG_DIRTY_PAGES/KVM_MEM_READONLY. All writes to this memory will be + posted to userspace as KVM_EXIT_MMIO exits. When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of the memory region are automatically reflected into the guest. For example, an diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 3f3f780c8c65..97ad1d90636e 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -48,6 +48,7 @@ #define __KVM_HAVE_XSAVE #define __KVM_HAVE_XCRS #define __KVM_HAVE_READONLY_MEM +#define __KVM_HAVE_ALLONES_MEM /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8c0b77ac8dc6..2c0fcab8ea1e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3389,6 +3389,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_GET_MSR_FEATURES: case KVM_CAP_MSR_PLATFORM_INFO: case KVM_CAP_EXCEPTION_PAYLOAD: + case KVM_CAP_ALLONES_MEM: r = 1; break; case KVM_CAP_SYNC_REGS: @@ -9962,9 +9963,11 @@ static int kvm_alloc_memslot_metadata(struct kvm_memory_slot *slot, ugfn = slot->userspace_addr >> PAGE_SHIFT; /* * If the gfn and userspace address are not aligned wrt each -* other, disable large page support for this slot. +* other, disable large page support for this slot. Also, +* disable large page support for KVM_MEM_ALLONES slots. */ - if ((slot->base_gfn ^ ugfn) & (KVM_PAGES_PER_HPAGE(level) - 1)) { + if (slot->flags &
Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
On Wed, 13 May 2020 22:24:56 -0400 (EDT) Mathieu Desnoyers wrote: > Also, considering that "inline" is not sufficient to ensure the compiler > does not emit a traceable function, I suspect you'll also want to mark > "native_get_debugreg" and "native_set_debugreg" always inline as well. I was thinking that the noinstr sections was more about not doing tracing and kprobes. As "inline" has been defined as "notrace" for some time, where any function marked as "inline" will not be available to ftrace even if the compiler decides not to honor the inline. in linux/compiler_types.h: #define inline inline __gnu_inline __inline_maybe_unused notrace -- Steve
Re: [PATCH v3] ACPI/IORT: Fix PMCG node always look for a single ID mapping.
> On May 14, 2020, at 2:29 AM, Lorenzo Pieralisi > wrote: > > Please update the subject: > > Subject: "ACPI/IORT: Fix PMCG node single ID mapping handling" > > On Wed, May 13, 2020 at 05:12:02PM -0700, Tuan Phan wrote: >> PMCG node can have zero ID mapping if its overflow interrupt >> is wire based. The code to parse PMCG node can not assume it will >> have a single ID mapping. > > "An IORT PMCG node can have no ID mapping if its overflow interrupt is > wire based therefore the code that parses the PMCG node can not assume > the node will always have a single mapping present at index 0. > > Fix iort_get_id_mapping_index() by checking for an overflow interrupt > and mapping count." Thanks for the correction. Will update it. > >> Fixes: 24e516049360 ("ACPI/IORT: Add support for PMCG") >> Reviewed-by: Hanjun Guo >> Signed-off-by: Tuan Phan >> --- >> v1 -> v2: >> - Use pmcg node to detect wired base overflow interrupt. >> >> v2 -> v3: >> - Address Hanjun and Robin's comments. >> >> drivers/acpi/arm64/iort.c | 5 + >> 1 file changed, 5 insertions(+) > > Acked-by: Lorenzo Pieralisi > >> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c >> index ed3d2d1..12bb70e 100644 >> --- a/drivers/acpi/arm64/iort.c >> +++ b/drivers/acpi/arm64/iort.c >> @@ -414,6 +414,7 @@ static struct acpi_iort_node *iort_node_get_id(struct >> acpi_iort_node *node, >> static int iort_get_id_mapping_index(struct acpi_iort_node *node) >> { >> struct acpi_iort_smmu_v3 *smmu; >> +struct acpi_iort_pmcg *pmcg; >> >> switch (node->type) { >> case ACPI_IORT_NODE_SMMU_V3: >> @@ -441,6 +442,10 @@ static int iort_get_id_mapping_index(struct >> acpi_iort_node *node) >> >> return smmu->id_mapping_index; >> case ACPI_IORT_NODE_PMCG: >> +pmcg = (struct acpi_iort_pmcg *)node->node_data; >> +if (pmcg->overflow_gsiv || node->mapping_count == 0) >> +return -EINVAL; >> + >> return 0; >> default: >> return -EINVAL; >> -- >> 2.7.4 >>
Re: [PATCH 2/3] remoteproc: Add inline coredump functionality
On 2020-05-07 13:21, Bjorn Andersson wrote: On Thu 16 Apr 11:38 PDT 2020, Rishabh Bhatnagar wrote: This patch adds the inline coredump functionality. The current coredump implementation uses vmalloc area to copy all the segments. But this might put a lot of strain on low memory targets as the firmware size sometimes is in ten's of MBs. The situation becomes worse if there are multiple remote processors undergoing recovery at the same time. This patch directly copies the device memory to userspace buffer and avoids extra memory usage. This requires recovery to be halted until data is read by userspace and free function is called. Signed-off-by: Rishabh Bhatnagar --- drivers/remoteproc/remoteproc_coredump.c | 130 +++ drivers/remoteproc/remoteproc_internal.h | 23 +- include/linux/remoteproc.h | 2 + 3 files changed, 153 insertions(+), 2 deletions(-) diff --git a/drivers/remoteproc/remoteproc_coredump.c b/drivers/remoteproc/remoteproc_coredump.c index 9de0467..888b7dec91 100644 --- a/drivers/remoteproc/remoteproc_coredump.c +++ b/drivers/remoteproc/remoteproc_coredump.c @@ -12,6 +12,84 @@ #include #include "remoteproc_internal.h" +static void rproc_free_dump(void *data) rproc_coredump_free() +{ + struct rproc_coredump_state *dump_state = data; + + complete(_state->dump_done); vfree(dump_state->header); +} + +static unsigned long resolve_addr(loff_t user_offset, rproc_coredump_find_segment() + struct list_head *segments, + unsigned long *data_left) +{ + struct rproc_dump_segment *segment; + + list_for_each_entry(segment, segments, node) { + if (user_offset >= segment->size) + user_offset -= segment->size; + else + break; if (user_offset < segment->size) { *data_left = segment->size - user_offset; return segment->da + user_offset; } user_offset -= segment->size; + } *data_left = 0; return 0; + + if (>node == segments) { + *data_left = 0; + return 0; + } + + *data_left = segment->size - user_offset; + + return segment->da + user_offset; +} + +static ssize_t rproc_read_dump(char *buffer, loff_t offset, size_t count, + void *data, size_t header_size) +{ + void *device_mem; + size_t data_left, copy_size, bytes_left = count; + unsigned long addr; + struct rproc_coredump_state *dump_state = data; + struct rproc *rproc = dump_state->rproc; + void *elfcore = dump_state->header; + + /* Copy the header first */ + if (offset < header_size) { + copy_size = header_size - offset; + copy_size = min(copy_size, bytes_left); + + memcpy(buffer, elfcore + offset, copy_size); + offset += copy_size; + bytes_left -= copy_size; + buffer += copy_size; + } Perhaps you can take inspiration from devcd_readv() here? + + while (bytes_left) { + addr = resolve_addr(offset - header_size, + >dump_segments, _left); + /* EOF check */ + if (data_left == 0) { Afaict data_left denotes the amount of data left in this particular segment, rather than in the entire core. I think you should start by making bytes_left the minimum of the core size and @count and then have this loop as long as bytes_left, copying data to the buffer either from header or an appropriate segment based on the current offset. + pr_info("Ramdump complete %lld bytes read", offset); dev_dbg(>dev, ...) + break; + } + + copy_size = min_t(size_t, bytes_left, data_left); + + device_mem = rproc->ops->da_to_va(rproc, addr, copy_size); rproc_da_to_va() + if (!device_mem) { + pr_err("Address:%lx with size %zd out of remoteproc carveout\n", dev_err(>dev, "coredump: %#lx size %#zx outside of carveouts\n", ..); + addr, copy_size); + return -ENOMEM; + } + memcpy(buffer, device_mem, copy_size); + + offset += copy_size; + buffer += copy_size; + bytes_left -= copy_size; + } + + return count - bytes_left; +} + static void create_elf_header(void *data, int phnum, struct rproc *rproc) { struct elf32_phdr *phdr; @@ -55,6 +133,58 @@ static void create_elf_header(void *data, int phnum, struct rproc *rproc) } /** + * rproc_inline_coredump() - perform synchronized coredump + * @rproc: rproc handle + * + * This function will generate an ELF header for
Re: [GIT PULL] thread fixes v5.7-rc5
On Thu, May 14, 2020 at 10:05 AM Christian Brauner wrote: > > This contains a single fix for all exported legacy fork helpers to block > accidental access to clone3() features in the upper 32 bits of their > respective flags arguments. I've taken this pull, but I really think the minimal and more straightforward fix would have been to just make do_fork(), kernel_thread() and clone() change the flags field from "unsigned long" to "unsigned int". I don't see why that wouldn't have fixed things, and it would have been simpler and more obvious, I think. Doesn't matter, I guess. Linus
Re: linux-next boot error: WARNING: suspicious RCU usage in bpq_device_event
On Thu, May 14, 2020 at 08:24:54AM -0400, Qian Cai wrote: > > > > On May 14, 2020, at 7:37 AM, syzbot > > wrote: > > > > Hello, > > > > syzbot found the following crash on: > > > > HEAD commit: c9529331 Add linux-next specific files for 20200514 > > git tree: linux-next > > console output: https://syzkaller.appspot.com/x/log.txt?x=17119f4810 > > kernel config: https://syzkaller.appspot.com/x/.config?x=404a80e135048067 > > dashboard link: https://syzkaller.appspot.com/bug?extid=bb82cafc737c002d11ca > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+bb82cafc737c002d1...@syzkaller.appspotmail.com > > > > = > > WARNING: suspicious RCU usage > > 5.7.0-rc5-next-20200514-syzkaller #0 Not tainted > > - > > drivers/net/hamradio/bpqether.c:149 RCU-list traversed in non-reader > > section!! > > How about teaching the bot to always CC Madhuparna and Amol for those > RCU-list bug reports? > Sounds good to me if this indeed is possible. > > > > other info that might help us debug this: > > > > > > rcu_scheduler_active = 2, debug_locks = 1 > > 1 lock held by ip/3967: > > #0: 8a7bad88 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock > > net/core/rtnetlink.c:72 [inline] > > #0: 8a7bad88 (rtnl_mutex){+.+.}-{3:3}, at: > > rtnetlink_rcv_msg+0x3f9/0xad0 net/core/rtnetlink.c:5458 > > > > stack backtrace: > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x18f/0x20d lib/dump_stack.c:118 > > bpq_get_ax25_dev drivers/net/hamradio/bpqether.c:149 [inline] > > bpq_device_event+0x796/0x8ee drivers/net/hamradio/bpqether.c:538 > > notifier_call_chain+0xc0/0x230 kernel/notifier.c:83 > > call_netdevice_notifiers_info net/core/dev.c:2016 [inline] > > call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2001 > > call_netdevice_notifiers_extack net/core/dev.c:2028 [inline] > > call_netdevice_notifiers net/core/dev.c:2042 [inline] > > __dev_notify_flags+0x121/0x2c0 net/core/dev.c:8279 > > dev_change_flags+0x100/0x160 net/core/dev.c:8317 > > do_setlink+0xa1c/0x35d0 net/core/rtnetlink.c:2605 > > __rtnl_newlink+0xad0/0x1590 net/core/rtnetlink.c:3273 > > rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3398 > > rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5461 > > netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2469 > > netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] > > netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 > > netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 > > sock_sendmsg_nosec net/socket.c:652 [inline] > > sock_sendmsg+0xcf/0x120 net/socket.c:672 > > sys_sendmsg+0x6e6/0x810 net/socket.c:2352 > > ___sys_sendmsg+0x100/0x170 net/socket.c:2406 > > __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439 > > do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 > > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > RIP: 0033:0x7f76dcdfcdc7 > > Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 8b 05 4a 49 > > 2b 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff > > ff 77 01 c3 48 8b 15 a1 f0 2a 00 f7 d8 64 89 02 48 > > RSP: 002b:7ffd45eccf28 EFLAGS: 0246 ORIG_RAX: 002e > > RAX: ffda RBX: 5ebd27cd RCX: 7f76dcdfcdc7 > > RDX: RSI: 7ffd45eccf70 RDI: 0003 > > RBP: 7ffd45eccf70 R08: 1000 R09: fefefeff77686d74 > > R10: 05e9 R11: 0246 R12: 7ffd45eccfb0 > > R13: 561a2ddea3c0 R14: 7ffd45ed5030 R15: > > ip (3967) used greatest stack depth: 23144 bytes left > > > > > > --- > > This bug is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkal...@googlegroups.com. > > > > syzbot will keep track of this bug report. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. >
Re: [PATCH V3 1/4] x86/resctrl: Enable user to view and select thread throttling mode
Hi Babu, Thank you very much for taking a look. On 5/14/2020 9:45 AM, Babu Moger wrote: > Hi Reinnette, > > The patches did not apply on my tree. I got the latest tree today. You Which tree did you use as baseline? These patches should apply cleanly on top of the other resctrl patches already queued for inclusion into v5.8 as found on the x86/cache branch of the tip repo. > might want to check again. > Hunk #1 FAILED at 29. > 1 out of 7 hunks FAILED -- saving rejects to file > arch/x86/kernel/cpu/resctrl/rdtgroup.c.rej > > >> -Original Message- >> From: Reinette Chatre >> Sent: Wednesday, May 6, 2020 6:50 PM >> To: t...@linutronix.de; fenghua...@intel.com; b...@alien8.de; >> tony.l...@intel.com >> Cc: kuo-lang.ts...@intel.com; ravi.v.shan...@intel.com; mi...@redhat.com; >> Moger, Babu ; h...@zytor.com; x...@kernel.org; >> linux-kernel@vger.kernel.org; Reinette Chatre >> Subject: [PATCH V3 1/4] x86/resctrl: Enable user to view and select thread >> throttling mode >> ... >> diff --git a/arch/x86/kernel/cpu/resctrl/core.c >> b/arch/x86/kernel/cpu/resctrl/core.c >> index 12f967c6b603..1bc686777069 100644 >> --- a/arch/x86/kernel/cpu/resctrl/core.c >> +++ b/arch/x86/kernel/cpu/resctrl/core.c >> @@ -250,6 +250,30 @@ static inline bool rdt_get_mb_table(struct rdt_resource >> *r) >> return false; >> } >> >> +/* >> + * Model-specific test to determine if platform where memory bandwidth >> + * control is applied to a core can be configured to apply either the >> + * maximum or minimum of the per-thread delay values. >> + * By default, platforms where memory bandwidth control is applied to a >> + * core will select the maximum delay value of the per-thread CLOS. >> + * >> + * NOTE: delay value programmed to hardware is inverse of bandwidth >> + * percentage configured via user interface. >> + */ >> +bool mba_cfg_supports_min_max_intel(void) >> +{ >> +switch (boot_cpu_data.x86_model) { >> +case INTEL_FAM6_ATOM_TREMONT_D: >> +case INTEL_FAM6_ICELAKE_X: >> +case INTEL_FAM6_ICELAKE_D: >> +return true; >> +default: >> +return false; >> +} >> + >> +return false; >> +} > > I see that you are calling this function multiple times. Why don't you It is called: - once during initialization - once per package initialization (when first CPU on a package comes online) - once per read from or write to the new "thread_throttle_mode" file > make it as a property in rdt_resource. Set it only once during the > init(may be in get_mem_config_intel). Then you can use it wherever > required. This also probably help James to make everything architecture > independent. What do you think? If I understand correctly this would require understanding how each architecture behaves in this regard to ensure the property within rdt_resource is initialized correctly. While it could just be set within get_mem_config_intel as you suggest, this property would be present in AMD's resource and thus need a value ... could you please provide guidance what this property should be on AMD? I looked at the bandwidth enforcement section of https://developer.amd.com/wp-content/resources/56375.pdf but it is not obvious to me where bandwidth is actually enforced and how this enforcement is affected when threads and/or cores are running tasks with different CLOS that have different bandwidth limits assigned. With AMD's properties understood then the new "thread_throttle_mode" file could be visible on all platforms, not just Intel, and display accurate values for all and be architecture independent. Alternatively, the new property could have values: UNDEFINED, MIN, MAX, or PER_THREAD ... with AMD having UNDEFINED and the "thread_throttle_mode" continues to be visible on Intel platforms only. I would appreciate your thoughts on this. > I assume that this property is probably not part of CPUID. Correct. This specific property is model specific, but also transient in that it is replaced by X86_FEATURE_PER_THREAD_MBA (that is part of CPUID) in future platforms. Reinette
Re: Default enable RCU list lockdep debugging with PROVE_RCU
On Thu, May 14, 2020 at 11:46:23AM -0400, Qian Cai wrote: > > > > On May 14, 2020, at 11:34 AM, Paul E. McKenney wrote: > > > > On Thu, May 14, 2020 at 10:03:21AM -0400, Qian Cai wrote: > >> > >> > >>> On May 14, 2020, at 9:54 AM, Paul E. McKenney wrote: > >>> > >>> On Thu, May 14, 2020 at 09:44:28AM -0400, Qian Cai wrote: > > > > On May 14, 2020, at 9:33 AM, Paul E. McKenney > > wrote: > > > > On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote: > >> > >> > >>> On May 14, 2020, at 8:25 AM, Stephen Rothwell > >>> wrote: > >>> > >>> Hi Paul, > >>> > >>> This patch in the rcu tree > >>> > >>> d13fee049fa8 ("Default enable RCU list lockdep debugging with > >>> PROVE_RCU") > >>> > >>> is causing whack-a-mole in the syzbot testing of linux-next. Because > >>> they always do a debug build of linux-next, no testing is getting > >>> done. :-( > >>> > >>> Can we find another way to find all the bugs that are being discovered > >>> (very slowly)? > >> > >> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it > >> can’t keep up with it? I personally found PROVE_RCU_LIST=y is still > >> useful for my linux-next testing, and don’t want to lose that coverage > >> overnight. > > > > The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking > > people > > to test without PROVE_LOCKING is a no-go in my opinion. But of course > > on the other hand if there is no testing of RCU list lockdep debugging, > > those issues will never be found, let alone fixed. > > > > One approach would be to do as Stephen asks (either remove d13fee049fa8 > > or pull it out of -next) and have testers force-enable the RCU list > > lockdep debugging. > > > > Would that work for you? > > Alternatively, how about having > > PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT > > since it is only syzbot can’t keep up with it? > >>> > >>> Sound good to me, assuming that this works for the syzkaller guys. > >>> Or could there be a "select PROVE_RCU_LIST" for the people who would > >>> like to test it. > >>> > >>> Alternatively, if we revert d13fee049fa8 from -next, I could provide > >>> you a script that updates your .config to set both RCU_EXPERT and > >>> PROVE_RCU_LIST. > >>> > >>> There are a lot of ways to appraoch this. > >>> > >>> So what would work best for everyone? > >> > >> > >> If PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT works for syzbot guys, that > >> would be great, so other testing agents could still report/fix those > >> RCU-list bugs and then pave a way for syzbot to return back once all those > >> false positives had been sorted out. > > > > On that, I must defer to the syzbot guys. > > > >> Otherwise, “select PROVE_RCU_LIST” *might* be better than buried into > >> RCU_EXPERT where we will probably never saw those false positives been > >> addressed since my configs does not cover a wide range of subsystems and > >> probably not many other bots would enable RCU_EXPERT. > > > > Yet another option would be to edit your local kernel/rcu/Kconfig.debug > > and change the code to the following: > > > > config PROVE_RCU_LIST > > def_bool y > > help > > Enable RCU lockdep checking for list usages. It is default > > enabled with CONFIG_PROVE_RCU. > > > > Removing the RCU_EXPERT dependency would not go over at all well with > > some people whose opinions are difficult to ignore. ;-) > > I am trying to not getting into a game of carrying any custom patch myself. > > Let’s see what syzbot guys will say, and then I’ll enable RCU_EXPERT myself > if needed, but again we probably never see PROVE_RCU_LIST to be used again in > syzbot for this path. I surely have no cycles to expand the testing coverage > for more subsystems at the moment. Fair enough! And yes, the Linux kernel is quite large, so I certainly am not asking you to test the whole thing yourself. Thanx, Paul
Re: [PATCH v2 03/20] dt-bindings: Add vendor prefix for Baikal Electronics, JSC
Rob, Could you also take a look at this patch? There are several patchsets I've sent which depend on the vendor-prefix it defines. So when you get to check those patchsets DT files, the dt_binding_check will fail without it. Is it possible somehow to pick this patch up from here and apply it before checking those Baikal-T1-specific binding files? -Sergey On Wed, May 06, 2020 at 08:42:21PM +0300, sergey.se...@baikalelectronics.ru wrote: > From: Serge Semin > > Add "BAIKAL ELECTRONICS, JSC" to the list of devicetree vendor prefixes > as "baikal". > > Website: http://www.baikalelectronics.com > > Signed-off-by: Serge Semin > Cc: Alexey Malahov > Cc: Thomas Bogendoerfer > Cc: Paul Burton > Cc: Ralf Baechle > Cc: Arnd Bergmann > Cc: linux-m...@vger.kernel.org > Cc: linux...@vger.kernel.org > > --- > > Changelog v2: > - Fix author and SoB emails mismatch. > - Add 'baikal' vendor prefix instead of ambiguous 'be'. > --- > Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml > b/Documentation/devicetree/bindings/vendor-prefixes.yaml > index d3891386d671..674c0d07c0ad 100644 > --- a/Documentation/devicetree/bindings/vendor-prefixes.yaml > +++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml > @@ -139,6 +139,8 @@ patternProperties: > description: Azoteq (Pty) Ltd >"^azw,.*": > description: Shenzhen AZW Technology Co., Ltd. > + "^baikal,.*": > +description: BAIKAL ELECTRONICS, JSC >"^bananapi,.*": > description: BIPAI KEJI LIMITED >"^beacon,.*": > -- > 2.25.1 >
[PATCH][next] rtw88: 8723d: fix incorrect setting of ldo_pwr
From: Colin Ian King Currently ldo_pwr has the LDO25 voltage bits set to zero and then it is overwritten with the new voltage setting. The assignment looks incorrect, it should be bit-wise or'ing in the new voltage setting rather than a direct assignment. Addresses-Coverity: ("Unused value") Fixes: 1afb5eb7a00d ("rtw88: 8723d: Add cfg_ldo25 to control LDO25") Signed-off-by: Colin Ian King --- drivers/net/wireless/realtek/rtw88/rtw8723d.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtw88/rtw8723d.c b/drivers/net/wireless/realtek/rtw88/rtw8723d.c index b517af417e0e..2c6e417c5bca 100644 --- a/drivers/net/wireless/realtek/rtw88/rtw8723d.c +++ b/drivers/net/wireless/realtek/rtw88/rtw8723d.c @@ -561,7 +561,7 @@ static void rtw8723d_cfg_ldo25(struct rtw_dev *rtwdev, bool enable) ldo_pwr = rtw_read8(rtwdev, REG_LDO_EFUSE_CTRL + 3); if (enable) { ldo_pwr &= ~BIT_MASK_LDO25_VOLTAGE; - ldo_pwr = (BIT_LDO25_VOLTAGE_V25 << 4) | BIT_LDO25_EN; + ldo_pwr |= (BIT_LDO25_VOLTAGE_V25 << 4) | BIT_LDO25_EN; } else { ldo_pwr &= ~BIT_LDO25_EN; } -- 2.25.1
Re: [PATCH 0/8] KVM: x86: Interrupt-based mechanism for async_pf 'page present' notifications
Vivek Goyal writes: > On Mon, May 11, 2020 at 06:47:44PM +0200, Vitaly Kuznetsov wrote: >> Concerns were expressed around (ab)using #PF for KVM's async_pf mechanism, >> it seems that re-using #PF exception for a PV mechanism wasn't a great >> idea after all. The Grand Plan is to switch to using e.g. #VE for 'page >> not present' events and normal APIC interrupts for 'page ready' events. >> This series does the later. > > Hi Vitaly, > > How does any of this impact nested virtualization code (if any). > > I have tried understanding that logic, but I have to admit, I could > never get it. > > arch/x86/kvm/mmu/mmu.c > > int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, > u64 fault_address, char *insn, int insn_len) > { > switch (vcpu->arch.apf.host_apf_reason) { > case KVM_PV_REASON_PAGE_NOT_PRESENT: > kvm_async_pf_task_wait(fault_address, 0); > case KVM_PV_REASON_PAGE_READY: > kvm_async_pf_task_wake(fault_address); > } > } > "[PATCH 8/8] KVM: x86: drop KVM_PV_REASON_PAGE_READY case from kvm_handle_page_fault()" modifies this a little bit. Basically (and if I understand this correctly) we have the following APF related feature (bit 2 in MSR_KVM_ASYNC_PF_EN): "asynchronous page faults are delivered to L1 as #PF vmexits.". When enabled, it allows L0 to inject #PF when L2 guest is running. L1 will see this as '#PF vmexit' and the code you cite will do exactly what do_async_page_fault() is doing. When we switch to interrupt based delivery for 'page ready' events we don't need a special handling for them in L1 (as we don't need any special handling for all interrupts from devices in kernel when KVM guest is running). I have to admit I haven't tested nested scenario yet, "what could go wrong?" :-) -- Vitaly
Re: [PATCH 1/2] dt-bindings: chrome: Add cros-ec-typec mux props
Hi folks, On Tue, May 12, 2020 at 04:41:54PM +0300, Heikki Krogerus wrote: > Hi guys, > > On Mon, May 11, 2020 at 01:46:35PM -0700, Prashant Malani wrote: > > Hi Rob, > > > > Thank you for reviewing the patch. Kindly see my comments inline: > > > > On Mon, May 11, 2020 at 02:28:00PM -0500, Rob Herring wrote: > > > On Wed, Apr 22, 2020 at 03:22:39PM -0700, Prashant Malani wrote: > > > > Add properties for mode, orientation and USB data role switches for > > > > Type C connectors. When available, these will allow the Type C connector > > > > class port driver to configure the various switches according to USB PD > > > > information (like orientation, alt mode etc.) provided by the Chrome OS > > > > EC controller. > > > > > > > > Signed-off-by: Prashant Malani > > > > --- > > > > .../bindings/chrome/google,cros-ec-typec.yaml | 27 ++- > > > > 1 file changed, 26 insertions(+), 1 deletion(-) > > > > > > > > diff --git > > > > a/Documentation/devicetree/bindings/chrome/google,cros-ec-typec.yaml > > > > b/Documentation/devicetree/bindings/chrome/google,cros-ec-typec.yaml > > > > index 6d7396ab8bee..b5814640aa32 100644 > > > > --- a/Documentation/devicetree/bindings/chrome/google,cros-ec-typec.yaml > > > > +++ b/Documentation/devicetree/bindings/chrome/google,cros-ec-typec.yaml > > > > @@ -21,7 +21,21 @@ properties: > > > > const: google,cros-ec-typec > > > > > > > >connector: > > > > -$ref: /schemas/connector/usb-connector.yaml# > > > > +allOf: > > > > + - $ref: /schemas/connector/usb-connector.yaml# > > > > + - type: object > > > > +properties: > > > > > > These don't seem CrOS EC specific, so why document them as such. > > > > Are you referring to the "mode-switch", "orientation-switch" and > > "usb-role-switch" properties? If so, then yes, they aren't Cros EC > > specific. The Type C connector class framework requires the nodes to be > > named like this, and the cros-ec-typec driver uses this framework, hence > > the description here (the Type C connector class framework doesn't have > > any bindings). > > > > Would it be better to add in the description string that Type Connector > > class expects these switches to be named this way? : > > > > " Reference to a DT node for the USB Type C Multiplexer controlling the > > data lines routing for this connector. This switch is assumed registered > > with the Type C connector class framework, which requires it to be named > > this way." > > > > > > > + mode-switch: > > > > +description: Reference to a DT node for the USB Type C > > > > Multiplexer > > > > + controlling the data lines routing for this connector. > > > > > > This is for alternate mode muxing I presume. > > > > Yes, that's right. > > > > > > We already have a mux-control binding. Why not use that here? > > > > Heikki might be able to offer more insight into why this is the case, > > since the connector class framework seems to expect a phandle and for > > the device driver to implement a "set" command. Heikki, would you happen to > > know? > > The mode-switch here would actually represent the "consumer" part in > the mux-control bindings. So the mux-controls would describe the > relationship between the "mode-switch" and the mux controller(s), > while the mode-switch property describes the relationship between > something like USB Type-C Port Manager (or this cros_ec function) and > the "mux consumer". > Thanks for the explanation, Heikki. Hi Rob, Does the above explanation help clarify the usage here? If so, shall I upload a new patch version with the additional text (referencing Type C connector class framework) added to the *-switch descriptions? Best regards, -Prashant > > > > + > > > > + orientation-switch: > > > > +description: Reference to a DT node for the USB Type C > > > > orientation > > > > + switch for this connector. > > > > > > What's in this node? > > > > Similar to the other "-switch", this will contain a phandle to a device > > which can control orientation settings for the Type C Mux. The connector > > class API assumes the switches are named this way. For example: > > > > orientation-switch: > > https://elixir.bootlin.com/linux/v5.7-rc2/source/drivers/usb/typec/mux.c#L64 > > > > mode-switch: > > https://elixir.bootlin.com/linux/v5.7-rc2/source/drivers/usb/typec/mux.c#L258 > > > > > > > > > + > > > > + usb-role-switch: > > > > +description: Reference to a DT node for the USB Data role > > > > switch > > > > + for this connector. > > > > > > > > required: > > > >- compatible > > > > @@ -49,6 +63,17 @@ examples: > > > > data-role = "dual"; > > > > try-power-role = "source"; > > > >}; > > > > + > > > > + connector@1 { > > > > +compatible = "usb-c-connector"; > > > > +reg = <1>; > > > > +power-role = "dual";
[PATCH][next] fsinfo: fix an uninialized variable 'conn'
From: Colin Ian King Variable conn is not initialized and can potentially contain garbage causing a false -EPERM return on the !conn check. Fix this by initializing it to false. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: f2494de388bd ("fsinfo: Add an attribute that lists all the visible mounts in a namespace") Signed-off-by: Colin Ian King --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index 3fd24575756b..ae489cbac467 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -4433,7 +4433,7 @@ int fsinfo_generic_mount_all(struct path *path, struct fsinfo_context *ctx) struct mnt_namespace *ns; struct mount *m, *p; struct path chroot; - bool conn; + bool conn = false; m = real_mount(path->mnt); ns = m->mnt_ns; -- 2.25.1
Re: [RFC v1 0/3] Add QTI QFPROM-Efuse driver support
Hi, On Wed, May 13, 2020 at 8:35 PM Ravi Kumar Bokka (Temp) wrote: > > Hi doug, > > Thanks for your feedback. Please find below inline comments. Probably the mailing list didn't see them. You responded with HTML mail. Please be careful to only respond in plain text. > Regards, > > Ravi Kumar.B > > > > On 5/13/2020 4:33 AM, Doug Anderson wrote: > > Hi, > > On Tue, May 12, 2020 at 11:18 AM Ravi Kumar Bokka > wrote: > > This patch series adds qfprom-efuse controller driver support. > > This driver can access the raw qfprom regions for fuse blowing. > > The current existed qfprom driver is only supports for cpufreq, thermal > sensors > drivers by read out calibration data, speed bins..etc which is stored by > qfprom efuses. > > I don't understand the interaction between this driver and the > existing "qcom,qfprom" driver. Can you please explain? Are they both > acting on the same values and this one has write access? Are there > two instances of the same hardware block and you're managing one of > them with this new driver and one with thue old driver? Something > else? > > [Ravi] Existing QFPROM driver in upstream kernel has limited support which is > some hard coded mapping of id vs set of fuses and user can read those fuse > with those id-bucket. > That is simply reading Hw-registers and it doesn't involve any hardware > programming sequence etc. Based on information given to us by QC-kernel team, > existing driver was created to read calibration/sensor fuses and it is very > basic/limited/fixed in functionalities and orthogonal to what we need to on > Trogdor. > > Requirement for Trogdor fuse blow driver is different which allows to > read/write almost whole fuse block and requires to follow HW programming > guide. Both are completely separate and has no overlapping in terms of > functionalities and capability. Please ignore the similarity of names of > drivers, they are different in terms of functionalities and driver internals > etc. If they are targeting the same type of hardware IP block then, in the very least, the bindings need to be the same. The bindings are supposed to be describing the hardware. Presumably if the underlying hardware is the same you should be able to write one driver and it can just operate in some sort of "read-only" mode if it's running somewhere it doesn't have access permissions to actually change the fuses. > Ravi Kumar Bokka (3): > dt-bindings: nvmem: Add devicetree bindings for qfprom-efuse > drivers: nvmem: Add driver for QTI qfprom-efuse support > arm64: dts: qcom: sc7180: Add qfprom-efuse > > .../devicetree/bindings/nvmem/qfprom-efuse.yaml| 40 ++ > arch/arm64/boot/dts/qcom/sc7180-idp.dts| 4 + > arch/arm64/boot/dts/qcom/sc7180.dtsi | 9 + > drivers/nvmem/Kconfig | 10 + > drivers/nvmem/Makefile | 2 + > drivers/nvmem/qfprom-efuse.c | 476 > + > 6 files changed, 541 insertions(+) > create mode 100644 Documentation/devicetree/bindings/nvmem/qfprom-efuse.yaml > create mode 100644 drivers/nvmem/qfprom-efuse.c > > -- > Qualcomm INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of > the Code Aurora Forum, hosted by the Linux Foundation. >
Re: [RFC v1 2/3] drivers: nvmem: Add driver for QTI qfprom-efuse support
Hi, I notice that you didn't respond to any of my feedback [1], only Srinivas's. Any reason why? It turns out that Srinivas had quite a lot of the same feedback as I did, but please make sure you didn't miss anything I suggested. More inline below. On Thu, May 14, 2020 at 5:26 AM Ravi Kumar Bokka (Temp) wrote: > > Hi Srinivas, > Thanks for your feedback by giving review comments. Please find my > inline comments. > > > Regards, > Ravi Kumar.B > > On 5/13/2020 6:50 PM, Srinivas Kandagatla wrote: > > > > > > On 12/05/2020 19:17, Ravi Kumar Bokka wrote: > >> This patch adds new driver for QTI qfprom-efuse controller. This > >> driver can > >> access the raw qfprom regions for fuse blowing. > > > > QTI? > > guidance I have received from internal Legal/LOST team is that the QCOM > prefix needs to be changed to QTI everywhere it is used I'll let Srinivas comment if he cares. I'm really not sure why a legal team cares about the Kconfig name in a GPL-licensed Linux kernel. > >> The current existed qfprom driver is only supports for cpufreq, > >> thermal sensors > >> drivers by read out calibration data, speed bins..etc which is stored > >> by qfprom efuses. > > > > Can you explain bit more about this QFPROM instance, Is this QFPROM part > > of secure controller address space? > > Is this closely tied to SoC or Secure controller version? > > > > Any reason why this can not be integrated into qfprom driver with > > specific compatible. > > > > QFPROM driver communicates with sec_controller address space however > scope and functionalities of this driver is different and not limited as > existing qfprom fuse Read-Only driver for specific “fuse buckets’ like > cpufreq, thermal sensors etc. QFPROM fuse write driver in this patch > requires specific sequence to write/blow fuses unlike other driver. > Scope/functionalities are different and this is separate driver. If the underlying IP blocks are the same it should be one driver and it should just work in read-only mode for the other range of stuff. > >> Signed-off-by: Ravi Kumar Bokka > >> --- > >> drivers/nvmem/Kconfig| 10 + > >> drivers/nvmem/Makefile | 2 + > >> drivers/nvmem/qfprom-efuse.c | 476 > >> +++ > >> 3 files changed, 488 insertions(+) > >> create mode 100644 drivers/nvmem/qfprom-efuse.c > >> > > ... > > > >> diff --git a/drivers/nvmem/qfprom-efuse.c b/drivers/nvmem/qfprom-efuse.c > >> new file mode 100644 > >> index 000..2e3c275 > >> --- /dev/null > >> +++ b/drivers/nvmem/qfprom-efuse.c > >> @@ -0,0 +1,476 @@ > >> +// SPDX-License-Identifier: GPL-2.0-only > >> +/* > >> + * Copyright (c) 2020, The Linux Foundation. All rights reserved. > >> + */ > >> + > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> +#include > >> + > >> +#define QFPROM_BLOW_STATUS_BUSY 0x1 > >> +#define QFPROM_BLOW_STATUS_READY 0x0 > >> + > >> +/* Blow timer clock frequency in Mhz for 10nm LPe technology */ > >> +#define QFPROM_BLOW_TIMER_OFFSET 0x03c > >> +#define QFPROM_BLOW_TIMER_RESET_VALUE 0x0 > >> + > >> +/* Amount of time required to hold charge to blow fuse in > >> micro-seconds */ > >> +#define QFPROM_FUSE_BLOW_POLL_PERIOD 100 > >> +#define QFPROM_BLOW_STATUS_OFFSET 0x048 > >> + > >> +#define QFPROM_ACCEL_OFFSET 0x044 > >> + > >> +/** > >> + * struct qfprom_efuse_platform_data - structure holding qfprom-efuse > >> + * platform data > >> + * > >> + * @name: qfprom-efuse compatible name > > > > ?? > > Thanks for your feedback. I will address this change > > >> + * @fuse_blow_time_in_us: Should contain the wait time when doing the > >> fuse blow > >> + * @accel_value: Should contain qfprom accel value > >> + * @accel_reset_value: The reset value of qfprom accel value > >> + * @qfprom_blow_timer_value: The timer value of qfprom when doing > >> efuse blow > >> + * @qfprom_blow_reset_freq: The frequency required to set when fuse > >> blowing > >> + * is done > >> + * @qfprom_blow_set_freq: The frequency required to set when we start > >> the > >> + * fuse blowing > >> + * @qfprom_max_vol: max voltage required to set fuse blow > >> + * @qfprom_min_vol: min voltage required to set fuse blow > > > > How specific are these values per SoC? > > > > This voltage level may change based on SoC and/or fuse-hardware > technology, it would change for SoC with different technology, hence we > have kept it in SOC specific settings. Generally I'd expect the SoC specific settings to be in the device tree. Drivers don't need to specify this. Please respond to the comments I posed in my review. > >> + */ > >> +struct qfprom_efuse_platform_data { > >> +const char *name; > >> +u8 fuse_blow_time_in_us; > >> +u32 accel_value; > >> +u32 accel_reset_value; > >> +u32 qfprom_blow_timer_value; > >> +u32 qfprom_blow_reset_freq; > >> +u32 qfprom_blow_set_freq; > >> +u32
Re: [GIT PULL] thread fixes v5.7-rc5
On May 14, 2020 8:07:59 PM GMT+02:00, Linus Torvalds wrote: >On Thu, May 14, 2020 at 10:05 AM Christian Brauner > wrote: >> >> This contains a single fix for all exported legacy fork helpers to >block >> accidental access to clone3() features in the upper 32 bits of their >> respective flags arguments. > >I've taken this pull, but I really think the minimal and more >straightforward fix would have been to just make do_fork(), >kernel_thread() and clone() change the flags field from "unsigned >long" to "unsigned int". > >I don't see why that wouldn't have fixed things, and it would have >been simpler and more obvious, I think. > >Doesn't matter, I guess. > > Linus Seemed weird to me to change something that's been exposed to userspace for that long. Christian
Re: WARNING: suspicious RCU usage with PROVE_RCU_LIST=y
On Mon, Apr 06, 2020 at 05:11:34PM +0530, Amol Grover wrote: > Hello, > > With respect to the patch https://lore.kernel.org/patchwork/patch/1202512/ > I boot tested with CONFIG_PROVE_RCU_LIST=y and encountered a susppicious RCU > usage warning in "security/apparmor/include/lib.h". I thought of going forward > and fix it myself, however, while going through the stack trace and the actual > code, I found that the function (__lookupn_profile) is required to be called > with rcu_read_locK() but the splat proves it otherwise. > > [ 12.727582] = > [ 12.727599] WARNING: suspicious RCU usage > [ 12.727601] 5.5.4-stable #17 Tainted: GE > [ 12.727602] - > [ 12.727604] security/apparmor/include/lib.h:191 RCU-list traversed in > non-reader section!! > [ 12.727605] >other info that might help us debug this: > > [ 12.727606] >rcu_scheduler_active = 2, debug_locks = 1 > [ 12.727608] 2 locks held by apparmor_parser/506: > [ 12.727609] #0: 9f0687562490 (sb_writers#10){.+.+}, at: > vfs_write+0x140/0x1a0 > [ 12.727614] #1: 9f0687f09ca8 (>lock){+.+.}, at: > aa_replace_profiles+0x17a/0xdd0 > [ 12.727619] >stack backtrace: > [ 12.727621] CPU: 3 PID: 506 Comm: apparmor_parser Tainted: GE >5.5.4-stable #17 > [ 12.727622] Hardware name: Gigabyte Technology Co., Ltd. > Z170-D3H/Z170-D3H-CF, BIOS F21 03/06/2017 > [ 12.727623] Call Trace: > [ 12.727627] dump_stack+0x8f/0xd0 > [ 12.727630] __lookupn_profile+0x19c/0x1a0 > [ 12.727632] ? aa_unpack+0x51b/0x580 > [ 12.727636] __lookup_replace+0x34/0xc0 > [ 12.727640] aa_replace_profiles+0x2a0/0xdd0 > [ 12.727649] policy_update+0x106/0x370 > [ 12.727653] profile_replace+0xa3/0x110 > [ 12.727657] vfs_write+0xb9/0x1a0 > [ 12.727661] ksys_write+0x68/0xe0 > [ 12.727666] do_syscall_64+0x5c/0xe0 > [ 12.727669] entry_SYSCALL_64_after_hwframe+0x49/0xbe > [ 12.727671] RIP: 0033:0x7ff83fec7f93 > [ 12.727673] Code: 75 05 48 83 c4 58 c3 e8 eb 41 ff ff 66 2e 0f 1f 84 00 00 > 00 00 00 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d > 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18 > [ 12.727674] RSP: 002b:7ffcebb5c398 EFLAGS: 0246 ORIG_RAX: > 0001 > [ 12.727676] RAX: ffda RBX: 7131 RCX: > 7ff83fec7f93 > [ 12.727677] RDX: 7131 RSI: 5610fd804a40 RDI: > 0006 > [ 12.727678] RBP: 5610fd804a40 R08: 7131 R09: > 5610fd802f38 > [ 12.727680] R10: fa8a R11: 0246 R12: > > [ 12.727681] R13: 0006 R14: 5610fd7dd490 R15: > 7131 > > Thanks > Amol Hello, Just a friendly request to please go through the above _bug_. Thanks Amol
Re: [RFC] mm/vmstat: Add events for THP migration without split
On 2020-05-11 21:22, Anshuman Khandual wrote: Add the following new trace events which will help in validating migration events involving PMD based THP pages. 1. THP_PMD_MIGRATION_ENTRY_SET 2. THP_PMD_MIGRATION_ENTRY_REMOVE There are no clear method to confirm whether a THP migration happened with out involving it's split. These trace events along with PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE will provide additional insights. After this change, Hi Anshuman, It's very nice to see this work, and I think that reporting a bit more about THP migration stats is going to make development and performance debugging a lot more efficient (and pleasant). A single 2M THP (2K base page) when migrated 1. Without split pgmigrate_success 1 pgmigrate_fail 0 thp_pmd_migration_entry_set 1 thp_pmd_migration_entry_remove 1 I do think we should decouple the trace event name(s) just a *little* more, from the mechanisms used to migrate THPs. In other words, let's report the number of THP migration successes, and name it accordingly--rather than "set" and "remove", which are pretty low-level and furthermore depend on today's exact code. Maybe Zi Yan's recommended name is exactly right, in fact: THP_PMD_MIGRATION_SUCCESS thanks, -- John Hubbard NVIDIA 2. With split pgmigrate_success 512 pgmigrate_fail 0 thp_pmd_migration_entry_set 0 thp_pmd_migration_entry_remove 0 pgmigrate_success as 1 instead of 512, provides a hint for possible THP migration event. But then it gets mixed with normal page migrations over time. These additional trace events provide required co-relation. Cc: Andrew Morton Cc: "Kirill A. Shutemov" Cc: Zi Yan Cc: linux...@kvack.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual --- This is an indirect way for counting PMD migrations without split. Is there a better method possible ? Just request for comments at the moment. include/linux/vm_event_item.h | 4 mm/migrate.c | 1 + mm/rmap.c | 1 + mm/vmstat.c | 4 4 files changed, 10 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index ffef0f279747..4b25102cf3ad 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -91,6 +91,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, THP_ZERO_PAGE_ALLOC_FAILED, THP_SWPOUT, THP_SWPOUT_FALLBACK, +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION + THP_PMD_MIGRATION_ENTRY_SET, + THP_PMD_MIGRATION_ENTRY_REMOVE, +#endif #endif #ifdef CONFIG_MEMORY_BALLOON BALLOON_INFLATE, diff --git a/mm/migrate.c b/mm/migrate.c index 7160c1556f79..8d50d55cbe97 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -228,6 +228,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, if (!pvmw.pte) { VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page); remove_migration_pmd(, new); + count_vm_event(THP_PMD_MIGRATION_ENTRY_REMOVE); continue; } #endif diff --git a/mm/rmap.c b/mm/rmap.c index f79a206b271a..3c1fe3f45cb5 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1418,6 +1418,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page); set_pmd_migration_entry(, page); + count_vm_event(THP_PMD_MIGRATION_ENTRY_SET); continue; } #endif diff --git a/mm/vmstat.c b/mm/vmstat.c index 96d21a792b57..a5254b7ee531 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1274,6 +1274,10 @@ const char * const vmstat_text[] = { "thp_zero_page_alloc_failed", "thp_swpout", "thp_swpout_fallback", +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION + "thp_pmd_migration_entry_set", + "thp_pmd_migration_entry_remove", +#endif #endif #ifdef CONFIG_MEMORY_BALLOON "balloon_inflate",
Re: [PATCH v2 03/20] dt-bindings: Add vendor prefix for Baikal Electronics, JSC
On Wed, 6 May 2020 20:42:21 +0300, wrote: > From: Serge Semin > > Add "BAIKAL ELECTRONICS, JSC" to the list of devicetree vendor prefixes > as "baikal". > > Website: http://www.baikalelectronics.com > > Signed-off-by: Serge Semin > Cc: Alexey Malahov > Cc: Thomas Bogendoerfer > Cc: Paul Burton > Cc: Ralf Baechle > Cc: Arnd Bergmann > Cc: linux-m...@vger.kernel.org > Cc: linux...@vger.kernel.org > > --- > > Changelog v2: > - Fix author and SoB emails mismatch. > - Add 'baikal' vendor prefix instead of ambiguous 'be'. > --- > Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++ > 1 file changed, 2 insertions(+) > Applied, thanks!
[PATCH][next] net: dsa: felix: fix incorrect clamp calculation for burst
From: Colin Ian King Currently burst is clamping on rate and not burst, the assignment of burst from the clamping discards the previous assignment of burst. This looks like a cut-n-paste error from the previous clamping calculation on ramp. Fix this by replacing ramp with burst. Addresses-Coverity: ("Unused value") Fixes: 0fbabf875d18 ("net: dsa: felix: add support Credit Based Shaper(CBS) for hardware offload") Signed-off-by: Colin Ian King --- drivers/net/dsa/ocelot/felix_vsc9959.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c index df4498c0e864..85e34d85cc51 100644 --- a/drivers/net/dsa/ocelot/felix_vsc9959.c +++ b/drivers/net/dsa/ocelot/felix_vsc9959.c @@ -1360,7 +1360,7 @@ static int vsc9959_qos_port_cbs_set(struct dsa_switch *ds, int port, /* Burst unit is 4kB */ burst = DIV_ROUND_UP(cbs_qopt->hicredit, 4096); /* Avoid using zero burst size */ - burst = clamp_t(u32, rate, 1, GENMASK(5, 0)); + burst = clamp_t(u32, burst, 1, GENMASK(5, 0)); ocelot_write_gix(ocelot, QSYS_CIR_CFG_CIR_RATE(rate) | QSYS_CIR_CFG_CIR_BURST(burst), -- 2.25.1
Re: [GIT PULL] thread fixes v5.7-rc5
On Thu, May 14, 2020 at 11:22 AM Christian Brauner wrote: > > Seemed weird to me to change something that's been exposed to userspace for > that long. Well, the internal declarations aren't actually "exposed" to user space - it's not like it's the declaration of the system call, that's separate. And we have done that before: we have had a lot of history of using "unsigned long" to basically mean "register", and then ended up cleaning up types afterwards. In fact, if you look at the macros that do SYSCALL_DEFINE() (hint - don't actually do it, you'll go mad) you'll see that magical __SC_LONG() thing, which actually declares _all_ arguments as either "unsigned long" or "unsigned long long". That's the "native" representation of the low-level system call (it's also marked "asmlinkage" etc). We then end up casting them to the internal representation with that __SC_CAST() macro. So the actual types that we get are intentionally "cleaned up" versions of the raw registers that were passed in. But you really don't want to understand the __SYSCALL_DEFINEx() macro. It's clever, but it really is the Cthulhu of macros. Just looking at it might drive you mad. Linus
[tip: x86/splitlock] x86/split_lock: Add Icelake microserver CPU model
The following commit has been merged into the x86/splitlock branch of tip: Commit-ID: 0ed7bf1d92eafc37bb9eb7c8692a8e44d24f9b99 Gitweb: https://git.kernel.org/tip/0ed7bf1d92eafc37bb9eb7c8692a8e44d24f9b99 Author:Fenghua Yu AuthorDate:Thu, 30 Apr 2020 16:46:35 -07:00 Committer: Borislav Petkov CommitterDate: Thu, 14 May 2020 19:25:10 +02:00 x86/split_lock: Add Icelake microserver CPU model Icelake microserver CPU supports split lock detection while it doesn't have the split lock enumeration bit in IA32_CORE_CAPABILITIES. Enumerate the feature by model number. Signed-off-by: Fenghua Yu Signed-off-by: Borislav Petkov Reviewed-by: Tony Luck Link: https://lkml.kernel.org/r/1588290395-2677-1-git-send-email-fenghua...@intel.com --- arch/x86/kernel/cpu/intel.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index a19a680..b59bc4a 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -1135,6 +1135,7 @@ void switch_to_sld(unsigned long tifn) static const struct x86_cpu_id split_lock_cpu_ids[] __initconst = { X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, 0), X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_L, 0), + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, 0), X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT,1), X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D, 1), X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_L, 1),
Re: [PATCH net-next 1/2] dt-bindings: net: dp83822: Add TI dp83822 phy
On Thu, May 14, 2020 at 12:30:54PM -0500, Dan Murphy wrote: > Add a dt binding for the TI dp83822 ethernet phy device. > > CC: Rob Herring > Signed-off-by: Dan Murphy > --- > .../devicetree/bindings/net/ti,dp83822.yaml | 49 +++ > 1 file changed, 49 insertions(+) > create mode 100644 Documentation/devicetree/bindings/net/ti,dp83822.yaml > > diff --git a/Documentation/devicetree/bindings/net/ti,dp83822.yaml > b/Documentation/devicetree/bindings/net/ti,dp83822.yaml > new file mode 100644 > index ..60afd43ad3b6 > --- /dev/null > +++ b/Documentation/devicetree/bindings/net/ti,dp83822.yaml > @@ -0,0 +1,49 @@ > +# SPDX-License-Identifier: (GPL-2.0+ OR BSD-2-Clause) > +# Copyright (C) 2020 Texas Instruments Incorporated > +%YAML 1.2 > +--- > +$id: "http://devicetree.org/schemas/net/ti,dp83822.yaml#; > +$schema: "http://devicetree.org/meta-schemas/core.yaml#; > + > +title: TI DP83822 ethernet PHY > + > +allOf: > + - $ref: "ethernet-controller.yaml#" > + > +maintainers: > + - Dan Murphy > + > +description: | > + The DP83822 is a low-power, single-port, 10/100 Mbps Ethernet PHY. It > + provides all of the physical layer functions needed to transmit and receive > + data over standard, twisted-pair cables or to connect to an external, > + fiber-optic transceiver. Additionally, the DP83822 provides flexibility to > + connect to a MAC through a standard MII, RMII, or RGMII interface Hi Dan You say 10/100 Mbps Ethernet PHY, but then list RGMII? > + > + Specifications about the charger can be found at: > +http://www.ti.com/lit/ds/symlink/dp83822i.pdf > + > +properties: > + reg: > +maxItems: 1 > + > + ti,signal-polarity-low: > +type: boolean > +description: | > + DP83822 PHY in Fiber mode only. > + Sets the DP83822 to detect a link drop condition when the signal goes > + high. If not set then link drop will occur when the signal goes low. Are we talking about the LOS line from the SFP cage? In the SFF/SFP binding we have: - los-gpios : GPIO phandle and a specifier of the Receiver Loss of Signal Indication input gpio signal, active (signal lost) high It would be nice to have a consistent naming. Is it required the LOS signal is connected to the PHY? Russell King has some patches which allows the Marvell PHY to be used as a media converter. In that setting, i think the SFP signals are connected to GPIOs not the PHY. The SFP core can then control the transmit disable, module insertion detection etc. So i'm wondering if you need a property to indicate the LOS is not connected to the PHY? Andrew
Re: [PATCH 3/5] exec: Remove recursion from search_binary_handler
On 5/13/20 4:59 PM, Eric W. Biederman wrote: > Careful with your terminology. ELF sections are for .o's For > executables ELF have segments. And reading through the code it is the > program segments that are independently relocatable. Sorry, I have trouble keeping this stuff straight when it's not in front of me. (I have a paperback copy of the old "linkers and loaders" book and it was the driest thing I have _ever_ slogged through. Back before the Linux Foundation ate the FSG I was pushing https://refspecs.linuxbase.org/ to include missing ABI supplement, I have copies of ones it doesn't collected from now long-dead sites...) But more recently I've just made puppy eyes at Rich Felker to have him fix this stuff for me, because I do _not_ retain the terminology here. REL vs RELA vs PLT, can you have a PLT without a GOT...? > There is a flag but it is defined per architecture and I don't think one > of the architectures define it. They all check for one, but I don't remember there being a #define. I have a todo item to check more architectures' fdpic binaries, this was from sh2eb (ala j-core): https://github.com/landley/toybox/commit/d61aeaf9e#diff-4442ddbb8949R65 There was the out of tree arm fdpic toolchain from the french guys for cortex-m, and the original frv paper, and in theory blackfin but nothing they touched ever got merged upstream anywhere: In _theory_ you could do fdpic for x86, but as with u-boot for x86 nobody ever bothers because it's got an x86-only solution. (And then the x86 version of stuff gets pushed to other platforms because all our device tree files were GPLed so of course acpi for arm became a thing. Sigh...) > I looked at ARM and apparently with an MMU ARM turns fdpic binaries into > PIE executables. I am not certain why. Falling back to a more widely tested codepath, I expect. Also maybe it saves 3 registers if all 4 are using the same base register? Map them linearly and it becomes "single base + offset"? Which of course looses the extra ASLR benefits the security people wanted, but "undoing what the security people want in the name of an unmeasurable microbenchmark optimization" is a proud tradition. Just because the 4 segments are compiled as independently relocatable doesn't mean they HAVE to be. (You'd think the code would be using different register numbers to index stuff so you'd STILL be using 4 registers, but I haven't looked at what arm's doing...) > The registers passed to the entry point are also different for both > cases. >From the same machine code chunks? I boggle at what the ld.so fixup is doing >then... > I think it would have been nice if the fdpic support had used a > different ELF type, instead of a different depending on using a > different architecture. This is what you get when a blackfin developer talks to the gnu/binutils developers: https://sourceware.org/legacy-ml/binutils/2008-04/msg00350.html > All that aside the core dumping code looks to be essentially the same > between binfmt_elf.c and binfmt_elf_fdpic.c. Do you think people would > be interested in refactoring binfmt_elf.c and binfmt_elf_fdpic.c so that > they could share the same core dumping code? I think merging the two of them together entirely would be a good idea, and anything that can collapse together I'm happy to regression test on sh2. I also note that qemu-sh4eb can run these binaries, maybe I can whip up a qemu-system-sh4eb that runs a nommu fdpic userspace... [hours later] Ok, here's me asking Rich Felker a question: >>> So fdpic binaries run under qemu-sh2eb and there's a qemu-system-sh2eb that >>> SHOULD also be able to run them under the r2d board emulation, and the >>> kernel >>> builds fine under the sh2eb compiler but I can't enable fdpic support >>> without >>> CONFIG_NOMMU, and if I yank that dependency from Kconfig (which only sh2 >>> has, >>> arm and such do fdpic with or without mmu) the build breaks with: >>> >>> /home/landley/toybox/clean/ccc/sh2eb-linux-muslfdpic-cross/bin/sh2eb-linux-muslfdpic-ld: >>> fs/binfmt_elf_fdpic.o: in function `load_elf_fdpic_binary': >>> binfmt_elf_fdpic.c:(.text+0x1734): undefined reference to >>> `elf_fdpic_arch_lay_out_mm' >>> >>> The problem is if I switch off CONFIG_MMU in the kernel, buckets of stuff >>> in the >>> r2d board kernel config changes and suddenly I don't get serial output from >>> the >>> qemu-system-sh2eb -M r2d boot anymore. Before it was running the kernel but >>> just >>> failing to run init... And his response: >> I don't think qemu-system-sh4eb can boot a nommu kernel. But you don't >> need to in order to do userspace-only testing. Just build a normal >> sh4eb kernel. It doesn't need CONFIG_BINFMT_ELF_FDPIC. The normal ELF >> loader can load FDPIC just fine, because a valid FDPIC ELF file is a >> valid ELF file, just with more constraints (in same sense a square is >> a rectangle). The normal ELF loader won't independently float the text >> and data segments, but that's okay
Re: [PATCH] coresight: etm4x: Add support to disable trace unit power up
Hi Mathieu, On 2020-05-14 23:30, Mathieu Poirier wrote: Good morning Sai, On Thu, May 14, 2020 at 04:29:15PM +0530, Sai Prakash Ranjan wrote: From: Tingwei Zhang On some Qualcomm Technologies Inc. SoCs like SC7180, there exists a hardware errata where the APSS (Application Processor SubSystem)/CPU watchdog counter is stopped when ETM register TRCPDCR.PU=1. Fun stuff... Yes :) Since the ETMs share the same power domain as that of respective CPU cores, they are powered on when the CPU core is powered on. So we can disable powering up of the trace unit after checking for this errata via new property called "qcom,tupwr-disable". Signed-off-by: Tingwei Zhang Co-developed-by: Sai Prakash Ranjan Signed-off-by: Sai Prakash Ranjan Co-developed-by: Sai Prakash Ranjan Signed-off-by: Tingwei Zhang Tingwei is the author, so if I understand correctly, his signed-off-by should appear first, am I wrong? --- .../devicetree/bindings/arm/coresight.txt | 6 drivers/hwtracing/coresight/coresight-etm4x.c | 29 --- Please split in two patches. Sure, I will split the dt-binding into separate patch, checkpatch did warn. 2 files changed, 25 insertions(+), 10 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/coresight.txt b/Documentation/devicetree/bindings/arm/coresight.txt index 846f6daae71b..d2030128fe46 100644 --- a/Documentation/devicetree/bindings/arm/coresight.txt +++ b/Documentation/devicetree/bindings/arm/coresight.txt @@ -108,6 +108,12 @@ its hardware characteristcs. * arm,cp14: must be present if the system accesses ETM/PTM management registers via co-processor 14. + * qcom,tupwr-disable: boolean. Indicates that trace unit power up can + be disabled on Qualcomm Technologies Inc. systems where ETMs are in + the same power domain as their CPU cores. This property is required + to identify such systems with hardware errata where the CPU watchdog + counter is stopped when TRCPDCR.PU=1. + I think something like "qcom,skip-power-up" would be clearer. Also, a better choice of words is that TRCPDCR.PU does not have to be set on Qualcomm... Yes "qcom,skip-power-up" is a lot better, thanks. Also will use something as you suggested for description. * Optional property for TMC: * arm,buffer-size: size of contiguous buffer space for TMC ETR diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index fb0f5f4f3a91..6886b44f6947 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -104,6 +104,11 @@ struct etm4_enable_arg { int rc; }; +static inline bool etm4_can_disable_tupwr(struct device *dev) +{ + return fwnode_property_present(dev_fwnode(dev), "qcom,tupwr-disable"); +} + Please call fwnode_property_present() at initialisation time to set a new drvdata::skip_power_up variable. From there just switch on that in etm4_enable/disable_hw(). Will do, thanks. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation