date:20201208

Re: [PATCH] KVM/SVM: add support for SEV attestation command

2020-12-08 Thread Ard Biesheuvel

On Fri, 4 Dec 2020 at 22:30, Brijesh Singh  wrote:
>
> The SEV FW version >= 0.23 added a new command that can be used to query
> the attestation report containing the SHA-256 digest of the guest memory
> encrypted through the KVM_SEV_LAUNCH_UPDATE_{DATA, VMSA} commands and
> sign the report with the Platform Endorsement Key (PEK).
>
> See the SEV FW API spec section 6.8 for more details.
>
> Note there already exist a command (KVM_SEV_LAUNCH_MEASURE) that can be
> used to get the SHA-256 digest. The main difference between the
> KVM_SEV_LAUNCH_MEASURE and KVM_SEV_ATTESTATION_REPORT is that the later

latter

> can be called while the guest is running and the measurement value is
> signed with PEK.
>
> Cc: James Bottomley 
> Cc: Tom Lendacky 
> Cc: David Rientjes 
> Cc: Paolo Bonzini 
> Cc: Sean Christopherson 
> Cc: Borislav Petkov 
> Cc: John Allen 
> Cc: Herbert Xu 
> Cc: linux-cry...@vger.kernel.org
> Signed-off-by: Brijesh Singh 
> ---
>  .../virt/kvm/amd-memory-encryption.rst| 21 ++
>  arch/x86/kvm/svm/sev.c| 71 +++
>  drivers/crypto/ccp/sev-dev.c  |  1 +
>  include/linux/psp-sev.h   | 17 +
>  include/uapi/linux/kvm.h  |  8 +++
>  5 files changed, 118 insertions(+)
>
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst 
> b/Documentation/virt/kvm/amd-memory-encryption.rst
> index 09a8f2a34e39..4c6685d0fddd 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -263,6 +263,27 @@ Returns: 0 on success, -negative on error
>  __u32 trans_len;
>  };
>
> +10. KVM_SEV_GET_ATTESATION_REPORT

KVM_SEV_GET_ATTESTATION_REPORT

> +-
> +
> +The KVM_SEV_GET_ATTESATION_REPORT command can be used by the hypervisor to 
> query the attestation

KVM_SEV_GET_ATTESTATION_REPORT

> +report containing the SHA-256 digest of the guest memory and VMSA passed 
> through the KVM_SEV_LAUNCH
> +commands and signed with the PEK. The digest returned by the command should 
> match the digest
> +used by the guest owner with the KVM_SEV_LAUNCH_MEASURE.
> +
> +Parameters (in): struct kvm_sev_attestation
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +struct kvm_sev_attestation_report {
> +__u8 mnonce[16];/* A random mnonce that will be 
> placed in the report */
> +
> +__u64 uaddr;/* userspace address where the 
> report should be copied */
> +__u32 len;
> +};
> +
>  References
>  ==
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 566f4d18185b..c4d3ee6be362 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -927,6 +927,74 @@ static int sev_launch_secret(struct kvm *kvm, struct 
> kvm_sev_cmd *argp)
> return ret;
>  }
>
> +static int sev_get_attestation_report(struct kvm *kvm, struct kvm_sev_cmd 
> *argp)
> +{
> +   void __user *report = (void __user *)(uintptr_t)argp->data;
> +   struct kvm_sev_info *sev = _kvm_svm(kvm)->sev_info;
> +   struct sev_data_attestation_report *data;
> +   struct kvm_sev_attestation_report params;
> +   void __user *p;
> +   void *blob = NULL;
> +   int ret;
> +
> +   if (!sev_guest(kvm))
> +   return -ENOTTY;
> +
> +   if (copy_from_user(, (void __user *)(uintptr_t)argp->data, 
> sizeof(params)))
> +   return -EFAULT;
> +
> +   data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
> +   if (!data)
> +   return -ENOMEM;
> +
> +   /* User wants to query the blob length */
> +   if (!params.len)
> +   goto cmd;
> +
> +   p = (void __user *)(uintptr_t)params.uaddr;
> +   if (p) {
> +   if (params.len > SEV_FW_BLOB_MAX_SIZE) {
> +   ret = -EINVAL;
> +   goto e_free;
> +   }
> +
> +   ret = -ENOMEM;
> +   blob = kmalloc(params.len, GFP_KERNEL);
> +   if (!blob)
> +   goto e_free;
> +
> +   data->address = __psp_pa(blob);
> +   data->len = params.len;
> +   memcpy(data->mnonce, params.mnonce, sizeof(params.mnonce));
> +   }
> +cmd:
> +   data->handle = sev->handle;
> +   ret = sev_issue_cmd(kvm, SEV_CMD_ATTESTATION_REPORT, data, 
> >error);
> +   /*
> +* If we query the session length, FW responded with expected data.
> +*/
> +   if (!params.len)
> +   goto done;
> +
> +   if (ret)
> +   goto e_free_blob;
> +
> +   if (blob) {
> +   if (copy_to_user(p, blob, params.len))
> +   ret = -EFAULT;
> +   }
> +
> +done:
> +   params.len = data->len;
> +   if (copy_to_user(report, , sizeof(params)))
> +   ret = -EFAULT;
> +e_free_blob:
>

Re: [PATCH] bcache: consider the fragmentation when update the writeback rate

2020-12-08 Thread Dongsheng Yang




在 2020/12/9 星期三 下午 12:48, Dongdong Tao 写道:

Hi Dongsheng,

I'm working on it, next step I'm gathering some testing data and
upload (very sorry for the delay...)
Thanks for the comment.
One of the main concerns to alleviate this issue with the writeback
process is that we need to minimize the impact on the client IO
performance.
writeback_percent by default is 10, start writeback when dirty buckets
reached 10 percent might be a bit too aggressive, as the
writeback_cutoff_sync is 70 percent.
So i chose to start the writeback when dirty buckets reached 50
percent so that this patch will only take effect after dirty buckets
percent is above that


Agree with that's too aggressive to reuse writeback_percent, and that's 
less flexable.


Okey, let's wait for your testing result.


Thanx



Thanks,
Dongdong




On Wed, Dec 9, 2020 at 10:27 AM Dongsheng Yang
 wrote:


在 2020/11/3 星期二 下午 8:42, Dongdong Tao 写道:

From: dongdong tao 

Current way to calculate the writeback rate only considered the
dirty sectors, this usually works fine when the fragmentation
is not high, but it will give us unreasonable small rate when
we are under a situation that very few dirty sectors consumed
a lot dirty buckets. In some case, the dirty bucekts can reached
to CUTOFF_WRITEBACK_SYNC while the dirty data(sectors) noteven
reached the writeback_percent, the writeback rate will still
be the minimum value (4k), thus it will cause all the writes to be
stucked in a non-writeback mode because of the slow writeback.

This patch will try to accelerate the writeback rate when the
fragmentation is high. It calculate the propotional_scaled value
based on below:
(dirty_sectors / writeback_rate_p_term_inverse) * fragment
As we can see, the higher fragmentation will result a larger
proportional_scaled value, thus cause a larger writeback rate.
The fragment value is calculated based on below:
(dirty_buckets *  bucket_size) / dirty_sectors
If you think about it, the value of fragment will be always
inside [1, bucket_size].

This patch only considers the fragmentation when the number of
dirty_buckets reached to a dirty threshold(configurable by
writeback_fragment_percent, default is 50), so bcache will
remain the original behaviour before the dirty buckets reached
the threshold.

Signed-off-by: dongdong tao 
---
   drivers/md/bcache/bcache.h|  1 +
   drivers/md/bcache/sysfs.c |  6 ++
   drivers/md/bcache/writeback.c | 21 +
   3 files changed, 28 insertions(+)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 1d57f48307e6..87632f7032b6 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -374,6 +374,7 @@ struct cached_dev {
   unsigned intwriteback_metadata:1;
   unsigned intwriteback_running:1;
   unsigned char   writeback_percent;
+ unsigned char   writeback_fragment_percent;
   unsigned intwriteback_delay;

   uint64_twriteback_rate_target;
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 554e3afc9b68..69499113aef8 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -115,6 +115,7 @@ rw_attribute(stop_when_cache_set_failed);
   rw_attribute(writeback_metadata);
   rw_attribute(writeback_running);
   rw_attribute(writeback_percent);
+rw_attribute(writeback_fragment_percent);


Hi Dongdong and Coly,

  What is the status about this patch? In my opinion, it is a problem
we need to solve,

but can we just reuse the parameter of writeback_percent, rather than
introduce a new writeback_fragment_percent?

That means the semantic of writeback_percent will act on dirty data
percent and dirty bucket percent.

When we found there are dirty buckets more than (c->nbuckets *
writeback_percent), start the writeback.


Thanx

Yang


   rw_attribute(writeback_delay);
   rw_attribute(writeback_rate);

@@ -197,6 +198,7 @@ SHOW(__bch_cached_dev)
   var_printf(writeback_running,   "%i");
   var_print(writeback_delay);
   var_print(writeback_percent);
+ var_print(writeback_fragment_percent);
   sysfs_hprint(writeback_rate,
wb ? atomic_long_read(>writeback_rate.rate) << 9 : 0);
   sysfs_printf(io_errors, "%i", atomic_read(>io_errors));
@@ -308,6 +310,9 @@ STORE(__cached_dev)
   sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent,
   0, bch_cutoff_writeback);

+ sysfs_strtoul_clamp(writeback_fragment_percent, 
dc->writeback_fragment_percent,
+ 0, bch_cutoff_writeback_sync);
+
   if (attr == _writeback_rate) {
   ssize_t ret;
   long int v = atomic_long_read(>writeback_rate.rate);
@@ -498,6 +503,7 @@ static struct attribute *bch_cached_dev_files[] = {
   _writeback_running,
   _writeback_delay,
   _writeback_percent,
+ _writeback_fragment_percent,
   _writeback_rate,

[PATCH] media: MAINTAINERS: correct entry in Amlogic GE2D driver section

2020-12-08 Thread Lukas Bulwahn

Commit aa821b2b9269 ("media: MAINTAINERS: Add myself as maintainer of the
Amlogic GE2D driver") introduced a new MAINTAINERS section, but the file
entry points to the wrong location.

Hence, ./scripts/get_maintainer.pl --self-test=patterns warns:

  warning: no file matchesF:drivers/media/meson/ge2d/

Adjust the entry to the actual location of the driver.

Signed-off-by: Lukas Bulwahn 
---
applies on next-20201208, not on current master

Neil, please ack.
Hans, Mauro, please pick this minor non-urgent fix-up for your -next tree.

 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5b20babb9f7b..d66bf50fc640 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11520,7 +11520,7 @@ L:  linux-amlo...@lists.infradead.org
 S: Supported
 T: git git://linuxtv.org/media_tree.git
 F: Documentation/devicetree/bindings/media/amlogic,axg-ge2d.yaml
-F: drivers/media/meson/ge2d/
+F: drivers/media/platform/meson/ge2d/
 
 MESON NAND CONTROLLER DRIVER FOR AMLOGIC SOCS
 M: Liang Yang 
-- 
2.17.1

Re: [PATCH v1 1/1] scsi: ufs: Fix ufs power down/on specs violation

2020-12-08 Thread ziqichen




Hi Can,

On 2020-12-09 15:27, Can Guo wrote:

On 2020-12-09 15:09, Ziqi Chen wrote:

As per specs, e.g, JESD220E chapter 7.2, while powering
off/on the ufs device, RST_N signal and REF_CLK signal
should be between VSS(Ground) and VCCQ/VCCQ2.

Power down:
1. Assert RST_N low
2. Turn-off REF_CLK
3. Turn-off VCC
4. Turn-off VCCQ/VCCQ2.
power on:
1. Turn-on VCC
2. Turn-on VCCQ/VCCQ2
3. Turn-On REF_CLK
4. Deassert RST_N high.

Signed-off-by: Ziqi Chen 
---
 drivers/scsi/ufs/ufs-qcom.c | 14 ++
 drivers/scsi/ufs/ufshcd.c   | 19 +--
 drivers/scsi/ufs/ufshcd.h   |  4 ++--
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index 1e434cc..5ed3a63d 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -582,6 +582,9 @@ static int ufs_qcom_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
ufs_qcom_disable_lane_clks(host);
phy_power_off(phy);

+   if (hba->vops && hba->vops->device_reset)
+   hba->vops->device_reset(hba, false);
+


Instead of doing the pull-down in ufshcd_vops_suspend(), can we do
it in ufshcd_suspend()? Since it is a common problem for all soc
vendors.


Sure, agree, Thanks.




} else if (!ufs_qcom_is_link_active(hba)) {
ufs_qcom_disable_lane_clks(host);
}
@@ -1400,10 +1403,11 @@ static void ufs_qcom_dump_dbg_regs(struct 
ufs_hba *hba)

 /**
  * ufs_qcom_device_reset() - toggle the (optional) device reset line
  * @hba: per-adapter instance
+ * @toggle: need pulling up or not
  *
  * Toggles the (optional) reset line to reset the attached device.
  */
-static int ufs_qcom_device_reset(struct ufs_hba *hba)
+static int ufs_qcom_device_reset(struct ufs_hba *hba, bool toggle)
 {
struct ufs_qcom_host *host = ufshcd_get_variant(hba);

@@ -1416,10 +1420,12 @@ static int ufs_qcom_device_reset(struct 
ufs_hba *hba)

 * be on the safe side.
 */
gpiod_set_value_cansleep(host->device_reset, 1);
-   usleep_range(10, 15);

-   gpiod_set_value_cansleep(host->device_reset, 0);
-   usleep_range(10, 15);
+   if (toggle) {
+   usleep_range(10, 15);
+   gpiod_set_value_cansleep(host->device_reset, 0);
+   usleep_range(10, 15);
+   }

return 0;
 }
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 92d433d..5ab1c02 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8633,8 +8633,6 @@ static int ufshcd_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
if (ret)
goto set_dev_active;

-   ufshcd_vreg_set_lpm(hba);
-
 disable_clks:
/*
 	 * Call vendor specific suspend callback. As these callbacks may 
access

@@ -8664,6 +8662,7 @@ static int ufshcd_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)

/* Put the host controller in low power mode if possible */
ufshcd_hba_vreg_set_lpm(hba);
+   ufshcd_vreg_set_lpm(hba);


Can you put ufshcd_vreg_set_lpm() before ufshcd_hba_vreg_set_lpm()?


Sure, thanks.




goto out;

 set_link_active:
@@ -8729,18 +8728,18 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
old_link_state = hba->uic_link_state;

ufshcd_hba_vreg_set_hpm(hba);
+   ret = ufshcd_vreg_set_hpm(hba);
+   if (ret)
+   goto out;
+
/* Make sure clocks are enabled before accessing controller */
ret = ufshcd_setup_clocks(hba, true);
if (ret)
-   goto out;
+   goto disable_vreg;

/* enable the host irq as host controller would be active soon */
ufshcd_enable_irq(hba);

-   ret = ufshcd_vreg_set_hpm(hba);
-   if (ret)
-   goto disable_irq_and_vops_clks;
-
/*
 	 * Call vendor specific resume callback. As these callbacks may 
access

 * vendor specific host controller register space call them when the
@@ -8748,7 +8747,7 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
 */
ret = ufshcd_vops_resume(hba, pm_op);
if (ret)
-   goto disable_vreg;
+   goto disable_irq_and_vops_clks;

 	/* For DeepSleep, the only supported option is to have the link off 
*/
 	WARN_ON(ufshcd_is_ufs_dev_deepsleep(hba) && 
!ufshcd_is_link_off(hba));

@@ -8815,8 +8814,6 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
ufshcd_link_state_transition(hba, old_link_state, 0);
 vendor_suspend:
ufshcd_vops_suspend(hba, pm_op);
-disable_vreg:
-   ufshcd_vreg_set_lpm(hba);
 disable_irq_and_vops_clks:
ufshcd_disable_irq(hba);
if (hba->clk_scaling.is_allowed)
@@ -8827,6 +8824,8 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
trace_ufshcd_clk_gating(dev_name(hba->dev),

drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:34:6: warning: no previous prototype for 'ia_css_isys_ibuf_rmgr_init'

2020-12-08 Thread kernel test robot

Hi Mauro,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   a68a0262abdaa251e12c53715f48e698a18ef402
commit: 5b552b198c2557295becd471bff53bb520fefee5 media: atomisp: re-enable 
warnings again
date:   6 months ago
config: i386-randconfig-a003-20200826 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b552b198c2557295becd471bff53bb520fefee5
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 5b552b198c2557295becd471bff53bb520fefee5
# save the attached .config to linux build tree
make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   cc1: warning: 
drivers/staging/media/atomisp//pci/hive_isp_css_include/memory_access/: No such 
file or directory [-Wmissing-include-dirs]
>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:34:6: 
>> warning: no previous prototype for 'ia_css_isys_ibuf_rmgr_init' 
>> [-Wmissing-prototypes]
  34 | void ia_css_isys_ibuf_rmgr_init(void)
 |  ^~
>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:40:6: 
>> warning: no previous prototype for 'ia_css_isys_ibuf_rmgr_uninit' 
>> [-Wmissing-prototypes]
  40 | void ia_css_isys_ibuf_rmgr_uninit(void)
 |  ^~~~
>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:46:6: 
>> warning: no previous prototype for 'ia_css_isys_ibuf_rmgr_acquire' 
>> [-Wmissing-prototypes]
  46 | bool ia_css_isys_ibuf_rmgr_acquire(
 |  ^
>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:106:6: 
>> warning: no previous prototype for 'ia_css_isys_ibuf_rmgr_release' 
>> [-Wmissing-prototypes]
 106 | void ia_css_isys_ibuf_rmgr_release(
 |  ^
   In file included from 
drivers/staging/media/atomisp//pci/input_system_local.h:10,
from 
drivers/staging/media/atomisp//pci/hive_isp_css_include/input_system.h:34,
from 
drivers/staging/media/atomisp//pci/runtime/isys/interface/ia_css_isys.h:20,
from 
drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:20:
   drivers/staging/media/atomisp//pci/isp2400_input_system_local.h:402:26: 
warning: 'SUB_SYSTEM_OFFSET' defined but not used [-Wunused-const-variable=]
 402 | static const hrt_address SUB_SYSTEM_OFFSET[N_SUB_SYSTEM_ID] = {
 |  ^
   drivers/staging/media/atomisp//pci/isp2400_input_system_local.h:391:30: 
warning: 'MIPI_PORT_LANES' defined but not used [-Wunused-const-variable=]
 391 | static const mipi_lane_cfg_t 
MIPI_PORT_LANES[N_RX_MODE][N_MIPI_PORT_ID] = {
 |  ^~~
   drivers/staging/media/atomisp//pci/isp2400_input_system_local.h:380:19: 
warning: 'MIPI_PORT_ACTIVE' defined but not used [-Wunused-const-variable=]
 380 | static const bool MIPI_PORT_ACTIVE[N_RX_MODE][N_MIPI_PORT_ID] = {
 |   ^~~~
   drivers/staging/media/atomisp//pci/isp2400_input_system_local.h:374:30: 
warning: 'MIPI_PORT_MAXLANES' defined but not used [-Wunused-const-variable=]
 374 | static const mipi_lane_cfg_t MIPI_PORT_MAXLANES[N_MIPI_PORT_ID] = {
 |  ^~
   drivers/staging/media/atomisp//pci/isp2400_input_system_local.h:368:26: 
warning: 'MIPI_PORT_OFFSET' defined but not used [-Wunused-const-variable=]
 368 | static const hrt_address MIPI_PORT_OFFSET[N_MIPI_PORT_ID] = {
 |  ^~~~
   In file included from drivers/staging/media/atomisp//pci/system_local.h:10,
from 
drivers/staging/media/atomisp//pci/hive_isp_css_include/input_system.h:33,
from 
drivers/staging/media/atomisp//pci/runtime/isys/interface/ia_css_isys.h:20,
from 
drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:20:
   drivers/staging/media/atomisp//pci/isp2400_system_local.h:178:26: warning: 
'RX_BASE' defined but not used [-Wunused-const-variable=]
 178 | static const hrt_address RX_BASE[N_RX_ID] = {
 |  ^~~
   drivers/staging/media/atomisp//pci/isp2400_system_local.h:163:26: warning: 
'INPUT_SYSTEM_BASE' defined but not used [-Wunused-const-variable=]
 163 | static const hrt_address INPUT_SYSTEM_BASE[N_INPUT_SYSTEM_ID] = {
 |  ^
   drivers/staging/media/atomisp//pci/isp2400_system_local.h:155:26: warning: 
'INPUT_FORMATTER_BASE'

Re: [PATCH 1/1] crypto: Fix possible buffer overflows in pkey_protkey_aes_attr_read

2020-12-08 Thread Harald Freudenberger

On 09.12.20 07:47, Xiaohui Zhang wrote:
> From: Zhang Xiaohui 
>
> pkey_protkey_aes_attr_read() calls memcpy() without checking the
> destination size may trigger a buffer overflower.
>
> Signed-off-by: Zhang Xiaohui 
> ---
>  drivers/s390/crypto/pkey_api.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/s390/crypto/pkey_api.c b/drivers/s390/crypto/pkey_api.c
> index 99cb60ea6..abc237130 100644
> --- a/drivers/s390/crypto/pkey_api.c
> +++ b/drivers/s390/crypto/pkey_api.c
> @@ -1589,6 +1589,8 @@ static ssize_t pkey_protkey_aes_attr_read(u32 keytype, 
> bool is_xts, char *buf,
>   if (rc)
>   return rc;
>  
> + if (protkey.len > MAXPROTKEYSIZE)
> + protkey.len = MAXPROTKEYSIZE;
>   protkeytoken.len = protkey.len;
>   memcpy(, , protkey.len);
>  
> @@ -1599,6 +1601,8 @@ static ssize_t pkey_protkey_aes_attr_read(u32 keytype, 
> bool is_xts, char *buf,
>   if (rc)
>   return rc;
>  
> + if (protkey.len > MAXPROTKEYSIZE)
> + protkey.len = MAXPROTKEYSIZE;
>   protkeytoken.len = protkey.len;
>   memcpy(, , protkey.len);
>  
Thanks Xiaohui
but one rule within the kernel is to trust the other internal functions to do 
the right thing.
So usually only on entrance into the kernel the api parameters are checked but 
within the
kernel each function trusts the other and no further parameter check is done. 
Otherwise
endless checks of input parameters would take place which is killing the 
performance.
As you can see the protkey object is stored by the function pkey_genprotkey() 
which is
called just 2 lines above. An internal function the module should trust here. I 
don't think
there is an additional length check needed here.
However, Thanks for your contribution.
Harald Freudenberger
see this function calls another function in the very same file and

Re: [PATCH v6 1/4] dt-bindings: soc: imx8m: add DT Binding doc for soc unique ID

2020-12-08 Thread Krzysztof Kozlowski

On Wed, Dec 09, 2020 at 02:30:22AM +, Alice Guo (OSS) wrote:
> Gentle ping..  and Krzysztof Kozlowski, do you agree?

I did not know that you wait for something from my side.

> 
> Best Regards,
> Alice Guo
> 
> > -Original Message-
> > From: linux-arm-kernel  On
> > Behalf Of Alice Guo (OSS)
> > Sent: 2020年12月1日 11:31
> > To: Rob Herring ; Krzysztof Kozlowski ;
> > shawn...@kernel.org
> > Cc: devicet...@vger.kernel.org; Peng Fan ;
> > s.ha...@pengutronix.de; linux-kernel@vger.kernel.org; k...@kernel.org;
> > dl-linux-imx ; linux-arm-ker...@lists.infradead.org
> > Subject: RE: [PATCH v6 1/4] dt-bindings: soc: imx8m: add DT Binding doc for 
> > soc
> > unique ID
> > 
> > 
> > 
> > > -Original Message-
> > > From: linux-arm-kernel 
> > > On Behalf Of Rob Herring
> > > Sent: 2020年12月1日 5:57
> > > To: Alice Guo 
> > > Cc: devicet...@vger.kernel.org; Peng Fan ;
> > > s.ha...@pengutronix.de; linux-kernel@vger.kernel.org; k...@kernel.org;
> > > dl-linux-imx ; shawn...@kernel.org;
> > > linux-arm-ker...@lists.infradead.org
> > > Subject: Re: [PATCH v6 1/4] dt-bindings: soc: imx8m: add DT Binding
> > > doc for soc unique ID
> > >
> > > On Tue, Nov 24, 2020 at 09:59:46AM +0800, Alice Guo wrote:
> > > > Add DT Binding doc for the Unique ID of i.MX 8M series.
> > > >
> > > > Signed-off-by: Alice Guo 
> > > > ---
> > > >
> > > > v2: remove the subject prefix "LF-2571-1"
> > > > v3: put it into Documentation/devicetree/bindings/arm/fsl.yaml
> > >
> > > No, I prefer this be a separate schema file and not clutter board/soc
> > > schemas with child nodes.
> > 
> > Hi,
> > Thank you for your comments. I read
> > "Documentation/devicetree/bindings/arm/arm,realview.yaml"
> > in which there is a "soc". So I added my "soc" to this current file. Can I 
> > keep it in
> > Documentation/devicetree/bindings/arm/fsl.yaml?

Please go with Rob's suggestion.

Best regards,
Krzysztof

RE: [PATCH v3 2/3] scsi: ufs: Keep device active mode only fWriteBoosterBufferFlushDuringHibernate == 1

2020-12-08 Thread Avri Altman

> From: Bean Huo 
> 
> According to the JEDEC UFS 3.1 Spec, If
> fWriteBoosterBufferFlushDuringHibernate
> is set to one, the device flushes the WriteBooster Buffer data automatically
> whenever the link enters the hibernate (HIBERN8) state. While the flushing
> operation is in progress, the device should be kept in Active power mode.
> Currently, we set this flag during the UFSHCD probe stage, but we didn't deal
> with its programming failure. Even this failure is less likely to occur, but
> still it is possible.
How about reading it on every ufshcd_wb_need_flush?

Thanks,
Avri

Re: Fair Pay: Some interesting observations of symboldevelopment, Uni / I-T

2020-12-08 Thread Ywe Cærlyn

I updated name now also, the ultimate name, Dian X - encouraging correct 
symbol interaction in all languages!


Serene Greetings,
Ywe Cærlyn
https://www.youtube.com/channel/UCqt17eaSO66UV4xvIYJvD4g

Re: [External] Re: [PATCH v7 05/15] mm/bootmem_info: Introduce {free,prepare}_vmemmap_page()

2020-12-08 Thread Muchun Song

On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand  wrote:
>
> On 30.11.20 16:18, Muchun Song wrote:
> > In the later patch, we can use the free_vmemmap_page() to free the
> > unused vmemmap pages and initialize a page for vmemmap page using
> > via prepare_vmemmap_page().
> >
> > Signed-off-by: Muchun Song 
> > ---
> >  include/linux/bootmem_info.h | 24 
> >  1 file changed, 24 insertions(+)
> >
> > diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h
> > index 4ed6dee1adc9..239e3cc8f86c 100644
> > --- a/include/linux/bootmem_info.h
> > +++ b/include/linux/bootmem_info.h
> > @@ -3,6 +3,7 @@
> >  #define __LINUX_BOOTMEM_INFO_H
> >
> >  #include 
> > +#include 
> >
> >  /*
> >   * Types for free bootmem stored in page->lru.next. These have to be in
> > @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct 
> > pglist_data *pgdat);
> >  void get_page_bootmem(unsigned long info, struct page *page,
> > unsigned long type);
> >  void put_page_bootmem(struct page *page);
> > +
> > +static inline void free_vmemmap_page(struct page *page)
> > +{
> > + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2);
> > +
> > + /* bootmem page has reserved flag in the reserve_bootmem_region */
> > + if (PageReserved(page)) {
> > + unsigned long magic = (unsigned long)page->freelist;
> > +
> > + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO)
> > + put_page_bootmem(page);
> > + else
> > + WARN_ON(1);
> > + }
> > +}
> > +
> > +static inline void prepare_vmemmap_page(struct page *page)
> > +{
> > + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page));
> > +
> > + get_page_bootmem(section_nr, page, SECTION_INFO);
> > + mark_page_reserved(page);
> > +}
>
> Can you clarify in the description when exactly these functions are
> called and on which type of pages?
>
> Would indicating "bootmem" in the function names make it clearer what we
> are dealing with?
>
> E.g., any memory allocated via the memblock allocator and not via the
> buddy will be makred reserved already in the memmap. It's unclear to me
> why we need the mark_page_reserved() here - can you enlighten me? :)

Sorry for ignoring this question. Because the vmemmap pages are allocated
from the bootmem allocator which is marked as PG_reserved. For those bootmem
pages, we should call put_page_bootmem for free. You can see that we
clear the PG_reserved in the put_page_bootmem. In order to be consistent,
the prepare_vmemmap_page also marks the page as PG_reserved.

Thanks.

>
> --
> Thanks,
>
> David / dhildenb
>


-- 
Yours,
Muchun

[rcu:dev.2020.12.08c 96/104] arc-elf-ld: slab_common.c:undefined reference to `kmem_provenance'

2020-12-08 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2020.12.08c
head:   0168e03a513cd576ca6ab24f428ce85cec1e3ff3
commit: fc2cf07ea6773cc71c15e5477f35b28080b824c8 [96/104] mm: Add 
mem_dump_obj() to print source of memory block
config: arc-defconfig (attached as .config)
compiler: arc-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?id=fc2cf07ea6773cc71c15e5477f35b28080b824c8
git remote add rcu 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
git fetch --no-tags rcu dev.2020.12.08c
git checkout fc2cf07ea6773cc71c15e5477f35b28080b824c8
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   arc-elf-ld: mm/slab_common.o: in function `kmem_dump_obj':
   slab_common.c:(.text+0xba): undefined reference to `kmem_provenance'
>> arc-elf-ld: slab_common.c:(.text+0xba): undefined reference to 
>> `kmem_provenance'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

UBSAN: shift-out-of-bounds in f2fs_fill_super

2020-12-08 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:15ac8fdb Add linux-next specific files for 20201207
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14ba4b3750
kernel config:  https://syzkaller.appspot.com/x/.config?x=3696b8138207d24d
dashboard link: https://syzkaller.appspot.com/bug?extid=ca9a785f8ac472085994
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14e17ccb50
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c2128750

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ca9a785f8ac472085...@syzkaller.appspotmail.com

loop0: detected capacity change from 16384 to 0

UBSAN: shift-out-of-bounds in fs/f2fs/super.c:2812:16
shift exponent 59 is too large for 32-bit type 'int'
CPU: 0 PID: 8465 Comm: syz-executor962 Not tainted 
5.10.0-rc6-next-20201207-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 sanity_check_raw_super fs/f2fs/super.c:2812 [inline]
 read_raw_super_block fs/f2fs/super.c:3267 [inline]
 f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519
 mount_bdev+0x34d/0x410 fs/super.c:1366
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1496
 do_new_mount fs/namespace.c:2896 [inline]
 path_mount+0x12ae/0x1e70 fs/namespace.c:3227
 do_mount fs/namespace.c:3240 [inline]
 __do_sys_mount fs/namespace.c:3448 [inline]
 __se_sys_mount fs/namespace.c:3425 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x446d4a
Code: b8 08 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 fd ad fb ff c3 66 2e 0f 1f 
84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 
da ad fb ff c3 66 0f 1f 84 00 00 00 00 00
RSP: 002b:7fff137bba68 EFLAGS: 0297 ORIG_RAX: 00a5
RAX: ffda RBX: 7fff137bbac0 RCX: 00446d4a
RDX: 2000 RSI: 2100 RDI: 7fff137bba80
RBP: 7fff137bba80 R08: 7fff137bbac0 R09: 7fff0015
R10:  R11: 0297 R12: 0002
R13: 0004 R14: 0003 R15: 0003



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

UBSAN: shift-out-of-bounds in option_probe

2020-12-08 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:15ac8fdb Add linux-next specific files for 20201207
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17dc6adf50
kernel config:  https://syzkaller.appspot.com/x/.config?x=3696b8138207d24d
dashboard link: https://syzkaller.appspot.com/bug?extid=8881b478dad0a7971f79
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12e8961350
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1799362350

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+8881b478dad0a7971...@syzkaller.appspotmail.com

usb 1-1: config 0 interface 109 has no altsetting 0
usb 1-1: New USB device found, idVendor=12d1, idProduct=02cb, bcdDevice= 1.fb
usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 1-1: config 0 descriptor??

UBSAN: shift-out-of-bounds in drivers/usb/serial/option.c:2120:21
shift exponent 109 is too large for 64-bit type 'long unsigned int'
CPU: 0 PID: 3169 Comm: kworker/0:3 Not tainted 
5.10.0-rc6-next-20201207-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 option_probe.cold+0x1a/0x1f drivers/usb/serial/option.c:2120
 usb_serial_probe+0x32d/0xef0 drivers/usb/serial/usb-serial.c:905
 usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396
 really_probe+0x2b1/0xe40 drivers/base/dd.c:554
 driver_probe_device+0x285/0x3f0 drivers/base/dd.c:738
 __device_attach_driver+0x216/0x2d0 drivers/base/dd.c:844


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

UBSAN: shift-out-of-bounds in ext4_fill_super

2020-12-08 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:15ac8fdb Add linux-next specific files for 20201207
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1125c92350
kernel config:  https://syzkaller.appspot.com/x/.config?x=3696b8138207d24d
dashboard link: https://syzkaller.appspot.com/bug?extid=345b75652b1d24227443
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=151bf86b50
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139212cb50

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+345b75652b1d24227...@syzkaller.appspotmail.com

loop0: detected capacity change from 4 to 0

UBSAN: shift-out-of-bounds in fs/ext4/super.c:4190:25
shift exponent 589825 is too large for 32-bit type 'int'
CPU: 1 PID: 8498 Comm: syz-executor023 Not tainted 
5.10.0-rc6-next-20201207-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 ext4_fill_super.cold+0x154/0x3ce fs/ext4/super.c:4190
 mount_bdev+0x34d/0x410 fs/super.c:1366
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1496
 do_new_mount fs/namespace.c:2896 [inline]
 path_mount+0x12ae/0x1e70 fs/namespace.c:3227
 do_mount fs/namespace.c:3240 [inline]
 __do_sys_mount fs/namespace.c:3448 [inline]
 __se_sys_mount fs/namespace.c:3425 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x446d6a
Code: b8 08 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 fd ad fb ff c3 66 2e 0f 1f 
84 00 00 00 00 00 66 90 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 
da ad fb ff c3 66 0f 1f 84 00 00 00 00 00
RSP: 002b:7ffc2d215018 EFLAGS: 0206 ORIG_RAX: 00a5
RAX: ffda RBX: 7ffc2d215070 RCX: 00446d6a
RDX: 2000 RSI: 2100 RDI: 7ffc2d215030
RBP: 7ffc2d215030 R08: 7ffc2d215070 R09: 
R10:  R11: 0206 R12: 0001
R13: 0004 R14: 0003 R15: 0003



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

UBSAN: shift-out-of-bounds in parse_audio_format_i

2020-12-08 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:15ac8fdb Add linux-next specific files for 20201207
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=16d4620f50
kernel config:  https://syzkaller.appspot.com/x/.config?x=3696b8138207d24d
dashboard link: https://syzkaller.appspot.com/bug?extid=df7dc146ebdd6435eea3
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=126afa1350
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1438961350

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+df7dc146ebdd6435e...@syzkaller.appspotmail.com

usb 1-1: New USB device found, idVendor=1d6b, idProduct=0101, bcdDevice= 0.40
usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-1: Product: syz
usb 1-1: Manufacturer: syz
usb 1-1: SerialNumber: syz
usb 1-1: 2:1 : no or invalid class specific endpoint descriptor

UBSAN: shift-out-of-bounds in sound/usb/format.c:44:17
shift exponent 4098 is too large for 64-bit type 'long long unsigned int'
CPU: 0 PID: 8656 Comm: kworker/0:4 Not tainted 
5.10.0-rc6-next-20201207-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 parse_audio_format_i_type sound/usb/format.c:44 [inline]
 parse_audio_format_i.cold+0xba/0x3e2 sound/usb/format.c:653
 snd_usb_parse_audio_format+0x89/0x290 sound/usb/format.c:753
 snd_usb_get_audioformat_uac12 sound/usb/stream.c:841 [inline]
 __snd_usb_parse_audio_interface+0xce4/0x3cf0 sound/usb/stream.c:1170
 snd_usb_parse_audio_interface+0x79/0x130 sound/usb/stream.c:1240
 snd_usb_create_stream.isra.0+0x23a/0x530 sound/usb/card.c:206
 snd_usb_create_streams sound/usb/card.c:278 [inline]
 usb_audio_probe+0x93c/0x2ab0 sound/usb/card.c:802
 usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396
 really_probe+0x2b1/0xe40 drivers/base/dd.c:554
 driver_probe_device+0x285/0x3f0 drivers/base/dd.c:738
 __device_attach_driver+0x216/0x2d0 drivers/base/dd.c:844
 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431
 __device_attach+0x228/0x4c0 drivers/base/dd.c:912
 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
 device_add+0xbb2/0x1ce0 drivers/base/core.c:2934
 usb_set_configuration+0x113c/0x1910 drivers/usb/core/message.c:2167
 usb_generic_driver_probe+0xba/0x100 drivers/usb/core/generic.c:238
 usb_probe_device+0xd9/0x2c0 drivers/usb/core/driver.c:293
 really_probe+0x2b1/0xe40 drivers/base/dd.c:554
 driver_probe_device+0x285/0x3f0 drivers/base/dd.c:738
 __device_attach_driver+0x216/0x2d0 drivers/base/dd.c:844
 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431
 __device_attach+0x228/0x4c0 drivers/base/dd.c:912
 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
 device_add+0xbb2/0x1ce0 drivers/base/core.c:2934
 usb_new_device.cold+0x725/0x1057 drivers/usb/core/hub.c:2555
 hub_port_connect drivers/usb/core/hub.c:5223 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5363 [inline]
 port_event drivers/usb/core/hub.c:5509 [inline]
 hub_event+0x2348/0x42d0 drivers/usb/core/hub.c:5591
 process_one_work+0x98d/0x1630 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

UBSAN: shift-out-of-bounds in intel_pmu_refresh

2020-12-08 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:15ac8fdb Add linux-next specific files for 20201207
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=10e6b92350
kernel config:  https://syzkaller.appspot.com/x/.config?x=3696b8138207d24d
dashboard link: https://syzkaller.appspot.com/bug?extid=ae488dc136a4cc6ba32b
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=168d927b50
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11e0f70350

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ae488dc136a4cc6ba...@syzkaller.appspotmail.com


UBSAN: shift-out-of-bounds in arch/x86/kvm/vmx/pmu_intel.c:348:45
shift exponent 197 is too large for 64-bit type 'long long unsigned int'
CPU: 0 PID: 8491 Comm: syz-executor902 Not tainted 
5.10.0-rc6-next-20201207-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 intel_pmu_refresh.cold+0x75/0x99 arch/x86/kvm/vmx/pmu_intel.c:348
 kvm_vcpu_after_set_cpuid+0x65a/0xf80 arch/x86/kvm/cpuid.c:177
 kvm_vcpu_ioctl_set_cpuid2+0x160/0x440 arch/x86/kvm/cpuid.c:308
 kvm_arch_vcpu_ioctl+0x11b6/0x2d70 arch/x86/kvm/x86.c:4709
 kvm_vcpu_ioctl+0x7b9/0xdb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3386
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x448f39
Code: e8 3c ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 
4b ff fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:7fdfd8aadd98 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 006ddc68 RCX: 00448f39
RDX: 2480 RSI: 4008ae90 RDI: 0008
RBP: 006ddc60 R08:  R09: 
R10:  R11: 0246 R12: 006ddc6c
R13: ddd82e006500 R14: 099a300f0078010f R15: 2e320fc080b9



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

UBSAN: shift-out-of-bounds in snd_pcm_oss_change_params_locked

2020-12-08 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:15ac8fdb Add linux-next specific files for 20201207
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1656cc1350
kernel config:  https://syzkaller.appspot.com/x/.config?x=3696b8138207d24d
dashboard link: https://syzkaller.appspot.com/bug?extid=33ef0b6639a8d2d42b4c
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13a8ad3750
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15bc6adf50

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+33ef0b6639a8d2d42...@syzkaller.appspotmail.com


UBSAN: shift-out-of-bounds in sound/core/oss/pcm_oss.c:705:23
shift exponent 58 is too large for 32-bit type 'int'
CPU: 1 PID: 8476 Comm: syz-executor572 Not tainted 
5.10.0-rc6-next-20201207-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 snd_pcm_oss_period_size sound/core/oss/pcm_oss.c:705 [inline]
 snd_pcm_oss_change_params_locked.cold+0x55/0x78 sound/core/oss/pcm_oss.c:925
 snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1084 [inline]
 snd_pcm_oss_make_ready+0xe7/0x1b0 sound/core/oss/pcm_oss.c:1143
 snd_pcm_oss_sync+0x1de/0x800 sound/core/oss/pcm_oss.c:1708
 snd_pcm_oss_release+0x276/0x300 sound/core/oss/pcm_oss.c:2546
 __fput+0x283/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x190 kernel/task_work.c:140
 exit_task_work include/linux/task_work.h:30 [inline]
 do_exit+0xb89/0x2a00 kernel/exit.c:823
 do_group_exit+0x125/0x310 kernel/exit.c:920
 __do_sys_exit_group kernel/exit.c:931 [inline]
 __se_sys_exit_group kernel/exit.c:929 [inline]
 __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:929
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x43ee98
Code: Unable to access opcode bytes at RIP 0x43ee6e.
RSP: 002b:7ffc0b9ddff8 EFLAGS: 0246 ORIG_RAX: 00e7
RAX: ffda RBX:  RCX: 0043ee98
RDX:  RSI: 003c RDI: 
RBP: 004be6a8 R08: 00e7 R09: ffd0
R10:  R11: 0246 R12: 0001
R13: 006d0180 R14:  R15: 



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Re: [PATCH v2 07/12] x86: add new features for paravirt patching

2020-12-08 Thread Jürgen Groß


On 08.12.20 19:43, Borislav Petkov wrote:

On Fri, Nov 20, 2020 at 12:46:25PM +0100, Juergen Gross wrote:

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index dad350d42ecf..ffa23c655412 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -237,6 +237,9 @@
  #define X86_FEATURE_VMCALL( 8*32+18) /* "" Hypervisor supports 
the VMCALL instruction */
  #define X86_FEATURE_VMW_VMMCALL   ( 8*32+19) /* "" VMware prefers 
VMMCALL hypercall instruction */
  #define X86_FEATURE_SEV_ES( 8*32+20) /* AMD Secure Encrypted 
Virtualization - Encrypted State */
+#define X86_FEATURE_NOT_XENPV  ( 8*32+21) /* "" Inverse of 
X86_FEATURE_XENPV */
+#define X86_FEATURE_NO_PVUNLOCK( 8*32+22) /* "" No PV unlock 
function */
+#define X86_FEATURE_NO_VCPUPREEMPT ( 8*32+23) /* "" No PV 
vcpu_is_preempted function */


Ew, negative features. ;-\


Hey, I already suggested to use ~FEATURE for that purpose (see
https://lore.kernel.org/lkml/f105a63d-6b51-3afb-83e0-e899ea408...@suse.com/ 
).




/me goes forward and looks at usage sites:

+   ALTERNATIVE_2 "jmp *paravirt_iret(%rip);",\
+ "jmp native_iret;", X86_FEATURE_NOT_XENPV,  \
+ "jmp xen_iret;", X86_FEATURE_XENPV

Can we make that:

ALTERNATIVE_TERNARY "jmp *paravirt_iret(%rip);",
  "jmp xen_iret;", X86_FEATURE_XENPV,
  "jmp native_iret;", X86_FEATURE_XENPV,


Would we really want to specify the feature twice?

I'd rather make the syntax:

ALTERNATIVE_TERNARY   
 

as this ...



where the last two lines are supposed to mean

X86_FEATURE_XENPV ? "jmp xen_iret;" : "jmp 
native_iret;"


... would match perfectly to this interpretation.



Now, in order to convey that logic to apply_alternatives(), you can do:

struct alt_instr {
 s32 instr_offset;   /* original instruction */
 s32 repl_offset;/* offset to replacement instruction */
 u16 cpuid;  /* cpuid bit set for replacement */
 u8  instrlen;   /* length of original instruction */
 u8  replacementlen; /* length of new instruction */
 u8  padlen; /* length of build-time padding */
u8  flags;  /* patching flags */<--- 
THIS
} __packed;


Hmm, using flags is an alternative (pun intended :-) ).



and yes, we have had the flags thing in a lot of WIP diffs over the
years but we've never come to actually needing it.

Anyway, then, apply_alternatives() will do:

if (flags & ALT_NOT_FEATURE)

or something like that - I'm bad at naming stuff - then it should patch
only when the feature is NOT set and vice versa.

There in that

if (!boot_cpu_has(a->cpuid)) {

branch.

Hmm?


Fine with me (I'd prefer my ALTERNATIVE_TERNARY syntax, though).




  /* Intel-defined CPU features, CPUID level 0x0007:0 (EBX), word 9 */
  #define X86_FEATURE_FSGSBASE  ( 9*32+ 0) /* RDFSBASE, WRFSBASE, 
RDGSBASE, WRGSBASE instructions*/
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2400ad62f330..f8f9700719cf 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -593,6 +593,18 @@ int alternatives_text_reserved(void *start, void *end)
  #endif /* CONFIG_SMP */
  
  #ifdef CONFIG_PARAVIRT

+static void __init paravirt_set_cap(void)
+{
+   if (!boot_cpu_has(X86_FEATURE_XENPV))
+   setup_force_cpu_cap(X86_FEATURE_NOT_XENPV);
+
+   if (pv_is_native_spin_unlock())
+   setup_force_cpu_cap(X86_FEATURE_NO_PVUNLOCK);
+
+   if (pv_is_native_vcpu_is_preempted())
+   setup_force_cpu_cap(X86_FEATURE_NO_VCPUPREEMPT);
+}
+
  void __init_or_module apply_paravirt(struct paravirt_patch_site *start,
 struct paravirt_patch_site *end)
  {
@@ -616,6 +628,8 @@ void __init_or_module apply_paravirt(struct 
paravirt_patch_site *start,
  }
  extern struct paravirt_patch_site __start_parainstructions[],
__stop_parainstructions[];
+#else
+static void __init paravirt_set_cap(void) { }
  #endif/* CONFIG_PARAVIRT */
  
  /*

@@ -723,6 +737,18 @@ void __init alternative_instructions(void)
 * patching.
 */
  
+	paravirt_set_cap();


Can that be called from somewhere in the Xen init path and not from
here? Somewhere before check_bugs() gets called.


No, this is needed for non-Xen cases, too (especially pv spinlocks).




+   /*
+* First patch paravirt functions, such that we overwrite the indirect
+* call with the direct call.
+*/
+   apply_paravirt(__parainstructions, __parainstructions_end);
+
+   /*
+* Then patch alternatives, such that those paravirt calls that are in
+* alternatives can

Re: [PATCH 2/2] pwm: pwm-gpio: Add DT bindings

2020-12-08 Thread Uwe Kleine-König

On Sat, Dec 05, 2020 at 10:46:16PM +0100, Nicola Di Lieto wrote:
> Added Documentation/devicetree/bindings/pwm/pwm-gpio.yaml
> 
> Signed-off-by: Nicola Di Lieto 
> ---
> .../devicetree/bindings/pwm/pwm-gpio.yaml  | 42 ++
> 1 file changed, 42 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/pwm/pwm-gpio.yaml
> 
> diff --git a/Documentation/devicetree/bindings/pwm/pwm-gpio.yaml 
> b/Documentation/devicetree/bindings/pwm/pwm-gpio.yaml
> new file mode 100644
> index ..2e021ac6ff4a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/pwm/pwm-gpio.yaml
> @@ -0,0 +1,42 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/pwm/pwm-gpio.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Generic software PWM for modulating GPIOs
> +
> +maintainers:
> +  - Nicola Di Lieto 
> +
> +properties:
> +  "#pwm-cells":
> +description: |
> +  It must be 2. See pwm.yaml in this directory for a
> +  description of the cells format.

I think nowadays we prefer 3 here.

> +const: 2
> +
> +  compatible:
> +const: pwm-gpio
> +
> +  gpios:
> +description:
> +  GPIO to be modulated
> +maxItems: 1
> +

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature

Re: [PATCH] kunit: tool: simplify kconfig is_subset_of() logic

2020-12-08 Thread David Gow

On Wed, Dec 9, 2020 at 7:21 AM Daniel Latypov  wrote:
>
> Don't use an O(nm) algorithm* and make it more readable by using a dict.
>
> *Most obviously, it does a nested for-loop over the entire other config.
> A bit more subtle, it calls .entries(), which constructs a set from the
> list for _every_ outer iteration.
>
> Signed-off-by: Daniel Latypov 
> ---
Thanks! This works great here: I didn't time it to see how much faster
it is, but it's clearly an improvement.

Reviewed-by: David Gow 

Cheers,
-- David

Re: [PATCH v1 1/1] scsi: ufs: Fix ufs power down/on specs violation

2020-12-08 Thread Can Guo


On 2020-12-09 15:09, Ziqi Chen wrote:

As per specs, e.g, JESD220E chapter 7.2, while powering
off/on the ufs device, RST_N signal and REF_CLK signal
should be between VSS(Ground) and VCCQ/VCCQ2.

Power down:
1. Assert RST_N low
2. Turn-off REF_CLK
3. Turn-off VCC
4. Turn-off VCCQ/VCCQ2.
power on:
1. Turn-on VCC
2. Turn-on VCCQ/VCCQ2
3. Turn-On REF_CLK
4. Deassert RST_N high.

Signed-off-by: Ziqi Chen 
---
 drivers/scsi/ufs/ufs-qcom.c | 14 ++
 drivers/scsi/ufs/ufshcd.c   | 19 +--
 drivers/scsi/ufs/ufshcd.h   |  4 ++--
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index 1e434cc..5ed3a63d 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -582,6 +582,9 @@ static int ufs_qcom_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
ufs_qcom_disable_lane_clks(host);
phy_power_off(phy);

+   if (hba->vops && hba->vops->device_reset)
+   hba->vops->device_reset(hba, false);
+


Instead of doing the pull-down in ufshcd_vops_suspend(), can we do
it in ufshcd_suspend()? Since it is a common problem for all soc
vendors.


} else if (!ufs_qcom_is_link_active(hba)) {
ufs_qcom_disable_lane_clks(host);
}
@@ -1400,10 +1403,11 @@ static void ufs_qcom_dump_dbg_regs(struct 
ufs_hba *hba)

 /**
  * ufs_qcom_device_reset() - toggle the (optional) device reset line
  * @hba: per-adapter instance
+ * @toggle: need pulling up or not
  *
  * Toggles the (optional) reset line to reset the attached device.
  */
-static int ufs_qcom_device_reset(struct ufs_hba *hba)
+static int ufs_qcom_device_reset(struct ufs_hba *hba, bool toggle)
 {
struct ufs_qcom_host *host = ufshcd_get_variant(hba);

@@ -1416,10 +1420,12 @@ static int ufs_qcom_device_reset(struct ufs_hba 
*hba)

 * be on the safe side.
 */
gpiod_set_value_cansleep(host->device_reset, 1);
-   usleep_range(10, 15);

-   gpiod_set_value_cansleep(host->device_reset, 0);
-   usleep_range(10, 15);
+   if (toggle) {
+   usleep_range(10, 15);
+   gpiod_set_value_cansleep(host->device_reset, 0);
+   usleep_range(10, 15);
+   }

return 0;
 }
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 92d433d..5ab1c02 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8633,8 +8633,6 @@ static int ufshcd_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
if (ret)
goto set_dev_active;

-   ufshcd_vreg_set_lpm(hba);
-
 disable_clks:
/*
 	 * Call vendor specific suspend callback. As these callbacks may 
access

@@ -8664,6 +8662,7 @@ static int ufshcd_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)

/* Put the host controller in low power mode if possible */
ufshcd_hba_vreg_set_lpm(hba);
+   ufshcd_vreg_set_lpm(hba);


Can you put ufshcd_vreg_set_lpm() before ufshcd_hba_vreg_set_lpm()?


goto out;

 set_link_active:
@@ -8729,18 +8728,18 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
old_link_state = hba->uic_link_state;

ufshcd_hba_vreg_set_hpm(hba);
+   ret = ufshcd_vreg_set_hpm(hba);
+   if (ret)
+   goto out;
+
/* Make sure clocks are enabled before accessing controller */
ret = ufshcd_setup_clocks(hba, true);
if (ret)
-   goto out;
+   goto disable_vreg;

/* enable the host irq as host controller would be active soon */
ufshcd_enable_irq(hba);

-   ret = ufshcd_vreg_set_hpm(hba);
-   if (ret)
-   goto disable_irq_and_vops_clks;
-
/*
 	 * Call vendor specific resume callback. As these callbacks may 
access

 * vendor specific host controller register space call them when the
@@ -8748,7 +8747,7 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
 */
ret = ufshcd_vops_resume(hba, pm_op);
if (ret)
-   goto disable_vreg;
+   goto disable_irq_and_vops_clks;

 	/* For DeepSleep, the only supported option is to have the link off 
*/
 	WARN_ON(ufshcd_is_ufs_dev_deepsleep(hba) && 
!ufshcd_is_link_off(hba));

@@ -8815,8 +8814,6 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
ufshcd_link_state_transition(hba, old_link_state, 0);
 vendor_suspend:
ufshcd_vops_suspend(hba, pm_op);
-disable_vreg:
-   ufshcd_vreg_set_lpm(hba);
 disable_irq_and_vops_clks:
ufshcd_disable_irq(hba);
if (hba->clk_scaling.is_allowed)
@@ -8827,6 +8824,8 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
trace_ufshcd_clk_gating(dev_name(hba->dev),
hba->clk_gating.state);
}
+disable_vreg:
+   ufshcd_vreg_set_lpm(hba);
 out:

Re: [PATCH V5 0/2] mailbox: Add mhuv2 mailbox controller's support

2020-12-08 Thread Viresh Kumar

On 17-11-20, 15:32, Viresh Kumar wrote:
> Hi Jassi,
> 
> Here is the updated version based on your suggestions.
> 
> I feel bad that I haven't implemented the single-word protocol as a
> special case of multi-word one in the earlier attempt. Perhaps I was too
> consumed by the terminology used by the ARM folks in the previous
> version of the driver and the reference manual of the controller :)
> 
> V1/V4->V5

Hi Jassi,

I still don't see this here, hope it is going to get merged in the
coming merge window.

https://git.linaro.org/landing-teams/working/fujitsu/integration.git/log/?h=mailbox-for-next

Please let me know if you have any other concerns. Thanks.

-- 
viresh

Re: [PATCH v2 1/4] spi: LS7A: Add Loongson LS7A SPI controller driver support

2020-12-08 Thread zhangqing


Hi Brown,

Thank you for your suggestions, these are achievable, I will send v3 in 
the soon.


Before sending v3, I would like to trouble you to see if this is 
correct. It has been tested locally.


On 12/08/2020 09:56 PM, Mark Brown wrote:

On Tue, Dec 08, 2020 at 03:44:24PM +0800, Qing Zhang wrote:


v2:
- keep Kconfig and Makefile sorted
- make the entire comment a C++ one so things look more intentional

You say this but...


+++ b/drivers/spi/spi-ls7a.c
@@ -0,0 +1,324 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Loongson LS7A SPI Controller driver
+ *
+ * Copyright (C) 2020 Loongson Technology Corporation Limited
+ */

...this is still a mix of C and C++ comments?

  Replace all with //




+static int set_cs(struct ls7a_spi *ls7a_spi, struct spi_device  *spi, int val)
+{
+   int cs = ls7a_spi_read_reg(ls7a_spi, SFCS) & ~(0x11 << 
spi->chip_select);
+
+   if (spi->mode  & SPI_CS_HIGH)
+   val = !val;
+   ls7a_spi_write_reg(ls7a_spi, SFCS,
+   (val ? (0x11 << spi->chip_select):(0x1 << spi->chip_select)) | 
cs);
+
+   return 0;
+}

Why not just expose this to the core and let it handle things?

Please also write normal conditional statements to improve legibility.
There's quite a lot of coding style issues in this with things like
missing spaces

static void ls7a_spi_set_cs(struct spi_device *spi, bool enable)
{
struct ls7a_spi *ls7a_spi;

int cs = ls7a_spi_read_reg(ls7a_spi, SFCS) & ~(0x11 << 
spi->chip_select));


ls7a_spi = spi_master_get_devdata(spi->master);

if (!!(spi->mode & SPI_CS_HIGH) == enable)
val = (0x11 << spi->chip_select) | cs;
else
val = (0x1 << spi->chip_select) | cs;

ls7a_spi_write_reg(ls7a_spi, SFCS, val);
}

 static int ls7a_spi_pci_probe>

 +master->set_cs = ls7a_spi_set_cs;




+   if (t) {
+   hz = t->speed_hz;
+   if (!hz)
+   hz = spi->max_speed_hz;
+   } else
+   hz = spi->max_speed_hz;

If one branch of the conditional has braces please use them on both to
improve legibility.


+static int  ls7a_spi_transfer_one_message(struct spi_master *master,
+ struct spi_message *m)

I don't understand why the driver is implementing transfer_one_message()
- it looks like this is just open coding the standard loop that the
framework provides and should just be using transfer_one().


static int  ls7a_spi_transfer_one(struct spi_master *master,
  struct spi_device *spi,
  struct spi_transfer *t)
{
struct ls7a_spi *ls7a_spi;
int param, status;

ls7a_spi = spi_master_get_devdata(master);

spin_lock(_spi->lock);
param = ls7a_spi_read_reg(ls7a_spi, PARA);
ls7a_spi_write_reg(ls7a_spi, PARA, param&~1);
spin_unlock(_spi->lock);

status = ls7a_spi_do_transfer(ls7a_spi, spi, t);
if(status < 0)
return status;

if(t->len)
r = ls7a_spi_write_read(spi, t);

spin_lock(_spi->lock);
ls7a_spi_write_reg(ls7a_spi, PARA, param);
spin_unlock(_spi->lock);

return status;
}

  static int ls7a_spi_pci_probe>

 - master->transfer_one_message = ls7a_spi_transfer_one_message;
 +master->transfer_one = ls7a_spi_transfer_one;



+   r = ls7a_spi_write_read(spi, t);
+   if (r < 0) {
+   status = r;
+   goto error;
+   }

The indentation here isn't following the kernel coding style.


+   master = spi_alloc_master(>dev, sizeof(struct ls7a_spi));
+   if (!master)
+   return -ENOMEM;

Why not use devm_ here?


- master = spi_alloc_master(>dev, sizeof(struct ls7a_spi));

  error:
- spi_put_master(master);

+ master = devm_spi_alloc_master(>dev, sizeof(struct ls7a_spi));




+   ret = devm_spi_register_master(dev, master);
+   if (ret)
+   goto err_free_master;

The driver uses devm_spi_register_master() here but...


+static void ls7a_spi_pci_remove(struct pci_dev *pdev)
+{
+   struct spi_master *master = pci_get_drvdata(pdev);
+   struct ls7a_spi *spi;
+
+   spi = spi_master_get_devdata(master);
+   if (!spi)
+   return;
+
+   pci_release_regions(pdev);

...releases the PCI regions in the remove() function before the SPI
controller is freed so the controller could still be active.


 static void ls7a_spi_pci_remove(struct pci_dev *pdev)
{
struct spi_master *master = pci_get_drvdata(pdev);

 + spi_unregister_master(master);
pci_release_regions(pdev);
}

Thanks,

-Qing

[PATCH] ASoC: audio-graph-card: Drop remote-endpoint as required property

2020-12-08 Thread Sameer Pujar

The remote-endpoint may not be available if it is part of some
pluggable module. One such example would be an audio card, the
Codec endpoint will not be available until it is plugged in.
Hence drop 'remote-endpoint' as a required property.

Cc: Rob Herring 
Cc: Kuninori Morimoto 
Signed-off-by: Sameer Pujar 
---
 Documentation/devicetree/bindings/sound/audio-graph-port.yaml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/sound/audio-graph-port.yaml 
b/Documentation/devicetree/bindings/sound/audio-graph-port.yaml
index 2005014..766e910 100644
--- a/Documentation/devicetree/bindings/sound/audio-graph-port.yaml
+++ b/Documentation/devicetree/bindings/sound/audio-graph-port.yaml
@@ -71,9 +71,6 @@ properties:
 description: CPU to Codec rate channels.
 $ref: /schemas/types.yaml#/definitions/uint32
 
-required:
-  - remote-endpoint
-
   ports:
 description: multi OF-Graph subnode
 type: object
-- 
2.7.4

Re: [PATCH v12 14/17] s390/zcrypt: Notify driver on config changed and scan complete callbacks

2020-12-08 Thread Harald Freudenberger

On 30.11.20 10:18, h...@d06av26.portsmouth.uk.ibm.com wrote:
> On Tue, 24 Nov 2020 16:40:13 -0500
> Tony Krowiak  wrote:
>
>> This patch intruduces an extension to the ap bus to notify device drivers
>> when the host AP configuration changes - i.e., adapters, domains or
>> control domains are added or removed. To that end, two new callbacks are
>> introduced for AP device drivers:
>>
>>   void (*on_config_changed)(struct ap_config_info *new_config_info,
>> struct ap_config_info *old_config_info);
>>
>>  This callback is invoked at the start of the AP bus scan
>>  function when it determines that the host AP configuration information
>>  has changed since the previous scan. This is done by storing
>>  an old and current QCI info struct and comparing them. If there is any
>>  difference, the callback is invoked.
>>
>>  Note that when the AP bus scan detects that AP adapters, domains or
>>  control domains have been removed from the host's AP configuration, it
>>  will remove the associated devices from the AP bus subsystem's device
>>  model. This callback gives the device driver a chance to respond to
>>  the removal of the AP devices from the host configuration prior to
>>  calling the device driver's remove callback. The primary purpose of
>>  this callback is to allow the vfio_ap driver to do a bulk unplug of
>>  all affected adapters, domains and control domains from affected
>>  guests rather than unplugging them one at a time when the remove
>>  callback is invoked.
>>
>>   void (*on_scan_complete)(struct ap_config_info *new_config_info,
>>struct ap_config_info *old_config_info);
>>
>>  The on_scan_complete callback is invoked after the ap bus scan is
>>  complete if the host AP configuration data has changed.
>>
>>  Note that when the AP bus scan detects that adapters, domains or
>>  control domains have been added to the host's configuration, it will
>>  create new devices in the AP bus subsystem's device model. The primary
>>  purpose of this callback is to allow the vfio_ap driver to do a bulk
>>  plug of all affected adapters, domains and control domains into
>>  affected guests rather than plugging them one at a time when the
>>  probe callback is invoked.
>>
>> Please note that changes to the apmask and aqmask do not trigger
>> these two callbacks since the bus scan function is not invoked by changes
>> to those masks.
>>
>> Signed-off-by: Harald Freudenberger 
>> Signed-off-by: Tony Krowiak 
>> ---
>>  drivers/s390/crypto/ap_bus.c  | 83 ++-
>>  drivers/s390/crypto/ap_bus.h  | 12 
>>  drivers/s390/crypto/vfio_ap_private.h | 14 -
>>  3 files changed, 106 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
>> index 593573740981..3a63f6b33d8a 100644
>> --- a/drivers/s390/crypto/ap_bus.c
>> +++ b/drivers/s390/crypto/ap_bus.c
>> @@ -75,6 +75,7 @@ DEFINE_MUTEX(ap_perms_mutex);
>>  EXPORT_SYMBOL(ap_perms_mutex);
>>  
>>  static struct ap_config_info *ap_qci_info;
>> +static struct ap_config_info *ap_qci_info_old;
>>  
>>  /*
>>   * AP bus related debug feature things.
>> @@ -1440,6 +1441,52 @@ static int __match_queue_device_with_queue_id(struct 
>> device *dev, const void *da
>>  && AP_QID_QUEUE(to_ap_queue(dev)->qid) == (int)(long) data;
>>  }
>>  
>> +/* Helper function for notify_config_changed */
>> +static int __drv_notify_config_changed(struct device_driver *drv, void 
>> *data)
>> +{
>> +struct ap_driver *ap_drv = to_ap_drv(drv);
>> +
>> +if (try_module_get(drv->owner)) {
>> +if (ap_drv->on_config_changed)
>> +ap_drv->on_config_changed(ap_qci_info,
>> +  ap_qci_info_old);
>> +module_put(drv->owner);
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +/* Notify all drivers about an qci config change */
>> +static inline void notify_config_changed(void)
>> +{
>> +bus_for_each_drv(_bus_type, NULL, NULL,
>> + __drv_notify_config_changed);
>> +}
>> +
>> +/* Helper function for notify_scan_complete */
>> +static int __drv_notify_scan_complete(struct device_driver *drv, void *data)
>> +{
>> +struct ap_driver *ap_drv = to_ap_drv(drv);
>> +
>> +if (try_module_get(drv->owner)) {
>> +if (ap_drv->on_scan_complete)
>> +ap_drv->on_scan_complete(ap_qci_info,
>> + ap_qci_info_old);
>> +module_put(drv->owner);
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +/* Notify all drivers about bus scan complete */
>> +static inline void notify_scan_complete(void)
>> +{
>> +bus_for_each_drv(_bus_type, NULL, NULL,
>> + __drv_notify_scan_complete);
>> +}
>> +
>> +
>> +
>>  /*
>>   * Helper function for ap_scan_bus().
>>   *

Re: [PATCH v1 1/1] scsi: ufs: Fix ufs power down/on specs violation

2020-12-08 Thread Can Guo


On 2020-12-09 15:09, Ziqi Chen wrote:

As per specs, e.g, JESD220E chapter 7.2, while powering
off/on the ufs device, RST_N signal and REF_CLK signal
should be between VSS(Ground) and VCCQ/VCCQ2.

Power down:
1. Assert RST_N low
2. Turn-off REF_CLK
3. Turn-off VCC
4. Turn-off VCCQ/VCCQ2.
power on:
1. Turn-on VCC
2. Turn-on VCCQ/VCCQ2
3. Turn-On REF_CLK
4. Deassert RST_N high.

Signed-off-by: Ziqi Chen 
---
 drivers/scsi/ufs/ufs-qcom.c | 14 ++
 drivers/scsi/ufs/ufshcd.c   | 19 +--
 drivers/scsi/ufs/ufshcd.h   |  4 ++--
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index 1e434cc..5ed3a63d 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -582,6 +582,9 @@ static int ufs_qcom_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
ufs_qcom_disable_lane_clks(host);
phy_power_off(phy);

+   if (hba->vops && hba->vops->device_reset)
+   hba->vops->device_reset(hba, false);
+


Can you make a new wrapper func to do this? ufs-qcom.c is the vendor
driver, why call hba->vops->func() in vendor driver?


} else if (!ufs_qcom_is_link_active(hba)) {
ufs_qcom_disable_lane_clks(host);
}
@@ -1400,10 +1403,11 @@ static void ufs_qcom_dump_dbg_regs(struct 
ufs_hba *hba)

 /**
  * ufs_qcom_device_reset() - toggle the (optional) device reset line
  * @hba: per-adapter instance
+ * @toggle: need pulling up or not
  *
  * Toggles the (optional) reset line to reset the attached device.
  */
-static int ufs_qcom_device_reset(struct ufs_hba *hba)
+static int ufs_qcom_device_reset(struct ufs_hba *hba, bool toggle)
 {
struct ufs_qcom_host *host = ufshcd_get_variant(hba);

@@ -1416,10 +1420,12 @@ static int ufs_qcom_device_reset(struct ufs_hba 
*hba)

 * be on the safe side.
 */
gpiod_set_value_cansleep(host->device_reset, 1);
-   usleep_range(10, 15);

-   gpiod_set_value_cansleep(host->device_reset, 0);
-   usleep_range(10, 15);
+   if (toggle) {
+   usleep_range(10, 15);
+   gpiod_set_value_cansleep(host->device_reset, 0);
+   usleep_range(10, 15);
+   }

return 0;
 }
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 92d433d..5ab1c02 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8633,8 +8633,6 @@ static int ufshcd_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
if (ret)
goto set_dev_active;

-   ufshcd_vreg_set_lpm(hba);
-
 disable_clks:
/*
 	 * Call vendor specific suspend callback. As these callbacks may 
access

@@ -8664,6 +8662,7 @@ static int ufshcd_suspend(struct ufs_hba *hba,
enum ufs_pm_op pm_op)

/* Put the host controller in low power mode if possible */
ufshcd_hba_vreg_set_lpm(hba);
+   ufshcd_vreg_set_lpm(hba);
goto out;

 set_link_active:
@@ -8729,18 +8728,18 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
old_link_state = hba->uic_link_state;

ufshcd_hba_vreg_set_hpm(hba);
+   ret = ufshcd_vreg_set_hpm(hba);
+   if (ret)
+   goto out;
+
/* Make sure clocks are enabled before accessing controller */
ret = ufshcd_setup_clocks(hba, true);
if (ret)
-   goto out;
+   goto disable_vreg;

/* enable the host irq as host controller would be active soon */
ufshcd_enable_irq(hba);

-   ret = ufshcd_vreg_set_hpm(hba);
-   if (ret)
-   goto disable_irq_and_vops_clks;
-
/*
 	 * Call vendor specific resume callback. As these callbacks may 
access

 * vendor specific host controller register space call them when the
@@ -8748,7 +8747,7 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
 */
ret = ufshcd_vops_resume(hba, pm_op);
if (ret)
-   goto disable_vreg;
+   goto disable_irq_and_vops_clks;

 	/* For DeepSleep, the only supported option is to have the link off 
*/
 	WARN_ON(ufshcd_is_ufs_dev_deepsleep(hba) && 
!ufshcd_is_link_off(hba));

@@ -8815,8 +8814,6 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
ufshcd_link_state_transition(hba, old_link_state, 0);
 vendor_suspend:
ufshcd_vops_suspend(hba, pm_op);
-disable_vreg:
-   ufshcd_vreg_set_lpm(hba);
 disable_irq_and_vops_clks:
ufshcd_disable_irq(hba);
if (hba->clk_scaling.is_allowed)
@@ -8827,6 +8824,8 @@ static int ufshcd_resume(struct ufs_hba *hba,
enum ufs_pm_op pm_op)
trace_ufshcd_clk_gating(dev_name(hba->dev),
hba->clk_gating.state);
}
+disable_vreg:
+   ufshcd_vreg_set_lpm(hba);
 out:
hba->pm_op_in_progress = 0;
if (ret)
diff --git a/drivers/scsi/ufs/ufshcd.h

[PATCH v1 1/1] scsi: ufs: Fix ufs power down/on specs violation

2020-12-08 Thread Ziqi Chen

As per specs, e.g, JESD220E chapter 7.2, while powering
off/on the ufs device, RST_N signal and REF_CLK signal
should be between VSS(Ground) and VCCQ/VCCQ2.

Power down:
1. Assert RST_N low
2. Turn-off REF_CLK
3. Turn-off VCC
4. Turn-off VCCQ/VCCQ2.
power on:
1. Turn-on VCC
2. Turn-on VCCQ/VCCQ2
3. Turn-On REF_CLK
4. Deassert RST_N high.

Signed-off-by: Ziqi Chen 
---
 drivers/scsi/ufs/ufs-qcom.c | 14 ++
 drivers/scsi/ufs/ufshcd.c   | 19 +--
 drivers/scsi/ufs/ufshcd.h   |  4 ++--
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index 1e434cc..5ed3a63d 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -582,6 +582,9 @@ static int ufs_qcom_suspend(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
ufs_qcom_disable_lane_clks(host);
phy_power_off(phy);
 
+   if (hba->vops && hba->vops->device_reset)
+   hba->vops->device_reset(hba, false);
+
} else if (!ufs_qcom_is_link_active(hba)) {
ufs_qcom_disable_lane_clks(host);
}
@@ -1400,10 +1403,11 @@ static void ufs_qcom_dump_dbg_regs(struct ufs_hba *hba)
 /**
  * ufs_qcom_device_reset() - toggle the (optional) device reset line
  * @hba: per-adapter instance
+ * @toggle: need pulling up or not
  *
  * Toggles the (optional) reset line to reset the attached device.
  */
-static int ufs_qcom_device_reset(struct ufs_hba *hba)
+static int ufs_qcom_device_reset(struct ufs_hba *hba, bool toggle)
 {
struct ufs_qcom_host *host = ufshcd_get_variant(hba);
 
@@ -1416,10 +1420,12 @@ static int ufs_qcom_device_reset(struct ufs_hba *hba)
 * be on the safe side.
 */
gpiod_set_value_cansleep(host->device_reset, 1);
-   usleep_range(10, 15);
 
-   gpiod_set_value_cansleep(host->device_reset, 0);
-   usleep_range(10, 15);
+   if (toggle) {
+   usleep_range(10, 15);
+   gpiod_set_value_cansleep(host->device_reset, 0);
+   usleep_range(10, 15);
+   }
 
return 0;
 }
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 92d433d..5ab1c02 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8633,8 +8633,6 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
if (ret)
goto set_dev_active;
 
-   ufshcd_vreg_set_lpm(hba);
-
 disable_clks:
/*
 * Call vendor specific suspend callback. As these callbacks may access
@@ -8664,6 +8662,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
 
/* Put the host controller in low power mode if possible */
ufshcd_hba_vreg_set_lpm(hba);
+   ufshcd_vreg_set_lpm(hba);
goto out;
 
 set_link_active:
@@ -8729,18 +8728,18 @@ static int ufshcd_resume(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
old_link_state = hba->uic_link_state;
 
ufshcd_hba_vreg_set_hpm(hba);
+   ret = ufshcd_vreg_set_hpm(hba);
+   if (ret)
+   goto out;
+
/* Make sure clocks are enabled before accessing controller */
ret = ufshcd_setup_clocks(hba, true);
if (ret)
-   goto out;
+   goto disable_vreg;
 
/* enable the host irq as host controller would be active soon */
ufshcd_enable_irq(hba);
 
-   ret = ufshcd_vreg_set_hpm(hba);
-   if (ret)
-   goto disable_irq_and_vops_clks;
-
/*
 * Call vendor specific resume callback. As these callbacks may access
 * vendor specific host controller register space call them when the
@@ -8748,7 +8747,7 @@ static int ufshcd_resume(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
 */
ret = ufshcd_vops_resume(hba, pm_op);
if (ret)
-   goto disable_vreg;
+   goto disable_irq_and_vops_clks;
 
/* For DeepSleep, the only supported option is to have the link off */
WARN_ON(ufshcd_is_ufs_dev_deepsleep(hba) && !ufshcd_is_link_off(hba));
@@ -8815,8 +8814,6 @@ static int ufshcd_resume(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
ufshcd_link_state_transition(hba, old_link_state, 0);
 vendor_suspend:
ufshcd_vops_suspend(hba, pm_op);
-disable_vreg:
-   ufshcd_vreg_set_lpm(hba);
 disable_irq_and_vops_clks:
ufshcd_disable_irq(hba);
if (hba->clk_scaling.is_allowed)
@@ -8827,6 +8824,8 @@ static int ufshcd_resume(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
trace_ufshcd_clk_gating(dev_name(hba->dev),
hba->clk_gating.state);
}
+disable_vreg:
+   ufshcd_vreg_set_lpm(hba);
 out:
hba->pm_op_in_progress = 0;
if (ret)
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 61344c4..47c7dab6 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -323,7 +323,7 @@ struct

Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console

2020-12-08 Thread Shung-Hsi Yu

On Wed, Dec 09, 2020 at 02:33:04PM +0800, Shung-Hsi Yu wrote:
> Hi Thomas,
> 
> On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote:
> > The real problem is irqbalanced aggressively exhausting the vector space
> > of a _whole_ socket to the point that there is not a single vector left
> > for serial. That's the problem you want to fix.
> 
> I believe this warning also gets triggered even when there's _no_ vector
> exhaustion.
> 
> This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on
> a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0).
 typo ^^

Should be "affinity mask set to cpumask_of_node(1) when
irq_data_get_node(irqd) == 0".


Shung-Hsi

>   $ lscpu
>   ...
>   NUMA node0 CPU(s):   0-25,52-77
>   NUMA node1 CPU(s):   26-51,78-103
> 
>   $ cat /sys/kernel/debug/tracing/trace
>...
>   irqbalance-1994[017] d...74.912799: irq_matrix_alloc: bit=33 cpu=26 
> online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
> global_rsvd=341, total_alloc=217
>   irqbalance-1994[017] d...74.912802: vector_alloc: irq=4 vector=33 
> reserved=0 ret=0
>   irqbalance-1994[017] d...74.912804: vector_update: irq=4 vector=33 
> cpu=26 prev_vector=33 prev_cpu=7
>   irqbalance-1994[017] d...74.912805: vector_config: irq=4 vector=33 
> cpu=26 apicdest=0x0040
>   -0   [007] d.h.74.970733: vector_free_moved: irq=4 cpu=7 
> vector=33 is_managed=0
>   -0   [007] d.h.74.970738: irq_matrix_free: bit=33 cpu=7 
> online=1 avl=200 alloc=1 managed=1 online_maps=104 global_avl=20687, 
> global_rsvd=341, total_alloc=217
>...
> (agetty)-3004[047] d...81.731231: vector_deactivate: irq=4 
> is_managed=0 can_reserve=1 reserve=0
> (agetty)-3004[047] d...81.738035: vector_clear: irq=4 vector=33 
> cpu=26 prev_vector=0 prev_cpu=7
> (agetty)-3004[047] d...81.738040: irq_matrix_free: bit=33 cpu=26 
> online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20689, 
> global_rsvd=341, total_alloc=215
> (agetty)-3004[047] d...81.738046: irq_matrix_reserve: 
> online_maps=104 global_avl=20689, global_rsvd=342, total_alloc=215
> (agetty)-3004[047] d...81.766739: vector_reserve: irq=4 ret=0
> (agetty)-3004[047] d...81.766741: vector_config: irq=4 vector=239 
> cpu=0 apicdest=0x
> (agetty)-3004[047] d...81.777152: vector_activate: irq=4 
> is_managed=0 can_reserve=1 reserve=0
> (agetty)-3004[047] d...81.777157: vector_alloc: irq=4 vector=0 
> reserved=1 ret=-22
> > irq_matrix_alloc() failed with
>   EINVAL because the cpumask
>   passed in is empty, which is a
>   result of affmask being
>   (ff,c000,000f,fc00)
>   and cpumask_of_node(node)
>   being
>   
> (00,3fff,fff0,03ff). 
> 
> (agetty)-3004[047] d...81.789349: irq_matrix_alloc: bit=33 cpu=1 
> online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, 
> global_rsvd=341, total_alloc=216
> (agetty)-3004[047] d...81.789351: vector_alloc: irq=4 vector=33 
> reserved=1 ret=0
> (agetty)-3004[047] d...81.789353: vector_update: irq=4 vector=33 
> cpu=1 prev_vector=0 prev_cpu=26
> (agetty)-3004[047] d...81.789355: vector_config: irq=4 vector=33 
> cpu=1 apicdest=0x0002
> > "irq 4: Affinity broken due to
>   vector space exhaustion."
>   warning shows up
> 
> (agetty)-3004[047] d...81.900783: irq_matrix_alloc: bit=33 cpu=26 
> online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
> global_rsvd=341, total_alloc=217
> (agetty)-3004[047] d...82.053535: vector_alloc: irq=4 vector=33 
> reserved=0 ret=0
> (agetty)-3004[047] d...82.053536: vector_update: irq=4 vector=33 
> cpu=26 prev_vector=33 prev_cpu=1
> (agetty)-3004[047] d...82.053538: vector_config: irq=4 vector=33 
> cpu=26 apicdest=0x0040

[PATCH v3 3/6] Drivers: hv: vmbus: Copy the hv_message in vmbus_on_msg_dpc()

2020-12-08 Thread Andrea Parri (Microsoft)

Since the message is in memory shared with the host, an erroneous or a
malicious Hyper-V could 'corrupt' the message while vmbus_on_msg_dpc()
or individual message handlers are executing.  To prevent it, copy the
message into private memory.

Reported-by: Juan Vazquez 
Signed-off-by: Andrea Parri (Microsoft) 
---
Changes since v2:
  - Revisit commit message and inline comment

 drivers/hv/vmbus_drv.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 44bcf9ccdaf5f..b1c5a89d75f9d 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1054,14 +1054,14 @@ void vmbus_on_msg_dpc(unsigned long data)
 {
struct hv_per_cpu_context *hv_cpu = (void *)data;
void *page_addr = hv_cpu->synic_message_page;
-   struct hv_message *msg = (struct hv_message *)page_addr +
+   struct hv_message msg_copy, *msg = (struct hv_message *)page_addr +
  VMBUS_MESSAGE_SINT;
struct vmbus_channel_message_header *hdr;
enum vmbus_channel_message_type msgtype;
const struct vmbus_channel_message_table_entry *entry;
struct onmessage_work_context *ctx;
-   u32 message_type = msg->header.message_type;
__u8 payload_size;
+   u32 message_type;
 
/*
 * 'enum vmbus_channel_message_type' is supposed to always be 'u32' as
@@ -1070,11 +1070,20 @@ void vmbus_on_msg_dpc(unsigned long data)
 */
BUILD_BUG_ON(sizeof(enum vmbus_channel_message_type) != sizeof(u32));
 
+   /*
+* Since the message is in memory shared with the host, an erroneous or
+* malicious Hyper-V could modify the message while vmbus_on_msg_dpc()
+* or individual message handlers are executing; to prevent this, copy
+* the message into private memory.
+*/
+   memcpy(_copy, msg, sizeof(struct hv_message));
+
+   message_type = msg_copy.header.message_type;
if (message_type == HVMSG_NONE)
/* no msg */
return;
 
-   hdr = (struct vmbus_channel_message_header *)msg->u.payload;
+   hdr = (struct vmbus_channel_message_header *)msg_copy.u.payload;
msgtype = hdr->msgtype;
 
trace_vmbus_on_msg_dpc(hdr);
@@ -1084,7 +1093,7 @@ void vmbus_on_msg_dpc(unsigned long data)
goto msg_handled;
}
 
-   payload_size = msg->header.payload_size;
+   payload_size = msg_copy.header.payload_size;
if (payload_size > HV_MESSAGE_PAYLOAD_BYTE_COUNT) {
WARN_ONCE(1, "payload size is too large (%d)\n", payload_size);
goto msg_handled;
@@ -1106,7 +1115,7 @@ void vmbus_on_msg_dpc(unsigned long data)
return;
 
INIT_WORK(>work, vmbus_onmessage_work);
-   memcpy(>msg, msg, sizeof(msg->header) + payload_size);
+   memcpy(>msg, _copy, sizeof(msg->header) + 
payload_size);
 
/*
 * The host can generate a rescind message while we
-- 
2.25.1

[PATCH] checkpatch: Add printk_once and printk_ratelimit to prefer pr_ warning

2020-12-08 Thread Joe Perches

Add the _once and _ratelimited variants to the test for
printk(KERN_ that should prefer pr_.

Miscellanea:

o Add comment description for the conversions

Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 7b086d1cd6c2..52f467fd32f9 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -4543,16 +4543,22 @@ sub process {
 "printk() should include KERN_ facility 
level\n" . $herecurr);
}
 
-   if ($line =~ /\bprintk\s*\(\s*KERN_([A-Z]+)/) {
-   my $orig = $1;
+# prefer variants of (subsystem|netdev|dev|pr)_ to printk(KERN_
+   if ($line =~ 
/\b(printk(_once|_ratelimited)?)\s*\(\s*KERN_([A-Z]+)/) {
+   my $printk = $1;
+   my $modifier = $2;
+   my $orig = $3;
my $level = lc($orig);
$level = "warn" if ($level eq "warning");
my $level2 = $level;
$level2 = "dbg" if ($level eq "debug");
+   $level .= $modifier;
+   $level2 .= $modifier;
WARN("PREFER_PR_LEVEL",
-"Prefer [subsystem eg: 
netdev]_$level2([subsystem]dev, ... then dev_$level2(dev, ... then 
pr_$level(...  to printk(KERN_$orig ...\n" . $herecurr);
+"Prefer [subsystem eg: 
netdev]_$level2([subsystem]dev, ... then dev_$level2(dev, ... then 
pr_$level(...  to $printk(KERN_$orig ...\n" . $herecurr);
}
 
+# prefer dev_ to dev_printk(KERN_
if ($line =~ /\bdev_printk\s*\(\s*KERN_([A-Z]+)/) {
my $orig = $1;
my $level = lc($orig);

[PATCH v3 1/6] Drivers: hv: vmbus: Initialize memory to be sent to the host

2020-12-08 Thread Andrea Parri (Microsoft)

__vmbus_open() and vmbus_teardown_gpadl() do not inizialite the memory
for the vmbus_channel_open_channel and the vmbus_channel_gpadl_teardown
objects they allocate respectively.  These objects contain padding bytes
and fields that are left uninitialized and that are later sent to the
host, potentially leaking guest data.  Zero initialize such fields to
avoid leaking sensitive information to the host.

Reported-by: Juan Vazquez 
Signed-off-by: Andrea Parri (Microsoft) 
Reviewed-by: Michael Kelley 
---
Changes since v2:
  - Add Reviewed-by: tag

 drivers/hv/channel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 0d63862d65518..9aa789e5f22bb 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -621,7 +621,7 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
goto error_clean_ring;
 
/* Create and init the channel open message */
-   open_info = kmalloc(sizeof(*open_info) +
+   open_info = kzalloc(sizeof(*open_info) +
   sizeof(struct vmbus_channel_open_channel),
   GFP_KERNEL);
if (!open_info) {
@@ -748,7 +748,7 @@ int vmbus_teardown_gpadl(struct vmbus_channel *channel, u32 
gpadl_handle)
unsigned long flags;
int ret;
 
-   info = kmalloc(sizeof(*info) +
+   info = kzalloc(sizeof(*info) +
   sizeof(struct vmbus_channel_gpadl_teardown), GFP_KERNEL);
if (!info)
return -ENOMEM;
-- 
2.25.1

[PATCH v3 5/6] Drivers: hv: vmbus: Resolve race condition in vmbus_onoffer_rescind()

2020-12-08 Thread Andrea Parri (Microsoft)

An erroneous or malicious host could send multiple rescind messages for
a same channel.  In vmbus_onoffer_rescind(), the guest maps the channel
ID to obtain a pointer to the channel object and it eventually releases
such object and associated data.  The host could time rescind messages
and lead to an use-after-free.  Add a new flag to the channel structure
to make sure that only one instance of vmbus_onoffer_rescind() can get
the reference to the channel object.

Reported-by: Juan Vazquez 
Signed-off-by: Andrea Parri (Microsoft) 
---
 drivers/hv/channel_mgmt.c | 12 
 include/linux/hyperv.h|  1 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 4072fd1f22146..68950a1e4b638 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -1063,6 +1063,18 @@ static void vmbus_onoffer_rescind(struct 
vmbus_channel_message_header *hdr)
 
mutex_lock(_connection.channel_mutex);
channel = relid2channel(rescind->child_relid);
+   if (channel != NULL) {
+   /*
+* Guarantee that no other instance of vmbus_onoffer_rescind()
+* has got a reference to the channel object.  Synchronize on
+* _connection.channel_mutex.
+*/
+   if (channel->rescind_ref) {
+   mutex_unlock(_connection.channel_mutex);
+   return;
+   }
+   channel->rescind_ref = true;
+   }
mutex_unlock(_connection.channel_mutex);
 
if (channel == NULL) {
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2ea967bc17adf..f0d48a368f131 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -809,6 +809,7 @@ struct vmbus_channel {
u8 monitor_bit;
 
bool rescind; /* got rescind msg */
+   bool rescind_ref; /* got rescind msg, got channel reference */
struct completion rescind_event;
 
u32 ringbuffer_gpadlhandle;
-- 
2.25.1

[PATCH v3 6/6] Drivers: hv: vmbus: Do not allow overwriting vmbus_connection.channels[]

2020-12-08 Thread Andrea Parri (Microsoft)

Currently, vmbus_onoffer() and vmbus_process_offer() are not validating
whether a given entry in the vmbus_connection.channels[] array is empty
before filling the entry with a call of vmbus_channel_map_relid().  An
erroneous or malicious host could rely on this to leak channel objects.
Do not allow overwriting an entry vmbus_connection.channels[].

Reported-by: Juan Vazquez 
Signed-off-by: Andrea Parri (Microsoft) 
---
Changes since v2:
  - Release channel_mutex before 'return' in vmbus_onoffer() error path

 drivers/hv/channel_mgmt.c | 40 +--
 drivers/hv/hyperv_vmbus.h |  2 +-
 2 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 68950a1e4b638..2c15693b86f1e 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -354,10 +354,12 @@ static void free_channel(struct vmbus_channel *channel)
kobject_put(>kobj);
 }
 
-void vmbus_channel_map_relid(struct vmbus_channel *channel)
+int vmbus_channel_map_relid(struct vmbus_channel *channel)
 {
-   if (WARN_ON(channel->offermsg.child_relid >= MAX_CHANNEL_RELIDS))
-   return;
+   u32 relid = channel->offermsg.child_relid;
+
+   if (WARN_ON(relid >= MAX_CHANNEL_RELIDS || 
vmbus_connection.channels[relid] != NULL))
+   return -EINVAL;
/*
 * The mapping of the channel's relid is visible from the CPUs that
 * execute vmbus_chan_sched() by the time that vmbus_chan_sched() will
@@ -383,18 +385,17 @@ void vmbus_channel_map_relid(struct vmbus_channel 
*channel)
 *  of the VMBus driver and vmbus_chan_sched() can not run before
 *  vmbus_bus_resume() has completed execution (cf. resume_noirq).
 */
-   smp_store_mb(
-   vmbus_connection.channels[channel->offermsg.child_relid],
-   channel);
+   smp_store_mb(vmbus_connection.channels[relid], channel);
+   return 0;
 }
 
 void vmbus_channel_unmap_relid(struct vmbus_channel *channel)
 {
-   if (WARN_ON(channel->offermsg.child_relid >= MAX_CHANNEL_RELIDS))
+   u32 relid = channel->offermsg.child_relid;
+
+   if (WARN_ON(relid >= MAX_CHANNEL_RELIDS))
return;
-   WRITE_ONCE(
-   vmbus_connection.channels[channel->offermsg.child_relid],
-   NULL);
+   WRITE_ONCE(vmbus_connection.channels[relid], NULL);
 }
 
 static void vmbus_release_relid(u32 relid)
@@ -601,6 +602,12 @@ static void vmbus_process_offer(struct vmbus_channel 
*newchannel)
 */
atomic_dec(_connection.offer_in_progress);
 
+   if (vmbus_channel_map_relid(newchannel)) {
+   mutex_unlock(_connection.channel_mutex);
+   kfree(newchannel);
+   return;
+   }
+
list_for_each_entry(channel, _connection.chn_list, listentry) {
if (guid_equal(>offermsg.offer.if_type,
   >offermsg.offer.if_type) &&
@@ -619,6 +626,7 @@ static void vmbus_process_offer(struct vmbus_channel 
*newchannel)
 * Check to see if this is a valid sub-channel.
 */
if (newchannel->offermsg.offer.sub_channel_index == 0) {
+   vmbus_channel_unmap_relid(newchannel);
mutex_unlock(_connection.channel_mutex);
/*
 * Don't call free_channel(), because newchannel->kobj
@@ -635,8 +643,6 @@ static void vmbus_process_offer(struct vmbus_channel 
*newchannel)
list_add_tail(>sc_list, >sc_list);
}
 
-   vmbus_channel_map_relid(newchannel);
-
mutex_unlock(_connection.channel_mutex);
cpus_read_unlock();
 
@@ -920,6 +926,8 @@ static void vmbus_onoffer(struct 
vmbus_channel_message_header *hdr)
oldchannel = find_primary_channel_by_offer(offer);
 
if (oldchannel != NULL) {
+   u32 relid = offer->child_relid;
+
/*
 * We're resuming from hibernation: all the sub-channel and
 * hv_sock channels we had before the hibernation should have
@@ -954,8 +962,12 @@ static void vmbus_onoffer(struct 
vmbus_channel_message_header *hdr)
atomic_dec(_connection.offer_in_progress);
 
WARN_ON(oldchannel->offermsg.child_relid != INVALID_RELID);
+   if (WARN_ON(vmbus_connection.channels[relid] != NULL)) {
+   mutex_unlock(_connection.channel_mutex);
+   return;
+   }
/* Fix up the relid. */
-   oldchannel->offermsg.child_relid = offer->child_relid;
+   oldchannel->offermsg.child_relid = relid;
 
offer_sz = sizeof(*offer);
if (memcmp(offer, >offermsg, offer_sz) != 0) {
@@ -967,7 +979,7 @@ static void vmbus_onoffer(struct 
vmbus_channel_message_header *hdr)
 * reoffers the device upon

[PATCH v3 2/6] Drivers: hv: vmbus: Reduce number of references to message in vmbus_on_msg_dpc()

2020-12-08 Thread Andrea Parri (Microsoft)

Simplify the function by removing various references to the hv_message
'msg', introduce local variables 'msgtype' and 'payload_size'.

Suggested-by: Juan Vazquez 
Suggested-by: Michael Kelley 
Signed-off-by: Andrea Parri (Microsoft) 
---
Changes since v2:
  - Squash patches #2 and #3
  - Revisit commit message

 drivers/hv/vmbus_drv.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 502f8cd95f6d4..44bcf9ccdaf5f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1057,9 +1057,11 @@ void vmbus_on_msg_dpc(unsigned long data)
struct hv_message *msg = (struct hv_message *)page_addr +
  VMBUS_MESSAGE_SINT;
struct vmbus_channel_message_header *hdr;
+   enum vmbus_channel_message_type msgtype;
const struct vmbus_channel_message_table_entry *entry;
struct onmessage_work_context *ctx;
u32 message_type = msg->header.message_type;
+   __u8 payload_size;
 
/*
 * 'enum vmbus_channel_message_type' is supposed to always be 'u32' as
@@ -1073,40 +1075,38 @@ void vmbus_on_msg_dpc(unsigned long data)
return;
 
hdr = (struct vmbus_channel_message_header *)msg->u.payload;
+   msgtype = hdr->msgtype;
 
trace_vmbus_on_msg_dpc(hdr);
 
-   if (hdr->msgtype >= CHANNELMSG_COUNT) {
-   WARN_ONCE(1, "unknown msgtype=%d\n", hdr->msgtype);
+   if (msgtype >= CHANNELMSG_COUNT) {
+   WARN_ONCE(1, "unknown msgtype=%d\n", msgtype);
goto msg_handled;
}
 
-   if (msg->header.payload_size > HV_MESSAGE_PAYLOAD_BYTE_COUNT) {
-   WARN_ONCE(1, "payload size is too large (%d)\n",
- msg->header.payload_size);
+   payload_size = msg->header.payload_size;
+   if (payload_size > HV_MESSAGE_PAYLOAD_BYTE_COUNT) {
+   WARN_ONCE(1, "payload size is too large (%d)\n", payload_size);
goto msg_handled;
}
 
-   entry = _message_table[hdr->msgtype];
+   entry = _message_table[msgtype];
 
if (!entry->message_handler)
goto msg_handled;
 
-   if (msg->header.payload_size < entry->min_payload_len) {
-   WARN_ONCE(1, "message too short: msgtype=%d len=%d\n",
- hdr->msgtype, msg->header.payload_size);
+   if (payload_size < entry->min_payload_len) {
+   WARN_ONCE(1, "message too short: msgtype=%d len=%d\n", msgtype, 
payload_size);
goto msg_handled;
}
 
if (entry->handler_type == VMHT_BLOCKING) {
-   ctx = kmalloc(sizeof(*ctx) + msg->header.payload_size,
- GFP_ATOMIC);
+   ctx = kmalloc(sizeof(*ctx) + payload_size, GFP_ATOMIC);
if (ctx == NULL)
return;
 
INIT_WORK(>work, vmbus_onmessage_work);
-   memcpy(>msg, msg, sizeof(msg->header) +
-  msg->header.payload_size);
+   memcpy(>msg, msg, sizeof(msg->header) + payload_size);
 
/*
 * The host can generate a rescind message while we
@@ -1115,7 +1115,7 @@ void vmbus_on_msg_dpc(unsigned long data)
 * by offer_in_progress and by channel_mutex.  See also the
 * inline comments in vmbus_onoffer_rescind().
 */
-   switch (hdr->msgtype) {
+   switch (msgtype) {
case CHANNELMSG_RESCIND_CHANNELOFFER:
/*
 * If we are handling the rescind message;
-- 
2.25.1

[PATCH v3 0/6] Drivers: hv: vmbus: More VMBus-hardening changes

2020-12-08 Thread Andrea Parri (Microsoft)

Integrating feedback from Juan, Michael and Wei. [1]  Changelogs are
inline/in the patches.

Thanks,
  Andrea

[1] https://lkml.kernel.org/r/20201202092214.13520-1-parri.and...@gmail.com

Andrea Parri (Microsoft) (6):
  Drivers: hv: vmbus: Initialize memory to be sent to the host
  Drivers: hv: vmbus: Reduce number of references to message in
vmbus_on_msg_dpc()
  Drivers: hv: vmbus: Copy the hv_message in vmbus_on_msg_dpc()
  Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind()
  Drivers: hv: vmbus: Resolve race condition in vmbus_onoffer_rescind()
  Drivers: hv: vmbus: Do not allow overwriting
vmbus_connection.channels[]

 drivers/hv/channel.c  |  4 +--
 drivers/hv/channel_mgmt.c | 55 +++
 drivers/hv/hyperv_vmbus.h |  2 +-
 drivers/hv/vmbus_drv.c| 43 ++
 include/linux/hyperv.h|  1 +
 5 files changed, 69 insertions(+), 36 deletions(-)

-- 
2.25.1

[PATCH v3 4/6] Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind()

2020-12-08 Thread Andrea Parri (Microsoft)

When channel->device_obj is non-NULL, vmbus_onoffer_rescind() could
invoke put_device(), that will eventually release the device and free
the channel object (cf. vmbus_device_release()).  However, a pointer
to the object is dereferenced again later to load the primary_channel.
The use-after-free can be avoided by noticing that this load/check is
redundant if device_obj is non-NULL: primary_channel must be NULL if
device_obj is non-NULL, cf. vmbus_add_channel_work().

Fixes: 54a66265d6754b ("Drivers: hv: vmbus: Fix rescind handling")
Reported-by: Juan Vazquez 
Signed-off-by: Andrea Parri (Microsoft) 
Reviewed-by: Michael Kelley 
---
Changes since v2:
  - Add Reviewed-by: tag

 drivers/hv/channel_mgmt.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 5bc5eef5da159..4072fd1f22146 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -1116,8 +1116,7 @@ static void vmbus_onoffer_rescind(struct 
vmbus_channel_message_header *hdr)
vmbus_device_unregister(channel->device_obj);
put_device(dev);
}
-   }
-   if (channel->primary_channel != NULL) {
+   } else if (channel->primary_channel != NULL) {
/*
 * Sub-channel is being rescinded. Following is the channel
 * close sequence when initiated from the driveri (refer to
-- 
2.25.1

Re: linux-next: build warning after merge of the akpm tree

2020-12-08 Thread Stephen Rothwell

Hi Michael,

On Wed, 09 Dec 2020 15:44:35 +1100 Michael Ellerman  wrote:
>
> They should really be in DATA_DATA or similar shouldn't they?

No other architecture appears t need them ...

-- 
Cheers,
Stephen Rothwell


pgpHxEGuWLRlX.pgp
Description: OpenPGP digital signature

Re: [External] Re: [PATCH v2] mm: memcontrol: optimize per-lruvec stats counter memory usage

2020-12-08 Thread Muchun Song

On Wed, Dec 9, 2020 at 11:52 AM Roman Gushchin  wrote:
>
> On Wed, Dec 09, 2020 at 10:31:55AM +0800, Muchun Song wrote:
> > On Wed, Dec 9, 2020 at 10:21 AM Roman Gushchin  wrote:
> > >
> > > On Tue, Dec 08, 2020 at 05:51:32PM +0800, Muchun Song wrote:
> > > > The vmstat threshold is 32 (MEMCG_CHARGE_BATCH), so the type of s32
> > > > of lruvec_stat_cpu is enough.
>
> Actually the threshold can be as big as MEMCG_CHARGE_BATCH * PAGE_SIZE.
> It still fits into s32, but without explicitly saying it it's hard to
> understand why not choosing s8, as in vmstat.c.

Yeah, here I need to update the commit log.

>
> > > >
> > > > The size of struct lruvec_stat is 304 bytes on 64 bits system. As it
> > > > is a per-cpu structure. So with this patch, we can save 304 / 2 * ncpu
> > > > bytes per-memcg per-node where ncpu is the number of the possible CPU.
> > > > If there are c memory cgroup (include dying cgroup) and n NUMA node in
> > > > the system. Finally, we can save (152 * ncpu * c * n) bytes.
> > >
> > > Honestly, I'm not convinced.
> > > Say, ncpu = 32, n = 2, c = 500. We're saving <5Mb of memory.
> > > If the machine has 128Gb of RAM, it's .3%.
> >
> > Hi Roman,
> >
> > When the cpu hotplug is enabled, the ncpu can be 256 on
> > some configurations. Also, the c can be more large when
> > there are many dying cgroup in the system.
> >
> > So the savings depends on the environment and
> > configurations. Right?
>
> Of course, but machines with more CPUs tend to have more RAM as well.

Here I mean possible CPU not online CPU. The number of possible
CPUs may be greater than online CPUs. The per-cpu allocator is based
on the number of possible CPUs. Right?

Thanks.

>
> Thanks!



-- 
Yours,
Muchun

Re: [PATCH v3] pwm: bcm2835: Support apply function for atomic configuration

2020-12-08 Thread Uwe Kleine-König

Hello Lino,

On Tue, Dec 08, 2020 at 11:01:45PM +0100, Lino Sanfilippo wrote:
> Use the newer .apply function of pwm_ops instead of .config, .enable,
> .disable and .set_polarity. This guarantees atomic changes of the pwm
> controller configuration. It also reduces the size of the driver.
> 
> Since now period is a 64 bit value, add an extra check to reject periods
> that exceed the possible max value for the 32 bit register.
> 
> This has been tested on a Raspberry PI 4.

This looks right, just two small nitpicks below.

> Signed-off-by: Lino Sanfilippo 
> ---
> 
> v3: Check against period truncation (based on a review by Uwe Kleine-König)
> v2: Fix compiler error for 64 bit builds
> 
>  drivers/pwm/pwm-bcm2835.c | 72 
> +--
>  1 file changed, 26 insertions(+), 46 deletions(-)
> 
> diff --git a/drivers/pwm/pwm-bcm2835.c b/drivers/pwm/pwm-bcm2835.c
> index 6841dcf..d339898 100644
> --- a/drivers/pwm/pwm-bcm2835.c
> +++ b/drivers/pwm/pwm-bcm2835.c
> @@ -58,13 +58,15 @@ static void bcm2835_pwm_free(struct pwm_chip *chip, 
> struct pwm_device *pwm)
>   writel(value, pc->base + PWM_CONTROL);
>  }
>  
> -static int bcm2835_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
> -   int duty_ns, int period_ns)
> +static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> +  const struct pwm_state *state)
>  {
> +
>   struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
>   unsigned long rate = clk_get_rate(pc->clk);
> + unsigned long long period;
>   unsigned long scaler;
> - u32 period;
> + u32 val;
>  
>   if (!rate) {
>   dev_err(pc->dev, "failed to get clock rate\n");
> @@ -72,65 +74,43 @@ static int bcm2835_pwm_config(struct pwm_chip *chip, 
> struct pwm_device *pwm,
>   }
>  
>   scaler = DIV_ROUND_CLOSEST(NSEC_PER_SEC, rate);
> - period = DIV_ROUND_CLOSEST(period_ns, scaler);
> + /* set period */
> + period = DIV_ROUND_CLOSEST_ULL(state->period, scaler);
>  
> - if (period < PERIOD_MIN)
> + /* dont accept a period that is too small or has been truncated */
> + if ((period < PERIOD_MIN) || (period > U32_MAX))
>   return -EINVAL;
>  
> - writel(DIV_ROUND_CLOSEST(duty_ns, scaler),
> -pc->base + DUTY(pwm->hwpwm));
> - writel(period, pc->base + PERIOD(pwm->hwpwm));
> -
> - return 0;
> -}
> -
> -static int bcm2835_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm)
> -{
> - struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
> - u32 value;
> -
> - value = readl(pc->base + PWM_CONTROL);
> - value |= PWM_ENABLE << PWM_CONTROL_SHIFT(pwm->hwpwm);
> - writel(value, pc->base + PWM_CONTROL);
> -
> - return 0;
> -}
> -
> -static void bcm2835_pwm_disable(struct pwm_chip *chip, struct pwm_device 
> *pwm)
> -{
> - struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
> - u32 value;
> + writel((u32) period, pc->base + PERIOD(pwm->hwpwm));

This cast isn't necessary. (And if it was, I *think* the space between
"(u32)" and "period" is wrong. But my expectation that checkpatch warns
about this is wrong, so take this with a grain of salt.)

> - value = readl(pc->base + PWM_CONTROL);
> - value &= ~(PWM_ENABLE << PWM_CONTROL_SHIFT(pwm->hwpwm));
> - writel(value, pc->base + PWM_CONTROL);
> -}
> + /* set duty cycle */
> + val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle, scaler);
> + writel(val, pc->base + DUTY(pwm->hwpwm));
>  
> -static int bcm2835_set_polarity(struct pwm_chip *chip, struct pwm_device 
> *pwm,
> - enum pwm_polarity polarity)
> -{
> - struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
> - u32 value;
> + /* set polarity */
> + val = readl(pc->base + PWM_CONTROL);
>  
> - value = readl(pc->base + PWM_CONTROL);
> + if (state->polarity == PWM_POLARITY_NORMAL)
> + val &= ~(PWM_POLARITY << PWM_CONTROL_SHIFT(pwm->hwpwm));
> + else
> + val |= PWM_POLARITY << PWM_CONTROL_SHIFT(pwm->hwpwm);
>  
> - if (polarity == PWM_POLARITY_NORMAL)
> - value &= ~(PWM_POLARITY << PWM_CONTROL_SHIFT(pwm->hwpwm));
> + /* enable/disable */
> + if (state->enabled)
> + val |= PWM_ENABLE << PWM_CONTROL_SHIFT(pwm->hwpwm);
>   else
> - value |= PWM_POLARITY << PWM_CONTROL_SHIFT(pwm->hwpwm);
> + val &= ~(PWM_ENABLE << PWM_CONTROL_SHIFT(pwm->hwpwm));
>  
> - writel(value, pc->base + PWM_CONTROL);
> + writel(val, pc->base + PWM_CONTROL);
>  
>   return 0;
>  }
>  
> +

I wouldn't have added this empty line. But I guess that's subjective. Or
did you add this by mistake?

>  static const struct pwm_ops bcm2835_pwm_ops = {
>   .request = bcm2835_pwm_request,
>   .free = bcm2835_pwm_free,
> - .config = bcm2835_pwm_config,
> - .enable = bcm2835_pwm_enable,
> - .disable = bcm2835_pwm_disable,
> - .set_polarity =

Re: [PATCH 069/141] ath5k: Fix fall-through warnings for Clang

2020-12-08 Thread Kalle Valo

"Gustavo A. R. Silva"  wrote:

> In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
> by explicitly adding a break statement instead of letting the code fall
> through to the next case.
> 
> Link: https://github.com/KSPP/linux/issues/115
> Signed-off-by: Gustavo A. R. Silva 
> Signed-off-by: Kalle Valo 

3 patches applied to ath-next branch of ath.git, thanks.

e64fa6d92ac4 ath5k: Fix fall-through warnings for Clang
e2cb11165445 carl9170: Fix fall-through warnings for Clang
b6041e1a3020 wcn36xx: Fix fall-through warnings for Clang

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/e127232621c4de340509047a11d98093958303c5.1605896059.git.gustavo...@kernel.org/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: [PATCH v2 1/1] ARM: dts: mmp2-olpc-xo-1-75: clear the warnings when make dtbs

2020-12-08 Thread Leizhen (ThunderTown)




On 2020/12/8 21:58, Arnd Bergmann wrote:
> On Mon, Dec 7, 2020 at 9:47 AM Zhen Lei  wrote:
>>
>> The check_spi_bus_bridge() in scripts/dtc/checks.c requires that the node
>> have "spi-slave" property must with "#address-cells = <0>" and
>> "#size-cells = <0>". But currently both "#address-cells" and "#size-cells"
>> properties are deleted, the corresponding default values are 2 and 1. As a
>> result, the check fails and below warnings is displayed.
>>
>> arch/arm/boot/dts/mmp2.dtsi:472.23-480.6: Warning (spi_bus_bridge): \
>> /soc/apb@d400/spi@d4037000: incorrect #address-cells for SPI bus
>>   also defined at arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:225.7-237.3
>> arch/arm/boot/dts/mmp2.dtsi:472.23-480.6: Warning (spi_bus_bridge): \
>> /soc/apb@d400/spi@d4037000: incorrect #size-cells for SPI bus
>>   also defined at arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts:225.7-237.3
>> arch/arm/boot/dts/mmp2-olpc-xo-1-75.dtb: Warning (spi_bus_reg): \
>> Failed prerequisite 'spi_bus_bridge'
>>
>> Because the value of "#size-cells" is already defined as zero in the node
>> "ssp3: spi@d4037000" in arch/arm/boot/dts/mmp2.dtsi. So we only need to
>> explicitly add "#address-cells = <0>" and keep "#size-cells" no change.
>>
>> Signed-off-by: Zhen Lei 
> 
> Right, I already sent the same patch earlier.

Oh, sorry, I don't known it. If you send it earlier, please apply your patch!

> 
> Lubomir, can I apply this to the fixes branch?

This fix is really should be considered to merge into v5.10.

> 
>>  arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts 
>> b/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
>> index adde62d6fce73b9..82da44dacba7172 100644
>> --- a/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
>> +++ b/arch/arm/boot/dts/mmp2-olpc-xo-1-75.dts
>> @@ -224,7 +224,7 @@
>>
>>   {
>> /delete-property/ #address-cells;
>> -   /delete-property/ #size-cells;
>> +   #address-cells = <0>;
>> spi-slave;
>> status = "okay";
>> ready-gpios = < 125 GPIO_ACTIVE_HIGH>;
>> --
>> 1.8.3
>>
>>
> 
> .
>

[tip:x86/cpu] BUILD SUCCESS 262bd5724afdefd4c48a260d6100e78cc43ee06b

2020-12-08 Thread kernel test robot

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git  
x86/cpu
branch HEAD: 262bd5724afdefd4c48a260d6100e78cc43ee06b  x86/cpu/amd: Remove dead 
code for TSEG region remapping

elapsed time: 726m

configs tested: 118
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
powerpc   maple_defconfig
arm rpc_defconfig
pariscgeneric-32bit_defconfig
arm   milbeaut_m10v_defconfig
powerpcwarp_defconfig
sh  lboxre2_defconfig
powerpc64   defconfig
powerpc pseries_defconfig
powerpc canyonlands_defconfig
powerpc mpc834x_mds_defconfig
sh  rsk7264_defconfig
arc haps_hs_smp_defconfig
sh   se7724_defconfig
powerpc ep8248e_defconfig
arm assabet_defconfig
mips cu1830-neo_defconfig
sh   se7751_defconfig
armlart_defconfig
powerpcmpc7448_hpc2_defconfig
sh   rts7751r2dplus_defconfig
sh   se7721_defconfig
mipsomega2p_defconfig
armdove_defconfig
mips  ath79_defconfig
powerpc kmeter1_defconfig
mips  maltaaprp_defconfig
powerpc  mgcoge_defconfig
archsdk_defconfig
xtensa  defconfig
powerpc  pmac32_defconfig
armmmp2_defconfig
powerpc   holly_defconfig
arm   h5000_defconfig
shmigor_defconfig
mipsbcm63xx_defconfig
arm   omap1_defconfig
h8300   defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a004-20201208
i386 randconfig-a005-20201208
i386 randconfig-a001-20201208
i386 randconfig-a002-20201208
i386 randconfig-a006-20201208
i386 randconfig-a003-20201208
x86_64   randconfig-a004-20201208
x86_64   randconfig-a006-20201208
x86_64   randconfig-a005-20201208
x86_64   randconfig-a001-20201208
x86_64   randconfig-a002-20201208
x86_64   randconfig-a003-20201208
i386 randconfig-a013-20201208
i386 randconfig-a014-20201208
i386 randconfig-a011-20201208
i386 randconfig-a015-20201208
i386 randconfig-a012-20201208
i386 randconfig-a016-20201208
i386 randconfig-a013-20201209
i386 randconfig-a014-20201209
i386 randconfig-a011-20201209
i386 randconfig-a015-20201209
i386 randconfig-a012-20201209
i386 randconfig-a016-20201209
riscvnommu_k210_defconfig
riscv

RE: [PATCH v3 1/4] Input: adp5589-keys - add default platform data

2020-12-08 Thread Ardelean, Alexandru




> -Original Message-
> From: Alexandru Ardelean 
> Sent: Friday, November 27, 2020 1:14 PM
> To: linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org;
> devicet...@vger.kernel.org
> Cc: l...@metafoo.de; dmitry.torok...@gmail.com; robh...@kernel.org;
> Ardelean, Alexandru 
> Subject: [PATCH v3 1/4] Input: adp5589-keys - add default platform data
> 
> From: Lars-Peter Clausen 
> 
> If no platform data is supplied use a dummy platform data that configures the
> device in GPIO only mode. This change adds a adp5589_kpad_pdata_get() helper
> that returns the default platform-data. This can be later extended to load
> configuration from device-trees or ACPI.
> 

Ping on this for the input subsystem.
Since patch 4 was applied by Rob, maybe for input, only the first 3 should be 
applied.
Or, should I re-send just the first 3?

Thanks
Alex

> Signed-off-by: Lars-Peter Clausen 
> Signed-off-by: Alexandru Ardelean 
> ---
> 
> Changelog v2 - v3:
> * https://lore.kernel.org/linux-input/20201124082255.13427-1-
> alexandru.ardel...@analog.com/
> * added patch 'dt-bindings: add ADP5585/ADP5589 entries to trivial-devices'
> 
>  drivers/input/keyboard/adp5589-keys.c | 33 +++
>  1 file changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/input/keyboard/adp5589-keys.c
> b/drivers/input/keyboard/adp5589-keys.c
> index e2cdf14d90cd..742bf4b97dbb 100644
> --- a/drivers/input/keyboard/adp5589-keys.c
> +++ b/drivers/input/keyboard/adp5589-keys.c
> @@ -369,6 +369,25 @@ static const struct adp_constants const_adp5585 = {
>   .reg= adp5585_reg,
>  };
> 
> +static const struct adp5589_gpio_platform_data adp5589_default_gpio_pdata
> = {
> + .gpio_start = -1,
> +};
> +
> +static const struct adp5589_kpad_platform_data adp5589_default_pdata = {
> + .gpio_data = _default_gpio_pdata, };
> +
> +static const struct adp5589_kpad_platform_data *adp5589_kpad_pdata_get(
> + struct device *dev)
> +{
> + const struct adp5589_kpad_platform_data *pdata =
> +dev_get_platdata(dev);
> +
> + if (!pdata)
> + pdata = _default_pdata;
> +
> + return pdata;
> +}
> +
>  static int adp5589_read(struct i2c_client *client, u8 reg)  {
>   int ret = i2c_smbus_read_byte_data(client, reg); @@ -498,7 +517,8 @@
> static int adp5589_build_gpiomap(struct adp5589_kpad *kpad,  static int
> adp5589_gpio_add(struct adp5589_kpad *kpad)  {
>   struct device *dev = >client->dev;
> - const struct adp5589_kpad_platform_data *pdata =
> dev_get_platdata(dev);
> + const struct adp5589_kpad_platform_data *pdata =
> + adp5589_kpad_pdata_get(dev);
>   const struct adp5589_gpio_platform_data *gpio_data = pdata-
> >gpio_data;
>   int i, error;
> 
> @@ -619,7 +639,7 @@ static int adp5589_setup(struct adp5589_kpad *kpad)  {
>   struct i2c_client *client = kpad->client;
>   const struct adp5589_kpad_platform_data *pdata =
> - dev_get_platdata(>dev);
> + adp5589_kpad_pdata_get(>dev);
>   u8 (*reg) (u8) = kpad->var->reg;
>   unsigned char evt_mode1 = 0, evt_mode2 = 0, evt_mode3 = 0;
>   unsigned char pull_mask = 0;
> @@ -824,7 +844,7 @@ static int adp5589_keypad_add(struct adp5589_kpad
> *kpad, unsigned int revid)  {
>   struct i2c_client *client = kpad->client;
>   const struct adp5589_kpad_platform_data *pdata =
> - dev_get_platdata(>dev);
> + adp5589_kpad_pdata_get(>dev);
>   struct input_dev *input;
>   unsigned int i;
>   int error;
> @@ -948,7 +968,7 @@ static int adp5589_probe(struct i2c_client *client,  {
>   struct adp5589_kpad *kpad;
>   const struct adp5589_kpad_platform_data *pdata =
> - dev_get_platdata(>dev);
> + adp5589_kpad_pdata_get(>dev);
>   unsigned int revid;
>   int error, ret;
> 
> @@ -958,11 +978,6 @@ static int adp5589_probe(struct i2c_client *client,
>   return -EIO;
>   }
> 
> - if (!pdata) {
> - dev_err(>dev, "no platform data?\n");
> - return -EINVAL;
> - }
> -
>   kpad = devm_kzalloc(>dev, sizeof(*kpad), GFP_KERNEL);
>   if (!kpad)
>   return -ENOMEM;
> --
> 2.27.0

Re: [PATCH RESEND v2] virtio-input: add multi-touch support

2020-12-08 Thread Greg KH

On Tue, Dec 08, 2020 at 11:01:50PM +0200, Vasyl Vavrychuk wrote:
> From: Mathias Crombez 
> 
> Without multi-touch slots allocated, ABS_MT_SLOT events will be lost by
> input_handle_abs_event.
> 
> Signed-off-by: Mathias Crombez 
> Signed-off-by: Vasyl Vavrychuk 
> Tested-by: Vasyl Vavrychuk 
> ---
> v2: fix patch corrupted by corporate email server
> 
>  drivers/virtio/Kconfig| 11 +++
>  drivers/virtio/virtio_input.c |  8 
>  2 files changed, 19 insertions(+)

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

Re: [PATCH v4] HID: i2c-hid: add polling mode based on connected GPIO chip's pin status

2020-12-08 Thread Greg KH

On Tue, Dec 08, 2020 at 09:59:20PM +, Barnabás Pőcze wrote:
> 2020. november 25., szerda 16:07 keltezéssel, Greg KH írta:
> 
> > [...]
> > > +static u8 polling_mode;
> > > +module_param(polling_mode, byte, 0444);
> > > +MODULE_PARM_DESC(polling_mode, "How to poll (default=0) - 0 disabled; 1 
> > > based on GPIO pin's status");
> >
> > Module parameters are for the 1990's, they are global and horrible to
> > try to work with. You should provide something on a per-device basis,
> > as what happens if your system requires different things here for
> > different devices? You set this for all devices :(
> > [...]
> 
> Hi
> 
> do you think something like what the usbcore has would be better?
> A module parameter like 
> "quirks=::[,::]*"?

Not really, that's just for debugging, and asking users to test
something, not for a final solution to anything.

thanks,

greg k-h

[PATCH] kexec: Fix error code in kexec_calculate_store_digests()

2020-12-08 Thread Dan Carpenter

Return -ENOMEM on allocation failure instead of returning success.

Fixes: a43cac0d9dc2 ("kexec: split kexec_file syscall code to kexec_file.c")
Signed-off-by: Dan Carpenter 
---
 kernel/kexec_file.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index b02086d70492..9570f380a825 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -735,8 +735,10 @@ static int kexec_calculate_store_digests(struct kimage 
*image)
 
sha_region_sz = KEXEC_SEGMENT_MAX * sizeof(struct kexec_sha_region);
sha_regions = vzalloc(sha_region_sz);
-   if (!sha_regions)
+   if (!sha_regions) {
+   ret = -ENOMEM;
goto out_free_desc;
+   }
 
desc->tfm   = tfm;
 
-- 
2.29.2

Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index

2020-12-08 Thread Eli Cohen

On Wed, Dec 09, 2020 at 01:46:22AM -0500, Michael S. Tsirkin wrote:
> On Wed, Dec 09, 2020 at 08:02:30AM +0200, Eli Cohen wrote:
> > On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> > > On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > > > Make sure to put write memory barrier after updating CQ consumer index
> > > > so the hardware knows that there are available CQE slots in the queue.
> > > > 
> > > > Failure to do this can cause the update of the RX doorbell record to get
> > > > updated before the CQ consumer index resulting in CQ overrun.
> > > > 
> > > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 
> > > > devices")
> > > > Signed-off-by: Eli Cohen 
> > > 
> > > Aren't both memory writes?
> > 
> > Not sure what exactly you mean here.
> 
> Both updates are CPU writes into RAM that hardware then reads
> using DMA.
> 

You mean why I did not put a memory barrier right after updating the
recieve doorbell record?

I thought about this and I think it is not required. Suppose it takes a
very long time till the hardware can actually see this update. The worst
effect would be that the hardware will drop received packets if it does
sees none available due to the delayed update. Eventually it will see
the update and will continue working.

If I put a memory barrier, I put some delay waiting for the CPU to flush
the write before continuing. I tried both options while checking packet
rate on couldn't see noticable difference in either case.

> > > And given that, isn't dma_wmb() sufficient here?
> > 
> > I agree that dma_wmb() is more appropriate here.
> > 
> > > 
> > > 
> > > > ---
> > > >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +
> > > >  1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
> > > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq 
> > > > *vcq)
> > > >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue 
> > > > *mvq, int num)
> > > >  {
> > > > mlx5_cq_set_ci(>cq.mcq);
> > > > +
> > > > +   /* make sure CQ cosumer update is visible to the hardware 
> > > > before updating
> > > > +* RX doorbell record.
> > > > +*/
> > > > +   wmb();
> > > > rx_post(>vqqp, num);
> > > > if (mvq->event_cb.callback)
> > > > mvq->event_cb.callback(mvq->event_cb.private);
> > > > -- 
> > > > 2.27.0
> > > 
>

[PATCH v1 1/2] scsi: ufs: Protect some contexts from unexpected clock scaling

2020-12-08 Thread Can Guo

In contexts like suspend, shutdown and error handling, we need to suspend
devfreq to make sure these contexts won't be disturbed by clock scaling.
However, suspending devfreq is not enough since users can still trigger a
clock scaling by manipulating the sysfs node clkscale_enable and devfreq
sysfs nodes like min/max_freq and governor. Add one more flag in struct
clk_scaling such that these contexts can prevent clock scaling from being
invoked through above sysfs nodes.

Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 86 ---
 drivers/scsi/ufs/ufshcd.h |  2 ++
 2 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 0c148fc..12266bd 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1147,12 +1147,18 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba 
*hba)
 */
ufshcd_scsi_block_requests(hba);
down_write(>clk_scaling_lock);
-   if (ufshcd_wait_for_doorbell_clr(hba, DOORBELL_CLR_TOUT_US)) {
+   if (!hba->clk_scaling.is_allowed ||
+   ufshcd_wait_for_doorbell_clr(hba, DOORBELL_CLR_TOUT_US)) {
ret = -EBUSY;
up_write(>clk_scaling_lock);
ufshcd_scsi_unblock_requests(hba);
+   goto out;
}
 
+   /* let's not get into low power until clock scaling is completed */
+   ufshcd_hold(hba, false);
+
+out:
return ret;
 }
 
@@ -1160,6 +1166,7 @@ static void ufshcd_clock_scaling_unprepare(struct ufs_hba 
*hba)
 {
up_write(>clk_scaling_lock);
ufshcd_scsi_unblock_requests(hba);
+   ufshcd_release(hba);
 }
 
 /**
@@ -1175,12 +1182,9 @@ static int ufshcd_devfreq_scale(struct ufs_hba *hba, 
bool scale_up)
 {
int ret = 0;
 
-   /* let's not get into low power until clock scaling is completed */
-   ufshcd_hold(hba, false);
-
ret = ufshcd_clock_scaling_prepare(hba);
if (ret)
-   goto out;
+   return ret;
 
/* scale down the gear before scaling down clocks */
if (!scale_up) {
@@ -1212,8 +1216,6 @@ static int ufshcd_devfreq_scale(struct ufs_hba *hba, bool 
scale_up)
 
 out_unprepare:
ufshcd_clock_scaling_unprepare(hba);
-out:
-   ufshcd_release(hba);
return ret;
 }
 
@@ -1294,15 +1296,8 @@ static int ufshcd_devfreq_target(struct device *dev,
}
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
 
-   pm_runtime_get_noresume(hba->dev);
-   if (!pm_runtime_active(hba->dev)) {
-   pm_runtime_put_noidle(hba->dev);
-   ret = -EAGAIN;
-   goto out;
-   }
start = ktime_get();
ret = ufshcd_devfreq_scale(hba, scale_up);
-   pm_runtime_put(hba->dev);
 
trace_ufshcd_profile_clk_scaling(dev_name(hba->dev),
(scale_up ? "up" : "down"),
@@ -1487,7 +1482,7 @@ static ssize_t ufshcd_clkscale_enable_show(struct device 
*dev,
 {
struct ufs_hba *hba = dev_get_drvdata(dev);
 
-   return snprintf(buf, PAGE_SIZE, "%d\n", hba->clk_scaling.is_allowed);
+   return snprintf(buf, PAGE_SIZE, "%d\n", hba->clk_scaling.is_enabled);
 }
 
 static ssize_t ufshcd_clkscale_enable_store(struct device *dev,
@@ -1496,12 +1491,20 @@ static ssize_t ufshcd_clkscale_enable_store(struct 
device *dev,
struct ufs_hba *hba = dev_get_drvdata(dev);
u32 value;
int err;
+   unsigned long flags;
+   bool update = true;
 
if (kstrtou32(buf, 0, ))
return -EINVAL;
 
value = !!value;
-   if (value == hba->clk_scaling.is_allowed)
+   spin_lock_irqsave(hba->host->host_lock, flags);
+   if (value == hba->clk_scaling.is_enabled)
+   update = false;
+   else
+   hba->clk_scaling.is_enabled = value;
+   spin_unlock_irqrestore(hba->host->host_lock, flags);
+   if (!update)
goto out;
 
pm_runtime_get_sync(hba->dev);
@@ -1510,8 +1513,6 @@ static ssize_t ufshcd_clkscale_enable_store(struct device 
*dev,
cancel_work_sync(>clk_scaling.suspend_work);
cancel_work_sync(>clk_scaling.resume_work);
 
-   hba->clk_scaling.is_allowed = value;
-
if (value) {
ufshcd_resume_clkscaling(hba);
} else {
@@ -1845,8 +1846,6 @@ static void ufshcd_init_clk_scaling(struct ufs_hba *hba)
snprintf(wq_name, sizeof(wq_name), "ufs_clkscaling_%d",
 hba->host->host_no);
hba->clk_scaling.workq = create_singlethread_workqueue(wq_name);
-
-   ufshcd_clkscaling_init_sysfs(hba);
 }
 
 static void ufshcd_exit_clk_scaling(struct ufs_hba *hba)
@@ -1854,6 +1853,8 @@ static void ufshcd_exit_clk_scaling(struct ufs_hba *hba)
if (!ufshcd_is_clkscaling_supported(hba))
return;
 
+   if (hba->devfreq)
+   device_remove_file(hba->dev, >clk_scaling.enable_attr);

[PATCH v1 2/2] scsi: ufs: Clean up some lines from ufshcd_hba_exit()

2020-12-08 Thread Can Guo

ufshcd_hba_exit() is always called after ufshcd_exit_clk_scaling() and
ufshcd_exit_clk_gating(), so no need to suspend clock scaling again in
ufshcd_hba_exit().

Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 12266bd..0a5b197 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -7765,6 +7765,7 @@ static void ufshcd_async_scan(void *data, async_cookie_t 
cookie)
if (ret) {
pm_runtime_put_sync(hba->dev);
ufshcd_exit_clk_scaling(hba);
+   ufshcd_exit_clk_gating(hba);
ufshcd_hba_exit(hba);
}
 }
@@ -8203,10 +8204,6 @@ static void ufshcd_hba_exit(struct ufs_hba *hba)
if (hba->is_powered) {
ufshcd_variant_hba_exit(hba);
ufshcd_setup_vreg(hba, false);
-   ufshcd_suspend_clkscaling(hba);
-   if (ufshcd_is_clkscaling_supported(hba))
-   if (hba->devfreq)
-   ufshcd_suspend_clkscaling(hba);
ufshcd_setup_clocks(hba, false);
ufshcd_setup_hba_vreg(hba, false);
hba->is_powered = false;
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [PATCH 1/1] crypto: Fix possible buffer overflows in pkey_protkey_aes_attr_read

2020-12-08 Thread Christian Borntraeger




On 09.12.20 07:47, Xiaohui Zhang wrote:
> From: Zhang Xiaohui 
> 
> pkey_protkey_aes_attr_read() calls memcpy() without checking the
> destination size may trigger a buffer overflower.

To me it looks like protkey.len is generated programmatically in 
pkey_genprotkey/pkey_clr2protkey
and this purely depends on the keytype and we do check for known ones.
Not sure how this can happen.

> 
> Signed-off-by: Zhang Xiaohui 
> ---
>  drivers/s390/crypto/pkey_api.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/s390/crypto/pkey_api.c b/drivers/s390/crypto/pkey_api.c
> index 99cb60ea6..abc237130 100644
> --- a/drivers/s390/crypto/pkey_api.c
> +++ b/drivers/s390/crypto/pkey_api.c
> @@ -1589,6 +1589,8 @@ static ssize_t pkey_protkey_aes_attr_read(u32 keytype, 
> bool is_xts, char *buf,
>   if (rc)
>   return rc;
>  
> + if (protkey.len > MAXPROTKEYSIZE)
> + protkey.len = MAXPROTKEYSIZE;
>   protkeytoken.len = protkey.len;
>   memcpy(, , protkey.len);
>  
> @@ -1599,6 +1601,8 @@ static ssize_t pkey_protkey_aes_attr_read(u32 keytype, 
> bool is_xts, char *buf,
>   if (rc)
>   return rc;
>  
> + if (protkey.len > MAXPROTKEYSIZE)
> + protkey.len = MAXPROTKEYSIZE;
>   protkeytoken.len = protkey.len;
>   memcpy(, , protkey.len);
>  
>

Re: [RFC 0/2] nocopy bvec for direct IO

2020-12-08 Thread Christoph Hellwig

On Wed, Dec 09, 2020 at 02:19:50AM +, Pavel Begunkov wrote:
> A benchmark got me 430KIOPS vs 540KIOPS, or +25% on bare metal. And perf
> shows that bio_iov_iter_get_pages() was taking ~20%. The test is pretty
> silly, but still imposing. I'll redo it closer to reality for next
> iteration, anyway need to double check some cases.

That is pretty impressive.  But I only got this cover letter, not the
actual patches..

[PATCH -next v2] net/mlx5_core: remove unused including

2020-12-08 Thread Zou Wei

Remove including  that don't need it.

Fixes: 17a7612b99e6 ("net/mlx5_core: Clean driver version and name")
Signed-off-by: Zou Wei 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 989c70c..82ecc161 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -30,7 +30,6 @@
  * SOFTWARE.
  */
 
-#include 
 #include 
 #include 
 #include 
-- 
2.6.2

Re: [PATCH 1/1] scsi: ufs-mediatek: use correct path to fix compiling error

2020-12-08 Thread Stanley Chu

Hi Zhen,

On Wed, 2020-12-09 at 14:31 +0800, Zhen Lei wrote:
> When the kernel is compiled with allmodconfig, the following error is
> reported:
> In file included from drivers/scsi/ufs/ufs-mediatek-trace.h:36:0,
>  from drivers/scsi/ufs/ufs-mediatek.c:28:
> ./include/trace/define_trace.h:95:42: fatal error: ./ufs-mediatek-trace.h: No 
> such file or directory
>  #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
> 
> The comment in include/trace/define_trace.h specifies that:
> TRACE_INCLUDE_PATH: Note, the path is relative to define_trace.h, not the
> file including it. Full path names for out of tree modules must be used.
> 
> So without "CFLAGS_ufs-mediatek.o := -I$(src)", the current directory "."
> is "include/trace/", the relative path of ufs-mediatek-trace.h is
> "../../drivers/scsi/ufs/".
> 
> Fixes: ca1bb061d644 ("scsi: ufs-mediatek: Introduce event_notify 
> implementation")
> Signed-off-by: Zhen Lei 
> ---
>  drivers/scsi/ufs/ufs-mediatek-trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ufs/ufs-mediatek-trace.h 
> b/drivers/scsi/ufs/ufs-mediatek-trace.h
> index fd6f84c1b4e2256..895e82ea6ece551 100644
> --- a/drivers/scsi/ufs/ufs-mediatek-trace.h
> +++ b/drivers/scsi/ufs/ufs-mediatek-trace.h
> @@ -31,6 +31,6 @@ TRACE_EVENT(ufs_mtk_event,
>  
>  #undef TRACE_INCLUDE_PATH
>  #undef TRACE_INCLUDE_FILE
> -#define TRACE_INCLUDE_PATH .
> +#define TRACE_INCLUDE_PATH ../../drivers/scsi/ufs/
>  #define TRACE_INCLUDE_FILE ufs-mediatek-trace
>  #include 

Thanks for this fix.

Reviewed-by: Stanley Chu

Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index

2020-12-08 Thread Michael S. Tsirkin

On Wed, Dec 09, 2020 at 08:02:30AM +0200, Eli Cohen wrote:
> On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> > On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > > Make sure to put write memory barrier after updating CQ consumer index
> > > so the hardware knows that there are available CQE slots in the queue.
> > > 
> > > Failure to do this can cause the update of the RX doorbell record to get
> > > updated before the CQ consumer index resulting in CQ overrun.
> > > 
> > > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 
> > > devices")
> > > Signed-off-by: Eli Cohen 
> > 
> > Aren't both memory writes?
> 
> Not sure what exactly you mean here.

Both updates are CPU writes into RAM that hardware then reads
using DMA.

> > And given that, isn't dma_wmb() sufficient here?
> 
> I agree that dma_wmb() is more appropriate here.
> 
> > 
> > 
> > > ---
> > >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
> > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > index 1f4089c6f9d7..295f46eea2a5 100644
> > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq 
> > > *vcq)
> > >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue 
> > > *mvq, int num)
> > >  {
> > >   mlx5_cq_set_ci(>cq.mcq);
> > > +
> > > + /* make sure CQ cosumer update is visible to the hardware before 
> > > updating
> > > +  * RX doorbell record.
> > > +  */
> > > + wmb();
> > >   rx_post(>vqqp, num);
> > >   if (mvq->event_cb.callback)
> > >   mvq->event_cb.callback(mvq->event_cb.private);
> > > -- 
> > > 2.27.0
> >

Re: [f2fs-dev] [PATCH v3] f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE

2020-12-08 Thread Chao Yu


On 2020/12/3 14:56, Daeho Jeong wrote:

From: Daeho Jeong 
+   f2fs_balance_fs(F2FS_I_SB(inode), true);


Trivial cleanup:

f2fs_balance_fs(sbi, true);


+   f2fs_balance_fs(F2FS_I_SB(inode), true);


Ditto,

Jaegeuk could fix this directly?

Thanks,

答复: [PATCH -next] net/mlx5_core: remove unused including

2020-12-08 Thread Zouwei (Samuel)

ok, I will add the Fixes line and send the v2 soon.

-邮件原件-
发件人: Leon Romanovsky [mailto:l...@kernel.org] 
发送时间: 2020年12月9日 14:21
收件人: Jakub Kicinski 
抄送: Zouwei (Samuel) ; sae...@nvidia.com; 
da...@davemloft.net; net...@vger.kernel.org; linux-r...@vger.kernel.org; 
linux-kernel@vger.kernel.org
主题: Re: [PATCH -next] net/mlx5_core: remove unused including 


On Tue, Dec 08, 2020 at 11:22:26AM -0800, Jakub Kicinski wrote:
> On Mon, 7 Dec 2020 20:14:00 +0800 Zou Wei wrote:
> > Remove including  that don't need it.
> >
> > Signed-off-by: Zou Wei 
> > ---
> >  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c 
> > b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> > index 989c70c..82ecc161 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> > @@ -30,7 +30,6 @@
> >   * SOFTWARE.
> >   */
> >
> > -#include 
> >  #include 
> >  #include 
> >  #include 

Jakub,

You probably doesn't have latest net-next.

In the commit 17a7612b99e6 ("net/mlx5_core: Clean driver version and name"), I 
removed "strlcpy(drvinfo->version, UTS_RELEASE, sizeof(drvinfo->version));" 
line.

The patch is ok, but should have Fixes line.
Fixes: 17a7612b99e6 ("net/mlx5_core: Clean driver version and name")

Thanks

>
>
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c: In function 
> ‘mlx5e_rep_get_drvinfo’:
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:66:28: error: ‘UTS_RELEASE’ 
> undeclared (first use in this function); did you mean ‘CSS_RELEASED’?
>66 |  strlcpy(drvinfo->version, UTS_RELEASE, sizeof(drvinfo->version));
>   |^~~
>   |CSS_RELEASED
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:66:28: note: each 
> undeclared identifier is reported only once for each function it 
> appears in
> make[6]: *** [drivers/net/ethernet/mellanox/mlx5/core/en_rep.o] Error 
> 1
> make[5]: *** [drivers/net/ethernet/mellanox/mlx5/core] Error 2
> make[4]: *** [drivers/net/ethernet/mellanox] Error 2
> make[3]: *** [drivers/net/ethernet] Error 2
> make[2]: *** [drivers/net] Error 2
> make[2]: *** Waiting for unfinished jobs
> make[1]: *** [drivers] Error 2
> make: *** [__sub-make] Error 2

Re: [PATCH v4 2/7] Input: use input_device_enabled()

2020-12-08 Thread Dmitry Torokhov

On Tue, Dec 08, 2020 at 11:05:42AM +0100, Marek Szyprowski wrote:
> Hi Andrzej,
> 
> On 07.12.2020 16:50, Andrzej Pietrasiewicz wrote:
> > Hi Marek,
> >
> > W dniu 07.12.2020 o 14:32, Marek Szyprowski pisze:
> >> Hi Andrzej,
> >>
> >> On 08.06.2020 13:22, Andrzej Pietrasiewicz wrote:
> >>> Use the newly added helper in relevant input drivers.
> >>>
> >>> Signed-off-by: Andrzej Pietrasiewicz 
> >>
> >> This patch landed recently in linux-next as commit d69f0a43c677 ("Input:
> >> use input_device_enabled()"). Sadly it causes following warning during
> >> system suspend/resume cycle on ARM 32bit Samsung Exynos5250-based Snow
> >> Chromebook with kernel compiled from exynos_defconfig:
> >>
> >> [ cut here ]
> >> WARNING: CPU: 0 PID: 1777 at drivers/input/input.c:2230
> >> input_device_enabled+0x68/0x6c
> >> Modules linked in: cmac bnep mwifiex_sdio mwifiex sha256_generic
> >> libsha256 sha256_arm cfg80211 btmrvl_sdio btmrvl bluetooth s5p_mfc
> >> exynos_gsc v4l2_mem2mem videob
> >> CPU: 0 PID: 1777 Comm: rtcwake Not tainted
> >> 5.10.0-rc6-next-20201207-1-g49a0dc04c46d-dirty #9902
> >> Hardware name: Samsung Exynos (Flattened Device Tree)
> >> [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> >> [] (show_stack) from [] (dump_stack+0xb4/0xd4)
> >> [] (dump_stack) from [] (__warn+0xd8/0x11c)
> >> [] (__warn) from [] (warn_slowpath_fmt+0xb0/0xb8)
> >> [] (warn_slowpath_fmt) from []
> >> (input_device_enabled+0x68/0x6c)
> >> [] (input_device_enabled) from []
> >
> > Apparently you are hitting this line of code in drivers/input/input.c:
> >
> > lockdep_assert_held(>mutex);
> >
> > Inspecting input device's "users" member should happen under dev's lock.
> >
> This check and warning has been introduced by this patch. I assume that 
> the suspend/resume paths are correct, but it looks that they were not 
> tested with this patch thus it has not been noticed that they are not 
> called under the input's lock. This needs a fix. Dmitry: how would you 
> like to handle this issue?

The check is proper and the warning is legit, cyapa should not be
checking this field without holding the lock. I think we can simply
remove this check from the power ops for gen3 and gen5, and this should
shut up the warning on suspend, but there other places in cyapa that do
check 'users', and they also need to be fixed.

Thanks.

-- 
Dmitry

Re: [PATCH] powerpc/mm: Refactor the floor/ceiling check in hugetlb range freeing functions

2020-12-08 Thread Aneesh Kumar K.V

Christophe Leroy  writes:

> All hugetlb range freeing functions have a verification like the following,
> which only differs by the mask used, depending on the page table level.
>
>   start &= MASK;
>   if (start < floor)
>   return;
>   if (ceiling) {
>   ceiling &= MASK;
>   if (! ceiling)
>   return;
>   }
>   if (end - 1 > ceiling - 1)
>   return;
>
> Refactor that into a helper function which takes the mask as
> an argument, returning true when [start;end[ is not fully
> contained inside [floor;ceiling[
>

Reviewed-by: Aneesh Kumar K.V 

> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/mm/hugetlbpage.c | 56 ---
>  1 file changed, 19 insertions(+), 37 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 36c3800769fb..f8d8a4988e15 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -294,6 +294,21 @@ static void hugepd_free(struct mmu_gather *tlb, void 
> *hugepte)
>  static inline void hugepd_free(struct mmu_gather *tlb, void *hugepte) {}
>  #endif
>  
> +/* Return true when the entry to be freed maps more than the area being 
> freed */
> +static bool range_is_outside_limits(unsigned long start, unsigned long end,
> + unsigned long floor, unsigned long ceiling,
> + unsigned long mask)
> +{
> + if ((start & mask) < floor)
> + return true;
> + if (ceiling) {
> + ceiling &= mask;
> + if (!ceiling)
> + return true;
> + }
> + return end - 1 > ceiling - 1;
> +}
> +
>  static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int 
> pdshift,
> unsigned long start, unsigned long end,
> unsigned long floor, unsigned long ceiling)
> @@ -309,15 +324,7 @@ static void free_hugepd_range(struct mmu_gather *tlb, 
> hugepd_t *hpdp, int pdshif
>   if (shift > pdshift)
>   num_hugepd = 1 << (shift - pdshift);
>  
> - start &= pdmask;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= pdmask;
> - if (! ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(start, end, floor, ceiling, pdmask))
>   return;
>  
>   for (i = 0; i < num_hugepd; i++, hpdp++)
> @@ -334,18 +341,9 @@ static void hugetlb_free_pte_range(struct mmu_gather 
> *tlb, pmd_t *pmd,
>  unsigned long addr, unsigned long end,
>  unsigned long floor, unsigned long ceiling)
>  {
> - unsigned long start = addr;
>   pgtable_t token = pmd_pgtable(*pmd);
>  
> - start &= PMD_MASK;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= PMD_MASK;
> - if (!ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(addr, end, floor, ceiling, PMD_MASK))
>   return;
>  
>   pmd_clear(pmd);
> @@ -395,15 +393,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather 
> *tlb, pud_t *pud,
> addr, next, floor, ceiling);
>   } while (addr = next, addr != end);
>  
> - start &= PUD_MASK;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= PUD_MASK;
> - if (!ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(start, end, floor, ceiling, PUD_MASK))
>   return;
>  
>   pmd = pmd_offset(pud, start);
> @@ -446,15 +436,7 @@ static void hugetlb_free_pud_range(struct mmu_gather 
> *tlb, p4d_t *p4d,
>   }
>   } while (addr = next, addr != end);
>  
> - start &= PGDIR_MASK;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= PGDIR_MASK;
> - if (!ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(start, end, floor, ceiling, PGDIR_MASK))
>   return;
>  
>   pud = pud_offset(p4d, start);
> -- 
> 2.25.0

Re: [PATCH RFC 10/39] KVM: x86/xen: support upcall vector

2020-12-08 Thread Ankur Arora


On 2020-12-08 8:08 a.m., David Woodhouse wrote:

On Wed, 2020-12-02 at 19:02 +, David Woodhouse wrote:



I feel we could just accommodate it as subtype in 
KVM_XEN_ATTR_TYPE_CALLBACK_VIA.
Don't see the adavantage in having another xen attr type.


Yeah, fair enough.


But kinda have mixed feelings in having kernel handling all event channels ABI,
as opposed to only the ones userspace asked to offload. It looks a tad 
unncessary besides
the added gain to VMMs that don't need to care about how the internals of event 
channels.
But performance-wise it wouldn't bring anything better. But maybe, the former 
is reason
enough to consider it.


Yeah, we'll see. Especially when it comes to implementing FIFO event
channels, I'd rather just do it in one place — and if the kernel does
it anyway then it's hardly difficult to hook into that.

But I've been about as coherent as I can be in email, and I think we're
generally aligned on the direction. I'll do some more experiments and
see what I can get working, and what it looks like.



So... I did some more typing, and revived our completely userspace
based implementation of event channels. I wanted to declare that such
was *possible*, and that using the kernel for IPI and VIRQ was just a
very desirable optimisation.

It looks like Linux doesn't use the per-vCPU upcall vector that you
called 'KVM_XEN_CALLBACK_VIA_EVTCHN'. So I'm delivering interrupts via
KVM_INTERRUPT as if they were ExtINT

... except I'm not. Because the kernel really does expect that to be an
ExtINT from a legacy PIC, and kvm_apic_accept_pic_intr() only returns
true if LVT0 is set up for EXTINT and unmasked.

I messed around with this hack and increasingly desperate variations on
the theme (since this one doesn't cause the IRQ window to be opened to
userspace in the first place), but couldn't get anything working:


Increasingly desperate variations,  about sums up my process as well while
trying to get the upcall vector working.



--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2380,6 +2380,9 @@ int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
 if ((lvt0 & APIC_LVT_MASKED) == 0 &&
 GET_APIC_DELIVERY_MODE(lvt0) == APIC_MODE_EXTINT)
 r = 1;
+   /* Shoot me. */
+   if (vcpu->arch.pending_external_vector == 243)
+   r = 1;
 return r;
  }
  


Eventually I resorted to delivering the interrupt through the lapic
*anyway* (through KVM_SIGNAL_MSI with an MSI message constructed for
the appropriate vCPU/vector) and the following hack to auto-EOI:

--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2416,7 +2419,7 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
  */
  
 apic_clear_irr(vector, apic);

-   if (test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
+   if (vector == 243 || test_bit(vector, 
vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
 /*
  * For auto-EOI interrupts, there might be another pending
  * interrupt above PPR, so check whether to raise another


That works, and now my guest finishes the SMP bringup (and gets as far
as waiting on the XenStore implementation that I haven't put back yet).

So I think we need at least a tiny amount of support in-kernel for
delivering event channel interrupt vectors, even if we wanted to allow
for a completely userspace implementation.

Unless I'm missing something?


I did use the auto_eoi hack as well. So, yeah, I don't see any way of
getting around this.

Also, IIRC we had eventually gotten rid of the auto_eoi approach
because that wouldn't work with APICv. At that point we resorted to
direct queuing for vectored callbacks which was a hack that I never
grew fond of...
 

I will get on with implementing the in-kernel handling with IRQ routing
entries targeting a given { port, vcpu }. And I'm kind of vacillating
about whether the mode/vector should be separately configured, or
whether they might as well be in the IRQ routing table too, even if
it's kind of redundant because it's specified the same for *every* port
targeting the same vCPU. I *think* I prefer that redundancy over having
a separate configuration mechanism to set the vector for each vCPU. But
we'll see what happens when my fingers do the typing...



Good luck to your fingers!

Ankur

[PATCH v2 5/5] phy: freescale: phy-fsl-imx8-mipi-dphy: Add i.MX8qxp LVDS PHY mode support

2020-12-08 Thread Liu Ying

i.MX8qxp SoC embeds a Mixel MIPI DPHY + LVDS PHY combo which supports
either a MIPI DSI display or a LVDS display.  The PHY mode is controlled
by SCU firmware and the driver would call a SCU firmware function to
configure the PHY mode.  The single LVDS PHY has 4 data lanes to support
a LVDS display.  Also, with a master LVDS PHY and a slave LVDS PHY, they
may work together to support a LVDS display with 8 data lanes(usually, dual
LVDS link display).  Note that this patch supports the LVDS PHY mode only
for the i.MX8qxp Mixel combo PHY, i.e., the MIPI DPHY mode is yet to be
supported, so for now error would be returned from ->set_mode() if MIPI
DPHY mode is passed over to it for the combo PHY.

Cc: Guido Günther 
Cc: Robert Chiras 
Cc: Kishon Vijay Abraham I 
Cc: Vinod Koul 
Cc: Shawn Guo 
Cc: Sascha Hauer 
Cc: Pengutronix Kernel Team 
Cc: Fabio Estevam 
Cc: NXP Linux Team 
Signed-off-by: Liu Ying 
---
Guido, I also print invalid PHY mode from mixel_dphy_configure().

v1->v2:
* Print invalid PHY mode in dmesg. (Guido)

 drivers/phy/freescale/phy-fsl-imx8-mipi-dphy.c | 270 -
 1 file changed, 259 insertions(+), 11 deletions(-)

diff --git a/drivers/phy/freescale/phy-fsl-imx8-mipi-dphy.c 
b/drivers/phy/freescale/phy-fsl-imx8-mipi-dphy.c
index a95572b..25c97ad 100644
--- a/drivers/phy/freescale/phy-fsl-imx8-mipi-dphy.c
+++ b/drivers/phy/freescale/phy-fsl-imx8-mipi-dphy.c
@@ -4,17 +4,31 @@
  * Copyright 2019 Purism SPC
  */
 
+#include 
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
+
+/* Control and Status Registers(CSR) */
+#define PHY_CTRL   0x00
+#define  CCM_MASK  GENMASK(7, 5)
+#define  CCM(n)FIELD_PREP(CCM_MASK, (n))
+#define  CA_MASK   GENMASK(4, 2)
+#define  CA(n) FIELD_PREP(CA_MASK, (n))
+#define  RFB   BIT(1)
+#define  LVDS_EN   BIT(0)
 
 /* DPHY registers */
 #define DPHY_PD_DPHY   0x00
@@ -55,8 +69,15 @@
 #define PWR_ON 0
 #define PWR_OFF1
 
+#define MIN_VCO_FREQ 64000
+#define MAX_VCO_FREQ 15
+
+#define MIN_LVDS_REFCLK_FREQ 2400
+#define MAX_LVDS_REFCLK_FREQ 15000
+
 enum mixel_dphy_devtype {
MIXEL_IMX8MQ,
+   MIXEL_IMX8QXP,
 };
 
 struct mixel_dphy_devdata {
@@ -65,6 +86,7 @@ struct mixel_dphy_devdata {
u8 reg_rxlprp;
u8 reg_rxcdrp;
u8 reg_rxhs_settle;
+   bool is_combo;  /* MIPI DPHY and LVDS PHY combo */
 };
 
 static const struct mixel_dphy_devdata mixel_dphy_devdata[] = {
@@ -74,6 +96,10 @@ static const struct mixel_dphy_devdata mixel_dphy_devdata[] 
= {
.reg_rxlprp = 0x40,
.reg_rxcdrp = 0x44,
.reg_rxhs_settle = 0x48,
+   .is_combo = false,
+   },
+   [MIXEL_IMX8QXP] = {
+   .is_combo = true,
},
 };
 
@@ -95,8 +121,12 @@ struct mixel_dphy_cfg {
 struct mixel_dphy_priv {
struct mixel_dphy_cfg cfg;
struct regmap *regmap;
+   struct regmap *lvds_regmap;
struct clk *phy_ref_clk;
const struct mixel_dphy_devdata *devdata;
+   struct imx_sc_ipc *ipc_handle;
+   bool is_slave;
+   int id;
 };
 
 static const struct regmap_config mixel_dphy_regmap_config = {
@@ -317,7 +347,8 @@ static int mixel_dphy_set_pll_params(struct phy *phy)
return 0;
 }
 
-static int mixel_dphy_configure(struct phy *phy, union phy_configure_opts 
*opts)
+static int
+mixel_dphy_configure_mipi_dphy(struct phy *phy, union phy_configure_opts *opts)
 {
struct mixel_dphy_priv *priv = phy_get_drvdata(phy);
struct mixel_dphy_cfg cfg = { 0 };
@@ -345,15 +376,121 @@ static int mixel_dphy_configure(struct phy *phy, union 
phy_configure_opts *opts)
return 0;
 }
 
+static int
+mixel_dphy_configure_lvds_phy(struct phy *phy, union phy_configure_opts *opts)
+{
+   struct mixel_dphy_priv *priv = phy_get_drvdata(phy);
+   struct phy_configure_opts_lvds *lvds_opts = >lvds;
+   unsigned long data_rate;
+   unsigned long fvco;
+   u32 rsc;
+   u32 co;
+   int ret;
+
+   priv->is_slave = lvds_opts->is_slave;
+
+   /* LVDS interface pins */
+   regmap_write(priv->lvds_regmap, PHY_CTRL, CCM(0x5) | CA(0x4) | RFB);
+
+   /* enable MODE8 only for slave LVDS PHY */
+   rsc = priv->id ? IMX_SC_R_MIPI_1 : IMX_SC_R_MIPI_0;
+   ret = imx_sc_misc_set_control(priv->ipc_handle, rsc, IMX_SC_C_DUAL_MODE,
+ lvds_opts->is_slave);
+   if (ret) {
+   dev_err(>dev, "Failed to configure MODE8: %d\n", ret);
+   return ret;
+   }
+
+   /*
+* Choose an appropriate divider ratio to meet the requirement of
+* PLL VCO frequency range.
+*
+*  -  640MHz ~ 1500MHz

[PATCH v2 4/5] dt-bindings: phy: mixel: mipi-dsi-phy: Add Mixel combo PHY support for i.MX8qxp

2020-12-08 Thread Liu Ying

Add support for Mixel MIPI DPHY + LVDS PHY combo IP
as found on Freescale i.MX8qxp SoC.

Cc: Guido Günther 
Cc: Kishon Vijay Abraham I 
Cc: Vinod Koul 
Cc: Rob Herring 
Cc: NXP Linux Team 
Signed-off-by: Liu Ying 
---
v1->v2:
* Add the binding for i.MX8qxp Mixel combo PHY based on the converted binding.
  (Guido)

 .../bindings/phy/mixel,mipi-dsi-phy.yaml   | 41 --
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml 
b/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml
index f869fd2..07b9849 100644
--- a/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml
@@ -14,10 +14,14 @@ description: |
   MIPI-DSI IP from Northwest Logic). It represents the physical layer for the
   electrical signals for DSI.
 
+  The Mixel PHY IP block found on i.MX8qxp is a combo PHY that can work
+  in either MIPI-DSI PHY mode or LVDS PHY mode.
+
 properties:
   compatible:
 enum:
   - fsl,imx8mq-mipi-dphy
+  - fsl,imx8qxp-mipi-dphy
 
   reg:
 maxItems: 1
@@ -41,6 +45,11 @@ properties:
   "#phy-cells":
 const: 0
 
+  fsl,syscon:
+$ref: /schemas/types.yaml#/definitions/phandle
+description: |
+  A phandle which points to Control and Status Registers(CSR) module.
+
   power-domains:
 maxItems: 1
 
@@ -49,12 +58,38 @@ required:
   - reg
   - clocks
   - clock-names
-  - assigned-clocks
-  - assigned-clock-parents
-  - assigned-clock-rates
   - "#phy-cells"
   - power-domains
 
+allOf:
+  - if:
+  properties:
+compatible:
+  contains:
+const: fsl,imx8mq-mipi-dphy
+then:
+  properties:
+fsl,syscon: false
+
+  required:
+- assigned-clocks
+- assigned-clock-parents
+- assigned-clock-rates
+
+  - if:
+  properties:
+compatible:
+  contains:
+const: fsl,imx8qxp-mipi-dphy
+then:
+  properties:
+assigned-clocks: false
+assigned-clock-parents: false
+assigned-clock-rates: false
+
+  required:
+- fsl,syscon
+
 additionalProperties: false
 
 examples:
-- 
2.7.4

[PATCH v2 2/5] phy: Add LVDS configuration options

2020-12-08 Thread Liu Ying

This patch allows LVDS PHYs to be configured through
the generic functions and through a custom structure
added to the generic union.

The parameters added here are based on common LVDS PHY
implementation practices.  The set of parameters
should cover all potential users.

Cc: Kishon Vijay Abraham I 
Cc: Vinod Koul 
Cc: NXP Linux Team 
Signed-off-by: Liu Ying 
---
v1->v2:
* No change.

 include/linux/phy/phy-lvds.h | 48 
 include/linux/phy/phy.h  |  4 
 2 files changed, 52 insertions(+)
 create mode 100644 include/linux/phy/phy-lvds.h

diff --git a/include/linux/phy/phy-lvds.h b/include/linux/phy/phy-lvds.h
new file mode 100644
index ..1b5b9d6
--- /dev/null
+++ b/include/linux/phy/phy-lvds.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2020 NXP
+ */
+
+#ifndef __PHY_LVDS_H_
+#define __PHY_LVDS_H_
+
+/**
+ * struct phy_configure_opts_lvds - LVDS configuration set
+ *
+ * This structure is used to represent the configuration state of a
+ * LVDS phy.
+ */
+struct phy_configure_opts_lvds {
+   /**
+* @bits_per_lane_and_dclk_cycle:
+*
+* Number of bits per data lane and differential clock cycle.
+*/
+   unsigned int bits_per_lane_and_dclk_cycle;
+
+   /**
+* @differential_clk_rate:
+*
+* Clock rate, in Hertz, of the LVDS differential clock.
+*/
+   unsigned long differential_clk_rate;
+
+   /**
+* @lanes:
+*
+* Number of active, consecutive, data lanes, starting from
+* lane 0, used for the transmissions.
+*/
+   unsigned int lanes;
+
+   /**
+* @is_slave:
+*
+* Boolean, true if the phy is a slave which works together
+* with a master phy to support dual link transmission,
+* otherwise a regular phy or a master phy.
+*/
+   bool is_slave;
+};
+
+#endif /* __PHY_LVDS_H_ */
diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
index e435bdb..d450b44 100644
--- a/include/linux/phy/phy.h
+++ b/include/linux/phy/phy.h
@@ -17,6 +17,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 struct phy;
@@ -51,10 +52,13 @@ enum phy_mode {
  * the MIPI_DPHY phy mode.
  * @dp:Configuration set applicable for phys supporting
  * the DisplayPort protocol.
+ * @lvds:  Configuration set applicable for phys supporting
+ * the LVDS phy mode.
  */
 union phy_configure_opts {
struct phy_configure_opts_mipi_dphy mipi_dphy;
struct phy_configure_opts_dpdp;
+   struct phy_configure_opts_lvds  lvds;
 };
 
 /**
-- 
2.7.4

Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console

2020-12-08 Thread Shung-Hsi Yu

Hi Thomas,

On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote:
> The real problem is irqbalanced aggressively exhausting the vector space
> of a _whole_ socket to the point that there is not a single vector left
> for serial. That's the problem you want to fix.

I believe this warning also gets triggered even when there's _no_ vector
exhaustion.

This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on
a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0).

  $ lscpu
  ...
  NUMA node0 CPU(s):   0-25,52-77
  NUMA node1 CPU(s):   26-51,78-103

  $ cat /sys/kernel/debug/tracing/trace
   ...
  irqbalance-1994[017] d...74.912799: irq_matrix_alloc: bit=33 cpu=26 
online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
  irqbalance-1994[017] d...74.912802: vector_alloc: irq=4 vector=33 
reserved=0 ret=0
  irqbalance-1994[017] d...74.912804: vector_update: irq=4 vector=33 
cpu=26 prev_vector=33 prev_cpu=7
  irqbalance-1994[017] d...74.912805: vector_config: irq=4 vector=33 
cpu=26 apicdest=0x0040
  -0   [007] d.h.74.970733: vector_free_moved: irq=4 cpu=7 
vector=33 is_managed=0
  -0   [007] d.h.74.970738: irq_matrix_free: bit=33 cpu=7 
online=1 avl=200 alloc=1 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
   ...
(agetty)-3004[047] d...81.731231: vector_deactivate: irq=4 
is_managed=0 can_reserve=1 reserve=0
(agetty)-3004[047] d...81.738035: vector_clear: irq=4 vector=33 
cpu=26 prev_vector=0 prev_cpu=7
(agetty)-3004[047] d...81.738040: irq_matrix_free: bit=33 cpu=26 
online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20689, 
global_rsvd=341, total_alloc=215
(agetty)-3004[047] d...81.738046: irq_matrix_reserve: 
online_maps=104 global_avl=20689, global_rsvd=342, total_alloc=215
(agetty)-3004[047] d...81.766739: vector_reserve: irq=4 ret=0
(agetty)-3004[047] d...81.766741: vector_config: irq=4 vector=239 
cpu=0 apicdest=0x
(agetty)-3004[047] d...81.777152: vector_activate: irq=4 
is_managed=0 can_reserve=1 reserve=0
(agetty)-3004[047] d...81.777157: vector_alloc: irq=4 vector=0 
reserved=1 ret=-22
> irq_matrix_alloc() failed with
  EINVAL because the cpumask
  passed in is empty, which is a
  result of affmask being
  (ff,c000,000f,fc00)
  and cpumask_of_node(node)
  being
  (00,3fff,fff0,03ff). 

(agetty)-3004[047] d...81.789349: irq_matrix_alloc: bit=33 cpu=1 
online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, 
global_rsvd=341, total_alloc=216
(agetty)-3004[047] d...81.789351: vector_alloc: irq=4 vector=33 
reserved=1 ret=0
(agetty)-3004[047] d...81.789353: vector_update: irq=4 vector=33 
cpu=1 prev_vector=0 prev_cpu=26
(agetty)-3004[047] d...81.789355: vector_config: irq=4 vector=33 
cpu=1 apicdest=0x0002
> "irq 4: Affinity broken due to
  vector space exhaustion."
  warning shows up

(agetty)-3004[047] d...81.900783: irq_matrix_alloc: bit=33 cpu=26 
online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
(agetty)-3004[047] d...82.053535: vector_alloc: irq=4 vector=33 
reserved=0 ret=0
(agetty)-3004[047] d...82.053536: vector_update: irq=4 vector=33 
cpu=26 prev_vector=33 prev_cpu=1
(agetty)-3004[047] d...82.053538: vector_config: irq=4 vector=33 
cpu=26 apicdest=0x0040


Shung-Hsi Yu

[PATCH v2 3/5] dt-bindings: phy: Convert mixel,mipi-dsi-phy to json-schema

2020-12-08 Thread Liu Ying

This patch converts the mixel,mipi-dsi-phy binding to
DT schema format using json-schema.

Comparing to the plain text version, the new binding adds
the 'assigned-clocks', 'assigned-clock-parents' and
'assigned-clock-rates' properites, otherwise 'make dtbs_check'
would complain that there are mis-matches.  Also, the new
binding requires the 'power-domains' property since all potential
SoCs that embed this PHY would provide a power domain for it.
The example of the new binding takes reference to the latest
dphy node in imx8mq.dtsi.

Cc: Guido Günther 
Cc: Kishon Vijay Abraham I 
Cc: Vinod Koul 
Cc: Rob Herring 
Cc: NXP Linux Team 
Signed-off-by: Liu Ying 
---
v1->v2:
* Newly introduced in v2.  (Guido)

 .../devicetree/bindings/phy/mixel,mipi-dsi-phy.txt | 29 -
 .../bindings/phy/mixel,mipi-dsi-phy.yaml   | 73 ++
 2 files changed, 73 insertions(+), 29 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.txt
 create mode 100644 
Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml

diff --git a/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.txt 
b/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.txt
deleted file mode 100644
index 9b23407..
--- a/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.txt
+++ /dev/null
@@ -1,29 +0,0 @@
-Mixel DSI PHY for i.MX8
-
-The Mixel MIPI-DSI PHY IP block is e.g. found on i.MX8 platforms (along the
-MIPI-DSI IP from Northwest Logic). It represents the physical layer for the
-electrical signals for DSI.
-
-Required properties:
-- compatible: Must be:
-  - "fsl,imx8mq-mipi-dphy"
-- clocks: Must contain an entry for each entry in clock-names.
-- clock-names: Must contain the following entries:
-  - "phy_ref": phandle and specifier referring to the DPHY ref clock
-- reg: the register range of the PHY controller
-- #phy-cells: number of cells in PHY, as defined in
-  Documentation/devicetree/bindings/phy/phy-bindings.txt
-  this must be <0>
-
-Optional properties:
-- power-domains: phandle to power domain
-
-Example:
-   dphy: dphy@30a0030 {
-   compatible = "fsl,imx8mq-mipi-dphy";
-   clocks = < IMX8MQ_CLK_DSI_PHY_REF>;
-   clock-names = "phy_ref";
-   reg = <0x30a00300 0x100>;
-   power-domains = <_mipi0>;
-   #phy-cells = <0>;
-};
diff --git a/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml 
b/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml
new file mode 100644
index ..f869fd2
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml
@@ -0,0 +1,73 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/mixel,mipi-dsi-phy.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Mixel DSI PHY for i.MX8
+
+maintainers:
+  - Guido Günther 
+
+description: |
+  The Mixel MIPI-DSI PHY IP block is e.g. found on i.MX8 platforms (along the
+  MIPI-DSI IP from Northwest Logic). It represents the physical layer for the
+  electrical signals for DSI.
+
+properties:
+  compatible:
+enum:
+  - fsl,imx8mq-mipi-dphy
+
+  reg:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+
+  clock-names:
+items:
+  - const: phy_ref
+
+  assigned-clocks:
+maxItems: 1
+
+  assigned-clock-parents:
+maxItems: 1
+
+  assigned-clock-rates:
+maxItems: 1
+
+  "#phy-cells":
+const: 0
+
+  power-domains:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+  - assigned-clocks
+  - assigned-clock-parents
+  - assigned-clock-rates
+  - "#phy-cells"
+  - power-domains
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+dphy: dphy@30a0030 {
+compatible = "fsl,imx8mq-mipi-dphy";
+reg = <0x30a00300 0x100>;
+clocks = < IMX8MQ_CLK_DSI_PHY_REF>;
+clock-names = "phy_ref";
+assigned-clocks = < IMX8MQ_CLK_DSI_PHY_REF>;
+assigned-clock-parents = < IMX8MQ_VIDEO_PLL1_OUT>;
+assigned-clock-rates = <2400>;
+#phy-cells = <0>;
+power-domains = <_mipi>;
+};
-- 
2.7.4

Re: scheduling while atomic in z3fold

2020-12-08 Thread Mike Galbraith

On Wed, 2020-12-09 at 07:13 +0100, Mike Galbraith wrote:
> On Wed, 2020-12-09 at 00:26 +0100, Vitaly Wool wrote:
> > Hi Mike,
> >
> > On 2020-12-07 16:41, Mike Galbraith wrote:
> > > On Mon, 2020-12-07 at 16:21 +0100, Vitaly Wool wrote:
> > >> On Mon, Dec 7, 2020 at 1:34 PM Mike Galbraith  wrote:
> > >>>
> > >>
> > >>> Unfortunately, that made zero difference.
> > >>
> > >> Okay, I suggest that you submit the patch that changes read_lock() to
> > >> write_lock() in __release_z3fold_page() and I'll ack it then.
> > >> I would like to rewrite the code so that write_lock is not necessary
> > >> there but I don't want to hold you back and it isn't likely that I'll
> > >> complete this today.
> > >
> > > Nah, I'm in no rush... especially not to sign off on "Because the
> > > little voices in my head said this bit should look like that bit over
> > > yonder, and testing _seems_ to indicate they're right about that" :)
> > >
> > >   -Mike
> > >
> >
> > okay, thanks. Would this make things better:
>
> Yup, z3fold became RT tolerant with this (un-munged and) applied.

Below is the other change that any RT users of z3fold will need.

mm, z3fold: Remove preempt disabled sections for RT

Replace get_cpu_ptr() with migrate_disable()+this_cpu_ptr() so RT can take
spinlocks that become sleeping locks.

Signed-off-by Mike Galbraith 
---
 mm/z3fold.c |   17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -617,14 +617,16 @@ static inline void add_to_unbuddied(stru
 {
if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0 ||
zhdr->middle_chunks == 0) {
-   struct list_head *unbuddied = get_cpu_ptr(pool->unbuddied);
-
+   struct list_head *unbuddied;
int freechunks = num_free_chunks(zhdr);
+
+   migrate_disable();
+   unbuddied = this_cpu_ptr(pool->unbuddied);
spin_lock(>lock);
list_add(>buddy, [freechunks]);
spin_unlock(>lock);
zhdr->cpu = smp_processor_id();
-   put_cpu_ptr(pool->unbuddied);
+   migrate_enable();
}
 }

@@ -861,8 +863,9 @@ static inline struct z3fold_header *__z3
int chunks = size_to_chunks(size), i;

 lookup:
+   migrate_disable();
/* First, try to find an unbuddied z3fold page. */
-   unbuddied = get_cpu_ptr(pool->unbuddied);
+   unbuddied = this_cpu_ptr(pool->unbuddied);
for_each_unbuddied_list(i, chunks) {
struct list_head *l = [i];

@@ -880,7 +883,7 @@ static inline struct z3fold_header *__z3
!z3fold_page_trylock(zhdr)) {
spin_unlock(>lock);
zhdr = NULL;
-   put_cpu_ptr(pool->unbuddied);
+   migrate_enable();
if (can_sleep)
cond_resched();
goto lookup;
@@ -894,7 +897,7 @@ static inline struct z3fold_header *__z3
test_bit(PAGE_CLAIMED, >private)) {
z3fold_page_unlock(zhdr);
zhdr = NULL;
-   put_cpu_ptr(pool->unbuddied);
+   migrate_enable();
if (can_sleep)
cond_resched();
goto lookup;
@@ -909,7 +912,7 @@ static inline struct z3fold_header *__z3
kref_get(>refcount);
break;
}
-   put_cpu_ptr(pool->unbuddied);
+   migrate_enable();

if (!zhdr) {
int cpu;

[PATCH v2 0/5] phy: phy-fsl-imx8-mipi-dphy: Add i.MX8qxp LVDS PHY mode support

2020-12-08 Thread Liu Ying

Hi,

This series adds i.MX8qxp LVDS PHY mode support for the Mixel PHY in the
Freescale i.MX8qxp SoC.

The Mixel PHY is MIPI DPHY + LVDS PHY combo, which can works in either
MIPI DPHY mode or LVDS PHY mode.  The PHY mode is controlled by i.MX8qxp
SCU firmware.  The PHY driver would call a SCU function to configure the
mode.

The PHY driver is already supporting the Mixel MIPI DPHY in i.MX8mq SoC,
where it appears to be a single MIPI DPHY.


Patch 1/5 sets PHY mode in the Northwest Logic MIPI DSI host controller
bridge driver, since i.MX8qxp SoC embeds this controller IP to support
MIPI DSI displays together with the Mixel PHY.

Patch 2/5 allows LVDS PHYs to be configured through the generic PHY functions
and through a custom structure added to the generic PHY configuration union.

Patch 3/5 converts mixel,mipi-dsi-phy plain text dt binding to json-schema.

Patch 4/5 adds dt binding support for the Mixel combo PHY in i.MX8qxp SoC.

Patch 5/5 adds the i.MX8qxp LVDS PHY mode support in the Mixel PHY driver.


Welcome comments, thanks.

v1->v2:
* Convert mixel,mipi-dsi-phy plain text dt binding to json-schema. (Guido)
* Print invalid PHY mode in dmesg from the Mixel PHY driver. (Guido)
* Add Guido's R-b tag on the patch for the nwl-dsi drm bridge driver.

Liu Ying (5):
  drm/bridge: nwl-dsi: Set PHY mode in nwl_dsi_enable()
  phy: Add LVDS configuration options
  dt-bindings: phy: Convert mixel,mipi-dsi-phy to json-schema
  dt-bindings: phy: mixel: mipi-dsi-phy: Add Mixel combo PHY support for
i.MX8qxp
  phy: freescale: phy-fsl-imx8-mipi-dphy: Add i.MX8qxp LVDS PHY mode
support

 .../devicetree/bindings/phy/mixel,mipi-dsi-phy.txt |  29 ---
 .../bindings/phy/mixel,mipi-dsi-phy.yaml   | 108 +
 drivers/gpu/drm/bridge/nwl-dsi.c   |   6 +
 drivers/phy/freescale/phy-fsl-imx8-mipi-dphy.c | 270 -
 include/linux/phy/phy-lvds.h   |  48 
 include/linux/phy/phy.h|   4 +
 6 files changed, 425 insertions(+), 40 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.txt
 create mode 100644 
Documentation/devicetree/bindings/phy/mixel,mipi-dsi-phy.yaml
 create mode 100644 include/linux/phy/phy-lvds.h

-- 
2.7.4

[PATCH v2 1/5] drm/bridge: nwl-dsi: Set PHY mode in nwl_dsi_enable()

2020-12-08 Thread Liu Ying

The Northwest Logic MIPI DSI host controller embedded in i.MX8qxp
works with a Mixel MIPI DPHY + LVDS PHY combo to support either
a MIPI DSI display or a LVDS display.  So, this patch calls
phy_set_mode() from nwl_dsi_enable() to set PHY mode to MIPI DPHY
explicitly.

Cc: Guido Günther 
Cc: Robert Chiras 
Cc: Martin Kepplinger 
Cc: Andrzej Hajda 
Cc: Neil Armstrong 
Cc: Laurent Pinchart 
Cc: Jonas Karlman 
Cc: Jernej Skrabec 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: NXP Linux Team 
Reviewed-by: Guido Günther 
Signed-off-by: Liu Ying 
---
v1->v2:
* Add Guido's R-b tag.

 drivers/gpu/drm/bridge/nwl-dsi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/bridge/nwl-dsi.c b/drivers/gpu/drm/bridge/nwl-dsi.c
index 66b6740..be6bfc5 100644
--- a/drivers/gpu/drm/bridge/nwl-dsi.c
+++ b/drivers/gpu/drm/bridge/nwl-dsi.c
@@ -678,6 +678,12 @@ static int nwl_dsi_enable(struct nwl_dsi *dsi)
return ret;
}
 
+   ret = phy_set_mode(dsi->phy, PHY_MODE_MIPI_DPHY);
+   if (ret < 0) {
+   DRM_DEV_ERROR(dev, "Failed to set DSI phy mode: %d\n", ret);
+   goto uninit_phy;
+   }
+
ret = phy_configure(dsi->phy, phy_cfg);
if (ret < 0) {
DRM_DEV_ERROR(dev, "Failed to configure DSI phy: %d\n", ret);
-- 
2.7.4

[PATCH 1/1] scsi: ufs-mediatek: use correct path to fix compiling error

2020-12-08 Thread Zhen Lei

When the kernel is compiled with allmodconfig, the following error is
reported:
In file included from drivers/scsi/ufs/ufs-mediatek-trace.h:36:0,
 from drivers/scsi/ufs/ufs-mediatek.c:28:
./include/trace/define_trace.h:95:42: fatal error: ./ufs-mediatek-trace.h: No 
such file or directory
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)

The comment in include/trace/define_trace.h specifies that:
TRACE_INCLUDE_PATH: Note, the path is relative to define_trace.h, not the
file including it. Full path names for out of tree modules must be used.

So without "CFLAGS_ufs-mediatek.o := -I$(src)", the current directory "."
is "include/trace/", the relative path of ufs-mediatek-trace.h is
"../../drivers/scsi/ufs/".

Fixes: ca1bb061d644 ("scsi: ufs-mediatek: Introduce event_notify 
implementation")
Signed-off-by: Zhen Lei 
---
 drivers/scsi/ufs/ufs-mediatek-trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufs-mediatek-trace.h 
b/drivers/scsi/ufs/ufs-mediatek-trace.h
index fd6f84c1b4e2256..895e82ea6ece551 100644
--- a/drivers/scsi/ufs/ufs-mediatek-trace.h
+++ b/drivers/scsi/ufs/ufs-mediatek-trace.h
@@ -31,6 +31,6 @@ TRACE_EVENT(ufs_mtk_event,
 
 #undef TRACE_INCLUDE_PATH
 #undef TRACE_INCLUDE_FILE
-#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_PATH ../../drivers/scsi/ufs/
 #define TRACE_INCLUDE_FILE ufs-mediatek-trace
 #include 
-- 
2.26.0.106.g9fadedd

[PATCH 0/1] scsi: ufs-mediatek: use correct path to fix compiling error

2020-12-08 Thread Zhen Lei

This patch is based on the latest linux-next code. So the Fixes commit-id
maybe changed when it merged int v5.11-rc1.


Zhen Lei (1):
  scsi: ufs-mediatek: use correct path to fix compiling error

 drivers/scsi/ufs/ufs-mediatek-trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.26.0.106.g9fadedd

Re: [f2fs-dev] [PATCH v4] f2fs: compress: support chksum

2020-12-08 Thread Chao Yu


On 2020/12/9 12:28, Chao Yu wrote:

On 2020/12/9 11:54, Jaegeuk Kim wrote:

Ah, could you please write another patch to adjust the new changes?


No problem, will drop "f2fs: compress:support chksum" based on your dev branch, 
and
apply all compress related patches on top of dev branch.


Jaegeuk, could you please
- drop "f2fs: compress:support chksum",
- manually fix conflict when applying "f2fs: add compress_mode mount option"
- and then apply last my resent patches.

Thanks,

[PATCH RESEND 5/6] f2fs: introduce a new per-sb directory in sysfs

2020-12-08 Thread Chao Yu

Add a new directory 'stat' in path of /sys/fs/f2fs//, later
we can add new readonly stat sysfs file into this directory, it will
make  directory less mess.

Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h  |  5 +++-
 fs/f2fs/sysfs.c | 69 +
 2 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index fbaef39e51df..cb94f650ec3d 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1549,9 +1549,12 @@ struct f2fs_sb_info {
unsigned int node_io_flag;
 
/* For sysfs suppport */
-   struct kobject s_kobj;
+   struct kobject s_kobj;  /* /sys/fs/f2fs/ */
struct completion s_kobj_unregister;
 
+   struct kobject s_stat_kobj; /* /sys/fs/f2fs//stat 
*/
+   struct completion s_stat_kobj_unregister;
+
/* For shrinker support */
struct list_head s_list;
int s_ndevs;/* number of devices */
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 989a649cfa8b..ebca0b4961e8 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -711,6 +711,11 @@ static struct attribute *f2fs_feat_attrs[] = {
 };
 ATTRIBUTE_GROUPS(f2fs_feat);
 
+static struct attribute *f2fs_stat_attrs[] = {
+   NULL,
+};
+ATTRIBUTE_GROUPS(f2fs_stat);
+
 static const struct sysfs_ops f2fs_attr_ops = {
.show   = f2fs_attr_show,
.store  = f2fs_attr_store,
@@ -739,6 +744,44 @@ static struct kobject f2fs_feat = {
.kset   = _kset,
 };
 
+static ssize_t f2fs_stat_attr_show(struct kobject *kobj,
+   struct attribute *attr, char *buf)
+{
+   struct f2fs_sb_info *sbi = container_of(kobj, struct f2fs_sb_info,
+   s_stat_kobj);
+   struct f2fs_attr *a = container_of(attr, struct f2fs_attr, attr);
+
+   return a->show ? a->show(a, sbi, buf) : 0;
+}
+
+static ssize_t f2fs_stat_attr_store(struct kobject *kobj, struct attribute 
*attr,
+   const char *buf, size_t len)
+{
+   struct f2fs_sb_info *sbi = container_of(kobj, struct f2fs_sb_info,
+   s_stat_kobj);
+   struct f2fs_attr *a = container_of(attr, struct f2fs_attr, attr);
+
+   return a->store ? a->store(a, sbi, buf, len) : 0;
+}
+
+static void f2fs_stat_kobj_release(struct kobject *kobj)
+{
+   struct f2fs_sb_info *sbi = container_of(kobj, struct f2fs_sb_info,
+   s_stat_kobj);
+   complete(>s_stat_kobj_unregister);
+}
+
+static const struct sysfs_ops f2fs_stat_attr_ops = {
+   .show   = f2fs_stat_attr_show,
+   .store  = f2fs_stat_attr_store,
+};
+
+static struct kobj_type f2fs_stat_ktype = {
+   .default_groups = f2fs_stat_groups,
+   .sysfs_ops  = _stat_attr_ops,
+   .release= f2fs_stat_kobj_release,
+};
+
 static int __maybe_unused segment_info_seq_show(struct seq_file *seq,
void *offset)
 {
@@ -945,11 +988,15 @@ int f2fs_register_sysfs(struct f2fs_sb_info *sbi)
init_completion(>s_kobj_unregister);
err = kobject_init_and_add(>s_kobj, _sb_ktype, NULL,
"%s", sb->s_id);
-   if (err) {
-   kobject_put(>s_kobj);
-   wait_for_completion(>s_kobj_unregister);
-   return err;
-   }
+   if (err)
+   goto put_sb_kobj;
+
+   sbi->s_stat_kobj.kset = _kset;
+   init_completion(>s_stat_kobj_unregister);
+   err = kobject_init_and_add(>s_stat_kobj, _stat_ktype,
+   >s_kobj, "stat");
+   if (err)
+   goto put_stat_kobj;
 
if (f2fs_proc_root)
sbi->s_proc = proc_mkdir(sb->s_id, f2fs_proc_root);
@@ -965,6 +1012,13 @@ int f2fs_register_sysfs(struct f2fs_sb_info *sbi)
victim_bits_seq_show, sb);
}
return 0;
+put_stat_kobj:
+   kobject_put(>s_stat_kobj);
+   wait_for_completion(>s_stat_kobj_unregister);
+put_sb_kobj:
+   kobject_put(>s_kobj);
+   wait_for_completion(>s_kobj_unregister);
+   return err;
 }
 
 void f2fs_unregister_sysfs(struct f2fs_sb_info *sbi)
@@ -976,6 +1030,11 @@ void f2fs_unregister_sysfs(struct f2fs_sb_info *sbi)
remove_proc_entry("victim_bits", sbi->s_proc);
remove_proc_entry(sbi->sb->s_id, f2fs_proc_root);
}
+
+   kobject_del(>s_stat_kobj);
+   kobject_put(>s_stat_kobj);
+   wait_for_completion(>s_stat_kobj_unregister);
+
kobject_del(>s_kobj);
kobject_put(>s_kobj);
wait_for_completion(>s_kobj_unregister);
-- 
2.29.2

[RFC PATCH v7] sched/fair: select idle cpu from idle cpumask for task wakeup

2020-12-08 Thread Aubrey Li

Add idle cpumask to track idle cpus in sched domain. Every time
a CPU enters idle, the CPU is set in idle cpumask to be a wakeup
target. And if the CPU is not in idle, the CPU is cleared in idle
cpumask during scheduler tick to ratelimit idle cpumask update.

When a task wakes up to select an idle cpu, scanning idle cpumask
has lower cost than scanning all the cpus in last level cache domain,
especially when the system is heavily loaded.

Benchmarks including hackbench, schbench, uperf, sysbench mysql
and kbuild were tested on a x86 4 socket system with 24 cores per
socket and 2 hyperthreads per core, total 192 CPUs, no regression
found.

v6->v7:
- place the whole idle cpumask mechanism under CONFIG_SMP.

v5->v6:
- decouple idle cpumask update from stop_tick signal, set idle CPU
  in idle cpumask every time the CPU enters idle

v4->v5:
- add update_idle_cpumask for s2idle case
- keep the same ordering of tick_nohz_idle_stop_tick() and update_
  idle_cpumask() everywhere

v3->v4:
- change setting idle cpumask from every idle entry to tickless idle
  if cpu driver is available.
- move clearing idle cpumask to scheduler_tick to decouple nohz mode.

v2->v3:
- change setting idle cpumask to every idle entry, otherwise schbench
  has a regression of 99th percentile latency.
- change clearing idle cpumask to nohz_balancer_kick(), so updating
  idle cpumask is ratelimited in the idle exiting path.
- set SCHED_IDLE cpu in idle cpumask to allow it as a wakeup target.

v1->v2:
- idle cpumask is updated in the nohz routines, by initializing idle
  cpumask with sched_domain_span(sd), nohz=off case remains the original
  behavior.

Cc: Peter Zijlstra 
Cc: Mel Gorman 
Cc: Vincent Guittot 
Cc: Qais Yousef 
Cc: Valentin Schneider 
Cc: Jiang Biao 
Cc: Tim Chen 
Signed-off-by: Aubrey Li 
---
 include/linux/sched/topology.h | 13 +
 kernel/sched/core.c|  2 ++
 kernel/sched/fair.c| 51 +-
 kernel/sched/idle.c|  5 
 kernel/sched/sched.h   |  4 +++
 kernel/sched/topology.c|  3 +-
 6 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 820511289857..b47b85163607 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -65,8 +65,21 @@ struct sched_domain_shared {
atomic_tref;
atomic_tnr_busy_cpus;
int has_idle_cores;
+   /*
+* Span of all idle CPUs in this domain.
+*
+* NOTE: this field is variable length. (Allocated dynamically
+* by attaching extra space to the end of the structure,
+* depending on how many CPUs the kernel has booted up with)
+*/
+   unsigned long   idle_cpus_span[];
 };
 
+static inline struct cpumask *sds_idle_cpus(struct sched_domain_shared *sds)
+{
+   return to_cpumask(sds->idle_cpus_span);
+}
+
 struct sched_domain {
/* These fields must be setup */
struct sched_domain __rcu *parent;  /* top domain must be null 
terminated */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c4da7e17b906..c4c51ff3402a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4011,6 +4011,7 @@ void scheduler_tick(void)
 
 #ifdef CONFIG_SMP
rq->idle_balance = idle_cpu(cpu);
+   update_idle_cpumask(cpu, false);
trigger_load_balance(rq);
 #endif
 }
@@ -7186,6 +7187,7 @@ void __init sched_init(void)
rq->idle_stamp = 0;
rq->avg_idle = 2*sysctl_sched_migration_cost;
rq->max_idle_balance_cost = sysctl_sched_migration_cost;
+   rq->last_idle_state = 1;
 
INIT_LIST_HEAD(>cfs_tasks);
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c0c4d9ad7da8..7306f8886120 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6146,7 +6146,12 @@ static int select_idle_cpu(struct task_struct *p, struct 
sched_domain *sd, int t
 
time = cpu_clock(this);
 
-   cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+   /*
+* sched_domain_shared is set only at shared cache level,
+* this works only because select_idle_cpu is called with
+* sd_llc.
+*/
+   cpumask_and(cpus, sds_idle_cpus(sd->shared), p->cpus_ptr);
 
for_each_cpu_wrap(cpu, cpus, target) {
if (!--nr)
@@ -6806,6 +6811,50 @@ balance_fair(struct rq *rq, struct task_struct *prev, 
struct rq_flags *rf)
 
return newidle_balance(rq, rf) != 0;
 }
+
+/*
+ * Update cpu idle state and record this information
+ * in sd_llc_shared->idle_cpus_span.
+ */
+void update_idle_cpumask(int cpu, bool set_idle)
+{
+   struct sched_domain *sd;
+   struct rq *rq = cpu_rq(cpu);
+   int idle_state;
+
+   /*
+* If called from scheduler tick, only update
+* idle cpumask if the CPU is busy, as idle
+* cpumask is also updated on idle entry.

[PATCH RESEND 2/6] f2fs: compress: add compress_inode to cache compressed blocks

2020-12-08 Thread Chao Yu

Support to use address space of inner inode to cache compressed block,
in order to improve cache hit ratio of random read.

Signed-off-by: Chao Yu 
---
 Documentation/filesystems/f2fs.rst |   3 +
 fs/f2fs/compress.c | 198 +++--
 fs/f2fs/data.c |  29 -
 fs/f2fs/debug.c|  13 ++
 fs/f2fs/f2fs.h |  34 -
 fs/f2fs/gc.c   |   1 +
 fs/f2fs/inode.c|  21 ++-
 fs/f2fs/segment.c  |   6 +-
 fs/f2fs/super.c|  19 ++-
 include/linux/f2fs_fs.h|   1 +
 10 files changed, 305 insertions(+), 20 deletions(-)

diff --git a/Documentation/filesystems/f2fs.rst 
b/Documentation/filesystems/f2fs.rst
index 41de149f11cb..1169e6e9981e 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -268,6 +268,9 @@ compress_mode=%s Control file compression mode. This 
supports "fs" and "user"
 compression/decompression on the compression enabled 
files using
 ioctls.
 compress_chksum Support verifying chksum of raw data in 
compressed cluster.
+compress_cache  Support to use address space of a filesystem managed 
inode to
+cache compressed block, in order to improve cache hit 
ratio of
+random read.
 inlinecrypt When possible, encrypt/decrypt the contents of 
encrypted
 files using the blk-crypto framework rather than
 filesystem-layer encryption. This allows the use of
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 4bcbacfe3325..446dd41a7bad 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -12,9 +12,11 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
+#include "segment.h"
 #include 
 
 static struct kmem_cache *cic_entry_slab;
@@ -721,25 +723,14 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
return ret;
 }
 
-void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
+void f2fs_do_decompress_pages(struct decompress_io_ctx *dic, bool verity)
 {
-   struct decompress_io_ctx *dic =
-   (struct decompress_io_ctx *)page_private(page);
-   struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
struct f2fs_inode_info *fi= F2FS_I(dic->inode);
const struct f2fs_compress_ops *cops =
f2fs_cops[fi->i_compress_algorithm];
int ret;
int i;
 
-   dec_page_count(sbi, F2FS_RD_DATA);
-
-   if (bio->bi_status || PageError(page))
-   dic->failed = true;
-
-   if (atomic_dec_return(>pending_pages))
-   return;
-
trace_f2fs_decompress_pages_start(dic->inode, dic->cluster_idx,
dic->cluster_size, fi->i_compress_algorithm);
 
@@ -797,6 +788,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page 
*page, bool verity)
ret = cops->decompress_pages(dic);
 
if (!ret && (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)) {
+   struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
u32 provided = le32_to_cpu(dic->cbuf->chksum);
u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
 
@@ -830,6 +822,30 @@ void f2fs_decompress_pages(struct bio *bio, struct page 
*page, bool verity)
f2fs_free_dic(dic);
 }
 
+void f2fs_cache_compressed_page(struct f2fs_sb_info *sbi, struct page *page,
+   nid_t ino, block_t blkaddr);
+void f2fs_decompress_pages(struct bio *bio, struct page *page,
+   bool verity, unsigned int ofs)
+{
+   struct decompress_io_ctx *dic =
+   (struct decompress_io_ctx *)page_private(page);
+   struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
+   block_t blkaddr;
+
+   dec_page_count(sbi, F2FS_RD_DATA);
+
+   if (bio->bi_status || PageError(page))
+   dic->failed = true;
+
+   blkaddr = SECTOR_TO_BLOCK(bio->bi_iter.bi_sector) + ofs;
+   f2fs_cache_compressed_page(sbi, page, dic->inode->i_ino, blkaddr);
+
+   if (atomic_dec_return(>pending_pages))
+   return;
+
+   f2fs_do_decompress_pages(dic, verity);
+}
+
 static bool is_page_in_cluster(struct compress_ctx *cc, pgoff_t index)
 {
if (cc->cluster_idx == NULL_CLUSTER)
@@ -1600,6 +1616,164 @@ void f2fs_decompress_end_io(struct page **rpages,
}
 }
 
+const struct address_space_operations f2fs_compress_aops = {
+   .releasepage = f2fs_release_page,
+   .invalidatepage = f2fs_invalidate_page,
+};
+
+struct address_space *COMPRESS_MAPPING(struct f2fs_sb_info *sbi)
+{
+   return sbi->compress_inode->i_mapping;
+}
+
+void f2fs_invalidate_compress_page(struct f2fs_sb_info *sbi, block_t blkaddr)
+{
+

[PATCH RESEND 3/6] f2fs: compress: support compress level

2020-12-08 Thread Chao Yu

Expand 'compress_algorithm' mount option to accept parameter as format of
:, by this way, it gives a way to allow user to do more
specified config on lz4 and zstd compression level, then f2fs compression
can provide higher compress ratio.

In order to set compress level for lz4 algorithm, it needs to set
CONFIG_LZ4HC_COMPRESS and CONFIG_F2FS_FS_LZ4HC config to enable lz4hc
compress algorithm.

CR and performance number on lz4/lz4hc algorithm:

dd if=enwik9 of=compressed_file conv=fsync

Original blocks:244382

lz4 lz4hc-9
compressed blocks   170647  163270
compress ratio  69.8%   66.8%
speed   16.4207 s, 60.9 MB/s26.7299 s, 37.4 MB/s

compress ratio = after / before

Signed-off-by: Chao Yu 
---
 Documentation/filesystems/f2fs.rst |  5 +++
 fs/f2fs/Kconfig| 10 +
 fs/f2fs/compress.c | 41 +++--
 fs/f2fs/f2fs.h |  9 
 fs/f2fs/super.c| 71 +-
 include/linux/f2fs_fs.h|  3 ++
 6 files changed, 134 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/f2fs.rst 
b/Documentation/filesystems/f2fs.rst
index 1169e6e9981e..217c95117e97 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -249,6 +249,11 @@ checkpoint=%s[:%u[%]]   Set to "disable" to turn off 
checkpointing. Set to "enabl
 This space is reclaimed once checkpoint=enable.
 compress_algorithm=%s   Control compress algorithm, currently f2fs supports 
"lzo",
 "lz4", "zstd" and "lzo-rle" algorithm.
+compress_algorithm=%s:%d Control compress algorithm and its compress level, 
now, only
+"lz4" and "zstd" support compress level config.
+algorithm  level range
+lz43 - 16
+zstd   1 - 22
 compress_log_size=%uSupport configuring compress cluster size, the size 
will
 be 4KB * (1 << %u), 16KB is minimum size, also it's
 default size.
diff --git a/fs/f2fs/Kconfig b/fs/f2fs/Kconfig
index d13c5c6a9787..63c1fc1a0e3b 100644
--- a/fs/f2fs/Kconfig
+++ b/fs/f2fs/Kconfig
@@ -119,6 +119,16 @@ config F2FS_FS_LZ4
help
  Support LZ4 compress algorithm, if unsure, say Y.
 
+config F2FS_FS_LZ4HC
+   bool "LZ4HC compression support"
+   depends on F2FS_FS_COMPRESSION
+   depends on F2FS_FS_LZ4
+   select LZ4HC_COMPRESS
+   default y
+   help
+ Support LZ4HC compress algorithm, LZ4HC has compatible on-disk
+ layout with LZ4, if unsure, say Y.
+
 config F2FS_FS_ZSTD
bool "ZSTD compression support"
depends on F2FS_FS_COMPRESSION
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 446dd41a7bad..8840f5f41bf1 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -254,8 +254,14 @@ static const struct f2fs_compress_ops f2fs_lzo_ops = {
 #ifdef CONFIG_F2FS_FS_LZ4
 static int lz4_init_compress_ctx(struct compress_ctx *cc)
 {
-   cc->private = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
-   LZ4_MEM_COMPRESS, GFP_NOFS);
+   unsigned int size = LZ4_MEM_COMPRESS;
+
+#ifdef CONFIG_F2FS_FS_LZ4HC
+   if (F2FS_I(cc->inode)->i_compress_flag >> COMPRESS_LEVEL_OFFSET)
+   size = LZ4HC_MEM_COMPRESS;
+#endif
+
+   cc->private = f2fs_kvmalloc(F2FS_I_SB(cc->inode), size, GFP_NOFS);
if (!cc->private)
return -ENOMEM;
 
@@ -274,10 +280,34 @@ static void lz4_destroy_compress_ctx(struct compress_ctx 
*cc)
cc->private = NULL;
 }
 
+#ifdef CONFIG_F2FS_FS_LZ4HC
+static int lz4hc_compress_pages(struct compress_ctx *cc)
+{
+   unsigned char level = F2FS_I(cc->inode)->i_compress_flag >>
+   COMPRESS_LEVEL_OFFSET;
+   int len;
+
+   if (level)
+   len = LZ4_compress_HC(cc->rbuf, cc->cbuf->cdata, cc->rlen,
+   cc->clen, level, cc->private);
+   else
+   len = LZ4_compress_default(cc->rbuf, cc->cbuf->cdata, cc->rlen,
+   cc->clen, cc->private);
+   if (!len)
+   return -EAGAIN;
+
+   cc->clen = len;
+   return 0;
+}
+#endif
+
 static int lz4_compress_pages(struct compress_ctx *cc)
 {
int len;
 
+#ifdef CONFIG_F2FS_FS_LZ4HC
+   return lz4hc_compress_pages(cc);
+#endif
len = LZ4_compress_default(cc->rbuf, cc->cbuf->cdata, cc->rlen,
cc->clen, cc->private);
if (!len)
@@ -327,8 +357,13 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
ZSTD_CStream *stream;
void *workspace;
unsigned int workspace_size;
+   unsigned char level =

[PATCH RESEND 4/6] f2fs: compress: deny setting unsupported compress algorithm

2020-12-08 Thread Chao Yu

If kernel doesn't support certain kinds of compress algorithm, deny to set
them as compress algorithm of f2fs via 'compress_algorithm=%s' mount option.

Signed-off-by: Chao Yu 
---
 fs/f2fs/super.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8637196dec7c..d128b5cb763d 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -936,9 +936,14 @@ static int parse_options(struct super_block *sb, char 
*options, bool is_remount)
if (!name)
return -ENOMEM;
if (!strcmp(name, "lzo")) {
+#ifdef CONFIG_F2FS_FS_LZO
F2FS_OPTION(sbi).compress_algorithm =
COMPRESS_LZO;
+#else
+   f2fs_info(sbi, "kernel doesn't support lzo 
compression");
+#endif
} else if (!strncmp(name, "lz4", 3)) {
+#ifdef CONFIG_F2FS_FS_LZ4
ret = f2fs_compress_set_level(sbi, name,
COMPRESS_LZ4);
if (ret) {
@@ -947,7 +952,11 @@ static int parse_options(struct super_block *sb, char 
*options, bool is_remount)
}
F2FS_OPTION(sbi).compress_algorithm =
COMPRESS_LZ4;
+#else
+   f2fs_info(sbi, "kernel doesn't support lz4 
compression");
+#endif
} else if (!strncmp(name, "zstd", 4)) {
+#ifdef CONFIG_F2FS_FS_ZSTD
ret = f2fs_compress_set_level(sbi, name,
COMPRESS_ZSTD);
if (ret) {
@@ -956,9 +965,16 @@ static int parse_options(struct super_block *sb, char 
*options, bool is_remount)
}
F2FS_OPTION(sbi).compress_algorithm =
COMPRESS_ZSTD;
+#else
+   f2fs_info(sbi, "kernel doesn't support zstd 
compression");
+#endif
} else if (!strcmp(name, "lzo-rle")) {
+#ifdef CONFIG_F2FS_FS_LZORLE
F2FS_OPTION(sbi).compress_algorithm =
COMPRESS_LZORLE;
+#else
+   f2fs_info(sbi, "kernel doesn't support lzorle 
compression");
+#endif
} else {
kfree(name);
return -EINVAL;
-- 
2.29.2

[PATCH RESEND 1/6] f2fs: compress: support chksum

2020-12-08 Thread Chao Yu

This patch supports to store chksum value with compressed
data, and verify the integrality of compressed data while
reading the data.

The feature can be enabled through specifying mount option
'compress_chksum'.

Signed-off-by: Chao Yu 
---
 Documentation/filesystems/f2fs.rst |  1 +
 fs/f2fs/compress.c | 22 ++
 fs/f2fs/f2fs.h | 16 ++--
 fs/f2fs/inode.c|  3 +++
 fs/f2fs/super.c|  9 +
 include/linux/f2fs_fs.h|  2 +-
 6 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/f2fs.rst 
b/Documentation/filesystems/f2fs.rst
index 5eb8d63439ec..41de149f11cb 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -267,6 +267,7 @@ compress_mode=%s Control file compression mode. This 
supports "fs" and "user"
 choosing the target file and the timing. The user can 
do manual
 compression/decompression on the compression enabled 
files using
 ioctls.
+compress_chksum Support verifying chksum of raw data in 
compressed cluster.
 inlinecrypt When possible, encrypt/decrypt the contents of 
encrypted
 files using the blk-crypto framework rather than
 filesystem-layer encryption. This allows the use of
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 708a563583db..4bcbacfe3325 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
f2fs_cops[fi->i_compress_algorithm];
unsigned int max_len, new_nr_cpages;
struct page **new_cpages;
+   u32 chksum = 0;
int i, ret;
 
trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
@@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
 
cc->cbuf->clen = cpu_to_le32(cc->clen);
 
+   if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
+   chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
+   cc->cbuf->cdata, cc->clen);
+   cc->cbuf->chksum = cpu_to_le32(chksum);
+
for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
cc->cbuf->reserved[i] = cpu_to_le32(0);
 
@@ -790,6 +796,22 @@ void f2fs_decompress_pages(struct bio *bio, struct page 
*page, bool verity)
 
ret = cops->decompress_pages(dic);
 
+   if (!ret && (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)) {
+   u32 provided = le32_to_cpu(dic->cbuf->chksum);
+   u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
+
+   if (provided != calculated) {
+   if (!is_inode_flag_set(dic->inode, 
FI_COMPRESS_CORRUPT)) {
+   set_inode_flag(dic->inode, FI_COMPRESS_CORRUPT);
+   printk_ratelimited(
+   "%sF2FS-fs (%s): checksum invalid, nid 
= %lu, %x vs %x",
+   KERN_INFO, sbi->sb->s_id, 
dic->inode->i_ino,
+   provided, calculated);
+   }
+   set_sbi_flag(sbi, SBI_NEED_FSCK);
+   }
+   }
+
 out_vunmap_cbuf:
vm_unmap_ram(dic->cbuf, dic->nr_cpages);
 out_vunmap_rbuf:
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 6edb2adc410e..7364d453783f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -145,7 +145,8 @@ struct f2fs_mount_info {
 
/* For compression */
unsigned char compress_algorithm;   /* algorithm type */
-   unsigned compress_log_size; /* cluster log size */
+   unsigned char compress_log_size;/* cluster log size */
+   bool compress_chksum;   /* compressed data chksum */
unsigned char compress_ext_cnt; /* extension count */
int compress_mode;  /* compression mode */
unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* 
extensions */
@@ -675,6 +676,7 @@ enum {
FI_ATOMIC_REVOKE_REQUEST, /* request to drop atomic data */
FI_VERITY_IN_PROGRESS,  /* building fs-verity Merkle tree */
FI_COMPRESSED_FILE, /* indicate file's data can be compressed */
+   FI_COMPRESS_CORRUPT,/* indicate compressed cluster is corrupted */
FI_MMAP_FILE,   /* indicate file was mmapped */
FI_ENABLE_COMPRESS, /* enable compression in "user" compression 
mode */
FI_MAX, /* max flag, never be used */
@@ -733,6 +735,7 @@ struct f2fs_inode_info {
atomic_t i_compr_blocks;/* # of compressed blocks */
unsigned char i_compress_algorithm; /* algorithm type */
unsigned char i_log_cluster_size;   /* log of cluster size */
+

[PATCH RESEND 6/6] f2fs: introduce sb_status sysfs node

2020-12-08 Thread Chao Yu

Introduce /sys/fs/f2fs//stat/sb_status to show superblock
status in real time as below:

IS_DIRTY:   no
IS_CLOSE:   no
IS_SHUTDOWN:no
IS_RECOVERED:   no
IS_RESIZEFS:no
NEED_FSCK:  no
POR_DOING:  no
NEED_SB_WRITE:  no
NEED_CP:no
CP_DISABLED:no
CP_DISABLED_QUICK:  no
QUOTA_NEED_FLUSH:   no
QUOTA_SKIP_FLUSH:   no
QUOTA_NEED_REPAIR:  no

Signed-off-by: Chao Yu 
---
 Documentation/ABI/testing/sysfs-fs-f2fs |  5 
 fs/f2fs/sysfs.c | 36 +
 2 files changed, 41 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
b/Documentation/ABI/testing/sysfs-fs-f2fs
index 3dfee94e0618..57ab839dc3a2 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -377,3 +377,8 @@ Description:This gives a control to limit the bio 
size in f2fs.
Default is zero, which will follow underlying block layer limit,
whereas, if it has a certain bytes value, f2fs won't submit a
bio larger than that size.
+
+What:  /sys/fs/f2fs//stat/sb_status
+Date:  December 2020
+Contact:   "Chao Yu" 
+Description:   Show status of f2fs superblock in real time.
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index ebca0b4961e8..1b85e6d16a94 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -101,6 +101,40 @@ static ssize_t lifetime_write_kbytes_show(struct f2fs_attr 
*a,
sbi->sectors_written_start) >> 1)));
 }
 
+#defineSB_STATUS(s)(s ? "yes" : "no")
+static ssize_t sb_status_show(struct f2fs_attr *a,
+   struct f2fs_sb_info *sbi, char *buf)
+{
+   return sprintf(buf, "IS_DIRTY:  %s\n"
+   "IS_CLOSE:  %s\n"
+   "IS_SHUTDOWN:   %s\n"
+   "IS_RECOVERED:  %s\n"
+   "IS_RESIZEFS:   %s\n"
+   "NEED_FSCK: %s\n"
+   "POR_DOING: %s\n"
+   "NEED_SB_WRITE: %s\n"
+   "NEED_CP:   %s\n"
+   "CP_DISABLED:   %s\n"
+   "CP_DISABLED_QUICK: %s\n"
+   "QUOTA_NEED_FLUSH:  %s\n"
+   "QUOTA_SKIP_FLUSH:  %s\n"
+   "QUOTA_NEED_REPAIR: %s\n",
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_IS_DIRTY)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_IS_CLOSE)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_IS_RECOVERED)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_IS_RESIZEFS)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_NEED_FSCK)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_POR_DOING)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_NEED_CP)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_CP_DISABLED)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_CP_DISABLED_QUICK)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_QUOTA_NEED_FLUSH)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_QUOTA_SKIP_FLUSH)),
+   SB_STATUS(is_sbi_flag_set(sbi, SBI_QUOTA_NEED_REPAIR)));
+}
+
 static ssize_t features_show(struct f2fs_attr *a,
struct f2fs_sb_info *sbi, char *buf)
 {
@@ -711,7 +745,9 @@ static struct attribute *f2fs_feat_attrs[] = {
 };
 ATTRIBUTE_GROUPS(f2fs_feat);
 
+F2FS_GENERAL_RO_ATTR(sb_status);
 static struct attribute *f2fs_stat_attrs[] = {
+   ATTR_LIST(sb_status),
NULL,
 };
 ATTRIBUTE_GROUPS(f2fs_stat);
-- 
2.29.2

Re: [PATCH -next] net/mlx5_core: remove unused including

2020-12-08 Thread Leon Romanovsky

On Tue, Dec 08, 2020 at 11:22:26AM -0800, Jakub Kicinski wrote:
> On Mon, 7 Dec 2020 20:14:00 +0800 Zou Wei wrote:
> > Remove including  that don't need it.
> >
> > Signed-off-by: Zou Wei 
> > ---
> >  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c 
> > b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> > index 989c70c..82ecc161 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> > @@ -30,7 +30,6 @@
> >   * SOFTWARE.
> >   */
> >
> > -#include 
> >  #include 
> >  #include 
> >  #include 

Jakub,

You probably doesn't have latest net-next.

In the commit 17a7612b99e6 ("net/mlx5_core: Clean driver version and
name"), I removed "strlcpy(drvinfo->version, UTS_RELEASE,
sizeof(drvinfo->version));" line.

The patch is ok, but should have Fixes line.
Fixes: 17a7612b99e6 ("net/mlx5_core: Clean driver version and name")

Thanks

>
>
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c: In function 
> ‘mlx5e_rep_get_drvinfo’:
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:66:28: error: ‘UTS_RELEASE’ 
> undeclared (first use in this function); did you mean ‘CSS_RELEASED’?
>66 |  strlcpy(drvinfo->version, UTS_RELEASE, sizeof(drvinfo->version));
>   |^~~
>   |CSS_RELEASED
> drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:66:28: note: each undeclared 
> identifier is reported only once for each function it appears in
> make[6]: *** [drivers/net/ethernet/mellanox/mlx5/core/en_rep.o] Error 1
> make[5]: *** [drivers/net/ethernet/mellanox/mlx5/core] Error 2
> make[4]: *** [drivers/net/ethernet/mellanox] Error 2
> make[3]: *** [drivers/net/ethernet] Error 2
> make[2]: *** [drivers/net] Error 2
> make[2]: *** Waiting for unfinished jobs
> make[1]: *** [drivers] Error 2
> make: *** [__sub-make] Error 2

Re: scheduling while atomic in z3fold

2020-12-08 Thread Mike Galbraith

On Wed, 2020-12-09 at 00:26 +0100, Vitaly Wool wrote:
> Hi Mike,
>
> On 2020-12-07 16:41, Mike Galbraith wrote:
> > On Mon, 2020-12-07 at 16:21 +0100, Vitaly Wool wrote:
> >> On Mon, Dec 7, 2020 at 1:34 PM Mike Galbraith  wrote:
> >>>
> >>
> >>> Unfortunately, that made zero difference.
> >>
> >> Okay, I suggest that you submit the patch that changes read_lock() to
> >> write_lock() in __release_z3fold_page() and I'll ack it then.
> >> I would like to rewrite the code so that write_lock is not necessary
> >> there but I don't want to hold you back and it isn't likely that I'll
> >> complete this today.
> >
> > Nah, I'm in no rush... especially not to sign off on "Because the
> > little voices in my head said this bit should look like that bit over
> > yonder, and testing _seems_ to indicate they're right about that" :)
> >
> > -Mike
> >
>
> okay, thanks. Would this make things better:

Yup, z3fold became RT tolerant with this (un-munged and) applied.

BTW, I don't think my write_lock() hacklet actually plugged the hole
that leads to the below.  I think it just reduced the odds of the two
meeting enough to make it look ~solid in fairly limited test time.

[ 3369.373023] kworker/-7413  4.12 3368809247us : do_compact_page: 
zhdr: 95d93abd8000 zhdr->slots: 95d951f5df80 zhdr->slots->slot[0]: 0
[ 3369.373025] kworker/-7413  4.12 3368809248us : do_compact_page: 
old_handle 95d951f5df98 *old_handle was 95d93abd804f now is 
95da3ab8104c
[ 3369.373027] kworker/-7413  4.11 3368809249us : 
__release_z3fold_page.constprop.25: freed 95d951f5df80
[ 3369.373028] -
[ 3369.373029] CR2: 0018
crash> p debug_handle | grep '\[2'
  [2]: 95dc1ecac0d0
crash> rd 95dc1ecac0d0
95dc1ecac0d0:  95d951f5df98...Q
crash> p debug_zhdr | grep '\[2'
  [2]: 95dc1ecac0c8
crash> rd 95dc1ecac0c8
95dc1ecac0c8:  95da3ab81000...:  <== kworker is 
not working on same zhdr as free_handle()..
crash> p debug_slots | grep '\[2'
  [2]: 95dc1ecac0c0
crash> rd 95dc1ecac0c0
95dc1ecac0c0:  95d951f5df80...Q  <== ..but is 
the same slots, and frees it under free_handle().
crash> bt -sx  leading 
to use after free+corruption+explosion 1us later.
PID: 9334   TASK: 95d95a1eb3c0  CPU: 2   COMMAND: "swapoff"
 #0 [b4248847f8f0] machine_kexec+0x16e at a605f8ce
 #1 [b4248847f938] __crash_kexec+0xd2 at a614c162
 #2 [b4248847f9f8] crash_kexec+0x30 at a614d350
 #3 [b4248847fa08] oops_end+0xca at a602680a
 #4 [b4248847fa28] no_context+0x14d at a606d7cd
 #5 [b4248847fa88] exc_page_fault+0x2b8 at a68bdb88
 #6 [b4248847fae0] asm_exc_page_fault+0x1e at a6a00ace
 #7 [b4248847fb68] mark_wakeup_next_waiter+0x51 at a60ea121
 #8 [b4248847fbd0] rt_mutex_futex_unlock+0x4f at a68c93ef
 #9 [b4248847fc10] z3fold_zpool_free+0x593 at c0ecb663 [z3fold]
#10 [b4248847fc78] zswap_free_entry+0x43 at a627c823
#11 [b4248847fc88] zswap_frontswap_invalidate_page+0x8a at a627c92a
#12 [b4248847fcb0] __frontswap_invalidate_page+0x48 at a627c018
#13 [b4248847fcd8] swapcache_free_entries+0x1ee at a6276f5e
#14 [b4248847fd20] free_swap_slot+0x9f at a627b8ff
#15 [b4248847fd40] delete_from_swap_cache+0x61 at a6274621
#16 [b4248847fd60] try_to_free_swap+0x70 at a6277520
#17 [b4248847fd70] unuse_vma+0x55c at a627869c
#18 [b4248847fe90] try_to_unuse+0x139 at a6278e89
#19 [b4248847fee8] __x64_sys_swapoff+0x1eb at a62798cb
#20 [b4248847ff40] do_syscall_64+0x33 at a68b9ab3
#21 [b4248847ff50] entry_SYSCALL_64_after_hwframe+0x44 at a6a0007c
RIP: 7fbd835a5d17  RSP: 7ffd60634458  RFLAGS: 0202
RAX: ffda  RBX: 559540e34b60  RCX: 7fbd835a5d17
RDX: 0001  RSI: 0001  RDI: 559540e34b60
RBP: 0001   R8: 7ffd606344c0   R9: 0003
R10: 559540e34721  R11: 0202  R12: 0001
R13:   R14: 7ffd606344c0  R15: 
ORIG_RAX: 00a8  CS: 0033  SS: 002b
crash>

>
> diff --git a/mm/z3fold.c b/mm/z3fold.c
> index 18feaa0bc537..340c38a5ffac 100644
> --- a/mm/z3fold.c
> +++ b/mm/z3fold.c
> @@ -303,10 +303,9 @@ static inline void put_z3fold_header(struct
> z3fold_header *zhdr)
>   z3fold_page_unlock(zhdr);
>   }
>
> -static inline void free_handle(unsigned long handle)
> +static inline void free_handle(unsigned long handle, struct
> z3fold_header *zhdr)
>   {
>   struct z3fold_buddy_slots *slots;
> - struct z3fold_header *zhdr;
>   int i;
>   bool is_free;
>
> @@ -316,22 +315,13 @@ static inline void

Re: memory leak in generic_parse_monolithic [+PATCH]

2020-12-08 Thread Randy Dunlap

On 12/8/20 10:03 PM, Dmitry Vyukov wrote:
> On Wed, Dec 9, 2020 at 12:15 AM Randy Dunlap  wrote:
>>
>> On 12/8/20 2:54 PM, David Howells wrote:
>>> Randy Dunlap  wrote:
>>>
> Now the backtrace only shows what the state was when the string was 
> allocated;
> it doesn't show what happened to it after that, so another possibility is 
> that
> the filesystem being mounted nicked what vfs_parse_fs_param() had 
> rightfully
> stolen, transferring fc->source somewhere else and then failed to release 
> it -
> most likely on mount failure (ie. it's an error handling bug in the
> filesystem).
>
> Do we know what filesystem it was?

 Yes, it's call AFS (or kAFS).
>>>
>>> Hmmm...  afs parses the string in afs_parse_source() without modifying it,
>>> then moves the pointer to fc->source (parallelling vfs_parse_fs_param()) and
>>> doesn't touch it again.  fc->source should be cleaned up by do_new_mount()
>>> calling put_fs_context() at the end of the function.
>>>
>>> As far as I can tell with the attached print-insertion patch, it works, 
>>> called
>>> by the following commands, some of which are correct and some which aren't:
>>>
>>> # mount -t afs none /xfstest.test/ -o dyn
>>> # umount /xfstest.test
>>> # mount -t afs "" /xfstest.test/ -o foo
>>> mount: /xfstest.test: bad option; for several filesystems (e.g. nfs, cifs) 
>>> you might need a /sbin/mount. helper program.
>>> # umount /xfstest.test
>>> umount: /xfstest.test: not mounted.
>>> # mount -t afs %xfstest.test20 /xfstest.test/ -o foo
>>> mount: /xfstest.test: bad option; for several filesystems (e.g. nfs, cifs) 
>>> you might need a /sbin/mount. helper program.
>>> # umount /xfstest.test
>>> umount: /xfstest.test: not mounted.
>>> # mount -t afs %xfstest.test20 /xfstest.test/
>>> # umount /xfstest.test
>>>
>>> Do you know if the mount was successful and what the mount parameters were?
>>
>> Here's the syzbot reproducer:
>> https://syzkaller.appspot.com/x/repro.c?x=129ca3d650
>>
>> The "interesting" mount params are:
>> source=%^]$[+%](${:\017k[)-:,source=%^]$[+.](%{:\017\200[)-:,\000
>>
>> There is no other AFS activity: nothing mounted, no cells known (or
>> whatever that is), etc.
>>
>> I don't recall if the mount was successful and I can't test it just now.
>> My laptop is mucked up.
>>
>>
>> Be aware that this report could just be a false positive: it waits
>> for 5 seconds then looks for a memleak. AFAIK, it's possible that the 
>> "leaked"
>> memory is still in valid use and will be freed some day.
> 
> FWIW KMEMLEAK scans memory for pointers. If it claims a memory leak,
> it means the heap object is not referenced anywhere anymore. There are
> no live pointers to it to call kfree or anything else.
> Some false positives are theoretically possible, but so I don't
> remember any, all reported ones were true leaks:
> https://syzkaller.appspot.com/upstream/fixed?manager=ci-upstream-gce-leak
> 

OK, great, thanks for the info.

> 
> 
>>> David
>>> ---
>>> diff --git a/fs/afs/super.c b/fs/afs/super.c
>>> index 6c5900df6aa5..4c44ec0196c9 100644
>>> --- a/fs/afs/super.c
>>> +++ b/fs/afs/super.c
>>> @@ -299,7 +299,7 @@ static int afs_parse_source(struct fs_context *fc, 
>>> struct fs_parameter *param)
>>>   ctx->cell = cell;
>>>   }
>>>
>>> - _debug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
>>> + kdebug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
>>>  ctx->cell->name, ctx->cell,
>>>  ctx->volnamesz, ctx->volnamesz, ctx->volname,
>>>  suffix ?: "-", ctx->type, ctx->force ? " FORCE" : "");
>>> @@ -318,6 +318,8 @@ static int afs_parse_param(struct fs_context *fc, 
>>> struct fs_parameter *param)
>>>   struct afs_fs_context *ctx = fc->fs_private;
>>>   int opt;
>>>
>>> + kenter("%s,%p '%s'", param->key, param->string, param->string);
>>> +
>>>   opt = fs_parse(fc, afs_fs_parameters, param, );
>>>   if (opt < 0)
>>>   return opt;
>>> diff --git a/fs/fs_context.c b/fs/fs_context.c
>>> index 2834d1afa6e8..f530a33876ce 100644
>>> --- a/fs/fs_context.c
>>> +++ b/fs/fs_context.c
>>> @@ -450,6 +450,8 @@ void put_fs_context(struct fs_context *fc)
>>>   put_user_ns(fc->user_ns);
>>>   put_cred(fc->cred);
>>>   put_fc_log(fc);
>>> + if (strcmp(fc->fs_type->name, "afs") == 0)
>>> + printk("PUT %p '%s'\n", fc->source, fc->source);
>>>   put_filesystem(fc->fs_type);
>>>   kfree(fc->source);
>>>   kfree(fc);
>>> @@ -671,6 +673,8 @@ void vfs_clean_context(struct fs_context *fc)
>>>   fc->s_fs_info = NULL;
>>>   fc->sb_flags = 0;
>>>   security_free_mnt_opts(>security);
>>> + if (strcmp(fc->fs_type->name, "afs") == 0)
>>> + printk("CLEAN %p '%s'\n", fc->source, fc->source);
>>>   kfree(fc->source);
>>>   fc->source = NULL;
>>>
>>>
>>
>> I'll check more after my test machine is working again.
>>
>> thanks.
>> --
>> ~Randy
>>
>> --
>>

[PATCH v16 1/9] arm64: Probe for the presence of KVM hypervisor

2020-12-08 Thread Jianyong Wu

From: Will Deacon 

Although the SMCCC specification provides some limited functionality for
describing the presence of hypervisor and firmware services, this is
generally applicable only to functions designated as "Arm Architecture
Service Functions" and no portable discovery mechanism is provided for
standard hypervisor services, despite having a designated range of
function identifiers reserved by the specification.

In an attempt to avoid the need for additional firmware changes every
time a new function is added, introduce a UID to identify the service
provider as being compatible with KVM. Once this has been established,
additional services can be discovered via a feature bitmap.

Change from Jianyong Wu:
mv kvm_arm_hyp_service_available to common place to let both arm/arm64 touch it.
add kvm_init_hyp_services also under arm arch to let arm kvm guest use this 
service.

Cc: Marc Zyngier 
Signed-off-by: Will Deacon 
Signed-off-by: Jianyong Wu 
---
 arch/arm/kernel/setup.c|  5 
 arch/arm64/kernel/setup.c  |  1 +
 drivers/firmware/smccc/smccc.c | 37 +
 include/linux/arm-smccc.h  | 43 ++
 4 files changed, 86 insertions(+)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 1a5edf562e85..adcefa9c8fab 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -1156,6 +1156,11 @@ void __init setup_arch(char **cmdline_p)
 
arm_dt_init_cpu_maps();
psci_dt_init();
+
+#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
+   kvm_init_hyp_services();
+#endif
+
 #ifdef CONFIG_SMP
if (is_smp()) {
if (!mdesc->smp_init || !mdesc->smp_init()) {
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index a950d5bc1ba5..97037b15c6ea 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -353,6 +353,7 @@ void __init __no_sanitize_address setup_arch(char 
**cmdline_p)
else
psci_acpi_init();
 
+   kvm_init_hyp_services();
init_bootcpu_ops();
smp_init_cpus();
smp_build_mpidr_hash();
diff --git a/drivers/firmware/smccc/smccc.c b/drivers/firmware/smccc/smccc.c
index 00c88b809c0c..e153c71ece99 100644
--- a/drivers/firmware/smccc/smccc.c
+++ b/drivers/firmware/smccc/smccc.c
@@ -7,10 +7,47 @@
 
 #include 
 #include 
+#include 
+#include 
 
 static u32 smccc_version = ARM_SMCCC_VERSION_1_0;
 static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE;
 
+DECLARE_BITMAP(__kvm_arm_hyp_services, ARM_SMCCC_KVM_NUM_FUNCS) = { };
+EXPORT_SYMBOL_GPL(__kvm_arm_hyp_services);
+
+void __init kvm_init_hyp_services(void)
+{
+   int i;
+   struct arm_smccc_res res;
+
+   if (arm_smccc_get_version() == ARM_SMCCC_VERSION_1_0)
+   return;
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, );
+   if (res.a0 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0 ||
+   res.a1 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1 ||
+   res.a2 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_2 ||
+   res.a3 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3)
+   return;
+
+   memset(, 0, sizeof(res));
+   arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID, );
+   for (i = 0; i < 32; ++i) {
+   if (res.a0 & (i))
+   set_bit(i + (32 * 0), __kvm_arm_hyp_services);
+   if (res.a1 & (i))
+   set_bit(i + (32 * 1), __kvm_arm_hyp_services);
+   if (res.a2 & (i))
+   set_bit(i + (32 * 2), __kvm_arm_hyp_services);
+   if (res.a3 & (i))
+   set_bit(i + (32 * 3), __kvm_arm_hyp_services);
+   }
+
+   pr_info("KVM hypervisor services detected (0x%08lx 0x%08lx 0x%08lx 
0x%08lx)\n",
+res.a3, res.a2, res.a1, res.a0);
+}
+
 void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit)
 {
smccc_version = version;
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index f860645f6512..d75408141137 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -55,6 +55,8 @@
 #define ARM_SMCCC_OWNER_TRUSTED_OS 50
 #define ARM_SMCCC_OWNER_TRUSTED_OS_END 63
 
+#define ARM_SMCCC_FUNC_QUERY_CALL_UID  0xff01
+
 #define ARM_SMCCC_QUIRK_NONE   0
 #define ARM_SMCCC_QUIRK_QCOM_A61 /* Save/restore register a6 */
 
@@ -87,6 +89,29 @@
   ARM_SMCCC_SMC_32,\
   0, 0x7fff)
 
+#define ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID  \
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  ARM_SMCCC_OWNER_VENDOR_HYP,  \
+  ARM_SMCCC_FUNC_QUERY_CALL_UID)
+
+/* KVM UID value: 28b46fb6-2ec5-11e9-a9ca-4b564d003a74 */
+#define

[PATCH v16 6/9] arm64/kvm: Add hypercall service for kvm ptp.

2020-12-08 Thread Jianyong Wu

ptp_kvm will get this service through SMCC call.
The service offers wall time and cycle count of host to guest.
The caller must specify whether they want the host cycle count
or the difference between host cycle count and cntvoff.

Signed-off-by: Jianyong Wu 
---
 arch/arm64/kvm/hypercalls.c | 59 +
 include/linux/arm-smccc.h   | 16 ++
 2 files changed, 75 insertions(+)

diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index b9d8607083eb..9a4834502388 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -9,6 +9,49 @@
 #include 
 #include 
 
+static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
+{
+   struct system_time_snapshot systime_snapshot;
+   u64 cycles = ~0UL;
+   u32 feature;
+
+   /*
+* system time and counter value must captured in the same
+* time to keep consistency and precision.
+*/
+   ktime_get_snapshot(_snapshot);
+
+   // binding ptp_kvm clocksource to arm_arch_counter
+   if (systime_snapshot.cs_id != CSID_ARM_ARCH_COUNTER)
+   return;
+
+   val[0] = upper_32_bits(systime_snapshot.real);
+   val[1] = lower_32_bits(systime_snapshot.real);
+
+   /*
+* which of virtual counter or physical counter being
+* asked for is decided by the r1 value of SMCCC
+* call. If no invalid r1 value offered, default cycle
+* value(-1) will be returned.
+* Note: keep in mind that feature is u32 and smccc_get_arg1
+* will return u64, so need auto cast here.
+*/
+   feature = smccc_get_arg1(vcpu);
+   switch (feature) {
+   case ARM_PTP_VIRT_COUNTER:
+   cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, 
CNTVOFF_EL2);
+   break;
+   case ARM_PTP_PHY_COUNTER:
+   cycles = systime_snapshot.cycles;
+   break;
+   default:
+   val[0] = SMCCC_RET_NOT_SUPPORTED;
+   break;
+   }
+   val[2] = upper_32_bits(cycles);
+   val[3] = lower_32_bits(cycles);
+}
+
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 {
u32 func_id = smccc_get_function(vcpu);
@@ -79,6 +122,22 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
break;
case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
+   val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
+   break;
+   /*
+* This serves virtual kvm_ptp.
+* Four values will be passed back.
+* reg0 stores high 32-bits of host ktime;
+* reg1 stores low 32-bits of host ktime;
+* For ARM_PTP_VIRT_COUNTER:
+* reg2 stores high 32-bits of difference of host cycles and cntvoff;
+* reg3 stores low 32-bits of difference of host cycles and cntvoff.
+* For ARM_PTP_PHY_COUNTER:
+* reg2 stores the high 32-bits of host cycles;
+* reg3 stores the low 32-bits of host cycles.
+*/
+   case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
+   kvm_ptp_get_time(vcpu, val);
break;
default:
return kvm_psci_call(vcpu);
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index d75408141137..7924069f8f0a 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -103,6 +103,7 @@
 
 /* KVM "vendor specific" services */
 #define ARM_SMCCC_KVM_FUNC_FEATURES0
+#define ARM_SMCCC_KVM_FUNC_PTP 1
 #define ARM_SMCCC_KVM_FUNC_FEATURES_2  127
 #define ARM_SMCCC_KVM_NUM_FUNCS128
 
@@ -114,6 +115,21 @@
 
 #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED   1
 
+/*
+ * ptp_kvm is a feature used for time sync between vm and host.
+ * ptp_kvm module in guest kernel will get service from host using
+ * this hypercall ID.
+ */
+#define ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID   \
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  ARM_SMCCC_OWNER_VENDOR_HYP,  \
+  ARM_SMCCC_KVM_FUNC_PTP)
+
+/* ptp_kvm counter type ID */
+#define ARM_PTP_VIRT_COUNTER   0
+#define ARM_PTP_PHY_COUNTER1
+
 /* Paravirtualised time calls (defined by ARM DEN0057A) */
 #define ARM_SMCCC_HV_PV_TIME_FEATURES  \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
-- 
2.17.1

[PATCH v16 7/9] ptp: arm/arm64: Enable ptp_kvm for arm/arm64

2020-12-08 Thread Jianyong Wu

Currently, there is no mechanism to keep time sync between guest and host
in arm/arm64 virtualization environment. Time in guest will drift compared
with host after boot up as they may both use third party time sources
to correct their time respectively. The time deviation will be in order
of milliseconds. But in some scenarios,like in cloud environment, we ask
for higher time precision.

kvm ptp clock, which chooses the host clock source as a reference
clock to sync time between guest and host, has been adopted by x86
which takes the time sync order from milliseconds to nanoseconds.

This patch enables kvm ptp clock for arm/arm64 and improves clock sync precision
significantly.

Test result comparisons between with kvm ptp clock and without it in arm/arm64
are as follows. This test derived from the result of command 'chronyc
sources'. we should take more care of the last sample column which shows
the offset between the local clock and the source at the last measurement.

no kvm ptp in guest:
MS Name/IP address   Stratum Poll Reach LastRx Last sample

^* dns1.synet.edu.cn  2   6   37713  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37721  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37729  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37737  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37745  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37753  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37761  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn  2   6   377 4   -130us[ +796us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37712   -130us[ +796us] +/-   21ms
^* dns1.synet.edu.cn  2   6   37720   -130us[ +796us] +/-   21ms

in host:
MS Name/IP address   Stratum Poll Reach LastRx Last sample

^* 120.25.115.20  2   7   37772   -470us[ -603us] +/-   18ms
^* 120.25.115.20  2   7   37792   -470us[ -603us] +/-   18ms
^* 120.25.115.20  2   7   377   112   -470us[ -603us] +/-   18ms
^* 120.25.115.20  2   7   377 2   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20  2   7   37722   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20  2   7   37743   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20  2   7   37763   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20  2   7   37783   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20  2   7   377   103   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20  2   7   377   123   +872ns[-6808ns] +/-   17ms

The dns1.synet.edu.cn is the network reference clock for guest and
120.25.115.20 is the network reference clock for host. we can't get the
clock error between guest and host directly, but a roughly estimated value
will be in order of hundreds of us to ms.

with kvm ptp in guest:
chrony has been disabled in host to remove the disturb by network clock.

MS Name/IP address Stratum Poll Reach LastRx Last sample

* PHC00   3   377 8 -7ns[   +1ns] +/-3ns
* PHC00   3   377 8 +1ns[  +16ns] +/-3ns
* PHC00   3   377 6 -4ns[   -0ns] +/-6ns
* PHC00   3   377 6 -8ns[  -12ns] +/-5ns
* PHC00   3   377 5 +2ns[   +4ns] +/-4ns
* PHC00   3   37713 +2ns[   +4ns] +/-4ns
* PHC00   3   37712 -4ns[   -6ns] +/-4ns
* PHC00   3   37711 -8ns[  -11ns] +/-6ns
* PHC00   3   37710-14ns[  -20ns] +/-4ns
* PHC00   3   377 8 +4ns[   +5ns] +/-4ns

The PHC0 is the ptp clock which choose the host clock as its source
clock. So we can see that the clock difference between host and guest
is in order of ns.

Signed-off-by: Jianyong Wu 
---
 drivers/clocksource/arm_arch_timer.c | 29 ++
 drivers/ptp/Kconfig  |  2 +-
 drivers/ptp/Makefile |  1 +
 drivers/ptp/ptp_kvm_arm.c| 45 
 4 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 drivers/ptp/ptp_kvm_arm.c

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index d55acffb0b90..16cd0a663587 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -1650,3 +1652,30 @@ static int __init arch_timer_acpi_init(struct 
acpi_table_header *table)
 }
 TIMER_ACPI_DECLARE(arch_timer, ACPI_SIG_GTDT, arch_timer_acpi_init);
 #endif
+
+int

[PATCH v16 8/9] doc: add ptp_kvm introduction for arm64 support

2020-12-08 Thread Jianyong Wu

PTP_KVM implementation depends on hypercall using SMCCC. So we
introduce a new SMCCC service ID. This doc explains how does the
ID define and how does PTP_KVM works on arm/arm64.

Signed-off-by: Jianyong Wu 
---
 Documentation/virt/kvm/api.rst |  9 +++
 Documentation/virt/kvm/arm/index.rst   |  1 +
 Documentation/virt/kvm/arm/ptp_kvm.rst | 31 +++
 Documentation/virt/kvm/timekeeping.rst | 35 ++
 4 files changed, 76 insertions(+)
 create mode 100644 Documentation/virt/kvm/arm/ptp_kvm.rst

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e00a66d72372..3769cc2f7d9c 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6390,3 +6390,12 @@ When enabled, KVM will disable paravirtual features 
provided to the
 guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
 (0x4001). Otherwise, a guest may use the paravirtual features
 regardless of what has actually been exposed through the CPUID leaf.
+
+8.27 KVM_CAP_PTP_KVM
+
+
+:Architectures: arm64
+
+This capability indicates that KVM virtual PTP service is supported in host.
+It must company with the implementation of KVM virtual PTP service in host
+so VMM can probe if there is the service in host by checking this capability.
diff --git a/Documentation/virt/kvm/arm/index.rst 
b/Documentation/virt/kvm/arm/index.rst
index 3e2b2aba90fc..78a9b670aafe 100644
--- a/Documentation/virt/kvm/arm/index.rst
+++ b/Documentation/virt/kvm/arm/index.rst
@@ -10,3 +10,4 @@ ARM
hyp-abi
psci
pvtime
+   ptp_kvm
diff --git a/Documentation/virt/kvm/arm/ptp_kvm.rst 
b/Documentation/virt/kvm/arm/ptp_kvm.rst
new file mode 100644
index ..d729c1388a5c
--- /dev/null
+++ b/Documentation/virt/kvm/arm/ptp_kvm.rst
@@ -0,0 +1,31 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+PTP_KVM support for arm/arm64
+=
+
+PTP_KVM is used for time sync between guest and host in a high precision.
+It needs to get the wall time and counter value from the host and transfer 
these
+to guest via hypercall service. So one more hypercall service has been added.
+
+This new SMCCC hypercall is defined as:
+
+* ARM_SMCCC_HYP_KVM_PTP_FUNC_ID: 0x8601
+
+As both 32 and 64-bits ptp_kvm client should be supported, we choose 
SMC32/HVC32
+calling convention.
+
+ARM_SMCCC_HYP_KVM_PTP_FUNC_ID:
+
+=====
+Function ID: (uint32)  0x8601
+Arguments:   (uint32)  ARM_PTP_PHY_COUNTER(1) or 
ARM_PTP_VIRT_COUNTER(0)
+   which indicate acquiring physical counter or
+   virtual counter respectively.
+Return Value:val0(uint32)  NOT_SUPPORTED(-1) or upper 32 bits of wall 
clock time(64-bits).
+ val1(uint32)  Lower 32 bits of wall clock time.
+ val2(uint32)  Upper 32 bits of counter cycle(64-bits).
+ val3(uint32)  Lower 32 bits of counter cycle.
+Endianness:No Restrictions.
+=====
+
+More info see section 5 in Documentation/virt/kvm/timekeeping.rst.
diff --git a/Documentation/virt/kvm/timekeeping.rst 
b/Documentation/virt/kvm/timekeeping.rst
index 21ae7efa29ba..c81383e38372 100644
--- a/Documentation/virt/kvm/timekeeping.rst
+++ b/Documentation/virt/kvm/timekeeping.rst
@@ -13,6 +13,7 @@ Timekeeping Virtualization for X86-Based Architectures
2) Timing Devices
3) TSC Hardware
4) Virtualization Problems
+   5) KVM virtual PTP clock
 
 1. Overview
 ===
@@ -643,3 +644,37 @@ by using CPU utilization itself as a signalling channel.  
Preventing such
 problems would require completely isolated virtual time which may not track
 real time any longer.  This may be useful in certain security or QA contexts,
 but in general isn't recommended for real-world deployment scenarios.
+
+5. KVM virtual PTP clock
+
+
+NTP (Network Time Protocol) is often used to sync time in a VM. Unfortunately,
+the precision of NTP is limited due to unknown delays in the network.
+
+KVM virtual PTP clock (PTP_KVM) offers another way to sync time in VM; use the
+host's clock rather than one from a remote machine. Having a synchronization
+mechanism for the virtualization environment allows us to keep all the guests
+running on the same host in sync.
+In general, the delay of communication between host and guest is quite
+small, so ptp_kvm can offer time sync precision up to in order of nanoseconds.
+Please keep in mind that ptp_kvm just limits itself to be a channel which
+transmits the remote clock from host to guest. An application, eg. chrony, is
+needed in usersapce of VM in order to set the guest time.
+
+After ptp_kvm is initialized, there will be a new device node under /dev called
+ptp%d. A guest userspace service, like chrony, can use this device to get host

[PATCH v16 5/9] clocksource: Add clocksource id for arm arch counter

2020-12-08 Thread Jianyong Wu

Add clocksource id for arm arch counter to let it be identified easily and
elegantly in ptp_kvm implementation for arm.

Signed-off-by: Jianyong Wu 
---
 drivers/clocksource/arm_arch_timer.c | 2 ++
 include/linux/clocksource_ids.h  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 6c3e84180146..d55acffb0b90 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -191,6 +192,7 @@ static u64 arch_counter_read_cc(const struct cyclecounter 
*cc)
 
 static struct clocksource clocksource_counter = {
.name   = "arch_sys_counter",
+   .id = CSID_ARM_ARCH_COUNTER,
.rating = 400,
.read   = arch_counter_read,
.mask   = CLOCKSOURCE_MASK(56),
diff --git a/include/linux/clocksource_ids.h b/include/linux/clocksource_ids.h
index 4d8e19e05328..16775d7d8f8d 100644
--- a/include/linux/clocksource_ids.h
+++ b/include/linux/clocksource_ids.h
@@ -5,6 +5,7 @@
 /* Enum to give clocksources a unique identifier */
 enum clocksource_ids {
CSID_GENERIC= 0,
+   CSID_ARM_ARCH_COUNTER,
CSID_MAX,
 };
 
-- 
2.17.1

[PATCH v16 9/9] arm64: Add kvm capability check extension for ptp_kvm

2020-12-08 Thread Jianyong Wu

Let userspace check if there is kvm ptp service in host.
Before VMs migrate to another host, VMM may check if this
cap is available to determine the next behavior.

Signed-off-by: Jianyong Wu 
Suggested-by: Marc Zyngier 
---
 arch/arm64/kvm/arm.c | 1 +
 include/uapi/linux/kvm.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f60f4a5e1a22..1bb1f64f9bb5 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -199,6 +199,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_ARM_INJECT_EXT_DABT:
case KVM_CAP_SET_GUEST_DEBUG:
case KVM_CAP_VCPU_ATTRIBUTES:
+   case KVM_CAP_PTP_KVM:
r = 1;
break;
case KVM_CAP_ARM_SET_DEVICE_ADDR:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index ca41220b40b8..797c40bbc31f 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1053,6 +1053,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_X86_USER_SPACE_MSR 188
 #define KVM_CAP_X86_MSR_FILTER 189
 #define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190
+#define KVM_CAP_PTP_KVM 191
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.17.1

[PATCH v16 4/9] time: Add mechanism to recognize clocksource in time_get_snapshot

2020-12-08 Thread Jianyong Wu

From: Thomas Gleixner 

System time snapshots are not conveying information about the current
clocksource which was used, but callers like the PTP KVM guest
implementation have the requirement to evaluate the clocksource type to
select the appropriate mechanism.

Introduce a clocksource id field in struct clocksource which is by default
set to CSID_GENERIC (0). Clocksource implementations can set that field to
a value which allows to identify the clocksource.

Store the clocksource id of the current clocksource in the
system_time_snapshot so callers can evaluate which clocksource was used to
take the snapshot and act accordingly.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Jianyong Wu 
---
 include/linux/clocksource.h |  6 ++
 include/linux/clocksource_ids.h | 11 +++
 include/linux/timekeeping.h | 12 +++-
 kernel/time/clocksource.c   |  2 ++
 kernel/time/timekeeping.c   |  1 +
 5 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/clocksource_ids.h

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 86d143db6523..1290d0dce840 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -62,6 +63,10 @@ struct module;
  * 400-499: Perfect
  * The ideal clocksource. A must-use where
  * available.
+ * @id:Defaults to CSID_GENERIC. The id value is 
captured
+ * in certain snapshot functions to allow callers to
+ * validate the clocksource from which the snapshot was
+ * taken.
  * @flags: Flags describing special properties
  * @enable:Optional function to enable the clocksource
  * @disable:   Optional function to disable the clocksource
@@ -100,6 +105,7 @@ struct clocksource {
const char  *name;
struct list_headlist;
int rating;
+   enum clocksource_idsid;
enum vdso_clock_modevdso_clock_mode;
unsigned long   flags;
 
diff --git a/include/linux/clocksource_ids.h b/include/linux/clocksource_ids.h
new file mode 100644
index ..4d8e19e05328
--- /dev/null
+++ b/include/linux/clocksource_ids.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CLOCKSOURCE_IDS_H
+#define _LINUX_CLOCKSOURCE_IDS_H
+
+/* Enum to give clocksources a unique identifier */
+enum clocksource_ids {
+   CSID_GENERIC= 0,
+   CSID_MAX,
+};
+
+#endif
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index d47009611109..688ec2e1a3bf 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -3,6 +3,7 @@
 #define _LINUX_TIMEKEEPING_H
 
 #include 
+#include 
 
 /* Included from linux/ktime.h */
 
@@ -243,11 +244,12 @@ struct ktime_timestamps {
  * @cs_was_changed_seq:The sequence number of clocksource change events
  */
 struct system_time_snapshot {
-   u64 cycles;
-   ktime_t real;
-   ktime_t raw;
-   unsigned intclock_was_set_seq;
-   u8  cs_was_changed_seq;
+   u64 cycles;
+   ktime_t real;
+   ktime_t raw;
+   enum clocksource_idscs_id;
+   unsigned intclock_was_set_seq;
+   u8  cs_was_changed_seq;
 };
 
 /**
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cce484a2cc7c..4fe1df894ee5 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -920,6 +920,8 @@ int __clocksource_register_scale(struct clocksource *cs, 
u32 scale, u32 freq)
 
clocksource_arch_init(cs);
 
+   if (WARN_ON_ONCE((unsigned int)cs->id >= CSID_MAX))
+   cs->id = CSID_GENERIC;
if (cs->vdso_clock_mode < 0 ||
cs->vdso_clock_mode >= VDSO_CLOCKMODE_MAX) {
pr_warn("clocksource %s registered with invalid VDSO mode %d. 
Disabling VDSO support.\n",
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index a45cedda93a7..50f08632165c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1049,6 +1049,7 @@ void ktime_get_snapshot(struct system_time_snapshot 
*systime_snapshot)
do {
seq = read_seqcount_begin(_core.seq);
now = tk_clock_read(>tkr_mono);
+   systime_snapshot->cs_id = tk->tkr_mono.clock->id;
systime_snapshot->cs_was_changed_seq = tk->cs_was_changed_seq;
systime_snapshot->clock_was_set_seq = tk->clock_was_set_seq;
base_real = ktime_add(tk->tkr_mono.base,
-- 
2.17.1

[PATCH v16 3/9] ptp: Reorganize ptp_kvm module to make it arch-independent.

2020-12-08 Thread Jianyong Wu

Currently, ptp_kvm modules implementation is only for x86 which includes
large part of arch-specific code.  This patch moves all of this code
into a new arch related file in the same directory.

Signed-off-by: Jianyong Wu 
---
 drivers/ptp/Makefile|  1 +
 drivers/ptp/{ptp_kvm.c => ptp_kvm_common.c} | 84 +-
 drivers/ptp/ptp_kvm_x86.c   | 96 +
 include/linux/ptp_kvm.h | 16 
 4 files changed, 135 insertions(+), 62 deletions(-)
 rename drivers/ptp/{ptp_kvm.c => ptp_kvm_common.c} (60%)
 create mode 100644 drivers/ptp/ptp_kvm_x86.c
 create mode 100644 include/linux/ptp_kvm.h

diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile
index 7aff75f745dc..699a4e4d19c2 100644
--- a/drivers/ptp/Makefile
+++ b/drivers/ptp/Makefile
@@ -4,6 +4,7 @@
 #
 
 ptp-y  := ptp_clock.o ptp_chardev.o ptp_sysfs.o
+ptp_kvm-$(CONFIG_X86)  := ptp_kvm_x86.o ptp_kvm_common.o
 obj-$(CONFIG_PTP_1588_CLOCK)   += ptp.o
 obj-$(CONFIG_PTP_1588_CLOCK_DTE)   += ptp_dte.o
 obj-$(CONFIG_PTP_1588_CLOCK_INES)  += ptp_ines.o
diff --git a/drivers/ptp/ptp_kvm.c b/drivers/ptp/ptp_kvm_common.c
similarity index 60%
rename from drivers/ptp/ptp_kvm.c
rename to drivers/ptp/ptp_kvm_common.c
index 658d33fc3195..721ddcede5e1 100644
--- a/drivers/ptp/ptp_kvm.c
+++ b/drivers/ptp/ptp_kvm_common.c
@@ -8,11 +8,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
-#include 
-#include 
 #include 
 
 #include 
@@ -24,56 +24,29 @@ struct kvm_ptp_clock {
 
 static DEFINE_SPINLOCK(kvm_ptp_lock);
 
-static struct pvclock_vsyscall_time_info *hv_clock;
-
-static struct kvm_clock_pairing clock_pair;
-static phys_addr_t clock_pair_gpa;
-
 static int ptp_kvm_get_time_fn(ktime_t *device_time,
   struct system_counterval_t *system_counter,
   void *ctx)
 {
-   unsigned long ret;
+   long ret;
+   u64 cycle;
struct timespec64 tspec;
-   unsigned version;
-   int cpu;
-   struct pvclock_vcpu_time_info *src;
+   struct clocksource *cs;
 
spin_lock(_ptp_lock);
 
preempt_disable_notrace();
-   cpu = smp_processor_id();
-   src = _clock[cpu].pvti;
-
-   do {
-   /*
-* We are using a TSC value read in the hosts
-* kvm_hc_clock_pairing handling.
-* So any changes to tsc_to_system_mul
-* and tsc_shift or any other pvclock
-* data invalidate that measurement.
-*/
-   version = pvclock_read_begin(src);
-
-   ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
-clock_pair_gpa,
-KVM_CLOCK_PAIRING_WALLCLOCK);
-   if (ret != 0) {
-   pr_err_ratelimited("clock pairing hypercall ret %lu\n", 
ret);
-   spin_unlock(_ptp_lock);
-   preempt_enable_notrace();
-   return -EOPNOTSUPP;
-   }
-
-   tspec.tv_sec = clock_pair.sec;
-   tspec.tv_nsec = clock_pair.nsec;
-   ret = __pvclock_read_cycles(src, clock_pair.tsc);
-   } while (pvclock_read_retry(src, version));
+   ret = kvm_arch_ptp_get_crosststamp(, , );
+   if (ret) {
+   spin_unlock(_ptp_lock);
+   preempt_enable_notrace();
+   return ret;
+   }
 
preempt_enable_notrace();
 
-   system_counter->cycles = ret;
-   system_counter->cs = _clock;
+   system_counter->cycles = cycle;
+   system_counter->cs = cs;
 
*device_time = timespec64_to_ktime(tspec);
 
@@ -111,22 +84,17 @@ static int ptp_kvm_settime(struct ptp_clock_info *ptp,
 
 static int ptp_kvm_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 {
-   unsigned long ret;
+   long ret;
struct timespec64 tspec;
 
spin_lock(_ptp_lock);
 
-   ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
-clock_pair_gpa,
-KVM_CLOCK_PAIRING_WALLCLOCK);
-   if (ret != 0) {
-   pr_err_ratelimited("clock offset hypercall ret %lu\n", ret);
+   ret = kvm_arch_ptp_get_clock();
+   if (ret) {
spin_unlock(_ptp_lock);
-   return -EOPNOTSUPP;
+   return ret;
}
 
-   tspec.tv_sec = clock_pair.sec;
-   tspec.tv_nsec = clock_pair.nsec;
spin_unlock(_ptp_lock);
 
memcpy(ts, , sizeof(struct timespec64));
@@ -168,19 +136,11 @@ static int __init ptp_kvm_init(void)
 {
long ret;
 
-   if (!kvm_para_available())
-   return -ENODEV;
-
-   clock_pair_gpa = slow_virt_to_phys(_pair);
-   hv_clock = pvclock_get_pvti_cpu0_va();
-
-   if (!hv_clock)
-   return -ENODEV;
-
-   ret =

[PATCH v16 2/9] arm/arm64: KVM: Advertise KVM UID to guests via SMCCC

2020-12-08 Thread Jianyong Wu

From: Will Deacon 

We can advertise ourselves to guests as KVM and provide a basic features
bitmap for discoverability of future hypervisor services.

Cc: Marc Zyngier 
Signed-off-by: Will Deacon 
Signed-off-by: Jianyong Wu 
---
 arch/arm64/kvm/hypercalls.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 25ea4ecb6449..b9d8607083eb 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -12,13 +12,13 @@
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 {
u32 func_id = smccc_get_function(vcpu);
-   long val = SMCCC_RET_NOT_SUPPORTED;
+   u64 val[4] = {SMCCC_RET_NOT_SUPPORTED};
u32 feature;
gpa_t gpa;
 
switch (func_id) {
case ARM_SMCCC_VERSION_FUNC_ID:
-   val = ARM_SMCCC_VERSION_1_1;
+   val[0] = ARM_SMCCC_VERSION_1_1;
break;
case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
feature = smccc_get_arg1(vcpu);
@@ -28,10 +28,10 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
case SPECTRE_VULNERABLE:
break;
case SPECTRE_MITIGATED:
-   val = SMCCC_RET_SUCCESS;
+   val[0] = SMCCC_RET_SUCCESS;
break;
case SPECTRE_UNAFFECTED:
-   val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
+   val[0] = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
break;
}
break;
@@ -54,27 +54,36 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
break;
fallthrough;
case SPECTRE_UNAFFECTED:
-   val = SMCCC_RET_NOT_REQUIRED;
+   val[0] = SMCCC_RET_NOT_REQUIRED;
break;
}
break;
case ARM_SMCCC_HV_PV_TIME_FEATURES:
-   val = SMCCC_RET_SUCCESS;
+   val[0] = SMCCC_RET_SUCCESS;
break;
}
break;
case ARM_SMCCC_HV_PV_TIME_FEATURES:
-   val = kvm_hypercall_pv_features(vcpu);
+   val[0] = kvm_hypercall_pv_features(vcpu);
break;
case ARM_SMCCC_HV_PV_TIME_ST:
gpa = kvm_init_stolen_time(vcpu);
if (gpa != GPA_INVALID)
-   val = gpa;
+   val[0] = gpa;
+   break;
+   case ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID:
+   val[0] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0;
+   val[1] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1;
+   val[2] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_2;
+   val[3] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3;
+   break;
+   case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
+   val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
break;
default:
return kvm_psci_call(vcpu);
}
 
-   smccc_set_retval(vcpu, val, 0, 0, 0);
+   smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
return 1;
 }
-- 
2.17.1

[PATCH v16 0/9] Enable ptp_kvm for arm/arm64

2020-12-08 Thread Jianyong Wu

Currently, we offen use ntp (sync time with remote network clock)
to sync time in VM. But the precision of ntp is subject to network delay
so it's difficult to sync time in a high precision.

kvm virtual ptp clock (ptp_kvm) offers another way to sync time in VM,
as the remote clock locates in the host instead of remote network clock.
It targets to sync time between guest and host in virtualization
environment and in this way, we can keep the time of all the VMs running
in the same host in sync. In general, the delay of communication between
host and guest is quiet small, so ptp_kvm can offer time sync precision
up to in order of nanosecond. Please keep in mind that ptp_kvm just
limits itself to be a channel which transmit the remote clock from
host to guest and leaves the time sync jobs to an application, eg. chrony,
in usersapce in VM.

How ptp_kvm works:
After ptp_kvm initialized, there will be a new device node under
/dev called ptp%d. A guest userspace service, like chrony, can use this
device to get host walltime, sometimes also counter cycle, which depends
on the service it calls. Then this guest userspace service can use those
data to do the time sync for guest.
here is a rough sketch to show how kvm ptp clock works.

||  |--|
|   guest userspace  |  |  host|
|ioctl -> /dev/ptp%d |  |  |
|   ^   ||  |  |
||  |  |
|   |   | guest kernel   |  |  |
|   |   V  (get host walltime/counter cycle)   |
|  ptp_kvm -> hypercall - - - - - - - - - - ->hypercall service|
| <- - - - - - - - - - - - |
||  |--|

1. time sync service in guest userspace call ptp device through /dev/ptp%d.
2. ptp_kvm module in guest receives this request then invoke hypercall to route
into host kernel to request host walltime/counter cycle.
3. ptp_kvm hypercall service in host response to the request and send data back.
4. ptp (not ptp_kvm) in guest copy the data to userspace.

This ptp_kvm implementation focuses itself to step 2 and 3 and step 2 works
in guest comparing step 3 works in host kernel.

change log:

from v15 to v16:
(1) remove ARM_PTP_NONE_COUNTER suggested by Marc.
(2) add more detail for ptp_kvm doc.
(3) fix ci issues reported by test robot.

from v14 to v15:
(1) enable ptp_kvm on arm32 guest, also ptp_kvm has been tested
on both arm64 and arm32 guest running on arm64 kvm host.
(2) move arch-agnostic part of ptp_kvm.rst into timekeeping.rst.
(3) rename KVM_CAP_ARM_PTP_KVM to KVM_CAP_PTP_KVM as it should be
arch agnostic.
(4) add description for KVM_CAP_PTP_KVM in 
Documentation/virt/kvm/api.rst.
(5) adjust dependency in Kconfig for ptp_kvm.
(6) refine multi-arch process in driver/ptp/Makefile.
(7) fix make pdfdocs htmldocs issue for ptp_kvm doc.
(8) address other issues from comments in v14.
(9) fold hypercall service of ptp_kvm as a function.
(10) rebase to 5.10-rc3.

from v13 to v14
(1) rebase code on 5.9-rc3.
(2) add a document to introduce implementation of PTP_KVM on
arm64.
(3) fix comments issue in hypercall.c.
(4) export arm_smccc_1_1_get_conduit using EXPORT_SYMBOL_GPL.
(5) fix make issue on x86 reported by kernel test robot.

from v12 to v13:
(1) rebase code on 5.8-rc1.
(2) this patch set base on 2 patches of 1/8 and 2/8 from Will Decon.
(3) remove the change to ptp device code of extend getcrosststamp.
(4) remove the mechanism of letting user choose the counter type in
ptp_kvm for arm64.
(5) add virtual counter option in ptp_kvm service to let user choose
the specific counter explicitly.

from v11 to v12:
(1) rebase code on 5.7-rc6 and rebase 2 patches from Will Decon
including 1/11 and 2/11. as these patches introduce discover mechanism of
vendor smccc service.
(2) rebase ptp_kvm hypercall service from standard smccc to vendor
smccc and add ptp_kvm to vendor smccc service discover mechanism.
(3) add detail of why we need ptp_kvm and how ptp_kvm works in cover
letter.

from v10 to v11:
(1) rebase code on 5.7-rc2.
(2) remove support for arm32, as kvm support for arm32 will be
removed [1]
(3) add error report in ptp_kvm initialization.

from v9 to v10:
(1) change code base to v5.5.
(2) enable ptp_kvm both for arm32 and arm64.
(3) let user choose which of virtual counter or physical counter
should return when using crosstimestamp mode of ptp_kvm for arm/arm64.
(4) extend input argument for getcrosstimestamp API.

Re: memory leak in generic_parse_monolithic [+PATCH]

2020-12-08 Thread Dmitry Vyukov

On Wed, Dec 9, 2020 at 12:15 AM Randy Dunlap  wrote:
>
> On 12/8/20 2:54 PM, David Howells wrote:
> > Randy Dunlap  wrote:
> >
> >>> Now the backtrace only shows what the state was when the string was 
> >>> allocated;
> >>> it doesn't show what happened to it after that, so another possibility is 
> >>> that
> >>> the filesystem being mounted nicked what vfs_parse_fs_param() had 
> >>> rightfully
> >>> stolen, transferring fc->source somewhere else and then failed to release 
> >>> it -
> >>> most likely on mount failure (ie. it's an error handling bug in the
> >>> filesystem).
> >>>
> >>> Do we know what filesystem it was?
> >>
> >> Yes, it's call AFS (or kAFS).
> >
> > Hmmm...  afs parses the string in afs_parse_source() without modifying it,
> > then moves the pointer to fc->source (parallelling vfs_parse_fs_param()) and
> > doesn't touch it again.  fc->source should be cleaned up by do_new_mount()
> > calling put_fs_context() at the end of the function.
> >
> > As far as I can tell with the attached print-insertion patch, it works, 
> > called
> > by the following commands, some of which are correct and some which aren't:
> >
> > # mount -t afs none /xfstest.test/ -o dyn
> > # umount /xfstest.test
> > # mount -t afs "" /xfstest.test/ -o foo
> > mount: /xfstest.test: bad option; for several filesystems (e.g. nfs, cifs) 
> > you might need a /sbin/mount. helper program.
> > # umount /xfstest.test
> > umount: /xfstest.test: not mounted.
> > # mount -t afs %xfstest.test20 /xfstest.test/ -o foo
> > mount: /xfstest.test: bad option; for several filesystems (e.g. nfs, cifs) 
> > you might need a /sbin/mount. helper program.
> > # umount /xfstest.test
> > umount: /xfstest.test: not mounted.
> > # mount -t afs %xfstest.test20 /xfstest.test/
> > # umount /xfstest.test
> >
> > Do you know if the mount was successful and what the mount parameters were?
>
> Here's the syzbot reproducer:
> https://syzkaller.appspot.com/x/repro.c?x=129ca3d650
>
> The "interesting" mount params are:
> source=%^]$[+%](${:\017k[)-:,source=%^]$[+.](%{:\017\200[)-:,\000
>
> There is no other AFS activity: nothing mounted, no cells known (or
> whatever that is), etc.
>
> I don't recall if the mount was successful and I can't test it just now.
> My laptop is mucked up.
>
>
> Be aware that this report could just be a false positive: it waits
> for 5 seconds then looks for a memleak. AFAIK, it's possible that the "leaked"
> memory is still in valid use and will be freed some day.

FWIW KMEMLEAK scans memory for pointers. If it claims a memory leak,
it means the heap object is not referenced anywhere anymore. There are
no live pointers to it to call kfree or anything else.
Some false positives are theoretically possible, but so I don't
remember any, all reported ones were true leaks:
https://syzkaller.appspot.com/upstream/fixed?manager=ci-upstream-gce-leak



> > David
> > ---
> > diff --git a/fs/afs/super.c b/fs/afs/super.c
> > index 6c5900df6aa5..4c44ec0196c9 100644
> > --- a/fs/afs/super.c
> > +++ b/fs/afs/super.c
> > @@ -299,7 +299,7 @@ static int afs_parse_source(struct fs_context *fc, 
> > struct fs_parameter *param)
> >   ctx->cell = cell;
> >   }
> >
> > - _debug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
> > + kdebug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
> >  ctx->cell->name, ctx->cell,
> >  ctx->volnamesz, ctx->volnamesz, ctx->volname,
> >  suffix ?: "-", ctx->type, ctx->force ? " FORCE" : "");
> > @@ -318,6 +318,8 @@ static int afs_parse_param(struct fs_context *fc, 
> > struct fs_parameter *param)
> >   struct afs_fs_context *ctx = fc->fs_private;
> >   int opt;
> >
> > + kenter("%s,%p '%s'", param->key, param->string, param->string);
> > +
> >   opt = fs_parse(fc, afs_fs_parameters, param, );
> >   if (opt < 0)
> >   return opt;
> > diff --git a/fs/fs_context.c b/fs/fs_context.c
> > index 2834d1afa6e8..f530a33876ce 100644
> > --- a/fs/fs_context.c
> > +++ b/fs/fs_context.c
> > @@ -450,6 +450,8 @@ void put_fs_context(struct fs_context *fc)
> >   put_user_ns(fc->user_ns);
> >   put_cred(fc->cred);
> >   put_fc_log(fc);
> > + if (strcmp(fc->fs_type->name, "afs") == 0)
> > + printk("PUT %p '%s'\n", fc->source, fc->source);
> >   put_filesystem(fc->fs_type);
> >   kfree(fc->source);
> >   kfree(fc);
> > @@ -671,6 +673,8 @@ void vfs_clean_context(struct fs_context *fc)
> >   fc->s_fs_info = NULL;
> >   fc->sb_flags = 0;
> >   security_free_mnt_opts(>security);
> > + if (strcmp(fc->fs_type->name, "afs") == 0)
> > + printk("CLEAN %p '%s'\n", fc->source, fc->source);
> >   kfree(fc->source);
> >   fc->source = NULL;
> >
> >
>
> I'll check more after my test machine is working again.
>
> thanks.
> --
> ~Randy
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this

Re: [PATCH] vdpa/mlx5: Use write memory barrier after updating CQ index

2020-12-08 Thread Eli Cohen

On Tue, Dec 08, 2020 at 04:45:04PM -0500, Michael S. Tsirkin wrote:
> On Sun, Dec 06, 2020 at 12:57:19PM +0200, Eli Cohen wrote:
> > Make sure to put write memory barrier after updating CQ consumer index
> > so the hardware knows that there are available CQE slots in the queue.
> > 
> > Failure to do this can cause the update of the RX doorbell record to get
> > updated before the CQ consumer index resulting in CQ overrun.
> > 
> > Change-Id: Ib0ae4c118cce524c9f492b32569179f3c1f04cc1
> > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 
> > devices")
> > Signed-off-by: Eli Cohen 
> 
> Aren't both memory writes?

Not sure what exactly you mean here.

> And given that, isn't dma_wmb() sufficient here?

I agree that dma_wmb() is more appropriate here.

> 
> 
> > ---
> >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
> > b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index 1f4089c6f9d7..295f46eea2a5 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -478,6 +478,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq)
> >  static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, 
> > int num)
> >  {
> > mlx5_cq_set_ci(>cq.mcq);
> > +
> > +   /* make sure CQ cosumer update is visible to the hardware before 
> > updating
> > +* RX doorbell record.
> > +*/
> > +   wmb();
> > rx_post(>vqqp, num);
> > if (mvq->event_cb.callback)
> > mvq->event_cb.callback(mvq->event_cb.private);
> > -- 
> > 2.27.0
>

Re: [PATCH v2 3/3] pinctrl: qcom: Clear possible pending irq when remuxing GPIOs

2020-12-08 Thread Maulik Shah


Hi Doug,

On 12/4/2020 2:34 AM, Doug Anderson wrote:

Hi,

On Thu, Dec 3, 2020 at 3:22 AM Maulik Shah  wrote:

+ /*
+  * Clear IRQs if switching to/from GPIO mode since muxing to/from
+  * the GPIO path can cause phantom edges.
+  */
+ old_i = (oldval & mask) >> g->mux_bit;
+ if (old_i != i &&
+ (i == pctrl->soc->gpio_func || old_i == pctrl->soc->gpio_func))
+ msm_pinctrl_clear_pending_irq(pctrl, group, irq);
+

The phantom irq can come when switching to GPIO irq mode. so may be only
check if (i == pctrl->soc->gpio_func) {

Have you tested this experimentally?

Yes

Yes means that you tried switching away from GPIO mode and you
couldn't get a phantom interrupt?  OK, I'll re-test then.

I'll test on the Chrome OS kernel tree since that's easiest for me,
but I can test on mainline if you think it would make a difference...

1. Pick  and put that kernel on the device.

2. In Cr50 console, make the WP line low with:
   wp enable

3. In AP console do:
   echo bogus > /sys/module/gpio_keys/parameters/doug_test

4. See bogus interrupt:

localhost ~ # echo bogus > /sys/module/gpio_keys/parameters/doug_test
[   62.006346] DOUG: selecting state bogus
[   62.011813] DOUG: ret 0
[   62.011875] DOUG: in dual edge parent: hwirq=66, type=1
[   62.020300] DOUG: gpio_keys_gpio_isr

Can you try replicating again?



I have experimentally tested this and I can actually see an interrupt
generated when I _leave_ GPIO as well as when I enter GPIO mode.  If
you can't see this I can re-setup my test, but this was one of those
things that convinced me that the _transition_ is what was causing the
fake interrupt.

I think my test CL  can help you with
testing if you wish.



even better if you can clear this unconditionally.

Why?  It should only matter if we're going to/from GPIO mode.

Probably i was not clear, the phantom irq should be cleared when
switching gpio to gpio IRQ mode.

When GPIO was used as Rx line in example QUP/UART use case, it can latch
the phantom IRQ

This is where I disagree with you.  I don't think the interrupt is
latching while it's used as an Rx line.  I think it's the pinmux
change that introduces an phantom interrupt.

Specifically, with the same test patch above, AKA
, I can do this:

1. On AP:
   echo bogus > /sys/module/gpio_keys/parameters/doug_test

2. On Cr50 console:
   wp disable
   wp enable
   wp disable
   wp enable
   wp disable
   wp enable

3. Go back and check the AP and see that no interrupts fired.

Said another way: when we're muxed away the interrupts aren't getting
latched.  It's the act of changing the mux that causes the phantom
interrupts.



but as long as its IRQ is in disabled/masked state it
doesn't matter.

...but there's no requirement that someone would need to disable/mask
an interrupt while switching the muxing, is there?  So it does matter.



its only when the GPIO is again set to IRQ mode with set_mux callback,
the phantom IRQ needs clear to start as clean.

So we should check only for if (i == pctrl->soc->gpio_func) then clear
phantom IRQ.

The same is case with .direction_output callback, when GPIO is used as
output say as clock, need not clear any phantom IRQ,

The reason is with every pulse of clock it can latch as pending IRQ in
GIC_ISPEND as long as it stay as output mode/clock.

its only when switching back GPIO from output direction to input & IRQ
function, need to clear the phantom IRQ.

so we do not require clear phantom irq in .direction_output callback.

I think all the above explanation is with the model that the interrupt
detection logic is still happening even when muxed away.  I don't
believe that's true.
Its not the interrupt detection logic that is still happening when muxed 
away, but the GPIO line is routed to GIC from PDC.
The GPIO line get forwarded when the system is active/out of system 
level low power mode to GIC irrespective of whether GPIO is used as 
interrupt or not.
Due to this it can still latch the IRQ at GIC after switching to lets 
say Rx mode, whenever the line has any data recive, the line state 
toggles can be latched as error interrupt at GIC.


As the interrupt is in disabled state it won't be sent to CPU.
Its only when the driver chooses to switch back to interrupt mode we 
want to clear the error interrupt latched to start as clean. same is the 
case when used as output direction.


Hope above is clear.

Thanks,
Maulik

Please run my test patch or code up something
similar yourself.



In step (3) msm_gpio_irq_set_type() touches the RAW_STATUS_EN making the
phantom irq pending again.
To resolve this, you will need to invoke msm_pinctrl_clear_pending_irq()
at the end of the msm_gpio_irq_set_type().

I would like Rajendra's (already in cc) review as well on above part.

Ugh, so we need a clear in yet another place.  Joy.  OK, I will wait
for Rajendra's comment but I can add similar code in

Re: [PATCH v5 0/4] CPUFreq: Add support for opp-sharing cpus

2020-12-08 Thread Viresh Kumar

On 08-12-20, 17:42, Nicola Mazzucato wrote:
> Hi All,
> 
> In this V5 posting I have addressed suggestions on opp/of and scmi-cpufreq
> driver.
> 
> This is to support systems where exposed cpu performance controls are more
> fine-grained that the platform's ability to scale cpus independently.
> 
> Nicola Mazzucato (3):
>   dt-bindings: opp: Allow empty OPP tables
>   opp/of: Allow empty opp-table with opp-shared

Applied these two for now. Please rework the other patches based on
the feedback given on the other thread. Thanks.

-- 
viresh

[PATCH] gpio: eic-sprd: break loop when getting NULL device resource

2020-12-08 Thread Chunyan Zhang

From: Chunyan Zhang 

EIC controller have unfixed numbers of banks on different Spreadtrum SoCs,
and each bank has its own base address, the loop of getting there base
address in driver should break if the resource gotten via
platform_get_resource() is NULL already. The later ones would be all NULL
even if the loop continues.

Fixes: 25518e024e3a ("gpio: Add Spreadtrum EIC driver support")
Signed-off-by: Chunyan Zhang 
---
 drivers/gpio/gpio-eic-sprd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-eic-sprd.c b/drivers/gpio/gpio-eic-sprd.c
index ad61daf6c212..865ab2b34fdd 100644
--- a/drivers/gpio/gpio-eic-sprd.c
+++ b/drivers/gpio/gpio-eic-sprd.c
@@ -598,7 +598,7 @@ static int sprd_eic_probe(struct platform_device *pdev)
 */
res = platform_get_resource(pdev, IORESOURCE_MEM, i);
if (!res)
-   continue;
+   break;
 
sprd_eic->base[i] = devm_ioremap_resource(>dev, res);
if (IS_ERR(sprd_eic->base[i]))
-- 
2.25.1

Re: [PATCH v4 3/4] scmi-cpufreq: get opp_shared_cpus from opp-v2 for EM

2020-12-08 Thread Viresh Kumar

On 08-12-20, 11:20, Sudeep Holla wrote:
> It is because of per-CPU vs per domain drama here. Imagine a system with
> 4 CPUs which the firmware puts in individual domains while they all are
> in the same perf domain and hence OPP is marked shared in DT.
> 
> Since this probe gets called for all the cpus, we need to skip adding
> OPPs for the last 3(add only for 1st one and mark others as shared).

Okay and this wasn't happening before this series because the firmware
was only returning the current CPU from scmi_get_sharing_cpus() ?

Is this driver also used for the cases where we have multiple CPUs in
a policy ? Otherwise we won't be required to call
dev_pm_opp_set_sharing_cpus().

So I assume that we want to support both the cases here ?

> If we attempt to add OPPs on second cpu probe, it *will* shout as duplicate
> OPP as we would have already marked it as shared table with the first cpu.
> Am I missing anything ? I suggested this as Nicola saw OPP duplicate
> warnings when he was hacking up this patch.

The common stuff (for all the CPUs) is better moved to probe() in this
case, instead of the ->init() callback. Otherwise it will always be
messy. You can initialize the OPP and cpufreq tables in probe()
itself, save the pointer somewhere and then just use it here in
->init().

Also do EM registration from there.

> > > otherwise no need as they would be duplicated.
> > > > And we don't check the return value of
> > > > the below call anymore, moreover we have to call it twice now.
> 
> Yes, that looks wrong, we need to add the check for non zero values, but 
> 
> > > 
> > > This second get_opp_count is required such that we register em with the 
> > > correct
> > > opp number after having added them. Without this the opp_count would not 
> > > be correct.
> >
> 
> ... I have a question here. Why do you need to call
> 
> em_dev_register_perf_domain(cpu_dev, nr_opp, _cb, opp_shared_cpus..)
> 
> on each CPU ? Why can't that be done once for unique opp_shared_cpus ?
> 
> The whole drama of per-CPU vs perf domain is to have energy model and
> if feeding it opp_shared_cpus once is not sufficient, then something is
> wrong or simply duplicated or just not necessary IMO.
> 
> > What if the count is still 0 ? What about deferred probe we were doing 
> > earlier ?
> 
> OK, you made me think with that question. I think the check was original
> added for deferred probe but then scmi core was changed to add the cpufreq
> device only after everything needed is ready. So the condition must never
> occur now.

The deferred probe shall be handled in a different patch in that case.

Nicola, please break the patch into multiple patches, with one patch
dealing only with one task.

-- 
viresh

[PATCH v2 9/9] soundwire: bus: clarify dev_err/dbg device references

2020-12-08 Thread Bard Liao

From: Pierre-Louis Bossart 

The SoundWire bus code confuses bus and Slave device levels for
dev_err/dbg logs. That's not impacting functionality but the accuracy
of kernel logs.

We should only use bus->dev for bus-level operations and handling of
Device0. For all other logs where the device number is not zero, we
should use >dev to provide more precisions to the
user/integrator.

Reported-by: Rander Wang 
Signed-off-by: Pierre-Louis Bossart 
Reviewed-by: Rander Wang 
Signed-off-by: Bard Liao 
---
 drivers/soundwire/bus.c | 63 +
 1 file changed, 33 insertions(+), 30 deletions(-)

diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index 7c4717dd9a34..39edf87cf832 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -636,6 +636,7 @@ static int sdw_get_device_num(struct sdw_slave *slave)
 
 static int sdw_assign_device_num(struct sdw_slave *slave)
 {
+   struct sdw_bus *bus = slave->bus;
int ret, dev_num;
bool new_device = false;
 
@@ -646,7 +647,7 @@ static int sdw_assign_device_num(struct sdw_slave *slave)
dev_num = sdw_get_device_num(slave);
mutex_unlock(>bus->bus_lock);
if (dev_num < 0) {
-   dev_err(slave->bus->dev, "Get dev_num failed: 
%d\n",
+   dev_err(bus->dev, "Get dev_num failed: %d\n",
dev_num);
return dev_num;
}
@@ -659,7 +660,7 @@ static int sdw_assign_device_num(struct sdw_slave *slave)
}
 
if (!new_device)
-   dev_dbg(slave->bus->dev,
+   dev_dbg(bus->dev,
"Slave already registered, reusing dev_num:%d\n",
slave->dev_num);
 
@@ -669,7 +670,7 @@ static int sdw_assign_device_num(struct sdw_slave *slave)
 
ret = sdw_write_no_pm(slave, SDW_SCP_DEVNUMBER, dev_num);
if (ret < 0) {
-   dev_err(>dev, "Program device_num %d failed: %d\n",
+   dev_err(bus->dev, "Program device_num %d failed: %d\n",
dev_num, ret);
return ret;
}
@@ -748,7 +749,7 @@ static int sdw_program_device_num(struct sdw_bus *bus)
 */
ret = sdw_assign_device_num(slave);
if (ret) {
-   dev_err(slave->bus->dev,
+   dev_err(bus->dev,
"Assign dev_num failed:%d\n",
ret);
return ret;
@@ -788,9 +789,11 @@ static int sdw_program_device_num(struct sdw_bus *bus)
 static void sdw_modify_slave_status(struct sdw_slave *slave,
enum sdw_slave_status status)
 {
-   mutex_lock(>bus->bus_lock);
+   struct sdw_bus *bus = slave->bus;
 
-   dev_vdbg(>dev,
+   mutex_lock(>bus_lock);
+
+   dev_vdbg(bus->dev,
 "%s: changing status slave %d status %d new status %d\n",
 __func__, slave->dev_num, slave->status, status);
 
@@ -811,7 +814,7 @@ static void sdw_modify_slave_status(struct sdw_slave *slave,
complete(>enumeration_complete);
}
slave->status = status;
-   mutex_unlock(>bus->bus_lock);
+   mutex_unlock(>bus_lock);
 }
 
 static enum sdw_clk_stop_mode sdw_get_clk_stop_mode(struct sdw_slave *slave)
@@ -1140,7 +1143,7 @@ int sdw_configure_dpn_intr(struct sdw_slave *slave,
 
ret = sdw_update(slave, addr, (mask | SDW_DPN_INT_PORT_READY), val);
if (ret < 0)
-   dev_err(slave->bus->dev,
+   dev_err(>dev,
"SDW_DPN_INTMASK write failed:%d\n", val);
 
return ret;
@@ -1271,7 +1274,7 @@ static int sdw_initialize_slave(struct sdw_slave *slave)
/* Enable SCP interrupts */
ret = sdw_update_no_pm(slave, SDW_SCP_INTMASK1, val, val);
if (ret < 0) {
-   dev_err(slave->bus->dev,
+   dev_err(>dev,
"SDW_SCP_INTMASK1 write failed:%d\n", ret);
return ret;
}
@@ -1286,7 +1289,7 @@ static int sdw_initialize_slave(struct sdw_slave *slave)
 
ret = sdw_update_no_pm(slave, SDW_DP0_INTMASK, val, val);
if (ret < 0)
-   dev_err(slave->bus->dev,
+   dev_err(>dev,
"SDW_DP0_INTMASK read failed:%d\n", ret);
return ret;
 }
@@ -1298,7 +1301,7 @@ static int sdw_handle_dp0_interrupt(struct sdw_slave 
*slave, u8 *slave_status)
 
status = sdw_read_no_pm(slave, SDW_DP0_INT);
if (status < 0) {
-   dev_err(slave->bus->dev,
+   dev_err(>dev,
"SDW_DP0_INT read failed:%d\n", status);

Re: [SPECIFICATION RFC] The firmware and bootloader log specification

2020-12-08 Thread Frank Rowand

On 12/4/20 7:23 AM, Paul Menzel wrote:
> Dear Wim, dear Daniel,
> 
> 
> First, thank you for including all parties in the discussion.
> Am 04.12.20 um 13:52 schrieb Wim Vervoorn:
> 
>> I agree with you. Using an existing standard is better than inventing
>> a new one in this case. I think using the coreboot logging is a good
>> idea as there is indeed a lot of support already available and it is
>> lightweight and simple.
> In my opinion coreboot’s format is lacking, that it does not record the 
> timestamp, and the log level is not stored as metadata, but (in coreboot) 
> only used to decide if to print the message or not.
> 
> I agree with you, that an existing standard should be used, and in my opinion 
> it’s Linux message format. That is most widely supported, and existing tools 
> could then also work with pre-Linux messages.
> 
> Sean Hudson from Mentor Graphics presented that idea at Embedded Linux 
> Conference Europe 2016 [1]. No idea, if anything came out of that effort. 
> (Unfortunately, I couldn’t find an email. Does somebody have contacts at 
> Mentor to find out, how to reach him?)

I forwarded this to Sean.

-Frank

> 
> 
> Kind regards,
> 
> Paul
> 
> 
> [1]: 
> http://events17.linuxfoundation.org/sites/events/files/slides/2016-10-12%20-%20ELCE%20-%20Shared%20Logging%20-%20Part%20Deux.pdf

Re: [PATCH bpf-next v4 04/11] bpf: Rename BPF_XADD and prepare to encode other atomics in .imm

2020-12-08 Thread John Fastabend

Brendan Jackman wrote:
> Hi John, thanks a lot for the reviews!
> 
> On Mon, Dec 07, 2020 at 01:56:53PM -0800, John Fastabend wrote:
> > Brendan Jackman wrote:
> > > A subsequent patch will add additional atomic operations. These new
> > > operations will use the same opcode field as the existing XADD, with
> > > the immediate discriminating different operations.
> > > 
> > > In preparation, rename the instruction mode BPF_ATOMIC and start
> > > calling the zero immediate BPF_ADD.
> > > 
> > > This is possible (doesn't break existing valid BPF progs) because the
> > > immediate field is currently reserved MBZ and BPF_ADD is zero.
> > > 
> > > All uses are removed from the tree but the BPF_XADD definition is
> > > kept around to avoid breaking builds for people including kernel
> > > headers.
> > > 
> > > Signed-off-by: Brendan Jackman 
> > > ---

[...]

> > > + case BPF_STX | BPF_ATOMIC | BPF_W:
> > > + case BPF_STX | BPF_ATOMIC | BPF_DW:
> > > + if (insn->imm != BPF_ADD) {
> > > + pr_err("bpf-jit: not supported: atomic operation %02x 
> > > ***\n",
> > > +insn->imm);
> > > + return -EINVAL;
> > > + }
> > 
> > Can we standardize the error across jits and the error return code? It seems
> > odd that we use pr_err, pr_info_once, pr_err_ratelimited and then return
> > ENOTSUPP, EFAULT or EINVAL.
> 
> That would be a noble cause but I don't think it makes sense in this
> patchset: they are already inconsistent, so here I've gone for intra-JIT
> consistency over inter-JIT consistency.
> 
> I think it would be more annoying, for example, if the s390 JIT returned
> -EOPNOTSUPP for a bad atomic but -1 for other unsupported ops, than it
> is already that the s390 JIT returns -1 where the MIPS returns -EINVAL.

ok works for me thanks for the explanation.

[PATCH v2 7/9] regmap: sdw-mbq: use MODULE_LICENSE("GPL")

2020-12-08 Thread Bard Liao

"GPL v2" is the same as "GPL". It exists for historic reasons.
See Documentation/process/license-rules.rst

Signed-off-by: Pierre-Louis Bossart 
Signed-off-by: Bard Liao 
---
 drivers/base/regmap/regmap-sdw-mbq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/regmap/regmap-sdw-mbq.c 
b/drivers/base/regmap/regmap-sdw-mbq.c
index 6675c3a4b829..fe3ac26b66ad 100644
--- a/drivers/base/regmap/regmap-sdw-mbq.c
+++ b/drivers/base/regmap/regmap-sdw-mbq.c
@@ -98,4 +98,4 @@ struct regmap *__devm_regmap_init_sdw_mbq(struct sdw_slave 
*sdw,
 EXPORT_SYMBOL_GPL(__devm_regmap_init_sdw_mbq);
 
 MODULE_DESCRIPTION("Regmap SoundWire MBQ Module");
-MODULE_LICENSE("GPL v2");
+MODULE_LICENSE("GPL");
-- 
2.17.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1331 matches

Mail list logo