date:20201019

Re: [PATCH v2] Documentation: kunit: Update Kconfig parts for KUNIT's module support

2020-10-19 Thread SeongJae Park

I just realized I missed adding Marco Elver as a recipient, so adding him.
Sorry, Marco.


Thanks,
SeongJae Park

On Tue, 13 Oct 2020 08:37:43 +0200 SeongJae Park  wrote:

> From: SeongJae Park 
> 
> If 'CONFIG_KUNIT=m', letting kunit tests that do not support loadable
> module build depends on 'KUNIT' instead of 'KUNIT=y' result in compile
> errors.  This commit updates the document for this.
> 
> Fixes: 9fe124bf1b77 ("kunit: allow kunit to be loaded as a module")
> Signed-off-by: SeongJae Park 
> ---
> 
> Changes from v1
> (https://lore.kernel.org/linux-kselftest/20201012105420.5945-1-sjp...@amazon.com/):
> - Fix a typo (Marco Elver)
> 
> ---
>  Documentation/dev-tools/kunit/start.rst | 2 +-
>  Documentation/dev-tools/kunit/usage.rst | 5 +
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/dev-tools/kunit/start.rst 
> b/Documentation/dev-tools/kunit/start.rst
> index d23385e3e159..454f307813ea 100644
> --- a/Documentation/dev-tools/kunit/start.rst
> +++ b/Documentation/dev-tools/kunit/start.rst
> @@ -197,7 +197,7 @@ Now add the following to ``drivers/misc/Kconfig``:
>  
>   config MISC_EXAMPLE_TEST
>   bool "Test for my example"
> - depends on MISC_EXAMPLE && KUNIT
> + depends on MISC_EXAMPLE && KUNIT=y
>  
>  and the following to ``drivers/misc/Makefile``:
>  
> diff --git a/Documentation/dev-tools/kunit/usage.rst 
> b/Documentation/dev-tools/kunit/usage.rst
> index 3c3fe8b5fecc..b331f5a5b0b9 100644
> --- a/Documentation/dev-tools/kunit/usage.rst
> +++ b/Documentation/dev-tools/kunit/usage.rst
> @@ -556,6 +556,11 @@ Once the kernel is built and installed, a simple
>  
>  ...will run the tests.
>  
> +.. note::
> +   Note that you should make your test depends on ``KUNIT=y`` in Kconfig if 
> the
> +   test does not support module build.  Otherwise, it will trigger compile
> +   errors if ``CONFIG_KUNIT`` is ``m``.
> +
>  Writing new tests for other architectures
>  -
>  
> -- 
> 2.17.1
>

drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:34:6: warning: no previous prototype for function 'ia_css_isys_ibuf_rmgr_init'

2020-10-19 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   270315b8235e3d10c2e360cff56c2f9e0915a252
commit: 5b552b198c2557295becd471bff53bb520fefee5 media: atomisp: re-enable 
warnings again
date:   4 months ago
config: x86_64-randconfig-a003-20201020 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
ea693a162786d933863ab079648d4261ac0ead47)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b552b198c2557295becd471bff53bb520fefee5
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 5b552b198c2557295becd471bff53bb520fefee5
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:34:6: 
>> warning: no previous prototype for function 'ia_css_isys_ibuf_rmgr_init' 
>> [-Wmissing-prototypes]
   void ia_css_isys_ibuf_rmgr_init(void)
^
   drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:34:1: 
note: declare 'static' if the function is not intended to be used outside of 
this translation unit
   void ia_css_isys_ibuf_rmgr_init(void)
   ^
   static 
>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:40:6: 
>> warning: no previous prototype for function 'ia_css_isys_ibuf_rmgr_uninit' 
>> [-Wmissing-prototypes]
   void ia_css_isys_ibuf_rmgr_uninit(void)
^
   drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:40:1: 
note: declare 'static' if the function is not intended to be used outside of 
this translation unit
   void ia_css_isys_ibuf_rmgr_uninit(void)
   ^
   static 
>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:46:6: 
>> warning: no previous prototype for function 'ia_css_isys_ibuf_rmgr_acquire' 
>> [-Wmissing-prototypes]
   bool ia_css_isys_ibuf_rmgr_acquire(
^
   drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:46:1: 
note: declare 'static' if the function is not intended to be used outside of 
this translation unit
   bool ia_css_isys_ibuf_rmgr_acquire(
   ^
   static 
>> drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:106:6: 
>> warning: no previous prototype for function 'ia_css_isys_ibuf_rmgr_release' 
>> [-Wmissing-prototypes]
   void ia_css_isys_ibuf_rmgr_release(
^
   drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c:106:1: 
note: declare 'static' if the function is not intended to be used outside of 
this translation unit
   void ia_css_isys_ibuf_rmgr_release(
   ^
   static 
   4 warnings generated.

vim +/ia_css_isys_ibuf_rmgr_init +34 
drivers/staging/media/atomisp/pci/runtime/isys/src/ibuf_ctrl_rmgr.c

ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   33  
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19  @34  void ia_css_isys_ibuf_rmgr_init(void)
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   35  {
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   36  memset(&ibuf_rsrc, 0, 
sizeof(ibuf_rsrc));
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   37  ibuf_rsrc.free_size = 
MAX_INPUT_BUFFER_SIZE;
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   38  }
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   39  
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19  @40  void ia_css_isys_ibuf_rmgr_uninit(void)
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   41  {
ad85094b293e40e 
drivers/staging/media/atomisp/pci/atomisp2/css2400/runtime/isys/src/ibuf_ctrl_rmgr.c
 Mauro Carvalho Chehab 2020-04-19   42  memset(&ibuf_rsrc, 0, 
sizeof(ibuf_rsrc));
ad85094b293e40e 
drivers/staging

Re: [PATCH 2/2] kunit: tool: Mark 'kunittest_config' as constant again

2020-10-19 Thread SeongJae Park

ping

On Mon, 12 Oct 2020 12:26:21 +0200 SeongJae Park  wrote:

> From: SeongJae Park 
> 
> 'kunit_kernel.kunittest_config' was constant at first, and therefore it
> used UPPER_SNAKE_CASE naming convention that usually means it is
> constant in Python world.  But, commit e3212513a8f0 ("kunit: Create
> default config in '--build_dir'") made it modifiable to fix a use case
> of the tool and thus the naming also changed to lower_snake_case.
> However, this resulted in a confusion.  As a result, some successing
> changes made the tool unittest fail, and a fix[1] of it again incurred
> the '--build_dir' use case failure.
> 
> As the previous commit fixed the '--build_dir' use case without
> modifying the variable again, this commit marks the variable as constant
> again with UPPER_SNAKE_CASE, to reduce future confusions.
> 
> [1] Commit d43c7fb05765 ("kunit: tool: fix improper treatment of file 
> location")
> 
> Signed-off-by: SeongJae Park 
> ---
>  tools/testing/kunit/kunit.py| 4 ++--
>  tools/testing/kunit/kunit_kernel.py | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
> index 611c23e178f8..0a58c1fb87d9 100755
> --- a/tools/testing/kunit/kunit.py
> +++ b/tools/testing/kunit/kunit.py
> @@ -44,9 +44,9 @@ class KunitStatus(Enum):
>   TEST_FAILURE = auto()
>  
>  def create_default_kunitconfig():
> - if not os.path.exists(kunit_kernel.kunitconfig_path):
> + if not os.path.exists(kunit_kernel.KUNITCONFIG_PATH):
>   shutil.copyfile('arch/um/configs/kunit_defconfig',
> - kunit_kernel.kunitconfig_path)
> + kunit_kernel.KUNITCONFIG_PATH)
>  
>  def get_kernel_root_path():
>   parts = sys.argv[0] if not __file__ else __file__
> diff --git a/tools/testing/kunit/kunit_kernel.py 
> b/tools/testing/kunit/kunit_kernel.py
> index 16a997504317..42dca0163479 100644
> --- a/tools/testing/kunit/kunit_kernel.py
> +++ b/tools/testing/kunit/kunit_kernel.py
> @@ -18,7 +18,7 @@ import kunit_config
>  import kunit_parser
>  
>  KCONFIG_PATH = '.config'
> -kunitconfig_path = '.kunitconfig'
> +KUNITCONFIG_PATH = '.kunitconfig'
>  BROKEN_ALLCONFIG_PATH = 'tools/testing/kunit/configs/broken_on_uml.config'
>  
>  class ConfigError(Exception):
> @@ -106,7 +106,7 @@ class LinuxSourceTree(object):
>  
>   def __init__(self, build_dir):
>   self._kconfig = kunit_config.Kconfig()
> - self._kconfig.read_from_file(os.path.join(build_dir, 
> kunitconfig_path))
> + self._kconfig.read_from_file(os.path.join(build_dir, 
> KUNITCONFIG_PATH))
>   self._ops = LinuxSourceTreeOperations()
>   signal.signal(signal.SIGINT, self.signal_handler)
>  
> -- 
> 2.17.1
>

答复: [PATCH 3/3] mmc: rtsx: Add SD Express mode support for RTS5261

2020-10-19 Thread 冯锐

Hi All,

A month has passed, do I need to modify these patches?

Thanks

> 
> From: Rui Feng 
> 
> RTS5261 support legacy SD mode and SD Express mode.
> In SD7.x, SD association introduce SD Express as a new mode.
> This patch makes RTS5261 support SD Express mode.
> 
> Signed-off-by: Rui Feng 
> ---
>  drivers/mmc/host/rtsx_pci_sdmmc.c | 59
> +++
>  1 file changed, 59 insertions(+)
> 
> diff --git a/drivers/mmc/host/rtsx_pci_sdmmc.c
> b/drivers/mmc/host/rtsx_pci_sdmmc.c
> index 2763a376b054..efde374a4a5e 100644
> --- a/drivers/mmc/host/rtsx_pci_sdmmc.c
> +++ b/drivers/mmc/host/rtsx_pci_sdmmc.c
> @@ -895,7 +895,9 @@ static int sd_set_bus_width(struct realtek_pci_sdmmc
> *host,  static int sd_power_on(struct realtek_pci_sdmmc *host)  {
>   struct rtsx_pcr *pcr = host->pcr;
> + struct mmc_host *mmc = host->mmc;
>   int err;
> + u32 val;
> 
>   if (host->power_state == SDMMC_POWER_ON)
>   return 0;
> @@ -922,6 +924,14 @@ static int sd_power_on(struct realtek_pci_sdmmc
> *host)
>   if (err < 0)
>   return err;
> 
> + if (PCI_PID(pcr) == PID_5261) {
> + val = rtsx_pci_readl(pcr, RTSX_BIPR);
> + if (val & SD_WRITE_PROTECT) {
> + pcr->extra_caps &= ~EXTRA_CAPS_SD_EXPRESS;
> + mmc->caps2 &= ~(MMC_CAP2_SD_EXP |
> MMC_CAP2_SD_EXP_1_2V);
> + }
> + }
> +
>   host->power_state = SDMMC_POWER_ON;
>   return 0;
>  }
> @@ -1127,6 +1137,8 @@ static int sdmmc_get_cd(struct mmc_host *mmc)
>   if (val & SD_EXIST)
>   cd = 1;
> 
> + if (pcr->extra_caps & EXTRA_CAPS_SD_EXPRESS)
> + mmc->caps2 |= MMC_CAP2_SD_EXP | MMC_CAP2_SD_EXP_1_2V;
>   mutex_unlock(&pcr->pcr_mutex);
> 
>   return cd;
> @@ -1308,6 +1320,50 @@ static int sdmmc_execute_tuning(struct
> mmc_host *mmc, u32 opcode)
>   return err;
>  }
> 
> +static int sdmmc_init_sd_express(struct mmc_host *mmc, struct mmc_ios
> +*ios) {
> + u32 relink_time, val;
> + struct realtek_pci_sdmmc *host = mmc_priv(mmc);
> + struct rtsx_pcr *pcr = host->pcr;
> +
> + /*
> +  * If card has PCIe availability and WP if off,
> +  * reader switch to PCIe mode.
> +  */
> + val = rtsx_pci_readl(pcr, RTSX_BIPR);
> + if (!(val & SD_WRITE_PROTECT)) {
> + /* Set relink_time for changing to PCIe card */
> + relink_time = 0x8FFF;
> +
> + rtsx_pci_write_register(pcr, 0xFF01, 0xFF, relink_time);
> + rtsx_pci_write_register(pcr, 0xFF02, 0xFF, relink_time >> 8);
> + rtsx_pci_write_register(pcr, 0xFF03, 0x01, relink_time >> 16);
> +
> + rtsx_pci_write_register(pcr, PETXCFG, 0x80, 0x80);
> + rtsx_pci_write_register(pcr, LDO_VCC_CFG0,
> + RTS5261_LDO1_OCP_THD_MASK,
> + pcr->option.sd_800mA_ocp_thd);
> +
> + if (pcr->ops->disable_auto_blink)
> + pcr->ops->disable_auto_blink(pcr);
> +
> + /* For PCIe/NVMe mode can't enter delink issue */
> + pcr->hw_param.interrupt_en &= ~(SD_INT_EN);
> + rtsx_pci_writel(pcr, RTSX_BIER, pcr->hw_param.interrupt_en);
> +
> + rtsx_pci_write_register(pcr, RTS5260_AUTOLOAD_CFG4,
> + RTS5261_AUX_CLK_16M_EN, RTS5261_AUX_CLK_16M_EN);
> + rtsx_pci_write_register(pcr, RTS5261_FW_CFG0,
> + RTS5261_FW_ENTER_EXPRESS, RTS5261_FW_ENTER_EXPRESS);
> + rtsx_pci_write_register(pcr, RTS5261_FW_CFG1,
> + RTS5261_MCU_BUS_SEL_MASK |
> RTS5261_MCU_CLOCK_SEL_MASK
> + | RTS5261_MCU_CLOCK_GATING |
> RTS5261_DRIVER_ENABLE_FW,
> + RTS5261_MCU_CLOCK_SEL_16M |
> RTS5261_MCU_CLOCK_GATING
> + | RTS5261_DRIVER_ENABLE_FW);
> + }
> + return 0;
> +}
> +
>  static const struct mmc_host_ops realtek_pci_sdmmc_ops = {
>   .pre_req = sdmmc_pre_req,
>   .post_req = sdmmc_post_req,
> @@ -1317,6 +1373,7 @@ static const struct mmc_host_ops
> realtek_pci_sdmmc_ops = {
>   .get_cd = sdmmc_get_cd,
>   .start_signal_voltage_switch = sdmmc_switch_voltage,
>   .execute_tuning = sdmmc_execute_tuning,
> + .init_sd_express = sdmmc_init_sd_express,
>  };
> 
>  static void init_extra_caps(struct realtek_pci_sdmmc *host) @@ -1338,6
> +1395,8 @@ static void init_extra_caps(struct realtek_pci_sdmmc *host)
>   mmc->caps |= MMC_CAP_8_BIT_DATA;
>   if (pcr->extra_caps & EXTRA_CAPS_NO_MMC)
>   mmc->caps2 |= MMC_CAP2_NO_MMC;
> + if (pcr->extra_caps & EXTRA_CAPS_SD_EXPRESS)
> + mmc->caps2 |= MMC_CAP2_SD_EXP | MMC_CAP2_SD_EXP_1_2V;
>  }
> 
>  static void realtek_init_host(struct realtek_pci_sdmmc *host)
> --
> 2.17.1

Re: [PATCH v2] fat: Add KUnit tests for checksums and timestamps

2020-10-19 Thread OGAWA Hirofumi

David Gow  writes:

> diff --git a/fs/fat/misc.c b/fs/fat/misc.c
> index f1b2a1fc2a6a..445ad3542e74 100644
> --- a/fs/fat/misc.c
> +++ b/fs/fat/misc.c
> @@ -229,6 +229,7 @@ void fat_time_fat2unix(struct msdos_sb_info *sbi, struct 
> timespec64 *ts,
>   ts->tv_nsec = 0;
>   }
>  }
> +EXPORT_SYMBOL_GPL(fat_time_fat2unix);

Hm, can this export only if FAT_KUNIT_TEST is builtin or module (maybe
#if IS_ENABLED(...))? And #if will also be worked as the comment too.

>  
>  /* Convert linear UNIX date to a FAT time/date pair. */
>  void fat_time_unix2fat(struct msdos_sb_info *sbi, struct timespec64 *ts,

-- 
OGAWA Hirofumi

[RFC PATCH 2/2] PKCS#7: Check codeSigning EKU for kernel module and kexec pe verification

2020-10-19 Thread Lee, Chun-Yi

This patch adds the logic for checking the CodeSigning extended
key usage extenstion when verifying signature of kernel module or
kexec PE binary in PKCS#7.

Signed-off-by: "Lee, Chun-Yi" 
---
 certs/system_keyring.c   |  2 +-
 crypto/asymmetric_keys/Kconfig   | 10 ++
 crypto/asymmetric_keys/pkcs7_trust.c | 37 +---
 include/crypto/pkcs7.h   |  3 ++-
 4 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index 798291177186..4104f5465d8a 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -242,7 +242,7 @@ int verify_pkcs7_message_sig(const void *data, size_t len,
goto error;
}
}
-   ret = pkcs7_validate_trust(pkcs7, trusted_keys);
+   ret = pkcs7_validate_trust(pkcs7, trusted_keys, usage);
if (ret < 0) {
if (ret == -ENOKEY)
pr_devel("PKCS#7 signature not signed with a trusted 
key\n");
diff --git a/crypto/asymmetric_keys/Kconfig b/crypto/asymmetric_keys/Kconfig
index 1f1f004dc757..6e3de0c3b5f0 100644
--- a/crypto/asymmetric_keys/Kconfig
+++ b/crypto/asymmetric_keys/Kconfig
@@ -96,4 +96,14 @@ config SIGNED_PE_FILE_VERIFICATION
  This option provides support for verifying the signature(s) on a
  signed PE binary.
 
+config CHECK_CODESIGN_EKU
+   bool "Check codeSigning extended key usage"
+   depends on PKCS7_MESSAGE_PARSER=y
+   depends on SYSTEM_DATA_VERIFICATION
+   help
+ This option provides support for checking the codeSigning extended
+ key usage extension when verifying the signature in PKCS#7. It
+ affects kernel module verification and kexec PE binary verification
+ now.
+
 endif # ASYMMETRIC_KEY_TYPE
diff --git a/crypto/asymmetric_keys/pkcs7_trust.c 
b/crypto/asymmetric_keys/pkcs7_trust.c
index 61af3c4d82cc..1d2318ff63db 100644
--- a/crypto/asymmetric_keys/pkcs7_trust.c
+++ b/crypto/asymmetric_keys/pkcs7_trust.c
@@ -16,12 +16,36 @@
 #include 
 #include "pkcs7_parser.h"
 
+#ifdef CONFIG_CHECK_CODESIGN_EKU
+static bool check_codesign_eku(struct key *key,
+enum key_being_used_for usage)
+{
+   struct public_key *public_key = key->payload.data[asym_crypto];
+
+   switch (usage) {
+   case VERIFYING_MODULE_SIGNATURE:
+   case VERIFYING_KEXEC_PE_SIGNATURE:
+   return !!(public_key->eku & EKU_codeSigning);
+   default:
+   break;
+   }
+   return true;
+}
+#else
+static bool check_codesign_eku(struct key *key,
+enum key_being_used_for usage)
+{
+   return true;
+}
+#endif
+
 /**
  * Check the trust on one PKCS#7 SignedInfo block.
  */
 static int pkcs7_validate_trust_one(struct pkcs7_message *pkcs7,
struct pkcs7_signed_info *sinfo,
-   struct key *trust_keyring)
+   struct key *trust_keyring,
+   enum key_being_used_for usage)
 {
struct public_key_signature *sig = sinfo->sig;
struct x509_certificate *x509, *last = NULL, *p;
@@ -112,6 +136,12 @@ static int pkcs7_validate_trust_one(struct pkcs7_message 
*pkcs7,
return -ENOKEY;
 
 matched:
+   if (!check_codesign_eku(key, usage)) {
+   pr_warn("sinfo %u: The signer %x key is not CodeSigning\n",
+   sinfo->index, key_serial(key));
+   key_put(key);
+   return -ENOKEY;
+   }
ret = verify_signature(key, sig);
key_put(key);
if (ret < 0) {
@@ -156,7 +186,8 @@ static int pkcs7_validate_trust_one(struct pkcs7_message 
*pkcs7,
  * May also return -ENOMEM.
  */
 int pkcs7_validate_trust(struct pkcs7_message *pkcs7,
-struct key *trust_keyring)
+struct key *trust_keyring,
+enum key_being_used_for usage)
 {
struct pkcs7_signed_info *sinfo;
struct x509_certificate *p;
@@ -167,7 +198,7 @@ int pkcs7_validate_trust(struct pkcs7_message *pkcs7,
p->seen = false;
 
for (sinfo = pkcs7->signed_infos; sinfo; sinfo = sinfo->next) {
-   ret = pkcs7_validate_trust_one(pkcs7, sinfo, trust_keyring);
+   ret = pkcs7_validate_trust_one(pkcs7, sinfo, trust_keyring, 
usage);
switch (ret) {
case -ENOKEY:
continue;
diff --git a/include/crypto/pkcs7.h b/include/crypto/pkcs7.h
index 38ec7f5f9041..b3b48240ba73 100644
--- a/include/crypto/pkcs7.h
+++ b/include/crypto/pkcs7.h
@@ -30,7 +30,8 @@ extern int pkcs7_get_content_data(const struct pkcs7_message 
*pkcs7,
  * pkcs7_trust.c
  */
 extern int pkcs7_validate_trust(struct pkcs7_message *pkcs7,
-   struct key *trust_keyring);
+   struct key *trust_k

[RFC PATCH 1/2] X.509: Add CodeSigning extended key usage parsing

2020-10-19 Thread Lee, Chun-Yi

This patch adds the logic for parsing the CodeSign extended key usage
extension in X.509. The parsing result will be set to the eku flag
which is carried by public key. It can be used in the PKCS#7
verification.

Signed-off-by: "Lee, Chun-Yi" 
---
 crypto/asymmetric_keys/x509_cert_parser.c | 24 
 include/crypto/public_key.h   |  1 +
 include/linux/oid_registry.h  |  5 +
 3 files changed, 30 insertions(+)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 26ec20ef4899..5179da8b7cd9 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -480,6 +480,8 @@ int x509_process_extension(void *context, size_t hdrlen,
struct x509_parse_context *ctx = context;
struct asymmetric_key_id *kid;
const unsigned char *v = value;
+   int i = 0;
+   enum OID oid;
 
pr_debug("Extension: %u\n", ctx->last_oid);
 
@@ -509,6 +511,28 @@ int x509_process_extension(void *context, size_t hdrlen,
return 0;
}
 
+   if (ctx->last_oid == OID_extKeyUsage) {
+   if (v[0] != ((ASN1_UNIV << 6) | ASN1_CONS_BIT | ASN1_SEQ) ||
+   v[1] != vlen - 2)
+   return -EBADMSG;
+   i += 2;
+
+   while (i < vlen) {
+   /* A 10 bytes EKU OID Octet blob =
+* ASN1_OID + size byte + 8 bytes OID */
+   if (v[i] != ASN1_OID || v[i + 1] != 8 || (i + 10) > 
vlen)
+   return -EBADMSG;
+
+   oid = look_up_OID(v + i + 2, v[i + 1]);
+   if (oid == OID_codeSigning) {
+   ctx->cert->pub->eku |= EKU_codeSigning;
+   }
+   i += 10;
+   }
+   pr_debug("extKeyUsage: %d\n", ctx->cert->pub->eku);
+   return 0;
+   }
+
return 0;
 }
 
diff --git a/include/crypto/public_key.h b/include/crypto/public_key.h
index 11f535cfb810..7c7342648260 100644
--- a/include/crypto/public_key.h
+++ b/include/crypto/public_key.h
@@ -28,6 +28,7 @@ struct public_key {
bool key_is_private;
const char *id_type;
const char *pkey_algo;
+   unsigned int eku : 9;  /* Extended Key Usage (9-bit) */
 };
 
 extern void public_key_free(struct public_key *key);
diff --git a/include/linux/oid_registry.h b/include/linux/oid_registry.h
index 657d6bf2c064..cd448e9b02fc 100644
--- a/include/linux/oid_registry.h
+++ b/include/linux/oid_registry.h
@@ -107,9 +107,14 @@ enum OID {
OID_gostTC26Sign512B,   /* 1.2.643.7.1.2.1.2.2 */
OID_gostTC26Sign512C,   /* 1.2.643.7.1.2.1.2.3 */
 
+   /* Extended key purpose OIDs [RFC 5280] */
+   OID_codeSigning,/* 1.3.6.1.5.5.7.3.3 */
+
OID__NR
 };
 
+#define EKU_codeSigning(1 << 2)
+
 extern enum OID look_up_OID(const void *data, size_t datasize);
 extern int sprint_oid(const void *, size_t, char *, size_t);
 extern int sprint_OID(enum OID, char *, size_t);
-- 
2.16.4

[RFC PATCH 0/2] Check codeSigning extended key usage extension

2020-10-19 Thread Lee, Chun-Yi

NIAP PP_OS certification requests that the OS shall validate the
CodeSigning extended key usage extension field for integrity
verifiction of exectable code:

https://www.niap-ccevs.org/MMO/PP/-442-/
FIA_X509_EXT.1.1

This patchset adds the logic for parsing the codeSigning EKU extension
field in X.509. And checking the CodeSigning EKU when verifying signature
of kernel module or kexec PE binary in PKCS#7.

Lee, Chun-Yi (2):
  X.509: Add CodeSigning extended key usage parsing
  PKCS#7: Check codeSigning EKU for kernel module and kexec pe
verification

 certs/system_keyring.c|  2 +-
 crypto/asymmetric_keys/Kconfig| 10 +
 crypto/asymmetric_keys/pkcs7_trust.c  | 37 ---
 crypto/asymmetric_keys/x509_cert_parser.c | 24 
 include/crypto/pkcs7.h|  3 ++-
 include/crypto/public_key.h   |  1 +
 include/linux/oid_registry.h  |  5 +
 7 files changed, 77 insertions(+), 5 deletions(-)

-- 
2.16.4

カナダオファー

2020-10-19 Thread FM

ご挨拶、
 
私の名前は、カナダ有数の暗号通貨交換プラットフォームの1つである最高コンプライアンス責任者であるFelixです。 
これは私からあなたへの秘密の秘密のメッセージであり、そのように扱われることを要求します。
 
故人の口座保有者が所有する清算されたBTC口座から生じた900万米ドルを超える資金に関する緊急事項（取引）についてご連絡いたします
。 私の計画と、私があなたの返事を受け取り、あなたの信頼を得た後、私が最初にあなたに連絡することを選んだ理由をあなたに知らせます。
 
たくさんの感謝とあなたの返事を楽しみにしています。
 
フェリックス。

Re: general protection fault in __switch_to_asm

2020-10-19 Thread syzbot

syzbot suspects this issue was fixed by commit:

commit 033724d6864245a11f8e04c066002e6ad22b3fd0
Author: Tetsuo Handa 
Date:   Wed Jul 15 01:51:02 2020 +

fbdev: Detect integer underflow at "struct fbcon_ops"->clear_margins.

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17f3766f90
start commit:   04300d66 Merge tag 'riscv-for-linus-5.8-rc7' of git://git...
git tree:   upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=f87a5e4232fdb267
dashboard link: https://syzkaller.appspot.com/bug?extid=fe6eeea133f070606074
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1575d10290
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15dd6ac090

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: fbdev: Detect integer underflow at "struct fbcon_ops"->clear_margins.

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Re: [PATCH v5 1/5] arm64: Add framework to turn IPI as NMI

2020-10-19 Thread Sumit Garg

On Mon, 19 Oct 2020 at 17:07, Marc Zyngier  wrote:
>
> On 2020-10-14 12:12, Sumit Garg wrote:
> > Introduce framework to turn an IPI as NMI using pseudo NMIs. In case a
> > particular platform doesn't support pseudo NMIs, then request IPI as a
> > regular IRQ.
> >
> > The main motivation for this feature is to have an IPI that can be
> > leveraged to invoke NMI functions on other CPUs. And current
> > prospective
> > users are NMI backtrace and KGDB CPUs round-up whose support is added
> > via future patches.
> >
> > Signed-off-by: Sumit Garg 
> > ---
> >  arch/arm64/include/asm/nmi.h | 16 +
> >  arch/arm64/kernel/Makefile   |  2 +-
> >  arch/arm64/kernel/ipi_nmi.c  | 77
> > 
> >  3 files changed, 94 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/nmi.h
> >  create mode 100644 arch/arm64/kernel/ipi_nmi.c
> >
> > diff --git a/arch/arm64/include/asm/nmi.h
> > b/arch/arm64/include/asm/nmi.h
> > new file mode 100644
> > index 000..3433c55
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/nmi.h
> > @@ -0,0 +1,16 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef __ASM_NMI_H
> > +#define __ASM_NMI_H
> > +
> > +#ifndef __ASSEMBLER__
> > +
> > +#include 
> > +
> > +extern void arch_send_call_nmi_func_ipi_mask(cpumask_t *mask);
> > +
> > +void set_smp_ipi_nmi(int ipi);
> > +void ipi_nmi_setup(int cpu);
> > +void ipi_nmi_teardown(int cpu);
> > +
> > +#endif /* !__ASSEMBLER__ */
> > +#endif
> > diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> > index bbaf0bc..525a1e0 100644
> > --- a/arch/arm64/kernel/Makefile
> > +++ b/arch/arm64/kernel/Makefile
> > @@ -17,7 +17,7 @@ obj-y   := debug-monitors.o entry.o 
> > irq.o fpsimd.o  \
> >  return_address.o cpuinfo.o cpu_errata.o
> >   \
> >  cpufeature.o alternative.o cacheinfo.o 
> >   \
> >  smp.o smp_spin_table.o topology.o smccc-call.o 
> >   \
> > -syscall.o proton-pack.o
> > +syscall.o proton-pack.o ipi_nmi.o
> >
> >  targets  += efi-entry.o
> >
> > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > new file mode 100644
> > index 000..a959256
> > --- /dev/null
> > +++ b/arch/arm64/kernel/ipi_nmi.c
> > @@ -0,0 +1,77 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * NMI support for IPIs
> > + *
> > + * Copyright (C) 2020 Linaro Limited
> > + * Author: Sumit Garg 
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include 
> > +
> > +static struct irq_desc *ipi_desc __read_mostly;
> > +static int ipi_id __read_mostly;
> > +static bool is_nmi __read_mostly;
> > +
> > +void arch_send_call_nmi_func_ipi_mask(cpumask_t *mask)
> > +{
> > + if (WARN_ON_ONCE(!ipi_desc))
> > + return;
> > +
> > + __ipi_send_mask(ipi_desc, mask);
> > +}
> > +
> > +static irqreturn_t ipi_nmi_handler(int irq, void *data)
> > +{
> > + /* nop, NMI handlers for special features can be added here. */
> > +
> > + return IRQ_HANDLED;
>
> This definitely is the *wrong* default. If nothing is explicitly
> handling the interrupt, it should be reported as such to the core
> code to be disabled if this happens too often.

Okay will fix default as "IRQ_NONE".

>
> > +}
> > +
> > +void ipi_nmi_setup(int cpu)
>
> The naming is awful. "ipi" is nowhere specific enough (we already have
> another 7 of them), and this doesn't necessarily uses pseudo-NMIs, since
> you are requesting IRQs.

How about following naming conventions?

- dynamic_ipi_setup()
- dynamic_ipi_teardown()
- set_smp_dynamic_ipi()

>
> > +{
> > + if (!ipi_desc)
> > + return;
> > +
> > + if (is_nmi) {
> > + if (!prepare_percpu_nmi(ipi_id))
> > + enable_percpu_nmi(ipi_id, IRQ_TYPE_NONE);
> > + } else {
> > + enable_percpu_irq(ipi_id, IRQ_TYPE_NONE);
>
> I'm not keen on this. Normal IRQs can't reliably work, so why do you
> even bother with this?

Yeah I agree but we need to support existing functionality for kgdb
roundup and sysrq backtrace using normal IRQs as well.

>
> > + }
> > +}
> > +
> > +void ipi_nmi_teardown(int cpu)
> > +{
> > + if (!ipi_desc)
> > + return;
> > +
> > + if (is_nmi) {
> > + disable_percpu_nmi(ipi_id);
> > + teardown_percpu_nmi(ipi_id);
> > + } else {
> > + disable_percpu_irq(ipi_id);
> > + }
> > +}
> > +
> > +void __init set_smp_ipi_nmi(int ipi)
> > +{
> > + int err;
> > +
> > + err = request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number);
> > + if (err) {
> > + err = request_percpu_irq(ipi, ipi_nmi_handler, "IPI",
> > +  &cpu_number);
> > + WARN_ON(err);
> > + is_nmi = false;
> > + } else {
> > +

Re: [PATCH v5 2/2] leds: mt6360: Add LED driver for MT6360

2020-10-19 Thread Gene Chen

Jacek Anaszewski  於 2020年10月9日 週五 上午5:51寫道：
>
> Hi Gene,
>
> On 10/7/20 3:42 AM, Gene Chen wrote:
> > From: Gene Chen 
> >
> > Add MT6360 LED driver include 2-channel Flash LED with torch/strobe mode,
> > 3-channel RGB LED support Register/Flash/Breath Mode, and 1-channel for
> > moonlight LED.
> >
> > Signed-off-by: Gene Chen 
> > ---
> >   drivers/leds/Kconfig   |  12 +
> >   drivers/leds/Makefile  |   1 +
> >   drivers/leds/leds-mt6360.c | 783 
> > +
> >   3 files changed, 796 insertions(+)
> >   create mode 100644 drivers/leds/leds-mt6360.c
> >
> > diff --git a/drivers/leds/Kconfig b/drivers/leds/Kconfig
> > index 1c181df..c7192dd 100644
> > --- a/drivers/leds/Kconfig
> > +++ b/drivers/leds/Kconfig
> > @@ -271,6 +271,18 @@ config LEDS_MT6323
> > This option enables support for on-chip LED drivers found on
> > Mediatek MT6323 PMIC.
> >
> > +config LEDS_MT6360
> > + tristate "LED Support for Mediatek MT6360 PMIC"
> > + depends on LEDS_CLASS_FLASH && OF
> > + depends on LEDS_CLASS_MULTICOLOR
>
> Since CONFIG_LED_CLASS_MULTICOLOR can be turned off you need to have
> below instead:
>
> depends on LEDS_CLASS_MULTICOLOR || !!LEDS_CLASS_MULTICOLOR
>
> Unless you want to prevent enabling the driver without RGB LED,
> but that does not seem to be reasonable at first glance.
>

May I change to "select LEDS_CLASS_MULTICOLOR"?
I suppose RGB always use multicolor mode.

> > + depends on V4L2_FLASH_LED_CLASS || !V4L2_FLASH_LED_CLASS
> > + depends on MFD_MT6360
> > + help
> > +   This option enables support for dual Flash LED drivers found on
> > +   Mediatek MT6360 PMIC.
> > +   Independent current sources supply for each flash LED support torch
> > +   and strobe mode.
> > +
>
> --
> Best regards,
> Jacek Anaszewski

Re: [PATCH v2] usb: gadget: configfs: Fix use-after-free issue with udc_name

2020-10-19 Thread Macpaul Lin

On Sat, 2020-07-18 at 10:45 +0800, Macpaul Lin wrote:
> From: Eddie Hung 
> There is a use-after-free issue, if access udc_name
> in function gadget_dev_desc_UDC_store after another context
> free udc_name in function unregister_gadget.
> 
> Context 1:
> gadget_dev_desc_UDC_store()->unregister_gadget()->
> free udc_name->set udc_name to NULL
> 
> Context 2:
> gadget_dev_desc_UDC_show()-> access udc_name
> 
> Call trace:
> dump_backtrace+0x0/0x340
> show_stack+0x14/0x1c
> dump_stack+0xe4/0x134
> print_address_description+0x78/0x478
> __kasan_report+0x270/0x2ec
> kasan_report+0x10/0x18
> __asan_report_load1_noabort+0x18/0x20
> string+0xf4/0x138
> vsnprintf+0x428/0x14d0
> sprintf+0xe4/0x12c
> gadget_dev_desc_UDC_show+0x54/0x64
> configfs_read_file+0x210/0x3a0
> __vfs_read+0xf0/0x49c
> vfs_read+0x130/0x2b4
> SyS_read+0x114/0x208
> el0_svc_naked+0x34/0x38
> 
> Add mutex_lock to protect this kind of scenario.
> 
> Signed-off-by: Eddie Hung 
> Signed-off-by: Macpaul Lin 
> Reviewed-by: Peter Chen 
> Cc: sta...@vger.kernel.org
> ---
> Changes for v2:
>   - Fix typo %s/contex/context, Thanks Peter.
> 
>  drivers/usb/gadget/configfs.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/gadget/configfs.c b/drivers/usb/gadget/configfs.c
> index 9dc06a4e1b30..21110b2865b9 100644
> --- a/drivers/usb/gadget/configfs.c
> +++ b/drivers/usb/gadget/configfs.c
> @@ -221,9 +221,16 @@ static ssize_t gadget_dev_desc_bcdUSB_store(struct 
> config_item *item,
>  
>  static ssize_t gadget_dev_desc_UDC_show(struct config_item *item, char *page)
>  {
> - char *udc_name = to_gadget_info(item)->composite.gadget_driver.udc_name;
> + struct gadget_info *gi = to_gadget_info(item);
> + char *udc_name;
> + int ret;
> +
> + mutex_lock(&gi->lock);
> + udc_name = gi->composite.gadget_driver.udc_name;
> + ret = sprintf(page, "%s\n", udc_name ?: "");
> + mutex_unlock(&gi->lock);
>  
> - return sprintf(page, "%s\n", udc_name ?: "");
> + return ret;
>  }
>  
>  static int unregister_gadget(struct gadget_info *gi)

Just want to remind we have a fix here for usb/gadget/configfs.c.
If the patch need to be further revised, please let us know.

Thanks!
Macpaul Lin

Re: [PATCH] venus: vdec: return parsed crop information from stream

2020-10-19 Thread Alexandre Courbot

On Mon, Oct 19, 2020 at 7:19 AM Fritz Koenig  wrote:
>
> It looks like only h.264 streams are populating the event.input_crop
> struct when receiving the HFI_INDEX_EXTRADATA_INPUT_CROP message in
> event_seq_changed().  vp8/vp9 streams end up with the struct filled
> with 0.

Indeed. :( I guess we can fallback to the previous behavior of using
the coded resolution as visible rect when the reported visible rect's
area is 0. That will preserve the previous behavior until the firmware
starts reporting this information for all encoded streams.

>
> On Fri, Oct 9, 2020 at 1:45 AM Alexandre Courbot  
> wrote:
> >
> > Per the stateful codec specification, VIDIOC_G_SELECTION with a target
> > of V4L2_SEL_TGT_COMPOSE is supposed to return the crop area of capture
> > buffers containing the decoded frame. Until now the driver did not get
> > that information from the firmware and just returned the dimensions of
> > CAPTURE buffers.
> >
> > Signed-off-by: Alexandre Courbot 
> > ---
> >  drivers/media/platform/qcom/venus/core.h |  1 +
> >  drivers/media/platform/qcom/venus/vdec.c | 21 -
> >  2 files changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/media/platform/qcom/venus/core.h 
> > b/drivers/media/platform/qcom/venus/core.h
> > index 7b79a33dc9d6..3bc129a4f817 100644
> > --- a/drivers/media/platform/qcom/venus/core.h
> > +++ b/drivers/media/platform/qcom/venus/core.h
> > @@ -361,6 +361,7 @@ struct venus_inst {
> > unsigned int streamon_cap, streamon_out;
> > u32 width;
> > u32 height;
> > +   struct v4l2_rect crop;
> > u32 out_width;
> > u32 out_height;
> > u32 colorspace;
> > diff --git a/drivers/media/platform/qcom/venus/vdec.c 
> > b/drivers/media/platform/qcom/venus/vdec.c
> > index ea13170a6a2c..ee74346f0cae 100644
> > --- a/drivers/media/platform/qcom/venus/vdec.c
> > +++ b/drivers/media/platform/qcom/venus/vdec.c
> > @@ -325,6 +325,10 @@ static int vdec_s_fmt(struct file *file, void *fh, 
> > struct v4l2_format *f)
> >
> > inst->width = format.fmt.pix_mp.width;
> > inst->height = format.fmt.pix_mp.height;
> > +   inst->crop.top = 0;
> > +   inst->crop.left = 0;
> > +   inst->crop.width = inst->width;
> > +   inst->crop.height = inst->height;
> >
> > if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > inst->fmt_out = fmt;
> > @@ -343,6 +347,9 @@ vdec_g_selection(struct file *file, void *fh, struct 
> > v4l2_selection *s)
> > s->type != V4L2_BUF_TYPE_VIDEO_OUTPUT)
> > return -EINVAL;
> >
> > +   s->r.top = 0;
> > +   s->r.left = 0;
> > +
> > switch (s->target) {
> > case V4L2_SEL_TGT_CROP_BOUNDS:
> > case V4L2_SEL_TGT_CROP_DEFAULT:
> > @@ -363,16 +370,12 @@ vdec_g_selection(struct file *file, void *fh, struct 
> > v4l2_selection *s)
> > case V4L2_SEL_TGT_COMPOSE:
> > if (s->type != V4L2_BUF_TYPE_VIDEO_CAPTURE)
> > return -EINVAL;
> > -   s->r.width = inst->out_width;
> > -   s->r.height = inst->out_height;
> > +   s->r = inst->crop;
> > break;
> > default:
> > return -EINVAL;
> > }
> >
> > -   s->r.top = 0;
> > -   s->r.left = 0;
> > -
> > return 0;
> >  }
> >
> > @@ -1309,6 +1312,10 @@ static void vdec_event_change(struct venus_inst 
> > *inst,
> >
> > inst->width = format.fmt.pix_mp.width;
> > inst->height = format.fmt.pix_mp.height;
> > +   inst->crop.left = ev_data->input_crop.left;
> > +   inst->crop.top = ev_data->input_crop.top;
> > +   inst->crop.width = ev_data->input_crop.width;
> > +   inst->crop.height = ev_data->input_crop.height;
> >
> > inst->out_width = ev_data->width;
> > inst->out_height = ev_data->height;
> > @@ -1412,6 +1419,10 @@ static void vdec_inst_init(struct venus_inst *inst)
> > inst->fmt_cap = &vdec_formats[0];
> > inst->width = frame_width_min(inst);
> > inst->height = ALIGN(frame_height_min(inst), 32);
> > +   inst->crop.left = 0;
> > +   inst->crop.top = 0;
> > +   inst->crop.width = inst->width;
> > +   inst->crop.height = inst->height;
> > inst->out_width = frame_width_min(inst);
> > inst->out_height = frame_height_min(inst);
> > inst->fps = 30;
> > --
> > 2.28.0.1011.ga647a8990f-goog
> >

Re: [PATCH v4 0/4] can: add support for ETAS ES58X CAN USB

2020-10-19 Thread Marc Kleine-Budde

On 10/16/20 7:13 PM, Vincent Mailhol wrote:
> The purpose of this patch series is to introduce a new CAN USB
> driver to support ETAS USB interfaces (ES58X series).
> 
> During development, issues in drivers/net/can/dev.c were discovered,
> the fix for those issues are included in this patch series.
> 
> We also propose to add one helper functions in include/linux/can/dev.h
> which we think can benefit other drivers: get_can_len().
I applied patches 1-3 to linux-can, I've changed get_can_len() -> can_get_len()
to use a common can_ prefix for all CAN related functions.

Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] can: mcp251xfd: fix semicolon.cocci warnings

2020-10-19 Thread Marc Kleine-Budde

On 10/19/20 2:08 PM, kernel test robot wrote:
> From: kernel test robot 
> 
> drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c:176:2-3: Unneeded semicolon
> 
> 
>  Remove unneeded semicolon.
> 
> Generated by: scripts/coccinelle/misc/semicolon.cocci
> 
> Fixes: f4f77366f21d ("can: mcp251xfd: rename all user facing strings to 
> mcp251xfd")

The correct fixes tag is:

Fixes: 875347fe5756 ("can: mcp25xxfd: add regmap infrastructure")

> Signed-off-by: kernel test robot 

Applied with that to linux-can.

tnx,
Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |



signature.asc
Description: OpenPGP digital signature

[PATCH net] net: hdlc_raw_eth: Clear the IFF_TX_SKB_SHARING flag after calling ether_setup

2020-10-19 Thread Xie He

This driver calls ether_setup to set up the network device.
The ether_setup function would add the IFF_TX_SKB_SHARING flag to the
device. This flag indicates that it is safe to transmit shared skbs to
the device.

However, this is not true. This driver may pad the frame (in eth_tx)
before transmission, so the skb may be modified.

Cc: Neil Horman 
Cc: Krzysztof Halasa 
Signed-off-by: Xie He 
---
 drivers/net/wan/hdlc_raw_eth.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wan/hdlc_raw_eth.c b/drivers/net/wan/hdlc_raw_eth.c
index 08e0a46501de..c70a518b8b47 100644
--- a/drivers/net/wan/hdlc_raw_eth.c
+++ b/drivers/net/wan/hdlc_raw_eth.c
@@ -99,6 +99,7 @@ static int raw_eth_ioctl(struct net_device *dev, struct ifreq 
*ifr)
old_qlen = dev->tx_queue_len;
ether_setup(dev);
dev->tx_queue_len = old_qlen;
+   dev->priv_flags &= ~IFF_TX_SKB_SHARING;
eth_hw_addr_random(dev);
call_netdevice_notifiers(NETDEV_POST_TYPE_CHANGE, dev);
netif_dormant_off(dev);
-- 
2.25.1

Re: [PATCH 2/2] KVM: not link irqfd with a fake IRQ bypass producer

2020-10-19 Thread Jason Wang




On 2020/10/19 下午5:06, Zhenzhong Duan wrote:

In case failure to setup Post interrupt for an IRQ, it make no sense
to assign irqfd->producer to the producer.

This change makes code more robust.



It's better to describe what issue we will get without this patch.

Thanks




Signed-off-by: Zhenzhong Duan 
---
  arch/x86/kvm/x86.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ce856e0..277e961 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10683,13 +10683,14 @@ int kvm_arch_irq_bypass_add_producer(struct 
irq_bypass_consumer *cons,
container_of(cons, struct kvm_kernel_irqfd, consumer);
int ret;
  
-	irqfd->producer = prod;

kvm_arch_start_assignment(irqfd->kvm);
ret = kvm_x86_ops.update_pi_irte(irqfd->kvm,
 prod->irq, irqfd->gsi, 1);
  
  	if (ret)

kvm_arch_end_assignment(irqfd->kvm);
+   else
+   irqfd->producer = prod;
  
  	return ret;

  }

[PATCH v2] dmaengine: pl330: Remove unreachable code

2020-10-19 Thread Surendran K

_setup_req(..) never returns negative value.
Hence the condition ret < 0 is never met

Signed-off-by: Surendran K 
---
Commit tag is changed

 drivers/dma/pl330.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index e9f0101d92fa..8355586c9788 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -1527,8 +1527,6 @@ static int pl330_submit_req(struct pl330_thread *thrd,

/* First dry run to check if req is acceptable */
ret = _setup_req(pl330, 1, thrd, idx, &xs);
-   if (ret < 0)
-   goto xfer_exit;

if (ret > pl330->mcbufsz / 2) {
dev_info(pl330->ddma.dev, "%s:%d Try increasing mcbufsz 
(%i/%i)\n",
--
2.17.1

[PATCH 2/2] fs:regfs: add panic notifier callback for saving regs

2020-10-19 Thread Zou Cao

register panic notifier callback for saveing regs, add a module
param regfs_panic to enable the show reg info when panic.

Signed-off-by: Zou Cao 
---
 fs/regfs/inode.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/fs/regfs/inode.c b/fs/regfs/inode.c
index 1643fcd..6c79f73 100644
--- a/fs/regfs/inode.c
+++ b/fs/regfs/inode.c
@@ -46,10 +46,41 @@
 static LIST_HEAD(regfs_head);
 
 static const struct inode_operations regfs_dir_inode_operations;
-int regfs_debug;
+int regfs_debug = 1;
 module_param(regfs_debug, int, S_IRUGO);
 MODULE_PARM_DESC(regfs_debug, "enable regfs debug mode");
 
+static int regfs_panic = 1;
+module_param(regfs_panic, int, S_IRUGO);
+MODULE_PARM_DESC(regfs_debug, "printk the register when panic");
+
+//save all register val when panic
+static int regfs_panic_event(struct notifier_block *self,
+unsigned long val, void *data)
+{
+   struct regfs_fs_info *fsi;
+   struct inode *inode, *next;
+
+
+   list_for_each_entry(fsi, ®fs_head, regfs_head) {
+   list_for_each_entry_safe(inode, next, &fsi->sb->s_inodes, 
i_sb_list) {
+   struct regfs_inode_info *info =  REGFS_I(inode);;
+   //save the regs val
+   if (info->type == RES_TYPE_ITEM) {
+   info->val = readl_relaxed(info->base + 
info->offset);
+   if (regfs_panic)
+   printk("%llx:%llx\n", (u64)(info->base 
+ info->offset), info->val);
+   }
+   }
+   }
+
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block regfs_panic_event_nb = {
+   .notifier_call   = regfs_panic_event,
+};
+
 struct inode *regfs_get_inode(struct super_block *sb, const struct inode *dir, 
umode_t mode, dev_t dev)
 {
struct inode *inode = new_inode(sb);
@@ -328,6 +359,7 @@ static void init_once(void *foo)
 
 static int __init init_regfs_fs(void)
 {
+   int ret;
 
regfs_inode_cachep = kmem_cache_create_usercopy("regfs_inode_cache",
sizeof(struct regfs_inode_info), 0,
@@ -337,11 +369,16 @@ static int __init init_regfs_fs(void)
if (!regfs_inode_cachep)
return -ENOMEM;
 
+   ret = atomic_notifier_chain_register(&panic_notifier_list, 
®fs_panic_event_nb);
+   if (ret)
+   pr_warn("regfs regiter panic notifier failed\n");
+
return  register_filesystem(®fs_fs_type);
 }
 
 static void __exit exit_regfs_fs(void)
 {
+   atomic_notifier_chain_unregister(&panic_notifier_list, 
®fs_panic_event_nb);
unregister_filesystem(®fs_fs_type);
rcu_barrier();
kmem_cache_destroy(regfs_inode_cachep);
-- 
1.8.3.1

[PATCH 1/2] fs:regfs: add register easy filesystem

2020-10-19 Thread Zou Cao

register filesystem is mapping the register into file dentry, it
will use the io readio to get the register val. DBT file is use
to decript the register tree, you can use it as follow:

mount -t regfs -o dtb=test.dtb none /mnt

test.dts:
/ {

compatible = "hisilicon,hi6220-hikey", "hisilicon,hi6220";
#address-cells = <0x2>;
#size-cells = <0x2>;
model = "HiKey Development Board";

gic-v3-dist{
reg = <0x0 0x800 0x0 0x1>;
GIC_CTRL {
offset = <0x0>;
};
GICD_TYPER {
offset = <0x4>;
};
   };
};

it will create all regiter dentry file in /mnt

Signed-off-by: Zou Cao 
---
 fs/Kconfig |   1 +
 fs/Makefile|   1 +
 fs/regfs/Kconfig   |   7 +
 fs/regfs/Makefile  |   8 ++
 fs/regfs/file.c| 107 +++
 fs/regfs/inode.c   | 354 +
 fs/regfs/internal.h|  32 +
 fs/regfs/regfs_inode.h |  32 +
 fs/regfs/supper.c  |  71 ++
 9 files changed, 613 insertions(+)
 create mode 100644 fs/regfs/Kconfig
 create mode 100644 fs/regfs/Makefile
 create mode 100644 fs/regfs/file.c
 create mode 100644 fs/regfs/inode.c
 create mode 100644 fs/regfs/internal.h
 create mode 100644 fs/regfs/regfs_inode.h
 create mode 100644 fs/regfs/supper.c

diff --git a/fs/Kconfig b/fs/Kconfig
index a88aa3a..d95acaf 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -324,6 +324,7 @@ endif # NETWORK_FILESYSTEMS
 source "fs/nls/Kconfig"
 source "fs/dlm/Kconfig"
 source "fs/unicode/Kconfig"
+source "fs/regfs/Kconfig"
 
 config IO_WQ
bool
diff --git a/fs/Makefile b/fs/Makefile
index 2ce5112..24f3878 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -136,3 +136,4 @@ obj-$(CONFIG_EFIVAR_FS) += efivarfs/
 obj-$(CONFIG_EROFS_FS) += erofs/
 obj-$(CONFIG_VBOXSF_FS)+= vboxsf/
 obj-$(CONFIG_ZONEFS_FS)+= zonefs/
+obj-$(CONFIG_REGFS_FS) += zonefs/
diff --git a/fs/regfs/Kconfig b/fs/regfs/Kconfig
new file mode 100644
index 000..74ba85b
--- /dev/null
+++ b/fs/regfs/Kconfig
@@ -0,0 +1,7 @@
+config REGFS_FS
+   tristate "registers filesystem support"
+   depends on ARM64
+   help
+ regfs support the read and write register of device resource by
+ dentry filesystem, it is more easy to support bsp debug. it also
+ support to printk the register val when panic
diff --git a/fs/regfs/Makefile b/fs/regfs/Makefile
new file mode 100644
index 000..26d5eef
--- /dev/null
+++ b/fs/regfs/Makefile
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+#Makefile for the linux ramfs routines.
+#
+
+obj-y += regfs.o
+
+regfs-objs += inode.o file.o supper.o
diff --git a/fs/regfs/file.c b/fs/regfs/file.c
new file mode 100644
index 000..6cd9f3d
--- /dev/null
+++ b/fs/regfs/file.c
@@ -0,0 +1,107 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "regfs_inode.h"
+#include "internal.h"
+
+ssize_t regfs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
+{
+   struct file *file = iocb->ki_filp;
+   struct inode *inode = file->f_mapping->host;
+   ssize_t ret;
+
+   inode_lock(inode);
+   ret = generic_write_checks(iocb, from);
+   if (ret > 0)
+   ret = __generic_file_write_iter(iocb, from);
+   inode_unlock(inode);
+
+   if (ret > 0)
+   ret = generic_write_sync(iocb, ret);
+   return ret;
+}
+
+static ssize_t regfs_file_read(struct file *file, char __user *buf, size_t 
len, loff_t *ppos)
+{
+   struct address_space *mapping = file->f_mapping;
+   struct regfs_inode_info  *info = REGFS_I(mapping->host);
+   char str[64];
+   unsigned long val;
+
+   val = readl_relaxed(info->base + info->offset);
+
+   loc_debug("name:%s base:%p val:%lx\n"
+   , file->f_path.dentry->d_iname
+   , info->base + info->offset
+   , val);
+
+   snprintf(str, 64, "%lx", val);
+
+   return simple_read_from_buffer(buf, len, ppos, str, strlen(str));
+}
+
+static ssize_t regfs_file_write(struct file *file, const char __user *buf, 
size_t len, loff_t *ppos)
+{
+   struct address_space *mapping = file->f_mapping;
+   struct regfs_inode_info  *info = REGFS_I(mapping->host);
+   char str[67];
+   unsigned long val = 0;
+   loff_t pos = *ppos;
+   size_t res;
+
+   if (pos < 0)
+   return -EINVAL;
+   if (pos >= len || len > 66)
+   return 0;
+
+   res = copy_from_user(str, buf, len);
+   if (res)
+   return -EFAULT;
+   str[len] = 0;
+
+   if (kstrtoul(str, 16, &val) < 0)
+   re

De Mr Armand AGBO(Demande de Partenariat / Partnership Request)

2020-10-19 Thread Mr Armand AGBO

De Mr Armand AGBO

Je suis Mr Armand AGBO de nationalitй bйninoise, gestionnaire de compte dans 
une institution bancaire. Je vous prie de m'excuser pour cette intrusion 
inattendue de ma part car c'est suite а une recherche via internet que j'ai 
trouvй votre contact et aprиs avoir parcouru votre profil, je suis convaincu 
que vous serez mieux placй pour exйcuter cette transaction commerciale avec moi.
En effet, ceci concerne l’un de nos clients qui est dйcйdй depuis prиs de 5 ans 
et qui dispose un compte sans successeur ni mandataire mentionnй dans les 
fichiers ni dans les archives. Je viens donc par ce message vous solliciter 
pour un partenariat comme suit : Ce compte est actuellement inactif et est 
bloquй mais c’est un compte physique liй а un compte en ligne. Je tiens а avoir 
votre accord de collaboration afin d’inscrire votre nom en tant que 
bйnйficiaire de succession au codicille et dernier testament de ce dernier dans 
nos fichiers et archives.
Cette transaction est 100% sans risque seulement une confiance mutuelle car 
tous les documents juridiques qui seront utilisйs pour traiter ce dossier 
seront traitйs par moi, dans votre acceptation de coopйrer avec moi sur cette 
affaire.
Veuillez me faire parvenir votre lettre d'acceptation pour me permettre de 
commencer а vous procurer tous les documents juridiques pour la libйration de 
ses fonds. Voici mon adresse e-mail personnelle: mrarmand.a...@gmail.com
Cordialement …
Mr AGBO A
..

By Mr Armand AGBO
I am Mr Armand AGBO of Beninese nationality, account manager in a banking 
institution. I apologize for this unexpected intrusion on my part because it is 
following an internet search that I found your contact and after having browsed 
your profile, I am convinced that you will be in a better position to execute 
this transaction. commercial with me.
Indeed, this concerns one of our clients who has been deceased for almost 5 
years and who has an account without a successor or representative mentioned in 
the files or in the archives. So I come by this message to solicit you for a 
partnership as follows: This account is currently inactive and is blocked but 
it is a physical account linked to an online account. I would like to have your 
collaboration agreement in order to register your name as beneficiary of the 
estate in the codicil and last will of the latter in our files and archives.
This transaction is 100% risk free only mutual trust as all legal documents 
that will be used to process this case will be handled by me, in your agreement 
to cooperate with me on this matter.
Please send me your letter of acceptance to allow me to start getting all the 
legal documents for the release of its funds. Here is my personal email 
address: mrarmand.a...@gmail.com
Cordially …
Mr AGBO A

--
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Re: [PATCH 1/2] KVM: not register a IRQ bypass producer if unsupported or disabled

2020-10-19 Thread Jason Wang




On 2020/10/19 下午5:06, Zhenzhong Duan wrote:

If Post interrupt is disabled due to hardware limit or forcely disabled
by "intremap=nopost" parameter, return -EINVAL so that the legacy mode IRQ
isn't registered as IRQ bypass producer.



Is there any side effect if it was still registered?




With this change, below message is printed:
"vfio-pci :db:00.0: irq bypass producer (token 60c8cda5) registration 
fails: -22"



I may miss something, but the patch only touches vhost-vDPA instead of VFIO?

Thanks




..which also hints us if a vfio or vdpa device works in PI mode or legacy
remapping mode.

Add a print to vdpa code just like what vfio_msi_set_vector_signal() does.

Signed-off-by: Zhenzhong Duan 
---
  arch/x86/kvm/svm/avic.c | 3 +--
  arch/x86/kvm/vmx/vmx.c  | 5 ++---
  drivers/vhost/vdpa.c| 5 +
  3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index ac830cd..316142a 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -814,7 +814,7 @@ int svm_update_pi_irte(struct kvm *kvm, unsigned int 
host_irq,
  
  	if (!kvm_arch_has_assigned_device(kvm) ||

!irq_remapping_cap(IRQ_POSTING_CAP))
-   return 0;
+   return ret;
  
  	pr_debug("SVM: %s: host_irq=%#x, guest_irq=%#x, set=%#x\n",

 __func__, host_irq, guest_irq, set);
@@ -899,7 +899,6 @@ int svm_update_pi_irte(struct kvm *kvm, unsigned int 
host_irq,
}
}
  
-	ret = 0;

  out:
srcu_read_unlock(&kvm->irq_srcu, idx);
return ret;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f0a9954..1fed6d6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7716,12 +7716,12 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned 
int host_irq,
struct kvm_lapic_irq irq;
struct kvm_vcpu *vcpu;
struct vcpu_data vcpu_info;
-   int idx, ret = 0;
+   int idx, ret = -EINVAL;
  
  	if (!kvm_arch_has_assigned_device(kvm) ||

!irq_remapping_cap(IRQ_POSTING_CAP) ||
!kvm_vcpu_apicv_active(kvm->vcpus[0]))
-   return 0;
+   return ret;
  
  	idx = srcu_read_lock(&kvm->irq_srcu);

irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
@@ -7787,7 +7787,6 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned 
int host_irq,
}
}
  
-	ret = 0;

  out:
srcu_read_unlock(&kvm->irq_srcu, idx);
return ret;
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 62a9bb0..b20060a 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -107,6 +107,11 @@ static void vhost_vdpa_setup_vq_irq(struct vhost_vdpa *v, 
u16 qid)
vq->call_ctx.producer.token = vq->call_ctx.ctx;
vq->call_ctx.producer.irq = irq;
ret = irq_bypass_register_producer(&vq->call_ctx.producer);
+   if (unlikely(ret))
+   dev_info(&vdpa->dev,
+   "irq bypass producer (token %p) registration fails: %d\n",
+   vq->call_ctx.producer.token, ret);
+
spin_unlock(&vq->call_ctx.ctx_lock);
  }

Re: [PATCH 1/4] ftgmac100: Fix race issue on TX descriptor[0]

2020-10-19 Thread Benjamin Herrenschmidt

On Tue, 2020-10-20 at 04:13 +, Joel Stanley wrote:
> On Mon, 19 Oct 2020 at 23:20, Benjamin Herrenschmidt
>  wrote:
> > 
> > On Mon, 2020-10-19 at 16:57 +0800, Dylan Hung wrote:
> > > These rules must be followed when accessing the TX descriptor:
> > > 
> > > 1. A TX descriptor is "cleanable" only when its value is non-zero
> > > and the owner bit is set to "software"
> > 
> > Can you elaborate ? What is the point of that change ? The owner
> > bit
> > should be sufficient, why do we need to check other fields ?
> 
> I would like Dylan to clarify too. The datasheet has a footnote below
> the descriptor layout:
> 
>  - TXDES#0: Bits 27 ~ 14 are valid only when FTS = 1
>  - TXDES#1: Bits 31 ~ 0 are valid only when FTS = 1
> 
> So the ownership bit (31) is not valid unless FTS is set. However,
> this isn't what his patch does. It adds checks for EDOTR.

No I think it adds a check for everything except EDOTR which just marks
the end of ring and needs to be ignored in the comparison.

That said, we do need a better explanation.

One potential bug I did find by looking at my code however is:

static bool ftgmac100_tx_complete_packet(struct ftgmac100 *priv)
{
struct net_device *netdev = priv->netdev;
struct ftgmac100_txdes *txdes;
struct sk_buff *skb;
unsigned int pointer;
u32 ctl_stat;

pointer = priv->tx_clean_pointer;
txdes = &priv->txdes[pointer];

ctl_stat = le32_to_cpu(txdes->txdes0);
if (ctl_stat & FTGMAC100_TXDES0_TXDMA_OWN)
return false;

skb = priv->tx_skbs[pointer];
netdev->stats.tx_packets++;
netdev->stats.tx_bytes += skb->len;
ftgmac100_free_tx_packet(priv, pointer, skb, txdes, ctl_stat);
txdes->txdes0 = cpu_to_le32(ctl_stat & priv->txdes0_edotr_mask);

   There should probably be an smp_wmb() here to ensure that all the above
stores are visible before the tx clean pointer is updated.

priv->tx_clean_pointer = ftgmac100_next_tx_pointer(priv, pointer);

return true;
}

Similarly we probablu should have one before setting tx_pointer in start_xmit().

As for the read side of this, I'm not 100% sure, I'll have to think more about
it, it *think* the existing barriers are sufficient at first sight.

Cheers,
Ben.

> > 
> > > 2. A TX descriptor is "writable" only when its value is zero
> > > regardless the edotr mask.
> > 
> > Again, why is that ? Can you elaborate ? What race are you trying
> > to
> > address here ?
> > 
> > Cheers,
> > Ben.
> > 
> > > Fixes: 52c0cae87465 ("ftgmac100: Remove tx descriptor accessors")
> > > Signed-off-by: Dylan Hung 
> > > Signed-off-by: Joel Stanley 
> > > ---
> > >  drivers/net/ethernet/faraday/ftgmac100.c | 10 ++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/drivers/net/ethernet/faraday/ftgmac100.c
> > > b/drivers/net/ethernet/faraday/ftgmac100.c
> > > index 00024dd41147..7cacbe4aecb7 100644
> > > --- a/drivers/net/ethernet/faraday/ftgmac100.c
> > > +++ b/drivers/net/ethernet/faraday/ftgmac100.c
> > > @@ -647,6 +647,9 @@ static bool
> > > ftgmac100_tx_complete_packet(struct
> > > ftgmac100 *priv)
> > >   if (ctl_stat & FTGMAC100_TXDES0_TXDMA_OWN)
> > >   return false;
> > > 
> > > + if ((ctl_stat & ~(priv->txdes0_edotr_mask)) == 0)
> > > + return false;
> > > +
> > >   skb = priv->tx_skbs[pointer];
> > >   netdev->stats.tx_packets++;
> > >   netdev->stats.tx_bytes += skb->len;
> > > @@ -756,6 +759,9 @@ static netdev_tx_t
> > > ftgmac100_hard_start_xmit(struct sk_buff *skb,
> > >   pointer = priv->tx_pointer;
> > >   txdes = first = &priv->txdes[pointer];
> > > 
> > > + if (le32_to_cpu(txdes->txdes0) & ~priv->txdes0_edotr_mask)
> > > + goto drop;
> > > +
> > >   /* Setup it up with the packet head. Don't write the head
> > > to
> > > the
> > >* ring just yet
> > >*/
> > > @@ -787,6 +793,10 @@ static netdev_tx_t
> > > ftgmac100_hard_start_xmit(struct sk_buff *skb,
> > >   /* Setup descriptor */
> > >   priv->tx_skbs[pointer] = skb;
> > >   txdes = &priv->txdes[pointer];
> > > +
> > > + if (le32_to_cpu(txdes->txdes0) & ~priv-
> > > > txdes0_edotr_mask)
> > > 
> > > + goto dma_err;
> > > +
> > >   ctl_stat = ftgmac100_base_tx_ctlstat(priv,
> > > pointer);
> > >   ctl_stat |= FTGMAC100_TXDES0_TXDMA_OWN;
> > >   ctl_stat |= FTGMAC100_TXDES0_TXBUF_SIZE(len);

[PATCH] power: suspend: Add sleep timer and timeout handler

2020-10-19 Thread Joseph Jang

Add sleep timer and timeout handler to prevent device stuck during suspend/
resume process. The timeout handler will dump disk sleep task at first
round timeout and trigger kernel panic at second round timeout.
The default timer for each round is defined in
CONFIG_PM_SLEEP_TIMER_TIMEOUT.

Signed-off-by: Joseph Jang 
---
 MAINTAINERS   |  2 +
 include/linux/console.h   |  1 +
 include/linux/suspend_timer.h | 90 +++
 kernel/power/Kconfig  | 15 ++
 kernel/power/suspend.c| 19 
 kernel/printk/printk.c|  5 ++
 6 files changed, 132 insertions(+)
 create mode 100644 include/linux/suspend_timer.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 867157311dc8..8ae91f5ff3ff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7787,6 +7787,7 @@ F:drivers/base/power/
 F: include/linux/freezer.h
 F: include/linux/pm.h
 F: include/linux/suspend.h
+F: include/linux/suspend_timer.h
 F: kernel/power/
 
 HID CORE LAYER
@@ -16629,6 +16630,7 @@ F:  drivers/base/power/
 F: include/linux/freezer.h
 F: include/linux/pm.h
 F: include/linux/suspend.h
+F: include/linux/suspend_timer.h
 F: kernel/power/
 
 SVGA HANDLING
diff --git a/include/linux/console.h b/include/linux/console.h
index 0670d3491e0e..5436d8dc600f 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -192,6 +192,7 @@ static inline void console_sysfs_notify(void)
 { }
 #endif
 extern bool console_suspend_enabled;
+extern int console_is_suspended(void);
 
 /* Suspend and resume console messages over PM events */
 extern void suspend_console(void);
diff --git a/include/linux/suspend_timer.h b/include/linux/suspend_timer.h
new file mode 100644
index ..7e4c9e31bf09
--- /dev/null
+++ b/include/linux/suspend_timer.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_SLEEP_TIMER_H
+#define _LINUX_SLEEP_TIMER_H
+
+#include 
+
+#ifdef CONFIG_PM_SLEEP_MONITOR
+struct sleep_timer {
+   struct task_struct  *tsk;
+   struct timer_list   timer;
+};
+
+#define DECLARE_SLEEP_TIMER(st) \
+   struct sleep_timer st
+
+/**
+ * init_sleep_timer - Initialize sleep timer.
+ * @st: Sleep timer to initialize.
+ * @func: Sleep timer timeout handler.
+ */
+static void init_sleep_timer(struct sleep_timer *st, void (*func))
+{
+   struct timer_list *timer = &st->timer;
+
+   timer_setup(timer, func, 0);
+}
+
+/**
+ * start_sleep_timer - Enable sleep timer to monitor suspend thread.
+ * @st: Sleep timer to enable.
+ */
+static void start_sleep_timer(struct sleep_timer *st)
+{
+   struct timer_list *timer = &st->timer;
+
+   st->tsk = current;
+
+   /* use same timeout value for both suspend and resume */
+   timer->expires = jiffies + HZ * CONFIG_PM_SLEEP_TIMER_TIMEOUT;
+   add_timer(timer);
+}
+
+/**
+ * stop_sleep_timer - Disable sleep timer.
+ * @st: sleep timer to disable.
+ */
+static void stop_sleep_timer(struct sleep_timer *st)
+{
+   struct timer_list *timer = &st->timer;
+
+   del_timer_sync(timer);
+}
+
+/**
+ * sleep_timeout_handler - sleep timer timeout handler.
+ * @t: The timer list that sleep timer depends on.
+ *
+ * Called when suspend thread has timeout suspending or resuming.
+ * Dump all uninterruptible tasks' call stack and call panic() to
+ * reboot system in second round timeout.
+ */
+static void sleep_timeout_handler(struct timer_list *t)
+{
+   struct sleep_timer *st = from_timer(st, t, timer);
+   static int timeout_count;
+
+   pr_info("Sleep timeout (timer is %d seconds)\n",
+   (CONFIG_PM_SLEEP_TIMER_TIMEOUT));
+   show_stack(st->tsk, NULL, KERN_EMERG);
+   show_state_filter(TASK_UNINTERRUPTIBLE);
+
+   if (timeout_count < 1) {
+   timeout_count++;
+   start_sleep_timer(st);
+   return;
+   }
+
+   if (console_is_suspended())
+   resume_console();
+
+   panic("Sleep timeout and panic\n");
+}
+#else
+#define DECLARE_SLEEP_TIMER(st)
+#define init_sleep_timer(x, y)
+#define start_sleep_timer(x)
+#define stop_sleep_timer(x)
+#endif
+
+#endif /* _LINUX_SLEEP_TIMER_H */
diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
index a7320f07689d..9e2b274db0c1 100644
--- a/kernel/power/Kconfig
+++ b/kernel/power/Kconfig
@@ -207,6 +207,21 @@ config PM_SLEEP_DEBUG
def_bool y
depends on PM_DEBUG && PM_SLEEP
 
+config PM_SLEEP_MONITOR
+   bool "Linux kernel suspend/resume process monitor"
+   depends on PM_SLEEP
+   help
+   This option will enable sleep timer to prevent device stuck
+   during suspend/resume process. Sleep timeout handler will dump
+   disk sleep task at first round timeout and trigger kernel panic
+   at second round timeout. The timer for each round is defined in
+   CONFIG_PM_SLEEP_TIMER_TIMEOUT.
+
+config PM_SLEEP_TIMER_TIMEOUT
+   int "Sleep timer timeout in seconds"
+   range

[RFCv2 03/16] x86/kvm: Make DMA pages shared

2020-10-19 Thread Kirill A. Shutemov

Make force_dma_unencrypted() return true for KVM to get DMA pages mapped
as shared.

__set_memory_enc_dec() now informs the host via hypercall if the state
of the page has changed from shared to private or back.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/Kconfig | 1 +
 arch/x86/mm/mem_encrypt_common.c | 5 +++--
 arch/x86/mm/pat/set_memory.c | 7 +++
 include/uapi/linux/kvm_para.h| 2 ++
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 619ebf40e457..cd272e3babbc 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -805,6 +805,7 @@ config KVM_GUEST
select PARAVIRT_CLOCK
select ARCH_CPUIDLE_HALTPOLL
select X86_HV_CALLBACK_VECTOR
+   select X86_MEM_ENCRYPT_COMMON
default y
help
  This option enables various optimizations for running under the KVM
diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
index 964e04152417..a878e7f246d5 100644
--- a/arch/x86/mm/mem_encrypt_common.c
+++ b/arch/x86/mm/mem_encrypt_common.c
@@ -10,14 +10,15 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
 bool force_dma_unencrypted(struct device *dev)
 {
/*
-* For SEV, all DMA must be to unencrypted/shared addresses.
+* For SEV and KVM, all DMA must be to unencrypted/shared addresses.
 */
-   if (sev_active())
+   if (sev_active() || kvm_mem_protected())
return true;
 
/*
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index d1b2a889f035..4c49303126c9 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1977,6 +1978,12 @@ static int __set_memory_enc_dec(unsigned long addr, int 
numpages, bool enc)
struct cpa_data cpa;
int ret;
 
+   if (kvm_mem_protected()) {
+   unsigned long gfn = __pa(addr) >> PAGE_SHIFT;
+   int call = enc ? KVM_HC_MEM_UNSHARE : KVM_HC_MEM_SHARE;
+   return kvm_hypercall2(call, gfn, numpages);
+   }
+
/* Nothing to do if memory encryption is not active */
if (!mem_encrypt_active())
return 0;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 1a216f32e572..c6d8c988e330 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -30,6 +30,8 @@
 #define KVM_HC_SEND_IPI10
 #define KVM_HC_SCHED_YIELD 11
 #define KVM_HC_ENABLE_MEM_PROTECTED12
+#define KVM_HC_MEM_SHARE   13
+#define KVM_HC_MEM_UNSHARE 14
 
 /*
  * hypercalls use architecture specific
-- 
2.26.2

[PATCH v2 0/2] watchdog: f71808e_wdt: migrate to new kernel API

2020-10-19 Thread Ahmad Fatoum

This series migrates the driver to use first the driver model, then
the new kernel watchdog API.

I tested it on a f81866.

v1 had a wrong title (f71808e_wdt: migrate to kernel). It's available here:
https://lore.kernel.org/linux-watchdog/20200611191750.28096-1-a.fat...@pengutronix.de/

v1 -> v2:
- reworked to platform device/driver pair (Guenther)
- squashed identifier renaming into the patches that touch
  the respective lines anyway
- fixed checkpatch.pl nitpicks (Guenther)
- fixed locally used variable declared without static (0-day)
- fixed unneded line break due to old line limit (Guenther)
- renamed struct fintek_wdog_data to struct fintek_wdt

Ahmad Fatoum (2):
  watchdog: f71808e_wdt: refactor to platform device/driver pair
  watchdog: f71808e_wdt: migrate to new kernel watchdog API

 drivers/watchdog/Kconfig   |   1 +
 drivers/watchdog/f71808e_wdt.c | 815 -
 2 files changed, 292 insertions(+), 524 deletions(-)

-- 
2.28.0

[PATCH v2 1/2] watchdog: f71808e_wdt: refactor to platform device/driver pair

2020-10-19 Thread Ahmad Fatoum

Driver so far wasn't ported to the driver model and set up its
miscdevice out of the init after probing the I/O ports for a watchdog
with correct vendor and device revision.

Keep the device detection part at init time, but move watchdog setup
to a platform driver probe function.

While at it, refactor some of the driver code we have to now touch
anyway:

 - platform_device_id is used instead of the two big switches mapping
   hardware ID to an enum and then mapping it to a pinconf function
 - we rename f71808e_ and watchdog_data to fintek_wdt, to avoid mix up
   of the generic parts up with the device specific parts

Suggested-by: Guenter Roeck 
Signed-off-by: Ahmad Fatoum 
---
 drivers/watchdog/f71808e_wdt.c | 377 +++--
 1 file changed, 215 insertions(+), 162 deletions(-)

diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c
index f60beec1bbae..4ff7a2509125 100644
--- a/drivers/watchdog/f71808e_wdt.c
+++ b/drivers/watchdog/f71808e_wdt.c
@@ -9,12 +9,15 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -110,22 +113,6 @@ module_param(start_withtimeout, uint, 0);
 MODULE_PARM_DESC(start_withtimeout, "Start watchdog timer on module load with"
" given initial timeout. Zero (default) disables this feature.");
 
-enum chips { f71808fg, f71858fg, f71862fg, f71868, f71869, f71882fg, f71889fg,
-f81803, f81865, f81866};
-
-static const char *f71808e_names[] = {
-   "f71808fg",
-   "f71858fg",
-   "f71862fg",
-   "f71868",
-   "f71869",
-   "f71882fg",
-   "f71889fg",
-   "f81803",
-   "f81865",
-   "f81866",
-};
-
 /* Super-I/O Function prototypes */
 static inline int superio_inb(int base, int reg);
 static inline int superio_inw(int base, int reg);
@@ -136,9 +123,17 @@ static inline int superio_enter(int base);
 static inline void superio_select(int base, int ld);
 static inline void superio_exit(int base);
 
-struct watchdog_data {
+struct fintek_wdt;
+
+struct fintek_wdt_variant {
+   u16 id;
+   void (*pinconf)(struct fintek_wdt *wd);
+   const char *identity_override;
+};
+
+struct fintek_wdt {
unsigned short  sioaddr;
-   enum chips  type;
+   const struct fintek_wdt_variant *variant;
unsigned long   opened;
struct mutexlock;
charexpect_close;
@@ -152,10 +147,15 @@ struct watchdog_data {
charcaused_reboot;  /* last reboot was by the watchdog */
 };
 
-static struct watchdog_data watchdog = {
+static struct fintek_wdt watchdog = {
.lock = __MUTEX_INITIALIZER(watchdog.lock),
 };
 
+static inline bool has_f81865_wdo_conf(struct fintek_wdt *wd)
+{
+   return wd->variant->id == SIO_F81865_ID || wd->variant->id == 
SIO_F81866_ID;
+}
+
 /* Super I/O functions */
 static inline int superio_inb(int base, int reg)
 {
@@ -247,7 +247,7 @@ static int watchdog_set_pulse_width(unsigned int pw)
int err = 0;
unsigned int t1 = 25, t2 = 125, t3 = 5000;
 
-   if (watchdog.type == f71868) {
+   if (watchdog.variant->id == SIO_F71868_ID) {
t1 = 30;
t2 = 150;
t3 = 6000;
@@ -309,7 +309,6 @@ static int watchdog_keepalive(void)
 static int watchdog_start(void)
 {
int err;
-   u8 tmp;
 
/* Make sure we don't die as soon as the watchdog is enabled below */
err = watchdog_keepalive();
@@ -323,81 +322,12 @@ static int watchdog_start(void)
superio_select(watchdog.sioaddr, SIO_F71808FG_LD_WDT);
 
/* Watchdog pin configuration */
-   switch (watchdog.type) {
-   case f71808fg:
-   /* Set pin 21 to GPIO23/WDTRST#, then to WDTRST# */
-   superio_clear_bit(watchdog.sioaddr, SIO_REG_MFUNCT2, 3);
-   superio_clear_bit(watchdog.sioaddr, SIO_REG_MFUNCT3, 3);
-   break;
-
-   case f71862fg:
-   if (f71862fg_pin == 63) {
-   /* SPI must be disabled first to use this pin! */
-   superio_clear_bit(watchdog.sioaddr, 
SIO_REG_ROM_ADDR_SEL, 6);
-   superio_set_bit(watchdog.sioaddr, SIO_REG_MFUNCT3, 4);
-   } else if (f71862fg_pin == 56) {
-   superio_set_bit(watchdog.sioaddr, SIO_REG_MFUNCT1, 1);
-   }
-   break;
-
-   case f71868:
-   case f71869:
-   /* GPIO14 --> WDTRST# */
-   superio_clear_bit(watchdog.sioaddr, SIO_REG_MFUNCT1, 4);
-   break;
-
-   case f71882fg:
-   /* Set pin 56 to WDTRST# */
-   superio_set_bit(watchdog.sioaddr, SIO_REG_MFUNCT1, 1);
-   break;
-
-   case f71889fg:
-   /* set pin 40 to WDTRST# */
-   superio_outb(watchdog.sioaddr, SIO_REG_MFUNCT3,
-

[RFCv2 04/16] x86/kvm: Use bounce buffers for KVM memory protection

2020-10-19 Thread Kirill A. Shutemov

Mirror SEV, use SWIOTLB always if KVM memory protection is enabled.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/Kconfig |  1 +
 arch/x86/kernel/kvm.c|  2 ++
 arch/x86/kernel/pci-swiotlb.c|  3 ++-
 arch/x86/mm/mem_encrypt.c| 21 -
 arch/x86/mm/mem_encrypt_common.c | 23 +++
 5 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cd272e3babbc..b22b95517437 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -806,6 +806,7 @@ config KVM_GUEST
select ARCH_CPUIDLE_HALTPOLL
select X86_HV_CALLBACK_VECTOR
select X86_MEM_ENCRYPT_COMMON
+   select SWIOTLB
default y
help
  This option enables various optimizations for running under the KVM
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 2c1f8952b92a..30bb3d2d6ccd 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -759,6 +760,7 @@ static void __init kvm_init_platform(void)
pr_info("KVM memory protection enabled\n");
mem_protected = true;
setup_force_cpu_cap(X86_FEATURE_KVM_MEM_PROTECTED);
+   swiotlb_force = SWIOTLB_FORCE;
}
 }
 
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..814060a6ceb0 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 int swiotlb __read_mostly;
 
@@ -49,7 +50,7 @@ int __init pci_swiotlb_detect_4gb(void)
 * buffers are allocated and used for devices that do not support
 * the addressing range required for the encryption mask.
 */
-   if (sme_active())
+   if (sme_active() || kvm_mem_protected())
swiotlb = 1;
 
return swiotlb;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 4dbdc9dac36b..5de64e068b0a 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -369,24 +369,3 @@ void __init mem_encrypt_free_decrypted_mem(void)
 
free_init_pages("unused decrypted", vaddr, vaddr_end);
 }
-
-/* Architecture __weak replacement functions */
-void __init mem_encrypt_init(void)
-{
-   if (!sme_me_mask)
-   return;
-
-   /* Call into SWIOTLB to update the SWIOTLB DMA buffers */
-   swiotlb_update_mem_attributes();
-
-   /*
-* With SEV, we need to unroll the rep string I/O instructions.
-*/
-   if (sev_active())
-   static_branch_enable(&sev_enable_key);
-
-   pr_info("AMD %s active\n",
-   sev_active() ? "Secure Encrypted Virtualization (SEV)"
-: "Secure Memory Encryption (SME)");
-}
-
diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
index a878e7f246d5..7900f3788010 100644
--- a/arch/x86/mm/mem_encrypt_common.c
+++ b/arch/x86/mm/mem_encrypt_common.c
@@ -37,3 +37,26 @@ bool force_dma_unencrypted(struct device *dev)
 
return false;
 }
+
+void __init mem_encrypt_init(void)
+{
+   if (!sme_me_mask && !kvm_mem_protected())
+   return;
+
+   /* Call into SWIOTLB to update the SWIOTLB DMA buffers */
+   swiotlb_update_mem_attributes();
+
+   /*
+* With SEV, we need to unroll the rep string I/O instructions.
+*/
+   if (sev_active())
+   static_branch_enable(&sev_enable_key);
+
+   if (sme_me_mask) {
+   pr_info("AMD %s active\n",
+   sev_active() ? "Secure Encrypted Virtualization (SEV)"
+   : "Secure Memory Encryption (SME)");
+   } else {
+   pr_info("KVM memory protection enabled\n");
+   }
+}
-- 
2.26.2

[RFCv2 05/16] x86/kvm: Make VirtIO use DMA API in KVM guest

2020-10-19 Thread Kirill A. Shutemov

VirtIO for KVM is a primary way to provide IO. All memory that used for
communication with the host has to be marked as shared.

The easiest way to archive that is to use DMA API that already knows how
to deal with shared memory.

Signed-off-by: Kirill A. Shutemov 
---
 drivers/virtio/virtio_ring.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index becc77697960..ace733845d5d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef DEBUG
 /* For development, we want to crash whenever the ring is screwed. */
@@ -255,6 +256,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
if (xen_domain())
return true;
 
+   if (kvm_mem_protected())
+   return true;
+
return false;
 }
 
-- 
2.26.2

[RFCv2 06/16] x86/kvmclock: Share hvclock memory with the host

2020-10-19 Thread Kirill A. Shutemov

hvclock is shared between the guest and the hypervisor. It has to be
accessible by host.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/kernel/kvmclock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 34b18f6eeb2c..ac6c2abe0d0f 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -253,7 +253,7 @@ static void __init kvmclock_init_mem(void)
 * hvclock is shared between the guest and the hypervisor, must
 * be mapped decrypted.
 */
-   if (sev_active()) {
+   if (sev_active() || kvm_mem_protected()) {
r = set_memory_decrypted((unsigned long) hvclock_mem,
 1UL << order);
if (r) {
-- 
2.26.2

[RFCv2 08/16] KVM: Use GUP instead of copy_from/to_user() to access guest memory

2020-10-19 Thread Kirill A. Shutemov

New helpers copy_from_guest()/copy_to_guest() to be used if KVM memory
protection feature is enabled.

Signed-off-by: Kirill A. Shutemov 
---
 include/linux/kvm_host.h |  4 ++
 virt/kvm/kvm_main.c  | 90 +++-
 2 files changed, 75 insertions(+), 19 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 05e3c2fb3ef7..380a64613880 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -504,6 +504,7 @@ struct kvm {
struct srcu_struct irq_srcu;
pid_t userspace_pid;
unsigned int max_halt_poll_ns;
+   bool mem_protected;
 };
 
 #define kvm_err(fmt, ...) \
@@ -728,6 +729,9 @@ void kvm_set_pfn_dirty(kvm_pfn_t pfn);
 void kvm_set_pfn_accessed(kvm_pfn_t pfn);
 void kvm_get_pfn(kvm_pfn_t pfn);
 
+int copy_from_guest(void *data, unsigned long hva, int len, bool protected);
+int copy_to_guest(unsigned long hva, const void *data, int len, bool 
protected);
+
 void kvm_release_pfn(kvm_pfn_t pfn, bool dirty, struct gfn_to_pfn_cache 
*cache);
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
int len);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cf88233b819a..a9884cb8c867 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2313,19 +2313,70 @@ static int next_segment(unsigned long len, int offset)
return len;
 }
 
+int copy_from_guest(void *data, unsigned long hva, int len, bool protected)
+{
+   int offset = offset_in_page(hva);
+   struct page *page;
+   int npages, seg;
+
+   if (!protected)
+   return __copy_from_user(data, (void __user *)hva, len);
+
+   might_fault();
+   kasan_check_write(data, len);
+   check_object_size(data, len, false);
+
+   while ((seg = next_segment(len, offset)) != 0) {
+   npages = get_user_pages_unlocked(hva, 1, &page, 0);
+   if (npages != 1)
+   return -EFAULT;
+   memcpy(data, page_address(page) + offset, seg);
+   put_page(page);
+   len -= seg;
+   hva += seg;
+   offset = 0;
+   }
+
+   return 0;
+}
+
+int copy_to_guest(unsigned long hva, const void *data, int len, bool protected)
+{
+   int offset = offset_in_page(hva);
+   struct page *page;
+   int npages, seg;
+
+   if (!protected)
+   return __copy_to_user((void __user *)hva, data, len);
+
+   might_fault();
+   kasan_check_read(data, len);
+   check_object_size(data, len, true);
+
+   while ((seg = next_segment(len, offset)) != 0) {
+   npages = get_user_pages_unlocked(hva, 1, &page, FOLL_WRITE);
+   if (npages != 1)
+   return -EFAULT;
+   memcpy(page_address(page) + offset, data, seg);
+   put_page(page);
+   len -= seg;
+   hva += seg;
+   offset = 0;
+   }
+
+   return 0;
+}
+
 static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn,
-void *data, int offset, int len)
+void *data, int offset, int len,
+bool protected)
 {
-   int r;
unsigned long addr;
 
addr = gfn_to_hva_memslot_prot(slot, gfn, NULL);
if (kvm_is_error_hva(addr))
return -EFAULT;
-   r = __copy_from_user(data, (void __user *)addr + offset, len);
-   if (r)
-   return -EFAULT;
-   return 0;
+   return copy_from_guest(data, addr + offset, len, protected);
 }
 
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
@@ -2333,7 +2384,8 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void 
*data, int offset,
 {
struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
 
-   return __kvm_read_guest_page(slot, gfn, data, offset, len);
+   return __kvm_read_guest_page(slot, gfn, data, offset, len,
+kvm->mem_protected);
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_page);
 
@@ -2342,7 +2394,8 @@ int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t 
gfn, void *data,
 {
struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
 
-   return __kvm_read_guest_page(slot, gfn, data, offset, len);
+   return __kvm_read_guest_page(slot, gfn, data, offset, len,
+vcpu->kvm->mem_protected);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_page);
 
@@ -2415,7 +2468,8 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, 
gpa_t gpa,
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic);
 
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
- const void *data, int offset, int len)
+ const void *data, int offset, int len,
+ bool protected)
 {
i

[PATCH v2 2/2] watchdog: f71808e_wdt: migrate to new kernel watchdog API

2020-10-19 Thread Ahmad Fatoum

Migrating the driver lets us drop the watchdog misc device boilerplate
and reduces size by 280~ lines. It also brings us support for new
functionality like CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED.

While at it, also rename all local identifiers starting with watchdog_
to start with easier to tell apart fintek_wdt_ instead.

This incurs a slight backwards-compatibility break, because the new
kernel watchdog API doesn't support unloading modules for drivers
whose watchdog hardware is reported to be running.

This means following scenario will be no longer supported:
 - BIOS has enabled watchdog
 - Module is loaded and unloaded without opening watchdog
 - module_exit is expected to succeed and disable watchdog HW

Signed-off-by: Ahmad Fatoum 
---
 drivers/watchdog/Kconfig   |   1 +
 drivers/watchdog/f71808e_wdt.c | 514 -
 2 files changed, 115 insertions(+), 400 deletions(-)

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index ab7aad5a1e69..81f9817b291c 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1066,6 +1066,7 @@ config EBC_C384_WDT
 config F71808E_WDT
tristate "Fintek F718xx, F818xx Super I/O Watchdog"
depends on X86
+   select WATCHDOG_CORE
help
  This is the driver for the hardware watchdog on the Fintek F71808E,
  F71862FG, F71868, F71869, F71882FG, F71889FG, F81803, F81865, and
diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c
index 4ff7a2509125..32e759356354 100644
--- a/drivers/watchdog/f71808e_wdt.c
+++ b/drivers/watchdog/f71808e_wdt.c
@@ -3,6 +3,7 @@
  *   Copyright (C) 2006 by Hans Edgington   *
  *   Copyright (C) 2007-2009 Hans de Goede*
  *   Copyright (C) 2010 Giel van Schijndel   *
+ *   Copyright (C) 2020 Ahmad Fatoum  *
  * *
  ***/
 
@@ -10,18 +11,12 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
-#include 
 #include 
 
 #define DRVNAME "f71808e_wdt"
@@ -132,23 +127,15 @@ struct fintek_wdt_variant {
 };
 
 struct fintek_wdt {
+   struct watchdog_device wdd;
unsigned short  sioaddr;
const struct fintek_wdt_variant *variant;
-   unsigned long   opened;
-   struct mutexlock;
-   charexpect_close;
struct watchdog_info ident;
 
-   unsigned short  timeout;
u8  timer_val;  /* content for the wd_time register */
charminutes_mode;
u8  pulse_val;  /* pulse width flag */
charpulse_mode; /* enable pulse output mode? */
-   charcaused_reboot;  /* last reboot was by the watchdog */
-};
-
-static struct fintek_wdt watchdog = {
-   .lock = __MUTEX_INITIALIZER(watchdog.lock),
 };
 
 static inline bool has_f81865_wdo_conf(struct fintek_wdt *wd)
@@ -218,505 +205,244 @@ static inline void superio_exit(int base)
release_region(base, 2);
 }
 
-static int watchdog_set_timeout(int timeout)
+static int fintek_wdt_set_timeout(struct watchdog_device *wdd, unsigned int 
timeout)
 {
-   if (timeout <= 0
-|| timeout >  max_timeout) {
-   pr_err("watchdog timeout out of range\n");
-   return -EINVAL;
-   }
+   struct fintek_wdt *wd = watchdog_get_drvdata(wdd);
 
-   mutex_lock(&watchdog.lock);
-
-   watchdog.timeout = timeout;
+   wdd->timeout = timeout;
if (timeout > 0xff) {
-   watchdog.timer_val = DIV_ROUND_UP(timeout, 60);
-   watchdog.minutes_mode = true;
+   wd->timer_val = DIV_ROUND_UP(timeout, 60);
+   wd->minutes_mode = true;
} else {
-   watchdog.timer_val = timeout;
-   watchdog.minutes_mode = false;
+   wd->timer_val = timeout;
+   wd->minutes_mode = false;
}
 
-   mutex_unlock(&watchdog.lock);
-
return 0;
 }
 
-static int watchdog_set_pulse_width(unsigned int pw)
+static int fintek_wdt_set_pulse_width(struct fintek_wdt *wd, unsigned int pw)
 {
-   int err = 0;
unsigned int t1 = 25, t2 = 125, t3 = 5000;
 
-   if (watchdog.variant->id == SIO_F71868_ID) {
+   if (wd->variant->id == SIO_F71868_ID) {
t1 = 30;
t2 = 150;
t3 = 6000;
}
 
-   mutex_lock(&watchdog.lock);
-
if(pw <=  1) {
-   watchdog.pulse_val = 0;
+   wd->pulse_val = 0;
} else if (pw <= t1) {
-   watchdog.pulse_val = 1;
+   wd->pulse_val = 1;
} else if (pw <= t2) {
-   watchdog.pulse_val = 2;
+   wd->pulse_val = 2;
} else if (pw <= t3) {
-   watchdog.

[RFCv2 02/16] x86/kvm: Introduce KVM memory protection feature

2020-10-19 Thread Kirill A. Shutemov

Provide basic helpers, KVM_FEATURE, CPUID flag and a hypercall.

Host side doesn't provide the feature yet, so it is a dead code for now.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/include/asm/cpufeatures.h   |  1 +
 arch/x86/include/asm/kvm_para.h  |  5 +
 arch/x86/include/uapi/asm/kvm_para.h |  3 ++-
 arch/x86/kernel/kvm.c| 18 ++
 include/uapi/linux/kvm_para.h|  3 ++-
 5 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 2901d5df4366..a72157137764 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -236,6 +236,7 @@
 #define X86_FEATURE_EPT_AD ( 8*32+17) /* Intel Extended Page Table 
access-dirty bit */
 #define X86_FEATURE_VMCALL ( 8*32+18) /* "" Hypervisor supports 
the VMCALL instruction */
 #define X86_FEATURE_VMW_VMMCALL( 8*32+19) /* "" VMware prefers 
VMMCALL hypercall instruction */
+#define X86_FEATURE_KVM_MEM_PROTECTED  ( 8*32+20) /* KVM memory protection 
extenstion */
 
 /* Intel-defined CPU features, CPUID level 0x0007:0 (EBX), word 9 */
 #define X86_FEATURE_FSGSBASE   ( 9*32+ 0) /* RDFSBASE, WRFSBASE, 
RDGSBASE, WRGSBASE instructions*/
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 338119852512..74aea18f3130 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -11,11 +11,16 @@ extern void kvmclock_init(void);
 
 #ifdef CONFIG_KVM_GUEST
 bool kvm_check_and_clear_guest_paused(void);
+bool kvm_mem_protected(void);
 #else
 static inline bool kvm_check_and_clear_guest_paused(void)
 {
return false;
 }
+static inline bool kvm_mem_protected(void)
+{
+   return false;
+}
 #endif /* CONFIG_KVM_GUEST */
 
 #define KVM_HYPERCALL \
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 812e9b4c1114..defbfc630a9f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -28,10 +28,11 @@
 #define KVM_FEATURE_PV_UNHALT  7
 #define KVM_FEATURE_PV_TLB_FLUSH   9
 #define KVM_FEATURE_ASYNC_PF_VMEXIT10
-#define KVM_FEATURE_PV_SEND_IPI11
+#define KVM_FEATURE_PV_SEND_IPI11
 #define KVM_FEATURE_POLL_CONTROL   12
 #define KVM_FEATURE_PV_SCHED_YIELD 13
 #define KVM_FEATURE_ASYNC_PF_INT   14
+#define KVM_FEATURE_MEM_PROTECTED  15
 
 #define KVM_HINTS_REALTIME  0
 
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 9663ba31347c..2c1f8952b92a 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -37,6 +37,13 @@
 #include 
 #include 
 
+static bool mem_protected;
+
+bool kvm_mem_protected(void)
+{
+   return mem_protected;
+}
+
 DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
 
 static int kvmapf = 1;
@@ -742,6 +749,17 @@ static void __init kvm_init_platform(void)
 {
kvmclock_init();
x86_platform.apic_post_init = kvm_apic_init;
+
+   if (kvm_para_has_feature(KVM_FEATURE_MEM_PROTECTED)) {
+   if (kvm_hypercall0(KVM_HC_ENABLE_MEM_PROTECTED)) {
+   pr_err("Failed to enable KVM memory protection\n");
+   return;
+   }
+
+   pr_info("KVM memory protection enabled\n");
+   mem_protected = true;
+   setup_force_cpu_cap(X86_FEATURE_KVM_MEM_PROTECTED);
+   }
 }
 
 const __initconst struct hypervisor_x86 x86_hyper_kvm = {
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 8b86609849b9..1a216f32e572 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -27,8 +27,9 @@
 #define KVM_HC_MIPS_EXIT_VM7
 #define KVM_HC_MIPS_CONSOLE_OUTPUT 8
 #define KVM_HC_CLOCK_PAIRING   9
-#define KVM_HC_SEND_IPI10
+#define KVM_HC_SEND_IPI10
 #define KVM_HC_SCHED_YIELD 11
+#define KVM_HC_ENABLE_MEM_PROTECTED12
 
 /*
  * hypercalls use architecture specific
-- 
2.26.2

[RFCv2 13/16] KVM: Rework copy_to/from_guest() to avoid direct mapping

2020-10-19 Thread Kirill A. Shutemov

We are going unmap guest pages from direct mapping and cannot rely on it
for guest memory access. Use temporary kmap_atomic()-style mapping to
access guest memory.

Signed-off-by: Kirill A. Shutemov 
---
 virt/kvm/kvm_main.c  |  27 ++-
 virt/lib/mem_protected.c | 101 +++
 2 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4c008c7b4974..9b569b78874a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -154,6 +155,12 @@ static void kvm_uevent_notify_change(unsigned int type, 
struct kvm *kvm);
 static unsigned long long kvm_createvm_count;
 static unsigned long long kvm_active_vms;
 
+void *kvm_map_page_atomic(struct page *page);
+void kvm_unmap_page_atomic(void *vaddr);
+
+int kvm_init_protected_memory(void);
+void kvm_exit_protected_memory(void);
+
 int __kvm_protect_memory(unsigned long start, unsigned long end, bool protect);
 
 __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
@@ -2329,6 +2336,7 @@ int copy_from_guest(void *data, unsigned long hva, int 
len, bool protected)
int offset = offset_in_page(hva);
struct page *page;
int npages, seg;
+   void *vaddr;
 
if (!protected)
return __copy_from_user(data, (void __user *)hva, len);
@@ -2341,7 +2349,11 @@ int copy_from_guest(void *data, unsigned long hva, int 
len, bool protected)
npages = get_user_pages_unlocked(hva, 1, &page, FOLL_KVM);
if (npages != 1)
return -EFAULT;
-   memcpy(data, page_address(page) + offset, seg);
+
+   vaddr = kvm_map_page_atomic(page);
+   memcpy(data, vaddr + offset, seg);
+   kvm_unmap_page_atomic(vaddr);
+
put_page(page);
len -= seg;
hva += seg;
@@ -2356,6 +2368,7 @@ int copy_to_guest(unsigned long hva, const void *data, 
int len, bool protected)
int offset = offset_in_page(hva);
struct page *page;
int npages, seg;
+   void *vaddr;
 
if (!protected)
return __copy_to_user((void __user *)hva, data, len);
@@ -2369,7 +2382,11 @@ int copy_to_guest(unsigned long hva, const void *data, 
int len, bool protected)
 FOLL_WRITE | FOLL_KVM);
if (npages != 1)
return -EFAULT;
-   memcpy(page_address(page) + offset, data, seg);
+
+   vaddr = kvm_map_page_atomic(page);
+   memcpy(vaddr + offset, data, seg);
+   kvm_unmap_page_atomic(vaddr);
+
put_page(page);
len -= seg;
hva += seg;
@@ -4945,6 +4962,10 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned 
vcpu_align,
if (r)
goto out_free;
 
+   if (IS_ENABLED(CONFIG_HAVE_KVM_PROTECTED_MEMORY) &&
+   kvm_init_protected_memory())
+   goto out_unreg;
+
kvm_chardev_ops.owner = module;
kvm_vm_fops.owner = module;
kvm_vcpu_fops.owner = module;
@@ -4968,6 +4989,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned 
vcpu_align,
return 0;
 
 out_unreg:
+   kvm_exit_protected_memory();
kvm_async_pf_deinit();
 out_free:
kmem_cache_destroy(kvm_vcpu_cache);
@@ -4989,6 +5011,7 @@ EXPORT_SYMBOL_GPL(kvm_init);
 
 void kvm_exit(void)
 {
+   kvm_exit_protected_memory();
debugfs_remove_recursive(kvm_debugfs_dir);
misc_deregister(&kvm_dev);
kmem_cache_destroy(kvm_vcpu_cache);
diff --git a/virt/lib/mem_protected.c b/virt/lib/mem_protected.c
index 0b01dd74f29c..1dfe82534242 100644
--- a/virt/lib/mem_protected.c
+++ b/virt/lib/mem_protected.c
@@ -5,6 +5,100 @@
 #include 
 #include 
 
+static pte_t **guest_map_ptes;
+static struct vm_struct *guest_map_area;
+
+void *kvm_map_page_atomic(struct page *page)
+{
+   pte_t *pte;
+   void *vaddr;
+
+   preempt_disable();
+   pte = guest_map_ptes[smp_processor_id()];
+   vaddr = guest_map_area->addr + smp_processor_id() * PAGE_SIZE;
+   set_pte(pte, mk_pte(page, PAGE_KERNEL));
+   return vaddr;
+}
+EXPORT_SYMBOL_GPL(kvm_map_page_atomic);
+
+void kvm_unmap_page_atomic(void *vaddr)
+{
+   pte_t *pte = guest_map_ptes[smp_processor_id()];
+   set_pte(pte, __pte(0));
+   flush_tlb_one_kernel((unsigned long)vaddr);
+   preempt_enable();
+}
+EXPORT_SYMBOL_GPL(kvm_unmap_page_atomic);
+
+int kvm_init_protected_memory(void)
+{
+   guest_map_ptes = kmalloc_array(num_possible_cpus(),
+  sizeof(pte_t *), GFP_KERNEL);
+   if (!guest_map_ptes)
+   return -ENOMEM;
+
+   guest_map_area = alloc_vm_area(PAGE_SIZE * num_possible_cpus(),
+  guest_map_ptes);
+   if

[RFCv2 07/16] x86/realmode: Share trampoline area if KVM memory protection enabled

2020-10-19 Thread Kirill A. Shutemov

If KVM memory protection is active, the trampoline area will need to be
in shared memory.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/realmode/init.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 1ed1208931e0..7392940a7f96 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct real_mode_header *real_mode_header;
 u32 *trampoline_cr4_features;
@@ -55,11 +56,11 @@ static void __init setup_real_mode(void)
base = (unsigned char *)real_mode_header;
 
/*
-* If SME is active, the trampoline area will need to be in
-* decrypted memory in order to bring up other processors
+* If SME or KVM memory protection is active, the trampoline area will
+* need to be in decrypted memory in order to bring up other processors
 * successfully. This is not needed for SEV.
 */
-   if (sme_active())
+   if (sme_active() || kvm_mem_protected())
set_memory_decrypted((unsigned long)base, size >> PAGE_SHIFT);
 
memcpy(base, real_mode_blob, size);
-- 
2.26.2

[RFCv2 10/16] KVM: x86: Use GUP for page walk instead of __get_user()

2020-10-19 Thread Kirill A. Shutemov

The user mapping doesn't have the page mapping for protected memory.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/kvm/mmu/paging_tmpl.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 4dd6b1e5b8cf..258a6361b9b2 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -397,8 +397,14 @@ static int FNAME(walk_addr_generic)(struct guest_walker 
*walker,
goto error;
 
ptep_user = (pt_element_t __user *)((void *)host_addr + offset);
-   if (unlikely(__get_user(pte, ptep_user)))
-   goto error;
+   if (vcpu->kvm->mem_protected) {
+   if (copy_from_guest(&pte, host_addr + offset,
+   sizeof(pte), true))
+   goto error;
+   } else {
+   if (unlikely(__get_user(pte, ptep_user)))
+   goto error;
+   }
walker->ptep_user[walker->level - 1] = ptep_user;
 
trace_kvm_mmu_paging_element(pte, walker->level);
-- 
2.26.2

[RFCv2 09/16] KVM: mm: Introduce VM_KVM_PROTECTED

2020-10-19 Thread Kirill A. Shutemov

The new VMA flag that indicate a VMA that is not accessible to userspace
but usable by kernel with GUP if FOLL_KVM is specified.

The FOLL_KVM is only used in the KVM code. The code has to know how to
deal with such pages.

Signed-off-by: Kirill A. Shutemov 
---
 include/linux/mm.h  |  8 
 mm/gup.c| 20 
 mm/huge_memory.c| 20 
 mm/memory.c |  3 +++
 mm/mmap.c   |  3 +++
 virt/kvm/async_pf.c |  2 +-
 virt/kvm/kvm_main.c |  9 +
 7 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 16b799a0522c..c8d8cdcbc425 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -342,6 +342,8 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#define VM_KVM_PROTECTED 0
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
@@ -658,6 +660,11 @@ static inline bool vma_is_accessible(struct vm_area_struct 
*vma)
return vma->vm_flags & VM_ACCESS_FLAGS;
 }
 
+static inline bool vma_is_kvm_protected(struct vm_area_struct *vma)
+{
+   return vma->vm_flags & VM_KVM_PROTECTED;
+}
+
 #ifdef CONFIG_SHMEM
 /*
  * The vma_is_shmem is not inline because it is used only by slow
@@ -2766,6 +2773,7 @@ struct page *follow_page(struct vm_area_struct *vma, 
unsigned long address,
 #define FOLL_SPLIT_PMD 0x2 /* split huge pmd before returning */
 #define FOLL_PIN   0x4 /* pages must be released via unpin_user_page */
 #define FOLL_FAST_ONLY 0x8 /* gup_fast: prevent fall-back to slow gup */
+#define FOLL_KVM   0x10 /* access to VM_KVM_PROTECTED VMAs */
 
 /*
  * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each
diff --git a/mm/gup.c b/mm/gup.c
index e869c634cc9a..accf6db0c06f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -384,10 +384,19 @@ static int follow_pfn_pte(struct vm_area_struct *vma, 
unsigned long address,
  * FOLL_FORCE can write to even unwritable pte's, but only
  * after we've gone through a COW cycle and they are dirty.
  */
-static inline bool can_follow_write_pte(pte_t pte, unsigned int flags)
+static inline bool can_follow_write_pte(struct vm_area_struct *vma,
+   pte_t pte, unsigned int flags)
 {
-   return pte_write(pte) ||
-   ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte));
+   if (pte_write(pte))
+   return true;
+
+   if ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte))
+   return true;
+
+   if (!vma_is_kvm_protected(vma) || !(vma->vm_flags & VM_WRITE))
+   return false;
+
+   return (vma->vm_flags & VM_SHARED) || page_mapcount(pte_page(pte)) == 1;
 }
 
 static struct page *follow_page_pte(struct vm_area_struct *vma,
@@ -430,7 +439,7 @@ static struct page *follow_page_pte(struct vm_area_struct 
*vma,
}
if ((flags & FOLL_NUMA) && pte_protnone(pte))
goto no_page;
-   if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) {
+   if ((flags & FOLL_WRITE) && !can_follow_write_pte(vma, pte, flags)) {
pte_unmap_unlock(ptep, ptl);
return NULL;
}
@@ -750,6 +759,9 @@ static struct page *follow_page_mask(struct vm_area_struct 
*vma,
 
ctx->page_mask = 0;
 
+   if (vma_is_kvm_protected(vma) && (flags & FOLL_KVM))
+   flags &= ~FOLL_NUMA;
+
/* make this handle hugepd */
page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
if (!IS_ERR(page)) {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index da397779a6d4..ec8cf9a40cfd 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1322,10 +1322,19 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, 
pmd_t orig_pmd)
  * FOLL_FORCE can write to even unwritable pmd's, but only
  * after we've gone through a COW cycle and they are dirty.
  */
-static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags)
+static inline bool can_follow_write_pmd(struct vm_area_struct *vma,
+   pmd_t pmd, unsigned int flags)
 {
-   return pmd_write(pmd) ||
-  ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd));
+   if (pmd_write(pmd))
+   return true;
+
+   if ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd))
+   return true;
+
+   if (!vma_is_kvm_protected(vma) || !(vma->vm_flags & VM_WRITE))
+   return false;
+
+   return (vma->vm_flags & VM_SHARED) || page_mapcount(pmd_page(pmd)) == 1;
 }
 
 struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,
@@ -1338,7 +1347,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct 
*vma,
 
assert_spin_locked(pmd_lockptr(mm, pmd));
 
-   if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags))
+   if (flags & FOLL_WRITE && !c

[RFCv2 15/16] KVM: Unmap protected pages from direct mapping

2020-10-19 Thread Kirill A. Shutemov

If the protected memory feature enabled, unmap guest memory from
kernel's direct mappings.

Migration and KSM is disabled for protected memory as it would require a
special treatment.

Signed-off-by: Kirill A. Shutemov 
---
 include/linux/mm.h   |  3 +++
 mm/huge_memory.c |  8 
 mm/ksm.c |  2 ++
 mm/memory.c  | 12 
 mm/rmap.c|  4 
 virt/lib/mem_protected.c | 21 +
 6 files changed, 50 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ee274d27e764..74efc51e63f0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -671,6 +671,9 @@ static inline bool vma_is_kvm_protected(struct 
vm_area_struct *vma)
return vma->vm_flags & VM_KVM_PROTECTED;
 }
 
+void kvm_map_page(struct page *page, int nr_pages);
+void kvm_unmap_page(struct page *page, int nr_pages);
+
 #ifdef CONFIG_SHMEM
 /*
  * The vma_is_shmem is not inline because it is used only by slow
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ec8cf9a40cfd..40974656cb43 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -627,6 +627,10 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct 
vm_fault *vmf,
spin_unlock(vmf->ptl);
count_vm_event(THP_FAULT_ALLOC);
count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC);
+
+   /* Unmap page from direct mapping */
+   if (vma_is_kvm_protected(vma))
+   kvm_unmap_page(page, HPAGE_PMD_NR);
}
 
return 0;
@@ -1689,6 +1693,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
page_remove_rmap(page, true);
VM_BUG_ON_PAGE(page_mapcount(page) < 0, page);
VM_BUG_ON_PAGE(!PageHead(page), page);
+
+   /* Map the page back to the direct mapping */
+   if (vma_is_kvm_protected(vma))
+   kvm_map_page(page, HPAGE_PMD_NR);
} else if (thp_migration_supported()) {
swp_entry_t entry;
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 9afccc36dbd2..c720e271448f 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -528,6 +528,8 @@ static struct vm_area_struct *find_mergeable_vma(struct 
mm_struct *mm,
return NULL;
if (!(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma)
return NULL;
+   if (vma_is_kvm_protected(vma))
+   return NULL;
return vma;
 }
 
diff --git a/mm/memory.c b/mm/memory.c
index 2c9756b4e52f..e28bd5f902a7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1245,6 +1245,11 @@ static unsigned long zap_pte_range(struct mmu_gather 
*tlb,
likely(!(vma->vm_flags & VM_SEQ_READ)))
mark_page_accessed(page);
}
+
+   /* Map the page back to the direct mapping */
+   if (vma_is_anonymous(vma) && vma_is_kvm_protected(vma))
+   kvm_map_page(page, 1);
+
rss[mm_counter(page)]--;
page_remove_rmap(page, false);
if (unlikely(page_mapcount(page) < 0))
@@ -3466,6 +3471,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
struct page *page;
vm_fault_t ret = 0;
pte_t entry;
+   bool set = false;
 
/* File mapping without ->vm_ops ? */
if (vma->vm_flags & VM_SHARED)
@@ -3554,6 +3560,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
page_add_new_anon_rmap(page, vma, vmf->address, false);
lru_cache_add_inactive_or_unevictable(page, vma);
+   set = true;
 setpte:
set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
 
@@ -3561,6 +3568,11 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
update_mmu_cache(vma, vmf->address, vmf->pte);
 unlock:
pte_unmap_unlock(vmf->pte, vmf->ptl);
+
+   /* Unmap page from direct mapping */
+   if (vma_is_kvm_protected(vma) && set)
+   kvm_unmap_page(page, 1);
+
return ret;
 release:
put_page(page);
diff --git a/mm/rmap.c b/mm/rmap.c
index 9425260774a1..247548d6d24b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1725,6 +1725,10 @@ static bool try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
 
 static bool invalid_migration_vma(struct vm_area_struct *vma, void *arg)
 {
+   /* TODO */
+   if (vma_is_kvm_protected(vma))
+   return true;
+
return vma_is_temporary_stack(vma);
 }
 
diff --git a/virt/lib/mem_protected.c b/virt/lib/mem_protected.c
index 1dfe82534242..9d2ef99285e5 100644
--- a/virt/lib/mem_protected.c
+++ b/virt/lib/mem_protected.c
@@ -30,6 +30,27 @@ void kvm_unmap_page_atomic(void *vaddr)
 }
 EXPORT_SYMBOL_GPL(kvm_unmap_page_atomic);
 
+void kvm_m

[RFCv2 14/16] KVM: Handle protected memory in __kvm_map_gfn()/__kvm_unmap_gfn()

2020-10-19 Thread Kirill A. Shutemov

We cannot access protected pages directly. Use ioremap() to
create a temporary mapping of the page. The mapping is destroyed
on __kvm_unmap_gfn().

The new interface gfn_to_pfn_memslot_protected() is used to detect if
the page is protected.

ioremap_cache_force() is a hack to bypass IORES_MAP_SYSTEM_RAM check in
the x86 ioremap code. We need a better solution.

Signed-off-by: Kirill A. Shutemov 
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c|  2 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c |  2 +-
 arch/x86/include/asm/io.h  |  2 +
 arch/x86/include/asm/pgtable_types.h   |  1 +
 arch/x86/kvm/mmu/mmu.c |  6 ++-
 arch/x86/mm/ioremap.c  | 16 ++--
 include/linux/kvm_host.h   |  3 +-
 include/linux/kvm_types.h  |  1 +
 virt/kvm/kvm_main.c| 52 +++---
 9 files changed, 63 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 38ea396a23d6..8e06cd3f759c 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -590,7 +590,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_vcpu *vcpu,
} else {
/* Call KVM generic code to do the slow-path check */
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
-  writing, &write_ok);
+  writing, &write_ok, NULL);
if (is_error_noslot_pfn(pfn))
return -EFAULT;
page = NULL;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 22a677b18695..6fd4e3f9b66a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -822,7 +822,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
 
/* Call KVM generic code to do the slow-path check */
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
-  writing, upgrade_p);
+  writing, upgrade_p, NULL);
if (is_error_noslot_pfn(pfn))
return -EFAULT;
page = NULL;
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index c58d52fd7bf2..a3e1bfad1026 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -184,6 +184,8 @@ extern void __iomem *ioremap_uc(resource_size_t offset, 
unsigned long size);
 #define ioremap_uc ioremap_uc
 extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
 #define ioremap_cache ioremap_cache
+extern void __iomem *ioremap_cache_force(resource_size_t offset, unsigned long 
size);
+#define ioremap_cache_force ioremap_cache_force
 extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, 
unsigned long prot_val);
 #define ioremap_prot ioremap_prot
 extern void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned 
long size);
diff --git a/arch/x86/include/asm/pgtable_types.h 
b/arch/x86/include/asm/pgtable_types.h
index 816b31c68550..4c16a9583786 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -147,6 +147,7 @@ enum page_cache_mode {
_PAGE_CACHE_MODE_UC   = 3,
_PAGE_CACHE_MODE_WT   = 4,
_PAGE_CACHE_MODE_WP   = 5,
+   _PAGE_CACHE_MODE_WB_FORCE = 6,
 
_PAGE_CACHE_MODE_NUM  = 8
 };
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 71aa3da2a0b7..162cb285b87b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4058,7 +4058,8 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
}
 
async = false;
-   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, &async, write, writable);
+   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, &async, write, writable,
+   NULL);
if (!async)
return false; /* *pfn has correct page already */
 
@@ -4072,7 +4073,8 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
return true;
}
 
-   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, NULL, write, writable);
+   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, NULL, write, writable,
+   NULL);
return false;
 }
 
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 9e5ccc56f8e0..4409785e294c 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -202,9 +202,12 @@ __ioremap_caller(resource_size_t phys_addr, unsigned long 
size,
__ioremap_check_mem(phys_addr, size, &io_desc);
 
/*
-* Don't allow anybody to remap normal RAM that we're using..
+* Don't allow anybody to remap normal RAM that we're using, unless
+* _PAGE_CACHE_MODE_WB_FORCE is used.

[RFCv2 12/16] KVM: x86: Enabled protected memory extension

2020-10-19 Thread Kirill A. Shutemov

Wire up hypercalls for the feature and define VM_KVM_PROTECTED.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/Kconfig | 1 +
 arch/x86/kvm/Kconfig | 1 +
 arch/x86/kvm/cpuid.c | 3 ++-
 arch/x86/kvm/x86.c   | 9 +
 include/linux/mm.h   | 6 ++
 5 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b22b95517437..0bcbdadb97d6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -807,6 +807,7 @@ config KVM_GUEST
select X86_HV_CALLBACK_VECTOR
select X86_MEM_ENCRYPT_COMMON
select SWIOTLB
+   select ARCH_USES_HIGH_VMA_FLAGS
default y
help
  This option enables various optimizations for running under the KVM
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index fbd5bd7a945a..2ea77c2a8029 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -46,6 +46,7 @@ config KVM
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_VFIO
select SRCU
+   select HAVE_KVM_PROTECTED_MEMORY
help
  Support hosting fully virtualized guest machines using hardware
  virtualization extensions.  You will need a fairly recent
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 3fd6eec202d7..eed33db032fb 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -746,7 +746,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array 
*array, u32 function)
 (1 << KVM_FEATURE_PV_SEND_IPI) |
 (1 << KVM_FEATURE_POLL_CONTROL) |
 (1 << KVM_FEATURE_PV_SCHED_YIELD) |
-(1 << KVM_FEATURE_ASYNC_PF_INT);
+(1 << KVM_FEATURE_ASYNC_PF_INT) |
+(1 << KVM_FEATURE_MEM_PROTECTED);
 
if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ce856e0ece84..e89ff39204f0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7752,6 +7752,15 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
kvm_sched_yield(vcpu->kvm, a0);
ret = 0;
break;
+   case KVM_HC_ENABLE_MEM_PROTECTED:
+   ret = kvm_protect_all_memory(vcpu->kvm);
+   break;
+   case KVM_HC_MEM_SHARE:
+   ret = kvm_protect_memory(vcpu->kvm, a0, a1, false);
+   break;
+   case KVM_HC_MEM_UNSHARE:
+   ret = kvm_protect_memory(vcpu->kvm, a0, a1, true);
+   break;
default:
ret = -KVM_ENOSYS;
break;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c8d8cdcbc425..ee274d27e764 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -304,11 +304,13 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HIGH_ARCH_BIT_2 34  /* bit only usable on 64-bit 
architectures */
 #define VM_HIGH_ARCH_BIT_3 35  /* bit only usable on 64-bit 
architectures */
 #define VM_HIGH_ARCH_BIT_4 36  /* bit only usable on 64-bit 
architectures */
+#define VM_HIGH_ARCH_BIT_5 37  /* bit only usable on 64-bit 
architectures */
 #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0)
 #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1)
 #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2)
 #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3)
 #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4)
+#define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5)
 #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
 
 #ifdef CONFIG_ARCH_HAS_PKEYS
@@ -342,7 +344,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86_64) && defined(CONFIG_KVM)
+#define VM_KVM_PROTECTED VM_HIGH_ARCH_5
+#else
 #define VM_KVM_PROTECTED 0
+#endif
 
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
-- 
2.26.2

[RFCv2 01/16] x86/mm: Move force_dma_unencrypted() to common code

2020-10-19 Thread Kirill A. Shutemov

force_dma_unencrypted() has to return true for KVM guest with the memory
protected enabled. Move it out of AMD SME code.

Introduce new config option X86_MEM_ENCRYPT_COMMON that has to be
selected by all x86 memory encryption features.

This is preparation for the following patches.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/Kconfig |  8 +--
 arch/x86/include/asm/io.h|  4 +++-
 arch/x86/mm/Makefile |  2 ++
 arch/x86/mm/mem_encrypt.c| 30 -
 arch/x86/mm/mem_encrypt_common.c | 38 
 5 files changed, 49 insertions(+), 33 deletions(-)
 create mode 100644 arch/x86/mm/mem_encrypt_common.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7101ac64bb20..619ebf40e457 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1514,13 +1514,17 @@ config X86_CPA_STATISTICS
  helps to determine the effectiveness of preserving large and huge
  page mappings when mapping protections are changed.
 
+config X86_MEM_ENCRYPT_COMMON
+   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+   select DYNAMIC_PHYSICAL_MASK
+   def_bool n
+
 config AMD_MEM_ENCRYPT
bool "AMD Secure Memory Encryption (SME) support"
depends on X86_64 && CPU_SUP_AMD
select DMA_COHERENT_POOL
-   select DYNAMIC_PHYSICAL_MASK
select ARCH_USE_MEMREMAP_PROT
-   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+   select X86_MEM_ENCRYPT_COMMON
help
  Say yes to enable support for the encryption of system memory.
  This requires an AMD processor that supports Secure Memory
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index e1aa17a468a8..c58d52fd7bf2 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -256,10 +256,12 @@ static inline void slow_down_io(void)
 
 #endif
 
-#ifdef CONFIG_AMD_MEM_ENCRYPT
 #include 
 
 extern struct static_key_false sev_enable_key;
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+
 static inline bool sev_key_active(void)
 {
return static_branch_unlikely(&sev_enable_key);
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 5864219221ca..b31cb52bf1bd 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -52,6 +52,8 @@ obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)+= 
pkeys.o
 obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
 obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o
 
+obj-$(CONFIG_X86_MEM_ENCRYPT_COMMON)   += mem_encrypt_common.o
+
 obj-$(CONFIG_AMD_MEM_ENCRYPT)  += mem_encrypt.o
 obj-$(CONFIG_AMD_MEM_ENCRYPT)  += mem_encrypt_identity.o
 obj-$(CONFIG_AMD_MEM_ENCRYPT)  += mem_encrypt_boot.o
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 9f1177edc2e7..4dbdc9dac36b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,10 +15,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
-#include 
 
 #include 
 #include 
@@ -350,32 +346,6 @@ bool sev_active(void)
return sme_me_mask && sev_enabled;
 }
 
-/* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
-bool force_dma_unencrypted(struct device *dev)
-{
-   /*
-* For SEV, all DMA must be to unencrypted addresses.
-*/
-   if (sev_active())
-   return true;
-
-   /*
-* For SME, all DMA must be to unencrypted addresses if the
-* device does not support DMA to addresses that include the
-* encryption mask.
-*/
-   if (sme_active()) {
-   u64 dma_enc_mask = DMA_BIT_MASK(__ffs64(sme_me_mask));
-   u64 dma_dev_mask = min_not_zero(dev->coherent_dma_mask,
-   dev->bus_dma_limit);
-
-   if (dma_dev_mask <= dma_enc_mask)
-   return true;
-   }
-
-   return false;
-}
-
 void __init mem_encrypt_free_decrypted_mem(void)
 {
unsigned long vaddr, vaddr_end, npages;
diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
new file mode 100644
index ..964e04152417
--- /dev/null
+++ b/arch/x86/mm/mem_encrypt_common.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#include 
+#include 
+#include 
+
+/* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
+bool force_dma_unencrypted(struct device *dev)
+{
+   /*
+* For SEV, all DMA must be to unencrypted/shared addresses.
+*/
+   if (sev_active())
+   return true;
+
+   /*
+* For SME, all DMA must be to unencrypted addresses if the
+* device does not support DMA to addresses that include the
+* encryption mask.
+*/
+   if (sme_active()) {
+   u64 dma_enc_mask = DMA_BIT_MASK(__ffs64(sme_me_mask));
+   u64 dma_d

[RFCv2 00/16] KVM protected memory extension

2020-10-19 Thread Kirill A. Shutemov

== Background / Problem ==

There are a number of hardware features (MKTME, SEV) which protect guest
memory from some unauthorized host access. The patchset proposes a purely
software feature that mitigates some of the same host-side read-only
attacks.


== What does this set mitigate? ==

 - Host kernel ”accidental” access to guest data (think speculation)

 - Host kernel induced access to guest data (write(fd, &guest_data_ptr, len))

 - Host userspace access to guest data (compromised qemu)

 - Guest privilege escalation via compromised QEMU device emulation

== What does this set NOT mitigate? ==

 - Full host kernel compromise.  Kernel will just map the pages again.

 - Hardware attacks


The second RFC revision addresses /most/ of the feedback.

I still didn't found a good solution to reboot and kexec. Unprotect all
the memory on such operations defeat the goal of the feature. Clearing up
most of the memory before unprotecting what is required for reboot (or
kexec) is tedious and error-prone.
Maybe we should just declare them unsupported?

== Series Overview ==

The hardware features protect guest data by encrypting it and then
ensuring that only the right guest can decrypt it.  This has the
side-effect of making the kernel direct map and userspace mapping
(QEMU et al) useless.  But, this teaches us something very useful:
neither the kernel or userspace mappings are really necessary for normal
guest operations.

Instead of using encryption, this series simply unmaps the memory. One
advantage compared to allowing access to ciphertext is that it allows bad
accesses to be caught instead of simply reading garbage.

Protection from physical attacks needs to be provided by some other means.
On Intel platforms, (single-key) Total Memory Encryption (TME) provides
mitigation against physical attacks, such as DIMM interposers sniffing
memory bus traffic.

The patchset modifies both host and guest kernel. The guest OS must enable
the feature via hypercall and mark any memory range that has to be shared
with the host: DMA regions, bounce buffers, etc. SEV does this marking via a
bit in the guest’s page table while this approach uses a hypercall.

For removing the userspace mapping, use a trick similar to what NUMA
balancing does: convert memory that belongs to KVM memory slots to
PROT_NONE: all existing entries converted to PROT_NONE with mprotect() and
the newly faulted in pages get PROT_NONE from the updated vm_page_prot.
The new VMA flag -- VM_KVM_PROTECTED -- indicates that the pages in the
VMA must be treated in a special way in the GUP and fault paths. The flag
allows GUP to return the page even though it is mapped with PROT_NONE, but
only if the new GUP flag -- FOLL_KVM -- is specified. Any userspace access
to the memory would result in SIGBUS. Any GUP access without FOLL_KVM
would result in -EFAULT.

Removing userspace mapping of the guest memory from QEMU process can help
to address some guest privilege escalation attacks. Consider the case when
unprivileged guest user exploits bug in a QEMU device emulation to gain
access to data it cannot normally have access within the guest.

Any anonymous page faulted into the VM_KVM_PROTECTED VMA gets removed from
the direct mapping with kernel_map_pages(). Note that kernel_map_pages() only
flushes local TLB. I think it's a reasonable compromise between security and
perfromance.

Zapping the PTE would bring the page back to the direct mapping after clearing.
At least for now, we don't remove file-backed pages from the direct mapping.
File-backed pages could be accessed via read/write syscalls. It adds
complexity.

Occasionally, host kernel has to access guest memory that was not made
shared by the guest. For instance, it happens for instruction emulation.
Normally, it's done via copy_to/from_user() which would fail with -EFAULT
now. We introduced a new pair of helpers: copy_to/from_guest(). The new
helpers acquire the page via GUP, map it into kernel address space with
kmap_atomic()-style mechanism and only then copy the data.

For some instruction emulation copying is not good enough: cmpxchg
emulation has to have direct access to the guest memory. __kvm_map_gfn()
is modified to accommodate the case.

The patchset is on top of v5.9

Kirill A. Shutemov (16):
  x86/mm: Move force_dma_unencrypted() to common code
  x86/kvm: Introduce KVM memory protection feature
  x86/kvm: Make DMA pages shared
  x86/kvm: Use bounce buffers for KVM memory protection
  x86/kvm: Make VirtIO use DMA API in KVM guest
  x86/kvmclock: Share hvclock memory with the host
  x86/realmode: Share trampoline area if KVM memory protection enabled
  KVM: Use GUP instead of copy_from/to_user() to access guest memory
  KVM: mm: Introduce VM_KVM_PROTECTED
  KVM: x86: Use GUP for page walk instead of __get_user()
  KVM: Protected memory extension
  KVM: x86: Enabled protected memory extension
  KVM: Rework copy_to/from_guest() to avoid direct mapping
  KVM: Handle protected memory in __kvm_map_gfn()/_

[RFCv2 11/16] KVM: Protected memory extension

2020-10-19 Thread Kirill A. Shutemov

Add infrastructure that handles protected memory extension.

Arch-specific code has to provide hypercalls and define non-zero
VM_KVM_PROTECTED.

Signed-off-by: Kirill A. Shutemov 
---
 include/linux/kvm_host.h |  4 +++
 virt/kvm/Kconfig |  3 ++
 virt/kvm/kvm_main.c  | 68 ++
 virt/lib/Makefile|  1 +
 virt/lib/mem_protected.c | 71 
 5 files changed, 147 insertions(+)
 create mode 100644 virt/lib/mem_protected.c

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 380a64613880..6655e8da4555 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -701,6 +701,10 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm);
 void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
   struct kvm_memory_slot *slot);
 
+int kvm_protect_all_memory(struct kvm *kvm);
+int kvm_protect_memory(struct kvm *kvm,
+  unsigned long gfn, unsigned long npages, bool protect);
+
 int gfn_to_page_many_atomic(struct kvm_memory_slot *slot, gfn_t gfn,
struct page **pages, int nr_pages);
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 1c37ccd5d402..50d7422386aa 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -63,3 +63,6 @@ config HAVE_KVM_NO_POLL
 
 config KVM_XFER_TO_GUEST_WORK
bool
+
+config HAVE_KVM_PROTECTED_MEMORY
+   bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 125db5a73e10..4c008c7b4974 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -154,6 +154,8 @@ static void kvm_uevent_notify_change(unsigned int type, 
struct kvm *kvm);
 static unsigned long long kvm_createvm_count;
 static unsigned long long kvm_active_vms;
 
+int __kvm_protect_memory(unsigned long start, unsigned long end, bool protect);
+
 __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
   unsigned long start, 
unsigned long end)
 {
@@ -1371,6 +1373,15 @@ int __kvm_set_memory_region(struct kvm *kvm,
if (r)
goto out_bitmap;
 
+   if (IS_ENABLED(CONFIG_HAVE_KVM_PROTECTED_MEMORY) &&
+   mem->memory_size && kvm->mem_protected) {
+   r = __kvm_protect_memory(new.userspace_addr,
+new.userspace_addr + new.npages * 
PAGE_SIZE,
+true);
+   if (r)
+   goto out_bitmap;
+   }
+
if (old.dirty_bitmap && !new.dirty_bitmap)
kvm_destroy_dirty_bitmap(&old);
return 0;
@@ -2720,6 +2731,63 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, 
gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);
 
+int kvm_protect_memory(struct kvm *kvm,
+  unsigned long gfn, unsigned long npages, bool protect)
+{
+   struct kvm_memory_slot *memslot;
+   unsigned long start, end;
+   gfn_t numpages;
+
+   if (!IS_ENABLED(CONFIG_HAVE_KVM_PROTECTED_MEMORY))
+   return -KVM_ENOSYS;
+
+   if (!npages)
+   return 0;
+
+   memslot = gfn_to_memslot(kvm, gfn);
+   /* Not backed by memory. It's okay. */
+   if (!memslot)
+   return 0;
+
+   start = gfn_to_hva_many(memslot, gfn, &numpages);
+   end = start + npages * PAGE_SIZE;
+
+   /* XXX: Share range across memory slots? */
+   if (WARN_ON(numpages < npages))
+   return -EINVAL;
+
+   return __kvm_protect_memory(start, end, protect);
+}
+EXPORT_SYMBOL_GPL(kvm_protect_memory);
+
+int kvm_protect_all_memory(struct kvm *kvm)
+{
+   struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;
+   unsigned long start, end;
+   int i, ret = 0;;
+
+   if (!IS_ENABLED(CONFIG_HAVE_KVM_PROTECTED_MEMORY))
+   return -KVM_ENOSYS;
+
+   mutex_lock(&kvm->slots_lock);
+   kvm->mem_protected = true;
+   for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
+   slots = __kvm_memslots(kvm, i);
+   kvm_for_each_memslot(memslot, slots) {
+   start = memslot->userspace_addr;
+   end = start + memslot->npages * PAGE_SIZE;
+   ret = __kvm_protect_memory(start, end, true);
+   if (ret)
+   goto out;
+   }
+   }
+out:
+   mutex_unlock(&kvm->slots_lock);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_protect_all_memory);
+
 void kvm_sigset_activate(struct kvm_vcpu *vcpu)
 {
if (!vcpu->sigset_active)
diff --git a/virt/lib/Makefile b/virt/lib/Makefile
index bd7f9a78bb6b..d6e50510801f 100644
--- a/virt/lib/Makefile
+++ b/virt/lib/Makefile
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_IRQ_BYPASS_MANAGER) += irqbypass.o
+obj-$(CONFIG_HAVE_KVM_PROTECTED_MEMORY) += mem_protected.o
diff --git a/virt/lib/mem_protected.c b/virt/lib/mem_prot

[RFCv2 16/16] mm: Do not use zero page for VM_KVM_PROTECTED VMAs

2020-10-19 Thread Kirill A. Shutemov

Presence of zero pages in the mapping would disclose content of the
mapping. Don't use them if KVM memory protection is enabled.

Signed-off-by: Kirill A. Shutemov 
---
 arch/s390/include/asm/pgtable.h | 2 +-
 include/linux/mm.h  | 4 ++--
 mm/huge_memory.c| 3 +--
 mm/memory.c | 3 +--
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index b55561cc8786..72ca3b3f04cb 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -543,7 +543,7 @@ static inline int mm_alloc_pgste(struct mm_struct *mm)
  * In the case that a guest uses storage keys
  * faults should no longer be backed by zero pages
  */
-#define mm_forbids_zeropage mm_has_pgste
+#define vma_forbids_zeropage(vma) mm_has_pgste(vma->vm_mm)
 static inline int mm_uses_skeys(struct mm_struct *mm)
 {
 #ifdef CONFIG_PGSTE
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 74efc51e63f0..ee713b7c2819 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -130,8 +130,8 @@ extern int mmap_rnd_compat_bits __read_mostly;
  * s390 does this to prevent multiplexing of hardware bits
  * related to the physical page in case of virtualization.
  */
-#ifndef mm_forbids_zeropage
-#define mm_forbids_zeropage(X) (0)
+#ifndef vma_forbids_zeropage
+#define vma_forbids_zeropage(vma) vma_is_kvm_protected(vma)
 #endif
 
 /*
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 40974656cb43..383614b24c4f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -709,8 +709,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
return VM_FAULT_OOM;
if (unlikely(khugepaged_enter(vma, vma->vm_flags)))
return VM_FAULT_OOM;
-   if (!(vmf->flags & FAULT_FLAG_WRITE) &&
-   !mm_forbids_zeropage(vma->vm_mm) &&
+   if (!(vmf->flags & FAULT_FLAG_WRITE) && !vma_forbids_zeropage(vma) &&
transparent_hugepage_use_zero_page()) {
pgtable_t pgtable;
struct page *zero_page;
diff --git a/mm/memory.c b/mm/memory.c
index e28bd5f902a7..9907ffe00490 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3495,8 +3495,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
return 0;
 
/* Use the zero-page for reads */
-   if (!(vmf->flags & FAULT_FLAG_WRITE) &&
-   !mm_forbids_zeropage(vma->vm_mm)) {
+   if (!(vmf->flags & FAULT_FLAG_WRITE) && !vma_forbids_zeropage(vma)) {
entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
vma->vm_page_prot));
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
-- 
2.26.2

Re: [Industrypack-devel] [PATCH] ipack: iopctal: remove unneeded break

2020-10-19 Thread Samuel Iglesias Gonsálvez

Hi Tom,

Thanks for the patch!

Patch is,

Acked-by: Samuel Iglesias Gonsalvez 

Greg, Would you mind picking this patch through your char-misc
tree?

Thanks!

Sam

On Mon, 2020-10-19 at 12:32 -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> A break is not needed if it is preceded by a return
> 
> Signed-off-by: Tom Rix 
> ---
>  drivers/ipack/devices/ipoctal.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/ipack/devices/ipoctal.c
> b/drivers/ipack/devices/ipoctal.c
> index d480a514c983..3940714e4397 100644
> --- a/drivers/ipack/devices/ipoctal.c
> +++ b/drivers/ipack/devices/ipoctal.c
> @@ -544,7 +544,6 @@ static void ipoctal_set_termios(struct tty_struct
> *tty,
>   break;
>   default:
>   return;
> - break;
>   }
>  
>   baud = tty_get_baud_rate(tty);


signature.asc
Description: This is a digitally signed message part

[tip:perf/urgent] BUILD SUCCESS f3d301c1f2f5676465cdf3259737ea19cc82731f

2020-10-19 Thread kernel test robot

allyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20201019
x86_64   randconfig-a002-20201019
x86_64   randconfig-a006-20201019
x86_64   randconfig-a003-20201019
x86_64   randconfig-a005-20201019
x86_64   randconfig-a001-20201019
i386 randconfig-a006-20201019
i386 randconfig-a005-20201019
i386 randconfig-a001-20201019
i386 randconfig-a003-20201019
i386 randconfig-a004-20201019
i386 randconfig-a002-20201019
i386 randconfig-a015-20201019
i386 randconfig-a013-20201019
i386 randconfig-a016-20201019
i386 randconfig-a012-20201019
i386 randconfig-a011-20201019
i386 randconfig-a014-20201019
riscvnommu_k210_defconfig
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   rhel
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  kexec

clang tested configs:
x86_64   randconfig-a016-20201019
x86_64   randconfig-a015-20201019
x86_64   randconfig-a012-20201019
x86_64   randconfig-a013-20201019
x86_64   randconfig-a011-20201019
x86_64   randconfig-a014-20201019

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: slowdown due to reader-owned rwsem time-based spinning

2020-10-19 Thread Julia Lawall




On Mon, 19 Oct 2020, Waiman Long wrote:

> On 10/19/20 3:48 PM, Julia Lawall wrote:
> >
> > On Mon, 19 Oct 2020, Waiman Long wrote:
> >
> > > On 10/15/20 7:38 AM, Julia Lawall wrote:
> > > > Hello,
> > > >
> > > > Phoenix is an implementation of map reduce:
> > > >
> > > > https://github.com/kozyraki/phoenix
> > > >
> > > > The phoenix-2.0/tests subdirectory contains some benchmarks, including
> > > > word_count.
> > > >
> > > > At the same time, on my server, since v5.8, the kernel has changed from
> > > > using the governor intel_pstate by default to using intel_cpufreq.
> > > > Intel_cpufreq causes kworkers to run on all cores every 0.004 seconds,
> > > > while intel_pstate involves very few such stray processes.
> > > >
> > > > Suprisingly, all those kworkers cause the word_count benchmark to run
> > > > 2-3
> > > > times faster.  I bisected the problem back to the following commit,
> > > > whcih
> > > > was introduced in v5.3:
> > > >
> > > > commit 7d43f1ce9dd075d8b2aa3ad1f3970ef386a5c358
> > > > Author: Waiman Long 
> > > > Date:   Mon May 20 16:59:13 2019 -0400
> > > >
> > > >   locking/rwsem: Enable time-based spinning on reader-owned rwsem
> > > >
> > > > Representative traces are attached.  word_count_5.9pwrsvpassive_1.pdf is
> > > > the one with the kworkers.
> > > >
> > > > I don't know the Phoenix code in detail, but the problem seems to be in
> > > > the infrastructure not the specific word count aplication, because most
> > > > of
> > > > the benchmarks seem to suffer similarly.  Some of the other benchmarks
> > > > seem to take a variable and long amount of time to get started in the
> > > > active mode, so perhaps the problem could be in reading the initial
> > > > dataset.
> > > >
> > > > Before I plunge into it, do you have any suggestions as to what could be
> > > > the problem?
> > > I am a bit confused as to what you are looking for. So you said this patch
> > > make the benchmark run 2-3 times faster. Is this a problem? What are you
> > > trying to achieve? Is it to make the passive case similar to the active
> > > case?
> > Sorry, it seems that I was not clear.  Prior to the commit above the
> > active case had good performance,  The patch caused the active case to
> > slow down by 2-3 times.  Adding lots of kworkers that interrupt the
> > threads eliminated the slowdown.
> >
> > > What this patch does is to allow writer waiting for a rwsem to spin for a
> > > while hoping the readers will release the lock soon to acquire the lock.
> > > Before that, the writer will go to sleep immediately when the rwsem is
> > > owned
> > > by readers. Probably because of that, the kworkers keep on running for a
> > > much
> > > longer time as long as there are no other tasks competing for the CPUs.
> > No, the kworkers don't run for a long time.  My hypothesis is that the
> > kworkers interrupt a thread that is spinning waiting for a lock and thus
> > allow the thread that is holding the lock to run.
> >
> Thanks for the clarification. Now I see what you mean by thinking this is a
> problem?
>
> However, the reader spinning is about 25us max. So I am puzzled by the long
> idle period in between busy period in the active chart. I will need to
> reproduce this condition myself to see what has gone wrong. What is
> configuration of your test machine as well as config option you used for the
> kernel and the boot command line parameters?

80 physical cores, 160 hardware threads.  4 sockets.  Intel(R) Xeon(R) CPU
E7-8870 v4 @ 2.10GHz

Boot options:  ro quiet intel_pstate=active

Benchmark suite: https://github.com/kozyraki/phoenix.git

phoenix-2.0/tests/word_count/word_count 
datasets/word_count/word_count_datafiles/word_100MB.txt

Traces from Linux 5.9 of several of the benchmarks are available at
https://pages.lip6.fr/Julia.Lawall/px.pdf

julia

Re: [PATCH v9 03/15] dt-bindings: usb: Maxim type-c controller device tree binding document

2020-10-19 Thread Badhri Jagan Sridharan

Hi Rob,

Apologies for the delay. Was coordinating care for my parents who
caught the COVID bug.

Thanks,
Badhri

On Tue, Oct 13, 2020 at 6:50 AM Rob Herring  wrote:
>
> On Tue, Oct 13, 2020 at 8:43 AM Rob Herring  wrote:
> >
> > On Wed, Oct 7, 2020 at 7:43 PM Badhri Jagan Sridharan  
> > wrote:
> > >
> > > Hi Robb,
> > >
> > > Thanks for the reviews ! Responses inline.
> > >
> > > Regards,
> > > Badhri
> > >
> > > On Mon, Oct 5, 2020 at 7:46 AM Rob Herring  wrote:
> > > >
> > > > On Mon, Sep 28, 2020 at 07:39:52PM -0700, Badhri Jagan Sridharan wrote:
> > > > > Add device tree binding document for Maxim TCPCI based Type-C chip 
> > > > > driver
> > > > >
> > > > > Signed-off-by: Badhri Jagan Sridharan 
> > > > > ---
> > > > > Changes since v1:
> > > > > - Changing patch version to v6 to fix version number confusion.
> > > > >
> > > > > Changes since v6:
> > > > > - Migrated to yaml format.
> > > > >
> > > > > Changes since v7:
> > > > > - Rebase on usb-next
> > > > >
> > > > > Changes since v8:
> > > > > - Fix errors from make dt_binding_check as suggested by
> > > > >   Rob Herring.
> > > > > ---
> > > > >  .../devicetree/bindings/usb/maxim,tcpci.yaml  | 68 
> > > > > +++
> > > > >  1 file changed, 68 insertions(+)
> > > > >  create mode 100644 
> > > > > Documentation/devicetree/bindings/usb/maxim,tcpci.yaml
> > > > >
> > > > > diff --git a/Documentation/devicetree/bindings/usb/maxim,tcpci.yaml 
> > > > > b/Documentation/devicetree/bindings/usb/maxim,tcpci.yaml
> > > > > new file mode 100644
> > > > > index ..f4b5f1a09b98
> > > > > --- /dev/null
> > > > > +++ b/Documentation/devicetree/bindings/usb/maxim,tcpci.yaml
> > > > > @@ -0,0 +1,68 @@
> > > > > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > > > +%YAML 1.2
> > > > > +---
> > > > > +$id: "http://devicetree.org/schemas/usb/maxim,tcpci.yaml#";
> > > > > +$schema: "http://devicetree.org/meta-schemas/core.yaml#";
> > > > > +
> > > > > +title: Maxim TCPCI Type-C PD controller DT bindings
> > > > > +
> > > > > +maintainers:
> > > > > +  - Badhri Jagan Sridharan 
> > > > > +
> > > > > +description: Maxim TCPCI Type-C PD controller
> > > > > +
> > > > > +properties:
> > > > > +  compatible:
> > > > > +enum:
> > > > > +  - maxim,tcpci
> > > >
> > > > Is there a datasheet for this? Searching for 'tcpci' doesn't really come
> > > > up with anything other than this patch. Only chip I found is MAX77958.
> > > > Bindings are for specific h/w devices.
> > >
> > > Unfortunately the datasheet cannot be made public yet. Has the datasheet
> > > have to be made public before sending the bindings ?
> >
> > No, but we need a part number or some assurance that 'tcpci' is a specific 
> > part.
Sure. Added the part number to the binding and changed the compatible string.
Sending this as part of v11


>
> I guess TCPCI is USB Type-C Port Controller Interface Specification.
>
> That's just a protocol definition, not a chip. DT describes h/w which
> is more than just the protocol.
>
> Rob

Re: [PATCH] net: ftgmac100: Fix missing TX-poll issue

2020-10-19 Thread Benjamin Herrenschmidt

On Mon, 2020-10-19 at 19:57 -0700, Jakub Kicinski wrote:
> > I suspect the problem is that the HW (and yes this would be a HW bug)
> > doesn't order the CPU -> memory and the CPU -> MMIO path.
> > 
> > What I think happens is that the store to txde0 is potentially still in
> > a buffer somewhere on its way to memory, gets bypassed by the store to
> > MMIO, causing the MAC to try to read the descriptor, and getting the
> > "old" data from memory.
> 
> I see, but in general this sort of a problem should be resolved by
> adding an appropriate memory barrier. And in fact such barrier should
> (these days) be implied by a writel (I'm not 100% clear on why this
> driver uses iowrite, and if it matters).

No, a barrier won't solve this I think.

This is a coherency problem at the fabric/interconnect level. I has to
do with the way they implemented the DMA path from memory to the
ethernet controller using a different "port" of the memory controller
than the one used by the CPU, separately from the MMIO path, with no
proper ordering between those busses. Old school design  and
broken.

By doing a read back, they probably force the previous write to memory
to get past the point where it will be visible to a subsequent DMA read
by the ethernet controller.

> > It's ... fishy, but that isn't the first time an Aspeed chip has that
> > type of bug (there's a similar one in the USB device controler iirc).

Cheers,
Ben.

Re: [PATCH v4 11/15] ASoC: dt-bindings: tegra: Add json-schema for Tegra audio graph card

2020-10-19 Thread Sameer Pujar





Add YAML schema for Tegra audio graph sound card DT bindings. It uses the
same DT bindings provided by generic audio graph driver. Along with this
few standard clock DT bindings are added which are specifically required
for Tegra audio.

Signed-off-by: Sameer Pujar 
---
  .../sound/nvidia,tegra-audio-graph-card.yaml   | 158 +
  1 file changed, 158 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/sound/nvidia,tegra-audio-graph-card.yaml

diff --git 
a/Documentation/devicetree/bindings/sound/nvidia,tegra-audio-graph-card.yaml 
b/Documentation/devicetree/bindings/sound/nvidia,tegra-audio-graph-card.yaml
new file mode 100644
index 000..284d185
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/nvidia,tegra-audio-graph-card.yaml
@@ -0,0 +1,158 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/sound/nvidia,tegra-audio-graph-card.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Audio Graph based Tegra sound card driver
+
+description: |
+  This is based on generic audio graph card driver along with additional
+  customizations for Tegra platforms. It uses the same bindings with
+  additional standard clock DT bindings required for Tegra.
+
+  See{LINUX}/Documentation/devicetree/bindings/sound/audio-graph-card.yaml

You should be able to just $ref this at the top level.


I am seeing one problem while using $ref like below.
allOf:
  - $ref: /schemas/sound/audio-graph-card.yaml

I see below while running doc validator.
"Documentation/devicetree/bindings/sound/nvidia,tegra-audio-graph-card.example.dt.yaml: 
tegra_sound: compatible:0: 'audio-graph-card' was expected"


Is there a way to avoid this?




+
+maintainers:
+  - Jon Hunter 
+  - Sameer Pujar 
+
+properties:
+  compatible:
+items:
+  - enum:
+  - nvidia,tegra210-audio-graph-card
+  - nvidia,tegra186-audio-graph-card
+



+  dais:
+$ref: /schemas/sound/audio-graph-card.yaml#/properties/dais
+
+  label:
+$ref: /schemas/sound/simple-card.yaml#/properties/label
+
+  pa-gpios:
+$ref: /schemas/sound/audio-graph-card.yaml#/properties/pa-gpios
+
+  widgets:
+$ref: /schemas/sound/simple-card.yaml#/definitions/widgets
+
+  routing:
+$ref: /schemas/sound/simple-card.yaml#/definitions/routing
+
+  mclk-fs:
+$ref: /schemas/sound/simple-card.yaml#/definitions/mclk-fs
+
+  prefix:
+$ref: /schemas/sound/simple-card.yaml#/definitions/prefix

And drop all of these.


Could not re-use because of above compatible problem. Also require some 
additional properties for Tegra.



+
+  clocks:
+   minItems: 2
+
+  clock-names:
+   minItems: 2

Don't need this.


This is required for Tegra audio graph card to update clock rates at 
runtime.





+   items:
+ - const: pll_a
+ - const: plla_out0
+
+  assigned-clocks:
+minItems: 1
+maxItems: 3
+
+  assigned-clock-parents:
+minItems: 1
+maxItems: 3
+
+  assigned-clock-rates:
+minItems: 1
+maxItems: 3
+


It is required for initialisation of above clocks with specific rates.


+  ports:
+$ref: /schemas/sound/audio-graph-card.yaml#/properties/ports
+
+patternProperties:
+  "^port(@[0-9a-f]+)?$":
+$ref: /schemas/sound/audio-graph-card.yaml#/definitions/port

And these can be dropped. Unless what each port is is Tegra specific.


May be I can drop this if I could just directly include 
audio-graph-card.yaml and extend required properties for Tegra.

block, bfq: lockdep circular locking dependency gripe

2020-10-19 Thread Mike Galbraith

[ 1917.361401] ==
[ 1917.361406] WARNING: possible circular locking dependency detected
[ 1917.361413] 5.9.0.g7cf726a-master #2 Tainted: G S  E
[ 1917.361417] --
[ 1917.361422] kworker/u16:35/15995 is trying to acquire lock:
[ 1917.361428] 89232237f7e0 (&ioc->lock){..-.}-{2:2}, at: 
put_io_context+0x30/0x90
[ 1917.361440]
   but task is already holding lock:
[ 1917.361445] 892244d2cc08 (&bfqd->lock){-.-.}-{2:2}, at: 
bfq_insert_requests+0x89/0x680
[ 1917.361456]
   which lock already depends on the new lock.

[ 1917.361463]
   the existing dependency chain (in reverse order) is:
[ 1917.361469]
   -> #1 (&bfqd->lock){-.-.}-{2:2}:
[ 1917.361479]_raw_spin_lock_irqsave+0x3d/0x50
[ 1917.361484]bfq_exit_icq_bfqq+0x48/0x3f0
[ 1917.361489]bfq_exit_icq+0x13/0x20
[ 1917.361494]put_io_context_active+0x55/0x80
[ 1917.361499]do_exit+0x72c/0xca0
[ 1917.361504]do_group_exit+0x47/0xb0
[ 1917.361508]__x64_sys_exit_group+0x14/0x20
[ 1917.361513]do_syscall_64+0x33/0x40
[ 1917.361518]entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1917.361523]
   -> #0 (&ioc->lock){..-.}-{2:2}:
[ 1917.361532]__lock_acquire+0x149d/0x1a70
[ 1917.361537]lock_acquire+0x1a7/0x3b0
[ 1917.361542]_raw_spin_lock_irqsave+0x3d/0x50
[ 1917.361547]put_io_context+0x30/0x90
[ 1917.361552]blk_mq_free_request+0x4f/0x140
[ 1917.361557]blk_attempt_req_merge+0x19/0x30
[ 1917.361563]elv_attempt_insert_merge+0x4f/0x90
[ 1917.361568]blk_mq_sched_try_insert_merge+0x28/0x40
[ 1917.361574]bfq_insert_requests+0x94/0x680
[ 1917.361579]blk_mq_sched_insert_requests+0xd1/0x2a0
[ 1917.361584]blk_mq_flush_plug_list+0x12d/0x240
[ 1917.361589]blk_flush_plug_list+0xb4/0xd0
[ 1917.361594]io_schedule_prepare+0x3c/0x40
[ 1917.361599]io_schedule+0xb/0x40
[ 1917.361604]blk_mq_get_tag+0x13a/0x250
[ 1917.361608]__blk_mq_alloc_request+0x5c/0x130
[ 1917.361613]blk_mq_submit_bio+0xf3/0x770
[ 1917.361618]submit_bio_noacct+0x41e/0x4b0
[ 1917.361622]submit_bio+0x33/0x160
[ 1917.361644]ext4_io_submit+0x49/0x60 [ext4]
[ 1917.361661]ext4_writepages+0x683/0x1070 [ext4]
[ 1917.361667]do_writepages+0x3c/0xe0
[ 1917.361672]__writeback_single_inode+0x62/0x630
[ 1917.361677]writeback_sb_inodes+0x218/0x4d0
[ 1917.361681]__writeback_inodes_wb+0x5f/0xc0
[ 1917.361686]wb_writeback+0x283/0x490
[ 1917.361691]wb_workfn+0x29a/0x670
[ 1917.361696]process_one_work+0x283/0x620
[ 1917.361701]worker_thread+0x39/0x3f0
[ 1917.361706]kthread+0x152/0x170
[ 1917.361711]ret_from_fork+0x1f/0x30
[ 1917.361715]
   other info that might help us debug this:

[ 1917.361722]  Possible unsafe locking scenario:

[ 1917.361728]CPU0CPU1
[ 1917.361731]
[ 1917.361736]   lock(&bfqd->lock);
[ 1917.361740]lock(&ioc->lock);
[ 1917.361746]lock(&bfqd->lock);
[ 1917.361752]   lock(&ioc->lock);
[ 1917.361757]
*** DEADLOCK ***

[ 1917.361763] 5 locks held by kworker/u16:35/15995:
[ 1917.361767]  #0: 892240c9bd38 ((wq_completion)writeback){+.+.}-{0:0}, 
at: process_one_work+0x1fa/0x620
[ 1917.361778]  #1: 94569342fe78 
((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: 
process_one_work+0x1fa/0x620
[ 1917.361789]  #2: 8921424ae0e0 (&type->s_umount_key#39){}-{3:3}, at: 
trylock_super+0x16/0x50
[ 1917.361800]  #3: 8921424aaa40 (&sbi->s_writepages_rwsem){.+.+}-{0:0}, 
at: do_writepages+0x3c/0xe0
[ 1917.361811]  #4: 892244d2cc08 (&bfqd->lock){-.-.}-{2:2}, at: 
bfq_insert_requests+0x89/0x680
[ 1917.361821]
   stack backtrace:
[ 1917.361827] CPU: 6 PID: 15995 Comm: kworker/u16:35 Kdump: loaded Tainted: G 
S  E 5.9.0.g7cf726a-master #2
[ 1917.361833] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 
09/23/2013
[ 1917.361840] Workqueue: writeback wb_workfn (flush-8:32)
[ 1917.361846] Call Trace:
[ 1917.361854]  dump_stack+0x77/0x97
[ 1917.361860]  check_noncircular+0xe7/0x100
[ 1917.361866]  ? __lock_acquire+0x2ce/0x1a70
[ 1917.361872]  ? __lock_acquire+0x149d/0x1a70
[ 1917.361877]  __lock_acquire+0x149d/0x1a70
[ 1917.361884]  lock_acquire+0x1a7/0x3b0
[ 1917.361889]  ? put_io_context+0x30/0x90
[ 1917.361894]  ? bfq_put_queue+0xcf/0x480
[ 1917.361901]  _raw_spin_lock_irqsave+0x3d/0x50
[ 1917.361906]  ? put_io_context+0x30/0x90
[ 1917.361911]  put_io_context+0x30/0x90
[ 1917.361916]  blk_mq_free_request+0x4f/0x140
[ 1917.361921]  blk_attempt_req_merge+0x19/0x30
[ 1917.361926]  elv_attempt_insert_merge+0x4f/0x90
[ 1917.361932]  blk_mq_sched_try_insert

RE: [PATCH] net: ftgmac100: Fix missing TX-poll issue

2020-10-19 Thread Dylan Hung

> -Original Message-
> From: Jakub Kicinski [mailto:k...@kernel.org]
> Sent: Tuesday, October 20, 2020 3:01 AM
> To: Joel Stanley 
> Cc: Dylan Hung ; Benjamin Herrenschmidt
> ; David S . Miller ;
> net...@vger.kernel.org; Linux Kernel Mailing List
> ; Po-Yu Chuang ;
> linux-aspeed ; OpenBMC Maillist
> ; BMC-SW 
> Subject: Re: [PATCH] net: ftgmac100: Fix missing TX-poll issue
> 
> On Mon, 19 Oct 2020 08:57:03 + Joel Stanley wrote:
> > > diff --git a/drivers/net/ethernet/faraday/ftgmac100.c
> > > b/drivers/net/ethernet/faraday/ftgmac100.c
> > > index 00024dd41147..9a99a87f29f3 100644
> > > --- a/drivers/net/ethernet/faraday/ftgmac100.c
> > > +++ b/drivers/net/ethernet/faraday/ftgmac100.c
> > > @@ -804,7 +804,8 @@ static netdev_tx_t
> ftgmac100_hard_start_xmit(struct sk_buff *skb,
> > >  * before setting the OWN bit on the first descriptor.
> > >  */
> > > dma_wmb();
> > > -   first->txdes0 = cpu_to_le32(f_ctl_stat);
> > > +   WRITE_ONCE(first->txdes0, cpu_to_le32(f_ctl_stat));
> > > +   READ_ONCE(first->txdes0);
> >
> > I understand what you're trying to do here, but I'm not sure that this
> > is the correct way to go about it.
> >
> > It does cause the compiler to produce a store and then a load.

Yes, the load instruction here is to guarantee the previous store is indeed 
pushed onto the physical memory.

> 
> +1 @first is system memory from dma_alloc_coherent(), right?
> 
> You shouldn't have to do this. Is coherent DMA memory broken on your
> platform?

It is about the arbitration on the DRAM controller.  There are two queues in 
the dram controller, one is for the CPU access and the other is for the HW 
engines.
When CPU issues a store command, the dram controller just acknowledges cpu's 
request and pushes the request into the queue.  Then CPU triggers the HW MAC 
engine, the HW engine starts to fetch the DMA memory.
But since the cpu's request may still stay in the queue, the HW engine may 
fetch the wrong data.

Re: [PATCH net v2] Revert "virtio-net: ethtool configurable RXCSUM"

2020-10-19 Thread Jason Wang




On 2020/10/20 上午1:32, Michael S. Tsirkin wrote:

This reverts commit 3618ad2a7c0e78e4258386394d5d5f92a3dbccf8.

When the device does not have a control vq (e.g. when using a
version of QEMU based on upstream v0.10 or older, or when specifying
ctrl_vq=off,ctrl_rx=off,ctrl_vlan=off,ctrl_rx_extra=off,ctrl_mac_addr=off
for the device on the QEMU command line), that commit causes a crash:

[   72.229171] kernel BUG at drivers/net/virtio_net.c:1667!
[   72.230266] invalid opcode:  [#1] PREEMPT SMP
[   72.231172] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.9.0-rc8-02934-g3618ad2a7c0e7 #1
[   72.231172] EIP: virtnet_send_command+0x120/0x140
[   72.231172] Code: 00 0f 94 c0 8b 7d f0 65 33 3d 14 00 00 00 75 1c 8d 65 f4 
5b 5e 5f 5d c3 66 90 be 01 00 00 00 e9 6e ff ff ff 8d b6 00
+00 00 00 <0f> 0b e8 d9 bb 82 00 eb 17 8d b4 26 00 00 00 00 8d b4 26 00 00 00
[   72.231172] EAX: 000d EBX: f72895c0 ECX: 0017 EDX: 0011
[   72.231172] ESI: f7197800 EDI: ed69bd00 EBP: ed69bcf4 ESP: ed69bc98
[   72.231172] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010246
[   72.231172] CR0: 80050033 CR2:  CR3: 02c84000 CR4: 000406f0
[   72.231172] Call Trace:
[   72.231172]  ? __virt_addr_valid+0x45/0x60
[   72.231172]  ? ___cache_free+0x51f/0x760
[   72.231172]  ? kobject_uevent_env+0xf4/0x560
[   72.231172]  virtnet_set_guest_offloads+0x4d/0x80
[   72.231172]  virtnet_set_features+0x85/0x120
[   72.231172]  ? virtnet_set_guest_offloads+0x80/0x80
[   72.231172]  __netdev_update_features+0x27a/0x8e0
[   72.231172]  ? kobject_uevent+0xa/0x20
[   72.231172]  ? netdev_register_kobject+0x12c/0x160
[   72.231172]  register_netdevice+0x4fe/0x740
[   72.231172]  register_netdev+0x1c/0x40
[   72.231172]  virtnet_probe+0x728/0xb60
[   72.231172]  ? _raw_spin_unlock+0x1d/0x40
[   72.231172]  ? virtio_vdpa_get_status+0x1c/0x20
[   72.231172]  virtio_dev_probe+0x1c6/0x271
[   72.231172]  really_probe+0x195/0x2e0
[   72.231172]  driver_probe_device+0x26/0x60
[   72.231172]  device_driver_attach+0x49/0x60
[   72.231172]  __driver_attach+0x46/0xc0
[   72.231172]  ? device_driver_attach+0x60/0x60
[   72.231172]  bus_add_driver+0x197/0x1c0
[   72.231172]  driver_register+0x66/0xc0
[   72.231172]  register_virtio_driver+0x1b/0x40
[   72.231172]  virtio_net_driver_init+0x61/0x86
[   72.231172]  ? veth_init+0x14/0x14
[   72.231172]  do_one_initcall+0x76/0x2e4
[   72.231172]  ? rdinit_setup+0x2a/0x2a
[   72.231172]  do_initcalls+0xb2/0xd5
[   72.231172]  kernel_init_freeable+0x14f/0x179
[   72.231172]  ? rest_init+0x100/0x100
[   72.231172]  kernel_init+0xd/0xe0
[   72.231172]  ret_from_fork+0x1c/0x30
[   72.231172] Modules linked in:
[   72.269563] ---[ end trace a6ebc4afea0e6cb1 ]---

The reason is that virtnet_set_features now calls virtnet_set_guest_offloads
unconditionally, it used to only call it when there is something
to configure.

If device does not have a control vq, everything breaks.

Looking at this some more, I noticed that it's not really checking the
hardware too much. E.g.

 if ((dev->features ^ features) & NETIF_F_LRO) {
 if (features & NETIF_F_LRO)
 offloads |= GUEST_OFFLOAD_LRO_MASK &
 vi->guest_offloads_capable;
 else
 offloads &= ~GUEST_OFFLOAD_LRO_MASK;
 }

and

 (1ULL << VIRTIO_NET_F_GUEST_TSO6) | \
 (1ULL << VIRTIO_NET_F_GUEST_ECN)  | \
 (1ULL << VIRTIO_NET_F_GUEST_UFO))

But there's no guarantee that e.g. VIRTIO_NET_F_GUEST_TSO6 is set.

If it isn't command should not send it.

Further

static int virtnet_set_features(struct net_device *dev,
 netdev_features_t features)
{
 struct virtnet_info *vi = netdev_priv(dev);
 u64 offloads = vi->guest_offloads;

seems wrong since guest_offloads is zero initialized,



I'm not sure I get here.

Did you mean vi->guest_offloads?

We initialize it during probe

    for (i = 0; i < ARRAY_SIZE(guest_offloads); i++)
        if (virtio_has_feature(vi->vdev, guest_offloads[i]))
            set_bit(guest_offloads[i], &vi->guest_offloads);



it does not reflect the state after reset which comes from
the features.

Revert the original commit for now.

Cc: Tonghao Zhang 
Cc: Willem de Bruijn 
Fixes: 3618ad2a7c0e7 ("virtio-net: ethtool configurable RXCSUM")
Reported-by: kernel test robot 
Signed-off-by: Michael S. Tsirkin 
---

changes from v1:
- clarify how to reproduce the bug in the log


  drivers/net/virtio_net.c | 50 +++-
  1 file changed, 13 insertions(+), 37 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d2d2c4a53cf2..21b71148c532 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -68,8 +68,6 @@ static const unsigned long guest_offloads[] = {
(1ULL << VIRTI

Re: [PATCH v6 0/2] perf: Make tsc testing as a common testing case

2020-10-19 Thread Jiri Olsa

On Mon, Oct 19, 2020 at 06:02:34PM +0800, Leo Yan wrote:
> This patch set is to move tsc testing from x86 specific to common
> testing case.  Since Arnaldo found the building failure for patch set
> v4 [1], the first four patches have been merged but the last two patches
> were left out; this patch set is to resend the last two patches with
> fixed the building failure (by removing the header "arch-tests.h" from the
> testing code).
> 
> These two patches have been tested on x86_64 and Arm64.  Though I don't
> test them on archs MIPS, PowerPC, etc, I tried to search every header so
> ensure included headers are supported for all archs.
> 
> These two patches have been rebased on the perf/core branch with its
> latest commit 744aec4df2c5 ("perf c2c: Update documentation for metrics
> reorganization").
> 
> Changes from v5:
> * Found the merging confliction on latest perf/core, so rebased it.
> 
> [1] https://lore.kernel.org/patchwork/cover/1305382/#1505752
> 
> 
> Leo Yan (2):
>   perf tests tsc: Make tsc testing as a common testing
>   perf tests tsc: Add checking helper is_supported()

Acked-by: Jiri Olsa 

thanks,
jirka

> 
>  tools/perf/arch/x86/include/arch-tests.h  |  1 -
>  tools/perf/arch/x86/tests/Build   |  1 -
>  tools/perf/arch/x86/tests/arch-tests.c|  4 
>  tools/perf/tests/Build|  1 +
>  tools/perf/tests/builtin-test.c   |  5 +
>  .../{arch/x86 => }/tests/perf-time-to-tsc.c   | 19 +++
>  tools/perf/tests/tests.h  |  2 ++
>  7 files changed, 23 insertions(+), 10 deletions(-)
>  rename tools/perf/{arch/x86 => }/tests/perf-time-to-tsc.c (92%)
> 
> -- 
> 2.17.1
>

[PATCH] [v3] rtc: sun6i: Fix memleak in sun6i_rtc_clk_init

2020-10-19 Thread Dinghao Liu

When clk_hw_register_fixed_rate_with_accuracy() fails,
clk_data should be freed. It's the same for the subsequent
two error paths, but we should also unregister the already
registered clocks in them.

Signed-off-by: Dinghao Liu 
---

Changelog:

v2: - Unregister the already registered clocks on failure.

v3: - Add a new label 'err_register' to unify code style.
---
 drivers/rtc/rtc-sun6i.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
index e2b8b150bcb4..f2818cdd11d8 100644
--- a/drivers/rtc/rtc-sun6i.c
+++ b/drivers/rtc/rtc-sun6i.c
@@ -272,7 +272,7 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node,
3);
if (IS_ERR(rtc->int_osc)) {
pr_crit("Couldn't register the internal oscillator\n");
-   return;
+   goto err;
}
 
parents[0] = clk_hw_get_name(rtc->int_osc);
@@ -290,7 +290,7 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node,
rtc->losc = clk_register(NULL, &rtc->hw);
if (IS_ERR(rtc->losc)) {
pr_crit("Couldn't register the LOSC clock\n");
-   return;
+   goto err_register;
}
 
of_property_read_string_index(node, "clock-output-names", 1,
@@ -301,7 +301,7 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node,
  &rtc->lock);
if (IS_ERR(rtc->ext_losc)) {
pr_crit("Couldn't register the LOSC external gate\n");
-   return;
+   goto err_register;
}
 
clk_data->num = 2;
@@ -314,6 +314,8 @@ static void __init sun6i_rtc_clk_init(struct device_node 
*node,
of_clk_add_hw_provider(node, of_clk_hw_onecell_get, clk_data);
return;
 
+err_register:
+   clk_hw_unregister_fixed_rate(rtc->int_osc);
 err:
kfree(clk_data);
 }
-- 
2.17.1

Re: KASAN: unknown-crash Read in do_exit

2020-10-19 Thread Dmitry Vyukov

On Mon, Oct 19, 2020 at 11:38 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit a49145acfb975d921464b84fe00279f99827d816
> Author: George Kennedy 
> Date:   Tue Jul 7 19:26:03 2020 +
>
> fbmem: add margin check to fb_check_caps()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17ce19c850
> start commit:   729e3d09 Merge tag 'ceph-for-5.9-rc5' of git://github.com/..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c61610091f4ca8c4
> dashboard link: https://syzkaller.appspot.com/bug?extid=d9ae84069cff753e94bf
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1064254590
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=141f2bed90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: fbmem: add margin check to fb_check_caps()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Based on the reproducer it looks reasonable:

#syz fix: fbmem: add margin check to fb_check_caps()

[PATCH] PM / devfreq: Remove redundant governor_name from struct devfreq

2020-10-19 Thread Chanwoo Choi

The devfreq structure instance contains the governor_name and a governor
instance. When need to show the governor name, better to use the name
of devfreq_governor structure. So, governor_name variable in struct devfreq
is a redundant and unneeded variable. Remove the redundant governor_name
of struct devfreq and then use the name of devfreq_governor instance.

Signed-off-by: Chanwoo Choi 
---
 drivers/devfreq/devfreq.c  | 18 +++---
 drivers/devfreq/governor.h |  2 ++
 include/linux/devfreq.h|  4 
 3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 99df27368628..8fc773bced25 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -811,7 +811,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->dev.release = devfreq_dev_release;
INIT_LIST_HEAD(&devfreq->node);
devfreq->profile = profile;
-   strscpy(devfreq->governor_name, governor_name, DEVFREQ_NAME_LEN);
devfreq->previous_freq = profile->initial_freq;
devfreq->last_status.current_frequency = profile->initial_freq;
devfreq->data = data;
@@ -907,7 +906,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
 
mutex_lock(&devfreq_list_lock);
 
-   governor = try_then_request_governor(devfreq->governor_name);
+   governor = try_then_request_governor(governor_name);
if (IS_ERR(governor)) {
dev_err(dev, "%s: Unable to find governor for the device\n",
__func__);
@@ -1250,7 +1249,7 @@ int devfreq_add_governor(struct devfreq_governor 
*governor)
int ret = 0;
struct device *dev = devfreq->dev.parent;
 
-   if (!strncmp(devfreq->governor_name, governor->name,
+   if (!strncmp(devfreq->governor->name, governor->name,
 DEVFREQ_NAME_LEN)) {
/* The following should never occur */
if (devfreq->governor) {
@@ -1312,7 +1311,7 @@ int devfreq_remove_governor(struct devfreq_governor 
*governor)
int ret;
struct device *dev = devfreq->dev.parent;
 
-   if (!strncmp(devfreq->governor_name, governor->name,
+   if (!strncmp(devfreq->governor->name, governor->name,
 DEVFREQ_NAME_LEN)) {
/* we should have a devfreq governor! */
if (!devfreq->governor) {
@@ -1407,20 +1406,17 @@ static ssize_t governor_store(struct device *dev, 
struct device_attribute *attr,
 * Start the new governor and create the specific sysfs files
 * which depend on new governor.
 */
-   strncpy(df->governor_name, new_governor->name, DEVFREQ_NAME_LEN);
ret = new_governor->event_handler(df, DEVFREQ_GOV_START, NULL);
if (ret) {
dev_warn(dev, "%s: Governor %s not started(%d)\n",
 __func__, new_governor->name, ret);
-   strncpy(df->governor_name, prev_governor->name,
-   DEVFREQ_NAME_LEN);
 
/* Restore prev_governor when failed to start new governor */
ret = prev_governor->event_handler(df, DEVFREQ_GOV_START, NULL);
if (ret) {
dev_err(dev,
"%s: reverting to Governor %s failed (%d)\n",
-   __func__, df->governor_name, ret);
+   __func__, prev_governor->name, ret);
df->governor = NULL;
goto out;
}
@@ -1457,7 +1453,7 @@ static ssize_t available_governors_show(struct device *d,
 */
if (IS_SUPPORTED_FLAG(df->governor->flags, IMMUTABLE)) {
count = scnprintf(&buf[count], DEVFREQ_NAME_LEN,
- "%s ", df->governor_name);
+ "%s ", df->governor->name);
/*
 * The devfreq device shows the registered governor except for
 * immutable governors such as passive governor .
@@ -1900,7 +1896,7 @@ static int devfreq_summary_show(struct seq_file *s, void 
*data)
 
list_for_each_entry_reverse(devfreq, &devfreq_list, node) {
 #if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
-   if (!strncmp(devfreq->governor_name, DEVFREQ_GOV_PASSIVE,
+   if (!strncmp(devfreq->governor->name, DEVFREQ_GOV_PASSIVE,
DEVFREQ_NAME_LEN)) {
struct devfreq_passive_data *data = devfreq->data;
 
@@ -1926,7 +1922,7 @@ static int devfreq_summary_show(struct seq_file *s, void 
*data)
"%-30s %-30s %-15s %-10s %10d %12ld %12ld %12ld\n",
dev_name(&devfreq->dev),
p_devfreq ? dev_name(&p_devfreq->dev) : "null",
-   devfreq->govern

Re: [PATCH] perf test: Implement skip_reason callback for watchpoint tests

2020-10-19 Thread Namhyung Kim

Hello,

On Fri, Oct 16, 2020 at 10:17 PM Tommi Rantala
 wrote:
>
> Currently reason for skipping the read only watchpoint test is only seen
> when running in verbose mode:
>
>   $ perf test watchpoint
>   23: Watchpoint:
>   23.1: Read Only Watchpoint: Skip
>   23.2: Write Only Watchpoint   : Ok
>   23.3: Read / Write Watchpoint : Ok
>   23.4: Modify Watchpoint   : Ok
>
>   $ perf test -v watchpoint
>   23: Watchpoint:
>   23.1: Read Only Watchpoint:
>   --- start ---
>   test child forked, pid 60204
>   Hardware does not support read only watchpoints.
>   test child finished with -2
>
> Implement skip_reason callback for the watchpoint tests, so that it's
> easy to see reason why the test is skipped:
>
>   $ perf test watchpoint
>   23: Watchpoint:
>   23.1: Read Only Watchpoint: Skip (missing 
> hardware support)
>   23.2: Write Only Watchpoint   : Ok
>   23.3: Read / Write Watchpoint : Ok
>   23.4: Modify Watchpoint   : Ok
>
> Signed-off-by: Tommi Rantala 

Acked-by: Namhyung Kim 

Thanks
Namhyung


> ---
>  tools/perf/tests/builtin-test.c |  1 +
>  tools/perf/tests/tests.h|  1 +
>  tools/perf/tests/wp.c   | 21 +++--
>  3 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> index d328caaba45d..3bfad4ee31ae 100644
> --- a/tools/perf/tests/builtin-test.c
> +++ b/tools/perf/tests/builtin-test.c
> @@ -142,6 +142,7 @@ static struct test generic_tests[] = {
> .skip_if_fail   = false,
> .get_nr = test__wp_subtest_get_nr,
> .get_desc   = test__wp_subtest_get_desc,
> +   .skip_reason= test__wp_subtest_skip_reason,
> },
> },
> {
> diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
> index 4447a516c689..0630301087a6 100644
> --- a/tools/perf/tests/tests.h
> +++ b/tools/perf/tests/tests.h
> @@ -66,6 +66,7 @@ int test__bp_signal_overflow(struct test *test, int 
> subtest);
>  int test__bp_accounting(struct test *test, int subtest);
>  int test__wp(struct test *test, int subtest);
>  const char *test__wp_subtest_get_desc(int subtest);
> +const char *test__wp_subtest_skip_reason(int subtest);
>  int test__wp_subtest_get_nr(void);
>  int test__task_exit(struct test *test, int subtest);
>  int test__mem(struct test *test, int subtest);
> diff --git a/tools/perf/tests/wp.c b/tools/perf/tests/wp.c
> index d262d6639829..9387fa76faa5 100644
> --- a/tools/perf/tests/wp.c
> +++ b/tools/perf/tests/wp.c
> @@ -174,10 +174,12 @@ static bool wp_ro_supported(void)
>  #endif
>  }
>
> -static void wp_ro_skip_msg(void)
> +static const char *wp_ro_skip_msg(void)
>  {
>  #if defined (__x86_64__) || defined (__i386__)
> -   pr_debug("Hardware does not support read only watchpoints.\n");
> +   return "missing hardware support";
> +#else
> +   return NULL;
>  #endif
>  }
>
> @@ -185,7 +187,7 @@ static struct {
> const char *desc;
> int (*target_func)(void);
> bool (*is_supported)(void);
> -   void (*skip_msg)(void);
> +   const char *(*skip_msg)(void);
>  } wp_testcase_table[] = {
> {
> .desc = "Read Only Watchpoint",
> @@ -219,16 +221,23 @@ const char *test__wp_subtest_get_desc(int i)
> return wp_testcase_table[i].desc;
>  }
>
> +const char *test__wp_subtest_skip_reason(int i)
> +{
> +   if (i < 0 || i >= (int)ARRAY_SIZE(wp_testcase_table))
> +   return NULL;
> +   if (!wp_testcase_table[i].skip_msg)
> +   return NULL;
> +   return wp_testcase_table[i].skip_msg();
> +}
> +
>  int test__wp(struct test *test __maybe_unused, int i)
>  {
> if (i < 0 || i >= (int)ARRAY_SIZE(wp_testcase_table))
> return TEST_FAIL;
>
> if (wp_testcase_table[i].is_supported &&
> -   !wp_testcase_table[i].is_supported()) {
> -   wp_testcase_table[i].skip_msg();
> +   !wp_testcase_table[i].is_supported())
> return TEST_SKIP;
> -   }
>
> return !wp_testcase_table[i].target_func() ? TEST_OK : TEST_FAIL;
>  }
> --
> 2.26.2
>

Re: [PATCH] perf mem2node: Improve warning if detected no memory nodes

2020-10-19 Thread Jiri Olsa

On Mon, Oct 19, 2020 at 08:36:13AM +0800, Leo Yan wrote:
> Some archs (e.g. x86 and Arm64) don't enable the configuration
> CONFIG_MEMORY_HOTPLUG by default, if this configuration is not enabled
> when build the kernel image, the SysFS for memory nodes will be missed.
> This results in perf tool has no chance to catpure the memory nodes
> information, when perf tool reports the result and detects no memory
> nodes, it outputs "assertion failed at util/mem2node.c:99".
> 
> The output log doesn't give out reason for the failure and users have no
> clue for how to fix it.  This patch changes to use explicit way for
> warning: it tells user that detected no memory nodes and suggests to
> enable CONFIG_MEMORY_HOTPLUG for kernel building.
> 
> Signed-off-by: Leo Yan 

Acked-by: Jiri Olsa 

thanks,
jirka

> ---
>  tools/perf/util/mem2node.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/mem2node.c b/tools/perf/util/mem2node.c
> index c84f5841c7ab..03a7d7b27737 100644
> --- a/tools/perf/util/mem2node.c
> +++ b/tools/perf/util/mem2node.c
> @@ -96,7 +96,8 @@ int mem2node__init(struct mem2node *map, struct perf_env 
> *env)
>  
>   /* Cut unused entries, due to merging. */
>   tmp_entries = realloc(entries, sizeof(*entries) * j);
> - if (tmp_entries || WARN_ON_ONCE(j == 0))
> + if (tmp_entries ||
> + WARN_ONCE(j == 0, "No memory nodes, is CONFIG_MEMORY_HOTPLUG 
> enabled?\n"))
>   entries = tmp_entries;
>  
>   for (i = 0; i < j; i++) {
> -- 
> 2.17.1
>

Re: [RFC PATCH] mm: memcg/slab: Stop reparented obj_cgroups from charging root

2020-10-19 Thread Richard Palethorpe

Hello Roman,

Roman Gushchin  writes:

> On Fri, Oct 16, 2020 at 07:15:02PM +0200, Michal Koutny wrote:
>> On Fri, Oct 16, 2020 at 10:53:08AM -0400, Johannes Weiner 
>>  wrote:
>> > The central try_charge() function charges recursively all the way up
>> > to and including the root.
>> Except for use_hiearchy=0 (which is the case here as Richard
>> wrote). The reparenting is hence somewhat incompatible with
>> new_parent.use_hiearchy=0 :-/
>> 
>> > We should clean this up one way or another: either charge the root or
>> > don't, but do it consistently.
>> I agree this'd be good to unify. One upside of excluding root memcg from
>> charging is that users are spared from the charging overhead when memcg
>> tree is not created.  (Actually, I thought that was the reason for this
>> exception.)
>
> Yeah, I'm completely on the same page. Moving a process to the root memory
> cgroup is currently a good way to estimate the memory cgroup overhead.
>
> How about the patch below, which consistently avoids charging the root
> memory cgroup? It seems like it doesn't add too many checks.
>
> Thanks!
>
> --
>
> From f50ea74d8f118b9121da3754acdde630ddc060a7 Mon Sep 17 00:00:00 2001
> From: Roman Gushchin 
> Date: Mon, 19 Oct 2020 14:37:35 -0700
> Subject: [PATCH RFC] mm: memcontrol: do not charge the root memory cgroup
>
> Currently the root memory cgroup is never charged directly, but
> if an ancestor cgroup is charged, the charge is propagated up to the
> root memory cgroup. The root memory cgroup doesn't show the charge
> to a user, neither it does allow to set any limits/protections.
> So the information about the current charge is completely useless.
>
> Avoiding to charge the root memory cgroup allows to:
> 1) simplify the model and the code, so, hopefully, fewer bugs will
>be introduced in the future;
> 2) avoid unnecessary atomic operations, which are used to (un)charge
>corresponding root page counters.
>
> In the default hierarchy case or if use_hiearchy == true, it's very
> straightforward: when the page counters tree is traversed to the root,
> the root page counter (the one with parent == NULL), should be
> skipped. To avoid multiple identical checks over the page counters
> code, for_each_nonroot_ancestor() macro is introduced.
>
> To handle the use_hierarchy == false case without adding custom
> checks, let's make page counters of all non-root memory cgroup
> direct ascendants of the corresponding root memory cgroup's page
> counters. In this case for_each_nonroot_ancestor() will work correctly
> as well.
>
> Please, note, that cgroup v1 provides root level memory.usage_in_bytes.
> However, it's not based on page counters (refer to mem_cgroup_usage()).
>
> Signed-off-by: Roman Gushchin 
> ---
>  mm/memcontrol.c   | 21 -
>  mm/page_counter.c | 21 -
>  2 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2636f8bad908..34cac7522e74 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5339,17 +5339,28 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state 
> *parent_css)
>   memcg->swappiness = mem_cgroup_swappiness(parent);
>   memcg->oom_kill_disable = parent->oom_kill_disable;
>   }
> - if (parent && parent->use_hierarchy) {
> + if (!parent) {
> + /* root memory cgroup */
> + page_counter_init(&memcg->memory, NULL);
> + page_counter_init(&memcg->swap, NULL);
> + page_counter_init(&memcg->kmem, NULL);
> + page_counter_init(&memcg->tcpmem, NULL);
> + } else if (parent->use_hierarchy) {
>   memcg->use_hierarchy = true;
>   page_counter_init(&memcg->memory, &parent->memory);
>   page_counter_init(&memcg->swap, &parent->swap);
>   page_counter_init(&memcg->kmem, &parent->kmem);
>   page_counter_init(&memcg->tcpmem, &parent->tcpmem);
>   } else {
> - page_counter_init(&memcg->memory, NULL);
> - page_counter_init(&memcg->swap, NULL);
> - page_counter_init(&memcg->kmem, NULL);
> - page_counter_init(&memcg->tcpmem, NULL);
> + /*
> +  * If use_hierarchy == false, consider all page counters direct
> +  * descendants of the corresponding root level counters.
> +  */
> + page_counter_init(&memcg->memory, &root_mem_cgroup->memory);
> + page_counter_init(&memcg->swap, &root_mem_cgroup->swap);
> + page_counter_init(&memcg->kmem, &root_mem_cgroup->kmem);
> + page_counter_init(&memcg->tcpmem, &root_mem_cgroup->tcpmem);
> +
>   /*
>* Deeper hierachy with use_hierarchy == false doesn't make
>* much sense so let cgroup subsystem know about this

Perhaps in this case, where the hierarchy is broken, objcgs should also
be reparented directly to root? Otherwise it will still be poss

Re: [PATCH v4 10/15] ASoC: dt-bindings: tegra: Add graph bindings

2020-10-19 Thread Sameer Pujar





Add device tree binding properties of generic graph to ASoC component
devices. This allows to define audio ports out of these components or
DAIs and audio graph based sound card can be realised with this.

Signed-off-by: Sameer Pujar 
---
  Documentation/devicetree/bindings/sound/nvidia,tegra186-dspk.yaml  | 7 +++
  .../devicetree/bindings/sound/nvidia,tegra210-admaif.yaml  | 7 +++
  Documentation/devicetree/bindings/sound/nvidia,tegra210-ahub.yaml  | 7 +++
  Documentation/devicetree/bindings/sound/nvidia,tegra210-dmic.yaml  | 7 +++
  Documentation/devicetree/bindings/sound/nvidia,tegra210-i2s.yaml   | 7 +++
  5 files changed, 35 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/nvidia,tegra186-dspk.yaml 
b/Documentation/devicetree/bindings/sound/nvidia,tegra186-dspk.yaml
index ed2fb32..23875b1 100644
--- a/Documentation/devicetree/bindings/sound/nvidia,tegra186-dspk.yaml
+++ b/Documentation/devicetree/bindings/sound/nvidia,tegra186-dspk.yaml
@@ -55,6 +55,13 @@ properties:
The name can be "DSPK1" or "DSPKx", where x depends on the maximum
available instances on a Tegra SoC.

+  ports:
+$ref: /schemas/sound/audio-graph-card.yaml#/definitions/ports
+



+patternProperties:
+  "^port(@[0-9a-f]+)?$":
+$ref: /schemas/sound/audio-graph-card.yaml#/definitions/port

You should have either 'ports' or a single 'port' (yes, the graph
binding allowed multiple port nodes without 'ports', but that should be
deprecated IMO)


OK, will drop this and just use 'port' here.

RE: [PATCH v2 0/5] Remove LPC register partitioning

2020-10-19 Thread ChiaWei Wang

Hi All,

Do you have any comment on the v2 changes?
Thanks.

Chiawei

> -Original Message-
> From: ChiaWei Wang 
> Sent: Monday, October 5, 2020 4:28 PM
> To: lee.jo...@linaro.org; robh...@kernel.org; j...@jms.id.au;
> and...@aj.id.au; miny...@acm.org; a...@arndb.de;
> gre...@linuxfoundation.org; linus.wall...@linaro.org;
> haiyue.w...@linux.intel.com; cyril...@gmail.com; rlipp...@google.com;
> linux-arm-ker...@lists.infradead.org; linux-asp...@lists.ozlabs.org;
> linux-kernel@vger.kernel.org; open...@lists.ozlabs.org;
> linux-g...@vger.kernel.org
> Subject: [PATCH v2 0/5] Remove LPC register partitioning
> 
> The LPC controller has no concept of the BMC and the Host partitions.
> The incorrect partitioning can impose unnecessary range restrictions on
> register access through the syscon regmap interface.
> 
> For instance, HICRB contains the I/O port address configuration of KCS channel
> 1/2. However, the KCS#1/#2 drivers cannot access HICRB as it is located at the
> other LPC partition.
> 
> In addition, to be backward compatible, the newly added HW control bits could
> be located at any reserved bits over the LPC addressing space.
> 
> Thereby, this patch series aims to remove the LPC partitioning for better 
> driver
> development and maintenance.
> 
> 
> Changes since v1:
>   - Add the fix to the aspeed-lpc binding documentation.
> 
> Chia-Wei, Wang (5):
>   ARM: dts: Remove LPC BMC and Host partitions
>   soc: aspeed: Fix LPC register offsets
>   ipmi: kcs: aspeed: Fix LPC register offsets
>   pinctrl: aspeed-g5: Fix LPC register offsets
>   dt-bindings: aspeed-lpc: Remove LPC partitioning
> 
>  .../devicetree/bindings/mfd/aspeed-lpc.txt|  85 ++-
>  arch/arm/boot/dts/aspeed-g4.dtsi  |  74 --
>  arch/arm/boot/dts/aspeed-g5.dtsi  | 135 --
>  arch/arm/boot/dts/aspeed-g6.dtsi  | 135 --
>  drivers/char/ipmi/kcs_bmc_aspeed.c|  13 +-
>  drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c|   2 +-
>  drivers/soc/aspeed/aspeed-lpc-ctrl.c  |   6 +-
>  drivers/soc/aspeed/aspeed-lpc-snoop.c |  11 +-
>  8 files changed, 176 insertions(+), 285 deletions(-)
> 
> --
> 2.17.1

[PATCH] media/platform/marvell-ccic: fix warnings when CONFIG_PM is not enabled

2020-10-19 Thread Randy Dunlap

From: Randy Dunlap 

Fix build warnings when CONFIG_PM is not set/enabled:

../drivers/media/platform/marvell-ccic/mmp-driver.c:324:12: warning: 
'mmpcam_runtime_suspend' defined but not used [-Wunused-function]
  324 | static int mmpcam_runtime_suspend(struct device *dev)
../drivers/media/platform/marvell-ccic/mmp-driver.c:310:12: warning: 
'mmpcam_runtime_resume' defined but not used [-Wunused-function]
  310 | static int mmpcam_runtime_resume(struct device *dev)

Signed-off-by: Randy Dunlap 
Cc: Jonathan Corbet 
Cc: linux-me...@vger.kernel.org
Cc: Mauro Carvalho Chehab 
---
 drivers/media/platform/marvell-ccic/mmp-driver.c |2 ++
 1 file changed, 2 insertions(+)

--- linux-next-20201009.orig/drivers/media/platform/marvell-ccic/mmp-driver.c
+++ linux-next-20201009/drivers/media/platform/marvell-ccic/mmp-driver.c
@@ -307,6 +307,7 @@ static int mmpcam_platform_remove(struct
  * Suspend/resume support.
  */
 
+#ifdef CONFIG_PM
 static int mmpcam_runtime_resume(struct device *dev)
 {
struct mmp_camera *cam = dev_get_drvdata(dev);
@@ -352,6 +353,7 @@ static int __maybe_unused mmpcam_resume(
return mccic_resume(&cam->mcam);
return 0;
 }
+#endif
 
 static const struct dev_pm_ops mmpcam_pm_ops = {
SET_RUNTIME_PM_OPS(mmpcam_runtime_suspend, mmpcam_runtime_resume, NULL)

Re: [Industrypack-devel] [PATCH] ipack: iopctal: remove unneeded break

2020-10-19 Thread Greg KH

On Tue, Oct 20, 2020 at 07:50:39AM +0200, Samuel Iglesias Gonsálvez wrote:
> Hi Tom,
> 
> Thanks for the patch!
> 
> Patch is,
> 
> Acked-by: Samuel Iglesias Gonsalvez 
> 
> Greg, Would you mind picking this patch through your char-misc
> tree?

Will do after -rc1 is out.

thanks,

greg k-h

[PATCH v2] fat: Add KUnit tests for checksums and timestamps

2020-10-19 Thread David Gow

Add some basic sanity-check tests for the fat_checksum() function and
the fat_time_unix2fat() and fat_time_fat2unix() functions. These unit
tests verify these functions return correct output for a number of test
inputs.

These tests were inspored by -- and serve a similar purpose to -- the
timestamp parsing KUnit tests in ext4[1].

Note that, unlike fat_time_unix2fat, fat_time_fat2unix wasn't previously
exported, so this patch exports it as well. This is required for the
case where we're building the fat and fat_test as modules.

[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/ext4/inode-test.c

Signed-off-by: David Gow 
Acked-by: OGAWA Hirofumi 
---

Changes since v1:
https://lore.kernel.org/linux-kselftest/20201017064107.375174-1-david...@google.com/
- Now export fat_time_fat2unix() so that the test can access it when
  built as a module.

 fs/fat/Kconfig|  13 +++
 fs/fat/Makefile   |   2 +
 fs/fat/fat_test.c | 197 ++
 fs/fat/misc.c |   1 +
 4 files changed, 213 insertions(+)
 create mode 100644 fs/fat/fat_test.c

diff --git a/fs/fat/Kconfig b/fs/fat/Kconfig
index 66532a71e8fd..fdef03b79c69 100644
--- a/fs/fat/Kconfig
+++ b/fs/fat/Kconfig
@@ -115,3 +115,16 @@ config FAT_DEFAULT_UTF8
  Say Y if you use UTF-8 encoding for file names, N otherwise.
 
  See  for more information.
+
+config FAT_KUNIT_TEST
+   tristate "Unit Tests for FAT filesystems" if !KUNIT_ALL_TESTS
+   select FAT_FS
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+   help
+ This builds the FAT KUnit tests
+
+ For more information on KUnit and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit
+
+ If unsure, say N
diff --git a/fs/fat/Makefile b/fs/fat/Makefile
index 70645ce2f7fc..2b034112690d 100644
--- a/fs/fat/Makefile
+++ b/fs/fat/Makefile
@@ -10,3 +10,5 @@ obj-$(CONFIG_MSDOS_FS) += msdos.o
 fat-y := cache.o dir.o fatent.o file.o inode.o misc.o nfs.o
 vfat-y := namei_vfat.o
 msdos-y := namei_msdos.o
+
+obj-$(CONFIG_FAT_KUNIT_TEST) += fat_test.o
diff --git a/fs/fat/fat_test.c b/fs/fat/fat_test.c
new file mode 100644
index ..c1b4348b9b3b
--- /dev/null
+++ b/fs/fat/fat_test.c
@@ -0,0 +1,197 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KUnit tests for FAT filesystems.
+ *
+ * Copyright (C) 2020 Google LLC.
+ * Author: David Gow 
+ */
+
+#include 
+
+#include "fat.h"
+
+static void fat_checksum_test(struct kunit *test)
+{
+   /* With no extension. */
+   KUNIT_EXPECT_EQ(test, fat_checksum("VMLINUX"), 44);
+   /* With 3-letter extension. */
+   KUNIT_EXPECT_EQ(test, fat_checksum("README  TXT"), 115);
+   /* With short (1-letter) extension. */
+   KUNIT_EXPECT_EQ(test, fat_checksum("ABCDEFGHA  "), 98);
+}
+
+
+struct fat_timestamp_testcase {
+   const char *name;
+   struct timespec64 ts;
+   __le16 time;
+   __le16 date;
+   u8 cs;
+   int time_offset;
+};
+
+const static struct fat_timestamp_testcase time_test_cases[] = {
+   {
+   .name = "Earliest possible UTC (1980-01-01 00:00:00)",
+   .ts = {.tv_sec = 315532800LL, .tv_nsec = 0L},
+   .time = 0,
+   .date = 33,
+   .cs = 0,
+   .time_offset = 0,
+   },
+   {
+   .name = "Latest possible UTC (2107-12-31 23:59:58)",
+   .ts = {.tv_sec = 4354819198LL, .tv_nsec = 0L},
+   .time = 49021,
+   .date = 65439,
+   .cs = 0,
+   .time_offset = 0,
+   },
+   {
+   .name = "Earliest possible (UTC-11) (== 1979-12-31 13:00:00 
UTC)",
+   .ts = {.tv_sec = 315493200LL, .tv_nsec = 0L},
+   .time = 0,
+   .date = 33,
+   .cs = 0,
+   .time_offset = 11 * 60,
+   },
+   {
+   .name = "Latest possible (UTC+11) (== 2108-01-01 10:59:58 UTC)",
+   .ts = {.tv_sec = 4354858798LL, .tv_nsec = 0L},
+   .time = 49021,
+   .date = 65439,
+   .cs = 0,
+   .time_offset = -11 * 60,
+   },
+   {
+   .name = "Leap Day / Year (1996-02-29 00:00:00)",
+   .ts = {.tv_sec = 825552000LL, .tv_nsec = 0L},
+   .time = 0,
+   .date = 8285,
+   .cs = 0,
+   .time_offset = 0,
+   },
+   {
+   .name = "Year 2000 is leap year (2000-02-29 00:00:00)",
+   .ts = {.tv_sec = 951782400LL, .tv_nsec = 0L},
+   .time = 0,
+   .date = 10333,
+   .cs = 0,
+   .time_offset = 0,
+   },
+   {
+   .name = "Year 2100 not leap year (2100-03-01 00:00:00)",
+   .ts = {.tv_sec = 4107542400LL, .tv_nsec = 0L},
+   .time = 0,
+   .date = 61537,
+   .cs = 0,
+

Re: [PATCH 1/1] kobject: Don't emit change events if not in sysfs

2020-10-19 Thread Greg Kroah-Hartman

On Mon, Oct 19, 2020 at 03:32:57PM -0700, Abhishek Pandit-Subedi wrote:
> Add a check to make sure the kobj is created and in sysfs before sending
> a change event notification. Otherwise, udev rules that depend on the
> change notification may find that the path that changed doesn't actually
> exist.

Why is the user of the kobject trying to emit a uevent before it is
registered?  Shouldn't we fix the root problem here instead?  Otherwise
the event is still "gone", the caller will not know what to do about it.

Please fix the root problem here.

thanks,

greg k-h

Re: Re: [PATCH] [v2] rtc: sun6i: Fix memleak in sun6i_rtc_clk_init

2020-10-19 Thread dinghao . liu

> Hi,
> 
> On Sun, Oct 18, 2020 at 03:28:10PM +0800, Dinghao Liu wrote:
> > When clk_hw_register_fixed_rate_with_accuracy() fails,
> > clk_data should be freed. It's the same for the subsequent
> > two error paths, but we should also unregister the already
> > registered clocks in them.
> > 
> > Signed-off-by: Dinghao Liu 
> > ---
> > 
> > Changelog:
> > 
> > v2: - Unregister the already registered clocks on failure.
> > ---
> >  drivers/rtc/rtc-sun6i.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/rtc/rtc-sun6i.c b/drivers/rtc/rtc-sun6i.c
> > index e2b8b150bcb4..6de0d3ad736a 100644
> > --- a/drivers/rtc/rtc-sun6i.c
> > +++ b/drivers/rtc/rtc-sun6i.c
> > @@ -272,7 +272,7 @@ static void __init sun6i_rtc_clk_init(struct 
> > device_node *node,
> > 3);
> > if (IS_ERR(rtc->int_osc)) {
> > pr_crit("Couldn't register the internal oscillator\n");
> > -   return;
> > +   goto err;
> > }
> >  
> > parents[0] = clk_hw_get_name(rtc->int_osc);
> > @@ -290,7 +290,8 @@ static void __init sun6i_rtc_clk_init(struct 
> > device_node *node,
> > rtc->losc = clk_register(NULL, &rtc->hw);
> > if (IS_ERR(rtc->losc)) {
> > pr_crit("Couldn't register the LOSC clock\n");
> > -   return;
> > +   clk_hw_unregister_fixed_rate(rtc->int_osc);
> > +   goto err;
> > }
> 
> The point of having labels for the error sequence is to avoid to
> duplicate the error handling code in each and every error code path.
> 
> You should add another label for the fixed rate clock unregistration
> 

Fine, I will fix this soon.

Regards,
Dinghao

Re: [PATCH V2 1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER

2020-10-19 Thread Viresh Kumar

On 20-10-20, 10:35, Viresh Kumar wrote:
> On 19-10-20, 15:10, Sudeep Holla wrote:
> > On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:
> > > On 19-10-20, 11:12, Sudeep Holla wrote:
> > > > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
> > > > by any clock provider driver. E.g. the issue you will see if "clocks"
> > > > property is used instead of "qcom,freq-domain" on Qcom parts.
> > > 
> > > Okay, I understand. But what I still don't understand is why it fails
> > > for you. You have a clocks property in DT for the CPU, the OPP core
> > > tries to get it and will get deferred-probed, which will try probing
> > > at a later point of time and it shall work then. Isn't it ?
> > >
> > 
> > Nope unfortunately. We don't have clock provider, so clk_get will
> > never succeed and always return -EPROBE_DEFER.
> 
> Now this is really bad, you have a fake clocks property, how is the
> OPP core supposed to know it ? Damn.

What about instead of fixing the OPP core, which really is doing the
right thing, we fix your driver (as you can't fix the DT) and add a
dummy CPU clk to make it all work ?

-- 
viresh

Re: [PATCH v3] mm: memcg/slab: Stop reparented obj_cgroups from charging root

2020-10-19 Thread Richard Palethorpe

Hello Shakeel,

Shakeel Butt  writes:
>>
>> V3: Handle common case where use_hierarchy=1 and update description.
>>
>>  mm/memcontrol.c | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 6877c765b8d0..34b8c4a66853 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -291,7 +291,7 @@ static void obj_cgroup_release(struct percpu_ref *ref)
>>
>> spin_lock_irqsave(&css_set_lock, flags);
>> memcg = obj_cgroup_memcg(objcg);
>> -   if (nr_pages)
>> +   if (nr_pages && (!mem_cgroup_is_root(memcg) || memcg->use_hierarchy))
>
> If we have non-root memcg with use_hierarchy as 0 and this objcg was
> reparented then this __memcg_kmem_uncharge() can potentially underflow
> the page counter and give the same warning.

Yes, although the kernel considers such a config to be broken, and
prints a warning to the log, it does allow it.

>
> We never set root_mem_cgroup->objcg, so, no need to check for root

I don't think that is relevant as we get the memcg from objcg->memcg
which is set during reparenting. I suppose however, we can determine if
the objcg was reparented by inspecting memcg->objcg.

> here. I think checking just memcg->use_hierarchy should be sufficient.

If we just check use_hierarchy then objects directly charged to the
memcg where use_hierarchy=0 will not be uncharged. However, maybe it is
better to check if it was reparented and if use_hierarchy=0.

>
>> __memcg_kmem_uncharge(memcg, nr_pages);
>> list_del(&objcg->list);
>> mem_cgroup_put(memcg);
>> @@ -3100,6 +3100,7 @@ static bool consume_obj_stock(struct obj_cgroup 
>> *objcg, unsigned int nr_bytes)
>>  static void drain_obj_stock(struct memcg_stock_pcp *stock)
>>  {
>> struct obj_cgroup *old = stock->cached_objcg;
>> +   struct mem_cgroup *memcg;
>>
>> if (!old)
>> return;
>> @@ -3110,7 +3111,9 @@ static void drain_obj_stock(struct memcg_stock_pcp 
>> *stock)
>>
>> if (nr_pages) {
>> rcu_read_lock();
>> -   __memcg_kmem_uncharge(obj_cgroup_memcg(old), 
>> nr_pages);
>> +   memcg = obj_cgroup_memcg(old);
>> +   if (!mem_cgroup_is_root(memcg) || 
>> memcg->use_hierarchy)
>> +   __memcg_kmem_uncharge(memcg, nr_pages);
>> rcu_read_unlock();
>> }
>>
>> --
>> 2.28.0
>>


-- 
Thank you,
Richard.

[RESEND PATCH V2 2/3] pwm: imx27: Use dev_err_probe() to simplify error handling

2020-10-19 Thread Anson Huang

dev_err_probe() can reduce code size, uniform error handling and record the
defer probe reason etc., use it to simplify the code.

Signed-off-by: Anson Huang 
Acked-by: Uwe Kleine-König 
---
changes since V1:
- remove redundant return value print.
---
 drivers/pwm/pwm-imx27.c | 25 ++---
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/drivers/pwm/pwm-imx27.c b/drivers/pwm/pwm-imx27.c
index c50d453..ceaed03 100644
--- a/drivers/pwm/pwm-imx27.c
+++ b/drivers/pwm/pwm-imx27.c
@@ -315,27 +315,14 @@ static int pwm_imx27_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, imx);
 
imx->clk_ipg = devm_clk_get(&pdev->dev, "ipg");
-   if (IS_ERR(imx->clk_ipg)) {
-   int ret = PTR_ERR(imx->clk_ipg);
-
-   if (ret != -EPROBE_DEFER)
-   dev_err(&pdev->dev,
-   "getting ipg clock failed with %d\n",
-   ret);
-   return ret;
-   }
+   if (IS_ERR(imx->clk_ipg))
+   return dev_err_probe(&pdev->dev, PTR_ERR(imx->clk_ipg),
+"getting ipg clock failed\n");
 
imx->clk_per = devm_clk_get(&pdev->dev, "per");
-   if (IS_ERR(imx->clk_per)) {
-   int ret = PTR_ERR(imx->clk_per);
-
-   if (ret != -EPROBE_DEFER)
-   dev_err(&pdev->dev,
-   "failed to get peripheral clock: %d\n",
-   ret);
-
-   return ret;
-   }
+   if (IS_ERR(imx->clk_per))
+   return dev_err_probe(&pdev->dev, PTR_ERR(imx->clk_per),
+"failed to get peripheral clock\n");
 
imx->chip.ops = &pwm_imx27_ops;
imx->chip.dev = &pdev->dev;
-- 
2.7.4

[RESEND PATCH V2 1/3] pwm: imx-tpm: Use dev_err_probe() to simplify error handling

2020-10-19 Thread Anson Huang

dev_err_probe() can reduce code size, uniform error handling and record the
defer probe reason etc., use it to simplify the code.

Signed-off-by: Anson Huang 
Acked-by: Uwe Kleine-König 
---
changes since V1:
- remove redundant return value print.
---
 drivers/pwm/pwm-imx-tpm.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/pwm/pwm-imx-tpm.c b/drivers/pwm/pwm-imx-tpm.c
index fcdf6be..aaf629b 100644
--- a/drivers/pwm/pwm-imx-tpm.c
+++ b/drivers/pwm/pwm-imx-tpm.c
@@ -350,13 +350,9 @@ static int pwm_imx_tpm_probe(struct platform_device *pdev)
return PTR_ERR(tpm->base);
 
tpm->clk = devm_clk_get(&pdev->dev, NULL);
-   if (IS_ERR(tpm->clk)) {
-   ret = PTR_ERR(tpm->clk);
-   if (ret != -EPROBE_DEFER)
-   dev_err(&pdev->dev,
-   "failed to get PWM clock: %d\n", ret);
-   return ret;
-   }
+   if (IS_ERR(tpm->clk))
+   return dev_err_probe(&pdev->dev, PTR_ERR(tpm->clk),
+"failed to get PWM clock\n");
 
ret = clk_prepare_enable(tpm->clk);
if (ret) {
-- 
2.7.4

[RESEND PATCH V2 3/3] pwm: imx1: Use dev_err_probe() to simplify error handling

2020-10-19 Thread Anson Huang

dev_err_probe() can reduce code size, uniform error handling and record the
defer probe reason etc., use it to simplify the code.

Signed-off-by: Anson Huang 
Acked-by: Uwe Kleine-König 
---
changes since V1:
- remove redundant return value print.
---
 drivers/pwm/pwm-imx1.c | 21 ++---
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/pwm/pwm-imx1.c b/drivers/pwm/pwm-imx1.c
index f8b2c2e..4877734 100644
--- a/drivers/pwm/pwm-imx1.c
+++ b/drivers/pwm/pwm-imx1.c
@@ -145,23 +145,14 @@ static int pwm_imx1_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, imx);
 
imx->clk_ipg = devm_clk_get(&pdev->dev, "ipg");
-   if (IS_ERR(imx->clk_ipg)) {
-   dev_err(&pdev->dev, "getting ipg clock failed with %ld\n",
-   PTR_ERR(imx->clk_ipg));
-   return PTR_ERR(imx->clk_ipg);
-   }
+   if (IS_ERR(imx->clk_ipg))
+   return dev_err_probe(&pdev->dev, PTR_ERR(imx->clk_ipg),
+"getting ipg clock failed\n");
 
imx->clk_per = devm_clk_get(&pdev->dev, "per");
-   if (IS_ERR(imx->clk_per)) {
-   int ret = PTR_ERR(imx->clk_per);
-
-   if (ret != -EPROBE_DEFER)
-   dev_err(&pdev->dev,
-   "failed to get peripheral clock: %d\n",
-   ret);
-
-   return ret;
-   }
+   if (IS_ERR(imx->clk_per))
+   return dev_err_probe(&pdev->dev, PTR_ERR(imx->clk_per),
+"failed to get peripheral clock\n");
 
imx->chip.ops = &pwm_imx1_ops;
imx->chip.dev = &pdev->dev;
-- 
2.7.4

Re: [PATCH 1/2] perf jevents: Tidy error handling

2020-10-19 Thread Namhyung Kim

Hello,

On Tue, Oct 20, 2020 at 12:42 AM John Garry  wrote:
>
> There is much duplication in the error handling for directory traversing
> for processing JSONs.
>
> Factor out the common code to tidy a bit.
>
> Signed-off-by: John Garry 
> ---
[SNIP]
> -empty_map:
> +err_processing_std_arch_event_dir:
> +   err_string_ext = " for std arch event";
> +err_processing_dir:
> +   if (verbose || rc > 0) {
> +   pr_info("%s: Error walking file tree %s%s\n", prog, ldirname,
> +   err_string_ext);

This was printed only if verbose is set but now changed.

Thanks
Namhyung


> +   empty_map = 1;
> +   } else {
> +   ret = 1;
> +   }
> +err_close_eventsfp:
> fclose(eventsfp);
> -   create_empty_mapping(output_file);
> +   if (empty_map)
> +   create_empty_mapping(output_file);
> +err_out:
> free_arch_std_events();
> -out_free_mapfile:
> free(mapfile);
> return ret;
>  }
> --
> 2.26.2
>

Re: [PATCH RFC 0/8] kasan: hardware tag-based mode for production use on arm64

2020-10-19 Thread Dmitry Vyukov

On Tue, Oct 20, 2020 at 12:51 AM Kostya Serebryany  wrote:
>
> Hi,
> I would like to hear opinions from others in CC on these choices:
> * Production use of In-kernel MTE should be based on stripped-down
> KASAN, or implemented independently?

Andrey, what are the fundamental consequences of basing MTE on KASAN?
I would assume that there are none as we can change KASAN code and
special case some code paths as necessary.

> * Should we aim at a single boot-time flag (with several values) or
> for several independent flags (OFF/SYNC/ASYNC, Stack traces on/off)

We won't be able to answer this question for several years until we
have actual hardware/users...
It's definitely safer to aim at multiple options. I would reuse the fs
opt parsing code as we seem to have lots of potential things to
configure so that we can do:
kasan_options=quarantine=off,fault=panic,trap=async

I am also always confused by the term "debug" when configuring the
kernel. In some cases it's for debugging of the subsystem (for
developers of KASAN), in some cases it adds additional checks to catch
misuses of the subsystem. in some - it just adds more debugging output
on console. And in this case it's actually neither of these. But I am
not sure what's a better name ("full"?). Even if we split options into
multiple, we still can have some kind of presents that just flip all
other options into reasonable values.



> Andrey, please give us some idea of the CPU and RAM overheads other
> than those coming from MTE
> * stack trace collection and storage
> * adding redzones to every allocation - not strictly needed for MTE,
> but convenient to store the stack trace IDs.
>
> Andrey: with production MTE we should not be using quarantine, which
> means storing the stack trace IDs
> in the deallocated memory doesn't provide good report quality.
> We may need to consider another approach, e.g. the one used in HWASAN
> (separate ring buffer, per thread or per core)
>
> --kcc
>
>
> On Fri, Oct 16, 2020 at 8:52 AM Andrey Konovalov  
> wrote:
> >
> > On Fri, Oct 16, 2020 at 3:31 PM Marco Elver  wrote:
> > >
> > > On Fri, 16 Oct 2020 at 15:17, 'Andrey Konovalov' via kasan-dev
> > >  wrote:
> > > [...]
> > > > > > The intention with this kind of a high level switch is to hide the
> > > > > > implementation details. Arguably, we could add multiple switches 
> > > > > > that allow
> > > > > > to separately control each KASAN or MTE feature, but I'm not sure 
> > > > > > there's
> > > > > > much value in that.
> > > > > >
> > > > > > Does this make sense? Any preference regarding the name of the 
> > > > > > parameter
> > > > > > and its values?
> > > > >
> > > > > KASAN itself used to be a debugging tool only. So introducing an "on"
> > > > > mode which no longer follows this convention may be confusing.
> > > >
> > > > Yeah, perhaps "on" is not the best name here.
> > > >
> > > > > Instead, maybe the following might be less confusing:
> > > > >
> > > > > "full" - current "debug", normal KASAN, all debugging help available.
> > > > > "opt" - current "on", optimized mode for production.
> > > >
> > > > How about "prod" here?
> > >
> > > SGTM.
> > >
> > > [...]
> > > >
> > > > > > Should we somehow control whether to panic the kernel on a tag 
> > > > > > fault?
> > > > > > Another boot time parameter perhaps?
> > > > >
> > > > > It already respects panic_on_warn, correct?
> > > >
> > > > Yes, but Android is unlikely to enable panic_on_warn as they have
> > > > warnings happening all over. AFAIR Pixel 3/4 kernels actually have a
> > > > custom patch that enables kernel panic for KASAN crashes specifically
> > > > (even though they don't obviously use KASAN in production), and I
> > > > think it's better to provide a similar facility upstream. Maybe call
> > > > it panic_on_kasan or something?
> > >
> > > Best would be if kasan= can take another option, e.g.
> > > "kasan=prod,panic". I think you can change the strcmp() to a
> > > str_has_prefix() for the checks for full/prod/on/off, and then check
> > > if what comes after it is ",panic".
> > >
> > > Thanks,
> > > -- Marco
> >
> > CC Kostya and Serban.

Re: [PATCH v4 08/15] Documentation: of: Convert graph bindings to json-schema

2020-10-19 Thread Sameer Pujar





Signed-off-by: Sameer Pujar 
Cc: Philipp Zabel 
---
  Documentation/devicetree/bindings/graph.txt  | 128 
  Documentation/devicetree/bindings/graph.yaml | 170 +++
  2 files changed, 170 insertions(+), 128 deletions(-)
  delete mode 100644 Documentation/devicetree/bindings/graph.txt
  create mode 100644 Documentation/devicetree/bindings/graph.yaml

I'd like to move this to the dtschema repository instead.


Do you mean I need to separately submit this patch for dtschema repo?

...

+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/graph.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Common bindings for device graphs
+
+description: |
+  The hierarchical organisation of the device tree is well suited to describe
+  control flow to devices, but there can be more complex connections between
+  devices that work together to form a logical compound device, following an
+  arbitrarily complex graph.
+  There already is a simple directed graph between devices tree nodes using
+  phandle properties pointing to other nodes to describe connections that
+  can not be inferred from device tree parent-child relationships. The device
+  tree graph bindings described herein abstract more complex devices that can
+  have multiple specifiable ports, each of which can be linked to one or more
+  ports of other devices.
+
+  These common bindings do not contain any information about the direction or
+  type of the connections, they just map their existence. Specific properties
+  may be described by specialized bindings depending on the type of connection.
+
+  To see how this binding applies to video pipelines, for example, see
+  Documentation/devicetree/bindings/media/video-interfaces.txt.
+  Here the ports describe data interfaces, and the links between them are
+  the connecting data buses. A single port with multiple connections can
+  correspond to multiple devices being connected to the same physical bus.
+
+maintainers:
+  - Philipp Zabel 
+
+definitions:
+
+  port:
+type: object
+description: |
+  If there is more than one 'port' or more than one 'endpoint' node
+  or 'reg' property present in the port and/or endpoint nodes then
+  '#address-cells' and '#size-cells' properties are required in relevant
+  parent node.

reg property.


done




+
+patternProperties:
+  "^endpoint(@[0-9a-f]+)?$":
+type: object
+properties:

reg?


done


+  remote-endpoint:
+description: |
+  phandle to an 'endpoint' subnode of a remote device node.
+$ref: /schemas/types.yaml#/definitions/phandle
+
+  ports:
+type: object
+patternProperties:
+  "^port(@[0-9a-f]+)?$":
+$ref: "#/definitions/port"

No reason for this to be under 'definitions'. Just move down.


Would definitions be needed if some schemas want to refer the base graph 
schema? Or is it like they can just directly include the base schema and 
definitions are not really required?


But what if they want to extend few properties. For example:

graph.yaml
--
endpoint {
    remote-endpoint = <>;
};

*audio-graph-card.yaml
--
endpoint {
    remote-endpoint = <>;

    property-x;
    node-x {
    ...
    };
};




+
+properties:
+  ports:
+$ref: "#/definitions/ports"
+
+patternProperties:
+  "^port(@[0-9a-f]+)?$":
+$ref: "#/definitions/port"
+
+additionalProperties: false

This needs to be true here. But you need this within 'ports' and 'port'.
(I think... I think we only have extra properties within endpoint
nodes.)


I think currently audio-graph allows few properties at port/ports. I am 
not sure if Morimoto-san has plans to get rid of this.

Re: [PATCH v2] ARM: kprobes: Avoid fortify_panic() when copying optprobe template

2020-10-19 Thread Joel Stanley

On Fri, 9 Oct 2020 at 05:20, Joel Stanley  wrote:
>
> On Thu, 1 Oct 2020 at 04:30, Andrew Jeffery  wrote:
> >
> > Setting both CONFIG_KPROBES=y and CONFIG_FORTIFY_SOURCE=y on ARM leads
> > to a panic in memcpy() when injecting a kprobe despite the fixes found
> > in commit e46daee53bb5 ("ARM: 8806/1: kprobes: Fix false positive with
> > FORTIFY_SOURCE") and commit 0ac569bf6a79 ("ARM: 8834/1: Fix: kprobes:
> > optimized kprobes illegal instruction").
> >
> > arch/arm/include/asm/kprobes.h effectively declares
> > the target type of the optprobe_template_entry assembly label as a u32
> > which leads memcpy()'s __builtin_object_size() call to determine that
> > the pointed-to object is of size four. However, the symbol is used as a 
> > handle
> > for the optimised probe assembly template that is at least 96 bytes in size.
> > The symbol's use despite its type blows up the memcpy() in ARM's
> > arch_prepare_optimized_kprobe() with a false-positive fortify_panic() when 
> > it
> > should instead copy the optimised probe template into place:
> >
> > ```
> > $ sudo perf probe -a aspeed_g6_pinctrl_probe
> > [  158.457252] detected buffer overflow in memcpy
> >
> > Fixes: e46daee53bb5 ("ARM: 8806/1: kprobes: Fix false positive with 
> > FORTIFY_SOURCE")
> > Fixes: 0ac569bf6a79 ("ARM: 8834/1: Fix: kprobes: optimized kprobes illegal 
> > instruction")
> > Cc: Luka Oreskovic 
> > Cc: Juraj Vijtiuk 
> > Suggested-by: Kees Cook 
> > Signed-off-by: Andrew Jeffery 
>
> Tested-by: Joel Stanley 
> Reviewed-by: Joel Stanley 
>
> Thanks Andrew.
>
> > ---
> > v1 was sent some time back, in May:
> >
> > https://lore.kernel.org/linux-arm-kernel/20200517153959.293224-1-and...@aj.id.au/

Russell, are you picking this fix up?

Would you prefer it to go through someone else's tree?

Cheers,

Joel

> >
> > I've taken the patch that Kees' suggested in the replies and tested it.
> > ---
> >  arch/arm/include/asm/kprobes.h| 22 +++---
> >  arch/arm/probes/kprobes/opt-arm.c | 18 +-
> >  2 files changed, 20 insertions(+), 20 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/kprobes.h b/arch/arm/include/asm/kprobes.h
> > index 213607a1f45c..e26a278d301a 100644
> > --- a/arch/arm/include/asm/kprobes.h
> > +++ b/arch/arm/include/asm/kprobes.h
> > @@ -44,20 +44,20 @@ int kprobe_exceptions_notify(struct notifier_block 
> > *self,
> >  unsigned long val, void *data);
> >
> >  /* optinsn template addresses */
> > -extern __visible kprobe_opcode_t optprobe_template_entry;
> > -extern __visible kprobe_opcode_t optprobe_template_val;
> > -extern __visible kprobe_opcode_t optprobe_template_call;
> > -extern __visible kprobe_opcode_t optprobe_template_end;
> > -extern __visible kprobe_opcode_t optprobe_template_sub_sp;
> > -extern __visible kprobe_opcode_t optprobe_template_add_sp;
> > -extern __visible kprobe_opcode_t optprobe_template_restore_begin;
> > -extern __visible kprobe_opcode_t optprobe_template_restore_orig_insn;
> > -extern __visible kprobe_opcode_t optprobe_template_restore_end;
> > +extern __visible kprobe_opcode_t optprobe_template_entry[];
> > +extern __visible kprobe_opcode_t optprobe_template_val[];
> > +extern __visible kprobe_opcode_t optprobe_template_call[];
> > +extern __visible kprobe_opcode_t optprobe_template_end[];
> > +extern __visible kprobe_opcode_t optprobe_template_sub_sp[];
> > +extern __visible kprobe_opcode_t optprobe_template_add_sp[];
> > +extern __visible kprobe_opcode_t optprobe_template_restore_begin[];
> > +extern __visible kprobe_opcode_t optprobe_template_restore_orig_insn[];
> > +extern __visible kprobe_opcode_t optprobe_template_restore_end[];
> >
> >  #define MAX_OPTIMIZED_LENGTH   4
> >  #define MAX_OPTINSN_SIZE   \
> > -   ((unsigned long)&optprobe_template_end -\
> > -(unsigned long)&optprobe_template_entry)
> > +   ((unsigned long)optprobe_template_end - \
> > +(unsigned long)optprobe_template_entry)
> >  #define RELATIVEJUMP_SIZE  4
> >
> >  struct arch_optimized_insn {
> > diff --git a/arch/arm/probes/kprobes/opt-arm.c 
> > b/arch/arm/probes/kprobes/opt-arm.c
> > index 7a449df0b359..c78180172120 100644
> > --- a/arch/arm/probes/kprobes/opt-arm.c
> > +++ b/arch/arm/probes/kprobes/opt-arm.c
> > @@ -85,21 +85,21 @@ asm (
> > "optprobe_template_end:\n");
> >
> >  #define TMPL_VAL_IDX \
> > -   ((unsigned long *)&optprobe_template_val - (unsigned long 
> > *)&optprobe_template_entry)
> > +   ((unsigned long *)optprobe_template_val - (unsigned long 
> > *)optprobe_template_entry)
> >  #define TMPL_CALL_IDX \
> > -   ((unsigned long *)&optprobe_template_call - (unsigned long 
> > *)&optprobe_template_entry)
> > +   ((unsigned long *)optprobe_template_call - (unsigned long 
> > *)optprobe_template_entry)
> >  #define TMPL_END_IDX \
> > -   ((unsigned long *)&optprobe_template_end - (unsigned long 
> > *)&optprobe_template_entry)
> >

Re: [PATCH] opp: Don't always remove static OPPs in _of_add_opp_table_v1()

2020-10-19 Thread Viresh Kumar

On 15-10-20, 10:01, Viresh Kumar wrote:
> On 15-10-20, 02:35, Aisheng Dong wrote:
> > Hi Viresh
> > 
> > Thanks for the quick fix.
> > 
> > > From: Viresh Kumar 
> > > Sent: Wednesday, October 14, 2020 12:26 PM
> > > 
> > > The patch missed returning 0 early in case of success and hence the 
> > > static OPPs
> > > got removed by mistake. Fix it.
> > > 
> > > Fixes: 90d46d71cce2 ("opp: Handle multiple calls for same OPP table in
> > > _of_add_opp_table_v1()")
> > > Reported-by: Aisheng Dong 
> > > Signed-off-by: Viresh Kumar 
> > 
> > Tested-by: Dong Aisheng 
> 
> Thanks.
> 
> Rafael: Please apply this one directly for 5.10-rc. Thanks.

Rafael: Ping.

-- 
viresh

Re: [PATCH RFC 0/8] kasan: hardware tag-based mode for production use on arm64

2020-10-19 Thread Dmitry Vyukov

On Mon, Oct 19, 2020 at 2:23 PM Marco Elver  wrote:
>
> On Wed, 14 Oct 2020 at 22:44, Andrey Konovalov  wrote:
> [...]
> > A question to KASAN maintainers: what would be the best way to support the
> > "off" mode? I see two potential approaches: add a check into each kasan
> > callback (easier to implement, but we still call kasan callbacks, even
> > though they immediately return), or add inline header wrappers that do the
> > same.
>
> This is tricky, because we don't know how bad the performance will be
> if we keep them as calls. We'd have to understand the performance
> impact of keeping them as calls, and if the performance impact is
> acceptable or not.
>
> Without understanding the performance impact, the only viable option I
> see is to add __always_inline kasan_foo() wrappers, which use the
> static branch to guard calls to __kasan_foo().

This sounds reasonable to me.

Re: [PATCH V2 1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER

2020-10-19 Thread Viresh Kumar

On 19-10-20, 15:10, Sudeep Holla wrote:
> On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:
> > On 19-10-20, 11:12, Sudeep Holla wrote:
> > > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
> > > by any clock provider driver. E.g. the issue you will see if "clocks"
> > > property is used instead of "qcom,freq-domain" on Qcom parts.
> > 
> > Okay, I understand. But what I still don't understand is why it fails
> > for you. You have a clocks property in DT for the CPU, the OPP core
> > tries to get it and will get deferred-probed, which will try probing
> > at a later point of time and it shall work then. Isn't it ?
> >
> 
> Nope unfortunately. We don't have clock provider, so clk_get will
> never succeed and always return -EPROBE_DEFER.

Now this is really bad, you have a fake clocks property, how is the
OPP core supposed to know it ? Damn.

-- 
viresh

Re: [PATCH] wireguard: convert selftest/{counter,ratelimiter}.c to KUnit

2020-10-19 Thread kernel test robot

Hi Daniel,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on 7cf726a59435301046250c42131554d9ccc566b8]

url:
https://github.com/0day-ci/linux/commits/Daniel-Latypov/wireguard-convert-selftest-counter-ratelimiter-c-to-KUnit/20201020-042650
base:7cf726a59435301046250c42131554d9ccc566b8
config: mips-allyesconfig (attached as .config)
compiler: mips-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/7a0f82af0af9735a7f20ef9e291e704aff218e8f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Daniel-Latypov/wireguard-convert-selftest-counter-ratelimiter-c-to-KUnit/20201020-042650
git checkout 7a0f82af0af9735a7f20ef9e291e704aff218e8f
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/net/wireguard/counter_test.c:84:2: note: in expansion of macro 'T'
  84 |  T(COUNTER_WINDOW_SIZE + 1, true);
 |  ^
   include/linux/minmax.h:18:28: warning: comparison of distinct pointer types 
lacks a cast
  18 |  (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 |^~
   include/kunit/test.h:748:9: note: in expansion of macro '__typecheck'
 748 |  ((void)__typecheck(__left, __right));   \
 | ^~~
   include/kunit/test.h:772:2: note: in expansion of macro 
'KUNIT_BASE_BINARY_ASSERTION'
 772 |  KUNIT_BASE_BINARY_ASSERTION(test,   \
 |  ^~~
   include/kunit/test.h:861:2: note: in expansion of macro 
'KUNIT_BASE_EQ_MSG_ASSERTION'
 861 |  KUNIT_BASE_EQ_MSG_ASSERTION(test,   \
 |  ^~~
   include/kunit/test.h:871:2: note: in expansion of macro 
'KUNIT_BINARY_EQ_MSG_ASSERTION'
 871 |  KUNIT_BINARY_EQ_MSG_ASSERTION(test,   \
 |  ^
   include/kunit/test.h:1234:2: note: in expansion of macro 
'KUNIT_BINARY_EQ_ASSERTION'
1234 |  KUNIT_BINARY_EQ_ASSERTION(test, KUNIT_EXPECTATION, left, right)
 |  ^
   drivers/net/wireguard/counter_test.c:22:3: note: in expansion of macro 
'KUNIT_EXPECT_EQ'
  22 |   KUNIT_EXPECT_EQ(test, counter_validate(counter, n), v)
 |   ^~~
   drivers/net/wireguard/counter_test.c:85:2: note: in expansion of macro 'T'
  85 |  T(0, false);
 |  ^
   include/linux/minmax.h:18:28: warning: comparison of distinct pointer types 
lacks a cast
  18 |  (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 |^~
   include/kunit/test.h:748:9: note: in expansion of macro '__typecheck'
 748 |  ((void)__typecheck(__left, __right));   \
 | ^~~
   include/kunit/test.h:772:2: note: in expansion of macro 
'KUNIT_BASE_BINARY_ASSERTION'
 772 |  KUNIT_BASE_BINARY_ASSERTION(test,   \
 |  ^~~
   include/kunit/test.h:861:2: note: in expansion of macro 
'KUNIT_BASE_EQ_MSG_ASSERTION'
 861 |  KUNIT_BASE_EQ_MSG_ASSERTION(test,   \
 |  ^~~
   include/kunit/test.h:871:2: note: in expansion of macro 
'KUNIT_BINARY_EQ_MSG_ASSERTION'
 871 |  KUNIT_BINARY_EQ_MSG_ASSERTION(test,   \
 |  ^
   include/kunit/test.h:1234:2: note: in expansion of macro 
'KUNIT_BINARY_EQ_ASSERTION'
1234 |  KUNIT_BINARY_EQ_ASSERTION(test, KUNIT_EXPECTATION, left, right)
 |  ^
   drivers/net/wireguard/counter_test.c:22:3: note: in expansion of macro 
'KUNIT_EXPECT_EQ'
  22 |   KUNIT_EXPECT_EQ(test, counter_validate(counter, n), v)
 |   ^~~
   drivers/net/wireguard/counter_test.c:89:3: note: in expansion of macro 'T'
  89 |   T(i, true);
 |   ^
   include/linux/minmax.h:18:28: warning: comparison of distinct pointer types 
lacks a cast
  18 |  (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 |^~
   include/kunit/test.h:748:9: note: in expansion of macro '__typecheck'
 748 |  ((void)__typecheck(__left, __right));   \
 | ^~~
   include/kunit/test.h:772:2: note: in expansion of macro 
'KUNIT_BASE_BINARY_ASSERTION'
 772 |  KUNIT_BASE_BINARY_ASSERTION(test,   \
 |  ^~~
   include/kunit/test.h:861:2: note: in expansion of macro 
'KUNIT_BASE_EQ_MSG_ASSERTION'
 861 |  KUNIT_BASE_EQ_MSG_ASSERTION(test,   \
 |  ^~~
   include/kunit/test.h:871:2

RE: [PATCH 08/20] dt-bindings: usb: renesas-xhci: Refer to the usb-xhci.yaml file

2020-10-19 Thread Yoshihiro Shimoda

Hi,

> From: Serge Semin, Sent: Wednesday, October 14, 2020 7:14 PM
> 
> With minor peculiarities (like uploading some vendor-specific firmware)
> these are just Generic xHCI controllers fully compatible with its
> properties. Make sure the Renesas USB xHCI DT nodes are also validated
> against the Generic xHCI DT schema.
> 
> Signed-off-by: Serge Semin 
> ---

Thank you for the patch!

Reviewed-by: Yoshihiro Shimoda 

Best regards,
Yoshihiro Shimoda

Re: [PATCH v5 2/4] leds: Add driver for Qualcomm LPG

2020-10-19 Thread Bjorn Andersson

On Sun 18 Oct 15:12 CDT 2020, Andy Shevchenko wrote:

> On Sat, Oct 17, 2020 at 8:41 AM Bjorn Andersson
>  wrote:
> >
> > The Light Pulse Generator (LPG) is a PWM-block found in a wide range of
> > PMICs from Qualcomm. It can operate on fixed parameters or based on a
> > lookup-table, altering the duty cycle over time - which provides the
> > means for e.g. hardware assisted transitions of LED brightness.
> 
> > +config LEDS_QCOM_LPG
> > +   tristate "LED support for Qualcomm LPG"
> > +   depends on LEDS_CLASS_MULTICOLOR
> > +   depends on OF
> > +   depends on SPMI
> 
> 
> > +#include 
> > +#include 
> 
> ...
> 
> > +struct lpg {
> > +   struct device *dev;
> > +   struct regmap *map;
> 
> Can't you derive the former from the latter?
> 

No, because map->dev is actually the dev->parent.

> > +
> > +   struct pwm_chip pwm;
> > +
> > +   const struct lpg_data *data;
> > +
> > +   u32 lut_base;
> > +   u32 lut_size;
> > +   unsigned long *lut_bitmap;
> > +
> > +   u32 triled_base;
> > +   u32 triled_src;
> > +
> > +   struct lpg_channel *channels;
> > +   unsigned int num_channels;
> > +};
> 
> ...
> 
> > +static int lpg_lut_store(struct lpg *lpg, struct led_pattern *pattern,
> > +size_t len, unsigned int *lo_idx, unsigned int 
> > *hi_idx)
> > +{
> > +   unsigned int idx;
> > +   u8 val[2];
> 
> __be16 val;
> 
> > +   int i;
> > +
> > +   /* Hardware does not behave when LO_IDX == HI_IDX */
> > +   if (len == 1)
> > +   return -EINVAL;
> > +
> > +   idx = bitmap_find_next_zero_area(lpg->lut_bitmap, lpg->lut_size,
> > +0, len, 0);
> > +   if (idx >= lpg->lut_size)
> > +   return -ENOMEM;
> > +
> > +   for (i = 0; i < len; i++) {
> > +   val[0] = pattern[i].brightness & 0xff;
> > +   val[1] = pattern[i].brightness >> 8;
> 
> cpu_to_be16();
> 

I like it, but isn't that a le16?

> > +
> > +   regmap_bulk_write(lpg->map,
> > + lpg->lut_base + LPG_LUT_REG(idx + i), 
> > val, 2);
> > +   }
> > +
> > +   bitmap_set(lpg->lut_bitmap, idx, len);
> > +
> > +   *lo_idx = idx;
> > +   *hi_idx = idx + len - 1;
> > +
> > +   return 0;
> > +}
> 
> ...
> 
> > +static void lpg_calc_freq(struct lpg_channel *chan, unsigned int period_us)
> > +{
> > +   int n, m, clk, div;
> > +   int best_m, best_div, best_clk;
> > +   unsigned intlast_err, cur_err, min_err;
> > +   unsigned inttmp_p, period_n;
> > +
> > +   if (period_us == chan->period_us)
> > +   return;
> > +
> > +   /* PWM Period / N */
> > +   if (period_us < ((unsigned int)(-1) / NSEC_PER_USEC)) {
> 
> Please, replace all these -1 with castings to unsigned types with
> corresponding limits, like
> UINT_MAX here.
> 

Sure thing.

> > +   period_n = (period_us * NSEC_PER_USEC) >> 6;
> > +   n = 6;
> > +   } else {
> > +   period_n = (period_us >> 9) * NSEC_PER_USEC;
> > +   n = 9;
> > +   }
> 
> Why inconsistency in branches? Can you rather derive n and calculate
> only once like
> 
>period_n = (period_us >> n) * NSEC_PER_USEC;
> 
> ?

I inherited this piece from the downstream driver and I assume that the
purpose was to avoid loss of precision. I will review this and if
nothing else it seems like I would be able to cast period_us to more
bits, do the multiply and then shift - in both cases.

> 
> > +   min_err = (unsigned int)(-1);
> > +   last_err = (unsigned int)(-1);
> > +   best_m = 0;
> > +   best_clk = 0;
> > +   best_div = 0;
> > +   for (clk = 0; clk < NUM_PWM_CLK; clk++) {
> > +   for (div = 0; div < NUM_PWM_PREDIV; div++) {
> > +   /* period_n = (PWM Period / N) */
> > +   /* tmp_p = (Pre-divide * Clock Period) * 2^m */
> > +   tmp_p = lpg_clk_table[div][clk];
> > +   for (m = 0; m <= NUM_EXP; m++) {
> > +   if (period_n > tmp_p)
> > +   cur_err = period_n - tmp_p;
> > +   else
> > +   cur_err = tmp_p - period_n;
> > +
> > +   if (cur_err < min_err) {
> > +   min_err = cur_err;
> > +   best_m = m;
> > +   best_clk = clk;
> > +   best_div = div;
> > +   }
> > +
> > +   if (m && cur_err > last_err)
> > +   /* Break for bigger cur_err */
> > +   break;
> > +
> > +   last_err = cur_err;
> > +

Re: [PATCH v4 1/2] dt-bindings: spmi: document binding for the Mediatek SPMI controller

2020-10-19 Thread Hsin-Hsiung Wang

Hi,

On Mon, 2020-10-19 at 14:54 -0500, Rob Herring wrote:
> On Sat, Oct 17, 2020 at 12:10:33AM +0800, Hsin-Hsiung Wang wrote:
> > This adds documentation for the SPMI controller found on Mediatek SoCs.
> > 
> > Signed-off-by: Hsin-Hsiung Wang 
> > ---
> 
> If you have a dependency such as the include, please note it here.
> 
Sorry, I don't add the necessary patchset[1] into the commit message.
I will update it in the next patch.

[1]
https://patchwork.kernel.org/project/linux-mediatek/list/?series=342593

> >  .../bindings/spmi/mtk,spmi-mtk-pmif.yaml  | 70 +++
> >  1 file changed, 70 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/spmi/mtk,spmi-mtk-pmif.yaml
> > 
> > diff --git a/Documentation/devicetree/bindings/spmi/mtk,spmi-mtk-pmif.yaml 
> > b/Documentation/devicetree/bindings/spmi/mtk,spmi-mtk-pmif.yaml
> > new file mode 100644
> > index ..9945200a35b3
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/spmi/mtk,spmi-mtk-pmif.yaml
> > @@ -0,0 +1,70 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/spmi/mtk,spmi-mtk-pmif.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Mediatek SPMI Controller Device Tree Bindings
> > +
> > +maintainers:
> > +  - Hsin-Hsiung Wang 
> > +
> > +description: |+
> > +  On MediaTek SoCs the PMIC is connected via SPMI and the controller allows
> > +  for multiple SoCs to control a single SPMI master.
> 
> Need a $ref to spmi.yaml.
> 
Thanks, I will update it in the next patch.
> > +
> > +properties:
> > +  compatible:
> > +const: mediatek,mt6873-spmi
> > +
> > +  reg:
> > +minItems: 2
> > +maxItems: 2
> > +
> > +  reg-names:
> > +items:
> > +  - const: "pmif"
> > +  - const: "spmimst"
> 
> Don't need quotes
> 
Thanks, I will update it in the next patch.
> > +
> > +  clocks:
> > +minItems: 3
> > +maxItems: 3
> > +
> > +  clock-names:
> > +items:
> > +  - const: "pmif_sys_ck"
> > +  - const: "pmif_tmr_ck"
> > +  - const: "spmimst_clk_mux"
> > +
> > +  assigned-clocks:
> > +maxItems: 1
> > +
> > +  assigned-clock-parents:
> > +maxItems: 1
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - reg-names
> > +  - clocks
> > +  - clock-names
> 
> unevaluatedProperties: false
> 
Thanks, I will update it in the next patch.
> > +
> > +examples:
> > +  - |
> > +#include 
> > +
> > +spmi: spmi@10027000 {
> > +compatible = "mediatek,mt6873-spmi";
> > +reg = <0 0x10027000 0 0x000e00>,
> > +  <0 0x10029000 0 0x000100>;
> > +reg-names = "pmif", "spmimst";
> > +clocks = <&infracfg CLK_INFRA_PMIC_AP>,
> > + <&infracfg CLK_INFRA_PMIC_TMR>,
> > + <&topckgen CLK_TOP_SPMI_MST_SEL>;
> > +clock-names = "pmif_sys_ck",
> > +  "pmif_tmr_ck",
> > +  "spmimst_clk_mux";
> > +assigned-clocks = <&topckgen CLK_TOP_PWRAP_ULPOSC_SEL>;
> > +assigned-clock-parents = <&topckgen CLK_TOP_OSC_D10>;
> > +};
> > +...
> > -- 
> > 2.18.0

Re: [PATCH 4/4] ftgmac100: Restart MAC HW once

2020-10-19 Thread Joel Stanley

On Mon, 19 Oct 2020 at 08:57, Dylan Hung  wrote:
>
> The interrupt handler may set the flag to reset the mac in the future,
> but that flag is not cleared once the reset has occured.
>
> Fixes: 10cbd6407609 ("ftgmac100: Rework NAPI & interrupts handling")
> Signed-off-by: Dylan Hung 
> Signed-off-by: Joel Stanley 

Reviewed-by: Joel Stanley 

> ---
>  drivers/net/ethernet/faraday/ftgmac100.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/faraday/ftgmac100.c 
> b/drivers/net/ethernet/faraday/ftgmac100.c
> index 0c67fc3e27df..57736b049de3 100644
> --- a/drivers/net/ethernet/faraday/ftgmac100.c
> +++ b/drivers/net/ethernet/faraday/ftgmac100.c
> @@ -1326,6 +1326,7 @@ static int ftgmac100_poll(struct napi_struct *napi, int 
> budget)
>  */
> if (unlikely(priv->need_mac_restart)) {
> ftgmac100_start_hw(priv);
> +   priv->need_mac_restart = false;
>
> /* Re-enable "bad" interrupts */
> ftgmac100_write(FTGMAC100_INT_BAD, priv->base + 
> FTGMAC100_OFFSET_IER);
> --
> 2.17.1
>

[RFC, net-next 1/3] net: dsa: ethtool preempt ops support on slave ports

2020-10-19 Thread Xiaoliang Yang

Preempt_set and preempt_get are new functions of ethtool ops, which
is to configure frame preemption according to 802.1qbu and 802.3br.
Add them on slave ports of DSA framework, so that DSA devices can
support to configure frame preemption by using ethtool.

Signed-off-by: Xiaoliang Yang 
---
 include/net/dsa.h | 12 
 net/dsa/slave.c   | 26 ++
 2 files changed, 38 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 35429a140dfa..85b196ade511 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -499,6 +499,18 @@ struct dsa_switch_ops {
int (*get_ts_info)(struct dsa_switch *ds, int port,
   struct ethtool_ts_info *ts);
 
+   /*
+* ethtool --set-frame-preemption
+*/
+   int (*set_preempt)(struct dsa_switch *ds, int port,
+  struct ethtool_fp *fpcmd);
+
+   /*
+* ethtool --show-frame-preemption
+*/
+   int (*get_preempt)(struct dsa_switch *ds, int port,
+  struct ethtool_fp *fpcmd);
+
/*
 * Suspend and resume
 */
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index e7c1d62fde99..f51a1575266c 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1281,6 +1281,30 @@ static int dsa_slave_get_ts_info(struct net_device *dev,
return ds->ops->get_ts_info(ds, p->dp->index, ts);
 }
 
+static int dsa_slave_set_preempt(struct net_device *dev,
+struct ethtool_fp *fpcmd)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p->dp->ds;
+
+   if (!ds->ops->set_preempt)
+   return -EOPNOTSUPP;
+
+   return ds->ops->set_preempt(ds, p->dp->index, fpcmd);
+}
+
+static int dsa_slave_get_preempt(struct net_device *dev,
+struct ethtool_fp *fpcmd)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p->dp->ds;
+
+   if (!ds->ops->get_preempt)
+   return -EOPNOTSUPP;
+
+   return ds->ops->get_preempt(ds, p->dp->index, fpcmd);
+}
+
 static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
 u16 vid)
 {
@@ -1571,6 +1595,8 @@ static const struct ethtool_ops dsa_slave_ethtool_ops = {
.get_rxnfc  = dsa_slave_get_rxnfc,
.set_rxnfc  = dsa_slave_set_rxnfc,
.get_ts_info= dsa_slave_get_ts_info,
+   .set_preempt= dsa_slave_set_preempt,
+   .get_preempt= dsa_slave_get_preempt,
 };
 
 /* legacy way, bypassing the bridge */
-- 
2.18.4

[RFC, net-next 3/3] net: dsa: felix: tc-taprio preempt set support

2020-10-19 Thread Xiaoliang Yang

After using ethtool to enable and configure frame preemption on
vsc9959, use tc-taprio preempt set to mark the preempt queues and
express queueus.

Signed-off-by: Xiaoliang Yang 
---
 drivers/net/dsa/ocelot/felix_vsc9959.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c 
b/drivers/net/dsa/ocelot/felix_vsc9959.c
index c0e41d499639..f2b9a5ee1ff5 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -1310,6 +1310,20 @@ static int vsc9959_qos_port_cbs_set(struct dsa_switch 
*ds, int port,
return 0;
 }
 
+static int vsc9959_port_preempt_queues(struct ocelot *ocelot, int port,
+  struct tc_preempt_qopt_offload *qopt)
+{
+   u8 p_queues = qopt->preemptible_queues;
+
+   ocelot_rmw_rix(ocelot,
+  QSYS_PREEMPTION_CFG_P_QUEUES(p_queues),
+  QSYS_PREEMPTION_CFG_P_QUEUES_M,
+  QSYS_PREEMPTION_CFG,
+  port);
+
+   return 0;
+}
+
 static int vsc9959_port_setup_tc(struct dsa_switch *ds, int port,
 enum tc_setup_type type,
 void *type_data)
@@ -1321,6 +1335,8 @@ static int vsc9959_port_setup_tc(struct dsa_switch *ds, 
int port,
return vsc9959_qos_port_tas_set(ocelot, port, type_data);
case TC_SETUP_QDISC_CBS:
return vsc9959_qos_port_cbs_set(ds, port, type_data);
+   case TC_SETUP_PREEMPT:
+   return vsc9959_port_preempt_queues(ocelot, port, type_data);
default:
return -EOPNOTSUPP;
}
-- 
2.18.4

Re: [PATCH 1/4] ftgmac100: Fix race issue on TX descriptor[0]

2020-10-19 Thread Joel Stanley

On Mon, 19 Oct 2020 at 23:20, Benjamin Herrenschmidt
 wrote:
>
> On Mon, 2020-10-19 at 16:57 +0800, Dylan Hung wrote:
> > These rules must be followed when accessing the TX descriptor:
> >
> > 1. A TX descriptor is "cleanable" only when its value is non-zero
> > and the owner bit is set to "software"
>
> Can you elaborate ? What is the point of that change ? The owner bit
> should be sufficient, why do we need to check other fields ?

I would like Dylan to clarify too. The datasheet has a footnote below
the descriptor layout:

 - TXDES#0: Bits 27 ~ 14 are valid only when FTS = 1
 - TXDES#1: Bits 31 ~ 0 are valid only when FTS = 1

So the ownership bit (31) is not valid unless FTS is set. However,
this isn't what his patch does. It adds checks for EDOTR.

>
> > 2. A TX descriptor is "writable" only when its value is zero
> > regardless the edotr mask.
>
> Again, why is that ? Can you elaborate ? What race are you trying to
> address here ?
>
> Cheers,
> Ben.
>
> > Fixes: 52c0cae87465 ("ftgmac100: Remove tx descriptor accessors")
> > Signed-off-by: Dylan Hung 
> > Signed-off-by: Joel Stanley 
> > ---
> >  drivers/net/ethernet/faraday/ftgmac100.c | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/faraday/ftgmac100.c
> > b/drivers/net/ethernet/faraday/ftgmac100.c
> > index 00024dd41147..7cacbe4aecb7 100644
> > --- a/drivers/net/ethernet/faraday/ftgmac100.c
> > +++ b/drivers/net/ethernet/faraday/ftgmac100.c
> > @@ -647,6 +647,9 @@ static bool ftgmac100_tx_complete_packet(struct
> > ftgmac100 *priv)
> >   if (ctl_stat & FTGMAC100_TXDES0_TXDMA_OWN)
> >   return false;
> >
> > + if ((ctl_stat & ~(priv->txdes0_edotr_mask)) == 0)
> > + return false;
> > +
> >   skb = priv->tx_skbs[pointer];
> >   netdev->stats.tx_packets++;
> >   netdev->stats.tx_bytes += skb->len;
> > @@ -756,6 +759,9 @@ static netdev_tx_t
> > ftgmac100_hard_start_xmit(struct sk_buff *skb,
> >   pointer = priv->tx_pointer;
> >   txdes = first = &priv->txdes[pointer];
> >
> > + if (le32_to_cpu(txdes->txdes0) & ~priv->txdes0_edotr_mask)
> > + goto drop;
> > +
> >   /* Setup it up with the packet head. Don't write the head to
> > the
> >* ring just yet
> >*/
> > @@ -787,6 +793,10 @@ static netdev_tx_t
> > ftgmac100_hard_start_xmit(struct sk_buff *skb,
> >   /* Setup descriptor */
> >   priv->tx_skbs[pointer] = skb;
> >   txdes = &priv->txdes[pointer];
> > +
> > + if (le32_to_cpu(txdes->txdes0) & ~priv-
> > >txdes0_edotr_mask)
> > + goto dma_err;
> > +
> >   ctl_stat = ftgmac100_base_tx_ctlstat(priv, pointer);
> >   ctl_stat |= FTGMAC100_TXDES0_TXDMA_OWN;
> >   ctl_stat |= FTGMAC100_TXDES0_TXBUF_SIZE(len);
>

[RFC, net-next 2/3] net: dsa: felix: add preempt queues set support for vsc9959

2020-10-19 Thread Xiaoliang Yang

VSC9959 support preempt queues according to 802.1qbu and 802.3br. This
patch add ethtool preempt set to configure preemption.

In user space, it can be set like this:
ethtool --set-frame-preemption swp0 enable min-frag-size 0

Signed-off-by: Xiaoliang Yang 
---
 drivers/net/dsa/ocelot/felix.c | 26 ++
 drivers/net/dsa/ocelot/felix.h |  4 +++
 drivers/net/dsa/ocelot/felix_vsc9959.c | 49 ++
 include/soc/mscc/ocelot.h  | 11 ++
 include/soc/mscc/ocelot_dev.h  | 23 
 5 files changed, 113 insertions(+)

diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
index f791860d495f..e08effbeb6bf 100644
--- a/drivers/net/dsa/ocelot/felix.c
+++ b/drivers/net/dsa/ocelot/felix.c
@@ -350,6 +350,30 @@ static int felix_get_ts_info(struct dsa_switch *ds, int 
port,
return ocelot_get_ts_info(ocelot, port, info);
 }
 
+static int felix_set_preempt(struct dsa_switch *ds, int port,
+struct ethtool_fp *fpcmd)
+{
+   struct ocelot *ocelot = ds->priv;
+   struct felix *felix = ocelot_to_felix(ocelot);
+
+   if (felix->info->port_set_preempt)
+   return felix->info->port_set_preempt(ocelot, port, fpcmd);
+
+   return -EOPNOTSUPP;
+}
+
+static int felix_get_preempt(struct dsa_switch *ds, int port,
+struct ethtool_fp *fpcmd)
+{
+   struct ocelot *ocelot = ds->priv;
+   struct felix *felix = ocelot_to_felix(ocelot);
+
+   if (felix->info->port_get_preempt)
+   return felix->info->port_get_preempt(ocelot, port, fpcmd);
+
+   return -EOPNOTSUPP;
+}
+
 static int felix_parse_ports_node(struct felix *felix,
  struct device_node *ports_node,
  phy_interface_t *port_phy_modes)
@@ -777,6 +801,8 @@ const struct dsa_switch_ops felix_switch_ops = {
.get_ethtool_stats  = felix_get_ethtool_stats,
.get_sset_count = felix_get_sset_count,
.get_ts_info= felix_get_ts_info,
+   .set_preempt= felix_set_preempt,
+   .get_preempt= felix_get_preempt,
.phylink_validate   = felix_phylink_validate,
.phylink_mac_config = felix_phylink_mac_config,
.phylink_mac_link_down  = felix_phylink_mac_link_down,
diff --git a/drivers/net/dsa/ocelot/felix.h b/drivers/net/dsa/ocelot/felix.h
index 4c717324ac2f..e0c93d4a351d 100644
--- a/drivers/net/dsa/ocelot/felix.h
+++ b/drivers/net/dsa/ocelot/felix.h
@@ -37,6 +37,10 @@ struct felix_info {
void(*port_sched_speed_set)(struct ocelot *ocelot, int port,
u32 speed);
void(*xmit_template_populate)(struct ocelot *ocelot, int port);
+   int (*port_set_preempt)(struct ocelot *ocelot, int port,
+   struct ethtool_fp *fpcmd);
+   int (*port_get_preempt)(struct ocelot *ocelot, int port,
+   struct ethtool_fp *fpcmd);
 };
 
 extern const struct dsa_switch_ops felix_switch_ops;
diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c 
b/drivers/net/dsa/ocelot/felix_vsc9959.c
index 3e925b8d5306..c0e41d499639 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -340,6 +341,10 @@ static const u32 vsc9959_dev_gmii_regmap[] = {
REG(DEV_MAC_FC_MAC_LOW_CFG, 0x3c),
REG(DEV_MAC_FC_MAC_HIGH_CFG,0x40),
REG(DEV_MAC_STICKY, 0x44),
+   REG(DEV_MM_ENABLE_CONFIG,   0x48),
+   REG(DEV_MM_VERIF_CONFIG,0x4c),
+   REG(DEV_MM_STATUS,  0x50),
+
REG_RESERVED(PCS1G_CFG),
REG_RESERVED(PCS1G_MODE_CFG),
REG_RESERVED(PCS1G_SD_CFG),
@@ -1321,6 +1326,48 @@ static int vsc9959_port_setup_tc(struct dsa_switch *ds, 
int port,
}
 }
 
+static int vsc9959_port_set_preempt(struct ocelot *ocelot, int port,
+   struct ethtool_fp *fpcmd)
+{
+   struct ocelot_port *ocelot_port = ocelot->ports[port];
+   int mm_fragsize = fpcmd->min_frag_size_mult;
+
+   if (mm_fragsize > 3)
+   return -EINVAL;
+
+   ocelot_port_rmwl(ocelot_port,
+(fpcmd->enabled ?
+ (DEV_MM_CONFIG_ENABLE_CONFIG_MM_RX_ENA |
+  DEV_MM_CONFIG_ENABLE_CONFIG_MM_TX_ENA) : 0),
+DEV_MM_CONFIG_ENABLE_CONFIG_MM_RX_ENA |
+DEV_MM_CONFIG_ENABLE_CONFIG_MM_TX_ENA,
+DEV_MM_ENABLE_CONFIG);
+
+   ocelot_rmw_rix(ocelot,
+  QSYS_PREEMPTION_CFG_MM_ADD_FRAG_SIZE(mm_fragsize),
+  QSYS_PREEMPTION_CFG_MM_ADD_FRAG_SIZE_M,
+  QSYS_PREEMPTION_C

[RFC, net-next 0/3] net: dsa: felix: frame preemption support

2020-10-19 Thread Xiaoliang Yang

VSC9959 supports frame preemption according to 802.1qbu and 802.3br.
This patch series use ethtool to enable and configure frame preemption,
then use tc-taprio preempt set to mark the preempt queues and express
queueus.

This series depends on series: "ethtool: Add support for frame preemption"
link: 
http://patchwork.ozlabs.org/project/netdev/patch/20201012235642.1384318-2-vinicius.go...@intel.com/

Xiaoliang Yang (3):
  net: dsa: ethtool preempt ops support on slave ports
  net: dsa: felix: add preempt queues set support for vsc9959
  net: dsa: felix: tc-taprio preempt set support

 drivers/net/dsa/ocelot/felix.c | 26 +++
 drivers/net/dsa/ocelot/felix.h |  4 ++
 drivers/net/dsa/ocelot/felix_vsc9959.c | 65 ++
 include/net/dsa.h  | 12 +
 include/soc/mscc/ocelot.h  | 11 +
 include/soc/mscc/ocelot_dev.h  | 23 +
 net/dsa/slave.c| 26 +++
 7 files changed, 167 insertions(+)

-- 
2.18.4

Re: [PATCH v4 1/2] dt-bindings: spmi: document binding for the Mediatek SPMI controller

2020-10-19 Thread Hsin-Hsiung Wang

Hi,

On Mon, 2020-10-19 at 14:52 -0500, Rob Herring wrote:
> On Sat, 17 Oct 2020 00:10:33 +0800, Hsin-Hsiung Wang wrote:
> > This adds documentation for the SPMI controller found on Mediatek SoCs.
> > 
> > Signed-off-by: Hsin-Hsiung Wang 
> > ---
> >  .../bindings/spmi/mtk,spmi-mtk-pmif.yaml  | 70 +++
> >  1 file changed, 70 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/spmi/mtk,spmi-mtk-pmif.yaml
> > 
> 
> 
> My bot found errors running 'make dt_binding_check' on your patch:
> 
> Documentation/devicetree/bindings/spmi/mtk,spmi-mtk-pmif.example.dts:19:18: 
> fatal error: dt-bindings/clock/mt8192-clk.h: No such file or directory
>19 | #include 
>   |  ^~~~
> compilation terminated.
> make[1]: *** [scripts/Makefile.lib:342: 
> Documentation/devicetree/bindings/spmi/mtk,spmi-mtk-pmif.example.dt.yaml] 
> Error 1
> make[1]: *** Waiting for unfinished jobs
> make: *** [Makefile:1366: dt_binding_check] Error 2
> 
> 
> See https://patchwork.ozlabs.org/patch/1383441
> 
> If you already ran 'make dt_binding_check' and didn't see the above
> error(s), then make sure dt-schema is up to date:
> 
> pip3 install git+https://github.com/devicetree-org/dt-schema.git@master 
> --upgrade
> 
> Please check and re-submit.
> 

Sorry, I don't add the necessary series[1] into the commit message.
I will update it in the next patch, thanks for the review.

[1]
https://patchwork.kernel.org/project/linux-mediatek/list/?series=342593

[PATCH] x86, libnvdimm/test: Remove COPY_MC_TEST

2020-10-19 Thread Dan Williams

The COPY_MC_TEST facility has served its purpose for validating the
early termination conditions of the copy_mc_fragile() implementation.
Remove it and the EXPORT_SYMBOL_GPL of copy_mc_fragile().

Reported-by: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
Cc: Vishal Verma 
Cc: Dave Jiang 
Cc: Ira Weiny 
Signed-off-by: Dan Williams 
---
 arch/x86/Kconfig.debug  |3 -
 arch/x86/include/asm/copy_mc_test.h |   75 -
 arch/x86/lib/copy_mc.c  |4 -
 arch/x86/lib/copy_mc_64.S   |   10 ---
 tools/testing/nvdimm/test/nfit.c|  103 ---
 5 files changed, 195 deletions(-)
 delete mode 100644 arch/x86/include/asm/copy_mc_test.h

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 27b5e2bc6a01..80b57e7f4947 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -62,9 +62,6 @@ config EARLY_PRINTK_USB_XDBC
  You should normally say N here, unless you want to debug early
  crashes or need a very simple printk logging facility.
 
-config COPY_MC_TEST
-   def_bool n
-
 config EFI_PGT_DUMP
bool "Dump the EFI pagetable"
depends on EFI
diff --git a/arch/x86/include/asm/copy_mc_test.h 
b/arch/x86/include/asm/copy_mc_test.h
deleted file mode 100644
index e4991ba96726..
--- a/arch/x86/include/asm/copy_mc_test.h
+++ /dev/null
@@ -1,75 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _COPY_MC_TEST_H_
-#define _COPY_MC_TEST_H_
-
-#ifndef __ASSEMBLY__
-#ifdef CONFIG_COPY_MC_TEST
-extern unsigned long copy_mc_test_src;
-extern unsigned long copy_mc_test_dst;
-
-static inline void copy_mc_inject_src(void *addr)
-{
-   if (addr)
-   copy_mc_test_src = (unsigned long) addr;
-   else
-   copy_mc_test_src = ~0UL;
-}
-
-static inline void copy_mc_inject_dst(void *addr)
-{
-   if (addr)
-   copy_mc_test_dst = (unsigned long) addr;
-   else
-   copy_mc_test_dst = ~0UL;
-}
-#else /* CONFIG_COPY_MC_TEST */
-static inline void copy_mc_inject_src(void *addr)
-{
-}
-
-static inline void copy_mc_inject_dst(void *addr)
-{
-}
-#endif /* CONFIG_COPY_MC_TEST */
-
-#else /* __ASSEMBLY__ */
-#include 
-
-#ifdef CONFIG_COPY_MC_TEST
-.macro COPY_MC_TEST_CTL
-   .pushsection .data
-   .align 8
-   .globl copy_mc_test_src
-   copy_mc_test_src:
-   .quad 0
-   EXPORT_SYMBOL_GPL(copy_mc_test_src)
-   .globl copy_mc_test_dst
-   copy_mc_test_dst:
-   .quad 0
-   EXPORT_SYMBOL_GPL(copy_mc_test_dst)
-   .popsection
-.endm
-
-.macro COPY_MC_TEST_SRC reg count target
-   leaq \count(\reg), %r9
-   cmp copy_mc_test_src, %r9
-   ja \target
-.endm
-
-.macro COPY_MC_TEST_DST reg count target
-   leaq \count(\reg), %r9
-   cmp copy_mc_test_dst, %r9
-   ja \target
-.endm
-#else
-.macro COPY_MC_TEST_CTL
-.endm
-
-.macro COPY_MC_TEST_SRC reg count target
-.endm
-
-.macro COPY_MC_TEST_DST reg count target
-.endm
-#endif /* CONFIG_COPY_MC_TEST */
-#endif /* __ASSEMBLY__ */
-#endif /* _COPY_MC_TEST_H_ */
diff --git a/arch/x86/lib/copy_mc.c b/arch/x86/lib/copy_mc.c
index c13e8c9ee926..80efd45a7761 100644
--- a/arch/x86/lib/copy_mc.c
+++ b/arch/x86/lib/copy_mc.c
@@ -10,10 +10,6 @@
 #include 
 
 #ifdef CONFIG_X86_MCE
-/*
- * See COPY_MC_TEST for self-test of the copy_mc_fragile()
- * implementation.
- */
 static DEFINE_STATIC_KEY_FALSE(copy_mc_fragile_key);
 
 void enable_copy_mc_fragile(void)
diff --git a/arch/x86/lib/copy_mc_64.S b/arch/x86/lib/copy_mc_64.S
index 892d8915f609..e5f77e293034 100644
--- a/arch/x86/lib/copy_mc_64.S
+++ b/arch/x86/lib/copy_mc_64.S
@@ -2,14 +2,11 @@
 /* Copyright(c) 2016-2020 Intel Corporation. All rights reserved. */
 
 #include 
-#include 
-#include 
 #include 
 
 #ifndef CONFIG_UML
 
 #ifdef CONFIG_X86_MCE
-COPY_MC_TEST_CTL
 
 /*
  * copy_mc_fragile - copy memory with indication if an exception / fault 
happened
@@ -38,8 +35,6 @@ SYM_FUNC_START(copy_mc_fragile)
subl %ecx, %edx
 .L_read_leading_bytes:
movb (%rsi), %al
-   COPY_MC_TEST_SRC %rsi 1 .E_leading_bytes
-   COPY_MC_TEST_DST %rdi 1 .E_leading_bytes
 .L_write_leading_bytes:
movb %al, (%rdi)
incq %rsi
@@ -55,8 +50,6 @@ SYM_FUNC_START(copy_mc_fragile)
 
 .L_read_words:
movq (%rsi), %r8
-   COPY_MC_TEST_SRC %rsi 8 .E_read_words
-   COPY_MC_TEST_DST %rdi 8 .E_write_words
 .L_write_words:
movq %r8, (%rdi)
addq $8, %rsi
@@ -73,8 +66,6 @@ SYM_FUNC_START(copy_mc_fragile)
movl %edx, %ecx
 .L_read_trailing_bytes:
movb (%rsi), %al
-   COPY_MC_TEST_SRC %rsi 1 .E_trailing_bytes
-   COPY_MC_TEST_DST %rdi 1 .E_trailing_bytes
 .L_write_trailing_bytes:
movb %al, (%rdi)
incq %rsi
@@ -88,7 +79,6 @@ SYM_FUNC_START(copy_mc_fragile)
 .L_done:
ret
 SYM_FUNC_END(copy_mc_fragile)
-EXPORT_SYMBOL_GPL(copy_mc_fragile)
 
.

[PATCH v1] i2c: tegra: Fix i2c_writesl() to use writel() instead of writesl()

2020-10-19 Thread Sowjanya Komatineni

VI I2C don't have DMA support and uses PIO mode all the time.

Current driver uses writesl() to fill TX FIFO based on available
empty slots and with this seeing strange silent hang during any I2C
register access after filling TX FIFO with 8 words.

Using writel() followed by i2c_readl() in a loop to write all words
to TX FIFO instead of using writesl() helps for large transfers in
PIO mode.

So, this patch updates i2c_writesl() API to use writel() in a loop
instead of writesl().

Signed-off-by: Sowjanya Komatineni 
---
 drivers/i2c/busses/i2c-tegra.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 6f08c0c..274bf3a 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -333,10 +333,13 @@ static u32 i2c_readl(struct tegra_i2c_dev *i2c_dev, 
unsigned int reg)
return readl_relaxed(i2c_dev->base + tegra_i2c_reg_addr(i2c_dev, reg));
 }
 
-static void i2c_writesl(struct tegra_i2c_dev *i2c_dev, void *data,
+static void i2c_writesl(struct tegra_i2c_dev *i2c_dev, u32 *data,
unsigned int reg, unsigned int len)
 {
-   writesl(i2c_dev->base + tegra_i2c_reg_addr(i2c_dev, reg), data, len);
+   while (len--) {
+   writel(*data++, i2c_dev->base + tegra_i2c_reg_addr(i2c_dev, 
reg));
+   i2c_readl(i2c_dev, I2C_INT_STATUS);
+   }
 }
 
 static void i2c_readsl(struct tegra_i2c_dev *i2c_dev, void *data,
@@ -811,7 +814,7 @@ static int tegra_i2c_fill_tx_fifo(struct tegra_i2c_dev 
*i2c_dev)
i2c_dev->msg_buf_remaining = buf_remaining;
i2c_dev->msg_buf = buf + words_to_transfer * 
BYTES_PER_FIFO_WORD;
 
-   i2c_writesl(i2c_dev, buf, I2C_TX_FIFO, words_to_transfer);
+   i2c_writesl(i2c_dev, (u32 *)buf, I2C_TX_FIFO, 
words_to_transfer);
 
buf += words_to_transfer * BYTES_PER_FIFO_WORD;
}
-- 
2.7.4

Re: [PATCH] spi: spi-sun6i: implement DMA-based transfer mode

2020-10-19 Thread Chen-Yu Tsai

On Tue, Oct 20, 2020 at 1:43 AM Alexander Kochetkov  wrote:
>
>
>
> > 19 окт. 2020 г., в 11:21, Maxime Ripard  написал(а):
> >
> > Hi!
> >
> > On Thu, Oct 15, 2020 at 06:47:40PM +0300, Alexander Kochetkov wrote:
> >> DMA-based transfer will be enabled if data length is larger than FIFO size
> >> (64 bytes for A64). This greatly reduce number of interrupts for
> >> transferring data.
> >>
> >> For smaller data size PIO mode will be used. In PIO mode whole buffer will
> >> be loaded into FIFO.
> >>
> >> If driver failed to request DMA channels then it fallback for PIO mode.
> >>
> >> Tested on SOPINE (https://www.pine64.org/sopine/)
> >>
> >> Signed-off-by: Alexander Kochetkov 
> >
> > Thanks for working on this, it's been a bit overdue
>
> Hi, Maxime!
>
> We did custom A64 based computation module for our product.
> Do you mean that A64 is obsolete or EOL product?
> If so, can you recommend active replacement for A64 from Allwinner same price?

I believe what Maxime meant was that DMA transfer for SPI is a long
sought-after feature, but no one had finished it.

ChenYu

[PATCH v3 3/3] net: dsa: mv88e6xxx: Support serdes ports on MV88E6123/6131

2020-10-19 Thread Chris Packham

Implement serdes_power, serdes_get_lane and serdes_pcs_get_state ops for
the MV88E6123 so that the ports without a built-in PHY supported as
serdes ports and directly connected to other network interfaces or to
SFPs. Also implement serdes_get_regs_len and serdes_get_regs to aid
future debugging.

Signed-off-by: Chris Packham 
---

This is untested (apart from compilation) it assumes the SERDES "phy"
address corresponds to the port number but I'm not confident that is a
valid assumption.

Changes in v3:
- None
Changes in v2:
- new

 drivers/net/dsa/mv88e6xxx/chip.c   | 10 +++
 drivers/net/dsa/mv88e6xxx/serdes.c | 44 ++
 drivers/net/dsa/mv88e6xxx/serdes.h |  4 +++
 3 files changed, 58 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 62d4d7b5d9ac..5344fc84b03e 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3574,6 +3574,11 @@ static const struct mv88e6xxx_ops mv88e6123_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6352_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6123_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
+   .serdes_get_regs_len = mv88e6123_serdes_get_regs_len,
+   .serdes_get_regs = mv88e6123_serdes_get_regs,
.pot_clear = mv88e6xxx_g2_pot_clear,
.reset = mv88e6352_g1_reset,
.atu_get_hash = mv88e6165_g1_atu_get_hash,
@@ -3613,6 +3618,11 @@ static const struct mv88e6xxx_ops mv88e6131_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6185_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6123_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
+   .serdes_get_regs_len = mv88e6123_serdes_get_regs_len,
+   .serdes_get_regs = mv88e6123_serdes_get_regs,
.ppu_enable = mv88e6185_g1_ppu_enable,
.set_cascade_port = mv88e6185_g1_set_cascade_port,
.ppu_disable = mv88e6185_g1_ppu_disable,
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c 
b/drivers/net/dsa/mv88e6xxx/serdes.c
index d4f40a739b17..eb89debbf576 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -428,6 +428,50 @@ u8 mv88e6341_serdes_get_lane(struct mv88e6xxx_chip *chip, 
int port)
return lane;
 }
 
+int mv88e6123_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
+  bool up)
+{
+   u16 val, new_val;
+   int err;
+
+   err = mv88e6xxx_phy_read(chip, port, MII_BMCR, &val);
+   if (err)
+   return err;
+
+   if (up)
+   new_val = val & ~BMCR_PDOWN;
+   else
+   new_val = val | BMCR_PDOWN;
+
+   if (val != new_val)
+   err = mv88e6xxx_phy_write(chip, port, MII_BMCR, val);
+
+   return err;
+}
+
+int mv88e6123_serdes_get_regs_len(struct mv88e6xxx_chip *chip, int port)
+{
+   if (mv88e6xxx_serdes_get_lane(chip, port) == 0)
+   return 0;
+
+   return 26 * sizeof(u16);
+}
+
+void mv88e6123_serdes_get_regs(struct mv88e6xxx_chip *chip, int port, void *_p)
+{
+   u16 *p = _p;
+   u16 reg;
+   int i;
+
+   if (mv88e6xxx_serdes_get_lane(chip, port) == 0)
+   return;
+
+   for (i = 0; i < 26; i++) {
+   mv88e6xxx_phy_read(chip, port, i, ®);
+   p[i] = reg;
+   }
+}
+
 int mv88e6185_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
   bool up)
 {
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.h 
b/drivers/net/dsa/mv88e6xxx/serdes.h
index c24ec4122c9e..b573139928c4 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.h
+++ b/drivers/net/dsa/mv88e6xxx/serdes.h
@@ -104,6 +104,8 @@ unsigned int mv88e6352_serdes_irq_mapping(struct 
mv88e6xxx_chip *chip,
  int port);
 unsigned int mv88e6390_serdes_irq_mapping(struct mv88e6xxx_chip *chip,
  int port);
+int mv88e6123_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
+  bool up);
 int mv88e6185_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
   bool up);
 int mv88e6352_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
@@ -129,6 +131,8 @@ int mv88e6390_serdes_get_strings(struct mv88e6xxx_chip 
*chip,
 int mv88e6390_serdes_get_stats(struct mv88e6xxx_chip *chip, int port,
   uint64_t *data);
 
+int mv88e6123_serdes_get_regs_len(struct mv88e6xxx_chip *chip, int port);
+void mv88e6123_serdes_get_regs(struct mv88e6xxx_chip *chip, int port, void 
*_p);
 int mv88e6352_serdes_get_regs_len(struct mv88e6xxx_chip *chip, int port);
 void mv88e6352_ser

[PATCH v3 2/3] net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185

2020-10-19 Thread Chris Packham

Implement serdes_power, serdes_get_lane and serdes_pcs_get_state ops for
the MV88E6097/6095/6185 so that ports 8 & 9 can be supported as serdes
ports and directly connected to other network interfaces or to SFPs
without a PHY.

Signed-off-by: Chris Packham 
Reviewed-by: Andrew Lunn 
---
Changes in v3:
- Add comment to mv88e6185_serdes_get_lane
- Add review from Andrew
Changes in v2:
- expand support to cover 6095 and 6185
- move serdes related code to serdes.c

 drivers/net/dsa/mv88e6xxx/chip.c   |  9 +
 drivers/net/dsa/mv88e6xxx/serdes.c | 62 ++
 drivers/net/dsa/mv88e6xxx/serdes.h |  5 +++
 3 files changed, 76 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 1ef392ee52c5..62d4d7b5d9ac 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3496,6 +3496,9 @@ static const struct mv88e6xxx_ops mv88e6095_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.mgmt_rsvd2cpu = mv88e6185_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6185_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
.ppu_enable = mv88e6185_g1_ppu_enable,
.ppu_disable = mv88e6185_g1_ppu_disable,
.reset = mv88e6185_g1_reset,
@@ -3534,6 +3537,9 @@ static const struct mv88e6xxx_ops mv88e6097_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6352_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6185_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
.pot_clear = mv88e6xxx_g2_pot_clear,
.reset = mv88e6352_g1_reset,
.rmu_disable = mv88e6085_g1_rmu_disable,
@@ -3958,6 +3964,9 @@ static const struct mv88e6xxx_ops mv88e6185_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6185_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6185_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
.set_cascade_port = mv88e6185_g1_set_cascade_port,
.ppu_enable = mv88e6185_g1_ppu_enable,
.ppu_disable = mv88e6185_g1_ppu_disable,
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c 
b/drivers/net/dsa/mv88e6xxx/serdes.c
index 9c07b4f3d345..d4f40a739b17 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -428,6 +428,68 @@ u8 mv88e6341_serdes_get_lane(struct mv88e6xxx_chip *chip, 
int port)
return lane;
 }
 
+int mv88e6185_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
+  bool up)
+{
+   /* The serdes power can't be controlled on this switch chip but we need
+* to supply this function to avoid returning -EOPNOTSUPP in
+* mv88e6xxx_serdes_power_up/mv88e6xxx_serdes_power_down
+*/
+   return 0;
+}
+
+u8 mv88e6185_serdes_get_lane(struct mv88e6xxx_chip *chip, int port)
+{
+   /* There are no configurable serdes lanes on this switch chip but we
+* need to return non-zero so that callers of
+* mv88e6xxx_serdes_get_lane() know this is a serdes port.
+*/
+   switch (chip->ports[port].cmode) {
+   case MV88E6185_PORT_STS_CMODE_SERDES:
+   case MV88E6185_PORT_STS_CMODE_1000BASE_X:
+   return 0xff;
+   default:
+   return 0;
+   }
+}
+
+int mv88e6185_serdes_pcs_get_state(struct mv88e6xxx_chip *chip, int port,
+  u8 lane, struct phylink_link_state *state)
+{
+   int err;
+   u16 status;
+
+   err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_STS, &status);
+   if (err)
+   return err;
+
+   state->link = !!(status & MV88E6XXX_PORT_STS_LINK);
+
+   if (state->link) {
+   state->duplex = status & MV88E6XXX_PORT_STS_DUPLEX ? 
DUPLEX_FULL : DUPLEX_HALF;
+
+   switch (status &  MV88E6XXX_PORT_STS_SPEED_MASK) {
+   case MV88E6XXX_PORT_STS_SPEED_1000:
+   state->speed = SPEED_1000;
+   break;
+   case MV88E6XXX_PORT_STS_SPEED_100:
+   state->speed = SPEED_100;
+   break;
+   case MV88E6XXX_PORT_STS_SPEED_10:
+   state->speed = SPEED_10;
+   break;
+   default:
+   dev_err(chip->dev, "invalid PHY speed\n");
+   return -EINVAL;
+   }
+   } else {
+   state->duplex = DUPLEX_UNKNOWN;
+   state->speed = SPEED_UNKNOWN;
+   }
+
+   return 0;
+}
+
 u8 mv88e6390_serdes_get_lane(struct mv88e6xxx_chip

[PATCH v3 1/3] net: dsa: mv88e6xxx: Don't force link when using in-band-status

2020-10-19 Thread Chris Packham

When a port is configured with 'managed = "in-band-status"' don't force
the link up, the switch MAC will detect the link status correctly.

Signed-off-by: Chris Packham 
Reviewed-by: Andrew Lunn 
---
Changes in v3:
- None
Changes in v2:
- Add review from Andrew

 drivers/net/dsa/mv88e6xxx/chip.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index f0dbc05e30a4..1ef392ee52c5 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -767,8 +767,11 @@ static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, 
int port,
goto error;
}
 
-   if (ops->port_set_link)
-   err = ops->port_set_link(chip, port, LINK_FORCED_UP);
+   if (ops->port_set_link) {
+   int link = mode == MLO_AN_INBAND ? LINK_UNFORCED : 
LINK_FORCED_UP;
+
+   err = ops->port_set_link(chip, port, link);
+   }
}
 error:
mv88e6xxx_reg_unlock(chip);
-- 
2.28.0

[PATCH v3 0/3] net: dsa: mv88e6xxx: serdes link without phy

2020-10-19 Thread Chris Packham

This small series gets my hardware into a working state. The key points are to
make sure we don't force the link and that we ask the MAC for the link status.
I also have updated my dts to say `phy-mode = "1000base-x";` and `managed =
"in-band-status";`

I've included patch #3 in this series but I don't have anything to test it on.
It's just a guess based on the datasheets. I'd suggest applying patch 1 & 2
and leaving 3 for the mailing list archives.

Chris Packham (3):
  net: dsa: mv88e6xxx: Don't force link when using in-band-status
  net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185
  net: dsa: mv88e6xxx: Support serdes ports on MV88E6123/6131

 drivers/net/dsa/mv88e6xxx/chip.c   |  26 ++-
 drivers/net/dsa/mv88e6xxx/serdes.c | 106 +
 drivers/net/dsa/mv88e6xxx/serdes.h |   9 +++
 3 files changed, 139 insertions(+), 2 deletions(-)

-- 
2.28.0

Re: [PATCH v8 -tip 13/26] kernel/entry: Add support for core-wide protection of kernel-mode

2020-10-19 Thread Randy Dunlap


On 10/19/20 6:43 PM, Joel Fernandes (Google) wrote:


---
  .../admin-guide/kernel-parameters.txt |   7 +
  include/linux/entry-common.h  |   2 +-
  include/linux/sched.h |  12 +
  kernel/entry/common.c |  25 +-
  kernel/sched/core.c   | 229 ++
  kernel/sched/sched.h  |   3 +
  6 files changed, 275 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 3236427e2215..48567110f709 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4678,6 +4678,13 @@
  
  	sbni=		[NET] Granch SBNI12 leased line adapter
  
+	sched_core_protect_kernel=


Needs a list of possible values after '=', along with telling us
what the default value/setting is.



+   [SCHED_CORE] Pause SMT siblings of a core running in
+   user mode, if at least one of the siblings of the core
+   is running in kernel mode. This is to guarantee that
+   kernel data is not leaked to tasks which are not trusted
+   by the kernel.
+



thanks.

Re: [PATCH v8 -tip 25/26] Documentation: Add core scheduling documentation

2020-10-19 Thread Randy Dunlap


Hi Joel,

On 10/19/20 6:43 PM, Joel Fernandes (Google) wrote:

Document the usecases, design and interfaces for core scheduling.

Co-developed-by: Vineeth Pillai 
Tested-by: Julien Desfossez 
Signed-off-by: Joel Fernandes (Google) 
---
  .../admin-guide/hw-vuln/core-scheduling.rst   | 312 ++
  Documentation/admin-guide/hw-vuln/index.rst   |   1 +
  2 files changed, 313 insertions(+)
  create mode 100644 Documentation/admin-guide/hw-vuln/core-scheduling.rst

diff --git a/Documentation/admin-guide/hw-vuln/core-scheduling.rst 
b/Documentation/admin-guide/hw-vuln/core-scheduling.rst
new file mode 100644
index ..eacafbb8fa3f
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/core-scheduling.rst
@@ -0,0 +1,312 @@
+Core Scheduling
+***
+Core scheduling support allows userspace to define groups of tasks that can
+share a core. These groups can be specified either for security usecases (one
+group of tasks don't trust another), or for performance usecases (some
+workloads may benefit from running on the same core as they don't need the same
+hardware resources of the shared core).
+
+Security usecase
+
+A cross-HT attack involves the attacker and victim running on different
+Hyper Threads of the same core. MDS and L1TF are examples of such attacks.
+Without core scheduling, the only full mitigation of cross-HT attacks is to
+disable Hyper Threading (HT). Core scheduling allows HT to be turned on safely
+by ensuring that trusted tasks can share a core. This increase in core sharing
+can improvement performance, however it is not guaranteed that performance will
+always improve, though that is seen to be the case with a number of real world
+workloads. In theory, core scheduling aims to perform at least as good as when
+Hyper Threading is disabled. In practise, this is mostly the case though not
+always: as synchronizing scheduling decisions across 2 or more CPUs in a core
+involves additional overhead - especially when the system is lightly loaded
+(``total_threads <= N/2``).


N is number of CPUs?


+
+Usage
+-
+Core scheduling support is enabled via the ``CONFIG_SCHED_CORE`` config option.
+Using this feature, userspace defines groups of tasks that trust each other.
+The core scheduler uses this information to make sure that tasks that do not
+trust each other will never run simultaneously on a core, while doing its best
+to satisfy the system's scheduling requirements.
+
+There are 2 ways to use core-scheduling:
+
+CGroup
+##
+Core scheduling adds additional files to the CPU controller CGroup:
+
+* ``cpu.tag``
+Writing ``1`` into this file results in all tasks in the group get tagged. This


  getting
orbeing


+results in all the CGroup's tasks allowed to run concurrently on a core's
+hyperthreads (also called siblings).
+
+The file being a value of ``0`` means the tag state of the CGroup is inheritted



inherited


+from its parent hierarchy. If any ancestor of the CGroup is tagged, then the
+group is tagged.
+
+.. note:: Once a CGroup is tagged via cpu.tag, it is not possible to set this
+  for any descendant of the tagged group. For finer grained control, 
the
+  ``cpu.tag_color`` file described next may be used.
+
+.. note:: When a CGroup is not tagged, all the tasks within the group can share
+  a core with kernel threads and untagged system threads. For this 
reason,
+  if a group has ``cpu.tag`` of 0, it is considered to be trusted.
+
+* ``cpu.tag_color``
+For finer grained control over core sharing, a color can also be set in
+addition to the tag. This allows to further control core sharing between child
+CGroups within an already tagged CGroup. The color and the tag are both used to
+generate a `cookie` which is used by the scheduler to identify the group.
+
+Upto 256 different colors can be set (0-255) by writing into this file.


  Up to


+
+A sample real-world usage of this file follows:
+
+Google uses DAC controls to make ``cpu.tag`` writeable only by root and the


$search tells me "writable".


+``cpu.tag_color`` can be changed by anyone.
+
+The hierarchy looks like this:
+::
+  Root group
+ / \
+A   B(These are created by the root daemon - borglet).
+   / \   \
+  C   D   E  (These are created by AppEngine within the container).
+
+A and B are containers for 2 different jobs or apps that are created by a root
+daemon called borglet. borglet then tags each of these group with the 
``cpu.tag``
+file. The job itself can create additional child CGroups which are colored by
+the container's AppEngine with the ``cpu.tag_color`` file.
+
+The reason why Google uses this 2-level tagging system is that AppEngine wants 
to
+allow a subset of child CGroups within a tagged parent CGroup to be 
co-sche

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1116 matches

Mail list logo