Re: [PATCH v9 2/4] fs: Add standard casefolding support
On Tue, Jun 23, 2020 at 09:33:39PM -0700, Daniel Rosenberg wrote: > This adds general supporting functions for filesystems that use > utf8 casefolding. It provides standard dentry_operations and adds the > necessary structures in struct super_block to allow this standardization. > > Ext4 and F2fs will switch to these common implementations. > > Signed-off-by: Daniel Rosenberg > --- > fs/libfs.c | 101 + > include/linux/fs.h | 22 ++ > 2 files changed, 123 insertions(+) > > diff --git a/fs/libfs.c b/fs/libfs.c > index 4d08edf19c782..f7345a5ed562f 100644 > --- a/fs/libfs.c > +++ b/fs/libfs.c > @@ -20,6 +20,8 @@ > #include > #include > #include > +#include > +#include > > #include > > @@ -1363,3 +1365,102 @@ bool is_empty_dir_inode(struct inode *inode) > return (inode->i_fop == _dir_operations) && > (inode->i_op == _dir_inode_operations); > } > + > +#ifdef CONFIG_UNICODE > +/** > + * needs_casefold - generic helper to determine if a filename should be > casefolded > + * @dir: Parent directory > + * > + * Generic helper for filesystems to use to determine if the name of a dentry > + * should be casefolded. It does not make sense to casefold the no-key token > of > + * an encrypted filename. > + * > + * Return: if names will need casefolding > + */ > +bool needs_casefold(const struct inode *dir) > +{ > + return IS_CASEFOLDED(dir) && dir->i_sb->s_encoding && > + (!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir)); > +} > +EXPORT_SYMBOL(needs_casefold); Note that the '!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir)' check can be racy, because a process can be looking up a no-key token in a directory while concurrently another process initializes the directory's ->i_crypt_info, causing fscrypt_has_encryption_key(dir) to suddenly start returning true. In my rework of filename handling in f2fs, I actually ended up removing all calls to needs_casefold(), thus avoiding this race. f2fs now decides whether the name is going to need casefolding early on, in __f2fs_setup_filename(), where it knows in a race-free way whether the filename is a no-key token or not. Perhaps ext4 should work the same way? It did look like there would be some extra complexity due to how the ext4 directory hashing works in comparison to f2fs's, but I haven't had a chance to properly investigate it. - Eric
Re: [PATCH][next] dmaengine: hisilicon: Use struct_size() in devm_kzalloc()
On 17-06-20, 16:11, Gustavo A. R. Silva wrote: > Make use of the struct_size() helper instead of an open-coded version > in order to avoid any potential type mistakes. > > This code was detected with the help of Coccinelle and, audited and > fixed manually. Applied, thanks -- ~Vinod
Re: [PATCH][next] dmaengine: ti: k3-udma: Use struct_size() in kzalloc()
On Wed, 2020-06-24 at 11:25 +0530, Vinod Koul wrote: > On 19-06-20, 17:43, Gustavo A. R. Silva wrote: > > Make use of the struct_size() helper instead of an open-coded version > > in order to avoid any potential type mistakes. > > > > This code was detected with the help of Coccinelle and, audited and > > fixed manually. > > > > Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83 Is this odd tag really useful? > > Signed-off-by: Gustavo A. R. Silva > > --- > > drivers/dma/ti/k3-udma.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c > > index 0d5fb154b8e2..411c54b86ba8 100644 > > --- a/drivers/dma/ti/k3-udma.c > > +++ b/drivers/dma/ti/k3-udma.c > > @@ -2209,7 +2209,7 @@ udma_prep_slave_sg_pkt(struct udma_chan *uc, struct > > scatterlist *sgl, > > u32 ring_id; > > unsigned int i; > > > > - d = kzalloc(sizeof(*d) + sglen * sizeof(d->hwdesc[0]), GFP_NOWAIT); > > + d = kzalloc(struct_size(d, hwdesc, sglen), GFP_NOWAIT); > > struct_size() is a * b + c but here we need, a + b * c.. the trailing > struct is N times here.. > > > > if (!d) > > return NULL; > > > > @@ -2525,7 +2525,7 @@ udma_prep_dma_cyclic_pkt(struct udma_chan *uc, > > dma_addr_t buf_addr, > > if (period_len >= SZ_4M) > > return NULL; > > > > - d = kzalloc(sizeof(*d) + periods * sizeof(d->hwdesc[0]), GFP_NOWAIT); > > + d = kzalloc(struct_size(d, hwdesc, periods), GFP_NOWAIT); > > if (!d) > > return NULL; > > > > -- > > 2.27.0
Re: [PATCH v12 00/10] Introduce support for guest CET feature
On Tue, Jun 23, 2020 at 11:39:19AM -0700, Sean Christopherson wrote: > On Thu, Jun 11, 2020 at 09:29:13AM +0800, Yang Weijiang wrote: > > On Wed, Jun 10, 2020 at 09:56:36AM -0700, Sean Christopherson wrote: > > > On Wed, May 06, 2020 at 04:20:59PM +0800, Yang Weijiang wrote: > > > > Several parts in KVM have been updated to provide VM CET support, > > > > including: > > > > CPUID/XSAVES config, MSR pass-through, user space MSR access interface, > > > > vmentry/vmexit config, nested VM etc. These patches have dependency on > > > > CET > > > > kernel patches for xsaves support and CET definitions, e.g., MSR and > > > > related > > > > feature flags. > > > > > > Other than the MSR and cpufeatures flags definitions, is there any direct > > > dependency on kernel CET support? I.e. if/when XSAVES support is merged, > > > is there anything beyond the architectural definitions that are required > > > to > > > merge KVM CET virtualization? > > No, KVM CET patches only depend on kernel CET related definitions and > > XSAVES > > support now. > > Neato. > > > But to make guest CET work, we need CET patches for QEMU. > > Ya, but we don't need to wait for host kernel support, which was the crux of > my question. > > > Can you please respin this series with the CET definition patches included? > The XSAVES support has been queued to tip/x86/fpu. Assuming that lands in > kernel 5.9, I _think_ KVM support for CET can land in 5.10. Sure. Besides this change and the unrestricted guest case change, any other changes I should do to v12 patch? Thanks for review! > > Base your series on kvm/queue, i.e. don't worry about the XSAVES patches, > I'll merge them in from tip/x86/fpu for testing. > > Thanks!
Re: [PATCH 5.4 000/314] 5.4.49-rc1 review
On Tue, Jun 23, 2020 at 09:14:21PM -0700, Nathan Chancellor wrote: > On Tue, Jun 23, 2020 at 09:53:15PM +0200, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 5.4.49 release. > > There are 314 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Thu, 25 Jun 2020 19:52:30 +. > > Anything received after that time might be too late. > > > > The whole patch series can be found in one patch at: > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.49-rc1.gz > > or in the git tree and branch at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > linux-5.4.y > > and the diffstat can be found below. > > > > thanks, > > > > greg k-h > > I ran this through my LLVM build tests and saw no regressions compared > to 5.4.47. Great, thanks for testing! greg k-h
Re: [PATCH][next] dmaengine: ti: k3-udma: Use struct_size() in kzalloc()
On 19-06-20, 17:43, Gustavo A. R. Silva wrote: > Make use of the struct_size() helper instead of an open-coded version > in order to avoid any potential type mistakes. > > This code was detected with the help of Coccinelle and, audited and > fixed manually. > > Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83 > Signed-off-by: Gustavo A. R. Silva > --- > drivers/dma/ti/k3-udma.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c > index 0d5fb154b8e2..411c54b86ba8 100644 > --- a/drivers/dma/ti/k3-udma.c > +++ b/drivers/dma/ti/k3-udma.c > @@ -2209,7 +2209,7 @@ udma_prep_slave_sg_pkt(struct udma_chan *uc, struct > scatterlist *sgl, > u32 ring_id; > unsigned int i; > > - d = kzalloc(sizeof(*d) + sglen * sizeof(d->hwdesc[0]), GFP_NOWAIT); > + d = kzalloc(struct_size(d, hwdesc, sglen), GFP_NOWAIT); struct_size() is a * b + c but here we need, a + b * c.. the trailing struct is N times here.. > if (!d) > return NULL; > > @@ -2525,7 +2525,7 @@ udma_prep_dma_cyclic_pkt(struct udma_chan *uc, > dma_addr_t buf_addr, > if (period_len >= SZ_4M) > return NULL; > > - d = kzalloc(sizeof(*d) + periods * sizeof(d->hwdesc[0]), GFP_NOWAIT); > + d = kzalloc(struct_size(d, hwdesc, periods), GFP_NOWAIT); > if (!d) > return NULL; > > -- > 2.27.0 -- ~Vinod
Re: [PATCH 5.4 000/314] 5.4.49-rc1 review
On Tue, Jun 23, 2020 at 10:05:35PM -0700, Guenter Roeck wrote: > On 6/23/20 12:53 PM, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 5.4.49 release. > > There are 314 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Thu, 25 Jun 2020 19:52:30 +. > > Anything received after that time might be too late. > > > > [ ... ] > > > Christophe JAILLET > > pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak > > in case of error in 'imx_pinctrl_probe()' > > > This patch has since been reverted in the upstream kernel (commit > 13f2d25b951f) > and needs to be dropped. Now dropped, thanks. greg k-h
Re: [PATCH 5.7 000/477] 5.7.6-rc1 review
On Tue, Jun 23, 2020 at 10:07:08PM -0700, Guenter Roeck wrote: > On 6/23/20 12:49 PM, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 5.7.6 release. > > There are 477 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Thu, 25 Jun 2020 19:52:30 +. > > Anything received after that time might be too late. > > > [ ... ] > > Christophe JAILLET > > pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak > > in case of error in 'imx_pinctrl_probe()' > > > This patch has since been reverted in the upstream kernel (commit > 13f2d25b951f) > and needs to be dropped. Now dropped, thanks. greg k-h
Re: [PATCH 5.7 318/477] pinctrl: freescale: imx: Use devm_of_iomap() to avoid a resource leak in case of error in imx_pinctrl_probe()
On Wed, Jun 24, 2020 at 07:05:05AM +0200, Marion & Christophe JAILLET wrote: > Hi, > > This one must NOT be included. It generates a regression. > This should be removed from 5.4 as well. > > See 13f2d25b951f139064ec2dd53c0c7ebdf8d8007e. > > There is also a thread on ML about it. I couldn't find it right away, but > I'm sure that Dan will be quicker than me for finding it, if needed ;-). Now dropped, thanks! greg k-h
Re: [PATCH v2 2/2] cpufreq: Specify default governor on command line
On 23-06-20, 15:21, Quentin Perret wrote: > Currently, the only way to specify the default CPUfreq governor is via > Kconfig options, which suits users who can build the kernel themselves > perfectly. > > However, for those who use a distro-like kernel (such as Android, with > the Generic Kernel Image project), the only way to use a different > default is to boot to userspace, and to then switch using the sysfs > interface. Being able to specify the default governor on the command > line, like is the case for cpuidle, would enable those users to specify > their governor of choice earlier on, and to simplify slighlty the > userspace boot procedure. > > To support this use-case, add a kernel command line parameter enabling > to specify a default governor for CPUfreq, which takes precedence over > the builtin default. > > This implementation has one notable limitation: the default governor > must be registered before the driver. This is solved for builtin > governors and drivers using appropriate *_initcall() functions. And in > the modular case, this must be reflected as a constraint on the module > loading order. > > Signed-off-by: Quentin Perret > --- > .../admin-guide/kernel-parameters.txt | 5 > Documentation/admin-guide/pm/cpufreq.rst | 6 ++--- > drivers/cpufreq/cpufreq.c | 23 +++ > 3 files changed, 26 insertions(+), 8 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt > b/Documentation/admin-guide/kernel-parameters.txt > index fb95fad81c79..5fd3c9f187eb 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -703,6 +703,11 @@ > cpufreq.off=1 [CPU_FREQ] > disable the cpufreq sub-system > > + cpufreq.default_governor= > + [CPU_FREQ] Name of the default cpufreq governor to use. > + This governor must be registered in the kernel before > + the cpufreq driver probes. > + > cpu_init_udelay=N > [X86] Delay for N microsec between assert and de-assert > of APIC INIT to start processors. This delay occurs > diff --git a/Documentation/admin-guide/pm/cpufreq.rst > b/Documentation/admin-guide/pm/cpufreq.rst > index 0c74a7784964..368e612145d2 100644 > --- a/Documentation/admin-guide/pm/cpufreq.rst > +++ b/Documentation/admin-guide/pm/cpufreq.rst > @@ -147,9 +147,9 @@ CPUs in it. > > The next major initialization step for a new policy object is to attach a > scaling governor to it (to begin with, that is the default scaling governor > -determined by the kernel configuration, but it may be changed later > -via ``sysfs``). First, a pointer to the new policy object is passed to the > -governor's ``->init()`` callback which is expected to initialize all of the > +determined by the kernel command line or configuration, but it may be changed > +later via ``sysfs``). First, a pointer to the new policy object is passed to > +the governor's ``->init()`` callback which is expected to initialize all of > the > data structures necessary to handle the given policy and, possibly, to add > a governor ``sysfs`` interface to it. Next, the governor is started by > invoking its ``->start()`` callback. > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > index 0128de3603df..4b1a5c0173cf 100644 > --- a/drivers/cpufreq/cpufreq.c > +++ b/drivers/cpufreq/cpufreq.c > @@ -50,6 +50,9 @@ static LIST_HEAD(cpufreq_governor_list); > #define for_each_governor(__governor)\ > list_for_each_entry(__governor, _governor_list, governor_list) > > +static char cpufreq_param_governor[CPUFREQ_NAME_LEN]; > +static struct cpufreq_governor *default_governor; > + > /** > * The "cpufreq driver" - the arch- or hardware-dependent low > * level driver of CPUFreq support, and its spinlock. This lock > @@ -1055,7 +1058,6 @@ __weak struct cpufreq_governor > *cpufreq_default_governor(void) > > static int cpufreq_init_policy(struct cpufreq_policy *policy) > { > - struct cpufreq_governor *def_gov = cpufreq_default_governor(); > struct cpufreq_governor *gov = NULL; > unsigned int pol = CPUFREQ_POLICY_UNKNOWN; > > @@ -1065,8 +1067,8 @@ static int cpufreq_init_policy(struct cpufreq_policy > *policy) > if (gov) { > pr_debug("Restoring governor %s for cpu %d\n", >policy->governor->name, policy->cpu); > - } else if (def_gov) { > - gov = def_gov; > + } else if (default_governor) { > + gov = default_governor; > } else { > return -ENODATA; > } > @@ -1074,8 +1076,8 @@ static int cpufreq_init_policy(struct cpufreq_policy > *policy) > /* Use the default policy if there is no last_policy.
Re: linux-next: build failure after merge of the drm-misc tree
On Wed, 24 Jun 2020 at 11:36, Stephen Rothwell wrote: > > Hi all, > > On Wed, 17 Jun 2020 10:59:29 +1000 Stephen Rothwell > wrote: > > > > After merging the drm-misc tree, today's linux-next build (x86_64 > > allmodconfig) failed like this: > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c: In function > > 'amdgpu_amdkfd_gpuvm_free_memory_of_gpu': > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:1357:2: error: implicit > > declaration of function 'drm_gem_object_put_unlocked'; did you mean > > 'drm_gem_object_put_locked'? [-Werror=implicit-function-declaration] > > 1357 | drm_gem_object_put_unlocked(>bo->tbo.base); > > | ^~~ > > | drm_gem_object_put_locked > > > > Caused by commit > > > > ab15d56e27be ("drm: remove transient drm_gem_object_put_unlocked()") > > > > interacting with commit > > > > fd9a9f8801de ("drm/amdgpu: Use GEM obj reference for KFD BOs") > > > > from Linus' tree. > > > > I have applied the following merge fix up patch for today. > > > > From: Stephen Rothwell > > Date: Wed, 17 Jun 2020 10:55:32 +1000 > > Subject: [PATCH] drm/amdgpu: remove stray drm_gem_object_put_unlocked > > > > Signed-off-by: Stephen Rothwell > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > > index b91b5171270f..9015c7b76d60 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > > @@ -1354,7 +1354,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu( > > } > > > > /* Free the BO*/ > > - drm_gem_object_put_unlocked(>bo->tbo.base); > > + drm_gem_object_put(>bo->tbo.base); > > mutex_destroy(>lock); > > kfree(mem); > > > > -- > > 2.26.2 > > This fix is now needed when I merge the drm tree :-( > > Given that the drm tree is based on v5.8-rc2 and the commit from Linus' > tree above was merged before v5.8-rc1, the above patch should be > applied to the drm tree (and should have been part of the patch that > merged the drm-misc tree). I am a bit suprised that the drm tree > currently passes CI. My bad, my local builds passed, as I had made the change but forgot the commit --amend Pushed out a new head with it in it now. Dave.
Re: [PATCH 2/3] nvme-pci: Add controller memory buffer supported macro
On Tue, Jun 23, 2020 at 07:58:17PM -0700, Keith Busch wrote: > On Tue, Jun 23, 2020 at 06:27:51PM +0200, Christoph Hellwig wrote: > > On Tue, Jun 23, 2020 at 09:24:33PM +0800, Baolin Wang wrote: > > > Introduce a new capability macro to indicate if the controller > > > supports the memory buffer or not, instead of reading the > > > NVME_REG_CMBSZ register. > > > > This is a complex issue. The CMBS bit was only added in NVMe 1.4 as > > a backwards incompatible change, as the CMB addressing scheme can lead > > to data corruption. The CMBS was added as part of the horribe hack > > that also involves the CBA field, which we'll need to see before > > using it to work around the addressing issue. At the same time we > > should also continue supporting the legacy pre-1.4 CMB with a warning > > (and may reject it if we know we run in a VM). > > Well, a CMB from an emulated controller (like qemu's) can be used within > a VM. It's only if you direct assign a PCI function that CMB usage > breaks. But we have no idea if a controller is assigned or emulated.
Re: [PATCH v9 4/4] ext4: Use generic casefolding support
Daniel Rosenberg writes: > - > const struct dentry_operations ext4_dentry_ops = { > - .d_hash = ext4_d_hash, > - .d_compare = ext4_d_compare, > + .d_hash = generic_ci_d_hash, > + .d_compare = generic_ci_d_compare, > }; > #endif Can you make the structure generic since it is the same for f2fs and ext4, which let you drop the code guards? Unless that becomes a problem for d_revalidate with fscrypt, it is fine like this. > #ifdef CONFIG_UNICODE > - sbi = EXT4_SB(sb); > - if (ext4_has_strict_mode(sbi) && IS_CASEFOLDED(dir) && > - sbi->s_encoding && utf8_validate(sbi->s_encoding, >d_name)) > + if (sb_has_enc_strict_mode(sb) && IS_CASEFOLDED(dir) && I keep reading the 'enc' in sb_has_enc_strict_mode() as 'encryption'. What do you think about renaming it to sb_has_strict_encoding()? These comments apply equally to patches 3 and 4. Other than that, Reviewed-by: Gabriel Krisman Bertazi -- Gabriel Krisman Bertazi
Re: [PATCH v9 2/4] fs: Add standard casefolding support
On Tue, Jun 23, 2020 at 09:33:39PM -0700, Daniel Rosenberg wrote: > This adds general supporting functions for filesystems that use > utf8 casefolding. It provides standard dentry_operations and adds the > necessary structures in struct super_block to allow this standardization. > > Ext4 and F2fs will switch to these common implementations. It would be helpful to explicitly call out anything that's "new" in this commit, i.e. anything that isn't simply moving code into the libfs with no behavior changes. There's the change of ->d_hash() to use utf8_casefold_hash() instead of allocating memory; that's not present in the ext4 and f2fs versions. There's also the change of needs_casefold() to be aware of encrypt+casefold. (Maybe that small change would better belong in a later patchset that actually introduces encrypt+casefold support?) Anything else? > +/** > + * generic_ci_d_hash - generic d_hash implementation for casefolding > filesystems > + * @dentry: dentry whose name we are hashing > + * @str: qstr of name whose hash we should fill in > + * > + * Return: 0 if hash was successful, or -ERRNO > + */ It also returns 0 if the hashing was not done because it wants to fallback to the standard hashing. > +static inline bool needs_casefold(const struct inode *dir) > +{ > + return 0; > +} The return type is bool, so it should 'return false', not 'return 0'. - Eric
Re: [PATCH v9 1/4] unicode: Add utf8_casefold_hash
On Tue, Jun 23, 2020 at 09:33:38PM -0700, Daniel Rosenberg wrote: > This adds a case insensitive hash function to allow taking the hash > without needing to allocate a casefolded copy of the string. It would be helpful to add a few more details in this commit message. Somewhat along the lines of: ->d_hash() for casefolding currently allocates memory, it needs to use GFP_ATOMIC due to ->d_hash() being called in rcu-walk mode, this is unreliable and inefficient, and this patch allows solving that problem by removing the need to allocate memory. - Eric
Re: process '/usr/bin/rsync' started with executable stack
On Tue, 23 Jun 2020, Kees Cook wrote: > > If you run something with exec stack after the message > > you shouldn't get it second time. > > If you want to reset this flag, you can do: > # echo 1 > /sys/kernel/debug/clear_warn_once Thanks. Although, I tend to not mount /sys/kernel/{config,debug,tracing} and other things, I always thought they are not needed and could maybe lower the attack surface if not mounted. Or maybe my tinfoil hat needs some adjustment... Christian. -- BOFH excuse #279: The static electricity routing is acting up...
Re: [PATCH v9 0/4] Prepare for upcoming Casefolding/Encryption patches
On Tue, Jun 23, 2020 at 09:33:37PM -0700, Daniel Rosenberg wrote: > This lays the ground work for enabling casefolding and encryption at the > same time for ext4 and f2fs. A future set of patches will enable that > functionality. These unify the highly similar dentry_operations that ext4 > and f2fs both use for casefolding. I think this undersells this patchset a bit. This patchset makes ext4 and f2fs share the casefolded ->d_compare() and ->d_hash() implementations, which eliminates duplicated code. That's a good thing regardless of whether we're going to add encrypt+casefold support or not. It also changes the casefolded ->d_hash() implementation to not have to allocate memory (with GFP_ATOMIC, no less), which was a big problem with the old implementation as it's unreliable and inefficient. So yes, this prepares for supporting encrypt+casefold. But these changes make sense on their own too as an improvement of the casefold feature. Except for the one line of code in needs_casefold() that is specific to encrypt+casefold; maybe that should be left out for now. (Side note: I think you could drop linux-doc and linux-mtd from Cc, as this patchset isn't really relevant to those mailing lists.) - Eric
[tip:ras/core] BUILD SUCCESS bb2de0adca217a114ce023489426e24152e4bfcf
defconfig powerpc rhel-kconfig powerpc allmodconfig powerpc allnoconfig i386 randconfig-a006-20200623 i386 randconfig-a002-20200623 i386 randconfig-a003-20200623 i386 randconfig-a001-20200623 i386 randconfig-a005-20200623 i386 randconfig-a004-20200623 i386 randconfig-a013-20200623 i386 randconfig-a016-20200623 i386 randconfig-a012-20200623 i386 randconfig-a014-20200623 i386 randconfig-a015-20200623 i386 randconfig-a011-20200623 riscvallyesconfig riscv allnoconfig riscvallmodconfig s390 allyesconfig s390 allmodconfig s390defconfig sparcallyesconfig sparc defconfig sparc64 allnoconfig sparc64 allyesconfig sparc64 allmodconfig um allmodconfig umallnoconfig um allyesconfig um defconfig x86_64 rhel-7.6 x86_64rhel-7.6-kselftests x86_64 rhel-8.3 x86_64 kexec x86_64 rhel x86_64 rhel-7.2-clear x86_64lkp x86_64 fedora-25 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
Re: [PATCH v9 2/4] fs: Add standard casefolding support
Daniel Rosenberg writes: > This adds general supporting functions for filesystems that use > utf8 casefolding. It provides standard dentry_operations and adds the > necessary structures in struct super_block to allow this standardization. > > Ext4 and F2fs will switch to these common implementations. > > Signed-off-by: Daniel Rosenberg > --- > fs/libfs.c | 101 + > include/linux/fs.h | 22 ++ > 2 files changed, 123 insertions(+) > > diff --git a/fs/libfs.c b/fs/libfs.c > index 4d08edf19c782..f7345a5ed562f 100644 > --- a/fs/libfs.c > +++ b/fs/libfs.c > @@ -20,6 +20,8 @@ > #include > #include > #include > +#include > +#include > > #include > > @@ -1363,3 +1365,102 @@ bool is_empty_dir_inode(struct inode *inode) > return (inode->i_fop == _dir_operations) && > (inode->i_op == _dir_inode_operations); > } > + > +#ifdef CONFIG_UNICODE > +/** > + * needs_casefold - generic helper to determine if a filename should be > casefolded > + * @dir: Parent directory > + * > + * Generic helper for filesystems to use to determine if the name of a dentry > + * should be casefolded. It does not make sense to casefold the no-key token > of > + * an encrypted filename. > + * > + * Return: if names will need casefolding > + */ > +bool needs_casefold(const struct inode *dir) > +{ > + return IS_CASEFOLDED(dir) && dir->i_sb->s_encoding && > + (!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir)); > +} > +EXPORT_SYMBOL(needs_casefold); > + > +/** > + * generic_ci_d_compare - generic d_compare implementation for casefolding > filesystems > + * @dentry: dentry whose name we are checking against > + * @len: len of name of dentry > + * @str: str pointer to name of dentry > + * @name:Name to compare against > + * > + * Return: 0 if names match, 1 if mismatch, or -ERRNO > + */ > +int generic_ci_d_compare(const struct dentry *dentry, unsigned int len, > + const char *str, const struct qstr *name) > +{ > + const struct dentry *parent = READ_ONCE(dentry->d_parent); > + const struct inode *inode = READ_ONCE(parent->d_inode); > + const struct super_block *sb = dentry->d_sb; > + const struct unicode_map *um = sb->s_encoding; > + struct qstr qstr = QSTR_INIT(str, len); > + char strbuf[DNAME_INLINE_LEN]; > + int ret; > + > + if (!inode || !needs_casefold(inode)) > + goto fallback; > + /* > + * If the dentry name is stored in-line, then it may be concurrently > + * modified by a rename. If this happens, the VFS will eventually retry > + * the lookup, so it doesn't matter what ->d_compare() returns. > + * However, it's unsafe to call utf8_strncasecmp() with an unstable > + * string. Therefore, we have to copy the name into a temporary buffer. > + */ > + if (len <= DNAME_INLINE_LEN - 1) { > + memcpy(strbuf, str, len); > + strbuf[len] = 0; > + qstr.name = strbuf; > + /* prevent compiler from optimizing out the temporary buffer */ > + barrier(); > + } > + ret = utf8_strncasecmp(um, name, ); > + if (ret >= 0) > + return ret; > + > + if (sb_has_enc_strict_mode(sb)) > + return -EINVAL; > +fallback: > + if (len != name->len) > + return 1; > + return !!memcmp(str, name->name, len); > +} > +EXPORT_SYMBOL(generic_ci_d_compare); > + > +/** > + * generic_ci_d_hash - generic d_hash implementation for casefolding > filesystems > + * @dentry: dentry whose name we are hashing > + * @str: qstr of name whose hash we should fill in > + * > + * Return: 0 if hash was successful, or -ERRNO > + */ > +int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str) > +{ > + const struct inode *inode = READ_ONCE(dentry->d_inode); > + struct super_block *sb = dentry->d_sb; > + const struct unicode_map *um = sb->s_encoding; > + int ret = 0; > + > + if (!inode || !needs_casefold(inode)) > + return 0; > + > + ret = utf8_casefold_hash(um, dentry, str); > + if (ret < 0) > + goto err; > + > + return 0; > +err: > + if (sb_has_enc_strict_mode(sb)) > + ret = -EINVAL; > + else > + ret = 0; > + return ret; > +} Maybe drop the err label and simplify: ret = utf8_casefold_hash(um, dentry, str); if (ret < 0 && sb_has_enc_strict_mode(sb)) return -EINVAL; return 0; > +EXPORT_SYMBOL(generic_ci_d_hash); > +#endif > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 3f881a892ea74..261904e06873b 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1392,6 +1392,12 @@ extern int send_sigurg(struct fown_struct *fown); > #define SB_ACTIVE(1<<30) > #define SB_NOUSER(1<<31) > > +/* These flags relate to encoding and casefolding */ > +#define SB_ENC_STRICT_MODE_FL(1 << 0) > + >
[PATCH 2/9] soundwire: intel: revisit SHIM programming sequences.
From: Pierre-Louis Bossart Somehow the existing code is not aligned with the steps described in the documentation, refactor code and make sure the register programming sequences are correct. Also add missing power-up, power-down and wake capabilities (the last two are used in follow-up patches but introduced here for consistency). Some of the SHIM registers exposed fields that are link specific, and in addition some of the power-related registers (SPA/CPA) take time to be updated. Uncontrolled access leads to timeouts or errors. Add a mutex, shared by all links, so that all accesses to such registers are serialized, and follow a pattern of read-modify-write. This includes making sure SHIM_SYNC is programmed only once, before the first master is powered on. We use a 'shim_mask' field, shared between all links and protected by a mutex, to deal with power-up and power-down sequences. Note that the SYNCPRD value is tied only to the XTAL value and not the current bus frequency or the frame rate. BugLink: https://github.com/thesofproject/linux/issues/1555 Signed-off-by: Rander Wang Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 241 +++- drivers/soundwire/intel.h | 4 + drivers/soundwire/intel_init.c | 4 + include/linux/soundwire/sdw_intel.h | 2 + 4 files changed, 215 insertions(+), 36 deletions(-) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 8c7ae07c0fe1..4792613e8e5a 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -46,7 +46,8 @@ #define SDW_SHIM_LCTL_SPA BIT(0) #define SDW_SHIM_LCTL_CPA BIT(8) -#define SDW_SHIM_SYNC_SYNCPRD_VAL 0x176F +#define SDW_SHIM_SYNC_SYNCPRD_VAL_24 (24000 / SDW_CADENCE_GSYNC_KHZ - 1) +#define SDW_SHIM_SYNC_SYNCPRD_VAL_38_4 (38400 / SDW_CADENCE_GSYNC_KHZ - 1) #define SDW_SHIM_SYNC_SYNCPRD GENMASK(14, 0) #define SDW_SHIM_SYNC_SYNCCPU BIT(15) #define SDW_SHIM_SYNC_CMDSYNC_MASK GENMASK(19, 16) @@ -272,8 +273,46 @@ static int intel_link_power_up(struct sdw_intel *sdw) { unsigned int link_id = sdw->instance; void __iomem *shim = sdw->link_res->shim; + u32 *shim_mask = sdw->link_res->shim_mask; + struct sdw_bus *bus = >cdns.bus; + struct sdw_master_prop *prop = >prop; int spa_mask, cpa_mask; - int link_control, ret; + int link_control; + int ret = 0; + u32 syncprd; + u32 sync_reg; + + mutex_lock(sdw->link_res->shim_lock); + + /* +* The hardware relies on an internal counter, typically 4kHz, +* to generate the SoundWire SSP - which defines a 'safe' +* synchronization point between commands and audio transport +* and allows for multi link synchronization. The SYNCPRD value +* is only dependent on the oscillator clock provided to +* the IP, so adjust based on _DSD properties reported in DSDT +* tables. The values reported are based on either 24MHz +* (CNL/CML) or 38.4 MHz (ICL/TGL+). +*/ + if (prop->mclk_freq % 600) + syncprd = SDW_SHIM_SYNC_SYNCPRD_VAL_38_4; + else + syncprd = SDW_SHIM_SYNC_SYNCPRD_VAL_24; + + if (!*shim_mask) { + /* we first need to program the SyncPRD/CPU registers */ + dev_dbg(sdw->cdns.dev, + "%s: first link up, programming SYNCPRD\n", __func__); + + /* set SyncPRD period */ + sync_reg = intel_readl(shim, SDW_SHIM_SYNC); + sync_reg |= (syncprd << +SDW_REG_SHIFT(SDW_SHIM_SYNC_SYNCPRD)); + + /* Set SyncCPU bit */ + sync_reg |= SDW_SHIM_SYNC_SYNCCPU; + intel_writel(shim, SDW_SHIM_SYNC, sync_reg); + } /* Link power up sequence */ link_control = intel_readl(shim, SDW_SHIM_LCTL); @@ -282,70 +321,182 @@ static int intel_link_power_up(struct sdw_intel *sdw) link_control |= spa_mask; ret = intel_set_bit(shim, SDW_SHIM_LCTL, link_control, cpa_mask); - if (ret < 0) - return ret; + if (ret < 0) { + dev_err(sdw->cdns.dev, "Failed to power up link: %d\n", ret); + goto out; + } + + if (!*shim_mask) { + /* SyncCPU will change once link is active */ + ret = intel_wait_bit(shim, SDW_SHIM_SYNC, +SDW_SHIM_SYNC_SYNCCPU, 0); + if (ret < 0) { + dev_err(sdw->cdns.dev, + "Failed to set SHIM_SYNC: %d\n", ret); + goto out; + } + } + + *shim_mask |= BIT(link_id); sdw->cdns.link_up = true; - return 0; +out: + mutex_unlock(sdw->link_res->shim_lock); + + return ret; } -static int intel_shim_init(struct
Re: [PATCH v5 1/2] remoteproc: qcom: Add per subsystem SSR notification
On Tue 23 Jun 18:41 PDT 2020, risha...@codeaurora.org wrote: > On 2020-06-23 14:45, Alex Elder wrote: > > On 6/22/20 8:04 PM, Rishabh Bhatnagar wrote: > > > Currently there is a single notification chain which is called > > > whenever any > > > remoteproc shuts down. This leads to all the listeners being > > > notified, and > > > is not an optimal design as kernel drivers might only be interested in > > > listening to notifications from a particular remoteproc. Create a > > > global > > > list of remoteproc notification info data structures. This will hold > > > the > > > name and notifier_list information for a particular remoteproc. The > > > API > > > to register for notifications will use name argument to retrieve the > > > notification info data structure and the notifier block will be > > > added to > > > that data structure's notification chain. Also move from blocking > > > notifier > > > to srcu notifer based implementation to support dynamic notifier head > > > creation. > > > > > > Signed-off-by: Siddharth Gupta > > > Signed-off-by: Rishabh Bhatnagar > > > > Sorry, a few more comments, but I think your next one will > > likely be fine. > > > > General: > > - SSR subsystems are added but never removed. Note that > > "qcom_common.o" can be built as a module, and if that > > module were ever removed, memory allocated for these > > subsystems would be leaked. > Hi Alex, > Thank you for reviewing this patchset quickly. This point was > brought up by Bjorn and it was decided that I will push another patch on > top in which I'll do the cleanup during module exit. > > - Will a remoteproc subdev (and in particular, an SSR subdev) > > ever be removed? What happens to entities that have > > registered for SSR notifications in that case? > In practice it should never be removed. If it is clients will > never get notification about subsystem shutdown/powerup. Given that clients make direct function calls into qcom_common.ko, qcom_common.ko would not be possible to rmmod until all clients has been rmmod'ed. As such there shouldn't be any remaining listeners, or subdevices, when this happens. Regards, Bjorn
[PATCH 6/9] soundwire: intel_init: use EXPORT_SYMBOL_NS
From: Pierre-Louis Bossart Make sure all symbols in this soundwire-intel-init module are exported with a namespace. The MODULE_IMPORT_NS will be used in Intel/SOF HDaudio modules to be posted in a separate series. Namespaces are only introduced for the Intel parts of the SoundWire code at this time, in future patches we should also add namespaces for Cadence parts and the SoundWire core. Suggested-by: Greg KH Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel_init.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/soundwire/intel_init.c b/drivers/soundwire/intel_init.c index d8f0c1472f1f..ad3175272e88 100644 --- a/drivers/soundwire/intel_init.c +++ b/drivers/soundwire/intel_init.c @@ -164,7 +164,7 @@ void sdw_intel_enable_irq(void __iomem *mmio_base, bool enable) writel(val, mmio_base + HDA_DSP_REG_ADSPIC2); } -EXPORT_SYMBOL(sdw_intel_enable_irq); +EXPORT_SYMBOL_NS(sdw_intel_enable_irq, SOUNDWIRE_INTEL_INIT); static struct sdw_intel_ctx *sdw_intel_probe_controller(struct sdw_intel_res *res) @@ -353,7 +353,7 @@ int sdw_intel_acpi_scan(acpi_handle *parent_handle, return sdw_intel_scan_controller(info); } -EXPORT_SYMBOL(sdw_intel_acpi_scan); +EXPORT_SYMBOL_NS(sdw_intel_acpi_scan, SOUNDWIRE_INTEL_INIT); /** * sdw_intel_probe() - SoundWire Intel probe routine @@ -370,7 +370,7 @@ struct sdw_intel_ctx { return sdw_intel_probe_controller(res); } -EXPORT_SYMBOL(sdw_intel_probe); +EXPORT_SYMBOL_NS(sdw_intel_probe, SOUNDWIRE_INTEL_INIT); /** * sdw_intel_startup() - SoundWire Intel startup @@ -383,7 +383,7 @@ int sdw_intel_startup(struct sdw_intel_ctx *ctx) { return sdw_intel_startup_controller(ctx); } -EXPORT_SYMBOL(sdw_intel_startup); +EXPORT_SYMBOL_NS(sdw_intel_startup, SOUNDWIRE_INTEL_INIT); /** * sdw_intel_exit() - SoundWire Intel exit * @ctx: SoundWire context allocated in the probe @@ -394,7 +394,7 @@ void sdw_intel_exit(struct sdw_intel_ctx *ctx) { sdw_intel_cleanup(ctx); } -EXPORT_SYMBOL(sdw_intel_exit); +EXPORT_SYMBOL_NS(sdw_intel_exit, SOUNDWIRE_INTEL_INIT); MODULE_LICENSE("Dual BSD/GPL"); MODULE_DESCRIPTION("Intel Soundwire Init Library"); -- 2.17.1
[PATCH 5/9] soundwire: intel_init: add implementation of sdw_intel_enable_irq()
From: Pierre-Louis Bossart This function is required to enable all interrupts across all links. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel_init.c | 24 1 file changed, 24 insertions(+) diff --git a/drivers/soundwire/intel_init.c b/drivers/soundwire/intel_init.c index f50a93130d12..d8f0c1472f1f 100644 --- a/drivers/soundwire/intel_init.c +++ b/drivers/soundwire/intel_init.c @@ -142,6 +142,30 @@ sdw_intel_scan_controller(struct sdw_intel_acpi_info *info) return 0; } +#define HDA_DSP_REG_ADSPIC2 (0x10) +#define HDA_DSP_REG_ADSPIS2 (0x14) +#define HDA_DSP_REG_ADSPIC2_SNDWBIT(5) + +/** + * sdw_intel_enable_irq() - enable/disable Intel SoundWire IRQ + * @mmio_base: The mmio base of the control register + * @enable: true if enable + */ +void sdw_intel_enable_irq(void __iomem *mmio_base, bool enable) +{ + u32 val; + + val = readl(mmio_base + HDA_DSP_REG_ADSPIC2); + + if (enable) + val |= HDA_DSP_REG_ADSPIC2_SNDW; + else + val &= ~HDA_DSP_REG_ADSPIC2_SNDW; + + writel(val, mmio_base + HDA_DSP_REG_ADSPIC2); +} +EXPORT_SYMBOL(sdw_intel_enable_irq); + static struct sdw_intel_ctx *sdw_intel_probe_controller(struct sdw_intel_res *res) { -- 2.17.1
[PATCH 8/9] soundwire: intel: add wake interrupt support
From: Rander Wang When system is suspended in clock stop mode on intel platforms, both master and slave are in clock stop mode and soundwire bus is taken over by a glue hardware. The bus message for jack event is processed by this glue hardware, which will trigger an interrupt to resume audio pci device. Then audio pci driver will resume soundwire master and slave, transfer bus ownership to master, finally slave will report jack event to master and codec driver is triggered to check jack status. if a slave has been attached to a bus, the slave->dev_num_sticky should be non-zero, so we can check this value to skip the ghost devices defined in ACPI table but not populated in hardware. Signed-off-by: Rander Wang Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 48 +- drivers/soundwire/intel.h | 1 + drivers/soundwire/intel_init.c | 22 3 files changed, 70 insertions(+), 1 deletion(-) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 06c553d94890..22d9fd3e34fa 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -436,7 +437,7 @@ static int intel_shim_init(struct sdw_intel *sdw, bool clock_stop) return ret; } -static void __maybe_unused intel_shim_wake(struct sdw_intel *sdw, bool wake_enable) +static void intel_shim_wake(struct sdw_intel *sdw, bool wake_enable) { void __iomem *shim = sdw->link_res->shim; unsigned int link_id = sdw->instance; @@ -1337,6 +1338,51 @@ static int intel_master_remove(struct platform_device *pdev) return 0; } +int intel_master_process_wakeen_event(struct platform_device *pdev) +{ + struct device *dev = >dev; + struct sdw_intel *sdw; + struct sdw_bus *bus; + struct sdw_slave *slave; + void __iomem *shim; + u16 wake_sts; + + sdw = platform_get_drvdata(pdev); + bus = >cdns.bus; + + if (bus->prop.hw_disabled) { + dev_dbg(dev, + "SoundWire master %d is disabled, ignoring\n", + bus->link_id); + return 0; + } + + shim = sdw->link_res->shim; + wake_sts = intel_readw(shim, SDW_SHIM_WAKESTS); + + if (!(wake_sts & BIT(sdw->instance))) + return 0; + + /* disable WAKEEN interrupt ASAP to prevent interrupt flood */ + intel_shim_wake(sdw, false); + + /* +* wake up master and slave so that slave can notify master +* the wakeen event and let codec driver check codec status +*/ + list_for_each_entry(slave, >slaves, node) { + /* +* discard devices that are defined in ACPI tables but +* not physically present and devices that cannot +* generate wakes +*/ + if (slave->dev_num_sticky && slave->prop.wake_capable) + pm_request_resume(>dev); + } + + return 0; +} + static struct platform_driver sdw_intel_drv = { .probe = intel_master_probe, .remove = intel_master_remove, diff --git a/drivers/soundwire/intel.h b/drivers/soundwire/intel.h index bf127c88eb51..4ea3d262d249 100644 --- a/drivers/soundwire/intel.h +++ b/drivers/soundwire/intel.h @@ -47,5 +47,6 @@ struct sdw_intel { #define SDW_INTEL_QUIRK_MASK_BUS_DISABLE BIT(1) int intel_master_startup(struct platform_device *pdev); +int intel_master_process_wakeen_event(struct platform_device *pdev); #endif /* __SDW_INTEL_LOCAL_H */ diff --git a/drivers/soundwire/intel_init.c b/drivers/soundwire/intel_init.c index 63b3beda443d..eff4e385bb59 100644 --- a/drivers/soundwire/intel_init.c +++ b/drivers/soundwire/intel_init.c @@ -415,5 +415,27 @@ void sdw_intel_exit(struct sdw_intel_ctx *ctx) } EXPORT_SYMBOL_NS(sdw_intel_exit, SOUNDWIRE_INTEL_INIT); +void sdw_intel_process_wakeen_event(struct sdw_intel_ctx *ctx) +{ + struct sdw_intel_link_res *link; + u32 link_mask; + int i; + + if (!ctx->links) + return; + + link = ctx->links; + link_mask = ctx->link_mask; + + /* Startup SDW Master devices */ + for (i = 0; i < ctx->count; i++, link++) { + if (!(link_mask & BIT(i))) + continue; + + intel_master_process_wakeen_event(link->pdev); + } +} +EXPORT_SYMBOL_NS(sdw_intel_process_wakeen_event, SOUNDWIRE_INTEL_INIT); + MODULE_LICENSE("Dual BSD/GPL"); MODULE_DESCRIPTION("Intel Soundwire Init Library"); -- 2.17.1
[PATCH 1/9] soundwire: intel: reuse code for wait loops to set/clear bits
From: Pierre-Louis Bossart Refactor code and use same routines on set/clear Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 45 +-- 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 7a65414e5714..8c7ae07c0fe1 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -123,40 +123,33 @@ static inline void intel_writew(void __iomem *base, int offset, u16 value) writew(value, base + offset); } +static int intel_wait_bit(void __iomem *base, int offset, u32 mask, u32 target) +{ + int timeout = 10; + u32 reg_read; + + do { + reg_read = readl(base + offset); + if ((reg_read & mask) == target) + return 0; + + timeout--; + usleep_range(50, 100); + } while (timeout != 0); + + return -EAGAIN; +} + static int intel_clear_bit(void __iomem *base, int offset, u32 value, u32 mask) { - int timeout = 10; - u32 reg_read; - writel(value, base + offset); - do { - reg_read = readl(base + offset); - if (!(reg_read & mask)) - return 0; - - timeout--; - udelay(50); - } while (timeout != 0); - - return -EAGAIN; + return intel_wait_bit(base, offset, mask, 0); } static int intel_set_bit(void __iomem *base, int offset, u32 value, u32 mask) { - int timeout = 10; - u32 reg_read; - writel(value, base + offset); - do { - reg_read = readl(base + offset); - if (reg_read & mask) - return 0; - - timeout--; - udelay(50); - } while (timeout != 0); - - return -EAGAIN; + return intel_wait_bit(base, offset, mask, mask); } /* -- 2.17.1
[PATCH 7/9] soundwire: intel/cadence: merge Soundwire interrupt handlers/threads
The existing code uses one pair of interrupt handler/thread per link but at the hardware level the interrupt is shared. This works fine for legacy PCI interrupts, but leads to timeouts in MSI (Message-Signaled Interrupt) mode, likely due to edges being lost. This patch unifies interrupt handling for all links. The dedicated handler is removed since we use a common one for all shared interrupt sources, and the thread function takes care of dealing with interrupt sources. This partition follows the model used for the SOF IPC on HDaudio platforms, where similar timeout issues were noticed and doing all the interrupt handling/clearing in the thread improved reliability/stability. Validation results with 4 links active in parallel show a night-and-day improvement with no timeouts noticed even during stress tests. Latency and quality of service are not affected by the change - mostly because events on a SoundWire link are throttled by the bus frame rate (typically 8..48kHz). Signed-off-by: Bard Liao Signed-off-by: Pierre-Louis Bossart --- drivers/soundwire/cadence_master.c | 18 ++ drivers/soundwire/cadence_master.h | 4 drivers/soundwire/intel.c | 15 --- drivers/soundwire/intel.h | 4 drivers/soundwire/intel_init.c | 19 +++ 5 files changed, 37 insertions(+), 23 deletions(-) diff --git a/drivers/soundwire/cadence_master.c b/drivers/soundwire/cadence_master.c index 613dbd415b91..24eafe0aa1c3 100644 --- a/drivers/soundwire/cadence_master.c +++ b/drivers/soundwire/cadence_master.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "bus.h" #include "cadence_master.h" @@ -790,7 +791,7 @@ irqreturn_t sdw_cdns_irq(int irq, void *dev_id) CDNS_MCP_INT_SLAVE_MASK, 0); int_status &= ~CDNS_MCP_INT_SLAVE_MASK; - ret = IRQ_WAKE_THREAD; + schedule_work(>work); } cdns_writel(cdns, CDNS_MCP_INTSTAT, int_status); @@ -799,13 +800,15 @@ irqreturn_t sdw_cdns_irq(int irq, void *dev_id) EXPORT_SYMBOL(sdw_cdns_irq); /** - * sdw_cdns_thread() - Cadence irq thread handler - * @irq: irq number - * @dev_id: irq context + * To update slave status in a work since we will need to handle + * other interrupts eg. CDNS_MCP_INT_RX_WL during the update slave + * process. + * @work: cdns worker thread */ -irqreturn_t sdw_cdns_thread(int irq, void *dev_id) +static void cdns_update_slave_status_work(struct work_struct *work) { - struct sdw_cdns *cdns = dev_id; + struct sdw_cdns *cdns = + container_of(work, struct sdw_cdns, work); u32 slave0, slave1; dev_dbg_ratelimited(cdns->dev, "Slave status change\n"); @@ -822,9 +825,7 @@ irqreturn_t sdw_cdns_thread(int irq, void *dev_id) cdns_updatel(cdns, CDNS_MCP_INTMASK, CDNS_MCP_INT_SLAVE_MASK, CDNS_MCP_INT_SLAVE_MASK); - return IRQ_HANDLED; } -EXPORT_SYMBOL(sdw_cdns_thread); /* * init routines @@ -1427,6 +1428,7 @@ int sdw_cdns_probe(struct sdw_cdns *cdns) init_completion(>tx_complete); cdns->bus.port_ops = _port_ops; + INIT_WORK(>work, cdns_update_slave_status_work); return 0; } EXPORT_SYMBOL(sdw_cdns_probe); diff --git a/drivers/soundwire/cadence_master.h b/drivers/soundwire/cadence_master.h index b410656f8194..7638858397df 100644 --- a/drivers/soundwire/cadence_master.h +++ b/drivers/soundwire/cadence_master.h @@ -129,6 +129,10 @@ struct sdw_cdns { bool link_up; unsigned int msg_count; + + struct work_struct work; + + struct list_head list; }; #define bus_to_cdns(_bus) container_of(_bus, struct sdw_cdns, bus) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 0a4fc7f65743..06c553d94890 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -1258,21 +1258,7 @@ static int intel_master_probe(struct platform_device *pdev) "SoundWire master %d is disabled, will be ignored\n", bus->link_id); - /* Acquire IRQ */ - ret = request_threaded_irq(sdw->link_res->irq, - sdw_cdns_irq, sdw_cdns_thread, - IRQF_SHARED, KBUILD_MODNAME, cdns); - if (ret < 0) { - dev_err(dev, "unable to grab IRQ %d, disabling device\n", - sdw->link_res->irq); - goto err_init; - } - return 0; - -err_init: - sdw_bus_master_delete(bus); - return ret; } int intel_master_startup(struct platform_device *pdev) @@ -1344,7 +1330,6 @@ static int intel_master_remove(struct platform_device *pdev) if (!bus->prop.hw_disabled) { intel_debugfs_exit(sdw); sdw_cdns_enable_interrupt(cdns, false); - free_irq(sdw->link_res->irq, sdw); snd_soc_unregister_component(dev); }
[PATCH 9/9] Soundwire: intel_init: save Slave(s) _ADR info in sdw_intel_ctx
Save ACPI information in context so that we can match machine driver with sdw _ADR matching tables. Suggested-by: Guennadi Liakhovetski Signed-off-by: Bard Liao Signed-off-by: Pierre-Louis Bossart --- drivers/soundwire/intel_init.c | 24 1 file changed, 24 insertions(+) diff --git a/drivers/soundwire/intel_init.c b/drivers/soundwire/intel_init.c index eff4e385bb59..6502a5e82229 100644 --- a/drivers/soundwire/intel_init.c +++ b/drivers/soundwire/intel_init.c @@ -188,7 +188,11 @@ static struct sdw_intel_ctx struct sdw_intel_link_res *link; struct sdw_intel_ctx *ctx; struct acpi_device *adev; + struct sdw_slave *slave; + struct list_head *node; + struct sdw_bus *bus; u32 link_mask; + int num_slaves = 0; int count; int i; @@ -265,6 +269,26 @@ static struct sdw_intel_ctx link->cdns = platform_get_drvdata(pdev); list_add_tail(>list, >link_list); + bus = >cdns->bus; + /* Calculate number of slaves */ + list_for_each(node, >slaves) + num_slaves++; + } + + ctx->ids = devm_kcalloc(>dev, num_slaves, + sizeof(*ctx->ids), GFP_KERNEL); + if (!ctx->ids) + goto err; + + ctx->num_slaves = num_slaves; + i = 0; + list_for_each_entry(link, >link_list, list) { + bus = >cdns->bus; + list_for_each_entry(slave, >slaves, node) { + ctx->ids[i].id = slave->id; + ctx->ids[i].link_id = bus->link_id; + i++; + } } return ctx; -- 2.17.1
[PATCH 3/9] soundwire: intel: introduce a helper to arm link synchronization
From: Pierre-Louis Bossart Move code from pre_bank_switch to dedicated helper, will be used in follow-up patches as recommended by programming flows. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 4792613e8e5a..6a745602c9cc 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -497,6 +497,21 @@ static int __maybe_unused intel_link_power_down(struct sdw_intel *sdw) return 0; } +static void intel_shim_sync_arm(struct sdw_intel *sdw) +{ + void __iomem *shim = sdw->link_res->shim; + u32 sync_reg; + + mutex_lock(sdw->link_res->shim_lock); + + /* update SYNC register */ + sync_reg = intel_readl(shim, SDW_SHIM_SYNC); + sync_reg |= (SDW_SHIM_SYNC_CMDSYNC << sdw->instance); + intel_writel(shim, SDW_SHIM_SYNC, sync_reg); + + mutex_unlock(sdw->link_res->shim_lock); +} + /* * PDI routines */ @@ -710,21 +725,12 @@ static int intel_pre_bank_switch(struct sdw_bus *bus) { struct sdw_cdns *cdns = bus_to_cdns(bus); struct sdw_intel *sdw = cdns_to_intel(cdns); - void __iomem *shim = sdw->link_res->shim; - int sync_reg; /* Write to register only for multi-link */ if (!bus->multi_link) return 0; - mutex_lock(sdw->link_res->shim_lock); - - /* Read SYNC register */ - sync_reg = intel_readl(shim, SDW_SHIM_SYNC); - sync_reg |= SDW_SHIM_SYNC_CMDSYNC << sdw->instance; - intel_writel(shim, SDW_SHIM_SYNC, sync_reg); - - mutex_unlock(sdw->link_res->shim_lock); + intel_shim_sync_arm(sdw); return 0; } -- 2.17.1
[PATCH 0/9] soundwire: intel: revisit SHIM programming
This series does some cleanup, revisits SHIM programming sequences, and merges Soundwire interrupt handlers/threads. Bard Liao (2): soundwire: intel/cadence: merge Soundwire interrupt handlers/threads Soundwire: intel_init: save Slave(s) _ADR info in sdw_intel_ctx Pierre-Louis Bossart (6): soundwire: intel: reuse code for wait loops to set/clear bits soundwire: intel: revisit SHIM programming sequences. soundwire: intel: introduce a helper to arm link synchronization soundwire: intel: introduce helper for link synchronization soundwire: intel_init: add implementation of sdw_intel_enable_irq() soundwire: intel_init: use EXPORT_SYMBOL_NS Rander Wang (1): soundwire: intel: add wake interrupt support drivers/soundwire/cadence_master.c | 18 +- drivers/soundwire/cadence_master.h | 4 + drivers/soundwire/intel.c | 389 ++-- drivers/soundwire/intel.h | 9 + drivers/soundwire/intel_init.c | 101 +++- include/linux/soundwire/sdw_intel.h | 2 + 6 files changed, 425 insertions(+), 98 deletions(-) -- 2.17.1
[PATCH 4/9] soundwire: intel: introduce helper for link synchronization
From: Pierre-Louis Bossart After arming the synchronization, the SYNCGO field controls the hardware-based synchronization between links. Move the programming and wait for clear of SYNCGO to dedicated helper. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 34 ++ 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 6a745602c9cc..0a4fc7f65743 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -512,6 +512,31 @@ static void intel_shim_sync_arm(struct sdw_intel *sdw) mutex_unlock(sdw->link_res->shim_lock); } +static int intel_shim_sync_go_unlocked(struct sdw_intel *sdw) +{ + void __iomem *shim = sdw->link_res->shim; + u32 sync_reg; + int ret; + + /* Read SYNC register */ + sync_reg = intel_readl(shim, SDW_SHIM_SYNC); + + /* +* Set SyncGO bit to synchronously trigger a bank switch for +* all the masters. A write to SYNCGO bit clears CMDSYNC bit for all +* the Masters. +*/ + sync_reg |= SDW_SHIM_SYNC_SYNCGO; + + ret = intel_clear_bit(shim, SDW_SHIM_SYNC, sync_reg, + SDW_SHIM_SYNC_SYNCGO); + + if (ret < 0) + dev_err(sdw->cdns.dev, "SyncGO clear failed: %d\n", ret); + + return ret; +} + /* * PDI routines */ @@ -763,15 +788,8 @@ static int intel_post_bank_switch(struct sdw_bus *bus) ret = 0; goto unlock; } - /* -* Set SyncGO bit to synchronously trigger a bank switch for -* all the masters. A write to SYNCGO bit clears CMDSYNC bit for all -* the Masters. -*/ - sync_reg |= SDW_SHIM_SYNC_SYNCGO; - ret = intel_clear_bit(shim, SDW_SHIM_SYNC, sync_reg, - SDW_SHIM_SYNC_SYNCGO); + ret = intel_shim_sync_go_unlocked(sdw); unlock: mutex_unlock(sdw->link_res->shim_lock); -- 2.17.1
Re: [PATCH v2 9/9] docs: staging: use small font for literal includes
Em Tue, 23 Jun 2020 16:12:04 -0700 Joe Perches escreveu: > On Tue, 2020-06-23 at 11:53 +0200, Mauro Carvalho Chehab wrote: > > The normal font is too big to display 80 columns, causing extra > > breaks to be added at weird places. > > > > change to the footnotesize, as this would fit a little bit > > better. > [] > > diff --git a/Documentation/staging/index.rst > > b/Documentation/staging/index.rst > [] > > @@ -19,14 +19,41 @@ Unsorted Documentation > > Atomic Types > > > > > > -.. literalinclude:: ../atomic_t.txt > > +.. raw:: latex > > + > > +\footnotesize > > + > > +.. include:: ../atomic_t.txt > > + :literal: > > + > > +.. raw:: latex > > + > > +\normalsize > > Is there something like push/pop for styles? > Maybe some variant of include:: with a style? Well, one could use, instead: .. css-class:: some_footnote_class Which should then require a "some_footnote_class" css file defined for html output and a set of macros to be added at the "latex_elements" section of conf.py. If properly defined, I guess it should be possible to avoid the need of returning to the normal size after using it. This is easily said than done. One of the problems with using such macros is that the macro should be prepared to work not only with plain text but also with more complex structures like tables or graphics. We do need to use tricks like changing the font size on media docs, in order to display tables with long lines. We have several of them there, with bit mappings. I tried a few times to play with that for PDF. I ended needing to revert all such attempts[1]. The main problem playing with those is that You'll end needing to play with the already defined macros found on sphinx.sty, which is shipped together with Sphinx. The macros there changed a lot over the time. Making something that would work with different versions is really hard. [1] I used those when I ported media docs from DocBook, but had to drop several macros because version 1.7 broke the ones I defined. On that time, I spent several days writing the first version, and, when they broke, I spent a long time figuring out what would be the best. At the end, using the "raw:: latex" like the above was the only thing that worked reliable among different Sphinx versions. Thanks, Mauro
Re: [dm-devel] [dm-crypt] [RFC PATCH 1/1] Add DM_CRYPT_FORCE_INLINE flag to dm-crypt target
On Wed, Jun 24, 2020 at 05:21:24AM +, Damien Le Moal wrote: > >> @@ -1458,13 +1459,18 @@ static void crypt_alloc_req_skcipher(struct > >> crypt_config *cc, > >> > >>skcipher_request_set_tfm(ctx->r.req, cc->cipher_tfm.tfms[key_index]); > >> > >> - /* > >> - * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs > >> - * requests if driver request queue is full. > >> - */ > >> - skcipher_request_set_callback(ctx->r.req, > >> - CRYPTO_TFM_REQ_MAY_BACKLOG, > >> - kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); > >> + if (test_bit(DM_CRYPT_FORCE_INLINE, >flags)) > >> + /* make sure we zero important fields of the request */ > >> + skcipher_request_set_callback(ctx->r.req, > >> + 0, NULL, NULL); > >> + else > >> + /* > >> + * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs > >> + * requests if driver request queue is full. > >> + */ > >> + skcipher_request_set_callback(ctx->r.req, > >> + CRYPTO_TFM_REQ_MAY_BACKLOG, > >> + kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); > >> } > > > > This looks wrong. Unless type=0 and mask=CRYPTO_ALG_ASYNC are passed to > > crypto_alloc_skcipher(), the skcipher implementation can still be > > asynchronous, > > in which case providing a callback is required. > > > > Do you intend that the "force_inline" option forces the use of a synchronous > > skcipher (alongside the other things it does)? Or should it still allow > > asynchronous ones? > > > > We may not actually have a choice in that matter, since xts-aes-aesni has > > the > > CRYPTO_ALG_ASYNC bit set (as I mentioned) despite being synchronous in most > > cases; thus, the crypto API won't give you it if you ask for a synchronous > > cipher. So I think you still need to allow async skciphers? That means a > > callback is still always required. > > Arg... So it means that some skciphers will not be OK at all for SMR writes. I > was not aware of these differences (tested with aes-xts-plain64 only). The > ugly > way to support async ciphers would be to just wait inline for the crypto API > to > complete using a completion for instance. But that is very ugly. Back to > brainstorming, and need to learn more about the crypto API... > It's easy to wait for crypto API requests to complete if you need to -- just use crypto_wait_req(). We do this in fs/crypto/, for example. (Not many people are using fscrypt with crypto API based accelerators, so there hasn't yet been much need to support the complexity of issuing multiple async crypto requests like dm-crypt supports.) - Eric
Re: [RFC PATCH 0/1] dm-crypt excessive overhead
On Wed, Jun 24 2020 at 12:54am -0400, Damien Le Moal wrote: > On 2020/06/24 0:23, Mike Snitzer wrote: > > On Tue, Jun 23 2020 at 11:07am -0400, > > Ignat Korchagin wrote: > > > >> Do you think it may be better to break it in two flags: one for read > >> path and one for write? So, depending on the needs and workflow these > >> could be enabled independently? > > > > If there is a need to split, then sure. But I think Damien had a hard > > requirement that writes had to be inlined but that reads didn't _need_ > > to be for his dm-zoned usecase. Damien may not yet have assessed the > > performance implications, of not have reads inlined, as much as you > > have. > > We did do performance testing :) > The results are mixed and performance differences between inline vs workqueues > depend on the workload (IO size, IO queue depth and number of drives being > used > mostly). In many cases, inlining everything does really improve performance as > Ignat reported. > > In our testing, we used hard drives and so focused mostly on throughput rather > than command latency. The added workqueue context switch overhead and crypto > work latency compared to typical HDD IO times is small, and significant only > if > the backend storage as short IO times. > > In the case of HDDs, especially for large IO sizes, inlining crypto work does > not shine as it prevents an efficient use of CPU resources. This is especially > true with reads on a large system with many drives connected to a single HBA: > the softirq context decryption work does not lend itself well to using other > CPUs that did not receive the HBA IRQ signaling command completions. The test > results clearly show much higher throughputs using dm-crypt as is. > > On the other hand, inlining crypto work significantly improves workloads of > small random IOs, even for a large number of disks: removing the overhead of > context switches allows faster completions, allowing sending more requests to > the drives more quickly, keeping them busy. > > For SMR, the inlining of write requests is *mandatory* to preserve the issuer > write sequence, but encryption work being done in the issuer context (writes > to > SMR drives can only be O_DIRECT writes), efficient CPU resource usage can be > achieved by simply using multiple writer thread/processes, working on > different > zones of different disks. This is a very reasonable model for SMR as writes > into > a single zone have to be done under mutual exclusion to ensure sequentiality. > > For reads, SMR drives are essentially exactly the same as regular disks, so > as-is or inline are both OK. Based on our performance results, allowing the > user > to have the choice of inlining or not reads based on the target workload would > be great. > > Of note is that zone append writes (emulated in SCSI, native with NVMe) are > not > subject to the sequential write constraint, so they can also be executed > either > inline or asynchronously. > > > So let's see how Damien's work goes and if he trully doesn't need/want > > reads to be inlined then 2 flags can be created. > > For SMR, I do not need inline reads, but I do want the user to have the > possibility of using this setup as that can provide better performance for > some > workloads. I think that splitting the inline flag in 2 is exactly what we > want: > > 1) For SMR, the write-inline flag can be automatically turned on when the > target > device is created if the backend device used is a host-managed zoned drive > (scsi > or NVMe ZNS). For reads, it would be the user choice, based on the target > workload. > 2) For regular block devices, write-inline only, read-inline only or both > would > be the user choice, to optimize for their target workload. > > With the split into 2 flags, my SMR support patch becomes very simple. OK, thanks for all the context. Was a fun read ;) SO let's run with splitting into 2 flags. Ignat would you be up to tweaking your patch to provide that and post a v2? An added bonus would be to consolidate your 0/1 and 1/1 patch headers, and add in the additional answers you provided in this thread to help others understand the patch (mainly some more detail about why tasklet is used). Thanks, Mike
Re: [dm-devel] [dm-crypt] [RFC PATCH 1/1] Add DM_CRYPT_FORCE_INLINE flag to dm-crypt target
On 2020/06/24 14:05, Eric Biggers wrote: > On Fri, Jun 19, 2020 at 05:41:32PM +0100, Ignat Korchagin wrote: >> Sometimes extra thread offloading imposed by dm-crypt hurts IO latency. This >> is >> especially visible on busy systems with many processes/threads. Moreover, >> most >> Crypto API implementaions are async, that is they offload crypto operations >> on >> their own, so this dm-crypt offloading is excessive. > > This really should say "some Crypto API implementations are async" instead of > "most Crypto API implementations are async". > > Notably, the AES-NI implementation of AES-XTS is synchronous if you call it > in a > context where SIMD instructions are usable. It's only asynchronous when SIMD > is > not usable. (This seems to have been missed in your blog post.) > >> This adds a new flag, which directs dm-crypt not to offload crypto operations >> and process everything inline. For cases, where crypto operations cannot >> happen >> inline (hard interrupt context, for example the read path of the NVME >> driver), >> we offload the work to a tasklet rather than a workqueue. > > This patch both removes some dm-crypt specific queueing, and changes > decryption > to use softIRQ context instead of a workqueue. It would be useful to know how > much of a difference the workqueue => softIRQ change makes by itself. Such a > change could be useful for fscrypt as well. (fscrypt uses a workqueue for > decryption, but besides that doesn't use any other queueing.) > >> @@ -127,7 +128,7 @@ struct iv_elephant_private { >> * and encrypts / decrypts at the same time. >> */ >> enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID, >> - DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD }; >> + DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD, DM_CRYPT_FORCE_INLINE = >> (sizeof(unsigned long) * 8 - 1) }; > > Assigning a specific enum value isn't necessary. > >> @@ -1458,13 +1459,18 @@ static void crypt_alloc_req_skcipher(struct >> crypt_config *cc, >> >> skcipher_request_set_tfm(ctx->r.req, cc->cipher_tfm.tfms[key_index]); >> >> -/* >> - * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs >> - * requests if driver request queue is full. >> - */ >> -skcipher_request_set_callback(ctx->r.req, >> -CRYPTO_TFM_REQ_MAY_BACKLOG, >> -kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); >> +if (test_bit(DM_CRYPT_FORCE_INLINE, >flags)) >> +/* make sure we zero important fields of the request */ >> +skcipher_request_set_callback(ctx->r.req, >> +0, NULL, NULL); >> +else >> +/* >> + * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs >> + * requests if driver request queue is full. >> + */ >> +skcipher_request_set_callback(ctx->r.req, >> +CRYPTO_TFM_REQ_MAY_BACKLOG, >> +kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); >> } > > This looks wrong. Unless type=0 and mask=CRYPTO_ALG_ASYNC are passed to > crypto_alloc_skcipher(), the skcipher implementation can still be > asynchronous, > in which case providing a callback is required. > > Do you intend that the "force_inline" option forces the use of a synchronous > skcipher (alongside the other things it does)? Or should it still allow > asynchronous ones? > > We may not actually have a choice in that matter, since xts-aes-aesni has the > CRYPTO_ALG_ASYNC bit set (as I mentioned) despite being synchronous in most > cases; thus, the crypto API won't give you it if you ask for a synchronous > cipher. So I think you still need to allow async skciphers? That means a > callback is still always required. Arg... So it means that some skciphers will not be OK at all for SMR writes. I was not aware of these differences (tested with aes-xts-plain64 only). The ugly way to support async ciphers would be to just wait inline for the crypto API to complete using a completion for instance. But that is very ugly. Back to brainstorming, and need to learn more about the crypto API... > > - Eric > > -- > dm-devel mailing list > dm-de...@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel > > -- Damien Le Moal Western Digital Research
Re: [PATCH 5.7 v2] x86/crypto: aesni: Fix build with LLVM_IAS=1
On Tue, Jun 23, 2020 at 8:44 PM Nick Desaulniers wrote: > > On Mon, Jun 22, 2020 at 7:56 PM Sedat Dilek wrote: > > > > When building with LLVM_IAS=1 means using Clang's Integrated Assembly (IAS) > > from LLVM/Clang >= v10.0.1-rc1+ instead of GNU/as from GNU/binutils > > I see the following breakage in Debian/testing AMD64: > > > > :15:74: error: too many positional arguments > > PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, > > ^ > > arch/x86/crypto/aesni-intel_asm.S:1598:2: note: while in macro > > instantiation > > GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp) > > ^ > > :47:2: error: unknown use of instruction mnemonic without a > > size suffix > > GHASH_4_ENCRYPT_4_PARALLEL_dec %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, > > %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc > > ^ > > arch/x86/crypto/aesni-intel_asm.S:1599:2: note: while in macro instantiation > > GCM_ENC_DEC dec > > ^ > > :15:74: error: too many positional arguments > > PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, > > ^ > > arch/x86/crypto/aesni-intel_asm.S:1686:2: note: while in macro instantiation > > GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp) > > ^ > > :47:2: error: unknown use of instruction mnemonic without a > > size suffix > > GHASH_4_ENCRYPT_4_PARALLEL_enc %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, > > %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc > > ^ > > arch/x86/crypto/aesni-intel_asm.S:1687:2: note: while in macro instantiation > > GCM_ENC_DEC enc > > > > Craig Topper suggested me in ClangBuiltLinux issue #1050: > > > > > I think the "too many positional arguments" is because the parser isn't > > > able > > > to handle the trailing commas. > > > > > > The "unknown use of instruction mnemonic" is because the macro was named > > > GHASH_4_ENCRYPT_4_PARALLEL_DEC but its being instantiated with > > > GHASH_4_ENCRYPT_4_PARALLEL_dec I guess gas ignores case on the > > > macro instantiation, but llvm doesn't. > > > > First, I removed the trailing comma in the PRECOMPUTE line. > > > > Second, I substituted: > > 1. GHASH_4_ENCRYPT_4_PARALLEL_DEC -> GHASH_4_ENCRYPT_4_PARALLEL_dec > > 2. GHASH_4_ENCRYPT_4_PARALLEL_ENC -> GHASH_4_ENCRYPT_4_PARALLEL_enc > > > > With these changes I was able to build with LLVM_IAS=1 and boot on bare > > metal. > > > > I confirmed that this works with Linux-kernel v5.7.5 final. > > > > NOTE: This patch is on top of Linux v5.7 final. > > Thanks for the note, still applies cleanly on top of linux-next today for me. > > > > > Thanks to Craig and especially Nick for double-checking and his comments. > > > > Suggested-by: Craig Topper > > Suggested-by: Craig Topper > > Suggested-by: Nick Desaulniers ndesaulni...@google.com > > ^ oh, may have missed <> around email addr. > > > Cc: "ClangBuiltLinux" > > Link: https://github.com/ClangBuiltLinux/linux/issues/1050 > > Signed-off-by: Sedat Dilek > > Following the same testing methodology from V1 > (https://lore.kernel.org/patchwork/comment/1456822/) I verified for > GCC+GAS this is no functional change. > > $ wget https://lore.kernel.org/patchwork/patch/1261340/mbox/ -O sedat_v3.patch > $ git am sedat_v3.patch > $ make -j71 arch/x86/crypto/aesni-intel_asm.o > $ llvm-objdump -dr arch/x86/crypto/aesni-intel_asm.o > postpatch_v3.txt > $ diff -u <(cat prepatch.txt | tr -s ' ' | cut -d '' -f 2-) <(cat > postpatch_v3.txt| tr -s ' ' | cut -d ' ' -f 2-) | less > (no output) > > Reviewed-by: Nick Desaulniers > Thanks a lot for your feedback. I have sent a v4 with corrected email-address and your Reviewed-by. - Sedat - [1] https://lore.kernel.org/patchwork/patch/1263102/ [2] https://lore.kernel.org/lkml/20200624051538.5355-1-sedat.di...@gmail.com/ > > --- > > arch/x86/crypto/aesni-intel_asm.S | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/crypto/aesni-intel_asm.S > > b/arch/x86/crypto/aesni-intel_asm.S > > index cad6e1bfa7d5..c216de287742 100644 > > --- a/arch/x86/crypto/aesni-intel_asm.S > > +++ b/arch/x86/crypto/aesni-intel_asm.S > > @@ -266,7 +266,7 @@ ALL_F: .octa 0x > > PSHUFB_XMM %xmm2, %xmm0 > > movdqu %xmm0, CurCount(%arg2) # ctx_data.current_counter = iv > > > > - PRECOMPUTE \SUBKEY, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, > > + PRECOMPUTE \SUBKEY, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7 > > movdqu HashKey(%arg2), %xmm13 > > > > CALC_AAD_HASH %xmm13, \AAD, \AADLEN, %xmm0, %xmm1, %xmm2, %xmm3, \ > > @@ -978,7 +978,7 @@ _initial_blocks_done\@: > > * arg1, %arg3, %arg4 are used as pointers only, not modified > > * %r11 is the data offset value > > */ > > -.macro GHASH_4_ENCRYPT_4_PARALLEL_ENC TMP1 TMP2 TMP3 TMP4 TMP5
Re: [PATCH] libnvdimm/security: Fix key lookup permissions
On Tue, Jun 23, 2020 at 09:35:26PM -0700, Dan Williams wrote: > As of commit 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather > than a mask") lookup_user_key() needs an explicit declaration of what it > wants to do with the key. Add KEY_NEED_SEARCH to fix a warning with the > below signature, and fixes the inability to retrieve a key. > > WARNING: CPU: 15 PID: 6276 at security/keys/permission.c:35 > key_task_permission+0xd3/0x140 > [..] > RIP: 0010:key_task_permission+0xd3/0x140 > [..] > Call Trace: > lookup_user_key+0xeb/0x6b0 > ? vsscanf+0x3df/0x840 > ? key_validate+0x50/0x50 > ? key_default_cmp+0x20/0x20 > nvdimm_get_user_key_payload.part.0+0x21/0x110 [libnvdimm] > nvdimm_security_store+0x67d/0xb20 [libnvdimm] > security_store+0x67/0x1a0 [libnvdimm] > kernfs_fop_write+0xcf/0x1c0 > vfs_write+0xde/0x1d0 > ksys_write+0x68/0xe0 > do_syscall_64+0x5c/0xa0 > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > Cc: Dan Williams > Cc: Vishal Verma > Cc: Dave Jiang > Cc: Ira Weiny Reviewed-by: Ira Weiny > Suggested-by: David Howells > Fixes: 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather than a > mask") > Signed-off-by: Dan Williams > --- > drivers/nvdimm/security.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/security.c b/drivers/nvdimm/security.c > index 89b85970912d..4cef69bd3c1b 100644 > --- a/drivers/nvdimm/security.c > +++ b/drivers/nvdimm/security.c > @@ -95,7 +95,7 @@ static struct key *nvdimm_lookup_user_key(struct nvdimm > *nvdimm, > struct encrypted_key_payload *epayload; > struct device *dev = >dev; > > - keyref = lookup_user_key(id, 0, 0); > + keyref = lookup_user_key(id, 0, KEY_NEED_SEARCH); > if (IS_ERR(keyref)) > return NULL; > >
[PATCH 5.7 v4] x86/crypto: aesni: Fix build with LLVM_IAS=1
When building with LLVM_IAS=1 means using Clang's Integrated Assembly (IAS) from LLVM/Clang >= v10.0.1-rc1+ instead of GNU/as from GNU/binutils I see the following breakage in Debian/testing AMD64: :15:74: error: too many positional arguments PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, ^ arch/x86/crypto/aesni-intel_asm.S:1598:2: note: while in macro instantiation GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp) ^ :47:2: error: unknown use of instruction mnemonic without a size suffix GHASH_4_ENCRYPT_4_PARALLEL_dec %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc ^ arch/x86/crypto/aesni-intel_asm.S:1599:2: note: while in macro instantiation GCM_ENC_DEC dec ^ :15:74: error: too many positional arguments PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, ^ arch/x86/crypto/aesni-intel_asm.S:1686:2: note: while in macro instantiation GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp) ^ :47:2: error: unknown use of instruction mnemonic without a size suffix GHASH_4_ENCRYPT_4_PARALLEL_enc %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc ^ arch/x86/crypto/aesni-intel_asm.S:1687:2: note: while in macro instantiation GCM_ENC_DEC enc Craig Topper suggested me in ClangBuiltLinux issue #1050: > I think the "too many positional arguments" is because the parser isn't able > to handle the trailing commas. > > The "unknown use of instruction mnemonic" is because the macro was named > GHASH_4_ENCRYPT_4_PARALLEL_DEC but its being instantiated with > GHASH_4_ENCRYPT_4_PARALLEL_dec I guess gas ignores case on the > macro instantiation, but llvm doesn't. First, I removed the trailing comma in the PRECOMPUTE line. Second, I substituted: 1. GHASH_4_ENCRYPT_4_PARALLEL_DEC -> GHASH_4_ENCRYPT_4_PARALLEL_dec 2. GHASH_4_ENCRYPT_4_PARALLEL_ENC -> GHASH_4_ENCRYPT_4_PARALLEL_enc With these changes I was able to build with LLVM_IAS=1 and boot on bare metal. I confirmed that this works with Linux-kernel v5.7.5 final. NOTE: This patch is on top of Linux v5.7 final. Thanks to Craig and especially Nick for double-checking and his comments. Suggested-by: Craig Topper Suggested-by: Craig Topper Suggested-by: Nick Desaulniers Reviewed-by: Nick Desaulniers Cc: "ClangBuiltLinux" Link: https://github.com/ClangBuiltLinux/linux/issues/1050 Signed-off-by: Sedat Dilek --- Changes v3->v4: - Add <> around email address as desired by Nick - Add Nick's Reviewed-by Changes v2->v3: - Add this Changelog Changes v1->v2: - Replace Cc by Suggested-by for Craig - Replace Cc by Suggested-by for Nick (dropped Cc as desired) - Really follow the suggestions of Craig - Drop unneeded comments for my build-environment and Links arch/x86/crypto/aesni-intel_asm.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S index cad6e1bfa7d5..c216de287742 100644 --- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -266,7 +266,7 @@ ALL_F: .octa 0x PSHUFB_XMM %xmm2, %xmm0 movdqu %xmm0, CurCount(%arg2) # ctx_data.current_counter = iv - PRECOMPUTE \SUBKEY, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, + PRECOMPUTE \SUBKEY, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7 movdqu HashKey(%arg2), %xmm13 CALC_AAD_HASH %xmm13, \AAD, \AADLEN, %xmm0, %xmm1, %xmm2, %xmm3, \ @@ -978,7 +978,7 @@ _initial_blocks_done\@: * arg1, %arg3, %arg4 are used as pointers only, not modified * %r11 is the data offset value */ -.macro GHASH_4_ENCRYPT_4_PARALLEL_ENC TMP1 TMP2 TMP3 TMP4 TMP5 \ +.macro GHASH_4_ENCRYPT_4_PARALLEL_enc TMP1 TMP2 TMP3 TMP4 TMP5 \ TMP6 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 operation movdqa\XMM1, \XMM5 @@ -1186,7 +1186,7 @@ aes_loop_par_enc_done\@: * arg1, %arg3, %arg4 are used as pointers only, not modified * %r11 is the data offset value */ -.macro GHASH_4_ENCRYPT_4_PARALLEL_DEC TMP1 TMP2 TMP3 TMP4 TMP5 \ +.macro GHASH_4_ENCRYPT_4_PARALLEL_dec TMP1 TMP2 TMP3 TMP4 TMP5 \ TMP6 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 operation movdqa\XMM1, \XMM5 -- 2.27.0
Re: [dm-devel] [RFC PATCH 1/1] Add DM_CRYPT_FORCE_INLINE flag to dm-crypt target
On 2020/06/22 17:47, Ignat Korchagin wrote: > Sometimes extra thread offloading imposed by dm-crypt hurts IO latency. This > is > especially visible on busy systems with many processes/threads. Moreover, most > Crypto API implementaions are async, that is they offload crypto operations on > their own, so this dm-crypt offloading is excessive. > > This adds a new flag, which directs dm-crypt not to offload crypto operations > and process everything inline. For cases, where crypto operations cannot > happen > inline (hard interrupt context, for example the read path of the NVME driver), > we offload the work to a tasklet rather than a workqueue. > > Signed-off-by: Ignat Korchagin > --- > drivers/md/dm-crypt.c | 55 +-- > 1 file changed, 43 insertions(+), 12 deletions(-) > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > index 000ddfab5ba0..5a9bac4fdffb 100644 > --- a/drivers/md/dm-crypt.c > +++ b/drivers/md/dm-crypt.c > @@ -69,6 +69,7 @@ struct dm_crypt_io { > u8 *integrity_metadata; > bool integrity_metadata_from_pool; > struct work_struct work; > + struct tasklet_struct tasklet; > > struct convert_context ctx; > > @@ -127,7 +128,7 @@ struct iv_elephant_private { > * and encrypts / decrypts at the same time. > */ > enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID, > - DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD }; > + DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD, DM_CRYPT_FORCE_INLINE = > (sizeof(unsigned long) * 8 - 1) }; I do not see why a special value needs to be defined for DM_CRYPT_FORCE_INLINE. It is clear from the number of members in the enum that we have far less than 32 flags. Also, it may be good to add DM_CRYPT_FORCE_INLINE as a new line to avoid the long-ish line. > > enum cipher_flags { > CRYPT_MODE_INTEGRITY_AEAD, /* Use authenticated mode for cihper */ > @@ -1458,13 +1459,18 @@ static void crypt_alloc_req_skcipher(struct > crypt_config *cc, > > skcipher_request_set_tfm(ctx->r.req, cc->cipher_tfm.tfms[key_index]); > > - /* > - * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs > - * requests if driver request queue is full. > - */ > - skcipher_request_set_callback(ctx->r.req, > - CRYPTO_TFM_REQ_MAY_BACKLOG, > - kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); > + if (test_bit(DM_CRYPT_FORCE_INLINE, >flags)) > + /* make sure we zero important fields of the request */ > + skcipher_request_set_callback(ctx->r.req, > + 0, NULL, NULL); May be add a return here to avoid the need for else ? > + else > + /* > + * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs > + * requests if driver request queue is full. > + */ > + skcipher_request_set_callback(ctx->r.req, > + CRYPTO_TFM_REQ_MAY_BACKLOG, > + kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); > } > > static void crypt_alloc_req_aead(struct crypt_config *cc, > @@ -1566,7 +1572,8 @@ static blk_status_t crypt_convert(struct crypt_config > *cc, > atomic_dec(>cc_pending); > ctx->cc_sector += sector_step; > tag_offset++; > - cond_resched(); > + if (!test_bit(DM_CRYPT_FORCE_INLINE, >flags)) > + cond_resched(); > continue; > /* >* There was a data integrity error. > @@ -1892,6 +1899,11 @@ static void kcryptd_crypt_write_io_submit(struct > dm_crypt_io *io, int async) > > clone->bi_iter.bi_sector = cc->start + io->sector; > > + if (test_bit(DM_CRYPT_FORCE_INLINE, >flags)) { > + generic_make_request(clone); > + return; > + } > + > if (likely(!async) && test_bit(DM_CRYPT_NO_OFFLOAD, >flags)) { A little inline helper such as kcryptd_crypt_write_io_inline(io, async) would simplify things here: the conditions leading to an inline write will be gathered together and can be explained. And for SMR, since we also need an IO type based selection, we can extent the helper without needing another change here. > generic_make_request(clone); > return; > @@ -2031,12 +2043,26 @@ static void kcryptd_crypt(struct work_struct *work) > kcryptd_crypt_write_convert(io); > } > > +static void kcryptd_crypt_tasklet(unsigned long work) > +{ > + kcryptd_crypt((struct work_struct *)work); > +} > + > static void kcryptd_queue_crypt(struct dm_crypt_io *io) > { > struct crypt_config *cc = io->cc; > > - INIT_WORK(>work, kcryptd_crypt); > - queue_work(cc->crypt_queue, >work); > + if (test_bit(DM_CRYPT_FORCE_INLINE, >flags)) { > + if (in_irq()) { > + /* Crypto API will fail hard in hard IRQ context */ > +
Re: [PATCH v9 1/4] unicode: Add utf8_casefold_hash
Daniel Rosenberg writes: > This adds a case insensitive hash function to allow taking the hash > without needing to allocate a casefolded copy of the string. > > Signed-off-by: Daniel Rosenberg > --- > fs/unicode/utf8-core.c | 23 ++- > include/linux/unicode.h | 3 +++ > 2 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c > index 2a878b739115d..90656b9980720 100644 > --- a/fs/unicode/utf8-core.c > +++ b/fs/unicode/utf8-core.c > @@ -6,6 +6,7 @@ > #include > #include > #include > +#include > > #include "utf8n.h" > > @@ -122,9 +123,29 @@ int utf8_casefold(const struct unicode_map *um, const > struct qstr *str, > } > return -EINVAL; > } > - > EXPORT_SYMBOL(utf8_casefold); > > +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, > +struct qstr *str) > +{ > + const struct utf8data *data = utf8nfdicf(um->version); > + struct utf8cursor cur; > + int c; > + unsigned long hash = init_name_hash(salt); > + > + if (utf8ncursor(, data, str->name, str->len) < 0) > + return -EINVAL; > + > + while ((c = utf8byte())) { > + if (c < 0) > + return c; Return -EINVAL here to match other unicode functions, since utf8byte will return -1 on a binary blob, which doesn't make sense for this. Other than that, looks good to me. Reviewed-by: Gabriel Krisman Bertazi > + hash = partial_name_hash((unsigned char)c, hash); > + } > + str->hash = end_name_hash(hash); > + return 0; > +} > +EXPORT_SYMBOL(utf8_casefold_hash); > + > int utf8_normalize(const struct unicode_map *um, const struct qstr *str, > unsigned char *dest, size_t dlen) > { > diff --git a/include/linux/unicode.h b/include/linux/unicode.h > index 990aa97d80496..74484d44c7554 100644 > --- a/include/linux/unicode.h > +++ b/include/linux/unicode.h > @@ -27,6 +27,9 @@ int utf8_normalize(const struct unicode_map *um, const > struct qstr *str, > int utf8_casefold(const struct unicode_map *um, const struct qstr *str, > unsigned char *dest, size_t dlen); > > +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, > +struct qstr *str); > + > struct unicode_map *utf8_load(const char *version); > void utf8_unload(struct unicode_map *um); -- Gabriel Krisman Bertazi
Re: [PATCH 5.7 000/477] 5.7.6-rc1 review
On 6/23/20 12:49 PM, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 5.7.6 release. > There are 477 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Thu, 25 Jun 2020 19:52:30 +. > Anything received after that time might be too late. > [ ... ] > Christophe JAILLET > pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak > in case of error in 'imx_pinctrl_probe()' > This patch has since been reverted in the upstream kernel (commit 13f2d25b951f) and needs to be dropped. Guenter
Re: [PATCH 5.4 000/314] 5.4.49-rc1 review
On 6/23/20 12:53 PM, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 5.4.49 release. > There are 314 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Thu, 25 Jun 2020 19:52:30 +. > Anything received after that time might be too late. > [ ... ] > Christophe JAILLET > pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak > in case of error in 'imx_pinctrl_probe()' > This patch has since been reverted in the upstream kernel (commit 13f2d25b951f) and needs to be dropped. Guenter
Re: [PATCH 5.7 318/477] pinctrl: freescale: imx: Use devm_of_iomap() to avoid a resource leak in case of error in imx_pinctrl_probe()
Hi, This one must NOT be included. It generates a regression. This should be removed from 5.4 as well. See 13f2d25b951f139064ec2dd53c0c7ebdf8d8007e. There is also a thread on ML about it. I couldn't find it right away, but I'm sure that Dan will be quicker than me for finding it, if needed ;-). CJ Le 23/06/2020 à 21:55, Greg Kroah-Hartman a écrit : From: Christophe JAILLET [ Upstream commit ba403242615c2c99e27af7984b1650771a2cc2c9 ] Use 'devm_of_iomap()' instead 'of_iomap()' to avoid a resource leak in case of error. Update the error handling code accordingly. Fixes: 26d8cde5260b ("pinctrl: freescale: imx: add shared input select reg support") Suggested-by: Dan Carpenter Signed-off-by: Christophe JAILLET Reviewed-by: Dan Carpenter Link: https://lore.kernel.org/r/20200602200626.677981-1-christophe.jail...@wanadoo.fr Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/pinctrl/freescale/pinctrl-imx.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/pinctrl/freescale/pinctrl-imx.c b/drivers/pinctrl/freescale/pinctrl-imx.c index 1f81569c7ae3b..cb7e0f08d2cf4 100644 --- a/drivers/pinctrl/freescale/pinctrl-imx.c +++ b/drivers/pinctrl/freescale/pinctrl-imx.c @@ -824,12 +824,13 @@ int imx_pinctrl_probe(struct platform_device *pdev, return -EINVAL; } - ipctl->input_sel_base = of_iomap(np, 0); + ipctl->input_sel_base = devm_of_iomap(>dev, np, + 0, NULL); of_node_put(np); - if (!ipctl->input_sel_base) { + if (IS_ERR(ipctl->input_sel_base)) { dev_err(>dev, "iomuxc input select base address not found\n"); - return -ENOMEM; + return PTR_ERR(ipctl->input_sel_base); } } }
Re: [dm-crypt] [RFC PATCH 1/1] Add DM_CRYPT_FORCE_INLINE flag to dm-crypt target
On Fri, Jun 19, 2020 at 05:41:32PM +0100, Ignat Korchagin wrote: > Sometimes extra thread offloading imposed by dm-crypt hurts IO latency. This > is > especially visible on busy systems with many processes/threads. Moreover, most > Crypto API implementaions are async, that is they offload crypto operations on > their own, so this dm-crypt offloading is excessive. This really should say "some Crypto API implementations are async" instead of "most Crypto API implementations are async". Notably, the AES-NI implementation of AES-XTS is synchronous if you call it in a context where SIMD instructions are usable. It's only asynchronous when SIMD is not usable. (This seems to have been missed in your blog post.) > This adds a new flag, which directs dm-crypt not to offload crypto operations > and process everything inline. For cases, where crypto operations cannot > happen > inline (hard interrupt context, for example the read path of the NVME driver), > we offload the work to a tasklet rather than a workqueue. This patch both removes some dm-crypt specific queueing, and changes decryption to use softIRQ context instead of a workqueue. It would be useful to know how much of a difference the workqueue => softIRQ change makes by itself. Such a change could be useful for fscrypt as well. (fscrypt uses a workqueue for decryption, but besides that doesn't use any other queueing.) > @@ -127,7 +128,7 @@ struct iv_elephant_private { > * and encrypts / decrypts at the same time. > */ > enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID, > - DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD }; > + DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD, DM_CRYPT_FORCE_INLINE = > (sizeof(unsigned long) * 8 - 1) }; Assigning a specific enum value isn't necessary. > @@ -1458,13 +1459,18 @@ static void crypt_alloc_req_skcipher(struct > crypt_config *cc, > > skcipher_request_set_tfm(ctx->r.req, cc->cipher_tfm.tfms[key_index]); > > - /* > - * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs > - * requests if driver request queue is full. > - */ > - skcipher_request_set_callback(ctx->r.req, > - CRYPTO_TFM_REQ_MAY_BACKLOG, > - kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); > + if (test_bit(DM_CRYPT_FORCE_INLINE, >flags)) > + /* make sure we zero important fields of the request */ > + skcipher_request_set_callback(ctx->r.req, > + 0, NULL, NULL); > + else > + /* > + * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs > + * requests if driver request queue is full. > + */ > + skcipher_request_set_callback(ctx->r.req, > + CRYPTO_TFM_REQ_MAY_BACKLOG, > + kcryptd_async_done, dmreq_of_req(cc, ctx->r.req)); > } This looks wrong. Unless type=0 and mask=CRYPTO_ALG_ASYNC are passed to crypto_alloc_skcipher(), the skcipher implementation can still be asynchronous, in which case providing a callback is required. Do you intend that the "force_inline" option forces the use of a synchronous skcipher (alongside the other things it does)? Or should it still allow asynchronous ones? We may not actually have a choice in that matter, since xts-aes-aesni has the CRYPTO_ALG_ASYNC bit set (as I mentioned) despite being synchronous in most cases; thus, the crypto API won't give you it if you ask for a synchronous cipher. So I think you still need to allow async skciphers? That means a callback is still always required. - Eric
Re: Subject: [PATCH v3 1/2] USB: serial: cp210x: Enable usb generic throttle/unthrottle
On Wed, Jun 24, 2020 at 04:11:21AM +, Phu Luu wrote: > Assign the .throttle and .unthrottle functions to be generic function > in the driver structure to prevent data loss that can otherwise occur > if the host does not enable USB throttling. > > Signed-off-by: Phu Luu An > Signed-off-by: Brant Merryman Why does your Subject: line have "Subject:" in it again? Can you please fix up and resend these patches? thanks, greg k-h
Re: [dm-devel] [RFC PATCH 0/1] dm-crypt excessive overhead
On 2020/06/24 0:23, Mike Snitzer wrote: > On Tue, Jun 23 2020 at 11:07am -0400, > Ignat Korchagin wrote: > >> Do you think it may be better to break it in two flags: one for read >> path and one for write? So, depending on the needs and workflow these >> could be enabled independently? > > If there is a need to split, then sure. But I think Damien had a hard > requirement that writes had to be inlined but that reads didn't _need_ > to be for his dm-zoned usecase. Damien may not yet have assessed the > performance implications, of not have reads inlined, as much as you > have. We did do performance testing :) The results are mixed and performance differences between inline vs workqueues depend on the workload (IO size, IO queue depth and number of drives being used mostly). In many cases, inlining everything does really improve performance as Ignat reported. In our testing, we used hard drives and so focused mostly on throughput rather than command latency. The added workqueue context switch overhead and crypto work latency compared to typical HDD IO times is small, and significant only if the backend storage as short IO times. In the case of HDDs, especially for large IO sizes, inlining crypto work does not shine as it prevents an efficient use of CPU resources. This is especially true with reads on a large system with many drives connected to a single HBA: the softirq context decryption work does not lend itself well to using other CPUs that did not receive the HBA IRQ signaling command completions. The test results clearly show much higher throughputs using dm-crypt as is. On the other hand, inlining crypto work significantly improves workloads of small random IOs, even for a large number of disks: removing the overhead of context switches allows faster completions, allowing sending more requests to the drives more quickly, keeping them busy. For SMR, the inlining of write requests is *mandatory* to preserve the issuer write sequence, but encryption work being done in the issuer context (writes to SMR drives can only be O_DIRECT writes), efficient CPU resource usage can be achieved by simply using multiple writer thread/processes, working on different zones of different disks. This is a very reasonable model for SMR as writes into a single zone have to be done under mutual exclusion to ensure sequentiality. For reads, SMR drives are essentially exactly the same as regular disks, so as-is or inline are both OK. Based on our performance results, allowing the user to have the choice of inlining or not reads based on the target workload would be great. Of note is that zone append writes (emulated in SCSI, native with NVMe) are not subject to the sequential write constraint, so they can also be executed either inline or asynchronously. > So let's see how Damien's work goes and if he trully doesn't need/want > reads to be inlined then 2 flags can be created. For SMR, I do not need inline reads, but I do want the user to have the possibility of using this setup as that can provide better performance for some workloads. I think that splitting the inline flag in 2 is exactly what we want: 1) For SMR, the write-inline flag can be automatically turned on when the target device is created if the backend device used is a host-managed zoned drive (scsi or NVMe ZNS). For reads, it would be the user choice, based on the target workload. 2) For regular block devices, write-inline only, read-inline only or both would be the user choice, to optimize for their target workload. With the split into 2 flags, my SMR support patch becomes very simple. > > Mike > > -- > dm-devel mailing list > dm-de...@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel > > -- Damien Le Moal Western Digital Research
Re: [PATCH v3 05/10] x86: Make sure _etext includes function sections
On Tue, Jun 23, 2020 at 10:23:22AM -0700, Kristen Carlson Accardi wrote: > When using -ffunction-sections to place each function in > it's own text section so it can be randomized at load time, the > linker considers these .text.* sections "orphaned sections", and > will place them after the first similar section (.text). In order > to accurately represent the end of the text section and the > orphaned sections, _etext must be moved so that it is after both > .text and .text.* The text size must also be calculated to > include .text AND .text.* > > Signed-off-by: Kristen Carlson Accardi Reviewed-by: Kees Cook -- Kees Cook
[PATCH] libnvdimm/security: Fix key lookup permissions
As of commit 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather than a mask") lookup_user_key() needs an explicit declaration of what it wants to do with the key. Add KEY_NEED_SEARCH to fix a warning with the below signature, and fixes the inability to retrieve a key. WARNING: CPU: 15 PID: 6276 at security/keys/permission.c:35 key_task_permission+0xd3/0x140 [..] RIP: 0010:key_task_permission+0xd3/0x140 [..] Call Trace: lookup_user_key+0xeb/0x6b0 ? vsscanf+0x3df/0x840 ? key_validate+0x50/0x50 ? key_default_cmp+0x20/0x20 nvdimm_get_user_key_payload.part.0+0x21/0x110 [libnvdimm] nvdimm_security_store+0x67d/0xb20 [libnvdimm] security_store+0x67/0x1a0 [libnvdimm] kernfs_fop_write+0xcf/0x1c0 vfs_write+0xde/0x1d0 ksys_write+0x68/0xe0 do_syscall_64+0x5c/0xa0 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Ira Weiny Suggested-by: David Howells Fixes: 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather than a mask") Signed-off-by: Dan Williams --- drivers/nvdimm/security.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvdimm/security.c b/drivers/nvdimm/security.c index 89b85970912d..4cef69bd3c1b 100644 --- a/drivers/nvdimm/security.c +++ b/drivers/nvdimm/security.c @@ -95,7 +95,7 @@ static struct key *nvdimm_lookup_user_key(struct nvdimm *nvdimm, struct encrypted_key_payload *epayload; struct device *dev = >dev; - keyref = lookup_user_key(id, 0, 0); + keyref = lookup_user_key(id, 0, KEY_NEED_SEARCH); if (IS_ERR(keyref)) return NULL;
Re: [PATCH 5.4 000/314] 5.4.49-rc1 review
On 6/23/20 12:53 PM, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 5.4.49 release. > There are 314 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Thu, 25 Jun 2020 19:52:30 +. > Anything received after that time might be too late. > I see a number of reboot failures in mcimx7d-sabre images; it looks like the gpio driver doesn't instantiate. I'll bisect. This affects both v5.4.y and v5.7.y. Guenter
Re: [PATCH v3 3/9] efi/libstub: Remove .note.gnu.property
On Tue, Jun 23, 2020 at 08:31:42PM -0700, 'Fangrui Song' via Clang Built Linux wrote: > On 2020-06-23, Kees Cook wrote: > > In preparation for adding --orphan-handling=warn to more architectures, > > make sure unwanted sections don't end up appearing under the .init > > section prefix that libstub adds to itself during objcopy. > > > > Signed-off-by: Kees Cook > > --- > > drivers/firmware/efi/libstub/Makefile | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/firmware/efi/libstub/Makefile > > b/drivers/firmware/efi/libstub/Makefile > > index 75daaf20374e..9d2d2e784bca 100644 > > --- a/drivers/firmware/efi/libstub/Makefile > > +++ b/drivers/firmware/efi/libstub/Makefile > > @@ -66,6 +66,9 @@ lib-$(CONFIG_X86) += x86-stub.o > > CFLAGS_arm32-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) > > CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) > > > > +# Remove unwanted sections first. > > +STUBCOPY_FLAGS-y += --remove-section=.note.gnu.property > > + > > # > > # For x86, bootloaders like systemd-boot or grub-efi do not zero-initialize > > the > > # .bss section, so the .bss section of the EFI stub needs to be included in > > the > > arch/arm64/Kconfig enables ARM64_PTR_AUTH by default. When the config is on > > ifeq ($(CONFIG_ARM64_BTI_KERNEL),y) > branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET_BTI) := > -mbranch-protection=pac-ret+leaf+bti > else > branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET) := > -mbranch-protection=pac-ret+leaf > endif > > This option creates .note.gnu.property: > > % readelf -n drivers/firmware/efi/libstub/efi-stub.o > > Displaying notes found in: .note.gnu.property > OwnerData sizeDescription > GNU 0x0010 NT_GNU_PROPERTY_TYPE_0 > Properties: AArch64 feature: PAC > > If .note.gnu.property is not desired in drivers/firmware/efi/libstub, > specifying > -mbranch-protection=none can override -mbranch-protection=pac-ret+leaf We want to keep the branch protection enabled. But since it's not a "regular" ELF, we don't need to keep the property that identifies the feature. -- Kees Cook
Re: [PATCH] soc: qcom: rpmh-rsc: Don't use ktime for timeout in write_tcs_reg_sync()
Reviewed-by: Maulik Shah Thanks, Maulik On 5/28/2020 8:18 PM, Douglas Anderson wrote: The write_tcs_reg_sync() may be called after timekeeping is suspended so it's not OK to use ktime. The readl_poll_timeout_atomic() macro implicitly uses ktime. This was causing a warning at suspend time. Change to just loop 100 times with a delay of 1 us between loops. This may give a timeout of more than 1 second but never less and is safe even if timekeeping is suspended. NOTE: I don't have any actual evidence that we need to loop here. It's possibly that all we really need to do is just read the value back to ensure that the pipes are cleaned and the looping/comparing is totally not needed. I never saw the loop being needed in my tests. However, the loop shouldn't hurt. Fixes: 91160150aba0 ("soc: qcom: rpmh-rsc: Timeout after 1 second in write_tcs_reg_sync()") Reported-by: Maulik Shah Signed-off-by: Douglas Anderson --- drivers/soc/qcom/rpmh-rsc.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c index 076fd27f3081..906778e2c1fa 100644 --- a/drivers/soc/qcom/rpmh-rsc.c +++ b/drivers/soc/qcom/rpmh-rsc.c @@ -175,13 +175,21 @@ static void write_tcs_reg(const struct rsc_drv *drv, int reg, int tcs_id, static void write_tcs_reg_sync(const struct rsc_drv *drv, int reg, int tcs_id, u32 data) { - u32 new_data; + int i; writel(data, tcs_reg_addr(drv, reg, tcs_id)); - if (readl_poll_timeout_atomic(tcs_reg_addr(drv, reg, tcs_id), new_data, - new_data == data, 1, USEC_PER_SEC)) - pr_err("%s: error writing %#x to %d:%#x\n", drv->name, - data, tcs_id, reg); + + /* +* Wait until we read back the same value. Use a counter rather than +* ktime for timeout since this may be called after timekeeping stops. +*/ + for (i = 0; i < USEC_PER_SEC; i++) { + if (readl(tcs_reg_addr(drv, reg, tcs_id)) == data) + return; + udelay(1); + } + pr_err("%s: error writing %#x to %d:%#x\n", drv->name, + data, tcs_id, reg); } /** -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH 05/15] mm: allow read-ahead with IOCB_NOWAIT set
On Thu, Jun 18, 2020 at 08:43:45AM -0600, Jens Axboe wrote: > The read-ahead shouldn't block, so allow it to be done even if > IOCB_NOWAIT is set in the kiocb. > > Acked-by: Johannes Weiner > Signed-off-by: Jens Axboe BTW, Jens, in case nobody had mentioned it, the Reply-To field for the patches in this patchset is screwed up: | Reply-To: a...@vger.kernel.org, supp...@vger.kernel.org, f...@vger.kernel.org, | as...@vger.kernel.org, buffe...@vger.kernel.org, | re...@vger.kernel.org Cheers, Dave. -- Dave Chinner da...@fromorbit.com
[PATCH v9 1/4] unicode: Add utf8_casefold_hash
This adds a case insensitive hash function to allow taking the hash without needing to allocate a casefolded copy of the string. Signed-off-by: Daniel Rosenberg --- fs/unicode/utf8-core.c | 23 ++- include/linux/unicode.h | 3 +++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c index 2a878b739115d..90656b9980720 100644 --- a/fs/unicode/utf8-core.c +++ b/fs/unicode/utf8-core.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "utf8n.h" @@ -122,9 +123,29 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str, } return -EINVAL; } - EXPORT_SYMBOL(utf8_casefold); +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, + struct qstr *str) +{ + const struct utf8data *data = utf8nfdicf(um->version); + struct utf8cursor cur; + int c; + unsigned long hash = init_name_hash(salt); + + if (utf8ncursor(, data, str->name, str->len) < 0) + return -EINVAL; + + while ((c = utf8byte())) { + if (c < 0) + return c; + hash = partial_name_hash((unsigned char)c, hash); + } + str->hash = end_name_hash(hash); + return 0; +} +EXPORT_SYMBOL(utf8_casefold_hash); + int utf8_normalize(const struct unicode_map *um, const struct qstr *str, unsigned char *dest, size_t dlen) { diff --git a/include/linux/unicode.h b/include/linux/unicode.h index 990aa97d80496..74484d44c7554 100644 --- a/include/linux/unicode.h +++ b/include/linux/unicode.h @@ -27,6 +27,9 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str, int utf8_casefold(const struct unicode_map *um, const struct qstr *str, unsigned char *dest, size_t dlen); +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, + struct qstr *str); + struct unicode_map *utf8_load(const char *version); void utf8_unload(struct unicode_map *um); -- 2.27.0.111.gc72c7da667-goog
[PATCH v9 3/4] f2fs: Use generic casefolding support
This switches f2fs over to the generic support provided in commit 5f829feca774 ("fs: Add standard casefolding support") Signed-off-by: Daniel Rosenberg --- fs/f2fs/dir.c | 84 + fs/f2fs/f2fs.h | 4 -- fs/f2fs/super.c | 10 ++--- fs/f2fs/sysfs.c | 10 +++-- include/linux/f2fs_fs.h | 3 -- 5 files changed, 21 insertions(+), 90 deletions(-) diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c index d35976785e8c5..b59c2673daa09 100644 --- a/fs/f2fs/dir.c +++ b/fs/f2fs/dir.c @@ -75,21 +75,22 @@ int f2fs_init_casefolded_name(const struct inode *dir, struct f2fs_filename *fname) { #ifdef CONFIG_UNICODE - struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb); + struct super_block *sb = dir->i_sb; + struct f2fs_sb_info *sbi = F2FS_SB(sb); if (IS_CASEFOLDED(dir)) { fname->cf_name.name = f2fs_kmalloc(sbi, F2FS_NAME_LEN, GFP_NOFS); if (!fname->cf_name.name) return -ENOMEM; - fname->cf_name.len = utf8_casefold(sbi->s_encoding, + fname->cf_name.len = utf8_casefold(sb->s_encoding, fname->usr_fname, fname->cf_name.name, F2FS_NAME_LEN); if ((int)fname->cf_name.len <= 0) { kfree(fname->cf_name.name); fname->cf_name.name = NULL; - if (f2fs_has_strict_mode(sbi)) + if (sb_has_enc_strict_mode(sb)) return -EINVAL; /* fall back to treating name as opaque byte sequence */ } @@ -215,8 +216,9 @@ static struct f2fs_dir_entry *find_in_block(struct inode *dir, static bool f2fs_match_ci_name(const struct inode *dir, const struct qstr *name, const u8 *de_name, u32 de_name_len) { - const struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb); - const struct unicode_map *um = sbi->s_encoding; + const struct super_block *sb = dir->i_sb; + const struct f2fs_sb_info *sbi = F2FS_SB(sb); + const struct unicode_map *um = sb->s_encoding; struct qstr entry = QSTR_INIT(de_name, de_name_len); int res; @@ -226,7 +228,7 @@ static bool f2fs_match_ci_name(const struct inode *dir, const struct qstr *name, * In strict mode, ignore invalid names. In non-strict mode, * fall back to treating them as opaque byte sequences. */ - if (f2fs_has_strict_mode(sbi) || name->len != entry.len) + if (sb_has_enc_strict_mode(sb) || name->len != entry.len) return false; return !memcmp(name->name, entry.name, name->len); } @@ -1107,75 +1109,9 @@ const struct file_operations f2fs_dir_operations = { }; #ifdef CONFIG_UNICODE -static int f2fs_d_compare(const struct dentry *dentry, unsigned int len, - const char *str, const struct qstr *name) -{ - const struct dentry *parent = READ_ONCE(dentry->d_parent); - const struct inode *dir = READ_ONCE(parent->d_inode); - const struct f2fs_sb_info *sbi = F2FS_SB(dentry->d_sb); - struct qstr entry = QSTR_INIT(str, len); - char strbuf[DNAME_INLINE_LEN]; - int res; - - if (!dir || !IS_CASEFOLDED(dir)) - goto fallback; - - /* -* If the dentry name is stored in-line, then it may be concurrently -* modified by a rename. If this happens, the VFS will eventually retry -* the lookup, so it doesn't matter what ->d_compare() returns. -* However, it's unsafe to call utf8_strncasecmp() with an unstable -* string. Therefore, we have to copy the name into a temporary buffer. -*/ - if (len <= DNAME_INLINE_LEN - 1) { - memcpy(strbuf, str, len); - strbuf[len] = 0; - entry.name = strbuf; - /* prevent compiler from optimizing out the temporary buffer */ - barrier(); - } - - res = utf8_strncasecmp(sbi->s_encoding, name, ); - if (res >= 0) - return res; - - if (f2fs_has_strict_mode(sbi)) - return -EINVAL; -fallback: - if (len != name->len) - return 1; - return !!memcmp(str, name->name, len); -} - -static int f2fs_d_hash(const struct dentry *dentry, struct qstr *str) -{ - struct f2fs_sb_info *sbi = F2FS_SB(dentry->d_sb); - const struct unicode_map *um = sbi->s_encoding; - const struct inode *inode = READ_ONCE(dentry->d_inode); - unsigned char *norm; - int len, ret = 0; - - if (!inode || !IS_CASEFOLDED(inode)) - return 0; - - norm =
[PATCH v9 4/4] ext4: Use generic casefolding support
This switches ext4 over to the generic support provided in commit 5f829feca774 ("fs: Add standard casefolding support") Signed-off-by: Daniel Rosenberg --- fs/ext4/dir.c | 64 ++--- fs/ext4/ext4.h | 12 -- fs/ext4/hash.c | 2 +- fs/ext4/namei.c | 20 +++- fs/ext4/super.c | 12 +- 5 files changed, 17 insertions(+), 93 deletions(-) diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c index 1d82336b1cd45..b437120f0b3f5 100644 --- a/fs/ext4/dir.c +++ b/fs/ext4/dir.c @@ -669,68 +669,8 @@ const struct file_operations ext4_dir_operations = { }; #ifdef CONFIG_UNICODE -static int ext4_d_compare(const struct dentry *dentry, unsigned int len, - const char *str, const struct qstr *name) -{ - struct qstr qstr = {.name = str, .len = len }; - const struct dentry *parent = READ_ONCE(dentry->d_parent); - const struct inode *inode = READ_ONCE(parent->d_inode); - char strbuf[DNAME_INLINE_LEN]; - - if (!inode || !IS_CASEFOLDED(inode) || - !EXT4_SB(inode->i_sb)->s_encoding) { - if (len != name->len) - return -1; - return memcmp(str, name->name, len); - } - - /* -* If the dentry name is stored in-line, then it may be concurrently -* modified by a rename. If this happens, the VFS will eventually retry -* the lookup, so it doesn't matter what ->d_compare() returns. -* However, it's unsafe to call utf8_strncasecmp() with an unstable -* string. Therefore, we have to copy the name into a temporary buffer. -*/ - if (len <= DNAME_INLINE_LEN - 1) { - memcpy(strbuf, str, len); - strbuf[len] = 0; - qstr.name = strbuf; - /* prevent compiler from optimizing out the temporary buffer */ - barrier(); - } - - return ext4_ci_compare(inode, name, , false); -} - -static int ext4_d_hash(const struct dentry *dentry, struct qstr *str) -{ - const struct ext4_sb_info *sbi = EXT4_SB(dentry->d_sb); - const struct unicode_map *um = sbi->s_encoding; - const struct inode *inode = READ_ONCE(dentry->d_inode); - unsigned char *norm; - int len, ret = 0; - - if (!inode || !IS_CASEFOLDED(inode) || !um) - return 0; - - norm = kmalloc(PATH_MAX, GFP_ATOMIC); - if (!norm) - return -ENOMEM; - - len = utf8_casefold(um, str, norm, PATH_MAX); - if (len < 0) { - if (ext4_has_strict_mode(sbi)) - ret = -EINVAL; - goto out; - } - str->hash = full_name_hash(dentry, norm, len); -out: - kfree(norm); - return ret; -} - const struct dentry_operations ext4_dentry_ops = { - .d_hash = ext4_d_hash, - .d_compare = ext4_d_compare, + .d_hash = generic_ci_d_hash, + .d_compare = generic_ci_d_compare, }; #endif diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 42f5060f3cdf1..5cd8be24a4fd9 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1393,14 +1393,6 @@ struct ext4_super_block { #define EXT4_ENC_UTF8_12_1 1 -/* - * Flags for ext4_sb_info.s_encoding_flags. - */ -#define EXT4_ENC_STRICT_MODE_FL(1 << 0) - -#define ext4_has_strict_mode(sbi) \ - (sbi->s_encoding_flags & EXT4_ENC_STRICT_MODE_FL) - /* * fourth extended-fs super-block data in memory */ @@ -1450,10 +1442,6 @@ struct ext4_sb_info { struct kobject s_kobj; struct completion s_kobj_unregister; struct super_block *s_sb; -#ifdef CONFIG_UNICODE - struct unicode_map *s_encoding; - __u16 s_encoding_flags; -#endif /* Journaling */ struct journal_s *s_journal; diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c index 3e133793a5a34..143b0073b3f46 100644 --- a/fs/ext4/hash.c +++ b/fs/ext4/hash.c @@ -275,7 +275,7 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len, struct dx_hash_info *hinfo) { #ifdef CONFIG_UNICODE - const struct unicode_map *um = EXT4_SB(dir->i_sb)->s_encoding; + const struct unicode_map *um = dir->i_sb->s_encoding; int r, dlen; unsigned char *buff; struct qstr qstr = {.name = name, .len = len }; diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 56738b538ddf4..7e9fb77fd2cc7 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1286,8 +1286,8 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block) int ext4_ci_compare(const struct inode *parent, const struct qstr *name, const struct qstr *entry, bool quick) { - const struct ext4_sb_info *sbi = EXT4_SB(parent->i_sb); - const struct unicode_map *um = sbi->s_encoding; + const struct super_block *sb = parent->i_sb; + const struct unicode_map *um = sb->s_encoding; int ret; if (quick) @@ -1299,7 +1299,7 @@ int
[PATCH v9 0/4] Prepare for upcoming Casefolding/Encryption patches
This lays the ground work for enabling casefolding and encryption at the same time for ext4 and f2fs. A future set of patches will enable that functionality. These unify the highly similar dentry_operations that ext4 and f2fs both use for casefolding. Daniel Rosenberg (4): unicode: Add utf8_casefold_hash fs: Add standard casefolding support f2fs: Use generic casefolding support ext4: Use generic casefolding support fs/ext4/dir.c | 64 + fs/ext4/ext4.h | 12 - fs/ext4/hash.c | 2 +- fs/ext4/namei.c | 20 fs/ext4/super.c | 12 ++--- fs/f2fs/dir.c | 84 - fs/f2fs/f2fs.h | 4 -- fs/f2fs/super.c | 10 ++-- fs/f2fs/sysfs.c | 10 ++-- fs/libfs.c | 101 fs/unicode/utf8-core.c | 23 - include/linux/f2fs_fs.h | 3 -- include/linux/fs.h | 22 + include/linux/unicode.h | 3 ++ 14 files changed, 186 insertions(+), 184 deletions(-) -- 2.27.0.111.gc72c7da667-goog
[PATCH v9 2/4] fs: Add standard casefolding support
This adds general supporting functions for filesystems that use utf8 casefolding. It provides standard dentry_operations and adds the necessary structures in struct super_block to allow this standardization. Ext4 and F2fs will switch to these common implementations. Signed-off-by: Daniel Rosenberg --- fs/libfs.c | 101 + include/linux/fs.h | 22 ++ 2 files changed, 123 insertions(+) diff --git a/fs/libfs.c b/fs/libfs.c index 4d08edf19c782..f7345a5ed562f 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include @@ -1363,3 +1365,102 @@ bool is_empty_dir_inode(struct inode *inode) return (inode->i_fop == _dir_operations) && (inode->i_op == _dir_inode_operations); } + +#ifdef CONFIG_UNICODE +/** + * needs_casefold - generic helper to determine if a filename should be casefolded + * @dir: Parent directory + * + * Generic helper for filesystems to use to determine if the name of a dentry + * should be casefolded. It does not make sense to casefold the no-key token of + * an encrypted filename. + * + * Return: if names will need casefolding + */ +bool needs_casefold(const struct inode *dir) +{ + return IS_CASEFOLDED(dir) && dir->i_sb->s_encoding && + (!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir)); +} +EXPORT_SYMBOL(needs_casefold); + +/** + * generic_ci_d_compare - generic d_compare implementation for casefolding filesystems + * @dentry:dentry whose name we are checking against + * @len: len of name of dentry + * @str: str pointer to name of dentry + * @name: Name to compare against + * + * Return: 0 if names match, 1 if mismatch, or -ERRNO + */ +int generic_ci_d_compare(const struct dentry *dentry, unsigned int len, + const char *str, const struct qstr *name) +{ + const struct dentry *parent = READ_ONCE(dentry->d_parent); + const struct inode *inode = READ_ONCE(parent->d_inode); + const struct super_block *sb = dentry->d_sb; + const struct unicode_map *um = sb->s_encoding; + struct qstr qstr = QSTR_INIT(str, len); + char strbuf[DNAME_INLINE_LEN]; + int ret; + + if (!inode || !needs_casefold(inode)) + goto fallback; + /* +* If the dentry name is stored in-line, then it may be concurrently +* modified by a rename. If this happens, the VFS will eventually retry +* the lookup, so it doesn't matter what ->d_compare() returns. +* However, it's unsafe to call utf8_strncasecmp() with an unstable +* string. Therefore, we have to copy the name into a temporary buffer. +*/ + if (len <= DNAME_INLINE_LEN - 1) { + memcpy(strbuf, str, len); + strbuf[len] = 0; + qstr.name = strbuf; + /* prevent compiler from optimizing out the temporary buffer */ + barrier(); + } + ret = utf8_strncasecmp(um, name, ); + if (ret >= 0) + return ret; + + if (sb_has_enc_strict_mode(sb)) + return -EINVAL; +fallback: + if (len != name->len) + return 1; + return !!memcmp(str, name->name, len); +} +EXPORT_SYMBOL(generic_ci_d_compare); + +/** + * generic_ci_d_hash - generic d_hash implementation for casefolding filesystems + * @dentry:dentry whose name we are hashing + * @str: qstr of name whose hash we should fill in + * + * Return: 0 if hash was successful, or -ERRNO + */ +int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str) +{ + const struct inode *inode = READ_ONCE(dentry->d_inode); + struct super_block *sb = dentry->d_sb; + const struct unicode_map *um = sb->s_encoding; + int ret = 0; + + if (!inode || !needs_casefold(inode)) + return 0; + + ret = utf8_casefold_hash(um, dentry, str); + if (ret < 0) + goto err; + + return 0; +err: + if (sb_has_enc_strict_mode(sb)) + ret = -EINVAL; + else + ret = 0; + return ret; +} +EXPORT_SYMBOL(generic_ci_d_hash); +#endif diff --git a/include/linux/fs.h b/include/linux/fs.h index 3f881a892ea74..261904e06873b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1392,6 +1392,12 @@ extern int send_sigurg(struct fown_struct *fown); #define SB_ACTIVE (1<<30) #define SB_NOUSER (1<<31) +/* These flags relate to encoding and casefolding */ +#define SB_ENC_STRICT_MODE_FL (1 << 0) + +#define sb_has_enc_strict_mode(sb) \ + (sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL) + /* * Umount options */ @@ -1461,6 +1467,10 @@ struct super_block { #endif #ifdef CONFIG_FS_VERITY const struct fsverity_operations *s_vop; +#endif +#ifdef CONFIG_UNICODE + struct unicode_map *s_encoding; + __u16 s_encoding_flags; #endif
Re: [PATCH -next] scsi: ufs: allow exynos ufs driver to build as module
On Sat, 20 Jun 2020 23:02:32 +0530, Alim Akhtar wrote: > Allow Exynos UFS driver to build as a module. > This patch fix the below build issue reported by > kernel build robot. > > drivers/scsi/ufs/ufs-exynos.o: in function `exynos_ufs_probe': > drivers/scsi/ufs/ufs-exynos.c:1231: undefined reference to > `ufshcd_pltfrm_init' > drivers/scsi/ufs/ufs-exynos.o: in function `exynos_ufs_pre_pwr_mode': > drivers/scsi/ufs/ufs-exynos.c:635: undefined reference to > `ufshcd_get_pwr_dev_param' > drivers/scsi/ufs/ufs-exynos.o:undefined reference to `ufshcd_pltfrm_shutdown' > drivers/scsi/ufs/ufs-exynos.o:undefined reference to `ufshcd_pltfrm_suspend' > drivers/scsi/ufs/ufs-exynos.o:undefined reference to `ufshcd_pltfrm_resume' > drivers/scsi/ufs/ufs-exynos.o:undefined reference to > `ufshcd_pltfrm_runtime_suspend' > drivers/scsi/ufs/ufs-exynos.o:undefined reference to > `ufshcd_pltfrm_runtime_resume' > drivers/scsi/ufs/ufs-exynos.o:undefined reference to > `ufshcd_pltfrm_runtime_idle' Applied to 5.9/scsi-queue, thanks! [1/1] scsi: ufs: Allow exynos ufs driver to build as module https://git.kernel.org/mkp/scsi/c/d31503fe395d -- Martin K. Petersen Oracle Linux Engineering
Re: Switching dmaengine tree to kernel.org
Hi Stephen, On 24-06-20, 08:09, Stephen Rothwell wrote: > Hi Vinod, > > On Tue, 23 Jun 2020 20:13:13 +0530 Vinod Koul wrote: > > > > I have switched dmaengine tree to kernel.org [1], please update your > > database to new tree which can be found at [2] > > > > [1]: > > https://lore.kernel.org/dmaengine/20200623143729.781403-1-vk...@kernel.org/ > > [2]: git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git > > OK, done. I also renamed them from slave-dma{,-fixes} to dmaengine{,-fixes). Thank you, rename makes sense. -- ~Vinod
Re: memcg missing charge when setting BPF
Hello Xie! It's actually not a surprise, it's a known limitation/exception. Partially it was so because historically there was no way to account percpu memory, and some bpf maps can are using it quite extensively. Fortunately, it changed recently, and 5.9 will likely get an ability to account percpu memory. The latest version of the patchset I've actually sent today: https://lore.kernel.org/linux-mm/20200623184515.4132564-1-g...@fb.com/T/#m0be45dd71e6a238985181c213d9934731949c089 I also have a patchset in work which adds a memcg accounting of bpf memory (programs and maps). I plan to send it upstream on the next week. If everything will go smoothly it might appear in 5.9 as well. Unfortunately the magnitude of required changes does not allow to backport these changes to older kernels. Thanks! PS I'll be completely offline till the end of the week. I'll respond all e-mail on Monday, Jun 29th. Thanks! On Wed, Jun 24, 2020 at 03:46:58AM +, Xie Xun wrote: > Hello, > > I found that programs can consume much more memory than memcg limit by > setting BPF for many times. It's because that allocations during setting BPF > are not charged by memcg. > > > Below is how I did it: > > 1. Run Linux kernel in a QEMU virtual machine (x86_64) with 1GB physical > memory. > The kernel is built with memcg and memcg kmem accounting enabled. > > 2. Create a docker (runC) container, with memory limit 100MB. > > docker run --name debian --memory 1 --kernel-memory 5000 \ > debian:slim /bin/bash > > 3. In the container, run a program to set BPF for many times. I use prctl to > set BPF. > > while(1) > { > prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, ); > } > > 4. Physical memory usage(the one by `free` or `top`) is increased by around > 40MB, > but memory usage of the container's memcg doesn't increase a lot (around > 100KB). > > 5. Run several processes to set BPF, and almost all physical memory is > consumed. > Sometimes some processes not in the container are also killed due to OOM. > > I also try this with user namespace on, and I can still kill host processes > inside container in this way. So this problem may be dangerous for containers > that based on cgroups. > > > kernel version: 5.3.6 > kernel configuration: in attachment (CONFIG_MEMCG_KMEM is on) > > > This blog also shows this problem: > https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.xiexun.tech_break-2Dmemcg.html=DwIFaQ=5VD0RTtNlTh3ycd41b3MUw=jJYgtDM7QT-W-Fz_d29HYQ=IBhsN9u88bNDFoDHNutIMKB-YrCvCOIvw-8z9RpB8RI=O1b3udJv7obq8vZ88-YPEDzs7hhGov3o_Txskn4IeyA= > > > > Cause of this problem: > > Memory allocations during setting BPF are not charged by memcg. For example, > in kernel/bpf/core.c:bpf_prog_alloc, bpf_prog_alloc_no_stats and > alloc_percpu_gfp > are called to allocate memory. However, neither of them are charged by memcg. > So if we trigger this path for many times, we can consume lots of memory, > without > increasing our memcg usage. > > /* */ > struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags) > { > gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | gfp_extra_flags; > struct bpf_prog *prog; > int cpu; > > prog = bpf_prog_alloc_no_stats(size, gfp_extra_flags); > if (!prog) > return NULL; > > prog->aux->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags); > > /* ... */ > > } > /* */ > > > My program that sets BPF: > > /* */ > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > int main() > { > struct sock_filter insns[] = > { > { > .code = 0x6, > .jt = 0, > .jf = 0, > .k = SECCOMP_RET_ALLOW > } > }; > struct sock_fprog bpf = > { > .len = 1, > .filter = insns > }; > int ret; > > ret = prctl(PR_SET_NO_NEW_PRIVS, 1, NULL, 0, 0); > if (ret) > { > printf("error1 %d\n", errno); > return 1; > } > int count = 0; > while (1) > { > ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, ); > if (ret) > { > sleep(1); > printf("error %d\n", errno); > } > else > { > count++; > printf("ok %d\n", count); > } > } > return 0; > } > /* */ > > > Thanks, > Xie Xun
Re: [PATCH] scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()
On Tue, 23 Jun 2020 10:41:22 +0200, SeongJae Park wrote: > Commit cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp > with general hdw_queues per cpu") has introduced static checker warnings > for potential null dereferences in 'lpfc_sli4_hba_unset()' and > commit 1ffdd2c0440d ("scsi: lpfc: resolve static checker warning in > lpfc_sli4_hba_unset") has tried to fix it. However, yet another > potential null dereference is remaining. This commit fixes it. > > [...] Applied to 5.8/scsi-fixes, thanks! [1/1] scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset() https://git.kernel.org/mkp/scsi/c/46da547e21d6 -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] qla2xxx: Set NVME status code for failed NVME FCP request
On Thu, 4 Jun 2020 12:07:45 +0200, Daniel Wagner wrote: > The qla2xxx driver knows when request was processed successfully or > not. But it always sets the NVME status code to 0/NVME_SC_SUCCESS. The > upper layer needs to figure out from the rcv_rsplen and > transferred_length variables if the request was successfully. This is > not always possible, e.g. when the request data length is 0, the > transferred_length is also set 0 which is interpreted as success in > nvme_fc_fcpio_done(). Let's inform the upper > layer (nvme_fc_fcpio_done()) when something went wrong. > > [...] Applied to 5.8/scsi-fixes, thanks! [1/1] scsi: qla2xxx: Set NVMe status code for failed NVMe FCP request https://git.kernel.org/mkp/scsi/c/ef2e3ec520a8 -- Martin K. Petersen Oracle Linux Engineering
Re: [RFC PATCH 0/1] dm-crypt excessive overhead
On 2020/06/24 0:01, Mike Snitzer wrote: > On Sun, Jun 21 2020 at 8:45pm -0400, > Damien Le Moal wrote: > >> On 2020/06/20 1:56, Mike Snitzer wrote: >>> On Fri, Jun 19 2020 at 12:41pm -0400, >>> Ignat Korchagin wrote: >>> This is a follow up from the long-forgotten [1], but with some more convincing evidence. Consider the following script: #!/bin/bash -e # create 4G ramdisk sudo modprobe brd rd_nr=1 rd_size=4194304 # create a dm-crypt device with NULL cipher on top of /dev/ram0 echo '0 8388608 crypt capi:ecb(cipher_null) - 0 /dev/ram0 0' | sudo dmsetup create eram0 # create a dm-crypt device with NULL cipher and custom force_inline flag echo '0 8388608 crypt capi:ecb(cipher_null) - 0 /dev/ram0 0 1 force_inline' | sudo dmsetup create inline-eram0 # read all data from /dev/ram0 sudo dd if=/dev/ram0 bs=4k iflag=direct | sha256sum # read the same data from /dev/mapper/eram0 sudo dd if=/dev/mapper/eram0 bs=4k iflag=direct | sha256sum # read the same data from /dev/mapper/inline-eram0 sudo dd if=/dev/mapper/inline-eram0 bs=4k iflag=direct | sha256sum This script creates a ramdisk (to eliminate hardware bias in the benchmark) and two dm-crypt instances on top. Both dm-crypt instances use the NULL cipher to eliminate potentially expensive crypto bias (the NULL cipher just uses memcpy for "encyption"). The first instance is the current dm-crypt implementation from 5.8-rc1, the second is the dm-crypt instance with a custom new flag enabled from the patch attached to this thread. On my VM (Debian in VirtualBox with 4 cores on 2.8 GHz Quad-Core Intel Core i7) I get the following output (formatted for better readability): # plain ram0 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 21.2305 s, 202 MB/s 8479e43911dc45e89f934fe48d01297e16f51d17aa561d4d1c216b1ae0fcddca - # eram0 (current dm-crypt) 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 53.2212 s, 80.7 MB/s 8479e43911dc45e89f934fe48d01297e16f51d17aa561d4d1c216b1ae0fcddca - # inline-eram0 (patched dm-crypt) 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 21.3472 s, 201 MB/s 8479e43911dc45e89f934fe48d01297e16f51d17aa561d4d1c216b1ae0fcddca - As we can see, current dm-crypt implementation creates a significant IO performance overhead (at least on small IO block sizes) for both latency and throughput. We suspect offloading IO request processing into workqueues and async threads is more harmful these days with the modern fast storage. I also did some digging into the dm-crypt git history and much of this async processing is not needed anymore, because the reasons it was added are mostly gone from the kernel. More details can be found in [2] (see "Git archeology" section). We have been running the attached patch on different hardware generations in more than 200 datacentres on both SATA SSDs and NVME SSDs and so far were very happy with the performance benefits. [1]: https://www.spinics.net/lists/dm-crypt/msg07516.html [2]: https://blog.cloudflare.com/speeding-up-linux-disk-encryption/ Ignat Korchagin (1): Add DM_CRYPT_FORCE_INLINE flag to dm-crypt target drivers/md/dm-crypt.c | 55 +-- 1 file changed, 43 insertions(+), 12 deletions(-) -- 2.20.1 >>> >>> Hi, >>> >>> I saw [2] and have been expecting something from cloudflare ever since. >>> Nice to see this submission. >>> >>> There is useful context in your 0th patch header. I'll likely merge >>> parts of this patch header with the more terse 1/1 header (reality is >>> there only needed to be a single patch submission). >>> >>> Will review and stage accordingly if all looks fine to me. Mikulas, >>> please have a look too. >> >> Very timely: I was about to send a couple of patches to add zoned block >> device >> support to dm-crypt :) >> >> I used [1] work as a base to have all _write_ requests be processed inline in >> the submitter context so that the submission order is preserved, avoiding the >> potential reordering of sequential writes that the normal workqueue based >> processing can generate. This inline processing is done only for writes. >> Reads >> are unaffected. >> >> To do this, I added a "inline_io" flag to struct convert_context which is >> initialized in crypt_io_init() based on the BIO op. >> kcryptd_crypt_write_io_submit() then uses this flag to call directly >> generic_make_request() if inline_io is true. >> >> This simplifies things compared to [1] since reads can still be processed as
Re: [PATCH v6 0/2] Renovate memcpy_mcsafe with copy_mc_to_{user, kernel}
On Wed, Jun 17, 2020 at 05:31:58PM -0700, Dan Williams wrote: > No changes since v5 [1], just rebased to v5.8-rc1. No comments since > that posting back at the end of May either, will continue to re-post > weekly, I am otherwise at a loss for what else to do to move this > forward. Should it go through Andrew since it's across PPC and x86? > Thanks again to Michael for the PPC acks. > > [1]: > http://lore.kernel.org/r/159062136234.2192412.7285856919306307817.st...@dwillia2-desk3.amr.corp.intel.com > > --- > > The primary motivation to go touch memcpy_mcsafe() is that the existing > benefit of doing slow "handle with care" copies is obviated on newer > CPUs. With that concern lifted it also obviates the need to continue to > update the MCA-recovery capability detection code currently gated by > "mcsafe_key". Now the old "mcsafe_key" opt-in to perform the copy with > concerns for recovery fragility can instead be made an opt-out from the > default fast copy implementation (enable_copy_mc_fragile()). > > The discussion with Linus on the first iteration of this patch > identified that memcpy_mcsafe() was misnamed relative to its usage. The > new names copy_mc_to_user() and copy_mc_to_kernel() clearly indicate the > intended use case and lets the architecture organize the implementation > accordingly. > > For both powerpc and x86 a copy_mc_generic() implementation is added as > the backend for these interfaces. > > Patches are relative to v5.8-rc1. It has a recent build success > notification from the kbuild robot and is passing local nvdimm tests. > Looks good to me. I tested on a Broadwell generation system (i.e. one of the system you now list as "fragile") and injecting uncorrected errors into buffers that are then copied using copy_mc_to_kernel() result in recovered machine checks with the return value correctly indicating the amount remaining. Reviewed-by: Tony Luck -Tony
arch/s390/include/asm/atomic.h:51:35: sparse: sparse: context imbalance in 'btrfs_set_lock_blocking_read' - unexpected unlock
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 3e08a95294a4fb3702bb3d35ed08028433c37fe6 commit: d6156218bec93965b6a43ba2686ad962ce77c854 btrfs: make locking assertion helpers static inline date: 7 months ago config: s390-randconfig-s031-20200623 (attached as .config) compiler: s390-linux-gcc (GCC) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.2-dirty git checkout d6156218bec93965b6a43ba2686ad962ce77c854 # save the attached .config to linux build tree make W=1 C=1 ARCH=s390 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot sparse warnings: (new ones prefixed by >>) >> arch/s390/include/asm/atomic.h:51:35: sparse: sparse: context imbalance in >> 'btrfs_set_lock_blocking_read' - unexpected unlock fs/btrfs/locking.c:25:9: sparse: sparse: context imbalance in 'btrfs_set_lock_blocking_write' - unexpected unlock fs/btrfs/locking.c:127:6: sparse: sparse: context imbalance in 'btrfs_tree_read_lock' - different lock contexts for basic block fs/btrfs/locking.c:166:5: sparse: sparse: context imbalance in 'btrfs_tree_read_lock_atomic' - different lock contexts for basic block fs/btrfs/locking.c:186:5: sparse: sparse: context imbalance in 'btrfs_try_tree_read_lock' - different lock contexts for basic block fs/btrfs/locking.c:208:5: sparse: sparse: context imbalance in 'btrfs_try_tree_write_lock' - different lock contexts for basic block >> arch/s390/include/asm/atomic.h:51:35: sparse: sparse: context imbalance in >> 'btrfs_tree_read_unlock' - unexpected unlock fs/btrfs/locking.c:19:9: sparse: sparse: context imbalance in 'btrfs_tree_lock' - wrong count at exit fs/btrfs/locking.c:25:9: sparse: sparse: context imbalance in 'btrfs_tree_unlock' - unexpected unlock vim +/btrfs_set_lock_blocking_read +51 arch/s390/include/asm/atomic.h 56fefbbc3f13ad8 Peter Zijlstra 2016-04-18 46 5692e4d11cf5b51 Heiko Carstens 2013-09-16 47 static inline void atomic_add(int i, atomic_t *v) 5692e4d11cf5b51 Heiko Carstens 2013-09-16 48 { 5692e4d11cf5b51 Heiko Carstens 2013-09-16 49 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES 5692e4d11cf5b51 Heiko Carstens 2013-09-16 50 if (__builtin_constant_p(i) && (i > -129) && (i < 128)) { 126b30c3cb476ce Martin Schwidefsky 2016-11-11 @51 __atomic_add_const(i, >counter); 0ccc8b7ac860533 Heiko Carstens 2014-03-20 52 return; 5692e4d11cf5b51 Heiko Carstens 2013-09-16 53 } 5692e4d11cf5b51 Heiko Carstens 2013-09-16 54 #endif 126b30c3cb476ce Martin Schwidefsky 2016-11-11 55 __atomic_add(i, >counter); 5692e4d11cf5b51 Heiko Carstens 2013-09-16 56 } 5692e4d11cf5b51 Heiko Carstens 2013-09-16 57 :: The code at line 51 was first introduced by commit :: 126b30c3cb476ce68489a657a7defb8e73775e6f s390/atomic: refactor atomic primitives :: TO: Martin Schwidefsky :: CC: Martin Schwidefsky --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH 5.4 000/314] 5.4.49-rc1 review
On Tue, Jun 23, 2020 at 09:53:15PM +0200, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 5.4.49 release. > There are 314 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Thu, 25 Jun 2020 19:52:30 +. > Anything received after that time might be too late. > > The whole patch series can be found in one patch at: > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.49-rc1.gz > or in the git tree and branch at: > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > linux-5.4.y > and the diffstat can be found below. > > thanks, > > greg k-h I ran this through my LLVM build tests and saw no regressions compared to 5.4.47. Cheers, Nathan
Subject: [PATCH v3 1/2] USB: serial: cp210x: Enable usb generic throttle/unthrottle
Assign the .throttle and .unthrottle functions to be generic function in the driver structure to prevent data loss that can otherwise occur if the host does not enable USB throttling. Signed-off-by: Phu Luu An Signed-off-by: Brant Merryman --- 06/09/2020: Patch v3 1/2 Modified based on feedback from Johan Hovold 12/18/2019: Patch v2 Broken into two patches and modified based on feedback from Johan Hovold 12/09/2019: Initial submission of patch "Proper RTS control when buffers fill" drivers/usb/serial/cp210x.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index f5143eedbc48..bcceb4ad8be0 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -272,6 +272,8 @@ static struct usb_serial_driver cp210x_device = { .break_ctl = cp210x_break_ctl, .set_termios= cp210x_set_termios, .tx_empty = cp210x_tx_empty, + .throttle = usb_serial_generic_throttle, + .unthrottle = usb_serial_generic_unthrottle, .tiocmget = cp210x_tiocmget, .tiocmset = cp210x_tiocmset, .attach = cp210x_attach, -- 2.17.0
Re: linux-next: manual merge of the rcu tree with the tip tree
On Wed, Jun 24, 2020 at 01:04:50PM +1000, Stephen Rothwell wrote: > Hi all, > > Today's linux-next merge of the rcu tree got a conflict in: > > kernel/sched/core.c > > between commit: > > 964ed98b0752 ("sched/core: Fix ttwu() race") > > from the tip tree and commit: > > 3c88d09bfb1b ("EXP sched: Alleged fix for v5.8 merge-window scheduler > issue") > > from the rcu tree. > > I fixed it up (I used the version from the tip tree) and can carry the > fix as necessary. This is now fixed as far as linux-next is concerned, > but any non trivial conflicts should be mentioned to your upstream > maintainer when your tree is submitted for merging. You may also want > to consider cooperating with the maintainer of the conflicting tree to > minimise any particularly complex conflicts. Gah. I will move my copy of this patch out of the rcu/next batch. I included it so that I could find other bugs. ;-) Thanx, Paul
Re: [PATCH 2/2] Documentation/litmus-tests: Add note on herd7 7.56 in atomic litmus test
On Wed, Jun 24, 2020 at 01:24:25AM +0200, Andrea Parri wrote: > On Wed, Jun 24, 2020 at 07:09:01AM +0900, Akira Yokosawa wrote: > > From f808c371075d2f92b955da1a83ecb3828db1972e Mon Sep 17 00:00:00 2001 > > From: Akira Yokosawa > > Date: Wed, 24 Jun 2020 06:59:26 +0900 > > Subject: [PATCH 2/2] Documentation/litmus-tests: Add note on herd7 7.56 in > > atomic litmus test > > > > herdtools 7.56 has enhanced herd7's C parser so that the "(void)expr" > > construct in Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus is > > accepted. > > > > This is independent of LKMM's cat model, so mention the required > > version in the header of the litmus test and its entry in README. > > > > CC: Boqun Feng > > Reported-by: Andrea Parri > > Signed-off-by: Akira Yokosawa > > Frankly, I was hoping that we could simply bump the herd7 version in > tools/memory-model/README; I understand your point, but I admit that > I haven't being playing with 7.52 for a while now... Maybe in a few years it will no longer be relevant, and could then be removed? > Acked-by: Andrea Parri I queued both, thank you both! Thanx, Paul > Andrea > > > > --- > > Documentation/litmus-tests/README| 1 + > > .../atomic/Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus | 1 + > > 2 files changed, 2 insertions(+) > > > > diff --git a/Documentation/litmus-tests/README > > b/Documentation/litmus-tests/README > > index b79e640214b9..7f5c6c3ed6c3 100644 > > --- a/Documentation/litmus-tests/README > > +++ b/Documentation/litmus-tests/README > > @@ -19,6 +19,7 @@ > > Atomic-RMW+mb__after_atomic-is-stronger-than-acquire.litmus > > > > Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus > > Test that atomic_set() cannot break the atomicity of atomic RMWs. > > +NOTE: Require herd7 7.56 or later which supports "(void)expr". > > > > > > RCU (/rcu directory) > > diff --git > > a/Documentation/litmus-tests/atomic/Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus > > > > b/Documentation/litmus-tests/atomic/Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus > > index 49385314d911..ffd4d3e79c4a 100644 > > --- > > a/Documentation/litmus-tests/atomic/Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus > > +++ > > b/Documentation/litmus-tests/atomic/Atomic-RMW-ops-are-atomic-WRT-atomic_set.litmus > > @@ -4,6 +4,7 @@ C Atomic-RMW-ops-are-atomic-WRT-atomic_set > > * Result: Never > > * > > * Test that atomic_set() cannot break the atomicity of atomic RMWs. > > + * NOTE: This requires herd7 7.56 or later which supports "(void)expr". > > *) > > > > { > > -- > > 2.17.1 > > > >
Re: reset-brcmstb-rescal.c:undefined reference to `devm_ioremap_resource'
On 6/23/2020 6:26 PM, kernel test robot wrote: > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 3e08a95294a4fb3702bb3d35ed08028433c37fe6 > commit: 4cf176e52397853e4a4dd37e917c5eafb47ba8d1 reset: Add Broadcom STB > RESCAL reset controller > date: 6 months ago > config: um-allmodconfig (attached as .config) > compiler: gcc-9 (Debian 9.3.0-13) 9.3.0 > reproduce (this is a W=1 build): > git checkout 4cf176e52397853e4a4dd37e917c5eafb47ba8d1 > # save the attached .config to linux build tree > make W=1 ARCH=um > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All errors (new ones prefixed by >>): > >/usr/bin/ld: arch/um/drivers/vde.o: in function `vde_open_real': >(.text+0x9cb): warning: Using 'getgrnam' in statically linked applications > requires at runtime the shared libraries from the glibc version used for > linking >/usr/bin/ld: (.text+0x61d): warning: Using 'getpwuid' in statically linked > applications requires at runtime the shared libraries from the glibc version > used for linking >/usr/bin/ld: arch/um/drivers/vector_user.o: in function > `user_init_socket_fds': >vector_user.c:(.text+0x53a): warning: Using 'getaddrinfo' in statically > linked applications requires at runtime the shared libraries from the glibc > version used for linking >/usr/bin/ld: arch/um/drivers/pcap.o: in function `pcap_nametoaddr': >(.text+0x10095): warning: Using 'gethostbyname' in statically linked > applications requires at runtime the shared libraries from the glibc version > used for linking >/usr/bin/ld: arch/um/drivers/pcap.o: in function `pcap_nametonetaddr': >(.text+0x10155): warning: Using 'getnetbyname' in statically linked > applications requires at runtime the shared libraries from the glibc version > used for linking >/usr/bin/ld: arch/um/drivers/pcap.o: in function `pcap_nametoproto': >(.text+0x10395): warning: Using 'getprotobyname' in statically linked > applications requires at runtime the shared libraries from the glibc version > used for linking >/usr/bin/ld: arch/um/drivers/pcap.o: in function `pcap_nametoport': >(.text+0x1018b): warning: Using 'getservbyname' in statically linked > applications requires at runtime the shared libraries from the glibc version > used for linking >/usr/bin/ld: drivers/reset/reset-brcmstb-rescal.o: in function > `brcm_rescal_reset_probe': >>> reset-brcmstb-rescal.c:(.text+0x21b): undefined reference to >>> `devm_ioremap_resource' >/usr/bin/ld: drivers/reset/reset-intel-gw.o: in function > `intel_reset_probe': >reset-intel-gw.c:(.text+0x5be): undefined reference to > `devm_platform_ioremap_resource' >collect2: error: ld returned 1 exit status This is a late report, it was fixed upstream with: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7fbcc53514c5e410dd598685c5fbbe1e6a5a87e1 -- Florian
[PATCH] powerpc/boot: Use address-of operator on section symbols
Clang warns: arch/powerpc/boot/main.c:107:18: warning: array comparison always evaluates to a constant [-Wtautological-compare] if (_initrd_end > _initrd_start) { ^ arch/powerpc/boot/main.c:155:20: warning: array comparison always evaluates to a constant [-Wtautological-compare] if (_esm_blob_end <= _esm_blob_start) ^ 2 warnings generated. These are not true arrays, they are linker defined symbols, which are just addresses. Using the address of operator silences the warning and does not change the resulting assembly with either clang/ld.lld or gcc/ld (tested with diff + objdump -Dr). Link: https://github.com/ClangBuiltLinux/linux/issues/212 Reported-by: Joel Stanley Signed-off-by: Nathan Chancellor --- arch/powerpc/boot/main.c | 4 ++-- arch/powerpc/boot/ps3.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/boot/main.c b/arch/powerpc/boot/main.c index a9d209135975..cae31a6e8f02 100644 --- a/arch/powerpc/boot/main.c +++ b/arch/powerpc/boot/main.c @@ -104,7 +104,7 @@ static struct addr_range prep_initrd(struct addr_range vmlinux, void *chosen, { /* If we have an image attached to us, it overrides anything * supplied by the loader. */ - if (_initrd_end > _initrd_start) { + if (&_initrd_end > &_initrd_start) { printf("Attached initrd image at 0x%p-0x%p\n\r", _initrd_start, _initrd_end); initrd_addr = (unsigned long)_initrd_start; @@ -152,7 +152,7 @@ static void prep_esm_blob(struct addr_range vmlinux, void *chosen) unsigned long esm_blob_addr, esm_blob_size; /* Do we have an ESM (Enter Secure Mode) blob? */ - if (_esm_blob_end <= _esm_blob_start) + if (&_esm_blob_end <= &_esm_blob_start) return; printf("Attached ESM blob at 0x%p-0x%p\n\r", diff --git a/arch/powerpc/boot/ps3.c b/arch/powerpc/boot/ps3.c index c52552a681c5..6e4efbdb6b7c 100644 --- a/arch/powerpc/boot/ps3.c +++ b/arch/powerpc/boot/ps3.c @@ -127,7 +127,7 @@ void platform_init(void) ps3_repository_read_rm_size(_size); dt_fixup_memory(0, rm_size); - if (_initrd_end > _initrd_start) { + if (&_initrd_end > &_initrd_start) { setprop_val(chosen, "linux,initrd-start", (u32)(_initrd_start)); setprop_val(chosen, "linux,initrd-end", (u32)(_initrd_end)); } base-commit: 3e08a95294a4fb3702bb3d35ed08028433c37fe6 -- 2.27.0
Re: [PATCH] mm/spase: never partially remove memmap for early section
On Wed, Jun 24, 2020 at 11:52:36AM +0800, Baoquan He wrote: >On 06/24/20 at 11:46am, Wei Yang wrote: >> On Wed, Jun 24, 2020 at 09:47:37AM +0800, Baoquan He wrote: >> >On 06/23/20 at 05:21pm, Dan Williams wrote: >> >> On Tue, Jun 23, 2020 at 2:43 AM Wei Yang >> >> wrote: >> >> > >> >> > For early sections, we assumes its memmap will never be partially >> >> > removed. But current behavior breaks this. >> >> >> >> Where do we assume that? >> >> >> >> The primary use case for this was mapping pmem that collides with >> >> System-RAM in the same 128MB section. That collision will certainly be >> >> depopulated on-demand depending on the state of the pmem device. So, >> >> I'm not understanding the problem or the benefit of this change. >> > >> >I was also confused when review this patch, the patch log is a little >> >short and simple. From the current code, with SPARSE_VMEMMAP enabled, we >> >do build memmap for the whole memory section during boot, even though >> >some of them may be partially populated. We just mark the subsection map >> >for present pages. >> > >> >Later, if pmem device is mapped into the partially boot memory section, >> >we just fill the relevant subsection map, do return directly, w/o building >> >the memmap for it, in section_activate(). Because the memmap for the >> >unpresent RAM part have been there. I guess this is what Wei is trying to >> >do to keep the behaviour be consistent for pmem device adding, or >> >pmem device removing and later adding again. >> > >> >Please correct me if I am wrong. >> >> You are right here. >> >> > >> >To me, fixing it looks good. But a clear doc or code comment is >> >necessary so that people can understand the code with less time. >> >Leaving it as is doesn't cause harm. I personally tend to choose >> >the former. >> > >> >> The former is to add a clear doc? > >Sorry for the confusion. The former means the fix in your patch. Maybe a >improved log and some code comment adding can make it more perfect. > Sure, I would try to add more log and comments, in case you have some good suggestion, just let me know :) -- Wei Yang Help you, Help me
Re: [PATCH] mm/spase: never partially remove memmap for early section
On 06/24/20 at 11:46am, Wei Yang wrote: > On Wed, Jun 24, 2020 at 09:47:37AM +0800, Baoquan He wrote: > >On 06/23/20 at 05:21pm, Dan Williams wrote: > >> On Tue, Jun 23, 2020 at 2:43 AM Wei Yang > >> wrote: > >> > > >> > For early sections, we assumes its memmap will never be partially > >> > removed. But current behavior breaks this. > >> > >> Where do we assume that? > >> > >> The primary use case for this was mapping pmem that collides with > >> System-RAM in the same 128MB section. That collision will certainly be > >> depopulated on-demand depending on the state of the pmem device. So, > >> I'm not understanding the problem or the benefit of this change. > > > >I was also confused when review this patch, the patch log is a little > >short and simple. From the current code, with SPARSE_VMEMMAP enabled, we > >do build memmap for the whole memory section during boot, even though > >some of them may be partially populated. We just mark the subsection map > >for present pages. > > > >Later, if pmem device is mapped into the partially boot memory section, > >we just fill the relevant subsection map, do return directly, w/o building > >the memmap for it, in section_activate(). Because the memmap for the > >unpresent RAM part have been there. I guess this is what Wei is trying to > >do to keep the behaviour be consistent for pmem device adding, or > >pmem device removing and later adding again. > > > >Please correct me if I am wrong. > > You are right here. > > > > >To me, fixing it looks good. But a clear doc or code comment is > >necessary so that people can understand the code with less time. > >Leaving it as is doesn't cause harm. I personally tend to choose > >the former. > > > > The former is to add a clear doc? Sorry for the confusion. The former means the fix in your patch. Maybe a improved log and some code comment adding can make it more perfect. > > > paging_init() > > ->sparse_init() > > ->sparse_init_nid() > > { > > ... > > for_each_present_section_nr(pnum_begin, pnum) { > > ... > > map = __populate_section_memmap(pfn, > > PAGES_PER_SECTION, > > nid, NULL); > > ... > > } > > } > > ... > > ->zone_sizes_init() > > ->free_area_init() > > { > > for_each_mem_pfn_range(i, MAX_NUMNODES, _pfn, > > _pfn, ) { > > subsection_map_init(start_pfn, end_pfn - > > start_pfn); > > } > > { > > > > __add_pages() > > ->sparse_add_section() > > ->section_activate() > > { > > ... > > fill_subsection_map(); > > if (nr_pages < PAGES_PER_SECTION && > > early_section(ms)) <--* > > return pfn_to_page(pfn); > > ... > > } > >> > > -- > Wei Yang > Help you, Help me >
Re: Add MediaTek MT6873 devapc driver
On Tue, 2020-06-09 at 11:32 -0600, Rob Herring wrote: > On Tue, Jun 09, 2020 at 06:24:19PM +0800, Neal Liu wrote: > > These patch series introduce a MediaTek MT6873 devapc driver. > > > > MT6873 bus frabric provides TrustZone security support and data > > protection to prevent slaves from being accessed by unexpected > > masters. > > The security violations are logged and sent to the processor for > > further analysis or countermeasures. > > > > Any occurrence of security violation would raise an interrupt, and > > it will be handled by devapc-mt6873 driver. The violation > > information is printed in order to find the murderer. > > There's also a proposed driver with similar functionality[1]. Come up > with a common solution. > > Rob > > [1] > https://lore.kernel.org/linux-arm-kernel/20200128153806.7780-1-benjamin.gaign...@st.com/ Actually, Mediatek devapc HW do the similar things. But the real "firewall" functionality is implemented in TrustZone instread of REE. This devapc-mt6873 driver is mainly handled the violation. Bus firewall framework seems not cover this parts.
RE: [PATCH 1/1] power_supply: wilco_ec: Add permanent long life charging mode
> > > Since this is normally only done in the factory context, can you > > > please confirm does something need to be artificially done to block > > > userland from trying to set the battery charging to this mode? Or > > > will the EC already > > handle > > > blocking it directly? > > > > This is a feature of the battery, when the EC receives this setting it > > will be sent to the battery and stored there. Afterwards any attempt > > to change this mode will return a failure. > > Sorry this still isn't clear to me. You're saying that if EC receives > longlife > setting it will be able to do this in the field? If so, then I think this > patch will > need to block that setting to not allow field conversions into longlife mode. > EC does handle blocking the attempts from changing the mode. EC reads current mode ahead of a new setting. If it sees permanent long life already in use then any attempt to put the battery charging in a different mode will get failure 0x01 from EC. > > > > It's even better to block mode selection in the userland if long life > > mode already in use. > > Yes, I agree. This sounds like a good follow up to me too.
Re: [PATCH 1/2] selftests/lkdtm: Don't clear dmesg when running tests
On 6/22/20 4:51 AM, Naresh Kamboju wrote: On Fri, 8 May 2020 at 12:23, Michael Ellerman wrote: It is Very Rude to clear dmesg in test scripts. That's because the script may be part of a larger test run, and clearing dmesg potentially destroys the output of other tests. We can avoid using dmesg -c by saving the content of dmesg before the test, and then using diff to compare that to the dmesg afterward, producing a log with just the added lines. Signed-off-by: Michael Ellerman --- tools/testing/selftests/lkdtm/run.sh | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/lkdtm/run.sh b/tools/testing/selftests/lkdtm/run.sh index dadf819148a4..0b409e187c7b 100755 --- a/tools/testing/selftests/lkdtm/run.sh +++ b/tools/testing/selftests/lkdtm/run.sh @@ -59,23 +59,25 @@ if [ -z "$expect" ]; then expect="call trace:" fi -# Clear out dmesg for output reporting -dmesg -c >/dev/null - # Prepare log for report checking -LOG=$(mktemp --tmpdir -t lkdtm-XX) +LOG=$(mktemp --tmpdir -t lkdtm-log-XX) +DMESG=$(mktemp --tmpdir -t lkdtm-dmesg-XX) cleanup() { - rm -f "$LOG" + rm -f "$LOG" "$DMESG" } trap cleanup EXIT +# Save existing dmesg so we can detect new content below +dmesg > "$DMESG" + # Most shells yell about signals and we're expecting the "cat" process # to usually be killed by the kernel. So we have to run it in a sub-shell # and silence errors. ($SHELL -c 'cat <(echo '"$test"') >'"$TRIGGER" 2>/dev/null) || true # Record and dump the results -dmesg -c >"$LOG" +dmesg | diff --changed-group-format='%>' --unchanged-group-format='' "$DMESG" - > "$LOG" || true We are facing problems with the diff `=%>` part of the option. This report is from the OpenEmbedded environment. We have the same problem from livepatch_testcases. # selftests lkdtm BUG.sh lkdtm: BUG.sh_ # # diff unrecognized option '--changed-group-format=%>' unrecognized: option_'--changed-group-format=%>' # # BusyBox v1.27.2 (2020-03-30 164108 UTC) multi-call binary. v1.27.2: (2020-03-30_164108 # # : _ # # Usage diff [-abBdiNqrTstw] [-L LABEL] [-S FILE] [-U LINES] FILE1 FILE2 diff: [-abBdiNqrTstw]_[-L # # BUG missing 'kernel BUG at' [FAIL] Full test output log, https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200621/testrun/2850083/suite/kselftest/test/lkdtm_BUG.sh/log D'oh! Using diff's changed/unchanged group format was a nice trick to easily fetch the new kernel log messages. I can't think of any simple alternative off the top of my head, so here's a kludgy tested-once awk script: SAVED_DMESG="$(dmesg | tail -n1)" ... tests ... NEW_DMESG=$(dmesg | awk -v last="$SAVED_DMESG" 'p; $0 == last{p=1}') I think timestamps should make each log line unique, but this probably won't handle kernel log buffer overflow. Maybe it would be easier to log a known unique test delimiter msg and then fetch all new messages after that? -- Joe
Re: [PATCH] mm/spase: never partially remove memmap for early section
On Wed, Jun 24, 2020 at 09:47:37AM +0800, Baoquan He wrote: >On 06/23/20 at 05:21pm, Dan Williams wrote: >> On Tue, Jun 23, 2020 at 2:43 AM Wei Yang >> wrote: >> > >> > For early sections, we assumes its memmap will never be partially >> > removed. But current behavior breaks this. >> >> Where do we assume that? >> >> The primary use case for this was mapping pmem that collides with >> System-RAM in the same 128MB section. That collision will certainly be >> depopulated on-demand depending on the state of the pmem device. So, >> I'm not understanding the problem or the benefit of this change. > >I was also confused when review this patch, the patch log is a little >short and simple. From the current code, with SPARSE_VMEMMAP enabled, we >do build memmap for the whole memory section during boot, even though >some of them may be partially populated. We just mark the subsection map >for present pages. > >Later, if pmem device is mapped into the partially boot memory section, >we just fill the relevant subsection map, do return directly, w/o building >the memmap for it, in section_activate(). Because the memmap for the >unpresent RAM part have been there. I guess this is what Wei is trying to >do to keep the behaviour be consistent for pmem device adding, or >pmem device removing and later adding again. > >Please correct me if I am wrong. You are right here. > >To me, fixing it looks good. But a clear doc or code comment is >necessary so that people can understand the code with less time. >Leaving it as is doesn't cause harm. I personally tend to choose >the former. > The former is to add a clear doc? > paging_init() > ->sparse_init() > ->sparse_init_nid() > { > ... > for_each_present_section_nr(pnum_begin, pnum) { > ... > map = __populate_section_memmap(pfn, > PAGES_PER_SECTION, > nid, NULL); > ... > } > } > ... > ->zone_sizes_init() > ->free_area_init() > { > for_each_mem_pfn_range(i, MAX_NUMNODES, _pfn, > _pfn, ) { > subsection_map_init(start_pfn, end_pfn - start_pfn); > } > { > > __add_pages() > ->sparse_add_section() > ->section_activate() > { > ... > fill_subsection_map(); > if (nr_pages < PAGES_PER_SECTION && early_section(ms)) > <--* > return pfn_to_page(pfn); > ... > } >> -- Wei Yang Help you, Help me
RE: [EXT] Re: [PATCH v1] ARM: imx6plus: enable internal routing of clk_enet_ref where possible
From: Sven Van Asbroeck Sent: Wednesday, June 24, 2020 10:56 AM > Hi Andy, > > On Tue, Jun 23, 2020 at 9:40 PM Andy Duan wrote: > > > > The patch looks good. > > Reviewed-by: Fugang Duan > > Thank you ! > > To check we're on a plus, the patch uses: > cpu_is_imx6q() && imx_get_soc_revision() >= IMX_CHIP_REVISION_2_0 > > Fabio Estevam suggested that perhaps checking > of_machine_is_compatible("fsl,imx6qp") > might be preferable. What is your opinion on this? Yes, I also think of_machine_is_compatible() is much better. Thanks.
ERROR: "min_low_pfn" undefined!
Hi Jiasen, First bad commit (maybe != root cause): tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 3e08a95294a4fb3702bb3d35ed08028433c37fe6 commit: 99a06056124dcf5cfc4c95278b86c6ff96aaa1ec NTB: ntb_perf: Fix address err in perf_copy_chunk date: 3 months ago config: microblaze-randconfig-c021-20200623 (attached as .config) compiler: microblaze-linux-gcc (GCC) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): ERROR: "min_low_pfn" [net/mac80211/mac80211.ko] undefined! ERROR: "min_low_pfn" [drivers/ntb/ntb_transport.ko] undefined! >> ERROR: "min_low_pfn" [drivers/ntb/test/ntb_perf.ko] undefined! ERROR: "min_low_pfn" [drivers/rpmsg/virtio_rpmsg_bus.ko] undefined! ERROR: "min_low_pfn" [drivers/mtd/nand/raw/nand.ko] undefined! ERROR: "min_low_pfn" [drivers/mmc/core/mmc_core.ko] undefined! ERROR: "min_low_pfn" [crypto/essiv.ko] undefined! ERROR: "min_low_pfn" [crypto/ccm.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH RESEND] net/cisco: Fix a sleep-in-atomic-context bug in enic_init_affinity_hint()
On 06/24/2020 11:23 AM, David Miller wrote: From: Kaige Li Date: Wed, 24 Jun 2020 09:56:47 +0800 On 06/24/2020 06:26 AM, David Miller wrote: From: David Miller Date: Tue, 23 Jun 2020 14:33:11 -0700 (PDT) Calling a NIC driver open function from a context holding a spinlock is very much the real problem, so many operations have to sleep and in face that ->ndo_open() method is defined as being allowed to sleep and that's why the core networking never invokes it with spinlocks I mean "without" of course. :-) held. Did you mean that open function should be out of spinlock? If so, I will send V2 patch. Yes, but only if that is safe. You have to analyze the locking done by this driver and fix it properly. I anticipate it is not just a matter of changing where the spinlock is held, you will have to rearchitect things a bit. Okay, I will careful analyze this question, and make a suitable patch in V2. Thank you.
[for-linus][PATCH 4/4] tracing/boottime: Fix kprobe multiple events
From: Sascha Ortmann Fix boottime kprobe events to report and abort after each failure when adding probes. As an example, when we try to set multiprobe kprobe events in bootconfig like this: ftrace.event.kprobes.vfsevents { probes = "vfs_read $arg1 $arg2,, !error! not reported;?", // leads to error "vfs_write $arg1 $arg2" } This will not work as expected. After commit da0f1f4167e3af69e ("tracing/boottime: Fix kprobe event API usage"), the function trace_boot_add_kprobe_event will not produce any error message when adding a probe fails at kprobe_event_gen_cmd_start. Furthermore, we continue to add probes when kprobe_event_gen_cmd_end fails (and kprobe_event_gen_cmd_start did not fail). In this case the function even returns successfully when the last call to kprobe_event_gen_cmd_end is successful. The behaviour of reporting and aborting after failures is not consistent. The function trace_boot_add_kprobe_event now reports each failure and stops adding probes immediately. Link: https://lkml.kernel.org/r/20200618163301.25854-1-sascha.ortm...@stud.uni-hannover.de Cc: sta...@vger.kernel.org Cc: linux-ker...@i4.cs.fau.de Co-developed-by: Maximilian Werner Fixes: da0f1f4167e3 ("tracing/boottime: Fix kprobe event API usage") Acked-by: Masami Hiramatsu Signed-off-by: Maximilian Werner Signed-off-by: Sascha Ortmann Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_boot.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index 8b5490cb02bb..fa0fc08c6ef8 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -101,12 +101,16 @@ trace_boot_add_kprobe_event(struct xbc_node *node, const char *event) kprobe_event_cmd_init(, buf, MAX_BUF_LEN); ret = kprobe_event_gen_cmd_start(, event, val); - if (ret) + if (ret) { + pr_err("Failed to generate probe: %s\n", buf); break; + } ret = kprobe_event_gen_cmd_end(); - if (ret) + if (ret) { pr_err("Failed to add probe: %s\n", buf); + break; + } } return ret; -- 2.26.2
[for-linus][PATCH 0/4] tracing: Fixes for 5.8-rc
Masami Hiramatsu (2): tracing/boot: Fix config dependency for synthedic event tracing: Fix event trigger to accept redundant spaces Sascha Ortmann (1): tracing/boottime: Fix kprobe multiple events Steven Rostedt (VMware) (1): ring-buffer: Zero out time extend if it is nested and not absolute kernel/trace/ring_buffer.c | 2 +- kernel/trace/trace_boot.c | 10 +++--- kernel/trace/trace_events_trigger.c | 21 +++-- 3 files changed, 27 insertions(+), 6 deletions(-)
[for-linus][PATCH 1/4] ring-buffer: Zero out time extend if it is nested and not absolute
From: "Steven Rostedt (VMware)" Currently the ring buffer makes events that happen in interrupts that preempt another event have a delta of zero. (Hopefully we can change this soon). But this is to deal with the races of updating a global counter with lockless and nesting functions updating deltas. With the addition of absolute time stamps, the time extend didn't follow this rule. A time extend can happen if two events happen longer than 2^27 nanoseconds appart, as the delta time field in each event is only 27 bits. If that happens, then a time extend is injected with 2^59 bits of nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between two events, and as it is writing the event, an interrupt triggers, it will see the 2^27 difference as well and inject a time extend of its own. But a recent change made the time extend logic not take into account the nesting, and this can cause two time extend deltas to happen moving the time stamp much further ahead than the current time. This gets all reset when the ring buffer moves to the next page, but that can cause time to appear to go backwards. This was observed in a trace-cmd recording, and since the data is saved in a file, with trace-cmd report --debug, it was possible to see that this indeed did happen! bash-52501 110d... 81778.908247: sched_switch: bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64] -0 110d... 81778.908757: sched_switch: swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64] TIME EXTEND: delta:306454770 length:0 bash-52501 110 81779.215212: sched_swap_numa: src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48] TIME EXTEND: delta:306458165 length:0 bash-52501 110dNh. 81779.521670: sched_wakeup: migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40] and at the next page, caused the time to go backwards: bash-52504 110d... 81779.685411: sched_switch: bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64] CPU:110 [SUBBUFFER START] [81779379165886:0x132] -0 110dN.. 81779.379166: sched_wakeup: bash:52504 [120] success=1 CPU:110 [0:0x10:40] -0 110d... 81779.379167: sched_switch: swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64] Link: https://lkml.kernel.org/r/20200622151815.345d1...@oasis.local.home Cc: Ingo Molnar Cc: Andrew Morton Cc: Tom Zanussi Cc: sta...@vger.kernel.org Fixes: dc4e2801d400b ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP") Reported-by: Julia Lawall Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/ring_buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index b8e1ca48be50..00867ff82412 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -2427,7 +2427,7 @@ rb_update_event(struct ring_buffer_per_cpu *cpu_buffer, if (unlikely(info->add_timestamp)) { bool abs = ring_buffer_time_stamp_abs(cpu_buffer->buffer); - event = rb_add_time_stamp(event, info->delta, abs); + event = rb_add_time_stamp(event, abs ? info->delta : delta, abs); length -= RB_LEN_TIME_EXTEND; delta = 0; } -- 2.26.2
[for-linus][PATCH 2/4] tracing/boot: Fix config dependency for synthedic event
From: Masami Hiramatsu Since commit 726721a51838 ("tracing: Move synthetic events to a separate file") decoupled synthetic event from histogram, boot-time tracing also has to check CONFIG_SYNTH_EVENT instead of CONFIG_HIST_TRIGGERS. Link: http://lkml.kernel.org/r/159262475441.185015.5300725180746017555.stgit@devnote2 Fixes: 726721a51838 ("tracing: Move synthetic events to a separate file") Reviewed-by: Tom Zanussi Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_boot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index 9de29bb45a27..8b5490cb02bb 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -120,7 +120,7 @@ trace_boot_add_kprobe_event(struct xbc_node *node, const char *event) } #endif -#ifdef CONFIG_HIST_TRIGGERS +#ifdef CONFIG_SYNTH_EVENTS static int __init trace_boot_add_synth_event(struct xbc_node *node, const char *event) { -- 2.26.2
[for-linus][PATCH 3/4] tracing: Fix event trigger to accept redundant spaces
From: Masami Hiramatsu Fix the event trigger to accept redundant spaces in the trigger input. For example, these return -EINVAL echo " traceon" > events/ftrace/print/trigger echo "traceon if common_pid == 0" > events/ftrace/print/trigger echo "disable_event:kmem:kmalloc " > events/ftrace/print/trigger But these are hard to find what is wrong. To fix this issue, use skip_spaces() to remove spaces in front of actual tokens, and set NULL if there is no token. Link: http://lkml.kernel.org/r/159262476352.185015.5261566783045364186.stgit@devnote2 Cc: Tom Zanussi Cc: sta...@vger.kernel.org Fixes: 85f2b08268c0 ("tracing: Add basic event trigger framework") Reviewed-by: Tom Zanussi Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_events_trigger.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c index 3a74736da363..f725802160c0 100644 --- a/kernel/trace/trace_events_trigger.c +++ b/kernel/trace/trace_events_trigger.c @@ -216,11 +216,17 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file) int trigger_process_regex(struct trace_event_file *file, char *buff) { - char *command, *next = buff; + char *command, *next; struct event_command *p; int ret = -EINVAL; + next = buff = skip_spaces(buff); command = strsep(, ": \t"); + if (next) { + next = skip_spaces(next); + if (!*next) + next = NULL; + } command = (command[0] != '!') ? command : command + 1; mutex_lock(_cmd_mutex); @@ -630,8 +636,14 @@ event_trigger_callback(struct event_command *cmd_ops, int ret; /* separate the trigger from the filter (t:n [if filter]) */ - if (param && isdigit(param[0])) + if (param && isdigit(param[0])) { trigger = strsep(, " \t"); + if (param) { + param = skip_spaces(param); + if (!*param) + param = NULL; + } + } trigger_ops = cmd_ops->get_trigger_ops(cmd, trigger); @@ -1368,6 +1380,11 @@ int event_enable_trigger_func(struct event_command *cmd_ops, trigger = strsep(, " \t"); if (!trigger) return -EINVAL; + if (param) { + param = skip_spaces(param); + if (!*param) + param = NULL; + } system = strsep(, ":"); if (!trigger) -- 2.26.2
Re: [PATCH v3 2/2] net: phy: call phy_disable_interrupts() in phy_init_hw()
Le 2020-06-23 à 20:26, Jisheng Zhang a écrit : > Call phy_disable_interrupts() in phy_init_hw() to "have a defined init > state as we don't know in which state the PHY is if the PHY driver is > loaded. We shouldn't assume that it's the chip power-on defaults, BIOS > or boot loader could have changed this. Or in case of dual-boot > systems the other OS could leave the PHY in whatever state." as pointed > out by Heiner. > > Suggested-by: Heiner Kallweit > Signed-off-by: Jisheng Zhang > --- > drivers/net/phy/phy_device.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c > index 04946de74fa0..f17d397ba689 100644 > --- a/drivers/net/phy/phy_device.c > +++ b/drivers/net/phy/phy_device.c > @@ -1090,10 +1090,13 @@ int phy_init_hw(struct phy_device *phydev) > if (ret < 0) > return ret; > > - if (phydev->drv->config_init) > + if (phydev->drv->config_init) { > ret = phydev->drv->config_init(phydev); > + if (ret < 0) > + return ret; > + } > > - return ret; > + return phy_disable_interrupts(phydev); Not sure if the order makes sense here, it may seem more natural for a driver writer to have interrupts disabled first and then config_init called (which could enable interrupts not related to link management like thermal events etc.) -- Florian
Re: [PATCH v1 0/5] ethernet: dec: tulip: use generic power management
From: Vaibhav Gupta Date: Mon, 22 Jun 2020 17:12:23 +0530 > Linux Kernel Mentee: Remove Legacy Power Management. > > The purpose of this patch series is to remove legacy power management > callbacks and invocation of PCI helper functions, from tulip ethernet drivers. > > With legacy PM, drivers themselves are responsible for handling the device's > power states. And they do this with the help of PCI helper functions like > pci_enable/disable_device(), pci_set/restore_state(), pci_set_powr_state(), > etc. > which is not recommended. > > In generic PM, all the required tasks are handled by PCI core and drivers need > to perform device-specific operations only. > > All patches are compile-tested only. Series applied to net-next, thanks.
Re: [PATCH v2 0/3] ethernet: amd: Convert to generic power management
From: Vaibhav Gupta Date: Mon, 22 Jun 2020 16:43:57 +0530 > Linux Kernel Mentee: Remove Legacy Power Management. > > The purpose of this patch series is to remove legacy power management > callbacks > from amd ethernet drivers. > > The callbacks performing suspend() and resume() operations are still calling > pci_save_state(), pci_set_power_state(), etc. and handling the power > management > themselves, which is not recommended. > > The conversion requires the removal of the those function calls and change the > callback definition accordingly and make use of dev_pm_ops structure. > > All patches are compile-tested only. Series applied, thanks.
Re: [PATCH -V2] swap: Reduce lock contention on swap cache from swap slots allocation
"Huang, Ying" writes: > Andrew Morton writes: > >> On Wed, 20 May 2020 11:15:02 +0800 Huang Ying wrote: >> >>> In some swap scalability test, it is found that there are heavy lock >>> contention on swap cache even if we have split one swap cache radix >>> tree per swap device to one swap cache radix tree every 64 MB trunk in >>> commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks"). >>> >>> The reason is as follow. After the swap device becomes fragmented so >>> that there's no free swap cluster, the swap device will be scanned >>> linearly to find the free swap slots. swap_info_struct->cluster_next >>> is the next scanning base that is shared by all CPUs. So nearby free >>> swap slots will be allocated for different CPUs. The probability for >>> multiple CPUs to operate on the same 64 MB trunk is high. This causes >>> the lock contention on the swap cache. >>> >>> To solve the issue, in this patch, for SSD swap device, a percpu >>> version next scanning base (cluster_next_cpu) is added. Every CPU >>> will use its own per-cpu next scanning base. And after finishing >>> scanning a 64MB trunk, the per-cpu scanning base will be changed to >>> the beginning of another randomly selected 64MB trunk. In this way, >>> the probability for multiple CPUs to operate on the same 64 MB trunk >>> is reduced greatly. Thus the lock contention is reduced too. For >>> HDD, because sequential access is more important for IO performance, >>> the original shared next scanning base is used. >>> >>> To test the patch, we have run 16-process pmbench memory benchmark on >>> a 2-socket server machine with 48 cores. One ram disk is configured >> >> What does "ram disk" mean here? Which drivers(s) are in use and backed >> by what sort of memory? > > We use the following kernel command line > > memmap=48G!6G memmap=48G!68G > > to create 2 DRAM based /dev/pmem disks (48GB each). Then we use these > ram disks as swap devices. > >>> as the swap device per socket. The pmbench working-set size is much >>> larger than the available memory so that swapping is triggered. The >>> memory read/write ratio is 80/20 and the accessing pattern is random. >>> In the original implementation, the lock contention on the swap cache >>> is heavy. The perf profiling data of the lock contention code path is >>> as following, >>> >>> _raw_spin_lock_irq.add_to_swap_cache.add_to_swap.shrink_page_list: 7.91 >>> _raw_spin_lock_irqsave.__remove_mapping.shrink_page_list: 7.11 >>> _raw_spin_lock.swapcache_free_entries.free_swap_slot.__swap_entry_free: 2.51 >>> _raw_spin_lock_irqsave.swap_cgroup_record.mem_cgroup_uncharge_swap: 1.66 >>> _raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node: 1.29 >>> _raw_spin_lock.free_pcppages_bulk.drain_pages_zone.drain_pages: 1.03 >>> _raw_spin_lock_irq.shrink_active_list.shrink_lruvec.shrink_node:0.93 >>> >>> After applying this patch, it becomes, >>> >>> _raw_spin_lock.swapcache_free_entries.free_swap_slot.__swap_entry_free: 3.58 >>> _raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node: 2.3 >>> _raw_spin_lock_irqsave.swap_cgroup_record.mem_cgroup_uncharge_swap: 2.26 >>> _raw_spin_lock_irq.shrink_active_list.shrink_lruvec.shrink_node:1.8 >>> _raw_spin_lock.free_pcppages_bulk.drain_pages_zone.drain_pages: 1.19 >>> >>> The lock contention on the swap cache is almost eliminated. >>> >>> And the pmbench score increases 18.5%. The swapin throughput >>> increases 18.7% from 2.96 GB/s to 3.51 GB/s. While the swapout >>> throughput increases 18.5% from 2.99 GB/s to 3.54 GB/s. >> >> If this was backed by plain old RAM, can we assume that the performance >> improvement on SSD swap is still good? > > We need really fast disk to show the benefit. I have tried this on 2 > Intel P3600 NVMe disks. The performance improvement is only about 1%. > The improvement should be better on the faster disks, such as Intel > Optane disk. I will try to find some to test. I finally find 2 Intel Optane disks to test. The pmbench throughput (page accesses per second) increases ~1.7% with the patch. The swapin throughput increases 2% (~1.36 GB/s to ~1.39 GB/s), the swapout throughput increases 1.7% (~1.61 GB/s to 1.63 GB/s). Perf profile shows the CPU cycles on the swap cache radix tree spinlock is reduced from ~1.76% to nearly 0. So the performance difference is much smaller, but still measurable. Best Regards, Huang, Ying
[PATCH] [drivers/x86] fix bound check in pmc_core_mphy_pg_show
Check bounds before accessing map[]. Signed-off-by: Gaurav Singh --- drivers/platform/x86/intel_pmc_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/intel_pmc_core.c b/drivers/platform/x86/intel_pmc_core.c index 7c8bdab078cf..f571f9cf7217 100644 --- a/drivers/platform/x86/intel_pmc_core.c +++ b/drivers/platform/x86/intel_pmc_core.c @@ -795,7 +795,7 @@ static int pmc_core_mphy_pg_show(struct seq_file *s, void *unused) msleep(10); val_high = pmc_core_reg_read(pmcdev, SPT_PMC_MFPMC_OFFSET); - for (index = 0; map[index].name && index < 8; index++) { + for (index = 0; index < 8 && map[index].name; index++) { seq_printf(s, "%-32s\tState: %s\n", map[index].name, map[index].bit_mask & val_low ? "Not power gated" : -- 2.17.1
Re: [PATCH v3 3/9] efi/libstub: Remove .note.gnu.property
On 2020-06-23, Kees Cook wrote: In preparation for adding --orphan-handling=warn to more architectures, make sure unwanted sections don't end up appearing under the .init section prefix that libstub adds to itself during objcopy. Signed-off-by: Kees Cook --- drivers/firmware/efi/libstub/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 75daaf20374e..9d2d2e784bca 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -66,6 +66,9 @@ lib-$(CONFIG_X86) += x86-stub.o CFLAGS_arm32-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) +# Remove unwanted sections first. +STUBCOPY_FLAGS-y += --remove-section=.note.gnu.property + # # For x86, bootloaders like systemd-boot or grub-efi do not zero-initialize the # .bss section, so the .bss section of the EFI stub needs to be included in the arch/arm64/Kconfig enables ARM64_PTR_AUTH by default. When the config is on ifeq ($(CONFIG_ARM64_BTI_KERNEL),y) branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET_BTI) := -mbranch-protection=pac-ret+leaf+bti else branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET) := -mbranch-protection=pac-ret+leaf endif This option creates .note.gnu.property: % readelf -n drivers/firmware/efi/libstub/efi-stub.o Displaying notes found in: .note.gnu.property OwnerData sizeDescription GNU 0x0010 NT_GNU_PROPERTY_TYPE_0 Properties: AArch64 feature: PAC If .note.gnu.property is not desired in drivers/firmware/efi/libstub, specifying -mbranch-protection=none can override -mbranch-protection=pac-ret+leaf
Re: [PATCH] [net] dcb_doit: remove redundant skb check
From: Gaurav Singh Date: Mon, 22 Jun 2020 22:50:39 -0400 > skb cannot be NULL here since its already being accessed > before: sock_net(skb->sk). Remove the redundant null check. > > Signed-off-by: Gaurav Singh Applied.
Re: [PATCH] [net/decnet] dn_route_rcv: remove redundant dev null check
From: Gaurav Singh Date: Mon, 22 Jun 2020 23:41:19 -0400 > dev cannot be NULL here since its already being accessed > before. Remove the redundant null check. > > Signed-off-by: Gaurav Singh Applied.
Re: [PATCH][next] net: ipv6: Use struct_size() helper and kcalloc()
From: "Gustavo A. R. Silva" Date: Mon, 22 Jun 2020 18:07:41 -0500 > Make use of the struct_size() helper instead of an open-coded version > in order to avoid any potential type mistakes. Also, remove unnecessary > function ipv6_rpl_srh_alloc_size() and replace kzalloc() with kcalloc(), > which has a 2-factor argument form for multiplication. > > This code was detected with the help of Coccinelle and, audited and > fixed manually. > > Signed-off-by: Gustavo A. R. Silva Applied.
[PATCH] mm/page_alloc: fix documentation error and remove magic numbers
When I increased the upper bound of the min_free_kbytes value in ee8eb9a5fe863, I forgot to tweak the above comment to reflect the new value. This patch fixes that mistake. In addition, this patch replaces the magic number bounds with symbolic constants to clarify the logic. Suggested-by: John Hubbard Suggested-by: Vlastimil Babka Signed-off-by: Joel Savitz --- mm/page_alloc.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 48eb0f1410d4..f725addc2748 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7832,7 +7832,7 @@ void setup_per_zone_wmarks(void) * Initialise min_free_kbytes. * * For small machines we want it small (128k min). For large machines - * we want it large (64MB max). But it is not linear, because network + * we want it large (256MB max). But it is not linear, because network * bandwidth does not increase linearly with machine size. We use * * min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy: @@ -7852,6 +7852,9 @@ void setup_per_zone_wmarks(void) * 8192MB: 11584k * 16384MB:16384k */ +static const int MIN_FREE_KBYTES_LOWER_BOUND = 1 << 7; +static const int MIN_FREE_KBYTES_UPPER_BOUND = 1 << 18; + int __meminit init_per_zone_wmark_min(void) { unsigned long lowmem_kbytes; @@ -7862,10 +7865,10 @@ int __meminit init_per_zone_wmark_min(void) if (new_min_free_kbytes > user_min_free_kbytes) { min_free_kbytes = new_min_free_kbytes; - if (min_free_kbytes < 128) - min_free_kbytes = 128; - if (min_free_kbytes > 262144) - min_free_kbytes = 262144; + if (min_free_kbytes < MIN_FREE_KBYTES_LOWER_BOUND) + min_free_kbytes = MIN_FREE_KBYTES_LOWER_BOUND; + if (min_free_kbytes > MIN_FREE_KBYTES_UPPER_BOUND) + min_free_kbytes = MIN_FREE_KBYTES_UPPER_BOUND; } else { pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n", new_min_free_kbytes, user_min_free_kbytes); -- 2.23.0
[PATCH v3 1/2] net: phy: make phy_disable_interrupts() non-static
We face an issue with rtl8211f, a pin is shared between INTB and PMEB, and the PHY Register Accessible Interrupt is enabled by default, so the INTB/PMEB pin is always active in polling mode case. As Heiner pointed out "I was thinking about calling phy_disable_interrupts() in phy_init_hw(), to have a defined init state as we don't know in which state the PHY is if the PHY driver is loaded. We shouldn't assume that it's the chip power-on defaults, BIOS or boot loader could have changed this. Or in case of dual-boot systems the other OS could leave the PHY in whatever state." Make phy_disable_interrupts() non-static so that it could be used in phy_init_hw() to have a defined init state. Suggested-by: Heiner Kallweit Signed-off-by: Jisheng Zhang --- drivers/net/phy/phy.c | 2 +- include/linux/phy.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 1de3938628f4..56cfae950472 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -840,7 +840,7 @@ static void phy_error(struct phy_device *phydev) * phy_disable_interrupts - Disable the PHY interrupts from the PHY side * @phydev: target phy_device struct */ -static int phy_disable_interrupts(struct phy_device *phydev) +int phy_disable_interrupts(struct phy_device *phydev) { int err; diff --git a/include/linux/phy.h b/include/linux/phy.h index 8c05d0fb5c00..b693b609b2f5 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -1416,6 +1416,7 @@ int phy_ethtool_ksettings_set(struct phy_device *phydev, int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd); int phy_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd); int phy_do_ioctl_running(struct net_device *dev, struct ifreq *ifr, int cmd); +int phy_disable_interrupts(struct phy_device *phydev); void phy_request_interrupt(struct phy_device *phydev); void phy_free_interrupt(struct phy_device *phydev); void phy_print_status(struct phy_device *phydev); -- 2.27.0
[PATCH v3 0/2] net: phy: call phy_disable_interrupts() in phy_init_hw()
We face an issue with rtl8211f, a pin is shared between INTB and PMEB, and the PHY Register Accessible Interrupt is enabled by default, so the INTB/PMEB pin is always active in polling mode case. As Heiner pointed out "I was thinking about calling phy_disable_interrupts() in phy_init_hw(), to have a defined init state as we don't know in which state the PHY is if the PHY driver is loaded. We shouldn't assume that it's the chip power-on defaults, BIOS or boot loader could have changed this. Or in case of dual-boot systems the other OS could leave the PHY in whatever state." patch1 makes phy_disable_interrupts() non-static so that it could be used in phy_init_hw() to have a defined init state. patch2 calls phy_disable_interrupts() in phy_init_hw() to have a defined init state. Since v2: - Don't export phy_disable_interrupts() but just make it non-static Since v1: - EXPORT the correct symbol Jisheng Zhang (2): net: phy: make phy_disable_interrupts() non-static net: phy: call phy_disable_interrupts() in phy_init_hw() drivers/net/phy/phy.c| 2 +- drivers/net/phy/phy_device.c | 7 +-- include/linux/phy.h | 1 + 3 files changed, 7 insertions(+), 3 deletions(-) -- 2.27.0
[PATCH v3 2/2] net: phy: call phy_disable_interrupts() in phy_init_hw()
Call phy_disable_interrupts() in phy_init_hw() to "have a defined init state as we don't know in which state the PHY is if the PHY driver is loaded. We shouldn't assume that it's the chip power-on defaults, BIOS or boot loader could have changed this. Or in case of dual-boot systems the other OS could leave the PHY in whatever state." as pointed out by Heiner. Suggested-by: Heiner Kallweit Signed-off-by: Jisheng Zhang --- drivers/net/phy/phy_device.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 04946de74fa0..f17d397ba689 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -1090,10 +1090,13 @@ int phy_init_hw(struct phy_device *phydev) if (ret < 0) return ret; - if (phydev->drv->config_init) + if (phydev->drv->config_init) { ret = phydev->drv->config_init(phydev); + if (ret < 0) + return ret; + } - return ret; + return phy_disable_interrupts(phydev); } EXPORT_SYMBOL(phy_init_hw); -- 2.27.0
Re: [PATCH] net: qrtr: free flow in __qrtr_node_release
From: Carl Huang Date: Tue, 23 Jun 2020 11:22:03 +0800 > @@ -168,6 +168,7 @@ static void __qrtr_node_release(struct kref *kref) > struct radix_tree_iter iter; > unsigned long flags; > void __rcu **slot; > + struct qrtr_tx_flow *flow; Please retain the reverse christmas tree ordering of local variables here. Thanks.
Re: [PATCH] fs/read_write.c: Fix memory leak in read_write.c
On Wed, Jun 24, 2020 at 11:07:03AM +0800, Peng Fan wrote: > kmemleak report: > unreferenced object 0x9802bb591d00 (size 256): > comm "ftest03", pid 24778, jiffies 4301603810 (age 490.665s) > hex dump (first 32 bytes): > 00 01 04 20 01 00 00 00 80 00 00 00 00 00 00 00 ... > f0 02 04 20 01 00 00 00 80 00 00 00 00 00 00 00 ... > backtrace: > [<50b162cb>] __kmalloc+0x234/0x438 > [<491da9c7>] rw_copy_check_uvector+0x1ac/0x1f0 > [] import_iovec+0x50/0xe8 > [ ] vfs_readv+0x50/0xb0 > [ ] do_readv+0x80/0x160 > [ ] syscall_common+0x34/0x58 > > This is because "iov" allocated by kmalloc() is not destroyed. Under normal > circumstances, "ret_pointer" should be equal to "iov". But if the previous > statements fails to execute, and the allocation is successful, then the > block of memory will not be released, because it is necessary to > determine whether they are equal. So we need to change the order. This patch doesn't make sense. It will _introduce_ a memory leak, not fix one.
Re: [PATCH RESEND] net/cisco: Fix a sleep-in-atomic-context bug in enic_init_affinity_hint()
From: Kaige Li Date: Wed, 24 Jun 2020 10:07:16 +0800 > You are right. Should I do spin_unlock before the enic_open, or remove > spin_lock in enic_reset? You need to learn how this driver's locking works and design a correct adjustment.