Re: [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super

2023-09-13 Thread Al Viro
On Thu, Sep 14, 2023 at 03:37:05AM +0100, Al Viro wrote:
> On Thu, Sep 14, 2023 at 12:27:12AM +0100, Al Viro wrote:
> > On Wed, Sep 13, 2023 at 08:09:57AM -0300, Christoph Hellwig wrote:
> > > Releasing an anon dev_t is a very common thing when freeing a
> > > super_block, as that's done for basically any not block based file
> > > system (modulo the odd mtd special case).  So instead of requiring
> > > a special ->kill_sb helper and a lot of boilerplate in more complicated
> > > file systems, just release the anon dev_t in deactivate_locked_super if
> > > the super_block was using one.
> > > 
> > > As the freeing is done after the main call to kill_super_notify, this
> > > removes the need for having two slightly different call sites for it.
> > 
> > Huh?  At this stage in your series freeing is still in ->kill_sb()
> > instances, after the calls of kill_anon_super() you've turned into
> > the calls of generic_shutdown_super().
> > 
> > You do split it off into a separate method later in the series, but
> > at this point you are reopening the same UAF that had been dealt with
> > in dc3216b14160 "super: ensure valid info".
> > 
> > Either move the introduction of ->free_sb() before that one, or
> > split it into lifting put_anon_bdev() (left here) and getting rid
> > of kill_anon_super() (after ->free_sb() introduction).
> 
> Actually, looking at the final stage in the series, you still have
> kill_super_notify() done *AFTER* ->free_sb() call.  So the problem
> persists until the very end...

It's worse - look at the rationale for 2c18a63b760a "super: wait until
we passed kill super".  Basically, "don't remove from the lists
until after block device closing".  IOW, we have

* stuff that needs to be done before generic_shutdown_super() (things
like pinned dentries on ramfs, etc.)
* generic_shutdown_super() itself (dentry/inode eviction, optionally
->put_super())
* stuff that needs to be done before eviction from the lists (block
device closing, since 2c18a63b760a)
* eviction from the lists
* stuff that needs to be done *after* eviction from the lists.

BTW, this part of commit message in 2c18a63b760a is rather confused:
Recent rework moved block device closing out of sb->put_super() and into
sb->kill_sb() to avoid deadlocks as s_umount is held in put_super() and
blkdev_put() can end up taking s_umount again.

That was *NOT* what a recent rework had done.  Block device closing had never
been inside ->put_super() - at no point since that (closing, that is) had been
introduced back in 0.97 ;-)  ->put_super() predates it (0.95c+).

The race is real, but the cause is not some kind of move of blkdev_put().
Your 2ea6f68932f7 "fs: use the super_block as holder when mounting file
systems" is where it actually came from.

Christoph, could you explain what the hell do we need that for?  It does
create the race in question and AFAICS 2c18a63b760a (and followups trying
to plug holes in it) had been nothing but headache.

Old logics: if mount attempt with a different fs type happens, -EBUSY
is precisely corrent - we would've gotten just that if mount() came
before umount().  If the type matches, we might
1) come before deactivate_locked_super() by umount(2).
No problem, we succeed.
2) come after the beginning of shutdown, but before the
removal from the list; fine, we'll wait for the sucker to be
unlocked (which happens in the end of generic_shutdown_super()),
notice it's dead and create a new superblock.  Since the only
part left on the umount side is closing the device, we are
just fine.
3) come after the removal from the list.  So we won't
wait for the old superblock to be unlocked, other than that
it's exactly the same as (2).  It doesn't matter whether we
open the device before or after close by umount - same owner
anyway, no -EBUSY.

Your "owner shall be the superblock" breaks that...

If you want to mess with _three_-way split of ->kill_sb(),
please start with writing down the rules re what should
go into each of those parts; such writeup should go into
Documentation/filesystems/porting anyway, even if the
split is a two-way one, BTW.


Re: [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super

2023-09-13 Thread Al Viro
On Thu, Sep 14, 2023 at 12:27:12AM +0100, Al Viro wrote:
> On Wed, Sep 13, 2023 at 08:09:57AM -0300, Christoph Hellwig wrote:
> > Releasing an anon dev_t is a very common thing when freeing a
> > super_block, as that's done for basically any not block based file
> > system (modulo the odd mtd special case).  So instead of requiring
> > a special ->kill_sb helper and a lot of boilerplate in more complicated
> > file systems, just release the anon dev_t in deactivate_locked_super if
> > the super_block was using one.
> > 
> > As the freeing is done after the main call to kill_super_notify, this
> > removes the need for having two slightly different call sites for it.
> 
> Huh?  At this stage in your series freeing is still in ->kill_sb()
> instances, after the calls of kill_anon_super() you've turned into
> the calls of generic_shutdown_super().
> 
> You do split it off into a separate method later in the series, but
> at this point you are reopening the same UAF that had been dealt with
> in dc3216b14160 "super: ensure valid info".
> 
> Either move the introduction of ->free_sb() before that one, or
> split it into lifting put_anon_bdev() (left here) and getting rid
> of kill_anon_super() (after ->free_sb() introduction).

Actually, looking at the final stage in the series, you still have
kill_super_notify() done *AFTER* ->free_sb() call.  So the problem
persists until the very end...


Re: [PATCH 13/19] fs: convert kill_block_super to block_free_sb

2023-09-13 Thread Al Viro
On Wed, Sep 13, 2023 at 08:10:07AM -0300, Christoph Hellwig wrote:

> -static void affs_kill_sb(struct super_block *sb)
> +static void affs_free_sb(struct super_block *sb)
>  {
>   struct affs_sb_info *sbi = AFFS_SB(sb);
> - kill_block_super(sb);
> +
> + block_free_sb(sb);
>   if (sbi) {
>   affs_free_bitmap(sb);
>   affs_brelse(sbi->s_root_bh);



Yep, that's printk + brelse()...  Could we have that
block_free_sb() (an awful name aside) done after the
if (sbi) { ... } there?


Re: [PATCH 11/19] fs: add new shutdown_sb and free_sb methods

2023-09-13 Thread Al Viro
On Wed, Sep 13, 2023 at 08:10:05AM -0300, Christoph Hellwig wrote:
> Currently super_blocks are shut down using the ->kill_sb method, which
> must call generic_shutdown_super, but allows the file system to
> add extra work before or after the call to generic_shutdown_super.
> 
> File systems tend to get rather confused by this, so add an alternative
> shutdown sequence where generic_shutdown_super is called by the core
> code, and there are extra ->shutdown_sb and ->free_sb hooks before and
> after it.  To remove the amount of boilerplate code ->shutdown_sb is only
> called if the super has finished initialization and ->d_root is set.

The last sentence doesn't match the patchset.  That aside, there
is an issue with method names.

->shutdown_sb() is... odd.  ->begin_shutdown_sb(), perhaps?  For the
majority of filesystems it's NULL, after all...

Worse, ->free_sb() is seriously misguiding - the name implies that
we are, well, freeing a superblock passed to it.  Which is not what is
happening here - superblock itself is freed only when all passive
references go away.  It's asking for trouble down the road.

We already have more than enough confusion in the area.  Note, BTW,
that there's a delicate issue around RCU accesses and freeing stuff -
->d_compare() can bloody well be called when superblock is getting
shut down.  For anything that might be needed by it (or by other
RCU'd methods) we must arrange for RCU-delayed destruction.
E.g. in case of fatfs we have sbi freeing done via call_rcu() (from
fat_put_super(), called by generic_shutdown_super()).



Oh, bugger...  AFAICS, exfat has a problem - exfat_free_sbi() is called
directly from exfat_kill_sb(), without any concern for this:
static int exfat_utf8_d_cmp(const struct dentry *dentry, unsigned int len,
const char *str, const struct qstr *name)
{
struct super_block *sb = dentry->d_sb;
unsigned int alen = exfat_striptail_len(name->len, name->name,
EXFAT_SB(sb)->options.keep_last_dots);

That kfree() needs to be RCU-delayed...  While we are at it, there's
this:
static int exfat_d_hash(const struct dentry *dentry, struct qstr *qstr)
{
struct super_block *sb = dentry->d_sb;
struct nls_table *t = EXFAT_SB(sb)->nls_io;
and we need this
unload_nls(sbi->nls_io);
in exfat_put_super() RCU-delayed as well.  And I suspect that
exfat_free_upcase_table(sbi);
right after it needs the same treatment.

AFFS: similar problem, wants ->s_fs_info freeing RCU-delayed.

hfsplus: similar, including non-delayed unlock_nls() calls.

ntfs3:
/*
 * Try slow way with current upcase table
 */
sbi = dentry->d_sb->s_fs_info;
uni1 = __getname();
if (!uni1)
return -ENOMEM;
__getname().  "Give me a page and you might block, while you are
at it".  Done from ->d_compare().  Called under dentry->d_lock
and rcu_read_lock().  OK, any further investigation of that
one is... probably not worth bothering with at that point.

Other in-tree instances appear to be correct.  I'll push fixes for
those (well, ntfs3 aside) out tomorrow.


Re: [PATCH v3] scsi: target: Replace strlcpy with strscpy

2023-09-13 Thread Martin K. Petersen
On Thu, 31 Aug 2023 14:36:38 +, Azeem Shaikh wrote:

> strlcpy() reads the entire source buffer first.
> This read may exceed the destination size limit.
> This is both inefficient and can lead to linear read
> overflows if a source string is not NUL-terminated [1].
> In an effort to remove strlcpy() completely [2], replace
> strlcpy() here with strscpy().
> 
> [...]

Applied to 6.6/scsi-fixes, thanks!

[1/1] scsi: target: Replace strlcpy with strscpy
  https://git.kernel.org/mkp/scsi/c/5c584fe6098a

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: linux-next: Tree for Sep 12 (bcachefs)

2023-09-13 Thread Kees Cook
On Tue, Sep 12, 2023 at 03:26:45PM +1000, Stephen Rothwell wrote:
> New tree: bcachefs

Thanks for going through and fixing all the fake flexible array members.
It looks much nicer. :)

I have some questions about the remaining "markers", for example:

$ git grep -A8 '\bkey_start\b' -- fs/bcachefs
fs/bcachefs/bcachefs_format.h:  __u8key_start[0];
...
fs/bcachefs/bcachefs_format.h-  __u8pad[sizeof(struct bkey) - 3];
--
fs/bcachefs/bkey.c: u8 *l = k->key_start;

Why isn't this just:

u8 *l = k->pad

and you can drop the marker?

And some seem entirely unused, like all of "struct bch_reflink_v".

And some are going to fail at runtime, since they're still zero-sized
and being used as an actual array:

struct bch_sb_field_journal_seq_blacklist {
struct bch_sb_field field;

struct journal_seq_blacklist_entry start[0];
__u64   _data[];
};
...
memmove(>start[i],
>start[i + 1],
sizeof(bl->start[0]) * (nr - i));

It looks like you just want a type union for the flexible array.
This can be done like this:

struct bch_sb_field_journal_seq_blacklist {
struct bch_sb_field field;

union {
DECLARE_FLEX_ARRAY(struct journal_seq_blacklist_entry, start);
DECLARE_FLEX_ARRAY(__u64, _data);
};
};

Hopefully that helps!

-Kees

-- 
Kees Cook


Re: [PATCH 09/19] zonefs: remove duplicate cleanup in zonefs_fill_super

2023-09-13 Thread Al Viro
On Wed, Sep 13, 2023 at 08:10:03AM -0300, Christoph Hellwig wrote:
> When ->fill_super fails, ->kill_sb is called which already cleans up
> the inodes and zgroups.

Ugh...  The use of "->" strongly suggests that you are talking about
a method; 'fill_super' here actually refers to callback passed to
mount_bdev().  Have a pity for those who'll be trying to parse it
- that might be yourself a couple of years down the road...

Something like

"If zonefs_fill_super() returns an error, its caller (mount_bdev()) will
make sure to call zonefs_kill_super(), which already cleans up
the inodes and zgroups.", perhaps?

> 
> Drop the extra cleanup code in zonefs_fill_super.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  fs/zonefs/super.c | 21 +
>  1 file changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
> index 9d1a9808fbbba6..35b2554ce2ac2e 100644
> --- a/fs/zonefs/super.c
> +++ b/fs/zonefs/super.c
> @@ -1309,13 +1309,12 @@ static int zonefs_fill_super(struct super_block *sb, 
> void *data, int silent)
>   /* Initialize the zone groups */
>   ret = zonefs_init_zgroups(sb);
>   if (ret)
> - goto cleanup;
> + return ret;
>  
>   /* Create the root directory inode */
> - ret = -ENOMEM;
>   inode = new_inode(sb);
>   if (!inode)
> - goto cleanup;
> + return -ENOMEM;
>  
>   inode->i_ino = bdev_nr_zones(sb->s_bdev);
>   inode->i_mode = S_IFDIR | 0555;
> @@ -1333,7 +1332,7 @@ static int zonefs_fill_super(struct super_block *sb, 
> void *data, int silent)
>  
>   sb->s_root = d_make_root(inode);
>   if (!sb->s_root)
> - goto cleanup;
> + return -ENOMEM;
>  
>   /*
>* Take a reference on the zone groups directory inodes
> @@ -1341,19 +1340,9 @@ static int zonefs_fill_super(struct super_block *sb, 
> void *data, int silent)
>*/
>   ret = zonefs_get_zgroup_inodes(sb);
>   if (ret)
> - goto cleanup;
> -
> - ret = zonefs_sysfs_register(sb);
> - if (ret)
> - goto cleanup;
> -
> - return 0;
> -
> -cleanup:
> - zonefs_release_zgroup_inodes(sb);
> - zonefs_free_zgroups(sb);
> + return ret;
>  
> - return ret;
> + return zonefs_sysfs_register(sb);
>  }
>  
>  static struct dentry *zonefs_mount(struct file_system_type *fs_type,
> -- 
> 2.39.2
> 


Re: [PATCH 05/19] fs: assign an anon dev_t in common code

2023-09-13 Thread Al Viro
On Wed, Sep 13, 2023 at 08:09:59AM -0300, Christoph Hellwig wrote:

> diff --git a/fs/super.c b/fs/super.c
> index bbe55f0651cca4..5c685b4944c2d6 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -787,7 +787,7 @@ struct super_block *sget_fc(struct fs_context *fc,
>   struct super_block *s = NULL;
>   struct super_block *old;
>   struct user_namespace *user_ns = fc->global ? _user_ns : 
> fc->user_ns;
> - int err;
> + int err = 0;
>  
>  retry:
>   spin_lock(_lock);
> @@ -806,14 +806,26 @@ struct super_block *sget_fc(struct fs_context *fc,
>   }
>  
>   s->s_fs_info = fc->s_fs_info;
> - err = set(s, fc);
> - if (err) {
> - s->s_fs_info = NULL;
> - spin_unlock(_lock);
> - destroy_unused_super(s);
> - return ERR_PTR(err);
> + if (set) {
> + err = set(s, fc);
> + if (err) {
> + s->s_fs_info = NULL;

Pointless (as the original had been); destroy_unused_super() doesn't
even look at ->s_fs_info.

> + goto unlock_and_destroy;
> + }
>   }
>   fc->s_fs_info = NULL;

Here we are transferring the ownership from fc to superblock; it used to sit
right next to insertion into lists and all failure exits past that point must
go through deactivate_locked_super(), so ->kill_sb() would be called on those
and it would take care of s->s_fs_info.  However, your variant has that
ownership transfer done at the point before get_anon_bdev(), and that got
you a new failure exit where you are still calling destroy_unused_super():

> + if (!s->s_dev) {
> + /*
> +  * If the file system didn't set a s_dev (which is usually only
> +  * done for block based file systems), set an anonymous dev_t
> +  * here now so that we always have a valid ->s_dev.
> +  */
> + err = get_anon_bdev(>s_dev);
> + if (err)
> + goto unlock_and_destroy;

This.  And that's a leak - fc has no reference left in it, and your
unlock_and_destroy won't even look at what's in ->s_fs_info, let alone know
what to do with it.

IOW, clearing fc->s_fs_info should've been done after that chunk.

And looking at the change in sget(),

> + if (set) {
> + err = set(s, data);
> + if (err)
> + goto unlock_and_destroy;
>   }
> +
> + if (!s->s_dev) {
> + err = get_anon_bdev(>s_dev);
> + if (err)
> + goto unlock_and_destroy;
> + }

I'd rather expressed it (both there and in sget_fc() as well) as
if (set)
err = set(s, data);
if (!err && !s->s_dev)
err = get_anon_bdev(>s_dev);
if (err)
goto unlock_and_destroy;

That's really what your transformation does - you are lifting the
calls of get_anon_bdev() (in guise of set_anon_super()) from the
tails of 'set' callbacks into the caller, making them conditional
upon the lack of other errors from 'set' and upon the ->s_dev left
zero and allow NULL for the case when that was all that had been
there.

The only place where you do something different is this:

> @@ -1191,7 +1191,6 @@ static struct dentry *ceph_real_mount(struct 
> ceph_fs_client *fsc,
>  static int ceph_set_super(struct super_block *s, struct fs_context *fc)
>  {
>   struct ceph_fs_client *fsc = s->s_fs_info;
> - int ret;
>  
>   dout("set_super %p\n", s);
>  
> @@ -1211,11 +1210,7 @@ static int ceph_set_super(struct super_block *s, 
> struct fs_context *fc)
>   s->s_flags |= SB_NODIRATIME | SB_NOATIME;
>  
>   ceph_fscrypt_set_ops(s);
> -
> - ret = set_anon_super_fc(s, fc);
> - if (ret != 0)
> - fsc->sb = NULL;
> - return ret;
> + return 0;

fsc->sb = NULL has disappeared here; it *is* OK (the caller won't look at
fsc->sb after failed sget_fc()), but that's worth a mention somewhere.
A separate commit removing that clearing fsc->sb in ceph_set_super(),
perhaps?


Re: [PATCH 09/19] zonefs: remove duplicate cleanup in zonefs_fill_super

2023-09-13 Thread Damien Le Moal
On 9/13/23 20:10, Christoph Hellwig wrote:
> When ->fill_super fails, ->kill_sb is called which already cleans up
> the inodes and zgroups.
> 
> Drop the extra cleanup code in zonefs_fill_super.
> 
> Signed-off-by: Christoph Hellwig 

Looks good to me.

Acked-by: Damien Le Moal 

> ---
>  fs/zonefs/super.c | 21 +
>  1 file changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
> index 9d1a9808fbbba6..35b2554ce2ac2e 100644
> --- a/fs/zonefs/super.c
> +++ b/fs/zonefs/super.c
> @@ -1309,13 +1309,12 @@ static int zonefs_fill_super(struct super_block *sb, 
> void *data, int silent)
>   /* Initialize the zone groups */
>   ret = zonefs_init_zgroups(sb);
>   if (ret)
> - goto cleanup;
> + return ret;
>  
>   /* Create the root directory inode */
> - ret = -ENOMEM;
>   inode = new_inode(sb);
>   if (!inode)
> - goto cleanup;
> + return -ENOMEM;
>  
>   inode->i_ino = bdev_nr_zones(sb->s_bdev);
>   inode->i_mode = S_IFDIR | 0555;
> @@ -1333,7 +1332,7 @@ static int zonefs_fill_super(struct super_block *sb, 
> void *data, int silent)
>  
>   sb->s_root = d_make_root(inode);
>   if (!sb->s_root)
> - goto cleanup;
> + return -ENOMEM;
>  
>   /*
>* Take a reference on the zone groups directory inodes
> @@ -1341,19 +1340,9 @@ static int zonefs_fill_super(struct super_block *sb, 
> void *data, int silent)
>*/
>   ret = zonefs_get_zgroup_inodes(sb);
>   if (ret)
> - goto cleanup;
> -
> - ret = zonefs_sysfs_register(sb);
> - if (ret)
> - goto cleanup;
> -
> - return 0;
> -
> -cleanup:
> - zonefs_release_zgroup_inodes(sb);
> - zonefs_free_zgroups(sb);
> + return ret;
>  
> - return ret;
> + return zonefs_sysfs_register(sb);
>  }
>  
>  static struct dentry *zonefs_mount(struct file_system_type *fs_type,

-- 
Damien Le Moal
Western Digital Research



[PATCH v2] module: Clarify documentation of module_param_call()

2023-09-13 Thread Kees Cook
Commit 9bbb9e5a3310 ("param: use ops in struct kernel_param, rather than
get and set fns directly") added the comment that module_param_call()
was deprecated, during a large scale refactoring to bring sanity to type
casting back then. In 2017 following more cleanups, it became useful
again as it wraps a common pattern of creating an ops struct for a
given get/set pair:

  b2f270e87473 ("module: Prepare to convert all module_param_call() prototypes")
  ece1996a21ee ("module: Do not paper over type mismatches in 
module_param_call()")

static const struct kernel_param_ops __param_ops_##name = \
{ .flags = 0, .set = _set, .get = _get }; \
__module_param_call(MODULE_PARAM_PREFIX, \
name, &__param_ops_##name, arg, perm, -1, 0)

__module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1, 0)

Many users of module_param_cb() appear to be almost universally
open-coding the same thing that module_param_call() does now. Don't
discourage[1] people from using module_param_call(): clarify the comment
to show that module_param_cb() is useful if you repeatedly use the same
pair of get/set functions.

[1] https://lore.kernel.org/lkml/202308301546.5C789E5EC@keescook/

Cc: Luis Chamberlain 
Cc: Johan Hovold 
Cc: Jessica Yu 
Cc: Sagi Grimberg 
Cc: Nick Desaulniers 
Cc: Miguel Ojeda 
Cc: Joe Perches 
Cc: linux-modu...@vger.kernel.org
Reviewed-by: Miguel Ojeda 
Signed-off-by: Kees Cook 
---
Luis, I note that include/linux/moduleparam.h isn't in the MAINTAINERS
file pattern. Perhaps you want to use include/linux/module*.h?
---
 include/linux/moduleparam.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 962cd41a2cb5..d4452f93d060 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -293,7 +293,11 @@ struct kparam_array
= { __param_str_##name, THIS_MODULE, ops,   \
VERIFY_OCTAL_PERMISSIONS(perm), level, flags, { arg } }
 
-/* Obsolete - use module_param_cb() */
+/*
+ * Useful for describing a set/get pair used only once (i.e. for this
+ * parameter). For repeated set/get pairs (i.e. the same struct
+ * kernel_param_ops), use module_param_cb() instead.
+ */
 #define module_param_call(name, _set, _get, arg, perm) \
static const struct kernel_param_ops __param_ops_##name =   \
{ .flags = 0, .set = _set, .get = _get };   \
-- 
2.34.1



Re: [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super

2023-09-13 Thread Al Viro
On Wed, Sep 13, 2023 at 08:09:57AM -0300, Christoph Hellwig wrote:
> Releasing an anon dev_t is a very common thing when freeing a
> super_block, as that's done for basically any not block based file
> system (modulo the odd mtd special case).  So instead of requiring
> a special ->kill_sb helper and a lot of boilerplate in more complicated
> file systems, just release the anon dev_t in deactivate_locked_super if
> the super_block was using one.
> 
> As the freeing is done after the main call to kill_super_notify, this
> removes the need for having two slightly different call sites for it.

Huh?  At this stage in your series freeing is still in ->kill_sb()
instances, after the calls of kill_anon_super() you've turned into
the calls of generic_shutdown_super().

You do split it off into a separate method later in the series, but
at this point you are reopening the same UAF that had been dealt with
in dc3216b14160 "super: ensure valid info".

Either move the introduction of ->free_sb() before that one, or
split it into lifting put_anon_bdev() (left here) and getting rid
of kill_anon_super() (after ->free_sb() introduction).


Re: [PATCH 5.10-rt] arm64: signal: Use ARCH_RT_DELAYS_SIGNAL_SEND

2023-09-13 Thread Luis Claudio R. Goncalves
On Tue, Sep 12, 2023 at 03:14:28PM +0800, yang.yan...@zte.com.cn wrote:
> From: Wang Yong 
> 
> The ltp test prompts the following bug information under the 5.10 kernel:
> BUG: sleeping function called from invalid context at 
> kernel/locking/rtmutex.c:969
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 796, name: cat
> Preemption disabled at:
> [] do_debug_exception+0x60/0x180
> CPU: 3 PID: 796 Comm: cat Not tainted 5.10.59-rt52-KERNEL_VERSION #38
> Hardware name: linux,dummy-virt (DT)
> Call trace:
>  dump_backtrace+0x0/0x198
>  show_stack+0x20/0x30
>  dump_stack+0xf0/0x13c
>  ___might_sleep+0x140/0x178
>  rt_spin_lock+0x30/0x90
>  force_sig_info_to_task+0x30/0xe0
>  force_sig_fault_to_task+0x54/0x78
>  force_sig_fault+0x1c/0x28
>  arm64_force_sig_fault+0x48/0x78
>  send_user_sigtrap+0x4c/0x80
>  brk_handler+0x3c/0x68
>  do_debug_exception+0xac/0x180
>  el0_dbg+0x34/0x58
>  el0_sync_handler+0x50/0xb8
>  el0_sync+0x180/0x1c0
> 
> It has been fixed by
> 0c34700de5e7 ("arm64: signal: Use ARCH_RT_DELAYS_SIGNAL_SEND.") in
> higher versions of the kernel. This patch needs to be compatible with 5.10.
> 5.10 kernel does not have signal.h file, so adding signal.h file to
> define ARCH_RT_DELAYS_SIGNAL_SEND.
> 
> Signed-off-by: Wang Yong 
> Cc: Xuexin Jiang 
> Cc: Yang Yang 
> Cc: Xiaokai Ran 
> ---

Thank you for this fix. I will add it to the release candidate that will
be posted on Friday.

Best regards,
Luis

>  arch/arm64/include/asm/signal.h | 12 
>  arch/arm64/kernel/signal.c  |  9 +
>  2 files changed, 21 insertions(+)
>  create mode 100644 arch/arm64/include/asm/signal.h
> 
> diff --git a/arch/arm64/include/asm/signal.h b/arch/arm64/include/asm/signal.h
> new file mode 100644
> index 0..0fb418cf4
> --- /dev/null
> +++ b/arch/arm64/include/asm/signal.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ARM64_ASM_SIGNAL_H
> +#define __ARM64_ASM_SIGNAL_H
> +
> +#include 
> +#include 
> +
> +#if defined(CONFIG_PREEMPT_RT)
> +#define ARCH_RT_DELAYS_SIGNAL_SEND
> +#endif
> +
> +#endif
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index fe94a3e1f..538d4aadb 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -927,6 +927,15 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
>   } else {
>   local_daif_restore(DAIF_PROCCTX);
> 
> +#ifdef ARCH_RT_DELAYS_SIGNAL_SEND
> + if (unlikely(current->forced_info.si_signo)) {
> + struct task_struct *t = current;
> +
> + force_sig_info(>forced_info);
> + t->forced_info.si_signo = 0;
> + }
> +#endif
> +
>   if (thread_flags & _TIF_UPROBE)
>   uprobe_notify_resume(regs);
> 
> -- 
> 2.25.1
> 
---end quoted text---



Re: [PATCH 12/19] fs: convert kill_litter_super to litter_shutdown_sb

2023-09-13 Thread Kees Cook
On Wed, Sep 13, 2023 at 08:10:06AM -0300, Christoph Hellwig wrote:
> Replace kill_litter_super with litter_shutdown_sb, which is wired up to
> the ->shutdown_sb method.  For file systems that wrapped
> kill_litter_super, ->kill_sb is replaced with ->shutdown and ->free_sb
> methods as needed.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Kees Cook  # for pstore

-- 
Kees Cook


Re: [PATCH 08/19] pstore: shrink the pstore_sb_lock critical section in pstore_kill_sb

2023-09-13 Thread Kees Cook
On Wed, Sep 13, 2023 at 08:10:02AM -0300, Christoph Hellwig wrote:
> ->kill_sb can't race with creating ->fill_super because pstore is a
> _single file system that only ever has a single sb instance, and we wait
> for the previous one to go away before creating a new one.  Reduce
> the critical section so that is is not held over generic_shutdown_super.
> 
> Signed-off-by: Christoph Hellwig 

Thanks for the refactoring!

Reviewed-by: Kees Cook 

-- 
Kees Cook


[PATCH] firmware: ti_sci: refactor deprecated strncpy

2023-09-13 Thread Justin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

We should prefer more robust and less ambiguous string interfaces.

A suitable replacement is `strscpy` [2] due to the fact that it guarantees
NUL-termination on the destination buffer.

It does not seem like `ver->firmware_description` requires NUL-padding
(which is a behavior that strncpy provides) but if it does let's opt for
`strscpy_pad()`.

Link: 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
 [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Signed-off-by: Justin Stitt 
---
Note: build-tested only.
---
 drivers/firmware/ti_sci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/ti_sci.c b/drivers/firmware/ti_sci.c
index 26a37f47f4ca..ce546f391959 100644
--- a/drivers/firmware/ti_sci.c
+++ b/drivers/firmware/ti_sci.c
@@ -485,7 +485,7 @@ static int ti_sci_cmd_get_revision(struct ti_sci_info *info)
ver->abi_major = rev_info->abi_major;
ver->abi_minor = rev_info->abi_minor;
ver->firmware_revision = rev_info->firmware_revision;
-   strncpy(ver->firmware_description, rev_info->firmware_description,
+   strscpy(ver->firmware_description, rev_info->firmware_description,
sizeof(ver->firmware_description));
 
 fail:

---
base-commit: 3669558bdf354cd352be955ef2764cde6a9bf5ec
change-id: 20230913-strncpy-drivers-firmware-ti_sci-c-22667413c18f

Best regards,
--
Justin Stitt 



[PATCH] firmware: tegra: bpmp: refactor deprecated strncpy

2023-09-13 Thread Justin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

We should prefer more robust and less ambiguous string interfaces.

It seems like the filename stored at `namevirt` is expected to be
NUL-terminated.

A suitable replacement is `strscpy_pad` due to the fact that it
guarantees NUL-termination on the destination buffer whilst maintaining
the NUL-padding behavior that strncpy provides.

Link: 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
 [1]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Signed-off-by: Justin Stitt 
---
Note: compile tested only.
---
 drivers/firmware/tegra/bpmp-debugfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/tegra/bpmp-debugfs.c 
b/drivers/firmware/tegra/bpmp-debugfs.c
index 6dfe3d34109e..bbcdd9fed3fb 100644
--- a/drivers/firmware/tegra/bpmp-debugfs.c
+++ b/drivers/firmware/tegra/bpmp-debugfs.c
@@ -610,7 +610,7 @@ static int debugfs_show(struct seq_file *m, void *p)
}
 
len = strlen(filename);
-   strncpy(namevirt, filename, namesize);
+   strscpy_pad(namevirt, filename, namesize);
 
err = mrq_debugfs_read(bpmp, namephys, len, dataphys, datasize,
   );
@@ -661,7 +661,7 @@ static ssize_t debugfs_store(struct file *file, const char 
__user *buf,
}
 
len = strlen(filename);
-   strncpy(namevirt, filename, namesize);
+   strscpy_pad(namevirt, filename, namesize);
 
if (copy_from_user(datavirt, buf, count)) {
err = -EFAULT;

---
base-commit: 3669558bdf354cd352be955ef2764cde6a9bf5ec
change-id: 20230913-strncpy-drivers-firmware-tegra-bpmp-debugfs-c-54f7baaf32c0

Best regards,
--
Justin Stitt 



Re: [PATCH v3 1/7] drivers: genpd: qcom: rpmpd: Fix MSM8976 power domains setup

2023-09-13 Thread Bjorn Andersson
On Sat, Aug 12, 2023 at 01:24:44PM +0200, Adam Skladowski wrote:

Please drop the "drivers:" prefix in $subject, and resubmit this with
patch (alone should be fine) with the new maintainer, and appropriate
mailing list, included.

Thanks,
Bjorn

> Downstream kernel parses resource names based on pm8950-rpm-regulator.dtsi
> in such file qcom,resource-name takes three values: smpa,ldoa and clk0.
> First appearance of RWSC/RWSM point to msm-4.4 kernel
> which is way newer than what this platform was shipped with (msm-3.10).
> For the max_state downstream code limit value to TURBO inside dts
> with only one turbo_high being placed in msm-thermal bindings.
> One of effects of requesting TURBO_HIGH vote is rebooting of device
> which happens during voting inside WCNSS/IRIS,
> this behavior was observed on LeEco S2 smartphone.
> Fix regulator setup and drop unused resources.
> 
> Fixes: b1d522443b4b ("soc: qcom: rpmpd: Add rpm power domains for msm8976")
> Signed-off-by: Adam Skladowski 
> Reviewed-by: Dmitry Baryshkov 
> ---
>  drivers/genpd/qcom/rpmpd.c | 27 ++-
>  1 file changed, 10 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/genpd/qcom/rpmpd.c b/drivers/genpd/qcom/rpmpd.c
> index 3135dd1dafe0..59caa4e7b99a 100644
> --- a/drivers/genpd/qcom/rpmpd.c
> +++ b/drivers/genpd/qcom/rpmpd.c
> @@ -166,13 +166,6 @@ static struct rpmpd cx_rwcx0_vfl = {
>   .key = KEY_FLOOR_LEVEL,
>  };
>  
> -static struct rpmpd cx_rwsc2_vfl = {
> - .pd = { .name = "cx_vfl", },
> - .res_type = RPMPD_RWSC,
> - .res_id = 2,
> - .key = KEY_FLOOR_LEVEL,
> -};
> -
>  static struct rpmpd cx_s1a_vfc = {
>   .pd = { .name = "cx_vfc", },
>   .res_type = RPMPD_SMPA,
> @@ -329,6 +322,13 @@ static struct rpmpd mx_s6a_lvl_ao = {
>   .key = KEY_LEVEL,
>  };
>  
> +static struct rpmpd mx_s6a_vfl = {
> + .pd = { .name = "mx_vfl", },
> + .res_type = RPMPD_SMPA,
> + .res_id = 6,
> + .key = KEY_FLOOR_LEVEL,
> +};
> +
>  static struct rpmpd mx_s7a_lvl_ao;
>  static struct rpmpd mx_s7a_lvl = {
>   .pd = { .name = "mx", },
> @@ -361,13 +361,6 @@ static struct rpmpd mx_rwmx0_vfl = {
>   .key = KEY_FLOOR_LEVEL,
>  };
>  
> -static struct rpmpd mx_rwsm6_vfl = {
> - .pd = { .name = "mx_vfl", },
> - .res_type = RPMPD_RWSM,
> - .res_id = 6,
> - .key = KEY_FLOOR_LEVEL,
> -};
> -
>  /* MD */
>  static struct rpmpd md_s1a_corner_ao;
>  static struct rpmpd md_s1a_corner = {
> @@ -591,16 +584,16 @@ static const struct rpmpd_desc msm8953_desc = {
>  static struct rpmpd *msm8976_rpmpds[] = {
>   [MSM8976_VDDCX] =   _s2a_lvl,
>   [MSM8976_VDDCX_AO] =_s2a_lvl_ao,
> - [MSM8976_VDDCX_VFL] =   _rwsc2_vfl,
> + [MSM8976_VDDCX_VFL] =   _s2a_vfl,
>   [MSM8976_VDDMX] =   _s6a_lvl,
>   [MSM8976_VDDMX_AO] =_s6a_lvl_ao,
> - [MSM8976_VDDMX_VFL] =   _rwsm6_vfl,
> + [MSM8976_VDDMX_VFL] =   _s6a_vfl,
>  };
>  
>  static const struct rpmpd_desc msm8976_desc = {
>   .rpmpds = msm8976_rpmpds,
>   .num_pds = ARRAY_SIZE(msm8976_rpmpds),
> - .max_state = RPM_SMD_LEVEL_TURBO_HIGH,
> + .max_state = RPM_SMD_LEVEL_TURBO,
>  };
>  
>  static struct rpmpd *msm8994_rpmpds[] = {
> -- 
> 2.41.0
> 


Re: [PATCH v2] ipmi: refactor deprecated strncpy

2023-09-13 Thread Corey Minyard
On Wed, Sep 13, 2023 at 05:13:04PM +, Justin Stitt wrote:
> `strncpy` is deprecated for use on NUL-terminated destination strings [1].

Thanks, applied to my next tree.

-corey

> 
> In this case, strncpy is being used specifically for its NUL-padding
> behavior (and has been commented as such). Moreover, the destination
> string is not required to be NUL-terminated [2].
> 
> We can use a more robust and less ambiguous interface in
> `memcpy_and_pad` which makes the code more readable and even eliminates
> the need for that comment.
> 
> Let's also use `strnlen` instead of `strlen()` with an upper-bounds
> check as this is intrinsically a part of `strnlen`.
> 
> Also included in this patch is a simple 1:1 change of `strncpy` to
> `strscpy` for ipmi_ssif.c. If NUL-padding is wanted here as well then we
> should opt again for `strscpy_pad`.
> 
> Link: 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
>  [1]
> Link: https://lore.kernel.org/all/zqeadybl0uz1n...@mail.minyard.net/ [2]
> Link: https://github.com/KSPP/linux/issues/90
> Cc: linux-harden...@vger.kernel.org
> Cc: Kees Cook 
> Signed-off-by: Justin Stitt 
> ---
> Changes in v2:
> - use memcpy_and_pad (thanks Corey)
> - Link to v1: 
> https://lore.kernel.org/r/20230912-strncpy-drivers-char-ipmi-ipmi-v1-1-cc43e0d1c...@google.com
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 11 +++
>  drivers/char/ipmi/ipmi_ssif.c   |  2 +-
>  2 files changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> b/drivers/char/ipmi/ipmi_msghandler.c
> index 186f1fee7534..d6f14279684d 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -5377,20 +5377,15 @@ static void send_panic_events(struct ipmi_smi *intf, 
> char *str)
>  
>   j = 0;
>   while (*p) {
> - int size = strlen(p);
> + int size = strnlen(p, 11);
>  
> - if (size > 11)
> - size = 11;
>   data[0] = 0;
>   data[1] = 0;
>   data[2] = 0xf0; /* OEM event without timestamp. */
>   data[3] = intf->addrinfo[0].address;
>   data[4] = j++; /* sequence # */
> - /*
> -  * Always give 11 bytes, so strncpy will fill
> -  * it with zeroes for me.
> -  */
> - strncpy(data+5, p, 11);
> +
> + memcpy_and_pad(data+5, 11, p, size, '\0');
>   p += size;
>  
>   ipmi_panic_request_and_wait(intf, , );
> diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> index 3b921c78ba08..edcb83765dce 100644
> --- a/drivers/char/ipmi/ipmi_ssif.c
> +++ b/drivers/char/ipmi/ipmi_ssif.c
> @@ -1940,7 +1940,7 @@ static int new_ssif_client(int addr, char *adapter_name,
>   }
>   }
>  
> - strncpy(addr_info->binfo.type, DEVICE_NAME,
> + strscpy(addr_info->binfo.type, DEVICE_NAME,
>   sizeof(addr_info->binfo.type));
>   addr_info->binfo.addr = addr;
>   addr_info->binfo.platform_data = addr_info;
> 
> ---
> base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> change-id: 20230912-strncpy-drivers-char-ipmi-ipmi-dda47b3773fd
> 
> Best regards,
> --
> Justin Stitt 
> 


Re: [PATCH v17 00/14] Add support for MT8195 SCP 2nd core

2023-09-13 Thread Mathieu Poirier
On Fri, Sep 01, 2023 at 04:09:21PM +0800, Tinghan Shen wrote:
> The mediatek remoteproc driver currently only allows bringing up a 
> single core SCP, e.g. MT8183. It also only bringing up the 1st 
> core in SoCs with a dual-core SCP, e.g. MT8195. This series support 
> to bring-up the 2nd core of the dual-core SCP.
> 
> v16 -> v17:
> 1. add a comment in scp_add_multi_core() at patchset 8

Other than patch 2 and 14, I have applied this set.  The remaining patches will
have to be resent to Matthias.

Thanks,
Mathieu

> 
> v15 -> v16:
> 1. fix the checkpatch warning at patchset 1 
> 2. move changes on scp_probe() to the new added patchset 6
> 3. revise platform_set_drvdata() at patchset 8
> 4. fix commit message at patchset 9
> 
> v15 -> v14:
> 1. use the common SCP registers in struct mtk_scp_of_cluster instead of
>copy it to struct mtk_scp at patchset 5 
> 2. use platform_set_drvdata instead of platform_device_add_data at patchset 5 
> 3. rename l2tcm_lock to cluster_lock at patchset 8
> 4. check l2tcm_refcnt value before decreasing at patchset 8
> 5. revise the commit message at patchset 11
> 
> v13 -> v14:
> 1. add review tag to patchset 1,6
> 2. exchange the order of sram power on and reset assert in
> mt8195_scp_c1_before_load at patchset 2
> 3. use ERR_CAST in patchset 5
> 4. re-write patchset 7 to remove dependency between core 0 and core 1 
> 5. add patch set 10 to report watchdot timeout to all cores
> 
> v12 -> v13:
> 1. replace subdevice with new mediatek scp operations in patchset 7 
> 2. add review tag to patchset 3
> 3. modify mediatek,scp phandle name of video-codec@1800 at patchset 11
> 
> v11 -> v12:
> 1. add scp_add_single/multi_core() to patchset 6
> 2. remove unused comment in patchset 6
> 3. rename list name from mtk_scp_cluster to mtk_scp_list
> 4. rewrite the multi-core probe flow 
> 5. disable rproc->autoboot and boot rproc by request_firmware_nowait at 
> patchset 7 
> 6. remove patchset 7 review tag  
> 
> v10 -> v11:
> 1. rewrite patchset 5 to probe single-core SCP with the cluster list
> 2. Also in patchset 5, move the pointer of mtk_scp object from the
>platform data property to the driver data property 
> 3. move the appearance of mtk_scp cluster property to patcheset 7
> 
> v9 -> v10:
> 1. move the global mtk_scp list into the platform device driver data structure
> 2. remove an unnecessary if() condition
> 
> v8 -> v9:
> 1. initialize l1tcm_size/l1tcm_phys at patchset 05/11 
> 2. rewrite patchset 06/11 to unify the flow and remove hacks
> 
> v7 -> v8:
> 1. update the node name of mt8192 asurada SCP rpmsg subnode
> 2. squash register definitions into driver patches
> 3. initialize local variables on the declaration at patch v8 06/11 
> 
> v6 -> v7:
> 1. merge the mtk_scp_cluster struct into the mtk_scp structure
>at the "Probe multi-core SCP" patch
> 
> v5 -> v6:
> 1. move the mtk_scp_of_regs structure from mtk_common.h to mtk_scp.c
> 2. rename the SCP core 0 label from 'scp' to 'scp_c0'
> 
> v4 -> v5:
> 1. move resource release actions to the platform driver remove operation 
> 2. fix dual-core watchdog handling
> 
> v3 -> v4:
> 1. change the representation of dual-core SCP in dts file and update SCP yaml
> 2. rewrite SCP driver to reflect the change of dts node
> 3. drop 'remove redundant call of rproc_boot for SCP' in v3 for further 
> investigation
> 
> v2 -> v3:
> 1. change the representation of dual-core SCP in dts file and update SCP yaml
> 2. rewrite SCP driver to reflect the change of dts node
> 3. add SCP core 1 node to mt8195.dtsi
> 4. remove redundant call of rproc_boot for SCP
> 5. refine IPI error message
> 
> v1 -> v2:
> 1. update dt-binding property description
> 2. remove kconfig for scp dual driver
> 3. merge mtk_scp_dual.c and mtk_scp_subdev.c to mtk_scp.c
> 
> 
> Tinghan Shen (14):
>   dt-bindings: remoteproc: mediatek: Improve the rpmsg subnode
> definition
>   arm64: dts: mediatek: Update the node name of SCP rpmsg subnode
>   dt-bindings: remoteproc: mediatek: Support MT8195 dual-core SCP
>   remoteproc: mediatek: Add MT8195 SCP core 1 operations
>   remoteproc: mediatek: Extract SCP common registers
>   remoteproc: mediatek: Revise SCP rproc initialization flow for
> multi-core SCP
>   remoteproc: mediatek: Probe SCP cluster on single-core SCP
>   remoteproc: mediatek: Probe SCP cluster on multi-core SCP
>   remoteproc: mediatek: Remove dependency of MT8195 SCP L2TCM power
> control on dual-core SCP
>   remoteproc: mediatek: Setup MT8195 SCP core 1 SRAM offset
>   remoteproc: mediatek: Handle MT8195 SCP core 1 watchdog timeout
>   remoteproc: mediatek: Report watchdog crash to all cores
>   remoteproc: mediatek: Refine ipi handler error message
>   arm64: dts: mediatek: mt8195: Add SCP 2nd core
> 
>  .../bindings/remoteproc/mtk,scp.yaml  | 176 +-
>  .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |   2 +-
>  .../boot/dts/mediatek/mt8192-asurada.dtsi |   2 +-
>  .../boot/dts/mediatek/mt8195-cherry.dtsi  |   

[PATCH v3] EDAC/mc_sysfs: refactor deprecated strncpy

2023-09-13 Thread Justin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

We should prefer more robust and less ambiguous string interfaces.

A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on the destination buffer without needlessly
NUL-padding.

Link: 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
 [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Signed-off-by: Justin Stitt 
---
Changes in v3:
- prefer strscpy to strscpy_pad (thanks Tony)
- Link to v2: 
https://lore.kernel.org/r/20230913-strncpy-drivers-edac-edac_mc_sysfs-c-v2-1-2d2e6bd43...@google.com

Changes in v2:
- included refactor of another strncpy in same file
- Link to v1: 
https://lore.kernel.org/r/20230913-strncpy-drivers-edac-edac_mc_sysfs-c-v1-1-d232891b0...@google.com
---
Note: build-tested only.
---
 drivers/edac/edac_mc_sysfs.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 15f63452a9be..9a5b4bbd8191 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -229,8 +229,7 @@ static ssize_t channel_dimm_label_store(struct device *dev,
if (copy_count == 0 || copy_count >= sizeof(rank->dimm->label))
return -EINVAL;
 
-   strncpy(rank->dimm->label, data, copy_count);
-   rank->dimm->label[copy_count] = '\0';
+   strscpy(rank->dimm->label, data, copy_count);
 
return count;
 }
@@ -535,7 +534,7 @@ static ssize_t dimmdev_label_store(struct device *dev,
if (copy_count == 0 || copy_count >= sizeof(dimm->label))
return -EINVAL;
 
-   strncpy(dimm->label, data, copy_count);
+   strscpy(dimm->label, data, copy_count);
dimm->label[copy_count] = '\0';
 
return count;

---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 20230913-strncpy-drivers-edac-edac_mc_sysfs-c-e619b00124a3

Best regards,
--
Justin Stitt 



Re: [PATCH] EDAC/mc_sysfs: refactor deprecated strncpy

2023-09-13 Thread Justin Stitt
On Wed, Sep 13, 2023 at 8:13 AM Luck, Tony  wrote:
>
> > `strncpy` is deprecated for use on NUL-terminated destination strings [1].
> >
> > We should prefer more robust and less ambiguous string interfaces.
> >
> > A suitable replacement is `strscpy_pad` [2] due to the fact that it 
> > guarantees
> > NUL-termination on the destination buffer whilst maintaining the
> > NUL-padding behavior that `strncpy` provides. This may not be strictly
> > necessary but as I couldn't understand what this code does I wanted to
> > ensure that the functionality is the same.
> >
> > Link: 
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> >  [1]
> > Link: 
> > https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
> > Link: https://github.com/KSPP/linux/issues/90
> > Cc: linux-harden...@vger.kernel.org
> > Signed-off-by: Justin Stitt 
> > ---
> > Note: build-tested only.
> > ---
> >  drivers/edac/edac_mc_sysfs.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> > index 15f63452a9be..b303309a63cf 100644
> > --- a/drivers/edac/edac_mc_sysfs.c
> > +++ b/drivers/edac/edac_mc_sysfs.c
> > @@ -229,8 +229,7 @@ static ssize_t channel_dimm_label_store(struct device 
> > *dev,
> > if (copy_count == 0 || copy_count >= sizeof(rank->dimm->label))
> > return -EINVAL;
> >
> > -   strncpy(rank->dimm->label, data, copy_count);
> > -   rank->dimm->label[copy_count] = '\0';
> > +   strscpy_pad(rank->dimm->label, data, copy_count);
>
> That doc page says the problem with strncpy() is that it doesn't guarantee to
> NUL terminate the target string. But this code is aware of that limitation and
> zaps a '\0' at the end to be sure.
>
> So this code doesn't suffer from the potential problems.

Right, the original code did not have an existing bug due to the
reason you mentioned. However, I'm pretty keen on eliminating uses of
this interface treewide as there is always a more robust and less
ambiguous option.


>
> If it is going to be fixed, then some further analysis of the original code
> would be wise. Just replacing with strscpy_pad() means the code probably
> still suffers from the "needless performance penalty" also mentioned in
> the deprecation document.
Got it, sending a v2 that prefers `strscpy` to `strscpy_pad` resolving
the performance issue.

>
> -Tony
>

Thanks for the timely review!
Justin


[PATCH v2] ipmi: refactor deprecated strncpy

2023-09-13 Thread Justin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

In this case, strncpy is being used specifically for its NUL-padding
behavior (and has been commented as such). Moreover, the destination
string is not required to be NUL-terminated [2].

We can use a more robust and less ambiguous interface in
`memcpy_and_pad` which makes the code more readable and even eliminates
the need for that comment.

Let's also use `strnlen` instead of `strlen()` with an upper-bounds
check as this is intrinsically a part of `strnlen`.

Also included in this patch is a simple 1:1 change of `strncpy` to
`strscpy` for ipmi_ssif.c. If NUL-padding is wanted here as well then we
should opt again for `strscpy_pad`.

Link: 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
 [1]
Link: https://lore.kernel.org/all/zqeadybl0uz1n...@mail.minyard.net/ [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Cc: Kees Cook 
Signed-off-by: Justin Stitt 
---
Changes in v2:
- use memcpy_and_pad (thanks Corey)
- Link to v1: 
https://lore.kernel.org/r/20230912-strncpy-drivers-char-ipmi-ipmi-v1-1-cc43e0d1c...@google.com
---
 drivers/char/ipmi/ipmi_msghandler.c | 11 +++
 drivers/char/ipmi/ipmi_ssif.c   |  2 +-
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
b/drivers/char/ipmi/ipmi_msghandler.c
index 186f1fee7534..d6f14279684d 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -5377,20 +5377,15 @@ static void send_panic_events(struct ipmi_smi *intf, 
char *str)
 
j = 0;
while (*p) {
-   int size = strlen(p);
+   int size = strnlen(p, 11);
 
-   if (size > 11)
-   size = 11;
data[0] = 0;
data[1] = 0;
data[2] = 0xf0; /* OEM event without timestamp. */
data[3] = intf->addrinfo[0].address;
data[4] = j++; /* sequence # */
-   /*
-* Always give 11 bytes, so strncpy will fill
-* it with zeroes for me.
-*/
-   strncpy(data+5, p, 11);
+
+   memcpy_and_pad(data+5, 11, p, size, '\0');
p += size;
 
ipmi_panic_request_and_wait(intf, , );
diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
index 3b921c78ba08..edcb83765dce 100644
--- a/drivers/char/ipmi/ipmi_ssif.c
+++ b/drivers/char/ipmi/ipmi_ssif.c
@@ -1940,7 +1940,7 @@ static int new_ssif_client(int addr, char *adapter_name,
}
}
 
-   strncpy(addr_info->binfo.type, DEVICE_NAME,
+   strscpy(addr_info->binfo.type, DEVICE_NAME,
sizeof(addr_info->binfo.type));
addr_info->binfo.addr = addr;
addr_info->binfo.platform_data = addr_info;

---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 20230912-strncpy-drivers-char-ipmi-ipmi-dda47b3773fd

Best regards,
--
Justin Stitt 



Re: [PATCH 01/19] fs: reflow deactivate_locked_super

2023-09-13 Thread Christian Brauner
On Wed, Sep 13, 2023 at 08:09:55AM -0300, Christoph Hellwig wrote:
> Return early for the case where the super block isn't cleaned up to
> reduce level of indentation.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  fs/super.c | 35 ++-
>  1 file changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 2d762ce67f6e6c..127a17d958a482 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -476,27 +476,28 @@ static void kill_super_notify(struct super_block *sb)
>  void deactivate_locked_super(struct super_block *s)

I wouldn't mind s/s/sb/ here as well. So we stop using @s in some and
@sb in other places.

Otherwise looks good to me,
Reviewed-by: Christian Brauner 


Re: [PATCH 10/19] USB: gadget/legacy: remove sb_mutex

2023-09-13 Thread Alan Stern
On Wed, Sep 13, 2023 at 08:10:04AM -0300, Christoph Hellwig wrote:
> Creating new a new super_block vs freeing the old one for single instance
> file systems is serialized by the wait for SB_DEAD.
> 
> Remove the superfluous sb_mutex.
> 
> Signed-off-by: Christoph Hellwig 
> ---

You might mention that this is essentially a reversion of commit 
d18dcfe9860e ("USB: gadgetfs: Fix race between mounting and 
unmounting").

Alan Stern

>  drivers/usb/gadget/legacy/inode.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/usb/gadget/legacy/inode.c 
> b/drivers/usb/gadget/legacy/inode.c
> index ce9e31f3d26bcc..a203266bc0dc82 100644
> --- a/drivers/usb/gadget/legacy/inode.c
> +++ b/drivers/usb/gadget/legacy/inode.c
> @@ -229,7 +229,6 @@ static void put_ep (struct ep_data *data)
>   */
>  
>  static const char *CHIP;
> -static DEFINE_MUTEX(sb_mutex);   /* Serialize superblock 
> operations */
>  
>  /*--*/
>  
> @@ -2012,8 +2011,6 @@ gadgetfs_fill_super (struct super_block *sb, struct 
> fs_context *fc)
>   struct dev_data *dev;
>   int rc;
>  
> - mutex_lock(_mutex);
> -
>   if (the_device) {
>   rc = -ESRCH;
>   goto Done;
> @@ -2069,7 +2066,6 @@ gadgetfs_fill_super (struct super_block *sb, struct 
> fs_context *fc)
>   rc = -ENOMEM;
>  
>   Done:
> - mutex_unlock(_mutex);
>   return rc;
>  }
>  
> @@ -2092,7 +2088,6 @@ static int gadgetfs_init_fs_context(struct fs_context 
> *fc)
>  static void
>  gadgetfs_kill_sb (struct super_block *sb)
>  {
> - mutex_lock(_mutex);
>   kill_litter_super (sb);
>   if (the_device) {
>   put_dev (the_device);
> @@ -2100,7 +2095,6 @@ gadgetfs_kill_sb (struct super_block *sb)
>   }
>   kfree(CHIP);
>   CHIP = NULL;
> - mutex_unlock(_mutex);
>  }
>  
>  /*--*/
> -- 
> 2.39.2
> 


Re: [PATCH v6 1/4] dt-bindings: remoteproc: k3-m4f: Add K3 AM64x SoCs

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 14:36, Rob Herring wrote:
> 
> On Wed, 13 Sep 2023 06:16:41 -0500, Hari Nagalla wrote:
>> K3 AM64x SoC has a Cortex M4F subsystem in the MCU voltage domain.
>> The remote processor's life cycle management and IPC mechanisms are
>> similar across the R5F and M4F cores from remote processor driver
>> point of view. However, there are subtle differences in image loading
>> and starting the M4F subsystems.
>>
>> The YAML binding document provides the various node properties to be
>> configured by the consumers of the M4F subsystem.
>>
>> Signed-off-by: Martyn Welch 
>> Signed-off-by: Hari Nagalla 
>> ---
>> Changes since v1:
>>  - Spelling corrections
>>  - Corrected to pass DT checks
>>
>> Changes since v2:
>>  - Missed spelling correction to commit message
>>
>> Changes since v3:
>>  - Removed unnecessary descriptions and used generic memory region names
>>  - Made mboxes and memory-region optional
>>  - Removed unrelated items from examples
>>
>> Changes since v4:
>>  - Rebased to the latest kernel-next tree
>>  - Added optional sram memory region for m4f device node
>>
>> Changes since v5:
>>  - None
>>
>>  .../bindings/remoteproc/ti,k3-m4f-rproc.yaml  | 136 ++
>>  1 file changed, 136 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
>>
> 
> My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
> on your patch (DT_CHECKER_FLAGS is new in v5.13):
> 
> yamllint warnings/errors:
> 
> dtschema/dtc warnings/errors:
> /builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/dma/stericsson,dma40.example.dtb:
>  dma-controller@801c: sram:0: [4294967295, 4294967295] is too long
>   from schema $id: 
> http://devicetree.org/schemas/dma/stericsson,dma40.yaml#

This looks unrelated but it is caused by this patch. Probably by
conflicting type for 'sram'. It seems we need to make exception for
'sram' in dtschema.

Best regards,
Krzysztof



Re: [PATCH v4 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> Implement support for cgroup control of SGX Enclave Page Cache (EPC)
> memory using the misc cgroup controller. EPC memory is independent
> from normal system memory, e.g. must be reserved at boot from RAM and
> cannot be converted between EPC and normal memory while the system is
> running. EPC is managed by the SGX subsystem and is not accounted by
> the memory controller.
>
> Much like normal system memory, EPC memory can be overcommitted via
> virtual memory techniques and pages can be swapped out of the EPC to
> their backing store (normal system memory, e.g. shmem).  The SGX EPC
> subsystem is analogous to the memory subsystem and the SGX EPC controller
> is in turn analogous to the memory controller; it implements limit and
> protection models for EPC memory.
>
> The misc controller provides a mechanism to set a hard limit of EPC
> usage via the "sgx_epc" resource in "misc.max". The total EPC memory
> available on the system is reported via the "sgx_epc" resource in
> "misc.capacity".
>
> This patch was modified from its original version to use the misc cgroup
> controller instead of a custom controller.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Tested-by: Mikko Ylinen 
>
> Cc: Sean Christopherson 
> ---
> V4:
> - Fix a white space issue in Kconfig (Randy).
> - Update comments for LRU list as it can be owned by a cgroup.
> - Fix comments for sgx_reclaim_epc_pages() and use IS_ENABLED consistently 
> (Mikko)
>
> V3:
>
> 1) Use the same maximum number of reclaiming candidate pages to be
> processed, SGX_NR_TO_SCAN_MAX, for each reclaiming iteration in both
> cgroup worker function and ksgxd. This fixes an overflow in the
> backing store buffer with the same fixed size allocated on stack in
> sgx_reclaim_epc_pages().
>
> 2) Initialize max for root EPC cgroup. Otherwise, all
> misc_cg_try_charge() calls would fail as it checks for all limits of
> ancestors all the way to the root node.
>
> 3) Start reclaiming whenever misc_cg_try_charge fails. Removed all
> re-checks for limits and current usage. For all purposes and intent,
> when misc_try_charge() fails, reclaiming is needed. This also corrects
> an error of not reclaiming when the child limit is larger than one of
> its ancestors.
>
> 4) Handle failure on charging to the root EPC cgroup. Failure on charging
> to root means we are at or above capacity, so start reclaiming or return
> OOM error.
>
> 5) Removed the custom cgroup tree walking iterator with epoch tracking
> logic. Replaced it with just the plain css_for_each_descendant_pre
> iterator. The custom iterator implemented a rather complex epoch scheme
> I believe was intended to prevent extra reclaiming from multiple worker
> threads doing the same walk but it turned out not matter much as each
> thread would only reclaim when usage is above limit. Using the plain
> css_for_each_descendant_pre iterator simplified code a bit.
>
> 6) Do not reclaim synchronously in misc_max_write callback which would
> block the user. Instead queue an async work item to run the reclaiming
> loop.
>
> 7) Other minor refactoring:
> - Remove unused params in epc_cgroup APIs
> - centralize uncharge into sgx_free_epc_page()
> ---
>  arch/x86/Kconfig |  13 +
>  arch/x86/kernel/cpu/sgx/Makefile |   1 +
>  arch/x86/kernel/cpu/sgx/epc_cgroup.c | 406 +++
>  arch/x86/kernel/cpu/sgx/epc_cgroup.h |  59 
>  arch/x86/kernel/cpu/sgx/main.c   |  67 -
>  arch/x86/kernel/cpu/sgx/sgx.h|  17 +-
>  6 files changed, 547 insertions(+), 16 deletions(-)
>  create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.c
>  create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.h
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 982b777eadc7..55fcf182d4a3 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1921,6 +1921,19 @@ config X86_SGX
>  
> If unsure, say N.
>  
> +config CGROUP_SGX_EPC
> + bool "Miscellaneous Cgroup Controller for Enclave Page Cache (EPC) for 
> Intel SGX"
> + depends on X86_SGX && CGROUP_MISC
> + help
> +   Provides control over the EPC footprint of tasks in a cgroup via
> +   the Miscellaneous cgroup controller.
> +
> +   EPC is a subset of regular memory that is usable only by SGX
> +   enclaves and is very limited in quantity, e.g. less than 1%
> +   of total DRAM.
> +
> +   Say N if unsure.
> +
>  config X86_USER_SHADOW_STACK
>   bool "X86 userspace shadow stack"
>   depends on AS_WRUSS
> diff --git a/arch/x86/kernel/cpu/sgx/Makefile 
> b/arch/x86/kernel/cpu/sgx/Makefile
> index 9c1656779b2a..12901a488da7 100644
> --- a/arch/x86/kernel/cpu/sgx/Makefile
> +++ b/arch/x86/kernel/cpu/sgx/Makefile
> @@ -4,3 +4,4 @@ obj-y += \
>   ioctl.o \
>   main.o
>  obj-$(CONFIG_X86_SGX_KVM)+= virt.o
> 

Re: [PATCH v6 1/4] dt-bindings: remoteproc: k3-m4f: Add K3 AM64x SoCs

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 15:59, Hari Nagalla wrote:
> On 9/13/23 06:32, Krzysztof Kozlowski wrote:
>>>   - Removed unrelated items from examples
>>>
>>> Changes since v4:
>>>   - Rebased to the latest kernel-next tree
>>>   - Added optional sram memory region for m4f device node
>>>
>>> Changes since v5:
>>>   - None
>> Hm, why none? There were errors in the binding to which you did not
>> respond. Did you just ignore them?
>>
> I do not see any errors in my builds. Am i missing something? Please 
> excuse my lack of knowledge here. Thought the bot errors were outside of 
> the patch submitted 
> (Documentation/devicetree/bindings/dma/stericsson,dma40.yaml). 
> Appreciate your kind inputs..

I would expect then some confirmation that errors can be ignored.
Instead report was left just unanswered.

> 
> $ make -j`nproc` ARCH=arm64 V=1 CROSS_COMPILE=aarch64-none-linux-gnu- 
> DT_CHEKCER_FLAGS=-m dt_binding_check 
> DT_SCHEMA_FILES=Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
> make --no-print-directory -C /home/a0868345/temp/linux-next \
> -f /home/a0868345/temp/linux-next/Makefile dt_binding_check
> make -f ./scripts/Makefile.build obj=scripts/basic
> make -f ./scripts/Makefile.build obj=scripts/dtc
> make -f ./scripts/Makefile.build obj=Documentation/devicetree/bindings
> # LINTDocumentation/devicetree/bindings
>(find ./Documentation/devicetree/bindings \( -name '*.yaml' ! -name 
> 'processed-schema*' \) | grep -F -e 
> "Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml" | 
> xargs -n200 -P$(nproc) /home/a0868345/.local/bin/yamllint -f parsable -c 
> ./Documentation/devicetree/bindings/.yamllint >&2) || true
> # DTEX 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dts
>dt-extract-example 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml > 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dts
> # CHKDT   Documentation/devicetree/bindings/processed-schema.json
>(find ./Documentation/devicetree/bindings \( -name '*.yaml' ! -name 
> 'processed-schema*' \) | grep -F -e 
> "Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml" | 
> xargs -n200 -P$(nproc) dt-doc-validate -u 
> ./Documentation/devicetree/bindings) || true
> # SCHEMA  Documentation/devicetree/bindings/processed-schema.json
>f=$(mktemp) ; find ./Documentation/devicetree/bindings \( -name 
> '*.yaml' ! -name 'processed-schema*' \) > $f ; dt-mk-schema -j  @$f > 
> Documentation/devicetree/bindings/processed-schema.json ; rm -f $f
> # DTC_CHK 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dtb
>gcc -E 
> -Wp,-MMD,Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.pre.tmp
>  
> -nostdinc -I./scripts/dtc/include-prefixes -undef -D__DTS__ -x 
> assembler-with-cpp -o 
> Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.dts.tmp
>  
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dts 
> ; ./scripts/dtc/dtc -o 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dtb 
> -b 0 -iDocumentation/devicetree/bindings/remoteproc/ 
> -i./scripts/dtc/include-prefixes -Wno-avoid_unnecessary_addr_size 
> -Wno-graph_child_address -Wno-interrupt_provider 
> -Wno-unique_unit_address -Wunique_unit_address_if_enabled -d 
> Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.dtc.tmp
>  
> Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.dts.tmp
>  
> ; cat 
> Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.pre.tmp
>  
> Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.dtc.tmp
>  
>  > 
> Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d 
> ; dt-validate  -u ./Documentation/devicetree/bindings -p 
> ./Documentation/devicetree/bindings/processed-schema.json 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dtb 
> || true
> 
> 
>  >> +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
>  >> +%YAML 1.2
>  >> +---
>  >> +$id:http://devicetree.org/schemas/remoteproc/ti,k3-m4f-rproc.yaml#
>  >> +$schema:http://devicetree.org/meta-schemas/core.yaml#
>  >> +
>  >> +title: TI K3 M4F processor subsystems
>  >> +
>  >> +maintainers:
>  >> +  - Hari Nagalla
>  >> +  - Mathieu Poirier
>  > Are you sure Mathieu has this device and is a maintainer of this device?
>  >
> Earlier, Mathieu suggested he can be the maintainer. Beagle play is 
> based on AM625 device.
> 

Sure, no problem, good for me.

Best regards,
Krzysztof



Re: [PATCH v4 15/18] x86/sgx: Prepare for multiple LRUs

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> Add sgx_can_reclaim() wrapper and encapsulate direct references to the
> global LRU list in the reclaimer functions so that they can be called with
> an LRU list per EPC cgroup.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Cc: Sean Christopherson 
> ---
> V4:
> - Re-organized this patch to include all changes related to
> encapsulation of the global LRU
> - Moved this patch to precede the EPC cgroup patch
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 41 +++---
>  1 file changed, 28 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index ce316bd5e5bb..3d396fe5ec09 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -34,6 +34,16 @@ static DEFINE_XARRAY(sgx_epc_address_space);
>   */
>  static struct sgx_epc_lru_lists sgx_global_lru;
>  
> +static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page 
> *epc_page)
> +{
> + return _global_lru;
> +}

I'd simply export sgx_global_lru.

> +static inline bool sgx_can_reclaim(void)
> +{
> + return !list_empty(_global_lru.reclaimable);
> +}


Accessors for the object should be named so that this fact is reflected,
e.g. sgx_global_lru_can_reclaim() in this case.

I would just open code this to the call sites though.

> +
>  static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0);
>  
>  /* Nodes with one or more EPC sections. */
> @@ -339,6 +349,7 @@ size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool 
> ignore_age)
>   struct sgx_backing backing[SGX_NR_TO_SCAN_MAX];
>   struct sgx_epc_page *epc_page, *tmp;
>   struct sgx_encl_page *encl_page;
> + struct sgx_epc_lru_lists *lru;
>   pgoff_t page_index;
>   LIST_HEAD(iso);
>   size_t ret;
> @@ -372,10 +383,11 @@ size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool 
> ignore_age)
>   continue;
>  
>  skip:
> - spin_lock(_global_lru.lock);
> + lru = sgx_lru_lists(epc_page);
> + spin_lock(>lock);
>   sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIMABLE);
> - list_move_tail(_page->list, _global_lru.reclaimable);
> - spin_unlock(_global_lru.lock);
> + list_move_tail(_page->list, >reclaimable);
> + spin_unlock(>lock);
>  
>   kref_put(_page->encl->refcount, sgx_encl_release);
>   }
> @@ -399,7 +411,7 @@ size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool 
> ignore_age)
>  static bool sgx_should_reclaim(unsigned long watermark)
>  {
>   return atomic_long_read(_nr_free_pages) < watermark &&
> -!list_empty(_global_lru.reclaimable);
> + sgx_can_reclaim();
>  }
>  
>  /*
> @@ -529,14 +541,16 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void)
>   */
>  void sgx_record_epc_page(struct sgx_epc_page *page, unsigned long flags)
>  {
> - spin_lock(_global_lru.lock);
> + struct sgx_epc_lru_lists *lru = sgx_lru_lists(page);
> +
> + spin_lock(>lock);
>   WARN_ON_ONCE(sgx_epc_page_reclaimable(page->flags));
>   page->flags |= flags;
>   if (sgx_epc_page_reclaimable(flags))
> - list_add_tail(>list, _global_lru.reclaimable);
> + list_add_tail(>list, >reclaimable);
>   else
> - list_add_tail(>list, _global_lru.unreclaimable);
> - spin_unlock(_global_lru.lock);
> + list_add_tail(>list, >unreclaimable);
> + spin_unlock(>lock);
>  }
>  
>  /**
> @@ -551,15 +565,16 @@ void sgx_record_epc_page(struct sgx_epc_page *page, 
> unsigned long flags)
>   */
>  int sgx_drop_epc_page(struct sgx_epc_page *page)
>  {
> - spin_lock(_global_lru.lock);
> + struct sgx_epc_lru_lists *lru = sgx_lru_lists(page);
> +
> + spin_lock(>lock);
>   if (sgx_epc_page_reclaim_in_progress(page->flags)) {
> - spin_unlock(_global_lru.lock);
> + spin_unlock(>lock);
>   return -EBUSY;
>   }
> -
>   list_del(>list);
>   sgx_epc_page_reset_state(page);
> - spin_unlock(_global_lru.lock);
> + spin_unlock(>lock);
>  
>   return 0;
>  }
> @@ -592,7 +607,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool 
> reclaim)
>   break;
>   }
>  
> - if (list_empty(_global_lru.reclaimable))
> + if (!sgx_can_reclaim())
>   return ERR_PTR(-ENOMEM);
>  
>   if (!reclaim) {
> -- 
> 2.25.1

BR, Jarkko


Re: [PATCH v4 13/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Sean Christopherson 
>
> Adjust and expose the top-level reclaim function as
> sgx_reclaim_epc_pages() for use by the upcoming EPC cgroup, which will
> initiate reclaim to enforce the max limit.
>
> Make these adjustments to the function signature.
>
> 1) To take a parameter that specifies the number of pages to scan for
> reclaiming. Define a max value of 32, but scan 16 in the case for the
> global reclaimer (ksgxd). The EPC cgroup will use it to specify a
> desired number of pages to be reclaimed up to the max value of 32.
>
> 2) To take a flag to force reclaiming a page regardless of its age.  The
> EPC cgroup will use the flag to enforce its limits by draining the
> reclaimable lists before resorting to other measures, e.g. forcefully
> kill enclaves.
>
> 3) Return the number of reclaimed pages. The EPC cgroup will use the
> result to track reclaiming progress and escalate to a more forceful
> reclaiming mode, e.g., calling this function with the flag to ignore age
> of pages.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Cc: Sean Christopherson 
> ---
> V4:
> - Combined the 3 patches that made the individual changes to the
> function signature.
> - Removed 'high' limit in commit message.
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 30 --
>  arch/x86/kernel/cpu/sgx/sgx.h  |  1 +
>  2 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index c8900d62cfff..e1dde431a400 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -17,6 +17,10 @@
>  #include "driver.h"
>  #include "encl.h"
>  #include "encls.h"

newline here

> +/**

/*

> + * Maximum number of pages to scan for reclaiming.
> + */
> +#define SGX_NR_TO_SCAN_MAX   32
>  
>  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
>  static int sgx_nr_epc_sections;
> @@ -279,7 +283,11 @@ static void sgx_reclaimer_write(struct sgx_epc_page 
> *epc_page,
>   mutex_unlock(>lock);
>  }
>  
> -/*
> +/**
> + * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers
> + * @nr_to_scan:   Number of EPC pages to scan for reclaim
> + * @ignore_age:   Reclaim a page even if it is young
> + *
>   * Take a fixed number of pages from the head of the active page pool and
>   * reclaim them to the enclave's private shmem files. Skip the pages, which 
> have
>   * been accessed since the last scan. Move those pages to the tail of active
> @@ -292,15 +300,15 @@ static void sgx_reclaimer_write(struct sgx_epc_page 
> *epc_page,
>   * problematic as it would increase the lock contention too much, which would
>   * halt forward progress.
>   */
> -static void sgx_reclaim_pages(void)
> +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age)
>  {
> - struct sgx_backing backing[SGX_NR_TO_SCAN];
> + struct sgx_backing backing[SGX_NR_TO_SCAN_MAX];
>   struct sgx_epc_page *epc_page, *tmp;
>   struct sgx_encl_page *encl_page;
>   pgoff_t page_index;
>   LIST_HEAD(iso);
> - int ret;
> - int i;
> + size_t ret;
> + size_t i;

I don't mind having these in separate lines but you could also

size_t ret, i;

>  
>   spin_lock(_global_lru.lock);
>   for (i = 0; i < SGX_NR_TO_SCAN; i++) {
> @@ -326,13 +334,14 @@ static void sgx_reclaim_pages(void)
>   spin_unlock(_global_lru.lock);
>  
>   if (list_empty())
> - return;
> + return 0;
>  
>   i = 0;
>   list_for_each_entry_safe(epc_page, tmp, , list) {
>   encl_page = epc_page->encl_page;
>  
> - if (!sgx_reclaimer_age(epc_page))
> + if (i == SGX_NR_TO_SCAN_MAX ||
> + (!ignore_age && !sgx_reclaimer_age(epc_page)))
>   goto skip;
>  
>   page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base);
> @@ -371,6 +380,7 @@ static void sgx_reclaim_pages(void)
>  
>   sgx_free_epc_page(epc_page);
>   }

newline

> + return i;
>  }
>  
>  static bool sgx_should_reclaim(unsigned long watermark)
> @@ -387,7 +397,7 @@ static bool sgx_should_reclaim(unsigned long watermark)
>  void sgx_reclaim_direct(void)
>  {
>   if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
> - sgx_reclaim_pages();
> + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false);
>  }
>  
>  static int ksgxd(void *p)
> @@ -410,7 +420,7 @@ static int ksgxd(void *p)
>sgx_should_reclaim(SGX_NR_HIGH_PAGES));
>  
>   if (sgx_should_reclaim(SGX_NR_HIGH_PAGES))
> - sgx_reclaim_pages();
> + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false);
>  
>   cond_resched();
>   }
> @@ -582,7 +592,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool 
> reclaim)
>   

[PATCH v2] tracefs: Add show_events_dentries

2023-09-13 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

Add a file in tracefs that shows the "events" allocated entries and the
dentries that are attached to them. This is used to see what dentries have
been dynamically allocated as well as their current ref counts.

 ~# cat /sys/kernel/tracing/events/sched/sched_switch/enable
 0
 ~# grep -A4 sched_switch /sys/kernel/tracing/show_events_dentries
 sched_switch/ dentry: (1)
 enable dentry: (0)
 id
 filter
 trigger

The first value is the name of the file or directory. If a dentry is
allocated, then a "dentry: ()" is displayed showing the address
of the dentry as well as its ref count.

Signed-off-by: Steven Rostedt (Google) 
---
Changes since v1:  
https://lore.kernel.org/linux-trace-kernel/20230913030021.550427...@goodmis.org

 - Fixed the missed deletion of the structures from the event_inode.c file
   that was incorrectly in another patch

 - Removed addresses (no need to expose KASLR)

 - Added comments to explain how it works!

 - Now a stand alone patch to go in for this release

 fs/tracefs/Makefile |   1 +
 fs/tracefs/event_inode.c|  42 +--
 fs/tracefs/event_show.c | 147 
 fs/tracefs/internal.h   |  44 +++
 include/linux/tracefs.h |   2 +
 kernel/trace/trace_events.c |   3 +
 6 files changed, 198 insertions(+), 41 deletions(-)
 create mode 100644 fs/tracefs/event_show.c

diff --git a/fs/tracefs/Makefile b/fs/tracefs/Makefile
index 73c56da8e284..8f48f4fc6698 100644
--- a/fs/tracefs/Makefile
+++ b/fs/tracefs/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 tracefs-objs   := inode.o
 tracefs-objs   += event_inode.o
+tracefs-objs   += event_show.o
 
 obj-$(CONFIG_TRACING)  += tracefs.o
 
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 9f64e7332796..b23bb0957bb4 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -23,47 +23,7 @@
 #include 
 #include "internal.h"
 
-struct eventfs_inode {
-   struct list_heade_top_files;
-};
-
-/*
- * struct eventfs_file - hold the properties of the eventfs files and
- *   directories.
- * @name:  the name of the file or directory to create
- * @d_parent:   holds parent's dentry
- * @dentry: once accessed holds dentry
- * @list:  file or directory to be added to parent directory
- * @ei:list of files and directories within directory
- * @fop:   file_operations for file or directory
- * @iop:   inode_operations for file or directory
- * @data:  something that the caller will want to get to later on
- * @mode:  the permission that the file or directory should have
- */
-struct eventfs_file {
-   const char  *name;
-   struct dentry   *d_parent;
-   struct dentry   *dentry;
-   struct list_headlist;
-   struct eventfs_inode*ei;
-   const struct file_operations*fop;
-   const struct inode_operations   *iop;
-   /*
-* Union - used for deletion
-* @del_list:   list of eventfs_file to delete
-* @rcu:eventfs_file to delete in RCU
-* @is_freed:   node is freed if one of the above is set
-*/
-   union {
-   struct list_headdel_list;
-   struct rcu_head rcu;
-   unsigned long   is_freed;
-   };
-   void*data;
-   umode_t mode;
-};
-
-static DEFINE_MUTEX(eventfs_mutex);
+DEFINE_MUTEX(eventfs_mutex);
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
 static struct dentry *eventfs_root_lookup(struct inode *dir,
diff --git a/fs/tracefs/event_show.c b/fs/tracefs/event_show.c
new file mode 100644
index ..66dece7cc810
--- /dev/null
+++ b/fs/tracefs/event_show.c
@@ -0,0 +1,147 @@
+#include 
+#include 
+#include "internal.h"
+
+/*
+ * This will iterate three lists that correspond to the directory level
+ * of the eventfs directory.
+ *
+ * level 0 : /sys/kernel/tracing/events
+ * level 1 : /sys/kernel/tracing/events/
+ * level 2 : /sys/kernel/tracing/events//event
+ *
+ * The iterator needs to see all levels as they all contain dynamically
+ * allocated dentries and inodes.
+ */
+struct event_list {
+   int level;
+   struct list_head*head[3];
+   struct list_head*next[3];
+};
+
+static void *e_next(struct seq_file *m, void *v, loff_t *pos)
+{
+   struct event_list *elist = m->private;
+   int level = elist->level;
+   struct list_head *head = elist->head[level];
+   struct list_head *next = elist->next[level];
+   struct eventfs_file *ef;
+
+   (*pos)++;
+
+   /* If next is equal to head, then the list is complete */
+   while (next == head) {
+   if (!level)
+   return NULL;
+
+   /* sublevel below top level, go up one */
+   

Re: [PATCH] dax: refactor deprecated strncpy

2023-09-13 Thread Dave Jiang



On 9/12/23 18:10, Justin Stitt wrote:
> `strncpy` is deprecated for use on NUL-terminated destination strings [1].
> 
> We should prefer more robust and less ambiguous string interfaces.
> 
> `dax_id->dev_name` is expected to be NUL-terminated and has been 
> zero-allocated.
> 
> A suitable replacement is `strscpy` [2] due to the fact that it
> guarantees NUL-termination on the destination buffer. Moreover, due to
> `dax_id` being zero-allocated the padding behavior of `strncpy` is not
> needed and a simple 1:1 replacement of strncpy -> strscpy should
> suffice.
> 
> Link: 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
>  [1]
> Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html 
> [2]
> Link: https://github.com/KSPP/linux/issues/90
> Cc: linux-harden...@vger.kernel.org
> Signed-off-by: Justin Stitt 

Reviewed-by: Dave Jiang 
> ---
> Note: build-tested only.
> ---
>  drivers/dax/bus.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 0ee96e6fc426..1659b787b65f 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -103,7 +103,7 @@ static ssize_t do_id_store(struct device_driver *drv, 
> const char *buf,
>   if (action == ID_ADD) {
>   dax_id = kzalloc(sizeof(*dax_id), GFP_KERNEL);
>   if (dax_id) {
> - strncpy(dax_id->dev_name, buf, DAX_NAME_LEN);
> + strscpy(dax_id->dev_name, buf, DAX_NAME_LEN);
>   list_add(_id->list, _drv->ids);
>   } else
>   rc = -ENOMEM;
> 
> ---
> base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> change-id: 20230913-strncpy-drivers-dax-bus-c-f12e3153e44b
> 
> Best regards,
> --
> Justin Stitt 
> 


Re: [PATCH v4 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Sean Christopherson 
>
> Introduce the OOM path for killing an enclave with a reclaimer that is no
> longer able to reclaim enough EPC pages. Find a victim enclave, which
> will be an enclave with only "unreclaimable" EPC pages left in the
> cgroup LRU lists. Once a victim is identified, mark the enclave as OOM
> and zap the enclave's entire page range, and drain all mm references in
> encl->mm_list. Block allocating any EPC pages in #PF handler, or
> reloading any pages in all paths, or creating any new mappings.
>
> The OOM killing path may race with the reclaimers: in some cases, the
> victim enclave is in the process of reclaiming the last EPC pages when
> OOM happens, that is, all pages other than SECS and VA pages are in
> RECLAIMING_IN_PROGRESS state. The reclaiming process requires access to
> the enclave backing, VA pages as well as SECS. So the OOM killer does
> not directly release those enclave resources, instead, it lets all
> reclaiming in progress to finish, and relies (as currently done) on
> kref_put on encl->refcount to trigger sgx_encl_release() to do the
> final cleanup.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Cc: Sean Christopherson 
> ---
> V4:
> - Updates for patch reordering and typo fixes.
>
> V3:
> - Rebased to use the new VMA_ITERATOR to zap VMAs.
> - Fixed the racing cases by blocking new page allocation/mapping and
> reloading when enclave is marked for OOM. And do not release any enclave
> resources other than draining mm_list entries, and let pages in
> RECLAIMING_IN_PROGRESS to be reaped by reclaimers.
> - Due to above changes, also removed the no-longer needed encl->lock in
> the OOM path which was causing deadlocks reported by the lock prover.
> ---
>  arch/x86/kernel/cpu/sgx/driver.c |  27 +-
>  arch/x86/kernel/cpu/sgx/encl.c   |  48 ++-
>  arch/x86/kernel/cpu/sgx/encl.h   |   2 +
>  arch/x86/kernel/cpu/sgx/ioctl.c  |   9 ++
>  arch/x86/kernel/cpu/sgx/main.c   | 140 +++
>  arch/x86/kernel/cpu/sgx/sgx.h|   1 +
>  6 files changed, 200 insertions(+), 27 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/driver.c 
> b/arch/x86/kernel/cpu/sgx/driver.c
> index 262f5fb18d74..ff42d649c7b6 100644
> --- a/arch/x86/kernel/cpu/sgx/driver.c
> +++ b/arch/x86/kernel/cpu/sgx/driver.c
> @@ -44,7 +44,6 @@ static int sgx_open(struct inode *inode, struct file *file)
>  static int sgx_release(struct inode *inode, struct file *file)
>  {
>   struct sgx_encl *encl = file->private_data;
> - struct sgx_encl_mm *encl_mm;
>  
>   /*
>* Drain the remaining mm_list entries. At this point the list contains
> @@ -52,31 +51,7 @@ static int sgx_release(struct inode *inode, struct file 
> *file)
>* not exited yet. The processes, which have exited, are gone from the
>* list by sgx_mmu_notifier_release().
>*/
> - for ( ; ; )  {
> - spin_lock(>mm_lock);
> -
> - if (list_empty(>mm_list)) {
> - encl_mm = NULL;
> - } else {
> - encl_mm = list_first_entry(>mm_list,
> -struct sgx_encl_mm, list);
> - list_del_rcu(_mm->list);
> - }
> -
> - spin_unlock(>mm_lock);
> -
> - /* The enclave is no longer mapped by any mm. */
> - if (!encl_mm)
> - break;
> -
> - synchronize_srcu(>srcu);
> - mmu_notifier_unregister(_mm->mmu_notifier, encl_mm->mm);
> - kfree(encl_mm);
> -
> - /* 'encl_mm' is gone, put encl_mm->encl reference: */
> - kref_put(>refcount, sgx_encl_release);
> - }
> -
> + sgx_encl_mm_drain(encl);
>   kref_put(>refcount, sgx_encl_release);
>   return 0;
>  }
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index bf0ac3677ca8..85b6f218f029 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -453,6 +453,9 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
>   if (unlikely(!encl))
>   return VM_FAULT_SIGBUS;
>  
> + if (test_bit(SGX_ENCL_OOM, >flags))
> + return VM_FAULT_SIGBUS;
> +
>   /*
>* The page_array keeps track of all enclave pages, whether they
>* are swapped out or not. If there is no entry for this page and
> @@ -651,7 +654,8 @@ static int sgx_vma_access(struct vm_area_struct *vma, 
> unsigned long addr,
>   if (!encl)
>   return -EFAULT;
>  
> - if (!test_bit(SGX_ENCL_DEBUG, >flags))
> + if (!test_bit(SGX_ENCL_DEBUG, >flags) ||
> + test_bit(SGX_ENCL_OOM, >flags))
>   return -EFAULT;
>  
>   for (i = 0; i < len; i += cnt) {
> @@ -776,6 +780,45 @@ void sgx_encl_release(struct kref *ref)
>   kfree(encl);
>  }
>  
> 

Re: [PATCH v4 11/18] x86/sgx: store unreclaimable pages in LRU lists

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> When an OOM event occurs, all pages associated with an enclave will need
> to be freed, including pages that are not currently tracked by the
> cgroup LRU lists.
>
> Add a new "unreclaimable" list to the sgx_epc_lru_lists struct and
> update the "sgx_record/drop_epc_pages()" functions for adding/removing
> VA and SECS pages to/from this "unreclaimable" list.
>
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> ---
> V4:
> - Updates for patch reordering.
> - Revised commit messages.
> - Revised comments for the list.
>
> V3:
> - Removed tracking virtual EPC pages in unreclaimable list as host
> kernel does not reclaim them. The EPC cgroups implemented later only
> blocks allocating for a guest if the limit is reached by returning
> -ENOMEM from sgx_alloc_epc_page() called by virt_epc, and does nothing
> else. Therefore, no need to track those in LRU lists.
> ---
>  arch/x86/kernel/cpu/sgx/encl.c  | 2 ++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 1 +
>  arch/x86/kernel/cpu/sgx/main.c  | 3 +++
>  arch/x86/kernel/cpu/sgx/sgx.h   | 8 +++-
>  4 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 91f83a5e543d..bf0ac3677ca8 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -748,6 +748,7 @@ void sgx_encl_release(struct kref *ref)
>   xa_destroy(>page_array);
>  
>   if (!encl->secs_child_cnt && encl->secs.epc_page) {
> + sgx_drop_epc_page(encl->secs.epc_page);
>   sgx_encl_free_epc_page(encl->secs.epc_page);
>   encl->secs.epc_page = NULL;
>   }
> @@ -756,6 +757,7 @@ void sgx_encl_release(struct kref *ref)
>   va_page = list_first_entry(>va_pages, struct sgx_va_page,
>  list);
>   list_del(_page->list);
> + sgx_drop_epc_page(va_page->epc_page);
>   sgx_encl_free_epc_page(va_page->epc_page);
>   kfree(va_page);
>   }
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 95ec20a6992f..8c23bb524674 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -48,6 +48,7 @@ void sgx_encl_shrink(struct sgx_encl *encl, struct 
> sgx_va_page *va_page)
>   encl->page_cnt--;
>  
>   if (va_page) {
> + sgx_drop_epc_page(va_page->epc_page);
>   sgx_encl_free_epc_page(va_page->epc_page);
>   list_del(_page->list);
>   kfree(va_page);
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index ed813288af44..f3a3ed894616 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -268,6 +268,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page 
> *epc_page,
>   goto out;
>  
>   sgx_encl_ewb(encl->secs.epc_page, _backing);
> + sgx_drop_epc_page(encl->secs.epc_page);
>   sgx_encl_free_epc_page(encl->secs.epc_page);
>   encl->secs.epc_page = NULL;
>  
> @@ -510,6 +511,8 @@ void sgx_record_epc_page(struct sgx_epc_page *page, 
> unsigned long flags)
>   page->flags |= flags;
>   if (sgx_epc_page_reclaimable(flags))
>   list_add_tail(>list, _global_lru.reclaimable);
> + else
> + list_add_tail(>list, _global_lru.unreclaimable);
>   spin_unlock(_global_lru.lock);
>  }
>  
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> index e06b4aadb6a1..e210af77f0cf 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -150,17 +150,23 @@ static inline void *sgx_get_epc_virt_addr(struct 
> sgx_epc_page *page)
>  }
>  
>  /*
> - * Tracks EPC pages reclaimable by the reclaimer (ksgxd).
> + * Contains EPC pages tracked by the reclaimer (ksgxd).
>   */
>  struct sgx_epc_lru_lists {
>   spinlock_t lock;
>   struct list_head reclaimable;
> + /*
> +  * Tracks SECS, VA pages,etc., pages only freeable after all its
> +  * dependent reclaimables are freed.
> +  */
> + struct list_head unreclaimable;
>  };
>  
>  static inline void sgx_lru_init(struct sgx_epc_lru_lists *lrus)
>  {
>   spin_lock_init(>lock);
>   INIT_LIST_HEAD(>reclaimable);
> + INIT_LIST_HEAD(>unreclaimable);
>  }
>  
>  struct sgx_epc_page *__sgx_alloc_epc_page(void);
> -- 
> 2.25.1

LGTM

BR, Jarkko


Re: [PATCH v4 09/18] x86/sgx: Store struct sgx_encl when allocating new VA pages

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> In a later patch, when a cgroup has exceeded the max capacity for EPC
> pages, it may need to identify and OOM kill a less active enclave to
> make room for other enclaves within the same group. Such a victim
> enclave would have no active pages other than the unreclaimable Version
> Array (VA) and SECS pages.  Therefore, the cgroup needs examine its
> unreclaimable page list, and finding an enclave given a SECS page or a
> VA page. This will require a backpointer from a page to an enclave,
> which is not available for VA pages.
>
> Because struct sgx_epc_page instances of VA pages are not owned by an
> sgx_encl_page instance, mark their owner as sgx_encl: pass the struct
> sgx_encl of the enclave allocating the VA page to sgx_alloc_epc_page(),
> which will store this value in the owner field of the struct
> sgx_epc_page.  In a later patch, VA pages will be placed in an
> unreclaimable queue that can be examined by the cgroup to select the OOM
> killed enclave.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Cc: Sean Christopherson 
> ---
> V4:
> - Changes needed for patch reordering
> - Revised commit messages (Jarkko)
> ---
>  arch/x86/kernel/cpu/sgx/encl.c  |  5 +++--
>  arch/x86/kernel/cpu/sgx/encl.h  |  2 +-
>  arch/x86/kernel/cpu/sgx/ioctl.c |  2 +-
>  arch/x86/kernel/cpu/sgx/main.c  | 20 ++--
>  arch/x86/kernel/cpu/sgx/sgx.h   |  5 -
>  5 files changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index d11d4111aa98..1aee0ad00e66 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -1238,6 +1238,7 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, 
> unsigned long addr)
>  
>  /**
>   * sgx_alloc_va_page() - Allocate a Version Array (VA) page
> + * @encl:The enclave that this page is allocated to.

Maybe would more clear:

* @encl:The new owner of the page

>   * @reclaim: Reclaim EPC pages directly if none available. Enclave
>   *   mutex should not be held if this is set.
>   *
> @@ -1247,12 +1248,12 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, 
> unsigned long addr)
>   *   a VA page,
>   *   -errno otherwise
>   */
> -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim)
> +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim)
>  {
>   struct sgx_epc_page *epc_page;
>   int ret;
>  
> - epc_page = sgx_alloc_epc_page(NULL, reclaim);
> + epc_page = sgx_alloc_epc_page(encl, reclaim);
>   if (IS_ERR(epc_page))
>   return ERR_CAST(epc_page);
>  
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index f94ff14c9486..831d63f80f5a 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -116,7 +116,7 @@ struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl 
> *encl,
> unsigned long offset,
> u64 secinfo_flags);
>  void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr);
> -struct sgx_epc_page *sgx_alloc_va_page(bool reclaim);
> +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_encl *encl, bool reclaim);
>  unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
>  void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
>  bool sgx_va_page_full(struct sgx_va_page *va_page);
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index c28f074d5d71..3ab8c050e665 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -30,7 +30,7 @@ struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, 
> bool reclaim)
>   if (!va_page)
>   return ERR_PTR(-ENOMEM);
>  
> - va_page->epc_page = sgx_alloc_va_page(reclaim);
> + va_page->epc_page = sgx_alloc_va_page(encl, reclaim);
>   if (IS_ERR(va_page->epc_page)) {
>   err = ERR_CAST(va_page->epc_page);
>   kfree(va_page);
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index fba06dc5abfe..ed813288af44 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -107,7 +107,7 @@ static unsigned long __sgx_sanitize_pages(struct 
> list_head *dirty_page_list)
>  
>  static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page)
>  {
> - struct sgx_encl_page *page = epc_page->owner;
> + struct sgx_encl_page *page = epc_page->encl_page;
>   struct sgx_encl *encl = page->encl;
>   struct sgx_encl_mm *encl_mm;
>   bool ret = true;
> @@ -139,7 +139,7 @@ static bool sgx_reclaimer_age(struct sgx_epc_page 
> *epc_page)
>  
>  static void sgx_reclaimer_block(struct sgx_epc_page *epc_page)
>  {
> - struct sgx_encl_page 

Re: [PATCH v4 08/18] x86/sgx: Use a list to track to-be-reclaimed pages

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> Change sgx_reclaim_pages() to use a list rather than an array for
> storing the epc_pages which will be reclaimed. This change is needed
> to transition to the LRU implementation for EPC cgroup support.
>
> When the EPC cgroup is implemented, the reclaiming process will do a
> pre-order tree walk for the subtree starting from the limit-violating
> cgroup.  When each node is visited, candidate pages are selected from
> its "reclaimable" LRU list and moved into this temporary list. Passing a
> list from node to node for temporary storage in this walk is more
> straightforward than using an array.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang
> Cc: Sean Christopherson 
> ---
> V4:
> - Changes needed for patch reordering
> - Revised commit message
>
> V3:
> - Removed list wrappers
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 40 +++---
>  1 file changed, 18 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index c1ae19a154d0..fba06dc5abfe 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -293,12 +293,11 @@ static void sgx_reclaimer_write(struct sgx_epc_page 
> *epc_page,
>   */
>  static void sgx_reclaim_pages(void)
>  {
> - struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
>   struct sgx_backing backing[SGX_NR_TO_SCAN];
> + struct sgx_epc_page *epc_page, *tmp;
>   struct sgx_encl_page *encl_page;
> - struct sgx_epc_page *epc_page;
>   pgoff_t page_index;
> - int cnt = 0;
> + LIST_HEAD(iso);
>   int ret;
>   int i;
>  
> @@ -314,18 +313,22 @@ static void sgx_reclaim_pages(void)
>  
>   if (kref_get_unless_zero(_page->encl->refcount) != 0) {
>   sgx_epc_page_set_state(epc_page, 
> SGX_EPC_PAGE_RECLAIM_IN_PROGRESS);
> - chunk[cnt++] = epc_page;
> + list_move_tail(_page->list, );
>   } else {
> - /* The owner is freeing the page. No need to add the
> -  * page back to the list of reclaimable pages.
> + /* The owner is freeing the page, remove it from the
> +  * LRU list
>*/
>   sgx_epc_page_reset_state(epc_page);
> + list_del_init(_page->list);
>   }
>   }
>   spin_unlock(_global_lru.lock);
>  
> - for (i = 0; i < cnt; i++) {
> - epc_page = chunk[i];
> + if (list_empty())
> + return;
> +
> + i = 0;
> + list_for_each_entry_safe(epc_page, tmp, , list) {
>   encl_page = epc_page->owner;
>  
>   if (!sgx_reclaimer_age(epc_page))
> @@ -340,6 +343,7 @@ static void sgx_reclaim_pages(void)
>   goto skip;
>   }
>  
> + i++;
>   encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
>   mutex_unlock(_page->encl->lock);
>   continue;
> @@ -347,27 +351,19 @@ static void sgx_reclaim_pages(void)
>  skip:
>   spin_lock(_global_lru.lock);
>   sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIMABLE);
> - list_add_tail(_page->list, _global_lru.reclaimable);
> + list_move_tail(_page->list, _global_lru.reclaimable);
>   spin_unlock(_global_lru.lock);
>  
>   kref_put(_page->encl->refcount, sgx_encl_release);
> -
> - chunk[i] = NULL;
> - }
> -
> - for (i = 0; i < cnt; i++) {
> - epc_page = chunk[i];
> - if (epc_page)
> - sgx_reclaimer_block(epc_page);
>   }
>  
> - for (i = 0; i < cnt; i++) {
> - epc_page = chunk[i];
> - if (!epc_page)
> - continue;
> + list_for_each_entry(epc_page, , list)
> + sgx_reclaimer_block(epc_page);
>  
> + i = 0;
> + list_for_each_entry_safe(epc_page, tmp, , list) {
>   encl_page = epc_page->owner;
> - sgx_reclaimer_write(epc_page, [i]);
> + sgx_reclaimer_write(epc_page, [i++]);
>  
>   kref_put(_page->encl->refcount, sgx_encl_release);
>   sgx_epc_page_reset_state(epc_page);
> -- 
> 2.25.1

LGTM

BR, Jarkko


Re: [PATCH RFC 00/37] Add support for arm64 MTE dynamic tag storage reuse

2023-09-13 Thread Catalin Marinas
On Mon, Sep 11, 2023 at 02:29:03PM +0200, David Hildenbrand wrote:
> On 11.09.23 13:52, Catalin Marinas wrote:
> > On Wed, Sep 06, 2023 at 12:23:21PM +0100, Alexandru Elisei wrote:
> > > On Thu, Aug 24, 2023 at 04:24:30PM +0100, Catalin Marinas wrote:
> > > > On Thu, Aug 24, 2023 at 01:25:41PM +0200, David Hildenbrand wrote:
> > > > > On 24.08.23 13:06, David Hildenbrand wrote:
> > > > > > Regarding one complication: "The kernel needs to know where to 
> > > > > > allocate
> > > > > > a PROT_MTE page from or migrate a current page if it becomes 
> > > > > > PROT_MTE
> > > > > > (mprotect()) and the range it is in does not support tagging.",
> > > > > > simplified handling would be if it's in a MIGRATE_CMA pageblock, it
> > > > > > doesn't support tagging. You have to migrate to a !CMA page (for
> > > > > > example, not specifying GFP_MOVABLE as a quick way to achieve that).
> > > > > 
> > > > > Okay, I now realize that this patch set effectively duplicates some 
> > > > > CMA
> > > > > behavior using a new migrate-type.
> > [...]
> > > I considered mixing the tag storage memory memory with normal memory and
> > > adding it to MIGRATE_CMA. But since tag storage memory cannot be tagged,
> > > this means that it's not enough anymore to have a __GFP_MOVABLE allocation
> > > request to use MIGRATE_CMA.
> > > 
> > > I considered two solutions to this problem:
> > > 
> > > 1. Only allocate from MIGRATE_CMA is the requested memory is not tagged =>
> > > this effectively means transforming all memory from MIGRATE_CMA into the
> > > MIGRATE_METADATA migratetype that the series introduces. Not very
> > > appealing, because that means treating normal memory that is also on the
> > > MIGRATE_CMA lists as tagged memory.
> > 
> > That's indeed not ideal. We could try this if it makes the patches
> > significantly simpler, though I'm not so sure.
> > 
> > Allocating metadata is the easier part as we know the correspondence
> > from the tagged pages (32 PROT_MTE page) to the metadata page (1 tag
> > storage page), so alloc_contig_range() does this for us. Just adding it
> > to the CMA range is sufficient.
> > 
> > However, making sure that we don't allocate PROT_MTE pages from the
> > metadata range is what led us to another migrate type. I guess we could
> > achieve something similar with a new zone or a CPU-less NUMA node,
> 
> Ideally, no significant core-mm changes to optimize for an architecture
> oddity. That implies, no new zones and no new migratetypes -- unless it is
> unavoidable and you are confident that you can convince core-MM people that
> the use case (giving back 3% of system RAM at max in some setups) is worth
> the trouble.

If I was an mm maintainer, I'd also question this ;). But vendors seem
pretty picky about the amount of RAM reserved for MTE (e.g. 0.5G for a
16G platform does look somewhat big). As more and more apps adopt MTE,
the wastage would be smaller but the first step is getting vendors to
enable it.

> I also had CPU-less NUMA nodes in mind when thinking about that, but not
> sure how easy it would be to integrate it. If the tag memory has actually
> different performance characteristics as well, a NUMA node would be the
> right choice.

In general I'd expect the same characteristics. However, changing the
memory designation from tag to data (and vice-versa) requires some cache
maintenance. The allocation cost is slightly higher (not the runtime
one), so it would help if the page allocator does not favour this range.
Anyway, that's an optimisation to worry about later.

> If we could find some way to easily support this either via CMA or CPU-less
> NUMA nodes, that would be much preferable; even if we cannot cover each and
> every future use case right now. I expect some issues with CXL+MTE either
> way , but are happy to be taught otherwise :)

I think CXL+MTE is rather theoretical at the moment. Given that PCIe
doesn't have any notion of MTE, more likely there would be some piece of
interconnect that generates two memory accesses: one for data and the
other for tags at a configurable offset (which may or may not be in the
same CXL range).

> Another thought I had was adding something like CMA memory characteristics.
> Like, asking if a given CMA area/page supports tagging (i.e., flag for the
> CMA area set?)?

I don't think adding CMA memory characteristics helps much. The metadata
allocation wouldn't go through cma_alloc() but rather
alloc_contig_range() directly for a specific pfn corresponding to the
data pages with PROT_MTE. The core mm code doesn't need to know about
the tag storage layout.

It's also unlikely for cma_alloc() memory to be mapped as PROT_MTE.
That's typically coming from device drivers (DMA API) with their own
mmap() implementation that doesn't normally set VM_MTE_ALLOWED (and
therefore PROT_MTE is rejected).

What we need though is to prevent vma_alloc_folio() from allocating from
a MIGRATE_CMA list if PROT_MTE (VM_MTE). I guess that's basically
removing 

Re: [PATCH v4 06/18] x86/sgx: Introduce EPC page states

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> Use the lower 3 bits in the flags field of sgx_epc_page struct to
> track EPC states in its life cycle and define an enum for possible
> states. More state(s) will be added later.
>
> Signed-off-by: Haitao Huang 
> ---
> V4:
> - No changes other than required for patch reordering.
>
> V3:
> - This is new in V3 to replace the bit mask based approach (requested by 
> Jarkko)
> ---
>  arch/x86/kernel/cpu/sgx/encl.c  | 14 +++---
>  arch/x86/kernel/cpu/sgx/ioctl.c |  7 +++--
>  arch/x86/kernel/cpu/sgx/main.c  | 19 +++--
>  arch/x86/kernel/cpu/sgx/sgx.h   | 49 ++---
>  4 files changed, 71 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index f84ee2eeb058..d11d4111aa98 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -244,8 +244,12 @@ static struct sgx_epc_page *sgx_encl_load_secs(struct 
> sgx_encl *encl)
>  {
>   struct sgx_epc_page *epc_page = encl->secs.epc_page;
>  
> - if (!epc_page)
> + if (!epc_page) {
>   epc_page = sgx_encl_eldu(>secs, NULL);
> + if (!IS_ERR(epc_page))
> + sgx_record_epc_page(epc_page,
> + SGX_EPC_PAGE_UNRECLAIMABLE);

sgx_record_epc_page(epc_page, 
SGX_EPC_PAGE_UNRECLAIMABLE) ;

> + }
>  
>   return epc_page;
>  }
> @@ -273,7 +277,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct 
> sgx_encl *encl,
>  
>   encl->secs_child_cnt++;
>   sgx_record_epc_page(epc_page,
> - SGX_EPC_PAGE_RECLAIMER_TRACKED);
> + SGX_EPC_PAGE_RECLAIMABLE);
>  
>   return entry;
>  }
> @@ -400,7 +404,7 @@ static vm_fault_t sgx_encl_eaug_page(struct 
> vm_area_struct *vma,
>   encl->secs_child_cnt++;
>  
>   sgx_record_epc_page(epc_page,
> - SGX_EPC_PAGE_RECLAIMER_TRACKED);
> + SGX_EPC_PAGE_RECLAIMABLE);
>  
>   phys_addr = sgx_get_epc_phys_addr(epc_page);
>   /*
> @@ -1258,6 +1262,8 @@ struct sgx_epc_page *sgx_alloc_va_page(bool reclaim)
>   sgx_encl_free_epc_page(epc_page);
>   return ERR_PTR(-EFAULT);
>   }
> + sgx_record_epc_page(epc_page,
> + SGX_EPC_PAGE_UNRECLAIMABLE);
>  
>   return epc_page;
>  }
> @@ -1317,7 +1323,7 @@ void sgx_encl_free_epc_page(struct sgx_epc_page *page)
>  {
>   int ret;
>  
> - WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> + WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_STATE_MASK);
>  
>   ret = __eremove(sgx_get_epc_virt_addr(page));
>   if (WARN_ONCE(ret, EREMOVE_ERROR_MESSAGE, ret, ret))
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 0d79dec408af..c28f074d5d71 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -113,6 +113,9 @@ static int sgx_encl_create(struct sgx_encl *encl, struct 
> sgx_secs *secs)
>   encl->attributes = secs->attributes;
>   encl->attributes_mask = SGX_ATTR_UNPRIV_MASK;
>  
> + sgx_record_epc_page(encl->secs.epc_page,
> + SGX_EPC_PAGE_UNRECLAIMABLE);
> +
>   /* Set only after completion, as encl->lock has not been taken. */
>   set_bit(SGX_ENCL_CREATED, >flags);
>  
> @@ -323,7 +326,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, 
> unsigned long src,
>   }
>  
>   sgx_record_epc_page(epc_page,
> - SGX_EPC_PAGE_RECLAIMER_TRACKED);
> + SGX_EPC_PAGE_RECLAIMABLE);
>   mutex_unlock(>lock);
>   mmap_read_unlock(current->mm);
>   return ret;
> @@ -978,7 +981,7 @@ static long sgx_enclave_modify_types(struct sgx_encl 
> *encl,
>   mutex_lock(>lock);
>  
>   sgx_record_epc_page(entry->epc_page,
> - SGX_EPC_PAGE_RECLAIMER_TRACKED);
> + SGX_EPC_PAGE_RECLAIMABLE);
>   }
>  
>   /* Change EPC type */
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index dec1d57cbff6..b26860399402 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -318,7 +318,7 @@ static void sgx_reclaim_pages(void)
>   /* The owner is freeing the page. No need to add the
>* page back to the list of reclaimable pages.
>*/
> - epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
> + sgx_epc_page_reset_state(epc_page);
>   }
>   spin_unlock(_global_lru.lock);
>  
> @@ -344,6 +344,7 @@ static void sgx_reclaim_pages(void)
>  
>  skip:
>   spin_lock(_global_lru.lock);
> + sgx_epc_page_set_state(epc_page, 

Re: [PATCH v4 05/18] x86/sgx: Store reclaimable EPC pages in sgx_epc_lru_lists

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> Replace sgx_mark_page_reclaimable() and sgx_unmark_page_reclaimable()
> with sgx_record_epc_page() and sgx_drop_epc_page(). The
> sgx_record_epc_page() function adds the epc_page to the "reclaimable"
> list in the sgx_epc_lru_lists struct, while sgx_drop_epc_page() removes
> the page from the LRU list.
>
> For now, this change serves as a straightforward replacement of the two
> functions for pages tracked by the reclaimer. When the unreclaimable
> list is added to track VA and SECS pages for cgroups, these functions
> will be updated to add/remove them from the unreclaimable lists.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Cc: Sean Christopherson 
> ---
> V4:
> - Code update needed for patch reordering
> - Revised commit message.
> ---
>  arch/x86/kernel/cpu/sgx/encl.c  |  8 +---
>  arch/x86/kernel/cpu/sgx/ioctl.c | 10 ++
>  arch/x86/kernel/cpu/sgx/main.c  | 22 --
>  arch/x86/kernel/cpu/sgx/sgx.h   |  4 ++--
>  4 files changed, 25 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 279148e72459..f84ee2eeb058 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -272,7 +272,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct 
> sgx_encl *encl,
>   return ERR_CAST(epc_page);
>  
>   encl->secs_child_cnt++;
> - sgx_mark_page_reclaimable(entry->epc_page);
> + sgx_record_epc_page(epc_page,
> + SGX_EPC_PAGE_RECLAIMER_TRACKED);

sgx_record_epc_page(epc_page, SGX_EPC_PAGE_RECLAIMER_TRACKED);

... even less than 80 characters (100 is the max these days)

>  
>   return entry;
>  }
> @@ -398,7 +399,8 @@ static vm_fault_t sgx_encl_eaug_page(struct 
> vm_area_struct *vma,
>   encl_page->type = SGX_PAGE_TYPE_REG;
>   encl->secs_child_cnt++;
>  
> - sgx_mark_page_reclaimable(encl_page->epc_page);
> + sgx_record_epc_page(epc_page,
> + SGX_EPC_PAGE_RECLAIMER_TRACKED);

Ditto.

>  
>   phys_addr = sgx_get_epc_phys_addr(epc_page);
>   /*
> @@ -714,7 +716,7 @@ void sgx_encl_release(struct kref *ref)
>* The page and its radix tree entry cannot be freed
>* if the page is being held by the reclaimer.
>*/
> - if (sgx_unmark_page_reclaimable(entry->epc_page))
> + if (sgx_drop_epc_page(entry->epc_page))
>   continue;
>  
>   sgx_encl_free_epc_page(entry->epc_page);
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 5d390df21440..0d79dec408af 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -322,7 +322,8 @@ static int sgx_encl_add_page(struct sgx_encl *encl, 
> unsigned long src,
>   goto err_out;
>   }
>  
> - sgx_mark_page_reclaimable(encl_page->epc_page);
> + sgx_record_epc_page(epc_pag
> + SGX_EPC_PAGE_RECLAIMER_TRACKED);

Ditto.

>   mutex_unlock(>lock);
>   mmap_read_unlock(current->mm);
>   return ret;
> @@ -961,7 +962,7 @@ static long sgx_enclave_modify_types(struct sgx_encl 
> *encl,
>* Prevent page from being reclaimed while mutex
>* is released.
>*/
> - if (sgx_unmark_page_reclaimable(entry->epc_page)) {
> + if (sgx_drop_epc_page(entry->epc_page)) {
>   ret = -EAGAIN;
>   goto out_entry_changed;
>   }
> @@ -976,7 +977,8 @@ static long sgx_enclave_modify_types(struct sgx_encl 
> *encl,
>  
>   mutex_lock(>lock);
>  
> - sgx_mark_page_reclaimable(entry->epc_page);
> + sgx_record_epc_page(entry->epc_page,
> + SGX_EPC_PAGE_RECLAIMER_TRACKED);

Ditto.

>   }
>  
>   /* Change EPC type */
> @@ -1133,7 +1135,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
>   goto out_unlock;
>   }
>  
> - if (sgx_unmark_page_reclaimable(entry->epc_page)) {
> + if (sgx_drop_epc_page(entry->epc_page)) {
>   ret = -EBUSY;
>   goto out_unlock;
>   }
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index afce51d6e94a..dec1d57cbff6 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -268,7 +268,6 @@ static void sgx_reclaimer_write(struct sgx_epc_page 
> *epc_page,
>   goto out;
>  
>   

RE: [PATCH] EDAC/mc_sysfs: refactor deprecated strncpy

2023-09-13 Thread Luck, Tony
> `strncpy` is deprecated for use on NUL-terminated destination strings [1].
>
> We should prefer more robust and less ambiguous string interfaces.
>
> A suitable replacement is `strscpy_pad` [2] due to the fact that it guarantees
> NUL-termination on the destination buffer whilst maintaining the
> NUL-padding behavior that `strncpy` provides. This may not be strictly
> necessary but as I couldn't understand what this code does I wanted to
> ensure that the functionality is the same.
>
> Link: 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
>  [1]
> Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html 
> [2]
> Link: https://github.com/KSPP/linux/issues/90
> Cc: linux-harden...@vger.kernel.org
> Signed-off-by: Justin Stitt 
> ---
> Note: build-tested only.
> ---
>  drivers/edac/edac_mc_sysfs.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 15f63452a9be..b303309a63cf 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -229,8 +229,7 @@ static ssize_t channel_dimm_label_store(struct device 
> *dev,
> if (copy_count == 0 || copy_count >= sizeof(rank->dimm->label))
> return -EINVAL;
>
> -   strncpy(rank->dimm->label, data, copy_count);
> -   rank->dimm->label[copy_count] = '\0';
> +   strscpy_pad(rank->dimm->label, data, copy_count);

That doc page says the problem with strncpy() is that it doesn't guarantee to
NUL terminate the target string. But this code is aware of that limitation and
zaps a '\0' at the end to be sure.

So this code doesn't suffer from the potential problems.

If it is going to be fixed, then some further analysis of the original code
would be wise. Just replacing with strscpy_pad() means the code probably
still suffers from the "needless performance penalty" also mentioned in
the deprecation document.

-Tony



Re: [PATCH v3] x86/platform/uv: refactor deprecated strcpy and strncpy

2023-09-13 Thread Hans de Goede
Hi,

On 9/6/23 16:09, Ingo Molnar wrote:
> 
> * Hans de Goede  wrote:
> 
>> Hi Ingo,
>>
>> On 9/6/23 14:10, Ingo Molnar wrote:
>>>
>>> * Justin Stitt  wrote:
>>>
 Both `strncpy` and `strcpy` are deprecated for use on NUL-terminated
 destination strings [1].

 We can see that `arg` and `uv_nmi_action` are expected to be
 NUL-terminated strings due to their use within `strcmp()` and format
 strings respectively.

 With this in mind, a suitable replacement is `strscpy` [2] due to the
 fact that it guarantees NUL-termination on its destination buffer
 argument which is _not_ the case for `strncpy` or `strcpy`!

 In this case, we can drop both the forced NUL-termination and the `... -1` 
 from:
 |   strncpy(arg, val, ACTION_LEN - 1);
 as `strscpy` implicitly has this behavior.

 Link: 
 www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings[1]
 Link: 
 https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
 Link: https://github.com/KSPP/linux/issues/90
 Cc: linux-harden...@vger.kernel.org
 Signed-off-by: Justin Stitt 
>>>
  arch/x86/platform/uv/uv_nmi.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> Note that this commit is already upstream:
>>>
>>>   1e6f01f72855 ("x86/platform/uv: Refactor code using deprecated 
>>> strcpy()/strncpy() interfaces to use strscpy()")
>>>
>>> Below is the delta your v3 patch has compared to what is upstream - is it 
>>> really necessary to open code it, instead of using strnchrnul() as your 
>>> original patch did? Am I missing anything here?
>>
>> The new version is a result of a review from my because IMHO:
>>
>>  strscpy(arg, val, strnchrnul(val, sizeof(arg)-1, '\n') - val + 1);
>>
>> Is really unreadable / really hard to reason about if
>> this is actually correct and does not contain any
>> of by 1 bugs.
>>
>> Note that the diff of v3 compared to the code before v2 landed is
>> actually smaller now and actually matches the subject of:
>> "refactor deprecated strcpy and strncpy"
>>
>> Where as v2 actually touches more code / refactor things
>> which fall outside of a "one change per patch" approach.
>> The:
>>
>>  p = strchr(arg, '\n');
>>  if (p)
>>  *p = '\0';
>>
>> was already there before v2 landed.
>>
>> I also suggested to do a follow up patch to change things to:
>>
>>  strscpy(arg, val, sizeof(arg));
>>  p = strchrnul(arg, '\n');
>>  *p = '\0';
>>
>> Which IMHO is much more readable then what has landed
>> now. But since v2 has already landed I guess the best
>> thing is just to stick with what we have upstream now...
> 
> Well, how about we do a delta patch with all the changes
> you suggested? I'm all for readability.

So I started doing this and notices that all the string
manipulation + parsing done here is really just a DYI
implementation of sysfs_match_string().

So I have prepared a patch to switch to sysfs_match_string(),
which completely removes the need to make a copy of the val
string.

I'll submit the patch right after this email.

Regards,

Hans




Re: [PATCH v4 04/18] x86/sgx: Use sgx_epc_lru_lists for existing active page list

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> All EPC pages of enclaves including VA and SECS will be tracked in

s/VA/Version Array (VA)/
s/SECS/SGX Enclave Control Structure (SECS)/

Just a nitpick, because it is always good to remind what these acronyms
are (there are so many of them in this world).

> sgx_epc_lru_lists structs, one per cgroup. For now just replace the
> existing sgx_active_page_list in the reclaimer and its spinlock with a
> global sgx_epc_lru_lists struct. VA and SECS pages are still not tracked
> at this point but they will be tracked after an unreclaimable LRU list
> is added to the sgx_epc_lru_lists struct.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Cc: Sean Christopherson 
> ---
> V4:
> - No change, only reordered the patch.
>
> V3:
> - Remove usage of list wrapper
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 39 +-
>  1 file changed, 20 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 166692f2d501..afce51d6e94a 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -26,10 +26,9 @@ static DEFINE_XARRAY(sgx_epc_address_space);
>  
>  /*
>   * These variables are part of the state of the reclaimer, and must be 
> accessed
> - * with sgx_reclaimer_lock acquired.
> + * with sgx_global_lru.lock acquired.
>   */
> -static LIST_HEAD(sgx_active_page_list);
> -static DEFINE_SPINLOCK(sgx_reclaimer_lock);
> +static struct sgx_epc_lru_lists sgx_global_lru;
>  
>  static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0);
>  
> @@ -304,13 +303,13 @@ static void sgx_reclaim_pages(void)
>   int ret;
>   int i;
>  
> - spin_lock(_reclaimer_lock);
> + spin_lock(_global_lru.lock);
>   for (i = 0; i < SGX_NR_TO_SCAN; i++) {
> - if (list_empty(_active_page_list))
> + epc_page = list_first_entry_or_null(_global_lru.reclaimable,
> + struct sgx_epc_page, list);
> + if (!epc_page)
>   break;
>  
> - epc_page = list_first_entry(_active_page_list,
> - struct sgx_epc_page, list);
>   list_del_init(_page->list);
>   encl_page = epc_page->owner;
>  
> @@ -322,7 +321,7 @@ static void sgx_reclaim_pages(void)
>*/
>   epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
>   }
> - spin_unlock(_reclaimer_lock);
> + spin_unlock(_global_lru.lock);
>  
>   for (i = 0; i < cnt; i++) {
>   epc_page = chunk[i];
> @@ -345,9 +344,9 @@ static void sgx_reclaim_pages(void)
>   continue;
>  
>  skip:
> - spin_lock(_reclaimer_lock);
> - list_add_tail(_page->list, _active_page_list);
> - spin_unlock(_reclaimer_lock);
> + spin_lock(_global_lru.lock);
> + list_add_tail(_page->list, _global_lru.reclaimable);
> + spin_unlock(_global_lru.lock);
>  
>   kref_put(_page->encl->refcount, sgx_encl_release);
>  
> @@ -378,7 +377,7 @@ static void sgx_reclaim_pages(void)
>  static bool sgx_should_reclaim(unsigned long watermark)
>  {
>   return atomic_long_read(_nr_free_pages) < watermark &&
> -!list_empty(_active_page_list);
> +!list_empty(_global_lru.reclaimable);
>  }
>  
>  /*
> @@ -430,6 +429,8 @@ static bool __init sgx_page_reclaimer_init(void)
>  
>   ksgxd_tsk = tsk;
>  
> + sgx_lru_init(_global_lru);
> +
>   return true;
>  }
>  
> @@ -505,10 +506,10 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void)
>   */
>  void sgx_mark_page_reclaimable(struct sgx_epc_page *page)
>  {
> - spin_lock(_reclaimer_lock);
> + spin_lock(_global_lru.lock);
>   page->flags |= SGX_EPC_PAGE_RECLAIMER_TRACKED;
> - list_add_tail(>list, _active_page_list);
> - spin_unlock(_reclaimer_lock);
> + list_add_tail(>list, _global_lru.reclaimable);
> + spin_unlock(_global_lru.lock);
>  }
>  
>  /**
> @@ -523,18 +524,18 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page 
> *page)
>   */
>  int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
>  {
> - spin_lock(_reclaimer_lock);
> + spin_lock(_global_lru.lock);
>   if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) {
>   /* The page is being reclaimed. */
>   if (list_empty(>list)) {
> - spin_unlock(_reclaimer_lock);
> + spin_unlock(_global_lru.lock);
>   return -EBUSY;
>   }
>  
>   list_del(>list);
>   page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
>   }
> - spin_unlock(_reclaimer_lock);
> + spin_unlock(_global_lru.lock);
>  
>   return 0;
>  }
> @@ -567,7 +568,7 @@ struct 

Re: [PATCH 3/3] tracing/selftests: Update kprobe args char/string to match new functions

2023-09-13 Thread Google
On Tue, 12 Sep 2023 22:58:58 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> The function that the kprobe_args_char and kprobes_arg_string attaches to
> for its test has changed its name once again. Now we need to check for
> eventfs_create_dir(), and if it exists, use that, otherwise check for
> eventfs_add_dir() and if that exists use that, otherwise use the original
> tracefs_create_dir()!

OK, looks good to me.

Acked-by: Masami Hiramatsu (Google) 

Thank you,

> 
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  .../selftests/ftrace/test.d/kprobe/kprobe_args_char.tc| 4 +++-
>  .../selftests/ftrace/test.d/kprobe/kprobe_args_string.tc  | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc 
> b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc
> index ff7499eb98d6..c639c6c8ca03 100644
> --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc
> +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_char.tc
> @@ -34,7 +34,9 @@ mips*)
>  esac
>  
>  : "Test get argument (1)"
> -if grep -q eventfs_add_dir available_filter_functions; then
> +if grep -q eventfs_create_dir available_filter_functions; then
> +  DIR_NAME="eventfs_create_dir"
> +elif grep -q eventfs_add_dir available_filter_functions; then
>DIR_NAME="eventfs_add_dir"
>  else
>DIR_NAME="tracefs_create_dir"
> diff --git 
> a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc 
> b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc
> index a202b2ea4baf..a5ab4d5c74ac 100644
> --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc
> +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_string.tc
> @@ -37,7 +37,9 @@ loongarch*)
>  esac
>  
>  : "Test get argument (1)"
> -if grep -q eventfs_add_dir available_filter_functions; then
> +if grep -q eventfs_create_dir available_filter_functions; then
> +  DIR_NAME="eventfs_create_dir"
> +elif grep -q eventfs_add_dir available_filter_functions; then
>DIR_NAME="eventfs_add_dir"
>  else
>DIR_NAME="tracefs_create_dir"
> -- 
> 2.40.1


-- 
Masami Hiramatsu (Google) 


Re: [PATCH v6 1/4] dt-bindings: remoteproc: k3-m4f: Add K3 AM64x SoCs

2023-09-13 Thread Hari Nagalla

On 9/13/23 06:32, Krzysztof Kozlowski wrote:

  - Removed unrelated items from examples

Changes since v4:
  - Rebased to the latest kernel-next tree
  - Added optional sram memory region for m4f device node

Changes since v5:
  - None

Hm, why none? There were errors in the binding to which you did not
respond. Did you just ignore them?

I do not see any errors in my builds. Am i missing something? Please 
excuse my lack of knowledge here. Thought the bot errors were outside of 
the patch submitted 
(Documentation/devicetree/bindings/dma/stericsson,dma40.yaml). 
Appreciate your kind inputs..


$ make -j`nproc` ARCH=arm64 V=1 CROSS_COMPILE=aarch64-none-linux-gnu- 
DT_CHEKCER_FLAGS=-m dt_binding_check 
DT_SCHEMA_FILES=Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml

make --no-print-directory -C /home/a0868345/temp/linux-next \
-f /home/a0868345/temp/linux-next/Makefile dt_binding_check
make -f ./scripts/Makefile.build obj=scripts/basic
make -f ./scripts/Makefile.build obj=scripts/dtc
make -f ./scripts/Makefile.build obj=Documentation/devicetree/bindings
# LINTDocumentation/devicetree/bindings
  (find ./Documentation/devicetree/bindings \( -name '*.yaml' ! -name 
'processed-schema*' \) | grep -F -e 
"Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml" | 
xargs -n200 -P$(nproc) /home/a0868345/.local/bin/yamllint -f parsable -c 
./Documentation/devicetree/bindings/.yamllint >&2) || true
# DTEX 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dts
  dt-extract-example 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml > 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dts

# CHKDT   Documentation/devicetree/bindings/processed-schema.json
  (find ./Documentation/devicetree/bindings \( -name '*.yaml' ! -name 
'processed-schema*' \) | grep -F -e 
"Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml" | 
xargs -n200 -P$(nproc) dt-doc-validate -u 
./Documentation/devicetree/bindings) || true

# SCHEMA  Documentation/devicetree/bindings/processed-schema.json
  f=$(mktemp) ; find ./Documentation/devicetree/bindings \( -name 
'*.yaml' ! -name 'processed-schema*' \) > $f ; dt-mk-schema -j  @$f > 
Documentation/devicetree/bindings/processed-schema.json ; rm -f $f
# DTC_CHK 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dtb
  gcc -E 
-Wp,-MMD,Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.pre.tmp 
-nostdinc -I./scripts/dtc/include-prefixes -undef -D__DTS__ -x 
assembler-with-cpp -o 
Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.dts.tmp 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dts 
; ./scripts/dtc/dtc -o 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dtb 
-b 0 -iDocumentation/devicetree/bindings/remoteproc/ 
-i./scripts/dtc/include-prefixes -Wno-avoid_unnecessary_addr_size 
-Wno-graph_child_address -Wno-interrupt_provider 
-Wno-unique_unit_address -Wunique_unit_address_if_enabled -d 
Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.dtc.tmp 
Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.dts.tmp 
; cat 
Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.pre.tmp 
Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d.dtc.tmp 
> 
Documentation/devicetree/bindings/remoteproc/.ti_k3-m4f-rproc.example.dtb.d 
; dt-validate  -u ./Documentation/devicetree/bindings -p 
./Documentation/devicetree/bindings/processed-schema.json 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.example.dtb 
|| true



>> +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id:http://devicetree.org/schemas/remoteproc/ti,k3-m4f-rproc.yaml#
>> +$schema:http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: TI K3 M4F processor subsystems
>> +
>> +maintainers:
>> +  - Hari Nagalla
>> +  - Mathieu Poirier
> Are you sure Mathieu has this device and is a maintainer of this device?
>
Earlier, Mathieu suggested he can be the maintainer. Beagle play is 
based on AM625 device.


I will look into the other comments for the 'ti,k3-m4f-rproc.yaml' 
binding doc.


Thanks


Re: [PATCH 2/3] x86/xen: move paravirt lazy code

2023-09-13 Thread Juergen Gross

On 13.09.23 15:26, Steven Rostedt wrote:

On Wed, 13 Sep 2023 13:38:27 +0200
Juergen Gross  wrote:


diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h
index 44a3f565264d..0577f0cdd231 100644
--- a/include/trace/events/xen.h
+++ b/include/trace/events/xen.h
@@ -6,26 +6,26 @@
  #define _TRACE_XEN_H
  
  #include 

-#include 
+#include 
  #include 
  
  struct multicall_entry;
  
  /* Multicalls */

  DECLARE_EVENT_CLASS(xen_mc__batch,
-   TP_PROTO(enum paravirt_lazy_mode mode),
+   TP_PROTO(enum xen_lazy_mode mode),
TP_ARGS(mode),
TP_STRUCT__entry(
-   __field(enum paravirt_lazy_mode, mode)
+   __field(enum xen_lazy_mode, mode)
),
TP_fast_assign(__entry->mode = mode),
TP_printk("start batch LAZY_%s",
- (__entry->mode == PARAVIRT_LAZY_MMU) ? "MMU" :
- (__entry->mode == PARAVIRT_LAZY_CPU) ? "CPU" : "NONE")
+ (__entry->mode == XEN_LAZY_MMU) ? "MMU" :
+ (__entry->mode == XEN_LAZY_CPU) ? "CPU" : "NONE")


There's helper functions that make the above easier to implement as well as
exports the symbols so that user space can parse this better:

TRACE_DEFINE_ENUM(XEN_LAZY_NONE);
TRACE_DEFINE_ENUM(XEN_LAZY_MMU);
TRACE_DEFINE_ENUM(XEN_LAZY_CPU);

[..]

TP_printk("start batch LAZY_%s",
  __print_symbolic(mode,
   { XEN_LAZY_NONE, "NONE" },
   { XEN_LAZY_MMU,  "MMU   },
   { XEN_LAZY_CPU,  "CPU"  }))

Then user space parsers that read the raw data can convert these events
into something humans can read.


Thanks. I'll add that to another patch I'm just writing for cleaning up
include/trace/events/xen.h (some defined trace events are no longer used).


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH 2/3] x86/xen: move paravirt lazy code

2023-09-13 Thread Steven Rostedt
On Wed, 13 Sep 2023 13:38:27 +0200
Juergen Gross  wrote:

> diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h
> index 44a3f565264d..0577f0cdd231 100644
> --- a/include/trace/events/xen.h
> +++ b/include/trace/events/xen.h
> @@ -6,26 +6,26 @@
>  #define _TRACE_XEN_H
>  
>  #include 
> -#include 
> +#include 
>  #include 
>  
>  struct multicall_entry;
>  
>  /* Multicalls */
>  DECLARE_EVENT_CLASS(xen_mc__batch,
> - TP_PROTO(enum paravirt_lazy_mode mode),
> + TP_PROTO(enum xen_lazy_mode mode),
>   TP_ARGS(mode),
>   TP_STRUCT__entry(
> - __field(enum paravirt_lazy_mode, mode)
> + __field(enum xen_lazy_mode, mode)
>   ),
>   TP_fast_assign(__entry->mode = mode),
>   TP_printk("start batch LAZY_%s",
> -   (__entry->mode == PARAVIRT_LAZY_MMU) ? "MMU" :
> -   (__entry->mode == PARAVIRT_LAZY_CPU) ? "CPU" : "NONE")
> +   (__entry->mode == XEN_LAZY_MMU) ? "MMU" :
> +   (__entry->mode == XEN_LAZY_CPU) ? "CPU" : "NONE")

There's helper functions that make the above easier to implement as well as
exports the symbols so that user space can parse this better:

TRACE_DEFINE_ENUM(XEN_LAZY_NONE);
TRACE_DEFINE_ENUM(XEN_LAZY_MMU);
TRACE_DEFINE_ENUM(XEN_LAZY_CPU);

[..]

TP_printk("start batch LAZY_%s",
  __print_symbolic(mode,
   { XEN_LAZY_NONE, "NONE" },
   { XEN_LAZY_MMU,  "MMU   },
   { XEN_LAZY_CPU,  "CPU"  }))

Then user space parsers that read the raw data can convert these events
into something humans can read.

-- Steve

>   );
>  #define DEFINE_XEN_MC_BATCH(name)\
>   DEFINE_EVENT(xen_mc__batch, name,   \
> - TP_PROTO(enum paravirt_lazy_mode mode), \
> + TP_PROTO(enum xen_lazy_mode mode),  \
>TP_ARGS(mode))
>  
>  DEFINE_XEN_MC_BATCH(xen_mc_batch);


Re: [PATCH v6 1/4] dt-bindings: remoteproc: k3-m4f: Add K3 AM64x SoCs

2023-09-13 Thread Rob Herring


On Wed, 13 Sep 2023 06:16:41 -0500, Hari Nagalla wrote:
> K3 AM64x SoC has a Cortex M4F subsystem in the MCU voltage domain.
> The remote processor's life cycle management and IPC mechanisms are
> similar across the R5F and M4F cores from remote processor driver
> point of view. However, there are subtle differences in image loading
> and starting the M4F subsystems.
> 
> The YAML binding document provides the various node properties to be
> configured by the consumers of the M4F subsystem.
> 
> Signed-off-by: Martyn Welch 
> Signed-off-by: Hari Nagalla 
> ---
> Changes since v1:
>  - Spelling corrections
>  - Corrected to pass DT checks
> 
> Changes since v2:
>  - Missed spelling correction to commit message
> 
> Changes since v3:
>  - Removed unnecessary descriptions and used generic memory region names
>  - Made mboxes and memory-region optional
>  - Removed unrelated items from examples
> 
> Changes since v4:
>  - Rebased to the latest kernel-next tree
>  - Added optional sram memory region for m4f device node
> 
> Changes since v5:
>  - None
> 
>  .../bindings/remoteproc/ti,k3-m4f-rproc.yaml  | 136 ++
>  1 file changed, 136 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/dma/stericsson,dma40.example.dtb:
 dma-controller@801c: sram:0: [4294967295, 4294967295] is too long
from schema $id: 
http://devicetree.org/schemas/dma/stericsson,dma40.yaml#

doc reference errors (make refcheckdocs):

See 
https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20230913111644.29889-2-hnaga...@ti.com

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.



Re: [PATCH] ipmi: refactor deprecated strncpy

2023-09-13 Thread Corey Minyard
On Tue, Sep 12, 2023 at 05:55:02PM -0700, Justin Stitt wrote:
> On Tue, Sep 12, 2023 at 5:19 PM Corey Minyard  wrote:
> >
> > On Tue, Sep 12, 2023 at 11:43:05PM +, Justin Stitt wrote:
> > > `strncpy` is deprecated for use on NUL-terminated destination strings [1].
> > >
> > > In this case, strncpy is being used specifically for its NUL-padding
> > > behavior (and has been commented as such). We can use a more robust and
> > > less ambiguous interface in `strscpy_pad` which makes the code more
> > > readable and even eliminates the need for that comment.
> > >
> > > Let's also use `strnlen` instead of `strlen()` with an upper-bounds
> > > check as this is intrinsically a part of `strnlen`.
> > >
> > > Also included in this patch is a simple 1:1 change of `strncpy` to
> > > `strscpy` for ipmi_ssif.c. If NUL-padding is wanted here as well then we
> > > should opt again for `strscpy_pad`.
> > >
> > > Link: 
> > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> > >  [1]
> > > Link: https://github.com/KSPP/linux/issues/90
> > > Cc: linux-harden...@vger.kernel.org
> > > Cc: Kees Cook 
> > > Signed-off-by: Justin Stitt 
> > > ---
> > >  drivers/char/ipmi/ipmi_msghandler.c | 11 +++
> > >  drivers/char/ipmi/ipmi_ssif.c   |  2 +-
> > >  2 files changed, 4 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/char/ipmi/ipmi_msghandler.c 
> > > b/drivers/char/ipmi/ipmi_msghandler.c
> > > index 186f1fee7534..04f7622cb703 100644
> > > --- a/drivers/char/ipmi/ipmi_msghandler.c
> > > +++ b/drivers/char/ipmi/ipmi_msghandler.c
> > > @@ -5377,20 +5377,15 @@ static void send_panic_events(struct ipmi_smi 
> > > *intf, char *str)
> > >
> > >   j = 0;
> > >   while (*p) {
> > > - int size = strlen(p);
> > > + int size = strnlen(p, 11);
> > >
> > > - if (size > 11)
> > > - size = 11;
> > >   data[0] = 0;
> > >   data[1] = 0;
> > >   data[2] = 0xf0; /* OEM event without timestamp. */
> > >   data[3] = intf->addrinfo[0].address;
> > >   data[4] = j++; /* sequence # */
> > > - /*
> > > -  * Always give 11 bytes, so strncpy will fill
> > > -  * it with zeroes for me.
> > > -  */
> > > - strncpy(data+5, p, 11);
> > > +
> > > + strscpy_pad(data+5, p, 11);
> >
> > This is incorrect, the destination should *not* be nil terminated if the
> > destination is full.  strncpy does exactly what is needed here.
> 
> Could we use `memcpy_and_pad()` as this matches the behavior of
> strncpy in this case? I understand strncpy works here but I'm really
> keen on snuffing out all its uses -- treewide.

Sure, I think "memcpy_and_pad(data + 5, 11, p, size, 0);" should work.
And that's self-documenting.

-corey

> 
> >
> > A comment should be added here, this is not the first time this has been
> > brought up.
> >
> > >   p += size;
> > >
> > >   ipmi_panic_request_and_wait(intf, , );
> > > diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
> > > index 3b921c78ba08..edcb83765dce 100644
> > > --- a/drivers/char/ipmi/ipmi_ssif.c
> > > +++ b/drivers/char/ipmi/ipmi_ssif.c
> > > @@ -1940,7 +1940,7 @@ static int new_ssif_client(int addr, char 
> > > *adapter_name,
> > >   }
> > >   }
> > >
> > > - strncpy(addr_info->binfo.type, DEVICE_NAME,
> > > + strscpy(addr_info->binfo.type, DEVICE_NAME,
> > >   sizeof(addr_info->binfo.type));
> >
> > This one is good.
> >
> > -corey
> >
> > >   addr_info->binfo.addr = addr;
> > >   addr_info->binfo.platform_data = addr_info;
> > >
> > > ---
> > > base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> > > change-id: 20230912-strncpy-drivers-char-ipmi-ipmi-dda47b3773fd
> > >
> > > Best regards,
> > > --
> > > Justin Stitt 
> > >


Re: [PATCH v2 06/14] arm64: dts: qcom: sdm630: Drop RPM bus clocks

2023-09-13 Thread Konrad Dybcio
On 13.09.2023 09:13, Krzysztof Kozlowski wrote:
> On 12/09/2023 15:31, Konrad Dybcio wrote:
>> These clocks are now handled from within the icc framework and are
>> no longer registered from within the CCF. Remove them.
>>
>> Signed-off-by: Konrad Dybcio 
>> ---
[...]

>>  anoc2_smmu: iommu@16c {
>>  compatible = "qcom,sdm630-smmu-v2", "qcom,smmu-v2";
>>  reg = <0x016c 0x4>;
>> -
>> -assigned-clocks = < RPM_SMD_AGGR2_NOC_CLK>;
>> -assigned-clock-rates = <1000>;
>> -clocks = < RPM_SMD_AGGR2_NOC_CLK>;
>> -clock-names = "bus";
> 
> This is also against bindings. After your patch #4, such bus clock (or
> other combinations) is still required.
So, we have 4 SMMU instances on this platform:

MMSS (described, iface, mem, mem_iface)
GPU (described, iface-mm, iface-smmu, bus-smmu)

ANOC2 (this one, no clocks after removing rpmcc bus)
LPASS (no clocks)

Should I then create a new entry in the bindings, replicating
what's there for msm8998[1] and dropping the entry with just "bus"
from anyOf?

Konrad

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/devicetree/bindings/iommu/arm,smmu.yaml?h=next-20230913#n272


[PATCH 2/3] x86/xen: move paravirt lazy code

2023-09-13 Thread Juergen Gross
Only Xen is using the paravirt lazy mode code, so it can be moved to
Xen specific sources.

This allows to make some of the functions static or to merge them into
their only call sites.

While at it do a rename from "paravirt" to "xen" for all moved
specifiers.

No functional change.

Signed-off-by: Juergen Gross 
---
 arch/x86/include/asm/paravirt_types.h | 15 --
 arch/x86/include/asm/xen/hypervisor.h | 26 +++
 arch/x86/kernel/paravirt.c| 67 ---
 arch/x86/xen/enlighten_pv.c   | 39 +---
 arch/x86/xen/mmu_pv.c | 55 ++
 arch/x86/xen/multicalls.h |  4 +-
 include/trace/events/xen.h| 12 ++---
 7 files changed, 102 insertions(+), 116 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 4acbcc29..772d03487520 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -9,13 +9,6 @@ struct paravirt_patch_site {
u8 type;/* type of this instruction */
u8 len; /* length of original instruction */
 };
-
-/* Lazy mode for batching updates / context switch */
-enum paravirt_lazy_mode {
-   PARAVIRT_LAZY_NONE,
-   PARAVIRT_LAZY_MMU,
-   PARAVIRT_LAZY_CPU,
-};
 #endif
 
 #ifdef CONFIG_PARAVIRT
@@ -549,14 +542,6 @@ int paravirt_disable_iospace(void);
__PVOP_VCALL(op, PVOP_CALL_ARG1(arg1), PVOP_CALL_ARG2(arg2),\
 PVOP_CALL_ARG3(arg3), PVOP_CALL_ARG4(arg4))
 
-enum paravirt_lazy_mode paravirt_get_lazy_mode(void);
-void paravirt_start_context_switch(struct task_struct *prev);
-void paravirt_end_context_switch(struct task_struct *next);
-
-void paravirt_enter_lazy_mmu(void);
-void paravirt_leave_lazy_mmu(void);
-void paravirt_flush_lazy_mmu(void);
-
 void _paravirt_nop(void);
 void paravirt_BUG(void);
 unsigned long paravirt_ret0(void);
diff --git a/arch/x86/include/asm/xen/hypervisor.h 
b/arch/x86/include/asm/xen/hypervisor.h
index 5fc35f889cd1..ed05ce3df5c7 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -36,6 +36,7 @@
 extern struct shared_info *HYPERVISOR_shared_info;
 extern struct start_info *xen_start_info;
 
+#include 
 #include 
 
 #define XEN_SIGNATURE "XenVMMXenVMM"
@@ -63,4 +64,29 @@ void __init xen_pvh_init(struct boot_params *boot_params);
 void __init mem_map_via_hcall(struct boot_params *boot_params_p);
 #endif
 
+/* Lazy mode for batching updates / context switch */
+enum xen_lazy_mode {
+   XEN_LAZY_NONE,
+   XEN_LAZY_MMU,
+   XEN_LAZY_CPU,
+};
+
+DECLARE_PER_CPU(enum xen_lazy_mode, xen_lazy_mode);
+
+static inline void enter_lazy(enum xen_lazy_mode mode)
+{
+   BUG_ON(this_cpu_read(xen_lazy_mode) != XEN_LAZY_NONE);
+
+   this_cpu_write(xen_lazy_mode, mode);
+}
+
+static inline void leave_lazy(enum xen_lazy_mode mode)
+{
+   BUG_ON(this_cpu_read(xen_lazy_mode) != mode);
+
+   this_cpu_write(xen_lazy_mode, XEN_LAZY_NONE);
+}
+
+enum xen_lazy_mode xen_get_lazy_mode(void);
+
 #endif /* _ASM_X86_XEN_HYPERVISOR_H */
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 975f98d5eee5..97f1436c1a20 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -143,66 +143,7 @@ int paravirt_disable_iospace(void)
return request_resource(_resource, _ioports);
 }
 
-static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = 
PARAVIRT_LAZY_NONE;
-
-static inline void enter_lazy(enum paravirt_lazy_mode mode)
-{
-   BUG_ON(this_cpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
-
-   this_cpu_write(paravirt_lazy_mode, mode);
-}
-
-static void leave_lazy(enum paravirt_lazy_mode mode)
-{
-   BUG_ON(this_cpu_read(paravirt_lazy_mode) != mode);
-
-   this_cpu_write(paravirt_lazy_mode, PARAVIRT_LAZY_NONE);
-}
-
-void paravirt_enter_lazy_mmu(void)
-{
-   enter_lazy(PARAVIRT_LAZY_MMU);
-}
-
-void paravirt_leave_lazy_mmu(void)
-{
-   leave_lazy(PARAVIRT_LAZY_MMU);
-}
-
-void paravirt_flush_lazy_mmu(void)
-{
-   preempt_disable();
-
-   if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_MMU) {
-   arch_leave_lazy_mmu_mode();
-   arch_enter_lazy_mmu_mode();
-   }
-
-   preempt_enable();
-}
-
 #ifdef CONFIG_PARAVIRT_XXL
-void paravirt_start_context_switch(struct task_struct *prev)
-{
-   BUG_ON(preemptible());
-
-   if (this_cpu_read(paravirt_lazy_mode) == PARAVIRT_LAZY_MMU) {
-   arch_leave_lazy_mmu_mode();
-   set_ti_thread_flag(task_thread_info(prev), 
TIF_LAZY_MMU_UPDATES);
-   }
-   enter_lazy(PARAVIRT_LAZY_CPU);
-}
-
-void paravirt_end_context_switch(struct task_struct *next)
-{
-   BUG_ON(preemptible());
-
-   leave_lazy(PARAVIRT_LAZY_CPU);
-
-   if (test_and_clear_ti_thread_flag(task_thread_info(next), 
TIF_LAZY_MMU_UPDATES))
-   arch_enter_lazy_mmu_mode();
-}

[PATCH 0/3] xen: cleanup and fix lazy mode handling

2023-09-13 Thread Juergen Gross
This small series is cleaning up Xen lazy mode handling by removing
unused stuff and moving purely Xen-specific code away from general
kernel code.

The last patch is fixing a regression which was introduced in the
6.6 merge window.

Juergen Gross (3):
  arm/xen: remove lazy mode related definitions
  x86/xen: move paravirt lazy code
  x86/xen: allow nesting of same lazy mode

 arch/x86/include/asm/paravirt_types.h | 15 --
 arch/x86/include/asm/xen/hypervisor.h | 37 +++
 arch/x86/kernel/paravirt.c| 67 ---
 arch/x86/xen/enlighten_pv.c   | 40 +---
 arch/x86/xen/mmu_pv.c | 55 ++
 arch/x86/xen/multicalls.h |  4 +-
 include/trace/events/xen.h| 12 ++---
 include/xen/arm/hypervisor.h  | 12 -
 8 files changed, 114 insertions(+), 128 deletions(-)

-- 
2.35.3



Re: [PATCH v6 3/4] remoteproc: k3: Split out functions common with M4 driver

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 13:16, Hari Nagalla wrote:
> From: Martyn Welch 
> 
> In the next commit we will be adding the M4F driver which shares a lot of
> commonality with the DSP driver. Split this shared functionality out so
> that it can be used by both drivers.
> 
> Signed-off-by: Martyn Welch 
> Signed-off-by: Hari Nagalla 
> ---
> Changes since v2:
>  - New patch (reordered refactored from v2)
> 
> Changes since v3:
>  - Removed "ipc_only" element from k3_rproc structure
>  - Refactored to bring 3 more common functions
> 
> Changes since v4:
>  - None
> 
> Changes since v5:
>  - Rearranged the functions order to match with the functions in
>ti_k3_dsp_remoteproc.c to ease review.
> 
>  drivers/remoteproc/Makefile   |   2 +-
>  drivers/remoteproc/ti_k3_common.c | 513 +++
>  drivers/remoteproc/ti_k3_dsp_remoteproc.c | 598 ++

Generate your patch correctly with -M/-B/-C so the move will be detected.

Best regards,
Krzysztof



Re: [PATCH v6 4/4] remoteproc: k3-m4: Add a remoteproc driver for M4F subsystem

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 13:16, Hari Nagalla wrote:
> From: Martyn Welch 
> 
> The AM62x and AM64x SoCs of the TI K3 family has a Cortex M4F core in
> the MCU domain. This core is typically used for safety applications in a
> stand alone mode. However, some application (non safety related) may
> want to use the M4F core as a generic remote processor with IPC to the
> host processor. The M4F core has internal IRAM and DRAM memories and are
> exposed to the system bus for code and data loading.
> 


>  drivers/remoteproc/Kconfig   |  13 +
>  drivers/remoteproc/Makefile  |   1 +
>  drivers/remoteproc/ti_k3_m4_remoteproc.c | 331 +++
>  3 files changed, 345 insertions(+)
>  create mode 100644 drivers/remoteproc/ti_k3_m4_remoteproc.c
> 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 48845dc8fa85..85c1a3a2b987 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -339,6 +339,19 @@ config TI_K3_DSP_REMOTEPROC
> It's safe to say N here if you're not interested in utilizing
> the DSP slave processors.
>  
> +config TI_K3_M4_REMOTEPROC
> + tristate "TI K3 M4 remoteproc support"
> + depends on ARCH_K3


Missing compile testing.

...

> +
> +static int k3_m4_rproc_probe(struct platform_device *pdev)
> +{
> + struct device *dev = >dev;
> + struct device_node *np = dev->of_node;
> + const struct k3_rproc_dev_data *data;
> + struct k3_rproc *kproc;
> + struct rproc *rproc;
> + const char *fw_name;
> + bool r_state = false;
> + bool p_state = false;
> + int ret = 0;
> + int ret1;
> +
> + data = of_device_get_match_data(dev);
> + if (!data)
> + return -ENODEV;
> +
> + ret = rproc_of_parse_firmware(dev, 0, _name);
> + if (ret) {
> + dev_err(dev, "failed to parse firmware-name property, ret = 
> %d\n",
> + ret);
> + return ret;

Nope, the syntax is dev_err_probe().

> + }
> +
> + rproc = rproc_alloc(dev, dev_name(dev), _m4_rproc_ops, fw_name,
> + sizeof(*kproc));
> + if (!rproc)
> + return -ENOMEM;
> +
> + rproc->has_iommu = false;
> + rproc->recovery_disabled = true;
> + if (data->uses_lreset) {
> + rproc->ops->prepare = k3_rproc_prepare;
> + rproc->ops->unprepare = k3_rproc_unprepare;
> + }
> + kproc = rproc->priv;
> + kproc->rproc = rproc;
> + kproc->dev = dev;
> + kproc->data = data;
> +
> + kproc->ti_sci = ti_sci_get_by_phandle(np, "ti,sci");
> + if (IS_ERR(kproc->ti_sci)) {
> + ret = PTR_ERR(kproc->ti_sci);
> + if (ret != -EPROBE_DEFER) {

No, really, do not open-code existing code.

> + dev_err(dev, "failed to get ti-sci handle, ret = %d\n",
> + ret);
> + }
> + kproc->ti_sci = NULL;
> + goto free_rproc;
> + }
> +
> + ret = of_property_read_u32(np, "ti,sci-dev-id", >ti_sci_id);
> + if (ret) {
> + dev_err(dev, "missing 'ti,sci-dev-id' property\n");
> + goto put_sci;
> + }
> +
> + kproc->reset = devm_reset_control_get_exclusive(dev, NULL);
> + if (IS_ERR(kproc->reset)) {
> + ret = PTR_ERR(kproc->reset);
> + dev_err(dev, "failed to get reset, status = %d\n", ret);

Syntax is return dev_err_probe. And everywhere else as well...


Best regards,
Krzysztof



Re: [PATCH v6 4/4] remoteproc: k3-m4: Add a remoteproc driver for M4F subsystem

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 13:16, Hari Nagalla wrote:
> From: Martyn Welch 
> 
> The AM62x and AM64x SoCs of the TI K3 family has a Cortex M4F core in
> the MCU domain. This core is typically used for safety applications in a
> stand alone mode. However, some application (non safety related) may
> want to use the M4F core as a generic remote processor with IPC to the
> host processor. The M4F core has internal IRAM and DRAM memories and are
> exposed to the system bus for code and data loading.
> 
> A remote processor driver is added to support this subsystem, including
> being able to load and boot the M4F core. Loading includes to M4F
> internal memories and predefined external code/data memories. The
> carve outs for external contiguous memory is defined in the M4F device
> node and should match with the external memory declarations in the M4F
> image binary. The M4F subsystem has two resets. One reset is for the
> entire subsystem i.e including the internal memories and the other, a
> local reset is only for the M4F processing core. When loading the image,
> the driver first releases the subsystem reset, loads the firmware image
> and then releases the local reset to let the M4F processing core run.
> 
> Signed-off-by: Martyn Welch 
> Signed-off-by: Hari Nagalla 
> ---
> Changes since v1:
>  - Addressed minor review comments (refactoring completed in separate
>patch)
> 
> Changes since v2:
>  - Refactoring completed first, thus smaller change
> 
> Changes since v3:
>  - Removed 'ipc_only' flag and made changes in probe() to enact right
>operations
>  - Fixed spelling mistakes in commit message
>  - Changed some 'dev_err' messages to 'dev_info'
>  - Removed unnecessary checks rproc state
> 
> Changes since v4:
>  - None
> 
> Changes since v5:
>  - None
> 
>  drivers/remoteproc/Kconfig   |  13 +
>  drivers/remoteproc/Makefile  |   1 +
>  drivers/remoteproc/ti_k3_m4_remoteproc.c | 331 +++
>  3 files changed, 345 insertions(+)
>  create mode 100644 drivers/remoteproc/ti_k3_m4_remoteproc.c
> 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 48845dc8fa85..85c1a3a2b987 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -339,6 +339,19 @@ config TI_K3_DSP_REMOTEPROC
> It's safe to say N here if you're not interested in utilizing
> the DSP slave processors.
>  
> +config TI_K3_M4_REMOTEPROC
> + tristate "TI K3 M4 remoteproc support"
> + depends on ARCH_K3
> + select MAILBOX
> + select OMAP2PLUS_MBOX
> + help
> +   Say m here to support TI's M4 remote processor subsystems
> +   on various TI K3 family of SoCs through the remote processor
> +   framework.
> +
> +   It's safe to say N here if you're not interested in utilizing
> +   a remote processor.
> +
>  config TI_K3_R5_REMOTEPROC
>   tristate "TI K3 R5 remoteproc support"
>   depends on ARCH_K3
> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> index 55c552e27a45..e30908ca4bfc 100644
> --- a/drivers/remoteproc/Makefile
> +++ b/drivers/remoteproc/Makefile
> @@ -37,5 +37,6 @@ obj-$(CONFIG_ST_REMOTEPROC) += st_remoteproc.o
>  obj-$(CONFIG_ST_SLIM_REMOTEPROC) += st_slim_rproc.o
>  obj-$(CONFIG_STM32_RPROC)+= stm32_rproc.o
>  obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)   += ti_k3_dsp_remoteproc.o ti_k3_common.o
> +obj-$(CONFIG_TI_K3_M4_REMOTEPROC)+= ti_k3_m4_remoteproc.o ti_k3_common.o

Nope, please compile your code and fix all the warnings. There is a big
fat warning about including objects twice.

Best regards,
Krzysztof



Re: [PATCH v6 2/4] remoteproc: k3: Split out data structures common with M4 driver

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 13:16, Hari Nagalla wrote:
> From: Martyn Welch 
> 
> We will be adding the M4F driver which shares a lot of commonality
> with the DSP driver. Common data structures are introduced here.
> 
> Signed-off-by: Martyn Welch 
> Signed-off-by: Hari Nagalla 
> ---
> Changes in v6:
>  - Created a separate patch for data structures to ease review
> 
>  drivers/remoteproc/ti_k3_common.h | 103 ++
>  1 file changed, 103 insertions(+)
>  create mode 100644 drivers/remoteproc/ti_k3_common.h
> 
> diff --git a/drivers/remoteproc/ti_k3_common.h 
> b/drivers/remoteproc/ti_k3_common.h
> new file mode 100644
> index ..5e1f27741183
> --- /dev/null
> +++ b/drivers/remoteproc/ti_k3_common.h
> @@ -0,0 +1,103 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * TI K3 Remote Processor(s) driver common code
> + *
> + * Refactored from ti_k3_dsp_remoteproc.c.
> + *
> + * ti_k3_dsp_remoteproc.c:
> + * Copyright (C) 2018-2022 Texas Instruments Incorporated - 
> https://www.ti.com/
> + *   Suman Anna 
> + */
> +
> +#ifndef REMOTEPROC_TI_K3_COMMON_H
> +#define REMOTEPROC_TI_K3_COMMON_H
> +
> +#define KEYSTONE_RPROC_LOCAL_ADDRESS_MASK(SZ_16M - 1)
> +
> +/**
> + * struct k3_rproc_mem - internal memory structure
> + * @cpu_addr: MPU virtual address of the memory region
> + * @bus_addr: Bus address used to access the memory region
> + * @dev_addr: Device address of the memory region from DSP view
> + * @size: Size of the memory region
> + */
> +struct k3_rproc_mem {
> + void __iomem *cpu_addr;
> + phys_addr_t bus_addr;
> + u32 dev_addr;
> + size_t size;

Where is the split? I see only addition here.

Where is the usage of this header? This is basically dead code. Don't
add dead code, but instead actually move the structures here! Move is
cut and paste, not just paste.

Best regards,
Krzysztof



Re: [PATCH v6 1/4] dt-bindings: remoteproc: k3-m4f: Add K3 AM64x SoCs

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 13:16, Hari Nagalla wrote:
> K3 AM64x SoC has a Cortex M4F subsystem in the MCU voltage domain.
> The remote processor's life cycle management and IPC mechanisms are
> similar across the R5F and M4F cores from remote processor driver
> point of view. However, there are subtle differences in image loading
> and starting the M4F subsystems.
> 
> The YAML binding document provides the various node properties to be
> configured by the consumers of the M4F subsystem.
> 
> Signed-off-by: Martyn Welch 
> Signed-off-by: Hari Nagalla 
> ---
> Changes since v1:
>  - Spelling corrections
>  - Corrected to pass DT checks
> 
> Changes since v2:
>  - Missed spelling correction to commit message
> 
> Changes since v3:
>  - Removed unnecessary descriptions and used generic memory region names
>  - Made mboxes and memory-region optional
>  - Removed unrelated items from examples
> 
> Changes since v4:
>  - Rebased to the latest kernel-next tree
>  - Added optional sram memory region for m4f device node
> 
> Changes since v5:
>  - None

Hm, why none? There were errors in the binding to which you did not
respond. Did you just ignore them?

> 
>  .../bindings/remoteproc/ti,k3-m4f-rproc.yaml  | 136 ++
>  1 file changed, 136 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml 
> b/Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
> new file mode 100644
> index ..21b7f14d9dc4
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
> @@ -0,0 +1,136 @@
> +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/remoteproc/ti,k3-m4f-rproc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: TI K3 M4F processor subsystems
> +
> +maintainers:
> +  - Hari Nagalla 
> +  - Mathieu Poirier 

Are you sure Mathieu has this device and is a maintainer of this device?

> +
> +description: |
> +  Some K3 family SoCs have Arm Cortex M4F cores. AM64x is a SoC in K3
> +  family with a M4F core. Typically safety oriented applications may use
> +  the M4F core in isolation without an IPC. Where as some industrial and
> +  home automation applications, may use the M4F core as a remote processor
> +  with IPC communications.
> +
> +$ref: /schemas/arm/keystone/ti,k3-sci-common.yaml#
> +
> +properties:
> +

Drop blank line.

> +  compatible:
> +enum:
> +  - ti,am64-m4fss
> +
> +  power-domains:
> +maxItems: 1
> +
> +  "#address-cells":
> +const: 2
> +
> +  "#size-cells":
> +const: 2
> +
> +  reg:
> +items:
> +  - description: IRAM internal memory region
> +  - description: DRAM internal memory region
> +
> +  reg-names:
> +items:
> +  - const: iram
> +  - const: dram
> +
> +  resets:
> +maxItems: 1
> +
> +  firmware-name:
> +$ref: /schemas/types.yaml#/definitions/string

Wrong type. This is an array. You need maxItems instead.

> +description: Name of firmware to load for the M4F core
> +
> +  mboxes:
> +description: |
> +  Mailbox specifier denoting the sub-mailbox, to be used for 
> communication
> +  with the remote processor. This property should match with the
> +  sub-mailbox node used in the firmware image.
> +maxItems: 2

You need to describe the items instead.

> +
> +  memory-region:
> +description: |
> +  phandle to the reserved memory nodes to be associated with the
> +  remoteproc device. The reserved memory nodes should be carveout nodes,
> +  and should be defined with a "no-map" property as per the bindings in
> +  Documentation/devicetree/bindings/reserved-memory/reserved-memory.yaml
> +  Optional memory regions available for firmware specific purposes.
> +maxItems: 8
> +items:
> +  - description: regions used for DMA allocations like vrings, vring 
> buffers
> + and memory dedicated to firmware's specific purposes.
> +additionalItems: true


Best regards,
Krzysztof



[PATCH] x86/hyperv: Restrict get_vtl to only VTL platforms

2023-09-13 Thread Saurabh Sengar
For non VTL platforms vtl is always 0, and there is no need of
get_vtl function. For VTL platforms get_vtl should always succeed
and should return the correct VTL.

Signed-off-by: Saurabh Sengar 
---
 arch/x86/hyperv/hv_init.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 783ed339f341..e589c240565a 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -416,8 +416,8 @@ static u8 __init get_vtl(void)
if (hv_result_success(ret)) {
ret = output->as64.low & HV_X64_VTL_MASK;
} else {
-   pr_err("Failed to get VTL(%lld) and set VTL to zero by 
default.\n", ret);
-   ret = 0;
+   pr_err("Failed to get VTL(error: %lld) exiting...\n", ret);
+   BUG();
}
 
local_irq_restore(flags);
@@ -604,8 +604,10 @@ void __init hyperv_init(void)
hv_query_ext_cap(0);
 
/* Find the VTL */
-   if (!ms_hyperv.paravisor_present && hv_isolation_type_snp())
+   if (IS_ENABLED(CONFIG_HYPERV_VTL_MODE))
ms_hyperv.vtl = get_vtl();
+   else
+   ms_hyperv.vtl = 0;
 
return;
 
-- 
2.34.1




[PATCH v6 3/4] remoteproc: k3: Split out functions common with M4 driver

2023-09-13 Thread Hari Nagalla
From: Martyn Welch 

In the next commit we will be adding the M4F driver which shares a lot of
commonality with the DSP driver. Split this shared functionality out so
that it can be used by both drivers.

Signed-off-by: Martyn Welch 
Signed-off-by: Hari Nagalla 
---
Changes since v2:
 - New patch (reordered refactored from v2)

Changes since v3:
 - Removed "ipc_only" element from k3_rproc structure
 - Refactored to bring 3 more common functions

Changes since v4:
 - None

Changes since v5:
 - Rearranged the functions order to match with the functions in
   ti_k3_dsp_remoteproc.c to ease review.

 drivers/remoteproc/Makefile   |   2 +-
 drivers/remoteproc/ti_k3_common.c | 513 +++
 drivers/remoteproc/ti_k3_dsp_remoteproc.c | 598 ++
 3 files changed, 543 insertions(+), 570 deletions(-)
 create mode 100644 drivers/remoteproc/ti_k3_common.c

diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 91314a9b43ce..55c552e27a45 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -36,6 +36,6 @@ obj-$(CONFIG_RCAR_REMOTEPROC) += rcar_rproc.o
 obj-$(CONFIG_ST_REMOTEPROC)+= st_remoteproc.o
 obj-$(CONFIG_ST_SLIM_REMOTEPROC)   += st_slim_rproc.o
 obj-$(CONFIG_STM32_RPROC)  += stm32_rproc.o
-obj-$(CONFIG_TI_K3_DSP_REMOTEPROC) += ti_k3_dsp_remoteproc.o
+obj-$(CONFIG_TI_K3_DSP_REMOTEPROC) += ti_k3_dsp_remoteproc.o ti_k3_common.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)  += ti_k3_r5_remoteproc.o
 obj-$(CONFIG_XLNX_R5_REMOTEPROC)   += xlnx_r5_remoteproc.o
diff --git a/drivers/remoteproc/ti_k3_common.c 
b/drivers/remoteproc/ti_k3_common.c
new file mode 100644
index ..4f42b068387b
--- /dev/null
+++ b/drivers/remoteproc/ti_k3_common.c
@@ -0,0 +1,513 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * TI K3 Remote Processor(s) driver common code
+ *
+ * Refactored from ti_k3_dsp_remoteproc.c.
+ *
+ * ti_k3_dsp_remoteproc.c:
+ * Copyright (C) 2018-2022 Texas Instruments Incorporated - https://www.ti.com/
+ * Suman Anna 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "omap_remoteproc.h"
+#include "remoteproc_internal.h"
+#include "ti_sci_proc.h"
+#include "ti_k3_common.h"
+
+/**
+ * k3_rproc_mbox_callback() - inbound mailbox message handler
+ * @client: mailbox client pointer used for requesting the mailbox channel
+ * @data: mailbox payload
+ *
+ * This handler is invoked by the K3 mailbox driver whenever a mailbox
+ * message is received. Usually, the mailbox payload simply contains
+ * the index of the virtqueue that is kicked by the remote processor,
+ * and we let remoteproc core handle it.
+ *
+ * In addition to virtqueue indices, we also have some out-of-band values
+ * that indicate different events. Those values are deliberately very
+ * large so they don't coincide with virtqueue indices.
+ */
+static void k3_rproc_mbox_callback(struct mbox_client *client, void *data)
+{
+   struct k3_rproc *kproc = container_of(client, struct k3_rproc,
+ client);
+   struct device *dev = kproc->rproc->dev.parent;
+   const char *name = kproc->rproc->name;
+   u32 msg = omap_mbox_message(data);
+
+   dev_dbg(dev, "mbox msg: 0x%x\n", msg);
+
+   switch (msg) {
+   case RP_MBOX_CRASH:
+   /*
+* remoteproc detected an exception, but error recovery is not
+* supported. So, just log this for now
+*/
+   dev_err(dev, "K3 rproc %s crashed\n", name);
+   break;
+   case RP_MBOX_ECHO_REPLY:
+   dev_info(dev, "received echo reply from %s\n", name);
+   break;
+   default:
+   /* silently handle all other valid messages */
+   if (msg >= RP_MBOX_READY && msg < RP_MBOX_END_MSG)
+   return;
+   if (msg > kproc->rproc->max_notifyid) {
+   dev_dbg(dev, "dropping unknown message 0x%x", msg);
+   return;
+   }
+   /* msg contains the index of the triggered vring */
+   if (rproc_vq_interrupt(kproc->rproc, msg) == IRQ_NONE)
+   dev_dbg(dev, "no message was found in vqid %d\n", msg);
+   }
+}
+
+/*
+ * Kick the remote processor to notify about pending unprocessed messages.
+ * The vqid usage is not used and is inconsequential, as the kick is performed
+ * through a simulated GPIO (a bit in an IPC interrupt-triggering register),
+ * the remote processor is expected to process both its Tx and Rx virtqueues.
+ */
+void k3_rproc_kick(struct rproc *rproc, int vqid)
+{
+   struct k3_rproc *kproc = rproc->priv;
+   struct device *dev = rproc->dev.parent;
+   mbox_msg_t msg = (mbox_msg_t)vqid;
+   int ret;
+
+   /* send the index of the triggered virtqueue in 

[PATCH v6 4/4] remoteproc: k3-m4: Add a remoteproc driver for M4F subsystem

2023-09-13 Thread Hari Nagalla
From: Martyn Welch 

The AM62x and AM64x SoCs of the TI K3 family has a Cortex M4F core in
the MCU domain. This core is typically used for safety applications in a
stand alone mode. However, some application (non safety related) may
want to use the M4F core as a generic remote processor with IPC to the
host processor. The M4F core has internal IRAM and DRAM memories and are
exposed to the system bus for code and data loading.

A remote processor driver is added to support this subsystem, including
being able to load and boot the M4F core. Loading includes to M4F
internal memories and predefined external code/data memories. The
carve outs for external contiguous memory is defined in the M4F device
node and should match with the external memory declarations in the M4F
image binary. The M4F subsystem has two resets. One reset is for the
entire subsystem i.e including the internal memories and the other, a
local reset is only for the M4F processing core. When loading the image,
the driver first releases the subsystem reset, loads the firmware image
and then releases the local reset to let the M4F processing core run.

Signed-off-by: Martyn Welch 
Signed-off-by: Hari Nagalla 
---
Changes since v1:
 - Addressed minor review comments (refactoring completed in separate
   patch)

Changes since v2:
 - Refactoring completed first, thus smaller change

Changes since v3:
 - Removed 'ipc_only' flag and made changes in probe() to enact right
   operations
 - Fixed spelling mistakes in commit message
 - Changed some 'dev_err' messages to 'dev_info'
 - Removed unnecessary checks rproc state

Changes since v4:
 - None

Changes since v5:
 - None

 drivers/remoteproc/Kconfig   |  13 +
 drivers/remoteproc/Makefile  |   1 +
 drivers/remoteproc/ti_k3_m4_remoteproc.c | 331 +++
 3 files changed, 345 insertions(+)
 create mode 100644 drivers/remoteproc/ti_k3_m4_remoteproc.c

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 48845dc8fa85..85c1a3a2b987 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -339,6 +339,19 @@ config TI_K3_DSP_REMOTEPROC
  It's safe to say N here if you're not interested in utilizing
  the DSP slave processors.
 
+config TI_K3_M4_REMOTEPROC
+   tristate "TI K3 M4 remoteproc support"
+   depends on ARCH_K3
+   select MAILBOX
+   select OMAP2PLUS_MBOX
+   help
+ Say m here to support TI's M4 remote processor subsystems
+ on various TI K3 family of SoCs through the remote processor
+ framework.
+
+ It's safe to say N here if you're not interested in utilizing
+ a remote processor.
+
 config TI_K3_R5_REMOTEPROC
tristate "TI K3 R5 remoteproc support"
depends on ARCH_K3
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 55c552e27a45..e30908ca4bfc 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -37,5 +37,6 @@ obj-$(CONFIG_ST_REMOTEPROC)   += st_remoteproc.o
 obj-$(CONFIG_ST_SLIM_REMOTEPROC)   += st_slim_rproc.o
 obj-$(CONFIG_STM32_RPROC)  += stm32_rproc.o
 obj-$(CONFIG_TI_K3_DSP_REMOTEPROC) += ti_k3_dsp_remoteproc.o ti_k3_common.o
+obj-$(CONFIG_TI_K3_M4_REMOTEPROC)  += ti_k3_m4_remoteproc.o ti_k3_common.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)  += ti_k3_r5_remoteproc.o
 obj-$(CONFIG_XLNX_R5_REMOTEPROC)   += xlnx_r5_remoteproc.o
diff --git a/drivers/remoteproc/ti_k3_m4_remoteproc.c 
b/drivers/remoteproc/ti_k3_m4_remoteproc.c
new file mode 100644
index ..19151cc31bde
--- /dev/null
+++ b/drivers/remoteproc/ti_k3_m4_remoteproc.c
@@ -0,0 +1,331 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * TI K3 Cortex-M4 Remote Processor(s) driver
+ *
+ * Copyright (C) 2021 Texas Instruments Incorporated - https://www.ti.com/
+ * Hari Nagalla 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "omap_remoteproc.h"
+#include "remoteproc_internal.h"
+#include "ti_sci_proc.h"
+#include "ti_k3_common.h"
+
+/*
+ * Power up the M4F remote processor.
+ *
+ * This function will be invoked only after the firmware for this rproc
+ * was loaded, parsed successfully, and all of its resource requirements
+ * were met. This callback is invoked only in remoteproc mode.
+ */
+static int k3_m4_rproc_start(struct rproc *rproc)
+{
+   struct k3_rproc *kproc = rproc->priv;
+   u32 boot_addr;
+   int ret;
+
+   ret = k3_rproc_request_mbox(rproc);
+   if (ret)
+   return ret;
+
+   boot_addr = rproc->bootaddr;
+   ret = k3_rproc_release(kproc);
+   if (ret)
+   goto put_mbox;
+
+   return 0;
+
+put_mbox:
+   mbox_free_channel(kproc->mbox);
+   return ret;
+}
+
+/*
+ * Stop the M4 remote processor.
+ *
+ * This function puts the M4 processor into reset, and finishes processing
+ * of any pending messages. This 

[PATCH v6 1/4] dt-bindings: remoteproc: k3-m4f: Add K3 AM64x SoCs

2023-09-13 Thread Hari Nagalla
K3 AM64x SoC has a Cortex M4F subsystem in the MCU voltage domain.
The remote processor's life cycle management and IPC mechanisms are
similar across the R5F and M4F cores from remote processor driver
point of view. However, there are subtle differences in image loading
and starting the M4F subsystems.

The YAML binding document provides the various node properties to be
configured by the consumers of the M4F subsystem.

Signed-off-by: Martyn Welch 
Signed-off-by: Hari Nagalla 
---
Changes since v1:
 - Spelling corrections
 - Corrected to pass DT checks

Changes since v2:
 - Missed spelling correction to commit message

Changes since v3:
 - Removed unnecessary descriptions and used generic memory region names
 - Made mboxes and memory-region optional
 - Removed unrelated items from examples

Changes since v4:
 - Rebased to the latest kernel-next tree
 - Added optional sram memory region for m4f device node

Changes since v5:
 - None

 .../bindings/remoteproc/ti,k3-m4f-rproc.yaml  | 136 ++
 1 file changed, 136 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml

diff --git a/Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
new file mode 100644
index ..21b7f14d9dc4
--- /dev/null
+++ b/Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
@@ -0,0 +1,136 @@
+# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/remoteproc/ti,k3-m4f-rproc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: TI K3 M4F processor subsystems
+
+maintainers:
+  - Hari Nagalla 
+  - Mathieu Poirier 
+
+description: |
+  Some K3 family SoCs have Arm Cortex M4F cores. AM64x is a SoC in K3
+  family with a M4F core. Typically safety oriented applications may use
+  the M4F core in isolation without an IPC. Where as some industrial and
+  home automation applications, may use the M4F core as a remote processor
+  with IPC communications.
+
+$ref: /schemas/arm/keystone/ti,k3-sci-common.yaml#
+
+properties:
+
+  compatible:
+enum:
+  - ti,am64-m4fss
+
+  power-domains:
+maxItems: 1
+
+  "#address-cells":
+const: 2
+
+  "#size-cells":
+const: 2
+
+  reg:
+items:
+  - description: IRAM internal memory region
+  - description: DRAM internal memory region
+
+  reg-names:
+items:
+  - const: iram
+  - const: dram
+
+  resets:
+maxItems: 1
+
+  firmware-name:
+$ref: /schemas/types.yaml#/definitions/string
+description: Name of firmware to load for the M4F core
+
+  mboxes:
+description: |
+  Mailbox specifier denoting the sub-mailbox, to be used for communication
+  with the remote processor. This property should match with the
+  sub-mailbox node used in the firmware image.
+maxItems: 2
+
+  memory-region:
+description: |
+  phandle to the reserved memory nodes to be associated with the
+  remoteproc device. The reserved memory nodes should be carveout nodes,
+  and should be defined with a "no-map" property as per the bindings in
+  Documentation/devicetree/bindings/reserved-memory/reserved-memory.yaml
+  Optional memory regions available for firmware specific purposes.
+maxItems: 8
+items:
+  - description: regions used for DMA allocations like vrings, vring 
buffers
+ and memory dedicated to firmware's specific purposes.
+additionalItems: true
+
+  sram:
+$ref: /schemas/types.yaml#/definitions/phandle-array
+minItems: 1
+maxItems: 4
+items:
+  maxItems: 4
+description: |
+  phandles to one or more reserved on-chip SRAM regions. The regions
+  should be defined as child nodes of the respective SRAM node, and
+  should be defined as per the generic bindings in,
+  Documentation/devicetree/bindings/sram/sram.yaml
+
+required:
+  - compatible
+  - reg
+  - reg-names
+  - ti,sci
+  - ti,sci-dev-id
+  - ti,sci-proc-ids
+  - resets
+  - firmware-name
+
+unevaluatedProperties: false
+
+examples:
+  - |
+reserved-memory {
+#address-cells = <2>;
+#size-cells = <2>;
+
+mcu_m4fss_dma_memory_region: m4f-dma-memory@9cb0 {
+compatible = "shared-dma-pool";
+reg = <0x00 0x9cb0 0x00 0x10>;
+no-map;
+};
+
+mcu_m4fss_memory_region: m4f-memory@9cc0 {
+compatible = "shared-dma-pool";
+reg = <0x00 0x9cc0 0x00 0xe0>;
+no-map;
+};
+};
+
+soc {
+#address-cells = <2>;
+#size-cells = <2>;
+
+remoteproc@500 {
+compatible = "ti,am64-m4fss";
+reg = <0x00 0x500 0x00 0x3>,
+  <0x00 0x504 0x00 0x1>;
+reg-names = "iram", "dram";
+ti,sci = <>;
+ti,sci-dev-id = <9>;
+

[PATCH v6 0/4] TI K3 M4F support on AM64x and AM62x SoCs

2023-09-13 Thread Hari Nagalla
The following series introduces K3 M4F remoteproc driver support for
AM64x and AM62x SoC families. These SoCs have a ARM Cortex M4F core in
the MCU voltage domain. For safety oriented applications, this core is
operated independently with out any IPC to other cores on the SoC.
However, for non safety applications, some customers use it as a remote
processor and so linux remote proc support is extended to the M4F core.

See AM64x Technical Reference Manual (SPRUIM2C – SEPTEMBER 2021) for
further details: https://www.ti.com/lit/pdf/SPRUIM2

See AM62x Technical Reference Manual (SPRUIV7A – MAY 2022) for
further details: https://www.ti.com/lit/pdf/SPRUIV7A

Hari Nagalla (1):
  dt-bindings: remoteproc: k3-m4f: Add K3 AM64x SoCs

Martyn Welch (3):
  remoteproc: k3: Split out data structures common with M4 driver
  remoteproc: k3: Split out functions common with M4 driver
  remoteproc: k3-m4: Add a remoteproc driver for M4F subsystem

 .../bindings/remoteproc/ti,k3-m4f-rproc.yaml  | 136 
 drivers/remoteproc/Kconfig|  13 +
 drivers/remoteproc/Makefile   |   3 +-
 drivers/remoteproc/ti_k3_common.c | 513 +++
 drivers/remoteproc/ti_k3_common.h | 103 +++
 drivers/remoteproc/ti_k3_dsp_remoteproc.c | 598 +-
 drivers/remoteproc/ti_k3_m4_remoteproc.c  | 331 ++
 7 files changed, 1127 insertions(+), 570 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/remoteproc/ti,k3-m4f-rproc.yaml
 create mode 100644 drivers/remoteproc/ti_k3_common.c
 create mode 100644 drivers/remoteproc/ti_k3_common.h
 create mode 100644 drivers/remoteproc/ti_k3_m4_remoteproc.c

-- 
2.34.1



[PATCH v6 2/4] remoteproc: k3: Split out data structures common with M4 driver

2023-09-13 Thread Hari Nagalla
From: Martyn Welch 

We will be adding the M4F driver which shares a lot of commonality
with the DSP driver. Common data structures are introduced here.

Signed-off-by: Martyn Welch 
Signed-off-by: Hari Nagalla 
---
Changes in v6:
 - Created a separate patch for data structures to ease review

 drivers/remoteproc/ti_k3_common.h | 103 ++
 1 file changed, 103 insertions(+)
 create mode 100644 drivers/remoteproc/ti_k3_common.h

diff --git a/drivers/remoteproc/ti_k3_common.h 
b/drivers/remoteproc/ti_k3_common.h
new file mode 100644
index ..5e1f27741183
--- /dev/null
+++ b/drivers/remoteproc/ti_k3_common.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * TI K3 Remote Processor(s) driver common code
+ *
+ * Refactored from ti_k3_dsp_remoteproc.c.
+ *
+ * ti_k3_dsp_remoteproc.c:
+ * Copyright (C) 2018-2022 Texas Instruments Incorporated - https://www.ti.com/
+ * Suman Anna 
+ */
+
+#ifndef REMOTEPROC_TI_K3_COMMON_H
+#define REMOTEPROC_TI_K3_COMMON_H
+
+#define KEYSTONE_RPROC_LOCAL_ADDRESS_MASK  (SZ_16M - 1)
+
+/**
+ * struct k3_rproc_mem - internal memory structure
+ * @cpu_addr: MPU virtual address of the memory region
+ * @bus_addr: Bus address used to access the memory region
+ * @dev_addr: Device address of the memory region from DSP view
+ * @size: Size of the memory region
+ */
+struct k3_rproc_mem {
+   void __iomem *cpu_addr;
+   phys_addr_t bus_addr;
+   u32 dev_addr;
+   size_t size;
+};
+
+/**
+ * struct k3_rproc_mem_data - memory definitions for a DSP
+ * @name: name for this memory entry
+ * @dev_addr: device address for the memory entry
+ */
+struct k3_rproc_mem_data {
+   const char *name;
+   const u32 dev_addr;
+};
+
+/**
+ * struct k3_rproc_dev_data - device data structure for a DSP
+ * @mems: pointer to memory definitions for a DSP
+ * @num_mems: number of memory regions in @mems
+ * @boot_align_addr: boot vector address alignment granularity
+ * @uses_lreset: flag to denote the need for local reset management
+ */
+struct k3_rproc_dev_data {
+   const struct k3_rproc_mem_data *mems;
+   u32 num_mems;
+   u32 boot_align_addr;
+   bool uses_lreset;
+};
+
+/**
+ * struct k3_rproc - k3 remote processor driver structure
+ * @dev: cached device pointer
+ * @rproc: remoteproc device handle
+ * @mem: internal memory regions data
+ * @num_mems: number of internal memory regions
+ * @rmem: reserved memory regions data
+ * @num_rmems: number of reserved memory regions
+ * @reset: reset control handle
+ * @data: pointer to device data
+ * @tsp: TI-SCI processor control handle
+ * @ti_sci: TI-SCI handle
+ * @ti_sci_id: TI-SCI device identifier
+ * @mbox: mailbox channel handle
+ * @client: mailbox client to request the mailbox channel
+ */
+struct k3_rproc {
+   struct device *dev;
+   struct rproc *rproc;
+   struct k3_rproc_mem *mem;
+   int num_mems;
+   struct k3_rproc_mem *rmem;
+   int num_rmems;
+   struct reset_control *reset;
+   const struct k3_rproc_dev_data *data;
+   struct ti_sci_proc *tsp;
+   const struct ti_sci_handle *ti_sci;
+   u32 ti_sci_id;
+   struct mbox_chan *mbox;
+   struct mbox_client client;
+};
+
+void k3_rproc_kick(struct rproc *rproc, int vqid);
+int k3_rproc_reset(struct k3_rproc *kproc);
+int k3_rproc_release(struct k3_rproc *kproc);
+int k3_rproc_request_mbox(struct rproc *rproc);
+int k3_rproc_prepare(struct rproc *rproc);
+int k3_rproc_unprepare(struct rproc *rproc);
+struct resource_table *k3_get_loaded_rsc_table(struct rproc *rproc,
+  size_t *rsc_table_sz);
+void *k3_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len,
+   bool *is_iomem);
+int k3_rproc_of_get_memories(struct platform_device *pdev,
+struct k3_rproc *kproc);
+int k3_reserved_mem_init(struct k3_rproc *kproc);
+void k3_reserved_mem_exit(struct k3_rproc *kproc);
+struct ti_sci_proc *k3_rproc_of_get_tsp(struct device *dev,
+   const struct ti_sci_handle *sci);
+#endif /* REMOTEPROC_TI_K3_COMMON_H */
-- 
2.34.1



Re: [PATCH v2 01/14] arm64: dts: qcom: msm8916: Drop RPM bus clocks

2023-09-13 Thread Konrad Dybcio
On 13.09.2023 13:14, Krzysztof Kozlowski wrote:
> On 13/09/2023 12:48, Konrad Dybcio wrote:
>> On 13.09.2023 10:53, Krzysztof Kozlowski wrote:
>>> On 13/09/2023 10:47, Konrad Dybcio wrote:
 On 13.09.2023 09:07, Krzysztof Kozlowski wrote:
> On 12/09/2023 15:31, Konrad Dybcio wrote:
>> These clocks are now handled from within the icc framework and are
>
> That's a driver behavior, not hardware.
 I believe we've been over this already..

 The rationale behind this change is: that hardware, which falls
 under the "interconnect" class, was previously misrepresented as
 a bunch of clocks. There are clocks underneath, but accessing them
 directly would be equivalent to e.g. circumventing the PHY subsystem
 and initializing your UFS PHY from within the UFS device.
>>>
>>> And every time one write such commit msg, how should we remember there
>>> is some exception and actually it is about clock representation not CCF
>>> or ICC framework.
>> So is your reply essentially "fine, but please make it clear in
>> each commit message"?
> 
> I am fine with this change. If commit msg had such statement, I would
> not have doubts :/
Ok, I'll resend, thanks for confirming!

Konrad


Re: [PATCH v2 01/14] arm64: dts: qcom: msm8916: Drop RPM bus clocks

2023-09-13 Thread Krzysztof Kozlowski
On 13/09/2023 12:48, Konrad Dybcio wrote:
> On 13.09.2023 10:53, Krzysztof Kozlowski wrote:
>> On 13/09/2023 10:47, Konrad Dybcio wrote:
>>> On 13.09.2023 09:07, Krzysztof Kozlowski wrote:
 On 12/09/2023 15:31, Konrad Dybcio wrote:
> These clocks are now handled from within the icc framework and are

 That's a driver behavior, not hardware.
>>> I believe we've been over this already..
>>>
>>> The rationale behind this change is: that hardware, which falls
>>> under the "interconnect" class, was previously misrepresented as
>>> a bunch of clocks. There are clocks underneath, but accessing them
>>> directly would be equivalent to e.g. circumventing the PHY subsystem
>>> and initializing your UFS PHY from within the UFS device.
>>
>> And every time one write such commit msg, how should we remember there
>> is some exception and actually it is about clock representation not CCF
>> or ICC framework.
> So is your reply essentially "fine, but please make it clear in
> each commit message"?

I am fine with this change. If commit msg had such statement, I would
not have doubts :/

Best regards,
Krzysztof



[PATCH 11/19] fs: add new shutdown_sb and free_sb methods

2023-09-13 Thread Christoph Hellwig
Currently super_blocks are shut down using the ->kill_sb method, which
must call generic_shutdown_super, but allows the file system to
add extra work before or after the call to generic_shutdown_super.

File systems tend to get rather confused by this, so add an alternative
shutdown sequence where generic_shutdown_super is called by the core
code, and there are extra ->shutdown_sb and ->free_sb hooks before and
after it.  To remove the amount of boilerplate code ->shutdown_sb is only
called if the super has finished initialization and ->d_root is set.

Signed-off-by: Christoph Hellwig 
---
 Documentation/filesystems/locking.rst |  4 
 Documentation/filesystems/vfs.rst | 12 
 fs/super.c|  9 +++--
 include/linux/fs.h|  2 ++
 4 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/locking.rst 
b/Documentation/filesystems/locking.rst
index 7be2900806c853..c33e2f03ed1f69 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -220,7 +220,9 @@ prototypes::
 
struct dentry *(*mount) (struct file_system_type *, int,
   const char *, void *);
+   void (*shutdown_sb) (struct super_block *);
void (*kill_sb) (struct super_block *);
+   void (*free_sb) (struct super_block *);
 
 locking rules:
 
@@ -228,7 +230,9 @@ locking rules:
 opsmay block
 ====
 mount  yes
+shutdown_sbyes
 kill_sbyes
+free_sbyes
 ====
 
 ->mount() returns ERR_PTR or the root dentry; its superblock should be locked
diff --git a/Documentation/filesystems/vfs.rst 
b/Documentation/filesystems/vfs.rst
index 99acc2e9867391..1a7c6926c31f34 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -119,7 +119,9 @@ members are defined:
const struct fs_parameter_spec *parameters;
struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *);
+   void (*shutdown_sb) (struct super_block *);
void (*kill_sb) (struct super_block *);
+   void (*free_sb) (struct super_block *);
struct module *owner;
struct file_system_type * next;
struct hlist_head fs_supers;
@@ -155,10 +157,20 @@ members are defined:
the method to call when a new instance of this filesystem should
be mounted
 
+``shutdown_sb``
+   Cleanup after a super_block has reached a zero active count, and before
+   the VFS level cleanup happens.  Typical picks all fs-specific objects
+   (if any) that need destruction out of superblock and releases them.
+   Note: dentries and inodes are normally taken care of and do not need
+   specific handling unless they are pinned by kernel users.
+
 ``kill_sb``
the method to call when an instance of this filesystem should be
shut down
 
+``free_sb``
+   Free file system specific resources like sb->s_fs_info that are
+   still needed while inodes are freed during umount.
 
 ``owner``
for internal VFS use: you should initialize this to THIS_MODULE
diff --git a/fs/super.c b/fs/super.c
index 5c685b4944c2d6..8e173eccc8c113 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -480,10 +480,15 @@ void deactivate_locked_super(struct super_block *s)
 
unregister_shrinker(>s_shrink);
 
-   if (fs->kill_sb)
+   if (fs->kill_sb) {
fs->kill_sb(s);
-   else
+   } else {
+   if (fs->shutdown_sb)
+   fs->shutdown_sb(s);
generic_shutdown_super(s);
+   if (fs->free_sb)
+   fs->free_sb(s);
+   }
 
kill_super_notify(s);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 31b6b235b36efa..12fff7df3cc46b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2341,6 +2341,8 @@ struct file_system_type {
struct dentry *(*mount) (struct file_system_type *, int,
   const char *, void *);
void (*kill_sb) (struct super_block *);
+   void (*shutdown_sb)(struct super_block *sb);
+   void (*free_sb)(struct super_block *sb);
struct module *owner;
struct file_system_type * next;
struct hlist_head fs_supers;
-- 
2.39.2



[PATCH 05/19] fs: assign an anon dev_t in common code

2023-09-13 Thread Christoph Hellwig
All super_blocks need to have a valid dev_t, and except for block
based file systems that tends to be an anonymouns dev_t.  Instead of
leaving that work to the file systems, assign the anonymous dev_t in
the core sget_fc and sget routines unless the file systems already
assigned on in the set callback.  Note that this now makes the
set callback optional as a lot of file systems don't need it any more.

Signed-off-by: Christoph Hellwig 
---
 fs/9p/vfs_super.c  |  2 +-
 fs/afs/super.c | 12 +++-
 fs/btrfs/super.c   |  6 ++--
 fs/ceph/super.c|  7 +
 fs/ecryptfs/main.c |  2 +-
 fs/fuse/inode.c|  2 +-
 fs/fuse/virtio_fs.c|  2 +-
 fs/kernfs/mount.c  |  2 +-
 fs/nfs/super.c |  2 +-
 fs/orangefs/super.c|  2 +-
 fs/smb/client/cifsfs.c |  3 +-
 fs/super.c | 68 +-
 fs/ubifs/super.c   |  2 +-
 include/linux/fs.h |  2 --
 14 files changed, 58 insertions(+), 56 deletions(-)

diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index 9e60eddf5179ed..e8b3641c98f886 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -40,7 +40,7 @@ static const struct super_operations v9fs_super_ops, 
v9fs_super_ops_dotl;
 static int v9fs_set_super(struct super_block *s, void *data)
 {
s->s_fs_info = data;
-   return set_anon_super(s, data);
+   return 0;
 }
 
 /**
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 754b9828233497..84b135ad3496b1 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -435,11 +435,6 @@ static int afs_dynroot_test_super(struct super_block *sb, 
struct fs_context *fc)
as->dyn_root);
 }
 
-static int afs_set_super(struct super_block *sb, struct fs_context *fc)
-{
-   return set_anon_super(sb, NULL);
-}
-
 /*
  * fill in the superblock
  */
@@ -574,9 +569,10 @@ static int afs_get_tree(struct fs_context *fc)
fc->s_fs_info = as;
 
/* allocate a deviceless superblock */
-   sb = sget_fc(fc,
-as->dyn_root ? afs_dynroot_test_super : afs_test_super,
-afs_set_super);
+   if (as->dyn_root)
+   sb = sget_fc(fc, afs_dynroot_test_super, NULL);
+   else
+   sb = sget_fc(fc, afs_test_super, NULL);
if (IS_ERR(sb)) {
ret = PTR_ERR(sb);
goto error;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 01b86bd4eae8dc..063b9aa313c227 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1350,10 +1350,8 @@ static int btrfs_test_super(struct super_block *s, void 
*data)
 
 static int btrfs_set_super(struct super_block *s, void *data)
 {
-   int err = set_anon_super(s, data);
-   if (!err)
-   s->s_fs_info = data;
-   return err;
+   s->s_fs_info = data;
+   return 0;
 }
 
 /*
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 7feef0b35b97b5..cbeaab8c21d8e6 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -1191,7 +1191,6 @@ static struct dentry *ceph_real_mount(struct 
ceph_fs_client *fsc,
 static int ceph_set_super(struct super_block *s, struct fs_context *fc)
 {
struct ceph_fs_client *fsc = s->s_fs_info;
-   int ret;
 
dout("set_super %p\n", s);
 
@@ -1211,11 +1210,7 @@ static int ceph_set_super(struct super_block *s, struct 
fs_context *fc)
s->s_flags |= SB_NODIRATIME | SB_NOATIME;
 
ceph_fscrypt_set_ops(s);
-
-   ret = set_anon_super_fc(s, fc);
-   if (ret != 0)
-   fsc->sb = NULL;
-   return ret;
+   return 0;
 }
 
 /*
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index d99b2311759166..3ed91537a3991a 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -505,7 +505,7 @@ static struct dentry *ecryptfs_mount(struct 
file_system_type *fs_type, int flags
}
mount_crypt_stat = >mount_crypt_stat;
 
-   s = sget(fs_type, NULL, set_anon_super, flags, NULL);
+   s = sget(fs_type, NULL, NULL, flags, NULL);
if (IS_ERR(s)) {
rc = PTR_ERR(s);
goto out;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 42523edb32fd53..5731003b56a9c9 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1511,7 +1511,7 @@ static int fuse_get_tree_submount(struct fs_context *fsc)
 
fm->fc = fuse_conn_get(fc);
fsc->s_fs_info = fm;
-   sb = sget_fc(fsc, NULL, set_anon_super_fc);
+   sb = sget_fc(fsc, NULL, NULL);
if (fsc->s_fs_info)
fuse_mount_destroy(fm);
if (IS_ERR(sb))
diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 0a0d593e5a9c79..a52957df956394 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -1454,7 +1454,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
virtqueue_size - FUSE_HEADER_OVERHEAD);
 
fsc->s_fs_info = fm;
-   sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
+   sb = sget_fc(fsc, virtio_fs_test_super, NULL);
if 

[PATCH 06/19] qibfs: use simple_release_fs

2023-09-13 Thread Christoph Hellwig
qibfs currently has convoluted code to allow registering HCAs while qibfs
is not mounted and vice versa.  Switch to using simple_release_fs every
time an entry is added to pin the fs instance and remove all the boiler
plate code.

Signed-off-by: Christoph Hellwig 
---
 drivers/infiniband/hw/qib/qib.h  |   4 +-
 drivers/infiniband/hw/qib/qib_fs.c   | 105 ++-
 drivers/infiniband/hw/qib/qib_init.c |  32 +++-
 3 files changed, 36 insertions(+), 105 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h
index 26c615772be390..f73c321d0bff88 100644
--- a/drivers/infiniband/hw/qib/qib.h
+++ b/drivers/infiniband/hw/qib/qib.h
@@ -1370,10 +1370,10 @@ void qib_device_remove(struct qib_devdata *);
 extern int qib_qsfp_dump(struct qib_pportdata *ppd, char *buf, int len);
 
 int __init qib_init_qibfs(void);
-int __exit qib_exit_qibfs(void);
+void __exit qib_exit_qibfs(void);
 
 int qibfs_add(struct qib_devdata *);
-int qibfs_remove(struct qib_devdata *);
+void qibfs_remove(struct qib_devdata *);
 
 int qib_pcie_init(struct pci_dev *, const struct pci_device_id *);
 int qib_pcie_ddinit(struct qib_devdata *, struct pci_dev *,
diff --git a/drivers/infiniband/hw/qib/qib_fs.c 
b/drivers/infiniband/hw/qib/qib_fs.c
index ed7d4b02f45a63..c52ca34b32e67d 100644
--- a/drivers/infiniband/hw/qib/qib_fs.c
+++ b/drivers/infiniband/hw/qib/qib_fs.c
@@ -43,7 +43,9 @@
 
 #define QIBFS_MAGIC 0x726a77
 
-static struct super_block *qib_super;
+static struct file_system_type qibfs_fs_type;
+static struct vfsmount *qib_mount;
+static int qib_mnt_count;
 
 #define private2dd(file) (file_inode(file)->i_private)
 
@@ -355,15 +357,19 @@ static const struct file_operations flash_ops = {
.llseek = default_llseek,
 };
 
-static int add_cntr_files(struct super_block *sb, struct qib_devdata *dd)
+int qibfs_add(struct qib_devdata *dd)
 {
struct dentry *dir, *tmp;
char unit[10];
int ret, i;
+   
+   ret = simple_pin_fs(_fs_type, _mount, _mnt_count);
+   if (ret)
+   return ret;
 
/* create the per-unit directory */
snprintf(unit, sizeof(unit), "%u", dd->unit);
-   ret = create_file(unit, S_IFDIR|S_IRUGO|S_IXUGO, sb->s_root, ,
+   ret = create_file(unit, S_IFDIR|S_IRUGO|S_IXUGO, qib_mount->mnt_root, 
,
  _dir_operations, dd);
if (ret) {
pr_err("create_file(%s) failed: %d\n", unit, ret);
@@ -422,65 +428,35 @@ static int add_cntr_files(struct super_block *sb, struct 
qib_devdata *dd)
pr_err("create_file(%s/flash) failed: %d\n",
unit, ret);
 bail:
+   simple_release_fs(_mount, _mnt_count);
return ret;
 }
 
-static int remove_device_files(struct super_block *sb,
-  struct qib_devdata *dd)
+void qibfs_remove(struct qib_devdata *dd)
 {
struct dentry *dir;
char unit[10];
 
snprintf(unit, sizeof(unit), "%u", dd->unit);
-   dir = lookup_one_len_unlocked(unit, sb->s_root, strlen(unit));
-
-   if (IS_ERR(dir)) {
-   pr_err("Lookup of %s failed\n", unit);
-   return PTR_ERR(dir);
-   }
-   simple_recursive_removal(dir, NULL);
-   return 0;
+   dir = lookup_one_len_unlocked(unit, qib_mount->mnt_root, strlen(unit));
+   if (!IS_ERR(dir))
+   simple_recursive_removal(dir, NULL);
+   simple_release_fs(_mount, _mnt_count);
 }
 
-/*
- * This fills everything in when the fs is mounted, to handle umount/mount
- * after device init.  The direct add_cntr_files() call handles adding
- * them from the init code, when the fs is already mounted.
- */
 static int qibfs_fill_super(struct super_block *sb, struct fs_context *fc)
 {
-   struct qib_devdata *dd;
-   unsigned long index;
-   int ret;
-
static const struct tree_descr files[] = {
[2] = {"driver_stats", _ops[0], S_IRUGO},
[3] = {"driver_stats_names", _ops[1], S_IRUGO},
{""},
};
-
-   ret = simple_fill_super(sb, QIBFS_MAGIC, files);
-   if (ret) {
-   pr_err("simple_fill_super failed: %d\n", ret);
-   goto bail;
-   }
-
-   xa_for_each(_dev_table, index, dd) {
-   ret = add_cntr_files(sb, dd);
-   if (ret)
-   goto bail;
-   }
-
-bail:
-   return ret;
+   return simple_fill_super(sb, QIBFS_MAGIC, files);
 }
 
 static int qibfs_get_tree(struct fs_context *fc)
 {
-   int ret = get_tree_single(fc, qibfs_fill_super);
-   if (ret == 0)
-   qib_super = fc->root->d_sb;
-   return ret;
+   return get_tree_single(fc, qibfs_fill_super);
 }
 
 static const struct fs_context_operations qibfs_context_ops = {
@@ -493,46 +469,11 @@ static int qibfs_init_fs_context(struct fs_context *fc)
return 0;
 }
 
-static void qibfs_kill_super(struct super_block *s)
-{
-   

[PATCH 17/19] NFS: move nfs_kill_super to fs_context.c

2023-09-13 Thread Christoph Hellwig
nfs_kill_super is only used in fs_context, so move it there.

Signed-off-by: Christoph Hellwig 
---
 fs/nfs/fs_context.c | 13 +
 fs/nfs/internal.h   |  1 -
 fs/nfs/super.c  | 16 
 fs/nfs/sysfs.h  |  2 ++
 4 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 853e8d609bb3bc..ee82e4cfb38bb5 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -21,8 +21,10 @@
 
 #include 
 
+#include "fscache.h"
 #include "nfs.h"
 #include "internal.h"
+#include "sysfs.h"
 
 #include "nfstrace.h"
 
@@ -1644,6 +1646,17 @@ static int nfs_init_fs_context(struct fs_context *fc)
return 0;
 }
 
+static void nfs_kill_super(struct super_block *s)
+{
+   struct nfs_server *server = NFS_SB(s);
+
+   nfs_sysfs_move_sb_to_server(server);
+   generic_shutdown_super(s);
+
+   nfs_fscache_release_super_cookie(s);
+   nfs_free_server(server);
+}
+
 struct file_system_type nfs_fs_type = {
.owner  = THIS_MODULE,
.name   = "nfs",
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 9c9cf764f6000d..49d5b03176c02d 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -447,7 +447,6 @@ extern const struct super_operations nfs_sops;
 bool nfs_auth_info_match(const struct nfs_auth_info *, rpc_authflavor_t);
 int nfs_try_get_tree(struct fs_context *);
 int nfs_get_tree_common(struct fs_context *);
-void nfs_kill_super(struct super_block *);
 
 extern struct rpc_stat nfs_rpcstat;
 
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 89131e855e1393..5ba793e7f262d4 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -1329,22 +1329,6 @@ int nfs_get_tree_common(struct fs_context *fc)
goto out;
 }
 
-/*
- * Destroy an NFS superblock
- */
-void nfs_kill_super(struct super_block *s)
-{
-   struct nfs_server *server = NFS_SB(s);
-
-   nfs_sysfs_move_sb_to_server(server);
-   generic_shutdown_super(s);
-
-   nfs_fscache_release_super_cookie(s);
-
-   nfs_free_server(server);
-}
-EXPORT_SYMBOL_GPL(nfs_kill_super);
-
 #if IS_ENABLED(CONFIG_NFS_V4)
 
 /*
diff --git a/fs/nfs/sysfs.h b/fs/nfs/sysfs.h
index c5d1990cade50a..44c8a1712149c2 100644
--- a/fs/nfs/sysfs.h
+++ b/fs/nfs/sysfs.h
@@ -8,6 +8,8 @@
 
 #define CONTAINER_ID_MAXLEN (64)
 
+struct nfs_net;
+
 struct nfs_netns_client {
struct kobject kobject;
struct kobject nfs_net_kobj;
-- 
2.39.2



[PATCH 03/19] fs: release anon dev_t in deactivate_locked_super

2023-09-13 Thread Christoph Hellwig
Releasing an anon dev_t is a very common thing when freeing a
super_block, as that's done for basically any not block based file
system (modulo the odd mtd special case).  So instead of requiring
a special ->kill_sb helper and a lot of boilerplate in more complicated
file systems, just release the anon dev_t in deactivate_locked_super if
the super_block was using one.

As the freeing is done after the main call to kill_super_notify, this
removes the need for having two slightly different call sites for it.

Signed-off-by: Christoph Hellwig 
---
 block/bdev.c|  1 -
 drivers/dax/super.c |  1 -
 drivers/dma-buf/dma-buf.c   |  1 -
 drivers/gpu/drm/drm_drv.c   |  1 -
 drivers/misc/cxl/api.c  |  1 -
 drivers/scsi/cxlflash/ocxl_hw.c |  1 -
 fs/9p/vfs_super.c   |  2 +-
 fs/afs/super.c  |  2 +-
 fs/aio.c|  1 -
 fs/anon_inodes.c|  1 -
 fs/autofs/inode.c   |  4 ++--
 fs/btrfs/super.c|  3 ++-
 fs/btrfs/tests/btrfs-tests.c|  1 -
 fs/ceph/super.c |  2 +-
 fs/coda/inode.c |  1 -
 fs/ecryptfs/main.c  |  3 ++-
 fs/erofs/super.c|  4 ++--
 fs/fuse/inode.c |  2 +-
 fs/fuse/virtio_fs.c |  2 +-
 fs/hostfs/hostfs_kern.c |  2 +-
 fs/kernfs/mount.c   |  2 +-
 fs/nfs/super.c  |  2 +-
 fs/nsfs.c   |  1 -
 fs/openpromfs/inode.c   |  1 -
 fs/orangefs/super.c |  2 +-
 fs/overlayfs/super.c|  1 -
 fs/pipe.c   |  1 -
 fs/proc/root.c  |  4 ++--
 fs/smb/client/cifsfs.c  |  2 +-
 fs/super.c  | 22 --
 fs/ubifs/super.c|  3 ++-
 fs/vboxsf/super.c   |  1 -
 include/linux/fs.h  |  1 -
 kernel/resource.c   |  1 -
 mm/secretmem.c  |  1 -
 net/socket.c|  1 -
 security/apparmor/apparmorfs.c  |  1 -
 37 files changed, 30 insertions(+), 53 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index f3b13aa1b7d428..9db691401497bb 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -358,7 +358,6 @@ static int bd_init_fs_context(struct fs_context *fc)
 static struct file_system_type bd_type = {
.name   = "bdev",
.init_fs_context = bd_init_fs_context,
-   .kill_sb= kill_anon_super,
 };
 
 struct super_block *blockdev_superblock __read_mostly;
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 0da9232ea1754b..a9315b7396e68a 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -397,7 +397,6 @@ static int dax_init_fs_context(struct fs_context *fc)
 static struct file_system_type dax_fs_type = {
.name   = "dax",
.init_fs_context = dax_init_fs_context,
-   .kill_sb= kill_anon_super,
 };
 
 static int dax_test(struct inode *inode, void *data)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 21916bba77d58b..7313e99f6e8ea5 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -125,7 +125,6 @@ static int dma_buf_fs_init_context(struct fs_context *fc)
 static struct file_system_type dma_buf_fs_type = {
.name = "dmabuf",
.init_fs_context = dma_buf_fs_init_context,
-   .kill_sb = kill_anon_super,
 };
 
 static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 3eda026ffac6a9..83676229cbe233 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -530,7 +530,6 @@ static struct file_system_type drm_fs_type = {
.name   = "drm",
.owner  = THIS_MODULE,
.init_fs_context = drm_fs_init_fs_context,
-   .kill_sb= kill_anon_super,
 };
 
 static struct inode *drm_fs_inode_new(void)
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index d85c5653086357..05b40076a0b481 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -44,7 +44,6 @@ static struct file_system_type cxl_fs_type = {
.name   = "cxl",
.owner  = THIS_MODULE,
.init_fs_context = cxl_fs_init_fs_context,
-   .kill_sb= kill_anon_super,
 };
 
 
diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 6542818e595a64..20f22610b104df 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -43,7 +43,6 @@ static struct file_system_type ocxlflash_fs_type = {
.name   = "ocxlflash",
.owner  = THIS_MODULE,
.init_fs_context = ocxlflash_fs_init_fs_context,
-   .kill_sb= kill_anon_super,
 };
 
 /*
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index 73db55c050bf10..9e60eddf5179ed 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -217,7 +217,7 

[PATCH 09/19] zonefs: remove duplicate cleanup in zonefs_fill_super

2023-09-13 Thread Christoph Hellwig
When ->fill_super fails, ->kill_sb is called which already cleans up
the inodes and zgroups.

Drop the extra cleanup code in zonefs_fill_super.

Signed-off-by: Christoph Hellwig 
---
 fs/zonefs/super.c | 21 +
 1 file changed, 5 insertions(+), 16 deletions(-)

diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index 9d1a9808fbbba6..35b2554ce2ac2e 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -1309,13 +1309,12 @@ static int zonefs_fill_super(struct super_block *sb, 
void *data, int silent)
/* Initialize the zone groups */
ret = zonefs_init_zgroups(sb);
if (ret)
-   goto cleanup;
+   return ret;
 
/* Create the root directory inode */
-   ret = -ENOMEM;
inode = new_inode(sb);
if (!inode)
-   goto cleanup;
+   return -ENOMEM;
 
inode->i_ino = bdev_nr_zones(sb->s_bdev);
inode->i_mode = S_IFDIR | 0555;
@@ -1333,7 +1332,7 @@ static int zonefs_fill_super(struct super_block *sb, void 
*data, int silent)
 
sb->s_root = d_make_root(inode);
if (!sb->s_root)
-   goto cleanup;
+   return -ENOMEM;
 
/*
 * Take a reference on the zone groups directory inodes
@@ -1341,19 +1340,9 @@ static int zonefs_fill_super(struct super_block *sb, 
void *data, int silent)
 */
ret = zonefs_get_zgroup_inodes(sb);
if (ret)
-   goto cleanup;
-
-   ret = zonefs_sysfs_register(sb);
-   if (ret)
-   goto cleanup;
-
-   return 0;
-
-cleanup:
-   zonefs_release_zgroup_inodes(sb);
-   zonefs_free_zgroups(sb);
+   return ret;
 
-   return ret;
+   return zonefs_sysfs_register(sb);
 }
 
 static struct dentry *zonefs_mount(struct file_system_type *fs_type,
-- 
2.39.2



[PATCH 16/19] x86/resctrl: release rdtgroup_mutex and the CPU hotplug lock in rdt_shutdown_sb

2023-09-13 Thread Christoph Hellwig
While the resctl code is a bit confusing, I can't find anything protected
by rdtgroup_mutex or the CPU hotplug lock in generic_shutdown_super or
kernfs_free_sb.  Drop the locks at the end of rdt_shutdown_sb to avoid
holding locks over method calls and VFS code which itself already has a
rather complicated locking hierarchy.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 8db767fd80df6b..e87de519493021 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2793,11 +2793,6 @@ static void rdt_shutdown_sb(struct super_block *sb)
static_branch_disable_cpuslocked(_mon_enable_key);
static_branch_disable_cpuslocked(_enable_key);
kernfs_shutdown_sb(sb);
-}
-
-static void rdt_free_sb(struct super_block *sb)
-{
-   kernfs_free_sb(sb);
mutex_unlock(_mutex);
cpus_read_unlock();
 }
@@ -2807,7 +2802,7 @@ static struct file_system_type rdt_fs_type = {
.init_fs_context= rdt_init_fs_context,
.parameters = rdt_fs_parameters,
.shutdown_sb= rdt_shutdown_sb,
-   .free_sb= rdt_free_sb,
+   .free_sb= kernfs_free_sb,
 };
 
 static int mon_addfile(struct kernfs_node *parent_kn, const char *name,
-- 
2.39.2



[PATCH 02/19] fs: make ->kill_sb optional

2023-09-13 Thread Christoph Hellwig
Call generic_shutdown_super if ->kill_sb is not provided by the file
system.  This can't currently happen but will become common soon.

Signed-off-by: Christoph Hellwig 
---
 fs/super.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index 127a17d958a482..ab234e6af48605 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -483,7 +483,11 @@ void deactivate_locked_super(struct super_block *s)
}
 
unregister_shrinker(>s_shrink);
-   fs->kill_sb(s);
+
+   if (fs->kill_sb)
+   fs->kill_sb(s);
+   else
+   generic_shutdown_super(s);
 
kill_super_notify(s);
 
-- 
2.39.2



[PATCH 10/19] USB: gadget/legacy: remove sb_mutex

2023-09-13 Thread Christoph Hellwig
Creating new a new super_block vs freeing the old one for single instance
file systems is serialized by the wait for SB_DEAD.

Remove the superfluous sb_mutex.

Signed-off-by: Christoph Hellwig 
---
 drivers/usb/gadget/legacy/inode.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/usb/gadget/legacy/inode.c 
b/drivers/usb/gadget/legacy/inode.c
index ce9e31f3d26bcc..a203266bc0dc82 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -229,7 +229,6 @@ static void put_ep (struct ep_data *data)
  */
 
 static const char *CHIP;
-static DEFINE_MUTEX(sb_mutex); /* Serialize superblock operations */
 
 /*--*/
 
@@ -2012,8 +2011,6 @@ gadgetfs_fill_super (struct super_block *sb, struct 
fs_context *fc)
struct dev_data *dev;
int rc;
 
-   mutex_lock(_mutex);
-
if (the_device) {
rc = -ESRCH;
goto Done;
@@ -2069,7 +2066,6 @@ gadgetfs_fill_super (struct super_block *sb, struct 
fs_context *fc)
rc = -ENOMEM;
 
  Done:
-   mutex_unlock(_mutex);
return rc;
 }
 
@@ -2092,7 +2088,6 @@ static int gadgetfs_init_fs_context(struct fs_context *fc)
 static void
 gadgetfs_kill_sb (struct super_block *sb)
 {
-   mutex_lock(_mutex);
kill_litter_super (sb);
if (the_device) {
put_dev (the_device);
@@ -2100,7 +2095,6 @@ gadgetfs_kill_sb (struct super_block *sb)
}
kfree(CHIP);
CHIP = NULL;
-   mutex_unlock(_mutex);
 }
 
 /*--*/
-- 
2.39.2



[PATCH 14/19] jffs2: convert to ->shutdown_sb and ->free_sb

2023-09-13 Thread Christoph Hellwig
Convert jffs2 from ->kill_sb to ->shutdown_sb and ->free_sb.  Drop
the otherwise unused kill_mtd_super helpers, as there is no benefit in
it over just calling put_mtd_device on sb->s_mtd.

Signed-off-by: Christoph Hellwig 
---
 drivers/mtd/mtdsuper.c| 12 
 fs/jffs2/super.c  | 22 ++
 include/linux/mtd/super.h |  2 --
 3 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/drivers/mtd/mtdsuper.c b/drivers/mtd/mtdsuper.c
index b7e3763c47f0cd..66da2e6f90f5f5 100644
--- a/drivers/mtd/mtdsuper.c
+++ b/drivers/mtd/mtdsuper.c
@@ -165,15 +165,3 @@ int get_tree_mtd(struct fs_context *fc,
return -EINVAL;
 }
 EXPORT_SYMBOL_GPL(get_tree_mtd);
-
-/*
- * destroy an MTD-based superblock
- */
-void kill_mtd_super(struct super_block *sb)
-{
-   generic_shutdown_super(sb);
-   put_mtd_device(sb->s_mtd);
-   sb->s_mtd = NULL;
-}
-
-EXPORT_SYMBOL_GPL(kill_mtd_super);
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index 7ea37f49f1e18e..14577368202e90 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -340,21 +340,27 @@ static void jffs2_put_super (struct super_block *sb)
jffs2_dbg(1, "%s(): returning\n", __func__);
 }
 
-static void jffs2_kill_sb(struct super_block *sb)
+static void jffs2_shutdown_sb(struct super_block *sb)
 {
struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
+
if (c && !sb_rdonly(sb))
jffs2_stop_garbage_collect_thread(c);
-   kill_mtd_super(sb);
-   kfree(c);
+}
+
+static void jffs2_free_sb(struct super_block *sb)
+{
+   put_mtd_device(sb->s_mtd);
+   kfree(JFFS2_SB_INFO(sb));
 }
 
 static struct file_system_type jffs2_fs_type = {
-   .owner =THIS_MODULE,
-   .name = "jffs2",
-   .init_fs_context = jffs2_init_fs_context,
-   .parameters =   jffs2_fs_parameters,
-   .kill_sb =  jffs2_kill_sb,
+   .owner  = THIS_MODULE,
+   .name   = "jffs2",
+   .init_fs_context= jffs2_init_fs_context,
+   .parameters = jffs2_fs_parameters,
+   .shutdown_sb= jffs2_shutdown_sb,
+   .free_sb= jffs2_free_sb,
 };
 MODULE_ALIAS_FS("jffs2");
 
diff --git a/include/linux/mtd/super.h b/include/linux/mtd/super.h
index 3608a6c36faceb..f6d5c1a17eec23 100644
--- a/include/linux/mtd/super.h
+++ b/include/linux/mtd/super.h
@@ -17,8 +17,6 @@
 extern int get_tree_mtd(struct fs_context *fc,
 int (*fill_super)(struct super_block *sb,
   struct fs_context *fc));
-extern void kill_mtd_super(struct super_block *sb);
-
 
 #endif /* __KERNEL__ */
 
-- 
2.39.2



[PATCH 15/19] kernfs: split ->kill_sb

2023-09-13 Thread Christoph Hellwig
Split the kernfs_kill_sb helper into helpers for the new split
shutdown_sb and free_sb methods.  Note that resctrl has very odd
locking in ->kill_sb, so this commit only releases the locking
acquired in rdt_shutdown_sb in rdt_free_sb.  This is not very good
code and relies on ->shutdown_sb and ->free_sb to always be called
in pairs, which it currently is.  The next commit will try to clean
this up.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 +---
 fs/kernfs/mount.c  | 18 --
 fs/sysfs/mount.c   |  7 ---
 include/linux/kernfs.h |  5 ++---
 kernel/cgroup/cgroup.c | 10 ++
 5 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 725344048f85da..8db767fd80df6b 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2772,7 +2772,7 @@ static void rmdir_all_sub(void)
kernfs_remove(kn_mondata);
 }
 
-static void rdt_kill_sb(struct super_block *sb)
+static void rdt_shutdown_sb(struct super_block *sb)
 {
struct rdt_resource *r;
 
@@ -2792,7 +2792,12 @@ static void rdt_kill_sb(struct super_block *sb)
static_branch_disable_cpuslocked(_alloc_enable_key);
static_branch_disable_cpuslocked(_mon_enable_key);
static_branch_disable_cpuslocked(_enable_key);
-   kernfs_kill_sb(sb);
+   kernfs_shutdown_sb(sb);
+}
+
+static void rdt_free_sb(struct super_block *sb)
+{
+   kernfs_free_sb(sb);
mutex_unlock(_mutex);
cpus_read_unlock();
 }
@@ -2801,7 +2806,8 @@ static struct file_system_type rdt_fs_type = {
.name   = "resctrl",
.init_fs_context= rdt_init_fs_context,
.parameters = rdt_fs_parameters,
-   .kill_sb= rdt_kill_sb,
+   .shutdown_sb= rdt_shutdown_sb,
+   .free_sb= rdt_free_sb,
 };
 
 static int mon_addfile(struct kernfs_node *parent_kn, const char *name,
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index d6d3cba669dbdd..32ec4ec3c878f6 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -379,14 +379,14 @@ void kernfs_free_fs_context(struct fs_context *fc)
 }
 
 /**
- * kernfs_kill_sb - kill_sb for kernfs
+ * kernfs_shutdown_sb - shutdown_sb for kernfs
  * @sb: super_block being killed
  *
- * This can be used directly for file_system_type->kill_sb().  If a kernfs
- * user needs extra cleanup, it can implement its own kill_sb() and call
+ * This can be used directly for file_system_type->shutdown_sb().  If a kernfs
+ * user needs extra cleanup, it can implement its own shutdown_sb() and call
  * this function at the end.
  */
-void kernfs_kill_sb(struct super_block *sb)
+void kernfs_shutdown_sb(struct super_block *sb)
 {
struct kernfs_super_info *info = kernfs_info(sb);
struct kernfs_root *root = info->root;
@@ -394,13 +394,11 @@ void kernfs_kill_sb(struct super_block *sb)
down_write(>kernfs_supers_rwsem);
list_del(>node);
up_write(>kernfs_supers_rwsem);
+}
 
-   /*
-* Remove the superblock from fs_supers/s_instances
-* so we can't find it, before freeing kernfs_super_info.
-*/
-   generic_shutdown_super(sb);
-   kfree(info);
+void kernfs_free_sb(struct super_block *sb)
+{
+   kfree(kernfs_info(sb));
 }
 
 static void __init kernfs_mutex_init(void)
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 98467bb7673781..804391342599bc 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -79,18 +79,19 @@ static int sysfs_init_fs_context(struct fs_context *fc)
return 0;
 }
 
-static void sysfs_kill_sb(struct super_block *sb)
+static void sysfs_free_sb(struct super_block *sb)
 {
void *ns = (void *)kernfs_super_ns(sb);
 
-   kernfs_kill_sb(sb);
+   kernfs_free_sb(sb);
kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
 }
 
 static struct file_system_type sysfs_fs_type = {
.name   = "sysfs",
.init_fs_context= sysfs_init_fs_context,
-   .kill_sb= sysfs_kill_sb,
+   .shutdown_sb= kernfs_shutdown_sb,
+   .free_sb= sysfs_free_sb,
.fs_flags   = FS_USERNS_MOUNT,
 };
 
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 2a36f3218b5106..940059251deac8 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -453,7 +453,8 @@ int kernfs_xattr_set(struct kernfs_node *kn, const char 
*name,
 const void *kernfs_super_ns(struct super_block *sb);
 int kernfs_get_tree(struct fs_context *fc);
 void kernfs_free_fs_context(struct fs_context *fc);
-void kernfs_kill_sb(struct super_block *sb);
+void kernfs_shutdown_sb(struct super_block *sb);
+void kernfs_free_sb(struct super_block *sb);
 
 void kernfs_init(void);
 
@@ -572,8 +573,6 @@ static 

[PATCH 12/19] fs: convert kill_litter_super to litter_shutdown_sb

2023-09-13 Thread Christoph Hellwig
Replace kill_litter_super with litter_shutdown_sb, which is wired up to
the ->shutdown_sb method.  For file systems that wrapped
kill_litter_super, ->kill_sb is replaced with ->shutdown and ->free_sb
methods as needed.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/platforms/cell/spufs/inode.c | 10 +-
 arch/s390/hypfs/inode.c   |  6 +++---
 drivers/android/binderfs.c| 12 +++-
 drivers/base/devtmpfs.c   |  8 
 drivers/infiniband/hw/qib/qib_fs.c|  2 +-
 drivers/misc/ibmasm/ibmasmfs.c|  8 
 drivers/usb/gadget/function/f_fs.c|  6 +++---
 drivers/usb/gadget/legacy/inode.c | 12 ++--
 drivers/xen/xenfs/super.c |  8 
 fs/binfmt_misc.c  |  8 
 fs/configfs/mount.c   |  8 
 fs/debugfs/inode.c|  8 
 fs/devpts/inode.c |  6 +++---
 fs/efivarfs/super.c   | 13 ++---
 fs/fuse/control.c | 12 ++--
 fs/hugetlbfs/inode.c  |  2 +-
 fs/nfsd/nfsctl.c  | 22 --
 fs/ocfs2/dlmfs/dlmfs.c|  2 +-
 fs/pstore/inode.c |  7 +++
 fs/ramfs/inode.c  |  6 +++---
 fs/super.c| 14 +++---
 fs/tracefs/inode.c|  2 +-
 include/linux/fs.h|  2 +-
 include/linux/ramfs.h |  2 +-
 init/do_mounts.c  |  6 +++---
 ipc/mqueue.c  |  2 +-
 kernel/bpf/inode.c|  2 +-
 mm/shmem.c|  5 +++--
 net/sunrpc/rpc_pipe.c | 19 ---
 security/inode.c  |  8 
 security/selinux/selinuxfs.c  | 15 +--
 security/smack/smackfs.c  |  6 +++---
 32 files changed, 126 insertions(+), 123 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c 
b/arch/powerpc/platforms/cell/spufs/inode.c
index 38c5be34c8951f..2610a0731ea242 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -763,11 +763,11 @@ static int spufs_init_fs_context(struct fs_context *fc)
 }
 
 static struct file_system_type spufs_type = {
-   .owner = THIS_MODULE,
-   .name = "spufs",
-   .init_fs_context = spufs_init_fs_context,
-   .parameters = spufs_fs_parameters,
-   .kill_sb = kill_litter_super,
+   .owner  = THIS_MODULE,
+   .name   = "spufs",
+   .init_fs_context= spufs_init_fs_context,
+   .parameters = spufs_fs_parameters,
+   .shutdown_sb= litter_shutdown_sb,
 };
 MODULE_ALIAS_FS("spufs");
 
diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index 3261fb9cade648..f18e3b844c5d9b 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -294,9 +294,8 @@ static int hypfs_init_fs_context(struct fs_context *fc)
return 0;
 }
 
-static void hypfs_kill_super(struct super_block *sb)
+static void hypfs_free_sb(struct super_block *sb)
 {
-   kill_litter_super(sb);
kfree(sb->s_fs_info);
 }
 
@@ -417,7 +416,8 @@ static struct file_system_type hypfs_type = {
.name   = "s390_hypfs",
.init_fs_context = hypfs_init_fs_context,
.parameters = hypfs_fs_parameters,
-   .kill_sb= hypfs_kill_super
+   .shutdown_sb= litter_shutdown_sb,
+   .free_sb= hypfs_free_sb,
 };
 
 static const struct super_operations hypfs_s_ops = {
diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
index 81effec17b3d63..f48196391239c0 100644
--- a/drivers/android/binderfs.c
+++ b/drivers/android/binderfs.c
@@ -771,19 +771,12 @@ static int binderfs_init_fs_context(struct fs_context *fc)
return 0;
 }
 
-static void binderfs_kill_super(struct super_block *sb)
+static void binderfs_free_sb(struct super_block *sb)
 {
struct binderfs_info *info = sb->s_fs_info;
 
-   /*
-* During inode eviction struct binderfs_info is needed.
-* So first wipe the super_block then free struct binderfs_info.
-*/
-   kill_litter_super(sb);
-
if (info && info->ipc_ns)
put_ipc_ns(info->ipc_ns);
-
kfree(info);
 }
 
@@ -791,7 +784,8 @@ static struct file_system_type binder_fs_type = {
.name   = "binder",
.init_fs_context= binderfs_init_fs_context,
.parameters = binderfs_fs_parameters,
-   .kill_sb= binderfs_kill_super,
+   .shutdown_sb= litter_shutdown_sb,
+   .free_sb= binderfs_free_sb,
.fs_flags   = 

[PATCH 19/19] fs: remove ->kill_sb

2023-09-13 Thread Christoph Hellwig
Now that no instances are left, remove ->kill_sb and mark
generic_shutdown_super static.

Signed-off-by: Christoph Hellwig 
---
 Documentation/filesystems/locking.rst |  5 -
 Documentation/filesystems/vfs.rst |  5 -
 fs/super.c| 25 +
 include/linux/fs.h|  2 --
 4 files changed, 9 insertions(+), 28 deletions(-)

diff --git a/Documentation/filesystems/locking.rst 
b/Documentation/filesystems/locking.rst
index c33e2f03ed1f69..e4ca99c0828d00 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -221,7 +221,6 @@ prototypes::
struct dentry *(*mount) (struct file_system_type *, int,
   const char *, void *);
void (*shutdown_sb) (struct super_block *);
-   void (*kill_sb) (struct super_block *);
void (*free_sb) (struct super_block *);
 
 locking rules:
@@ -231,16 +230,12 @@ ops   may block
 ====
 mount  yes
 shutdown_sbyes
-kill_sbyes
 free_sbyes
 ====
 
 ->mount() returns ERR_PTR or the root dentry; its superblock should be locked
 on return.
 
-->kill_sb() takes a write-locked superblock, does all shutdown work on it,
-unlocks and drops the reference.
-
 address_space_operations
 
 prototypes::
diff --git a/Documentation/filesystems/vfs.rst 
b/Documentation/filesystems/vfs.rst
index 1a7c6926c31f34..29513ee1d34ede 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -120,7 +120,6 @@ members are defined:
struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *);
void (*shutdown_sb) (struct super_block *);
-   void (*kill_sb) (struct super_block *);
void (*free_sb) (struct super_block *);
struct module *owner;
struct file_system_type * next;
@@ -164,10 +163,6 @@ members are defined:
Note: dentries and inodes are normally taken care of and do not need
specific handling unless they are pinned by kernel users.
 
-``kill_sb``
-   the method to call when an instance of this filesystem should be
-   shut down
-
 ``free_sb``
Free file system specific resources like sb->s_fs_info that are
still needed while inodes are freed during umount.
diff --git a/fs/super.c b/fs/super.c
index 805ca1dd1e23f2..d9c564e70ffcd5 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -458,6 +458,8 @@ static void kill_super_notify(struct super_block *sb)
super_wake(sb, SB_DEAD);
 }
 
+static void generic_shutdown_super(struct super_block *sb);
+
 /**
  * deactivate_locked_super -   drop an active reference to superblock
  * @s: superblock to deactivate
@@ -480,15 +482,11 @@ void deactivate_locked_super(struct super_block *s)
 
unregister_shrinker(>s_shrink);
 
-   if (fs->kill_sb) {
-   fs->kill_sb(s);
-   } else {
-   if (fs->shutdown_sb)
-   fs->shutdown_sb(s);
-   generic_shutdown_super(s);
-   if (fs->free_sb)
-   fs->free_sb(s);
-   }
+   if (fs->shutdown_sb)
+   fs->shutdown_sb(s);
+   generic_shutdown_super(s);
+   if (fs->free_sb)
+   fs->free_sb(s);
 
kill_super_notify(s);
 
@@ -661,16 +659,13 @@ EXPORT_SYMBOL(retire_super);
  * @sb: superblock to kill
  *
  * generic_shutdown_super() does all fs-independent work on superblock
- * shutdown.  Typical ->kill_sb() should pick all fs-specific objects
- * that need destruction out of superblock, call generic_shutdown_super()
- * and release aforementioned objects.  Note: dentries and inodes _are_
- * taken care of and do not need specific handling.
+ * shutdown. 
  *
  * Upon calling this function, the filesystem may no longer alter or
  * rearrange the set of dentries belonging to this super_block, nor may it
  * change the attachments of dentries to inodes.
  */
-void generic_shutdown_super(struct super_block *sb)
+static void generic_shutdown_super(struct super_block *sb)
 {
const struct super_operations *sop = sb->s_op;
 
@@ -743,8 +738,6 @@ void generic_shutdown_super(struct super_block *sb)
}
 }
 
-EXPORT_SYMBOL(generic_shutdown_super);
-
 bool mount_capable(struct fs_context *fc)
 {
if (!(fc->fs_type->fs_flags & FS_USERNS_MOUNT))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 302be5dfc1a04a..f57d3a27b488f7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2340,7 +2340,6 @@ struct file_system_type {
const struct fs_parameter_spec *parameters;
struct dentry *(*mount) (struct file_system_type *, int,
   const char *, void *);
-   void (*kill_sb) (struct super_block *);
void 

[PATCH 08/19] pstore: shrink the pstore_sb_lock critical section in pstore_kill_sb

2023-09-13 Thread Christoph Hellwig
->kill_sb can't race with creating ->fill_super because pstore is a
_single file system that only ever has a single sb instance, and we wait
for the previous one to go away before creating a new one.  Reduce
the critical section so that is is not held over generic_shutdown_super.

Signed-off-by: Christoph Hellwig 
---
 fs/pstore/inode.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 585360706b335f..fd1d24b47160d0 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -467,10 +467,9 @@ static struct dentry *pstore_mount(struct file_system_type 
*fs_type,
 
 static void pstore_kill_sb(struct super_block *sb)
 {
-   mutex_lock(_sb_lock);
-   WARN_ON(pstore_sb && pstore_sb != sb);
-
kill_litter_super(sb);
+
+   mutex_lock(_sb_lock);
pstore_sb = NULL;
 
mutex_lock(_list_lock);
-- 
2.39.2



[PATCH 13/19] fs: convert kill_block_super to block_free_sb

2023-09-13 Thread Christoph Hellwig
Replace kill_block_super with block_free_sb, which is wired up to
the ->free_sb method.  For file systems that wrapped kill_block_super,
->kill_sb is replaced with ->shutdown and ->free_sb methods as needed.

Signed-off-by: Christoph Hellwig 
---
 fs/adfs/super.c  |  2 +-
 fs/affs/super.c  |  7 ---
 fs/befs/linuxvfs.c   |  2 +-
 fs/bfs/inode.c   |  2 +-
 fs/efs/super.c   |  7 ---
 fs/erofs/super.c | 25 ++---
 fs/exfat/super.c |  6 +++---
 fs/ext2/super.c  |  2 +-
 fs/ext4/super.c  | 12 ++--
 fs/f2fs/super.c  |  6 +++---
 fs/fat/namei_msdos.c |  2 +-
 fs/fat/namei_vfat.c  |  2 +-
 fs/freevxfs/vxfs_super.c |  2 +-
 fs/fuse/inode.c  | 12 ++--
 fs/gfs2/ops_fstype.c | 11 ---
 fs/hfs/super.c   |  2 +-
 fs/hfsplus/super.c   |  2 +-
 fs/hpfs/super.c  |  2 +-
 fs/isofs/inode.c |  2 +-
 fs/jfs/super.c   |  2 +-
 fs/minix/inode.c |  2 +-
 fs/nilfs2/super.c|  2 +-
 fs/ntfs/super.c  |  2 +-
 fs/ntfs3/super.c |  6 +++---
 fs/ocfs2/super.c |  2 +-
 fs/omfs/inode.c  |  2 +-
 fs/qnx4/inode.c  |  7 ---
 fs/qnx6/inode.c  |  2 +-
 fs/reiserfs/super.c  |  7 +++
 fs/squashfs/super.c  |  2 +-
 fs/super.c   |  6 ++
 fs/sysv/super.c  |  4 ++--
 fs/udf/super.c   |  2 +-
 fs/ufs/super.c   |  2 +-
 fs/xfs/xfs_buf.c |  2 +-
 fs/xfs/xfs_super.c   |  6 +++---
 fs/zonefs/super.c| 13 ++---
 include/linux/fs.h   |  2 +-
 38 files changed, 86 insertions(+), 95 deletions(-)

diff --git a/fs/adfs/super.c b/fs/adfs/super.c
index e8bfc38239cd59..22f0137f485e5f 100644
--- a/fs/adfs/super.c
+++ b/fs/adfs/super.c
@@ -463,7 +463,7 @@ static struct file_system_type adfs_fs_type = {
.owner  = THIS_MODULE,
.name   = "adfs",
.mount  = adfs_mount,
-   .kill_sb= kill_block_super,
+   .free_sb= block_free_sb,
.fs_flags   = FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("adfs");
diff --git a/fs/affs/super.c b/fs/affs/super.c
index 58b391446ae1fd..775e878797f9fc 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -631,10 +631,11 @@ static struct dentry *affs_mount(struct file_system_type 
*fs_type,
return mount_bdev(fs_type, flags, dev_name, data, affs_fill_super);
 }
 
-static void affs_kill_sb(struct super_block *sb)
+static void affs_free_sb(struct super_block *sb)
 {
struct affs_sb_info *sbi = AFFS_SB(sb);
-   kill_block_super(sb);
+
+   block_free_sb(sb);
if (sbi) {
affs_free_bitmap(sb);
affs_brelse(sbi->s_root_bh);
@@ -648,7 +649,7 @@ static struct file_system_type affs_fs_type = {
.owner  = THIS_MODULE,
.name   = "affs",
.mount  = affs_mount,
-   .kill_sb= affs_kill_sb,
+   .free_sb= affs_free_sb,
.fs_flags   = FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("affs");
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 9a16a51fbb88d4..7682c027d44782 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -981,7 +981,7 @@ static struct file_system_type befs_fs_type = {
.owner  = THIS_MODULE,
.name   = "befs",
.mount  = befs_mount,
-   .kill_sb= kill_block_super,
+   .free_sb= block_free_sb,
.fs_flags   = FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("befs");
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index e6a76ae9eb4442..4d894d5dd07074 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -458,7 +458,7 @@ static struct file_system_type bfs_fs_type = {
.owner  = THIS_MODULE,
.name   = "bfs",
.mount  = bfs_mount,
-   .kill_sb= kill_block_super,
+   .free_sb= block_free_sb,
.fs_flags   = FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("bfs");
diff --git a/fs/efs/super.c b/fs/efs/super.c
index b287f47c165ba8..1f808a455e7e87 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -28,10 +28,11 @@ static struct dentry *efs_mount(struct file_system_type 
*fs_type,
return mount_bdev(fs_type, flags, dev_name, data, efs_fill_super);
 }
 
-static void efs_kill_sb(struct super_block *s)
+static void efs_free_sb(struct super_block *s)
 {
struct efs_sb_info *sbi = SUPER_INFO(s);
-   kill_block_super(s);
+
+   block_free_sb(s);
kfree(sbi);
 }
 
@@ -39,7 +40,7 @@ static struct file_system_type efs_fs_type = {
.owner  = THIS_MODULE,
.name   = "efs",
.mount  = efs_mount,
-   .kill_sb= efs_kill_sb,
+   .free_sb= efs_free_sb,
.fs_flags   = FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("efs");
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 

[PATCH 07/19] hypfs: use d_genocide to kill fs entries

2023-09-13 Thread Christoph Hellwig
hypfs is entirely synthetic and doesn't care about i_nlink when dropping
entries from the cache.  Switch to d_genocide instead of a home grown
file remove loop for unmount and write (yes, really!).

Signed-off-by: Christoph Hellwig 
---
 arch/s390/hypfs/inode.c | 37 ++---
 1 file changed, 2 insertions(+), 35 deletions(-)

diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index dbe8a7dcafa922..3261fb9cade648 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -64,33 +64,6 @@ static void hypfs_add_dentry(struct dentry *dentry)
hypfs_last_dentry = dentry;
 }
 
-static void hypfs_remove(struct dentry *dentry)
-{
-   struct dentry *parent;
-
-   parent = dentry->d_parent;
-   inode_lock(d_inode(parent));
-   if (simple_positive(dentry)) {
-   if (d_is_dir(dentry))
-   simple_rmdir(d_inode(parent), dentry);
-   else
-   simple_unlink(d_inode(parent), dentry);
-   }
-   d_drop(dentry);
-   dput(dentry);
-   inode_unlock(d_inode(parent));
-}
-
-static void hypfs_delete_tree(struct dentry *root)
-{
-   while (hypfs_last_dentry) {
-   struct dentry *next_dentry;
-   next_dentry = hypfs_last_dentry->d_fsdata;
-   hypfs_remove(hypfs_last_dentry);
-   hypfs_last_dentry = next_dentry;
-   }
-}
-
 static struct inode *hypfs_make_inode(struct super_block *sb, umode_t mode)
 {
struct inode *ret = new_inode(sb);
@@ -183,14 +156,14 @@ static ssize_t hypfs_write_iter(struct kiocb *iocb, 
struct iov_iter *from)
rc = -EBUSY;
goto out;
}
-   hypfs_delete_tree(sb->s_root);
+   d_genocide(sb->s_root);
if (MACHINE_IS_VM)
rc = hypfs_vm_create_files(sb->s_root);
else
rc = hypfs_diag_create_files(sb->s_root);
if (rc) {
pr_err("Updating the hypfs tree failed\n");
-   hypfs_delete_tree(sb->s_root);
+   d_genocide(sb->s_root);
goto out;
}
hypfs_update_update(sb);
@@ -323,12 +296,6 @@ static int hypfs_init_fs_context(struct fs_context *fc)
 
 static void hypfs_kill_super(struct super_block *sb)
 {
-   struct hypfs_sb_info *sb_info = sb->s_fs_info;
-
-   if (sb->s_root)
-   hypfs_delete_tree(sb->s_root);
-   if (sb_info && sb_info->update_file)
-   hypfs_remove(sb_info->update_file);
kill_litter_super(sb);
kfree(sb->s_fs_info);
 }
-- 
2.39.2



[PATCH 04/19] NFS: remove the s_dev field from struct nfs_server

2023-09-13 Thread Christoph Hellwig
Don't duplicate the dev_t in the nfs_server structure given that it can
be trivially retrieved from the super_block.

Signed-off-by: Christoph Hellwig 
---
 fs/nfs/client.c   |  2 +-
 fs/nfs/nfs4proc.c |  8 
 fs/nfs/nfs4trace.h|  6 +++---
 fs/nfs/nfs4xdr.c  |  2 +-
 fs/nfs/super.c| 10 +++---
 include/linux/nfs_fs_sb.h |  1 -
 6 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 44eca51b28085d..039fd67ac17c82 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -1337,7 +1337,7 @@ static int nfs_volume_list_show(struct seq_file *m, void 
*v)
clp = server->nfs_client;
 
snprintf(dev, sizeof(dev), "%u:%u",
-MAJOR(server->s_dev), MINOR(server->s_dev));
+MAJOR(server->super->s_dev), MINOR(server->super->s_dev));
 
snprintf(fsid, sizeof(fsid), "%llx:%llx",
 (unsigned long long) server->fsid.major,
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 794343790ea8bb..4d002cc514983c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -6763,7 +6763,7 @@ static int _nfs4_proc_getlk(struct nfs4_state *state, int 
cmd, struct file_lock
goto out;
lsp = request->fl_u.nfs4_fl.owner;
arg.lock_owner.id = lsp->ls_seqid.owner_id;
-   arg.lock_owner.s_dev = server->s_dev;
+   arg.lock_owner.s_dev = server->super->s_dev;
status = nfs4_call_sync(server->client, server, , _args, 
_res, 1);
switch (status) {
case 0:
@@ -7088,7 +7088,7 @@ static struct nfs4_lockdata *nfs4_alloc_lockdata(struct 
file_lock *fl,
goto out_free_seqid;
p->arg.lock_owner.clientid = server->nfs_client->cl_clientid;
p->arg.lock_owner.id = lsp->ls_seqid.owner_id;
-   p->arg.lock_owner.s_dev = server->s_dev;
+   p->arg.lock_owner.s_dev = server->super->s_dev;
p->res.lock_seqid = p->arg.lock_seqid;
p->lsp = lsp;
p->server = server;
@@ -7475,7 +7475,7 @@ nfs4_retry_setlk(struct nfs4_state *state, int cmd, 
struct file_lock *request)
.inode = state->inode,
.owner = { .clientid = clp->cl_clientid,
   .id = lsp->ls_seqid.owner_id,
-  .s_dev = server->s_dev },
+  .s_dev = server->super->s_dev },
};
int status;
 
@@ -7689,7 +7689,7 @@ nfs4_release_lockowner(struct nfs_server *server, struct 
nfs4_lock_state *lsp)
data->server = server;
data->args.lock_owner.clientid = server->nfs_client->cl_clientid;
data->args.lock_owner.id = lsp->ls_seqid.owner_id;
-   data->args.lock_owner.s_dev = server->s_dev;
+   data->args.lock_owner.s_dev = server->super->s_dev;
 
msg.rpc_argp = >args;
msg.rpc_resp = >res;
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index d27919d7241d38..13a602c675ddb2 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -945,7 +945,7 @@ TRACE_EVENT(nfs4_delegreturn_exit,
),
 
TP_fast_assign(
-   __entry->dev = res->server->s_dev;
+   __entry->dev = res->server->super->s_dev;
__entry->fhandle = nfs_fhandle_hash(args->fhandle);
__entry->error = error < 0 ? -error : 0;
__entry->stateid_seq =
@@ -1269,7 +1269,7 @@ DECLARE_EVENT_CLASS(nfs4_getattr_event,
),
 
TP_fast_assign(
-   __entry->dev = server->s_dev;
+   __entry->dev = server->super->s_dev;
__entry->valid = fattr->valid;
__entry->fhandle = nfs_fhandle_hash(fhandle);
__entry->fileid = (fattr->valid & 
NFS_ATTR_FATTR_FILEID) ? fattr->fileid : 0;
@@ -1966,7 +1966,7 @@ DECLARE_EVENT_CLASS(nfs4_deviceid_status,
),
 
TP_fast_assign(
-   __entry->dev = server->s_dev;
+   __entry->dev = server->super->s_dev;
__entry->status = status;
__assign_str(dstaddr, server->nfs_client->cl_hostname);
memcpy(__entry->deviceid, deviceid->data,
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index deec76cf5afeaf..9767c5e2ed1a9a 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1420,7 +1420,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, 
const struct nfs_opena
p = xdr_encode_hyper(p, arg->clientid);
*p++ = cpu_to_be32(24);
p = xdr_encode_opaque_fixed(p, "open id:", 8);
-   *p++ = cpu_to_be32(arg->server->s_dev);
+   *p++ = cpu_to_be32(arg->server->super->s_dev);
*p++ = cpu_to_be32(arg->id.uniquifier);
xdr_encode_hyper(p, arg->id.create_time);
 }
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 29d6a55b9d400d..561221a87b02a6 100644

[PATCH 01/19] fs: reflow deactivate_locked_super

2023-09-13 Thread Christoph Hellwig
Return early for the case where the super block isn't cleaned up to
reduce level of indentation.

Signed-off-by: Christoph Hellwig 
---
 fs/super.c | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 2d762ce67f6e6c..127a17d958a482 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -476,27 +476,28 @@ static void kill_super_notify(struct super_block *sb)
 void deactivate_locked_super(struct super_block *s)
 {
struct file_system_type *fs = s->s_type;
-   if (atomic_dec_and_test(>s_active)) {
-   unregister_shrinker(>s_shrink);
-   fs->kill_sb(s);
 
-   kill_super_notify(s);
-
-   /*
-* Since list_lru_destroy() may sleep, we cannot call it from
-* put_super(), where we hold the sb_lock. Therefore we destroy
-* the lru lists right now.
-*/
-   list_lru_destroy(>s_dentry_lru);
-   list_lru_destroy(>s_inode_lru);
-
-   put_filesystem(fs);
-   put_super(s);
-   } else {
+   if (!atomic_dec_and_test(>s_active)) {
super_unlock_excl(s);
+   return;
}
-}
 
+   unregister_shrinker(>s_shrink);
+   fs->kill_sb(s);
+
+   kill_super_notify(s);
+
+   /*
+* Since list_lru_destroy() may sleep, we cannot call it from
+* put_super(), where we hold the sb_lock. Therefore we destroy
+* the lru lists right now.
+*/
+   list_lru_destroy(>s_dentry_lru);
+   list_lru_destroy(>s_inode_lru);
+
+   put_filesystem(fs);
+   put_super(s);
+}
 EXPORT_SYMBOL(deactivate_locked_super);
 
 /**
-- 
2.39.2



[PATCH 18/19] fs: simple ->shutdown_sb and ->free_sb conversions

2023-09-13 Thread Christoph Hellwig
Convert all file systems that just called generic_shutdown_super from
->kill_sb without any state kept from before the call to after it to
->shutdown_sb and ->free_sb as needed.

Signed-off-by: Christoph Hellwig 
---
 fs/9p/vfs_super.c | 14 +++---
 fs/afs/super.c| 15 +++
 fs/autofs/autofs_i.h  |  3 ++-
 fs/autofs/init.c  |  3 ++-
 fs/autofs/inode.c | 24 +---
 fs/btrfs/super.c  | 11 ---
 fs/ceph/super.c   | 13 +
 fs/cramfs/inode.c |  6 ++
 fs/ecryptfs/main.c| 11 ++-
 fs/fuse/inode.c   |  7 +++
 fs/fuse/virtio_fs.c   | 19 +++
 fs/hostfs/hostfs_kern.c   |  5 ++---
 fs/nfs/fs_context.c   | 18 ++
 fs/orangefs/orangefs-kernel.h |  2 +-
 fs/orangefs/orangefs-mod.c|  2 +-
 fs/orangefs/super.c   | 11 ---
 fs/proc/root.c| 16 +---
 fs/romfs/super.c  |  6 ++
 fs/smb/client/cifsfs.c| 14 +-
 fs/ubifs/super.c  |  9 +++--
 20 files changed, 103 insertions(+), 106 deletions(-)

diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index e8b3641c98f886..a238065dd8b361 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -205,25 +205,17 @@ static struct dentry *v9fs_mount(struct file_system_type 
*fs_type, int flags,
return ERR_PTR(retval);
 }
 
-/**
- * v9fs_kill_super - Kill Superblock
- * @s: superblock
- *
- */
-
-static void v9fs_kill_super(struct super_block *s)
+static void v9fs_free_sb(struct super_block *s)
 {
struct v9fs_session_info *v9ses = s->s_fs_info;
 
p9_debug(P9_DEBUG_VFS, " %p\n", s);
 
-   generic_shutdown_super(s);
-
v9fs_session_cancel(v9ses);
v9fs_session_close(v9ses);
kfree(v9ses);
s->s_fs_info = NULL;
-   p9_debug(P9_DEBUG_VFS, "exiting kill_super\n");
+   p9_debug(P9_DEBUG_VFS, "exiting free_sb\n");
 }
 
 static void
@@ -340,7 +332,7 @@ static const struct super_operations v9fs_super_ops_dotl = {
 struct file_system_type v9fs_fs_type = {
.name = "9p",
.mount = v9fs_mount,
-   .kill_sb = v9fs_kill_super,
+   .free_sb = v9fs_free_sb,
.owner = THIS_MODULE,
.fs_flags = FS_RENAME_DOES_D_MOVE,
 };
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 84b135ad3496b1..bd85554056415d 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -30,7 +30,8 @@
 #include "internal.h"
 
 static void afs_i_init_once(void *foo);
-static void afs_kill_super(struct super_block *sb);
+static void afs_shutdown_sb(struct super_block *sb);
+static void afs_free_sb(struct super_block *sb);
 static struct inode *afs_alloc_inode(struct super_block *sb);
 static void afs_destroy_inode(struct inode *inode);
 static void afs_free_inode(struct inode *inode);
@@ -45,7 +46,8 @@ struct file_system_type afs_fs_type = {
.name   = "afs",
.init_fs_context= afs_init_fs_context,
.parameters = afs_fs_parameters,
-   .kill_sb= afs_kill_super,
+   .shutdown_sb= afs_shutdown_sb,
+   .free_sb= afs_free_sb,
.fs_flags   = FS_RENAME_DOES_D_MOVE,
 };
 MODULE_ALIAS_FS("afs");
@@ -527,7 +529,7 @@ static void afs_destroy_sbi(struct afs_super_info *as)
}
 }
 
-static void afs_kill_super(struct super_block *sb)
+static void afs_shutdown_sb(struct super_block *sb)
 {
struct afs_super_info *as = AFS_FS_S(sb);
 
@@ -539,7 +541,12 @@ static void afs_kill_super(struct super_block *sb)
 */
if (as->volume)
rcu_assign_pointer(as->volume->sb, NULL);
-   generic_shutdown_super(sb);
+}
+
+static void afs_free_sb(struct super_block *sb)
+{
+   struct afs_super_info *as = AFS_FS_S(sb);
+
if (as->volume)
afs_deactivate_volume(as->volume);
afs_destroy_sbi(as);
diff --git a/fs/autofs/autofs_i.h b/fs/autofs/autofs_i.h
index d5a44fa88acf9a..f60f425c08b55c 100644
--- a/fs/autofs/autofs_i.h
+++ b/fs/autofs/autofs_i.h
@@ -276,4 +276,5 @@ static inline void autofs_del_expiring(struct dentry 
*dentry)
}
 }
 
-void autofs_kill_sb(struct super_block *);
+void autofs_shutdown_sb(struct super_block *sb);
+void autofs_free_sb(struct super_block *sb);
diff --git a/fs/autofs/init.c b/fs/autofs/init.c
index d3f55e87433890..1f7bed5391f822 100644
--- a/fs/autofs/init.c
+++ b/fs/autofs/init.c
@@ -17,7 +17,8 @@ struct file_system_type autofs_fs_type = {
.owner  = THIS_MODULE,
.name   = "autofs",
.mount  = autofs_mount,
-   .kill_sb= autofs_kill_sb,
+   .shutdown_sb= autofs_shutdown_sb,
+   .free_sb= autofs_free_sb,
 };
 MODULE_ALIAS_FS("autofs");
 MODULE_ALIAS("autofs");
diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c
index 

split up ->kill_sb

2023-09-13 Thread Christoph Hellwig
Hi Al and Christian,

this series splits ->kill_sb into separate ->shutdown_sb and ->free_sb
methods and then calls generic_shutdown_super from common code to clean
up the file system shutdown interface.

As a first step towards that it moves allocating and freeing the
anonymous block device dev_t into common code. As every super_block must
have a valid s_dev it makes sense to just do that if the file system
didn't set one by itself, and we can also detect if one was assigned
easily when shutting down.

A git tree is available here:

git://git.infradead.org/users/hch/misc.git fs-kill_sb

Gitweb:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/fs-kill_sb

Diffstat:
 Documentation/filesystems/locking.rst |9 -
 Documentation/filesystems/vfs.rst |   15 ++
 arch/powerpc/platforms/cell/spufs/inode.c |   10 -
 arch/s390/hypfs/inode.c   |   43 
 arch/x86/kernel/cpu/resctrl/rdtgroup.c|7 -
 block/bdev.c  |1 
 drivers/android/binderfs.c|   12 --
 drivers/base/devtmpfs.c   |8 -
 drivers/dax/super.c   |1 
 drivers/dma-buf/dma-buf.c |1 
 drivers/gpu/drm/drm_drv.c |1 
 drivers/infiniband/hw/qib/qib.h   |4 
 drivers/infiniband/hw/qib/qib_fs.c|  105 ---
 drivers/infiniband/hw/qib/qib_init.c  |   32 ++
 drivers/misc/cxl/api.c|1 
 drivers/misc/ibmasm/ibmasmfs.c|8 -
 drivers/mtd/mtdsuper.c|   12 --
 drivers/scsi/cxlflash/ocxl_hw.c   |1 
 drivers/usb/gadget/function/f_fs.c|6 -
 drivers/usb/gadget/legacy/inode.c |   18 +--
 drivers/xen/xenfs/super.c |8 -
 fs/9p/vfs_super.c |   16 ---
 fs/adfs/super.c   |2 
 fs/affs/super.c   |7 -
 fs/afs/super.c|   27 ++---
 fs/aio.c  |1 
 fs/anon_inodes.c  |1 
 fs/autofs/autofs_i.h  |3 
 fs/autofs/init.c  |3 
 fs/autofs/inode.c |   24 ++--
 fs/befs/linuxvfs.c|2 
 fs/bfs/inode.c|2 
 fs/binfmt_misc.c  |8 -
 fs/btrfs/super.c  |   16 +--
 fs/btrfs/tests/btrfs-tests.c  |1 
 fs/ceph/super.c   |   20 +--
 fs/coda/inode.c   |1 
 fs/configfs/mount.c   |8 -
 fs/cramfs/inode.c |6 -
 fs/debugfs/inode.c|8 -
 fs/devpts/inode.c |6 -
 fs/ecryptfs/main.c|   14 --
 fs/efivarfs/super.c   |   13 +-
 fs/efs/super.c|7 -
 fs/erofs/super.c  |   25 +---
 fs/exfat/super.c  |6 -
 fs/ext2/super.c   |2 
 fs/ext4/super.c   |   12 +-
 fs/f2fs/super.c   |6 -
 fs/fat/namei_msdos.c  |2 
 fs/fat/namei_vfat.c   |2 
 fs/freevxfs/vxfs_super.c  |2 
 fs/fuse/control.c |   12 +-
 fs/fuse/inode.c   |   19 +--
 fs/fuse/virtio_fs.c   |   21 ++-
 fs/gfs2/ops_fstype.c  |   11 --
 fs/hfs/super.c|2 
 fs/hfsplus/super.c|2 
 fs/hostfs/hostfs_kern.c   |5 
 fs/hpfs/super.c   |2 
 fs/hugetlbfs/inode.c  |2 
 fs/isofs/inode.c  |2 
 fs/jffs2/super.c  |   22 ++--
 fs/jfs/super.c|2 
 fs/kernfs/mount.c |   20 +--
 fs/minix/inode.c  |2 
 fs/nfs/client.c   |2 
 fs/nfs/fs_context.c   |   19 +++
 fs/nfs/internal.h |1 
 fs/nfs/nfs4proc.c |8 -
 fs/nfs/nfs4trace.h|6 -
 fs/nfs/nfs4xdr.c  |2 
 fs/nfs/super.c|   26 
 fs/nfs/sysfs.h|2 
 fs/nfsd/nfsctl.c  |   22 ++--
 fs/nilfs2/super.c |2 
 fs/nsfs.c |1 
 fs/ntfs/super.c   |2 
 fs/ntfs3/super.c  |6 -
 fs/ocfs2/dlmfs/dlmfs.c|2 
 fs/ocfs2/super.c  |2 
 fs/omfs/inode.c   |2 
 

Re: [PATCH v2 01/14] arm64: dts: qcom: msm8916: Drop RPM bus clocks

2023-09-13 Thread Konrad Dybcio
On 13.09.2023 10:53, Krzysztof Kozlowski wrote:
> On 13/09/2023 10:47, Konrad Dybcio wrote:
>> On 13.09.2023 09:07, Krzysztof Kozlowski wrote:
>>> On 12/09/2023 15:31, Konrad Dybcio wrote:
 These clocks are now handled from within the icc framework and are
>>>
>>> That's a driver behavior, not hardware.
>> I believe we've been over this already..
>>
>> The rationale behind this change is: that hardware, which falls
>> under the "interconnect" class, was previously misrepresented as
>> a bunch of clocks. There are clocks underneath, but accessing them
>> directly would be equivalent to e.g. circumventing the PHY subsystem
>> and initializing your UFS PHY from within the UFS device.
> 
> And every time one write such commit msg, how should we remember there
> is some exception and actually it is about clock representation not CCF
> or ICC framework.
So is your reply essentially "fine, but please make it clear in
each commit message"?

Konrad


Re: [PATCH v2 13/14] arm64: dts: qcom: msm8996: Remove PNoC clock from MSS

2023-09-13 Thread Yassine Oudjana
On 12/09/2023 4:31 pm, Konrad Dybcio wrote:
> The PNoC clock is a clock for the entire PNoC bus, managed from
> within the interconnect driver. Attaching it to MSS was a total hack.
> Get rid of it and take the liberty to make the clock-names entries
> more readable.
>
> Signed-off-by: Konrad Dybcio 
> ---
>   arch/arm64/boot/dts/qcom/msm8996.dtsi | 11 ---
>   1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index 7061a8e12c81..51db5ddcef88 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -2479,10 +2479,15 @@ mss_pil: remoteproc@208 {
>< GCC_MSS_GPLL0_DIV_CLK>,
>< GCC_MSS_SNOC_AXI_CLK>,
>< GCC_MSS_MNOC_BIMC_AXI_CLK>,
> -  < RPM_SMD_PCNOC_CLK>,
>< RPM_SMD_QDSS_CLK>;
> - clock-names = "iface", "bus", "mem", "xo", "gpll0_mss",
> -   "snoc_axi", "mnoc_axi", "pnoc", "qdss";
> + clock-names = "iface",
> +   "bus",
> +   "mem",
> +   "xo",
> +   "gpll0_mss",
> +   "snoc_axi",
> +   "mnoc_axi",
> +   "qdss";
>   
>   resets = < GCC_MSS_RESTART>;
>   reset-names = "mss_restart";

Tested-by: Yassine Oudjana 



Re: [PATCH v2 12/14] arm64: dts: qcom: msm8996: Remove AGGRE2 clock from SLPI

2023-09-13 Thread Yassine Oudjana
On 12/09/2023 4:31 pm, Konrad Dybcio wrote:
> The AGGRE2 clock is a clock for the entire AGGRE2 bus, managed from
> within the interconnect driver. Attaching it to SLPI was a total hack.
> Get rid of it.
>
> Signed-off-by: Konrad Dybcio 
> ---
>   arch/arm64/boot/dts/qcom/msm8996.dtsi | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index acef67ab0581..7061a8e12c81 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -2433,9 +2433,8 @@ slpi_pil: remoteproc@1c0 {
> "handover",
> "stop-ack";
>   
> - clocks = <_board>,
> -  < RPM_SMD_AGGR2_NOC_CLK>;
> - clock-names = "xo", "aggre2";
> + clocks = <_board>;
> + clock-names = "xo";
>   
>   memory-region = <_mem>;
>   

Tested-by: Yassine Oudjana 



Re: [PATCH v2 02/14] arm64: dts: qcom: msm8996: Drop RPM bus clocks

2023-09-13 Thread Yassine Oudjana
On 12/09/2023 4:31 pm, Konrad Dybcio wrote:
> These clocks are now handled from within the icc framework and are
> no longer registered from within the CCF. Remove them.
>
> Signed-off-by: Konrad Dybcio 
> ---
>   arch/arm64/boot/dts/qcom/msm8996.dtsi | 27 ---
>   1 file changed, 4 insertions(+), 23 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index c8e0986425ab..acef67ab0581 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -798,9 +798,6 @@ bimc: interconnect@408000 {
>   compatible = "qcom,msm8996-bimc";
>   reg = <0x00408000 0x5a000>;
>   #interconnect-cells = <1>;
> - clock-names = "bus", "bus_a";
> - clocks = < RPM_SMD_BIMC_CLK>,
> -  < RPM_SMD_BIMC_A_CLK>;
>   };
>   
>   tsens0: thermal-sensor@4a9000 {
> @@ -851,18 +848,12 @@ cnoc: interconnect@50 {
>   compatible = "qcom,msm8996-cnoc";
>   reg = <0x0050 0x1000>;
>   #interconnect-cells = <1>;
> - clock-names = "bus", "bus_a";
> - clocks = < RPM_SMD_CNOC_CLK>,
> -  < RPM_SMD_CNOC_A_CLK>;
>   };
>   
>   snoc: interconnect@524000 {
>   compatible = "qcom,msm8996-snoc";
>   reg = <0x00524000 0x1c000>;
>   #interconnect-cells = <1>;
> - clock-names = "bus", "bus_a";
> - clocks = < RPM_SMD_SNOC_CLK>,
> -  < RPM_SMD_SNOC_A_CLK>;
>   };
>   
>   a0noc: interconnect@543000 {
> @@ -882,19 +873,14 @@ a1noc: interconnect@562000 {
>   compatible = "qcom,msm8996-a1noc";
>   reg = <0x00562000 0x5000>;
>   #interconnect-cells = <1>;
> - clock-names = "bus", "bus_a";
> - clocks = < RPM_SMD_AGGR1_NOC_CLK>,
> -  < RPM_SMD_AGGR1_NOC_A_CLK>;
>   };
>   
>   a2noc: interconnect@583000 {
>   compatible = "qcom,msm8996-a2noc";
>   reg = <0x00583000 0x7000>;
>   #interconnect-cells = <1>;
> - clock-names = "bus", "bus_a", "aggre2_ufs_axi", 
> "ufs_axi";
> - clocks = < RPM_SMD_AGGR2_NOC_CLK>,
> -  < RPM_SMD_AGGR2_NOC_A_CLK>,
> -  < GCC_AGGRE2_UFS_AXI_CLK>,
> + clock-names = "aggre2_ufs_axi", "ufs_axi";
> + clocks = < GCC_AGGRE2_UFS_AXI_CLK>,
>< GCC_UFS_AXI_CLK>;
>   };
>   
> @@ -902,19 +888,14 @@ mnoc: interconnect@5a4000 {
>   compatible = "qcom,msm8996-mnoc";
>   reg = <0x005a4000 0x1c000>;
>   #interconnect-cells = <1>;
> - clock-names = "bus", "bus_a", "iface";
> - clocks = < RPM_SMD_MMAXI_CLK>,
> -  < RPM_SMD_MMAXI_A_CLK>,
> -  < AHB_CLK_SRC>;
> + clock-names = "iface";
> + clocks = < AHB_CLK_SRC>;
>   };
>   
>   pnoc: interconnect@5c {
>   compatible = "qcom,msm8996-pnoc";
>   reg = <0x005c 0x3000>;
>   #interconnect-cells = <1>;
> - clock-names = "bus", "bus_a";
> - clocks = < RPM_SMD_PCNOC_CLK>,
> -  < RPM_SMD_PCNOC_A_CLK>;
>   };
>   
>   tcsr_mutex: hwlock@74 {

Tested-by: Yassine Oudjana 





Re: [PATCH v4 03/18] x86/sgx: Add sgx_epc_lru_lists to encapsulate LRU lists

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> Introduce a data structure to wrap the existing reclaimable list and its
> spinlock. Each cgroup later will have one instance of this structure to
> track EPC pages allocated for processes associated with the same cgroup.
> Just like the global SGX reclaimer (ksgxd), an EPC cgroup reclaims pages
> from the reclaimable list in this structure when its usage reaches near
> its limit.
>
> Currently, ksgxd does not track the VA, SECS pages. They are considered
> as 'unreclaimable' pages that are only deallocated when their respective
> owning enclaves are destroyed and all associated resources released.
>
> When an EPC cgroup can not reclaim any more reclaimable EPC pages to
> reduce its usage below its limit, the cgroup must also reclaim those
> unreclaimables by killing their owning enclaves. The VA and SECS pages
> later are also tracked in an 'unreclaimable' list added to this structure
> to support this OOM killing of enclaves.
>
> Signed-off-by: Sean Christopherson 
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> Cc: Sean Christopherson 
> ---
> V4:
> - Removed unneeded comments for the spinlock and the non-reclaimables.
> (Kai, Jarkko)
> - Revised the commit to add introduction comments for unreclaimables and
> multiple LRU lists.(Kai)
> - Reordered the patches: delay all changes for unreclaimables to
> later, and this one becomes the first change in the SGX subsystem.
>
> V3:
> - Removed the helper functions and revised commit messages.
> ---
>  arch/x86/kernel/cpu/sgx/sgx.h | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> index d2dad21259a8..018414b2abe8 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -83,6 +83,20 @@ static inline void *sgx_get_epc_virt_addr(struct 
> sgx_epc_page *page)
>   return section->virt_addr + index * PAGE_SIZE;
>  }
>  
> +/*
> + * Tracks EPC pages reclaimable by the reclaimer (ksgxd).
> + */
> +struct sgx_epc_lru_lists {
> + spinlock_t lock;
> + struct list_head reclaimable;
> +};
> +
> +static inline void sgx_lru_init(struct sgx_epc_lru_lists *lrus)
> +{
> + spin_lock_init(>lock);
> + INIT_LIST_HEAD(>reclaimable);
> +}
> +
>  struct sgx_epc_page *__sgx_alloc_epc_page(void);
>  void sgx_free_epc_page(struct sgx_epc_page *page);
>  
> -- 
> 2.25.1
>

Looks good but not yet time for ack'ing.

BR, Jarkko


[PATCH v1 1/1] lib/string_helpers: Don't copy a tail in kstrdup_and_replace() if 'new' is \0

2023-09-13 Thread Andy Shevchenko
The kstrdup_and_replace() takes two characters, old and new, to replace
former with latter after the copying of the original string. But in case
when new is a NUL, there is no point to copy the rest of the string,
the contract with the callers is that that the function returns a
NUL-terminated string and not a buffer of the size filled with a given
data. With this we can optimize the memory consumption by copying only
meaningful part of the original string and drop the rest.

Signed-off-by: Andy Shevchenko 
---

The first user of this is pending:
https://lore.kernel.org/platform-driver-x86/20230913092701.440959-1-andriy.shevche...@linux.intel.com/

 lib/string_helpers.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 7713f73e66b0..e385bf3cc2de 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -723,11 +723,17 @@ EXPORT_SYMBOL_GPL(kstrdup_quotable_file);
 
 /*
  * Returns duplicate string in which the @old characters are replaced by @new.
+ *
+ * If @new is NUL, copy the string up to the first occurrence of @old, which
+ * will be replaced by a NUL.
  */
 char *kstrdup_and_replace(const char *src, char old, char new, gfp_t gfp)
 {
char *dst;
 
+   if (new == '\0')
+   return kmemdup_nul(src, strchrnul(src, old) - src, gfp);
+
dst = kstrdup(src, gfp);
if (!dst)
return NULL;
-- 
2.40.0.1.gaa8946217a0b



Re: [PATCH v4 02/18] cgroup/misc: Add SGX EPC resource type and export APIs for SGX driver

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> The SGX driver will need to get access to the root misc_cg object
> to do iterative walks and also determine if a charge will be
> towards the root cgroup or not.

What is "a charge" and why does SGX driver need to do iterative walks?
Neither is explained here.

> To manage the SGX EPC memory via the misc controller, the SGX
> driver will also need to be able to iterate over the misc cgroup
> hierarchy.

Ambiguous language: misc_cg vs "misc controller". Are the different
types of objects? If not, then stick to misc_cg everywhere.

> Move parent_misc() into misc_cgroup.h and make inline to make this
> function available to SGX, rename it to misc_cg_parent(), and update
> misc.c to use the new name.

net/rxrpc/misc.c?

The point being that plain "misc.c" is ambiguous.

> Add per resource type private data so that SGX can store additional
> per cgroup data with the misc_cg struct.

Yet another term "misc cg struct", and not just "misc_cg" like in the
first paragraph.

>
> Allow SGX EPC memory to be a valid resource type for the misc
> controller.
>
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> ---
> V4:
> - Moved this to the second in the series.
> ---
>  include/linux/misc_cgroup.h | 29 +
>  kernel/cgroup/misc.c| 25 -
>  2 files changed, 41 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h
> index e1bcd176c2de..6f8330f435ba 100644
> --- a/include/linux/misc_cgroup.h
> +++ b/include/linux/misc_cgroup.h
> @@ -17,6 +17,10 @@ enum misc_res_type {
>   MISC_CG_RES_SEV,
>   /* AMD SEV-ES ASIDs resource */
>   MISC_CG_RES_SEV_ES,
> +#endif
> +#ifdef CONFIG_CGROUP_SGX_EPC
> + /* SGX EPC memory resource */
> + MISC_CG_RES_SGX_EPC,
>  #endif
>   MISC_CG_RES_TYPES
>  };
> @@ -37,6 +41,7 @@ struct misc_res {
>   u64 max;
>   atomic64_t usage;
>   atomic64_t events;
> + void *priv;
>  
>   /* per resource callback ops */
>   int (*misc_cg_alloc)(struct misc_cg *cg);
> @@ -59,6 +64,7 @@ struct misc_cg {
>   struct misc_res res[MISC_CG_RES_TYPES];
>  };
>  
> +struct misc_cg *misc_cg_root(void);
>  u64 misc_cg_res_total_usage(enum misc_res_type type);
>  int misc_cg_set_capacity(enum misc_res_type type, u64 capacity);
>  int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg, u64 
> amount);
> @@ -78,6 +84,20 @@ static inline struct misc_cg *css_misc(struct 
> cgroup_subsys_state *css)
>   return css ? container_of(css, struct misc_cg, css) : NULL;
>  }
>  
> +/**
> + * misc_cg_parent() - Get the parent of the passed misc cgroup.
> + * @cgroup: cgroup whose parent needs to be fetched.
> + *
> + * Context: Any context.
> + * Return:
> + * * struct misc_cg* - Parent of the @cgroup.
> + * * %NULL - If @cgroup is null or the passed cgroup does not have a parent.
> + */
> +static inline struct misc_cg *misc_cg_parent(struct misc_cg *cgroup)
> +{
> + return cgroup ? css_misc(cgroup->css.parent) : NULL;
> +}
> +
>  /*
>   * get_current_misc_cg() - Find and get the misc cgroup of the current task.
>   *
> @@ -102,6 +122,15 @@ static inline void put_misc_cg(struct misc_cg *cg)
>  }
>  
>  #else /* !CONFIG_CGROUP_MISC */
> +static inline struct misc_cg *misc_cg_root(void)
> +{
> + return NULL;
> +}
> +
> +static inline struct misc_cg *misc_cg_parent(struct misc_cg *cg)
> +{
> + return NULL;
> +}
>  
>  static inline u64 misc_cg_res_total_usage(enum misc_res_type type)
>  {
> diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c
> index e0092170d0dd..dbd881be773f 100644
> --- a/kernel/cgroup/misc.c
> +++ b/kernel/cgroup/misc.c
> @@ -24,6 +24,10 @@ static const char *const misc_res_name[] = {
>   /* AMD SEV-ES ASIDs resource */
>   "sev_es",
>  #endif
> +#ifdef CONFIG_CGROUP_SGX_EPC
> + /* Intel SGX EPC memory bytes */
> + "sgx_epc",
> +#endif
>  };
>  
>  /* Root misc cgroup */
> @@ -40,18 +44,13 @@ static struct misc_cg root_cg;
>  static u64 misc_res_capacity[MISC_CG_RES_TYPES];
>  
>  /**
> - * parent_misc() - Get the parent of the passed misc cgroup.
> - * @cgroup: cgroup whose parent needs to be fetched.
> - *
> - * Context: Any context.
> - * Return:
> - * * struct misc_cg* - Parent of the @cgroup.
> - * * %NULL - If @cgroup is null or the passed cgroup does not have a parent.
> + * misc_cg_root() - Return the root misc cgroup.
>   */
> -static struct misc_cg *parent_misc(struct misc_cg *cgroup)
> +struct misc_cg *misc_cg_root(void)
>  {
> - return cgroup ? css_misc(cgroup->css.parent) : NULL;
> + return _cg;
>  }
> +EXPORT_SYMBOL_GPL(misc_cg_root);
>  
>  /**
>   * valid_type() - Check if @type is valid or not.
> @@ -150,7 +149,7 @@ int misc_cg_try_charge(enum misc_res_type type, struct 
> misc_cg *cg, u64 amount)
>   if (!amount)
>   return 0;
>  
> - for (i = cg; i; 

Re: [PATCH v4 01/18] cgroup/misc: Add per resource callbacks for CSS events

2023-09-13 Thread Jarkko Sakkinen
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote:
> From: Kristen Carlson Accardi 
>
> Consumers of the misc cgroup controller might need to perform separate
> actions for Cgroups Subsystem State(CSS) events: cgroup alloc and free.

nit: s/State(CSS)/State (CSS)/

"cgroup alloc" and "cgroup free" mean absolutely nothing.


> In addition, writes to the max value may also need separate action. Add

What "the max value"?

> the ability to allow downstream users to setup callbacks for these
> operations, and call the corresponding per-resource-type callback when
> appropriate.

Who are "the downstream users" and what sort of callbacks they setup?

>
> This code will be utilized by the SGX driver in a future patch.
>
> Signed-off-by: Kristen Carlson Accardi 
> Signed-off-by: Haitao Huang 
> ---
> V4:
> - Moved this to the front of the series.
> - Applies on cgroup/for-6.6 with the overflow fix for misc.
>
> V3:
> - Removed the released() callback
> ---
>  include/linux/misc_cgroup.h |  5 +
>  kernel/cgroup/misc.c| 32 +---
>  2 files changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h
> index e799b1f8d05b..e1bcd176c2de 100644
> --- a/include/linux/misc_cgroup.h
> +++ b/include/linux/misc_cgroup.h
> @@ -37,6 +37,11 @@ struct misc_res {
>   u64 max;
>   atomic64_t usage;
>   atomic64_t events;
> +
> + /* per resource callback ops */
> + int (*misc_cg_alloc)(struct misc_cg *cg);
> + void (*misc_cg_free)(struct misc_cg *cg);
> + void (*misc_cg_max_write)(struct misc_cg *cg);
>  };
>  
>  /**
> diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c
> index 79a3717a5803..e0092170d0dd 100644
> --- a/kernel/cgroup/misc.c
> +++ b/kernel/cgroup/misc.c
> @@ -276,10 +276,13 @@ static ssize_t misc_cg_max_write(struct 
> kernfs_open_file *of, char *buf,
>  
>   cg = css_misc(of_css(of));
>  
> - if (READ_ONCE(misc_res_capacity[type]))
> + if (READ_ONCE(misc_res_capacity[type])) {
>   WRITE_ONCE(cg->res[type].max, max);
> - else
> + if (cg->res[type].misc_cg_max_write)
> + cg->res[type].misc_cg_max_write(cg);
> + } else {
>   ret = -EINVAL;
> + }
>  
>   return ret ? ret : nbytes;
>  }
> @@ -383,23 +386,39 @@ static struct cftype misc_cg_files[] = {
>  static struct cgroup_subsys_state *
>  misc_cg_alloc(struct cgroup_subsys_state *parent_css)
>  {
> + struct misc_cg *parent_cg;
>   enum misc_res_type i;
>   struct misc_cg *cg;
> + int ret;
>  
>   if (!parent_css) {
>   cg = _cg;
> + parent_cg = _cg;
>   } else {
>   cg = kzalloc(sizeof(*cg), GFP_KERNEL);
>   if (!cg)
>   return ERR_PTR(-ENOMEM);
> + parent_cg = css_misc(parent_css);
>   }
>  
>   for (i = 0; i < MISC_CG_RES_TYPES; i++) {
>   WRITE_ONCE(cg->res[i].max, MAX_NUM);
>   atomic64_set(>res[i].usage, 0);
> + if (parent_cg->res[i].misc_cg_alloc) {
> + ret = parent_cg->res[i].misc_cg_alloc(cg);
> + if (ret)
> + goto alloc_err;
> + }
>   }
>  
>   return >css;
> +
> +alloc_err:
> + for (i = 0; i < MISC_CG_RES_TYPES; i++)
> + if (parent_cg->res[i].misc_cg_free)
> + cg->res[i].misc_cg_free(cg);
> + kfree(cg);
> + return ERR_PTR(ret);
>  }
>  
>  /**
> @@ -410,7 +429,14 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css)
>   */
>  static void misc_cg_free(struct cgroup_subsys_state *css)
>  {
> - kfree(css_misc(css));
> + struct misc_cg *cg = css_misc(css);
> + enum misc_res_type i;
> +
> + for (i = 0; i < MISC_CG_RES_TYPES; i++)
> + if (cg->res[i].misc_cg_free)
> + cg->res[i].misc_cg_free(cg);
> +
> + kfree(cg);
>  }
>  
>  /* Cgroup controller callbacks */
> -- 
> 2.25.1

BR, Jarkko


[PATCH v2 4/8] x86/vdso: Remove now superfluous sentinel element from ctl_table array

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Remove sentinel element from abi_table2. This removal is safe because
register_sysctl implicitly uses ARRAY_SIZE() in addition to checking for
the sentinel.

Signed-off-by: Joel Granados 
---
 arch/x86/entry/vdso/vdso32-setup.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/entry/vdso/vdso32-setup.c 
b/arch/x86/entry/vdso/vdso32-setup.c
index f3b3cacbcbb0..37b761802181 100644
--- a/arch/x86/entry/vdso/vdso32-setup.c
+++ b/arch/x86/entry/vdso/vdso32-setup.c
@@ -66,8 +66,7 @@ static struct ctl_table abi_table2[] = {
.proc_handler   = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
-   },
-   {}
+   }
 };
 
 static __init int ia32_binfmt_init(void)

-- 
2.30.2



[PATCH v2 7/8] ia64: Remove now superfluous sentinel element from ctl_table array

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Remove sentinel from kdump_ctl_table. This removal is safe because
register_sysctl implicitly uses ARRAY_SIZE() in addition to checking for
the sentinel.

Signed-off-by: Joel Granados 
---
 arch/ia64/kernel/crash.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/ia64/kernel/crash.c b/arch/ia64/kernel/crash.c
index 88b3ce3e66cd..fbf8893a570c 100644
--- a/arch/ia64/kernel/crash.c
+++ b/arch/ia64/kernel/crash.c
@@ -231,8 +231,7 @@ static struct ctl_table kdump_ctl_table[] = {
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec,
-   },
-   { }
+   }
 };
 #endif
 

-- 
2.30.2



[PATCH v2 8/8] c-sky: Remove now superfluous sentinel element from ctl_talbe array

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Remove sentinel from alignment_tbl ctl_table array. This removal is safe
because register_sysctl_init implicitly uses ARRAY_SIZE() in addition to
checking for the sentinel.

Acked-by: Guo Ren 
Signed-off-by: Joel Granados 
---
 arch/csky/abiv1/alignment.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/csky/abiv1/alignment.c b/arch/csky/abiv1/alignment.c
index b60259daed1b..0d75ce7b0328 100644
--- a/arch/csky/abiv1/alignment.c
+++ b/arch/csky/abiv1/alignment.c
@@ -328,8 +328,7 @@ static struct ctl_table alignment_tbl[5] = {
.maxlen = sizeof(align_usr_count),
.mode = 0666,
.proc_handler = _dointvec
-   },
-   {}
+   }
 };
 
 static int __init csky_alignment_init(void)

-- 
2.30.2



[PATCH v2 6/8] powerpc: Remove now superfluous sentinel element from ctl_table arrays

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Remove sentinel from powersave_nap_ctl_table and nmi_wd_lpm_factor_ctl_table.
This removal is safe because register_sysctl implicitly uses ARRAY_SIZE()
in addition to checking for the sentinel.

Signed-off-by: Joel Granados 
---
 arch/powerpc/kernel/idle.c| 3 +--
 arch/powerpc/platforms/pseries/mobility.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
index b1c0418b25c8..a8591f5fa70e 100644
--- a/arch/powerpc/kernel/idle.c
+++ b/arch/powerpc/kernel/idle.c
@@ -104,8 +104,7 @@ static struct ctl_table powersave_nap_ctl_table[] = {
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = proc_dointvec,
-   },
-   {}
+   }
 };
 
 static int __init
diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 0161226d8fec..d82b0c802fbb 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -60,8 +60,7 @@ static struct ctl_table nmi_wd_lpm_factor_ctl_table[] = {
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = proc_douintvec_minmax,
-   },
-   {}
+   }
 };
 
 static int __init register_nmi_wd_lpm_factor_sysctl(void)

-- 
2.30.2



[PATCH v2 3/8] arch/x86: Remove now superfluous sentinel elem from ctl_table arrays

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Remove sentinel element from sld_sysctl and itmt_kern_table. This
removal is safe because register_sysctl_init and register_sysctl
implicitly use the array size in addition to checking for the sentinel.

Reviewed-by: Ingo Molnar 
Acked-by: Dave Hansen  # for x86
Signed-off-by: Joel Granados 
---
 arch/x86/kernel/cpu/intel.c | 3 +--
 arch/x86/kernel/itmt.c  | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index be4045628fd3..e63391b82624 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1015,8 +1015,7 @@ static struct ctl_table sld_sysctls[] = {
.proc_handler   = proc_douintvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
-   },
-   {}
+   }
 };
 
 static int __init sld_mitigate_sysctl_init(void)
diff --git a/arch/x86/kernel/itmt.c b/arch/x86/kernel/itmt.c
index ee4fe8cdb857..5f2ccff38297 100644
--- a/arch/x86/kernel/itmt.c
+++ b/arch/x86/kernel/itmt.c
@@ -73,8 +73,7 @@ static struct ctl_table itmt_kern_table[] = {
.proc_handler   = sched_itmt_update_handler,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
-   },
-   {}
+   }
 };
 
 static struct ctl_table_header *itmt_sysctl_header;

-- 
2.30.2



[PATCH v2 2/8] arm: Remove now superfluous sentinel elem from ctl_table arrays

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Removed the sentinel as well as the explicit size from ctl_isa_vars. The
size is redundant as the initialization sets it. Changed
insn_emulation->sysctl from a 2 element array of struct ctl_table to a
simple struct. This has no consequence for the sysctl registration as it
is forwarded as a pointer. Removed sentinel from sve_defatul_vl_table,
sme_default_vl_table, tagged_addr_sysctl_table and
armv8_pmu_sysctl_table.

This removal is safe because register_sysctl_sz and register_sysctl use
the array size in addition to checking for the sentinel.

Signed-off-by: Joel Granados 
---
 arch/arm/kernel/isa.c| 4 ++--
 arch/arm64/kernel/armv8_deprecated.c | 8 +++-
 arch/arm64/kernel/fpsimd.c   | 6 ++
 arch/arm64/kernel/process.c  | 3 +--
 drivers/perf/arm_pmuv3.c | 3 +--
 5 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/arch/arm/kernel/isa.c b/arch/arm/kernel/isa.c
index 20218876bef2..0b9c28077092 100644
--- a/arch/arm/kernel/isa.c
+++ b/arch/arm/kernel/isa.c
@@ -16,7 +16,7 @@
 
 static unsigned int isa_membase, isa_portbase, isa_portshift;
 
-static struct ctl_table ctl_isa_vars[4] = {
+static struct ctl_table ctl_isa_vars[] = {
{
.procname   = "membase",
.data   = _membase, 
@@ -35,7 +35,7 @@ static struct ctl_table ctl_isa_vars[4] = {
.maxlen = sizeof(isa_portshift),
.mode   = 0444,
.proc_handler   = proc_dointvec,
-   }, {}
+   }
 };
 
 static struct ctl_table_header *isa_sysctl_header;
diff --git a/arch/arm64/kernel/armv8_deprecated.c 
b/arch/arm64/kernel/armv8_deprecated.c
index e459cfd33711..dd6ce86d4332 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -52,10 +52,8 @@ struct insn_emulation {
int min;
int max;
 
-   /*
-* sysctl for this emulation + a sentinal entry.
-*/
-   struct ctl_table sysctl[2];
+   /* sysctl for this emulation */
+   struct ctl_table sysctl;
 };
 
 #define ARM_OPCODE_CONDTEST_FAIL   0
@@ -558,7 +556,7 @@ static void __init register_insn_emulation(struct 
insn_emulation *insn)
update_insn_emulation_mode(insn, INSN_UNDEF);
 
if (insn->status != INSN_UNAVAILABLE) {
-   sysctl = >sysctl[0];
+   sysctl = >sysctl;
 
sysctl->mode = 0644;
sysctl->maxlen = sizeof(int);
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 91e44ac7150f..db3ad1ba8272 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -588,8 +588,7 @@ static struct ctl_table sve_default_vl_table[] = {
.mode   = 0644,
.proc_handler   = vec_proc_do_default_vl,
.extra1 = _info[ARM64_VEC_SVE],
-   },
-   { }
+   }
 };
 
 static int __init sve_sysctl_init(void)
@@ -612,8 +611,7 @@ static struct ctl_table sme_default_vl_table[] = {
.mode   = 0644,
.proc_handler   = vec_proc_do_default_vl,
.extra1 = _info[ARM64_VEC_SME],
-   },
-   { }
+   }
 };
 
 static int __init sme_sysctl_init(void)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 0fcc4eb1a7ab..48861cdc3aae 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -723,8 +723,7 @@ static struct ctl_table tagged_addr_sysctl_table[] = {
.proc_handler   = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
-   },
-   { }
+   }
 };
 
 static int __init tagged_addr_init(void)
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index e5a2ac4155f6..c4aa6a8d1b05 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -1172,8 +1172,7 @@ static struct ctl_table armv8_pmu_sysctl_table[] = {
.proc_handler   = armv8pmu_proc_user_access_handler,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
-   },
-   { }
+   }
 };
 
 static void armv8_pmu_register_sysctl_table(void)

-- 
2.30.2



[PATCH v2 0/8] sysctl: Remove sentinel elements from arch

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

What?
These commits remove the sentinel element (last empty element) from the
sysctl arrays of all the files under the "arch/" directory that use a
sysctl array for registration. The merging of the preparation patches
(in https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)
to mainline allows us to just remove sentinel elements without changing
behavior. This is now safe because the sysctl registration code
(register_sysctl() and friends) use the array size in addition to
checking for a sentinel ([1] for more info).

These commits are part of a bigger set (bigger patchset here
https://github.com/Joelgranados/linux/tree/tag/sysctl_remove_empty_elem_V4)
that remove the ctl_table sentinel. The idea is to make the review
process easier by chunking the 52 commits into manageable pieces. By
sending out one chunk at a time, they can be reviewed separately without
noise from parallel sets. After the "arch/" commits in this set are
reviewed, I will continue with drivers/*, fs/*, kernel/*, net/* and
miscellaneous. The final set will remove the unneeded check for
->procname == NULL.

Why?
By removing the sysctl sentinel elements we avoid kernel bloat as
ctl_table arrays get moved out of kernel/sysctl.c into their own
respective subsystems. This move was started long ago to avoid merge
conflicts; the sentinel removal bit came after Mathew Wilcox suggested
it to avoid bloating the kernel by one element as arrays moved out. This
patchset will reduce the overall build time size of the kernel and run
time memory bloat by about ~64 bytes per declared ctl_table array. I
have consolidated some links that shed light on the history of this
effort [2].

V2:
* Added clarification both in the commit messages and the coverletter as
  to why this patch is safe to apply.
* Added {Acked,Reviewed,Tested}-by from list
* Link to v1: 
https://lore.kernel.org/r/20230906-jag-sysctl_remove_empty_elem_arch-v1-0-3935d4854...@samsung.com

Testing:
* Ran sysctl selftests (./tools/testing/selftests/sysctl/sysctl.sh)
* Ran this through 0-day with no errors or warnings

Size saving after removing all sentinels:
  A consequence of eventually removing all the sentinels (64 bytes per
  sentinel) is the bytes we save. These are *not* numbers that we will
  get after this patch set; these are the numbers that we will get after
  removing all the sentinels. I included them here because they are
  relevant and to get an idea of just how much memory we are talking
  about.
* bloat-o-meter:
- The "yesall" configuration results save 9158 bytes (bloat-o-meter 
output here
  
https://lore.kernel.org/all/20230621091000.424843-1-j.grana...@samsung.com/)
- The "tiny" config + CONFIG_SYSCTL save 1215 bytes (bloat-o-meter 
output here
  
https://lore.kernel.org/all/20230809105006.1198165-1-j.grana...@samsung.com/)
* memory usage:
we save some bytes in main memory as well. In my testing kernel
I measured a difference of 7296 bytes. I include the way to
measure in [3]

Size saving after this patchset:
  Here I give the values that I measured for the architecture that I'm
  running (x86_64). This can give an approximation of how many bytes are
  saved for each arch. I won't publish for all the archs because I don't
  have access to all of them.
* bloat-o-meter
- The "yesall" config saves 192 bytes (bloat-o-meter output [4])
- The "tiny" config saves 64 bytes (bloat-o-meter output [5])
* memory usage:
In this case there were no bytes saved. To measure it comment the
printk in `new_dir` and uncomment the if conditional in `new_links`
[3].

Comments/feedback greatly appreciated

Best
Joel

[1]
https://lore.kernel.org/all/20230809105006.1198165-1-j.grana...@samsung.com/

[2]
Links Related to the ctl_table sentinel removal:
* Good summary from Luis sent with the "pull request" for the
  preparation patches.
  https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/
* Another very good summary from Luis.
  https://lore.kernel.org/all/zmfizkfkvxuft...@bombadil.infradead.org/
* This is a patch set that replaces register_sysctl_table with register_sysctl
  https://lore.kernel.org/all/20230302204612.782387-1-mcg...@kernel.org/
* Patch set to deprecate register_sysctl_paths()
  https://lore.kernel.org/all/20230302202826.776286-1-mcg...@kernel.org/
* Here there is an explicit expectation for the removal of the sentinel element.
  https://lore.kernel.org/all/20230321130908.6972-1-frank...@vivo.com
* The "ARRAY_SIZE" approach was mentioned (proposed?) in this thread
  https://lore.kernel.org/all/20220220060626.15885-1-tangm...@uniontech.com

[3]
To measure the in memory savings apply this on top of this patchset.

"
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index c88854df0b62..e0073a627bac 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -976,6 +976,8 @@ static struct ctl_dir 

[PATCH v2 5/8] riscv: Remove now superfluous sentinel element from ctl_table array

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Remove sentinel element from riscv_v_default_vstate_table. This removal
is safe because register_sysctl implicitly uses ARRAY_SIZE() in addition
to checking for the sentinel.

Signed-off-by: Joel Granados 
---
 arch/riscv/kernel/vector.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
index 8d92fb6c522c..a1ae68b2ac0f 100644
--- a/arch/riscv/kernel/vector.c
+++ b/arch/riscv/kernel/vector.c
@@ -254,8 +254,7 @@ static struct ctl_table riscv_v_default_vstate_table[] = {
.maxlen = sizeof(riscv_v_implicit_uacc),
.mode   = 0644,
.proc_handler   = proc_dobool,
-   },
-   { }
+   }
 };
 
 static int __init riscv_v_sysctl_init(void)

-- 
2.30.2



[PATCH v2 1/8] S390: Remove now superfluous sentinel elem from ctl_table arrays

2023-09-13 Thread Joel Granados via B4 Relay
From: Joel Granados 

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which will
reduce the overall build time size of the kernel and run time memory
bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)

Remove the sentinel element from appldata_table, s390dbf_table,
topology_ctl_table, cmm_table and page_table_sysctl. Reduced the memory
allocation in appldata_register_ops by 1 effectively removing the
sentinel from ops->ctl_table.

This removal is safe because register_sysctl_sz and register_sysctl use
the array size in addition to checking for the sentinel.

Tested-by: Alexander Gordeev 
Acked-by: Heiko Carstens 
Signed-off-by: Joel Granados 
---
 arch/s390/appldata/appldata_base.c | 6 ++
 arch/s390/kernel/debug.c   | 3 +--
 arch/s390/kernel/topology.c| 3 +--
 arch/s390/mm/cmm.c | 3 +--
 arch/s390/mm/pgalloc.c | 3 +--
 5 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/arch/s390/appldata/appldata_base.c 
b/arch/s390/appldata/appldata_base.c
index 3b0994625652..872a644b1fd1 100644
--- a/arch/s390/appldata/appldata_base.c
+++ b/arch/s390/appldata/appldata_base.c
@@ -62,8 +62,7 @@ static struct ctl_table appldata_table[] = {
.procname   = "interval",
.mode   = S_IRUGO | S_IWUSR,
.proc_handler   = appldata_interval_handler,
-   },
-   { },
+   }
 };
 
 /*
@@ -351,8 +350,7 @@ int appldata_register_ops(struct appldata_ops *ops)
if (ops->size > APPLDATA_MAX_REC_SIZE)
return -EINVAL;
 
-   /* The last entry must be an empty one */
-   ops->ctl_table = kcalloc(2, sizeof(struct ctl_table), GFP_KERNEL);
+   ops->ctl_table = kcalloc(1, sizeof(struct ctl_table), GFP_KERNEL);
if (!ops->ctl_table)
return -ENOMEM;
 
diff --git a/arch/s390/kernel/debug.c b/arch/s390/kernel/debug.c
index a85e0c3e7027..150e2bfff0b3 100644
--- a/arch/s390/kernel/debug.c
+++ b/arch/s390/kernel/debug.c
@@ -977,8 +977,7 @@ static struct ctl_table s390dbf_table[] = {
.maxlen = sizeof(int),
.mode   = S_IRUGO | S_IWUSR,
.proc_handler   = s390dbf_procactive,
-   },
-   { }
+   }
 };
 
 static struct ctl_table_header *s390dbf_sysctl_header;
diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
index 68adf1de..9dcfac416669 100644
--- a/arch/s390/kernel/topology.c
+++ b/arch/s390/kernel/topology.c
@@ -635,8 +635,7 @@ static struct ctl_table topology_ctl_table[] = {
.procname   = "topology",
.mode   = 0644,
.proc_handler   = topology_ctl_handler,
-   },
-   { },
+   }
 };
 
 static int __init topology_init(void)
diff --git a/arch/s390/mm/cmm.c b/arch/s390/mm/cmm.c
index f47515313226..8937aa7090b3 100644
--- a/arch/s390/mm/cmm.c
+++ b/arch/s390/mm/cmm.c
@@ -331,8 +331,7 @@ static struct ctl_table cmm_table[] = {
.procname   = "cmm_timeout",
.mode   = 0644,
.proc_handler   = cmm_timeout_handler,
-   },
-   { }
+   }
 };
 
 #ifdef CONFIG_CMM_IUCV
diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
index 07fc660a24aa..e8cecd31715f 100644
--- a/arch/s390/mm/pgalloc.c
+++ b/arch/s390/mm/pgalloc.c
@@ -29,8 +29,7 @@ static struct ctl_table page_table_sysctl[] = {
.proc_handler   = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
-   },
-   { }
+   }
 };
 
 static int __init page_table_register_sysctl(void)

-- 
2.30.2



  1   2   >