Re: [PATCH] mctp i2c: Add rx trace

2024-05-28 Thread Jeremy Kerr
Hi Tal,

Thanks for the contribution! Some comments:

> mctp-i2c rx implementation doesn't call
> __i2c_transfer which calls the i2c reply trace function.

No, but we can trace the i2c rx path through the trace_i2c_slave
tracepoint. It is a little messier than tracing trace_i2c_write, but
has been sufficient with the debugging I've needed in the past.

> Add an mctp_reply trace function that will be used instead.

Can you elaborate a little on what you were/are looking to inspect
here? (mainly: which packet fields are you interested in?) That will
help to determine the best approach here.

Cheers,


Jeremy



Re: [PATCH 0/3] tracing: Fix some selftest issues

2024-05-28 Thread Google
On Wed, 29 May 2024 01:46:40 +0900
Masami Hiramatsu (Google)  wrote:

> On Mon, 27 May 2024 19:29:07 -0400
> Steven Rostedt  wrote:
> 
> > On Sun, 26 May 2024 19:10:57 +0900
> > "Masami Hiramatsu (Google)"  wrote:
> > 
> > > Hi,
> > > 
> > > Here is a series of some fixes/improvements for the test modules and boot
> > > time selftest of kprobe events. I found a WARNING message with some boot 
> > > time selftest configuration, which came from the combination of embedded
> > > kprobe generate API tests module and ftrace boot-time selftest. So the 
> > > main
> > > problem is that the test module should not be built-in. But I also think
> > > this WARNING message is useless (because there are warning messages 
> > > already)
> > > and the cleanup code is redundant. This series fixes those issues.
> > 
> > Note, when I enable trace tests as builtin instead of modules, I just
> > disable the bootup self tests when it detects this. This helps with
> > doing tests via config options than having to add user space code that
> > loads modules.
> > 
> > Could you do something similar?
> 
> OK, in that case, I would like to move the test cleanup code in
> module_exit function into the end of module_init function. 
> It looks there is no reason to split those into 2 parts.

Wait, I would like to hear Tom's opinion. I found following usage comments
in the code.

 * Following that are a few examples using the created events to test
 * various ways of tracing a synthetic event.
 *
 * To test, select CONFIG_SYNTH_EVENT_GEN_TEST and build the module.
 * Then:
 *
 * # insmod kernel/trace/synth_event_gen_test.ko
 * # cat /sys/kernel/tracing/trace
 *
 * You should see several events in the trace buffer -
 * "create_synth_test", "empty_synth_test", and several instances of
 * "gen_synth_test".
 *
 * To remove the events, remove the module:
 *
 * # rmmod synth_event_gen_test

Tom, is that intended behavior ? and are you expected to reuse these
events outside of the module? e.g. load the test module and run some
test script in user space which uses those events?

As far as I can see, those tests are not intended to be embedded in the
kernel because those are expected to be removed.

Thank you,

> 
> Thank you,
> 
> > 
> > -- Steve
> > 
> > 
> > > 
> > > Thank you,
> > > 
> > > ---
> > > 
> > > Masami Hiramatsu (Google) (3):
> > >   tracing: Build event generation tests only as modules
> > >   tracing/kprobe: Remove unneeded WARN_ON_ONCE() in selftests
> > >   tracing/kprobe: Remove cleanup code unrelated to selftest
> > > 
> 
> 
> -- 
> Masami Hiramatsu (Google) 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH DNM 2/2] arm64: dts: qcom: qcm6490-fairphone-fp5: Add DisplayPort sound support

2024-05-28 Thread Bjorn Andersson
On Fri, May 10, 2024 at 02:27:09PM GMT, Luca Weiss wrote:
> Add the required nodes for sound playback via a connected external
> display (DisplayPort over USB-C).
> 
> Signed-off-by: Luca Weiss 
> ---
> Depends on a bunch of patches upstream doing bringup of Display (DSI),
> DisplayPort, GPU, and then finally audio could land. But we're blocked
> on DPU 1:1:1 topology for all of that unfortunately.
> 
> And also machine driver for sound just exists a bit hackily.

Thanks for sharing this, Luca. Can you please resubmit this once it's
ready to be merged, so that I don't need to keep track of it?

Regards,
Bjorn

> ---
>  arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts | 36 
> ++
>  1 file changed, 36 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts 
> b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> index 05bbf1da5cb8..2bbbcaeff95e 100644
> --- a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> +++ b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> @@ -14,6 +14,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include "sc7280.dtsi"
>  #include "pm7250b.dtsi"
>  #include "pm7325.dtsi"
> @@ -774,6 +776,12 @@ _resin {
>   status = "okay";
>  };
>  
> + {
> + dai@104 {
> + reg = ;
> + };
> +};
> +
>  _spi13_cs {
>   drive-strength = <6>;
>   bias-disable;
> @@ -847,6 +855,34 @@ _2 {
>   status = "okay";
>  };
>  
> + {
> + compatible = "fairphone,fp5-sndcard";
> + model = "Fairphone 5";
> +
> + mm1-dai-link {
> + link-name = "MultiMedia1";
> + cpu {
> + sound-dai = < MSM_FRONTEND_DAI_MULTIMEDIA1>;
> + };
> + };
> +
> + displayport-rx-dai-link {
> + link-name = "DisplayPort Playback";
> +
> + cpu {
> + sound-dai = < DISPLAY_PORT_RX>;
> + };
> +
> + platform {
> + sound-dai = <>;
> + };
> +
> + codec {
> + sound-dai = <_dp>;
> + };
> + };
> +};
> +
>   {
>   status = "okay";
>  
> 
> -- 
> 2.45.0
> 



Re: [PATCH v5 5/7] remoteproc: core: support of the tee interface

2024-05-28 Thread Mathieu Poirier
On Tue, May 21, 2024 at 10:09:59AM +0200, Arnaud Pouliquen wrote:
> 1) on start:
> - Using the TEE loader, the resource table is loaded by an external entity.
> In such case the resource table address is not find from the firmware but
> provided by the TEE remoteproc framework.
> Use the rproc_get_loaded_rsc_table instead of rproc_find_loaded_rsc_table
> - test that rproc->cached_table is not null before performing the memcpy
> 
> 2)on stop
> The use of the cached_table seems mandatory:
> - during recovery sequence to have a snapshot of the resource table
>   resources used,
> - on stop to allow  for the deinitialization of resources after the
>   the remote processor has been shutdown.
> However if the TEE interface is being used, we first need to unmap the
> table_ptr before setting it to rproc->cached_table.
> The update of rproc->table_ptr to rproc->cached_table is performed in
> tee_remoteproc.
> 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/remoteproc/remoteproc_core.c | 31 +---
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index 42bca01f3bde..3a642151c983 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1267,6 +1267,7 @@ EXPORT_SYMBOL(rproc_resource_cleanup);
>  static int rproc_set_rsc_table_on_start(struct rproc *rproc, const struct 
> firmware *fw)
>  {
>   struct resource_table *loaded_table;
> + struct device *dev = >dev;
>  
>   /*
>* The starting device has been given the rproc->cached_table as the
> @@ -1276,12 +1277,21 @@ static int rproc_set_rsc_table_on_start(struct rproc 
> *rproc, const struct firmwa
>* this information to device memory. We also update the table_ptr so
>* that any subsequent changes will be applied to the loaded version.
>*/
> - loaded_table = rproc_find_loaded_rsc_table(rproc, fw);
> - if (loaded_table) {
> - memcpy(loaded_table, rproc->cached_table, rproc->table_sz);
> - rproc->table_ptr = loaded_table;
> + if (rproc->tee_interface) {
> + loaded_table = rproc_get_loaded_rsc_table(rproc, 
> >table_sz);
> + if (IS_ERR(loaded_table)) {
> + dev_err(dev, "can't get resource table\n");
> + return PTR_ERR(loaded_table);
> + }
> + } else {
> + loaded_table = rproc_find_loaded_rsc_table(rproc, fw);
>   }
>  
> + if (loaded_table && rproc->cached_table)
> + memcpy(loaded_table, rproc->cached_table, rproc->table_sz);
> +

Why is this not part of the else {} above as it was the case before this patch?
And why was an extra check for ->cached_table added?

This should be a simple change, i.e introduce an if {} else {} block to take
care of the two scenarios.  Plus the comment is misplaced now. 

More comments tomorrow.

Thanks,
Mathieu

> + rproc->table_ptr = loaded_table;
> +
>   return 0;
>  }
>  
> @@ -1318,11 +1328,16 @@ static int rproc_reset_rsc_table_on_stop(struct rproc 
> *rproc)
>   kfree(rproc->clean_table);
>  
>  out:
> - /*
> -  * Use a copy of the resource table for the remainder of the
> -  * shutdown process.
> + /* If the remoteproc_tee interface is used, then we have first to unmap 
> the resource table
> +  * before updating the proc->table_ptr reference.
>*/
> - rproc->table_ptr = rproc->cached_table;
> + if (!rproc->tee_interface) {
> + /*
> +  * Use a copy of the resource table for the remainder of the
> +  * shutdown process.
> +  */
> + rproc->table_ptr = rproc->cached_table;
> + }
>   return 0;
>  }
>  
> -- 
> 2.25.1
> 



Re: [PATCH v10] remoteproc: qcom: Move minidump related layout and API to soc/qcom directory

2024-05-28 Thread Bjorn Andersson
On Fri, May 03, 2024 at 01:48:07PM GMT, Mukesh Ojha wrote:
> Currently, Qualcomm Minidump is being used to collect mini version of
> remoteproc coredump with the help of boot firmware however, Minidump
> as a feature is not limited to be used only for remote processor and
> can also be used for Application processors. So, in preparation of
> using it move the Minidump related data structure and its function
> to its own file under drivers/soc/qcom with qcom_rproc_minidump.c
> which clearly says it is only for remoteproc minidump.
> 
> Extra changes made apart from the movement is,
> 1. Adds new config, kernel headers and module macros to get this
>module compiled.
> 2. Guards the qcom_minidump() with CONFIG_QCOM_RPROC_MINIDUMP.
> 3. Selects this QCOM_RPROC_MINIDUMP config when QCOM_RPROC_COMMON
>enabled.
> 4. Added new header qcom_minidump.h .
> 
> Signed-off-by: Mukesh Ojha 

I wouldn't be able to merge this without anything depending on it...

[..]
> diff --git a/drivers/soc/qcom/qcom_rproc_minidump.c 
> b/drivers/soc/qcom/qcom_rproc_minidump.c
[..]
> +void qcom_minidump(struct rproc *rproc, unsigned int minidump_id,
> + void (*rproc_dumpfn_t)(struct rproc *rproc,
> + struct rproc_dump_segment *segment, void *dest, size_t offset,
> + size_t size))
> +{
> + int ret;
> + struct minidump_subsystem *subsystem;
> + struct minidump_global_toc *toc;
> +
> + /* Get Global minidump ToC*/
> + toc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, NULL);
> +
> + /* check if global table pointer exists and init is set */
> + if (IS_ERR(toc) || !toc->status) {
> + dev_err(>dev, "Minidump TOC not found in SMEM\n");
> + return;
> + }
> +
> + /* Get subsystem table of contents using the minidump id */
> + subsystem = >subsystems[minidump_id];
> +
> + /**
> +  * Collect minidump if SS ToC is valid and segment table
> +  * is initialized in memory and encryption status is set.
> +  */
> + if (subsystem->regions_baseptr == 0 ||
> + le32_to_cpu(subsystem->status) != 1 ||
> + le32_to_cpu(subsystem->enabled) != MINIDUMP_SS_ENABLED) {
> + return rproc_coredump(rproc);
> + }
> +
> + if (le32_to_cpu(subsystem->encryption_status) != MINIDUMP_SS_ENCR_DONE) 
> {
> +     dev_err(>dev, "Minidump not ready, skipping\n");
> + return;
> + }
> +
> + /**
> +  * Clear out the dump segments populated by parse_fw before
> +  * re-populating them with minidump segments.
> +  */
> + rproc_coredump_cleanup(rproc);

I don't think this should be invoked outside drivers/remoteproc, and the
comment talks about a remoteproc-internal concern...

> +
> + ret = qcom_add_minidump_segments(rproc, subsystem, rproc_dumpfn_t);

This function changes the internal state of the remoteproc and relies on
other operations to clean things up.

I think we could come up with a better design of this, and I don't think
we should spread this outside of the remoteproc framework.

Regards,
Bjorn

> + if (ret) {
> + dev_err(>dev, "Failed with error: %d while adding 
> minidump entries\n", ret);
> + goto clean_minidump;
> + }
> + rproc_coredump_using_sections(rproc);
> +clean_minidump:
> + qcom_minidump_cleanup(rproc);
> +}
> +EXPORT_SYMBOL_GPL(qcom_minidump);
> +
> +MODULE_DESCRIPTION("Qualcomm Remoteproc Minidump helper module");
> +MODULE_LICENSE("GPL");
> diff --git a/include/soc/qcom/qcom_minidump.h 
> b/include/soc/qcom/qcom_minidump.h
> new file mode 100644
> index ..0fe156066bc0
> --- /dev/null
> +++ b/include/soc/qcom/qcom_minidump.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _QCOM_MINIDUMP_H_
> +#define _QCOM_MINIDUMP_H_
> +
> +struct rproc;
> +struct rproc_dump_segment;
> +
> +#if IS_ENABLED(CONFIG_QCOM_RPROC_MINIDUMP)
> +void qcom_minidump(struct rproc *rproc, unsigned int minidump_id,
> +void (*rproc_dumpfn_t)(struct rproc *rproc,
> +struct rproc_dump_segment *segment, void *dest, size_t 
> offset,
> +size_t size));
> +#else
> +static inline void qcom_minidump(struct rproc *rproc, unsigned int 
> minidump_id,
> +void (*rproc_dumpfn_t)(struct rproc *rproc,
> +struct rproc_dump_segment *segment, void *dest, size_t 
> offset,
> +size_t size)) { }
> +#endif /* CONFIG_QCOM_RPROC_MINIDUMP */
> +#endif /* _QCOM_MINIDUMP_H_ */
> -- 
> 2.7.4
> 



Re: [PATCH v5 4/7] remoteproc: core introduce rproc_set_rsc_table_on_start function

2024-05-28 Thread Mathieu Poirier
On Tue, May 21, 2024 at 10:09:58AM +0200, Arnaud Pouliquen wrote:
> Split rproc_start()to prepare the update of the management of

I don't see any "splitting" for rproc_start() in this patch.  Please consider
rewording or removing.

> the cache table on start, for the support of the firmware loading
> by the TEE interface.
> - create rproc_set_rsc_table_on_start() to address the management of
>   the cache table in a specific function, as done in
>   rproc_reset_rsc_table_on_stop().
> - rename rproc_set_rsc_table in rproc_set_rsc_table_on_attach()
> - move rproc_reset_rsc_table_on_stop() to be close to the
>   rproc_set_rsc_table_on_start() function

This patch is really hard to read due to all 3 operations happening at the same
time.  Please split in 3 smaller patches.

> 
> Suggested-by: Mathieu Poirier 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/remoteproc/remoteproc_core.c | 116 ++-
>  1 file changed, 62 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index f276956f2c5c..42bca01f3bde 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1264,18 +1264,9 @@ void rproc_resource_cleanup(struct rproc *rproc)
>  }
>  EXPORT_SYMBOL(rproc_resource_cleanup);
>  
> -static int rproc_start(struct rproc *rproc, const struct firmware *fw)
> +static int rproc_set_rsc_table_on_start(struct rproc *rproc, const struct 
> firmware *fw)
>  {
>   struct resource_table *loaded_table;
> - struct device *dev = >dev;
> - int ret;
> -
> - /* load the ELF segments to memory */
> - ret = rproc_load_segments(rproc, fw);
> - if (ret) {
> - dev_err(dev, "Failed to load program segments: %d\n", ret);
> - return ret;
> - }
>  
>   /*
>* The starting device has been given the rproc->cached_table as the
> @@ -1291,6 +1282,64 @@ static int rproc_start(struct rproc *rproc, const 
> struct firmware *fw)
>   rproc->table_ptr = loaded_table;
>   }
>  
> + return 0;
> +}
> +
> +static int rproc_reset_rsc_table_on_stop(struct rproc *rproc)
> +{
> + /* A resource table was never retrieved, nothing to do here */
> + if (!rproc->table_ptr)
> + return 0;
> +
> + /*
> +  * If a cache table exists the remote processor was started by
> +  * the remoteproc core.  That cache table should be used for
> +  * the rest of the shutdown process.
> +  */
> + if (rproc->cached_table)
> + goto out;
> +
> + /*
> +  * If we made it here the remote processor was started by another
> +  * entity and a cache table doesn't exist.  As such make a copy of
> +  * the resource table currently used by the remote processor and
> +  * use that for the rest of the shutdown process.  The memory
> +  * allocated here is free'd in rproc_shutdown().
> +  */
> + rproc->cached_table = kmemdup(rproc->table_ptr,
> +   rproc->table_sz, GFP_KERNEL);
> + if (!rproc->cached_table)
> + return -ENOMEM;
> +
> + /*
> +  * Since the remote processor is being switched off the clean table
> +  * won't be needed.  Allocated in rproc_set_rsc_table_on_start().
> +  */
> + kfree(rproc->clean_table);
> +
> +out:
> + /*
> +  * Use a copy of the resource table for the remainder of the
> +  * shutdown process.
> +  */
> + rproc->table_ptr = rproc->cached_table;
> + return 0;
> +}
> +
> +static int rproc_start(struct rproc *rproc, const struct firmware *fw)
> +{
> + struct device *dev = >dev;
> + int ret;
> +
> + /* load the ELF segments to memory */
> + ret = rproc_load_segments(rproc, fw);
> + if (ret) {
> + dev_err(dev, "Failed to load program segments: %d\n", ret);
> + return ret;
> + }
> +
> + rproc_set_rsc_table_on_start(rproc, fw);
> +
>   ret = rproc_prepare_subdevices(rproc);
>   if (ret) {
>   dev_err(dev, "failed to prepare subdevices for %s: %d\n",
> @@ -1450,7 +1499,7 @@ static int rproc_fw_boot(struct rproc *rproc, const 
> struct firmware *fw)
>   return ret;
>  }
>  
> -static int rproc_set_rsc_table(struct rproc *rproc)
> +static int rproc_set_rsc_table_on_attach(struct rproc *rproc)
>  {
>   struct resource_table *table_ptr;
>   struct device *dev = >dev;
> @@ -1540,54 +1589,13 @@ static int rproc_reset_rsc_table_on_detach(struct 
> rproc *rproc)
>  
>   /*
>* The clean resource table is no longer needed.  Allocated in
> -  * rproc_set_rsc_table().
> +  * rproc_set_rsc_table_on_attach().
>*/
>   kfree(rproc->clean_table);
>  
>   return 0;
>  }
>  
> -static int rproc_reset_rsc_table_on_stop(struct rproc *rproc)
> -{
> - /* A resource table was never retrieved, nothing to do here */
> - if (!rproc->table_ptr)
> - return 0;
> 

Re: [PATCH 2/2] objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids

2024-05-28 Thread Jiri Olsa
On Wed, Apr 24, 2024 at 02:52:14PM -0700, Andrii Nakryiko wrote:
> Profiling shows that calling nr_possible_cpus() in objpool_pop() takes
> a noticeable amount of CPU (when profiled on 80-core machine), as we
> need to recalculate number of set bits in a CPU bit mask. This number
> can't change, so there is no point in paying the price for recalculating
> it. As such, cache this value in struct objpool_head and use it in
> objpool_pop().
> 
> On the other hand, cached pool->nr_cpus isn't necessary, as it's not
> used in hot path and is also a pretty trivial value to retrieve. So drop
> pool->nr_cpus in favor of using nr_cpu_ids everywhere. This way the size
> of struct objpool_head remains the same, which is a nice bonus.
> 
> Same BPF selftests benchmarks were used to evaluate the effect. Using
> changes in previous patch (inlining of objpool_pop/objpool_push) as
> baseline, here are the differences:
> 
> BASELINE
> 
> kretprobe  :9.937 ± 0.174M/s
> kretprobe-multi:   10.440 ± 0.108M/s
> 
> AFTER
> =
> kretprobe  :   10.106 ± 0.120M/s (+1.7%)
> kretprobe-multi:   10.515 ± 0.180M/s (+0.7%)

nice, overall lgtm

jirka

> 
> Cc: Matt (Qiang) Wu 
> Signed-off-by: Andrii Nakryiko 
> ---
>  include/linux/objpool.h |  6 +++---
>  lib/objpool.c   | 12 ++--
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/objpool.h b/include/linux/objpool.h
> index d8b1f7b91128..cb1758eaa2d3 100644
> --- a/include/linux/objpool.h
> +++ b/include/linux/objpool.h
> @@ -73,7 +73,7 @@ typedef int (*objpool_fini_cb)(struct objpool_head *head, 
> void *context);
>   * struct objpool_head - object pooling metadata
>   * @obj_size:   object size, aligned to sizeof(void *)
>   * @nr_objs:total objs (to be pre-allocated with objpool)
> - * @nr_cpus:local copy of nr_cpu_ids
> + * @nr_possible_cpus: cached value of num_possible_cpus()
>   * @capacity:   max objs can be managed by one objpool_slot
>   * @gfp:gfp flags for kmalloc & vmalloc
>   * @ref:refcount of objpool
> @@ -85,7 +85,7 @@ typedef int (*objpool_fini_cb)(struct objpool_head *head, 
> void *context);
>  struct objpool_head {
>   int obj_size;
>   int nr_objs;
> - int nr_cpus;
> + int nr_possible_cpus;
>   int capacity;
>   gfp_t   gfp;
>   refcount_t  ref;
> @@ -176,7 +176,7 @@ static inline void *objpool_pop(struct objpool_head *pool)
>   raw_local_irq_save(flags);
>  
>   cpu = raw_smp_processor_id();
> - for (i = 0; i < num_possible_cpus(); i++) {
> + for (i = 0; i < pool->nr_possible_cpus; i++) {
>   obj = __objpool_try_get_slot(pool, cpu);
>   if (obj)
>   break;
> diff --git a/lib/objpool.c b/lib/objpool.c
> index f696308fc026..234f9d0bd081 100644
> --- a/lib/objpool.c
> +++ b/lib/objpool.c
> @@ -50,7 +50,7 @@ objpool_init_percpu_slots(struct objpool_head *pool, int 
> nr_objs,
>  {
>   int i, cpu_count = 0;
>  
> - for (i = 0; i < pool->nr_cpus; i++) {
> + for (i = 0; i < nr_cpu_ids; i++) {
>  
>   struct objpool_slot *slot;
>   int nodes, size, rc;
> @@ -60,8 +60,8 @@ objpool_init_percpu_slots(struct objpool_head *pool, int 
> nr_objs,
>   continue;
>  
>   /* compute how many objects to be allocated with this slot */
> - nodes = nr_objs / num_possible_cpus();
> - if (cpu_count < (nr_objs % num_possible_cpus()))
> + nodes = nr_objs / pool->nr_possible_cpus;
> + if (cpu_count < (nr_objs % pool->nr_possible_cpus))
>   nodes++;
>   cpu_count++;
>  
> @@ -103,7 +103,7 @@ static void objpool_fini_percpu_slots(struct objpool_head 
> *pool)
>   if (!pool->cpu_slots)
>   return;
>  
> - for (i = 0; i < pool->nr_cpus; i++)
> + for (i = 0; i < nr_cpu_ids; i++)
>   kvfree(pool->cpu_slots[i]);
>   kfree(pool->cpu_slots);
>  }
> @@ -130,13 +130,13 @@ int objpool_init(struct objpool_head *pool, int 
> nr_objs, int object_size,
>  
>   /* initialize objpool pool */
>   memset(pool, 0, sizeof(struct objpool_head));
> - pool->nr_cpus = nr_cpu_ids;
> + pool->nr_possible_cpus = num_possible_cpus();
>   pool->obj_size = object_size;
>   pool->capacity = capacity;
>   pool->gfp = gfp & ~__GFP_ZERO;
>   pool->context = context;
>   pool->release = release;
> - slot_size = pool->nr_cpus * sizeof(struct objpool_slot);
> + slot_size = nr_cpu_ids * sizeof(struct objpool_slot);
>   pool->cpu_slots = kzalloc(slot_size, pool->gfp);
>   if (!pool->cpu_slots)
>   return -ENOMEM;
> -- 
> 2.43.0
> 
> 



Re: [PATCH v5 2/7] dt-bindings: remoteproc: Add compatibility for TEE support

2024-05-28 Thread Mathieu Poirier
On Tue, May 21, 2024 at 10:09:56AM +0200, Arnaud Pouliquen wrote:
> The "st,stm32mp1-m4-tee" compatible is utilized in a system configuration
> where the Cortex-M4 firmware is loaded by the Trusted execution Environment
> (TEE).
> For instance, this compatible is used in both the Linux and OP-TEE
> device-tree:
> - In OP-TEE, a node is defined in the device tree with the
>   st,stm32mp1-m4-tee to support signed remoteproc firmware.
>   Based on DT properties, OP-TEE authenticates, loads, starts, and stops
>   the firmware.

I don't see how firmware can be started and stopped.  Please rework.

> - On Linux, when the compatibility is set, the Cortex-M resets should not
>   be declared in the device tree.

This is a description of "what" is happening and not "why".

More comments to come shortly.

Thanks,
Mathieu

> 
> Signed-off-by: Arnaud Pouliquen 
> Reviewed-by: Rob Herring 
> ---
>  .../bindings/remoteproc/st,stm32-rproc.yaml   | 51 ---
>  1 file changed, 43 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml 
> b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
> index 370af61d8f28..36ea54016b76 100644
> --- a/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
> +++ b/Documentation/devicetree/bindings/remoteproc/st,stm32-rproc.yaml
> @@ -16,7 +16,12 @@ maintainers:
>  
>  properties:
>compatible:
> -const: st,stm32mp1-m4
> +enum:
> +  - st,stm32mp1-m4
> +  - st,stm32mp1-m4-tee
> +description:
> +  Use "st,stm32mp1-m4" for the Cortex-M4 coprocessor management by 
> non-secure context
> +  Use "st,stm32mp1-m4-tee" for the Cortex-M4 coprocessor management by 
> secure context
>  
>reg:
>  description:
> @@ -142,21 +147,41 @@ properties:
>  required:
>- compatible
>- reg
> -  - resets
>  
>  allOf:
>- if:
>properties:
> -reset-names:
> -  not:
> -contains:
> -  const: hold_boot
> +compatible:
> +  contains:
> +const: st,stm32mp1-m4
>  then:
> +  if:
> +properties:
> +  reset-names:
> +not:
> +  contains:
> +const: hold_boot
> +  then:
> +required:
> +  - st,syscfg-holdboot
> +  else:
> +properties:
> +  st,syscfg-holdboot: false
> +required:
> +  - reset-names
>required:
> -- st,syscfg-holdboot
> -else:
> +- resets
> +
> +  - if:
> +  properties:
> +compatible:
> +  contains:
> +const: st,stm32mp1-m4-tee
> +then:
>properties:
>  st,syscfg-holdboot: false
> +reset-names: false
> +resets: false
>  
>  additionalProperties: false
>  
> @@ -188,5 +213,15 @@ examples:
>st,syscfg-rsc-tbl = < 0x144 0x>;
>st,syscfg-m4-state = < 0x148 0x>;
>  };
> +  - |
> +#include 
> +m4@1000 {
> +  compatible = "st,stm32mp1-m4-tee";
> +  reg = <0x1000 0x4>,
> +<0x3000 0x4>,
> +<0x3800 0x1>;
> +  st,syscfg-rsc-tbl = < 0x144 0x>;
> +  st,syscfg-m4-state = < 0x148 0x>;
> +};
>  
>  ...
> -- 
> 2.25.1
> 



Re: [PATCH] remoteproc: stm32_rproc: Fix mailbox interrupts queuing

2024-05-28 Thread Mathieu Poirier
On Tue, May 21, 2024 at 06:23:16PM +0200, Gwenael Treuveur wrote:
> Manage interrupt coming from coprocessor also when state is
> ATTACHED.
> 
> Fixes: 35bdafda40cc ("remoteproc: stm32_rproc: Add mutex protection for 
> workqueue")
> Signed-off-by: Gwenael Treuveur 
> Acked-by: Arnaud Pouliquen 

I will pickup this patch but this time only - next time all reviews and tagging
need to happend on the mailing list.

Mathieu

> ---
>  drivers/remoteproc/stm32_rproc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/stm32_rproc.c 
> b/drivers/remoteproc/stm32_rproc.c
> index 88623df7d0c3..8c7f7950b80e 100644
> --- a/drivers/remoteproc/stm32_rproc.c
> +++ b/drivers/remoteproc/stm32_rproc.c
> @@ -294,7 +294,7 @@ static void stm32_rproc_mb_vq_work(struct work_struct 
> *work)
>  
>   mutex_lock(>lock);
>  
> - if (rproc->state != RPROC_RUNNING)
> + if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED)
>   goto unlock_mutex;
>  
>   if (rproc_vq_interrupt(rproc, mb->vq_id) == IRQ_NONE)
> 
> base-commit: 4d5ba6ead1dc9fa298d727e92db40cd98564d1ac
> -- 
> 2.25.1
> 



Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-28 Thread Steven Rostedt
On Tue, 28 May 2024 07:51:30 +0300
Ilkka Naulapää  wrote:

> yeah, the cache_from_obj tracing bug (without panic) has been
> displayed quite some time now - maybe even since 6.7.x or so. I could
> try checking a few versions back for this and try bisecting it if I
> can find when this started.
> 

OK, so I don't think the commit your last bisect hit is the cause of
the bug. It added a delay (via RCU) and is causing the real bug to blow
up more.

Can you add this patch to v6.9.2 and hopefully it crashes in a better
location that we can find where the mixup happened.

You may need to add the other commit (too if this doesn't trigger.

Thanks,

-- Steve

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 417c840e6403..7af3f696696d 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -50,6 +50,7 @@ static struct inode *tracefs_alloc_inode(struct super_block 
*sb)
list_add_rcu(>list, _inodes);
spin_unlock_irqrestore(_inode_lock, flags);
 
+   ti->magic = 20240823;
return >vfs_inode;
 }
 
@@ -66,6 +67,7 @@ static void tracefs_free_inode(struct inode *inode)
struct tracefs_inode *ti = get_tracefs(inode);
unsigned long flags;
 
+   BUG_ON(ti->magic != 20240823);
spin_lock_irqsave(_inode_lock, flags);
list_del_rcu(>list);
spin_unlock_irqrestore(_inode_lock, flags);
@@ -271,16 +273,6 @@ static const struct inode_operations 
tracefs_file_inode_operations = {
.setattr= tracefs_setattr,
 };
 
-struct inode *tracefs_get_inode(struct super_block *sb)
-{
-   struct inode *inode = new_inode(sb);
-   if (inode) {
-   inode->i_ino = get_next_ino();
-   simple_inode_init_ts(inode);
-   }
-   return inode;
-}
-
 struct tracefs_mount_opts {
kuid_t uid;
kgid_t gid;
@@ -448,6 +440,17 @@ static const struct super_operations 
tracefs_super_operations = {
.show_options   = tracefs_show_options,
 };
 
+struct inode *tracefs_get_inode(struct super_block *sb)
+{
+   struct inode *inode = new_inode(sb);
+   BUG_ON(sb->s_op != _super_operations);
+   if (inode) {
+   inode->i_ino = get_next_ino();
+   simple_inode_init_ts(inode);
+   }
+   return inode;
+}
+
 /*
  * It would be cleaner if eventfs had its own dentry ops.
  *
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index f704d8348357..dda7d2708e30 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -16,6 +16,7 @@ struct tracefs_inode {
};
/* The below gets initialized with memset_after(ti, 0, vfs_inode) */
struct list_headlist;
+   unsigned long   magic;
unsigned long   flags;
void*private;
 };



Re: (subset) [PATCH v2 0/2] Allow gpio-hog nodes in qcom,pmic-gpio bindings (& dt fixup)

2024-05-28 Thread Bjorn Andersson


On Tue, 09 Apr 2024 20:36:35 +0200, Luca Weiss wrote:
> Resolve the dt validation failure on Nexus 5.
> 
> 

Applied, thanks!

[2/2] ARM: dts: qcom: msm8974-hammerhead: Update gpio hog node name
  commit: 92b9ce5b11d7ba281f5bf0029185d5c891b29344

Best regards,
-- 
Bjorn Andersson 



Re: (subset) [PATCH 0/2] Fix msm8974 apcs syscon compatible

2024-05-28 Thread Bjorn Andersson


On Mon, 08 Apr 2024 21:32:02 +0200, Luca Weiss wrote:
> Finally fix a warning about the apcs-global syscon used on msm8974 that
> has been around forever.
> 
> 

Applied, thanks!

[2/2] ARM: dts: qcom: msm8974: Use proper compatible for APCS syscon
  commit: c133cfc12cd717b72ce534477415446e1c33de47

Best regards,
-- 
Bjorn Andersson 



Re: [PATCH 0/3] tracing: Fix some selftest issues

2024-05-28 Thread Google
On Mon, 27 May 2024 19:29:07 -0400
Steven Rostedt  wrote:

> On Sun, 26 May 2024 19:10:57 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > Hi,
> > 
> > Here is a series of some fixes/improvements for the test modules and boot
> > time selftest of kprobe events. I found a WARNING message with some boot 
> > time selftest configuration, which came from the combination of embedded
> > kprobe generate API tests module and ftrace boot-time selftest. So the main
> > problem is that the test module should not be built-in. But I also think
> > this WARNING message is useless (because there are warning messages already)
> > and the cleanup code is redundant. This series fixes those issues.
> 
> Note, when I enable trace tests as builtin instead of modules, I just
> disable the bootup self tests when it detects this. This helps with
> doing tests via config options than having to add user space code that
> loads modules.
> 
> Could you do something similar?

OK, in that case, I would like to move the test cleanup code in
module_exit function into the end of module_init function. 
It looks there is no reason to split those into 2 parts.

Thank you,

> 
> -- Steve
> 
> 
> > 
> > Thank you,
> > 
> > ---
> > 
> > Masami Hiramatsu (Google) (3):
> >   tracing: Build event generation tests only as modules
> >   tracing/kprobe: Remove unneeded WARN_ON_ONCE() in selftests
> >   tracing/kprobe: Remove cleanup code unrelated to selftest
> > 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH 1/2] objpool: enable inlining objpool_push() and objpool_pop() operations

2024-05-28 Thread Google
Hi,

Sorry for late reply.

On Fri, 10 May 2024 10:20:56 +0200
Vlastimil Babka  wrote:

> On 5/10/24 9:59 AM, wuqiang.matt wrote:
> > On 2024/5/7 21:55, Vlastimil Babka wrote:
>  >>
> >>> + } while (!try_cmpxchg_acquire(>tail, , tail + 1));
> >>> +
> >>> + /* now the tail position is reserved for the given obj */
> >>> + WRITE_ONCE(slot->entries[tail & slot->mask], obj);
> >>> + /* update sequence to make this obj available for pop() */
> >>> + smp_store_release(>last, tail + 1);
> >>> +
> >>> + return 0;
> >>> +}
> >>>   
> >>>   /**
> >>>* objpool_push() - reclaim the object and return back to objpool
> >>> @@ -134,7 +219,19 @@ void *objpool_pop(struct objpool_head *pool);
> >>>* return: 0 or error code (it fails only when user tries to push
> >>>* the same object multiple times or wrong "objects" into objpool)
> >>>*/
> >>> -int objpool_push(void *obj, struct objpool_head *pool);
> >>> +static inline int objpool_push(void *obj, struct objpool_head *pool)
> >>> +{
> >>> + unsigned long flags;
> >>> + int rc;
> >>> +
> >>> + /* disable local irq to avoid preemption & interruption */
> >>> + raw_local_irq_save(flags);
> >>> + rc = __objpool_try_add_slot(obj, pool, raw_smp_processor_id());
> >> 
> >> And IIUC, we could in theory objpool_pop() on one cpu, then later another
> >> cpu might do objpool_push() and cause the latter cpu's pool to go over
> >> capacity? Is there some implicit requirements of objpool users to take care
> >> of having matched cpu for pop and push? Are the current objpool users
> >> obeying this requirement? (I can see the selftests do, not sure about the
> >> actual users).
> >> Or am I missing something? Thanks.
> > 
> > The objects are all pre-allocated along with creation of the new objpool
> > and the total number of objects never exceeds the capacity on local node.
> 
> Aha, I see, the capacity of entries is enough to hold objects from all nodes
> in the most unfortunate case they all end up freed from a single cpu.
> 
> > So objpool_push() would always find an available slot from the ring-array
> > for the given object to insert back. objpool_pop() would try looping all
> > the percpu slots until an object is found or whole objpool is empty.
> 
> So it's correct, but seems rather wasteful to have the whole capacity for
> entries replicated on every cpu? It does make objpool_push() simple and
> fast, but as you say, objpool_pop() still has to search potentially all
> non-local percpu slots, with disabled irqs, which is far from ideal.

For the kretprobe/fprobe use-case, it is important to push (return) object
fast. We can reservce enough number of objects when registering but push
operation will happen always on random CPU.

> 
> And the "abort if the slot was already full" comment for
> objpool_try_add_slot() seems still misleading? Maybe that was your initial
> idea but changed later?

Ah, it should not happen...

> 
> > Currently kretprobe is the only actual usecase of objpool.

Note that fprobe is also using this objpool, but currently I'm working on
integrating fprobe on function-graph tracer[1] which will make fprobe not
using objpool. And also I'm planning to replace kretprobe with the new
fprobe eventually. So if SLUB will use objpool for frontend caching, it
sounds good to me. (Maybe it can speed up the object allocation/free)

> > 
> > I'm testing an updated objpool in our HIDS project for critical pathes,
> > which is widely deployed on servers inside my company. The new version
> > eliminates the raw_local_irq_save and raw_local_irq_restore pair of
> > objpool_push and gains up to 5% of performance boost.
> 
> Mind Ccing me and linux-mm once you are posting that?

Can you add me too?

Thank you,

> 
> Thanks,
> Vlastimil
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH v3 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-05-28 Thread Dave Hansen
On 5/17/24 04:06, Dmitrii Kuvaiskii wrote:
...

First, why is SGX so special here?  How is the SGX problem different
than what the core mm code does?

> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -25,6 +25,9 @@
>  /* 'desc' bit marking that the page is being reclaimed. */
>  #define SGX_ENCL_PAGE_BEING_RECLAIMEDBIT(3)
>  
> +/* 'desc' bit marking that the page is being removed. */
> +#define SGX_ENCL_PAGE_BEING_REMOVED  BIT(2)

Second, convince me that this _needs_ a new bit.  Why can't we just have
a bit that effectively means "return EBUSY if you see this bit when
handling a fault".

>  struct sgx_encl_page {
>   unsigned long desc;
>   unsigned long vm_max_prot_bits:8;
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 5d390df21440..de59219ae794 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -1142,6 +1142,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
>* Do not keep encl->lock because of dependency on
>* mmap_lock acquired in sgx_zap_enclave_ptes().
>*/
> + entry->desc |= SGX_ENCL_PAGE_BEING_REMOVED;

This also needs a comment, no matter what.



Re: [PATCH v3 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-05-28 Thread Dave Hansen
On 5/17/24 04:06, Dmitrii Kuvaiskii wrote:
> We wrote a trivial stress test to reproduce the hangs observed in
> real-world applications. The test stresses #PF-based page allocation and
> SGX_IOC_ENCLAVE_REMOVE_PAGES flows in the SGX driver:

This seems like something we'd want in the kernel SGX selftests.



Re: (subset) [PATCH v2 0/3] arm64: dts: qcom: msm8996: enable fastrpc and glink-edge

2024-05-28 Thread Bjorn Andersson


On Thu, 18 Apr 2024 09:44:19 +0300, Dmitry Baryshkov wrote:
> Enable the FastRPC and glink-edge nodes on MSM8996 platform. Tested on
> APQ8096 Dragonboard820c.
> 
> 

Applied, thanks!

[2/3] arm64: dts: qcom: msm8996: add glink-edge nodes
  commit: 56ae780a4387d71dd709895acd95112d01f37fb4
[3/3] arm64: dts: msm8996: add fastrpc nodes
  commit: 1b80b83f893dd69efe3c3bf84cd9f661218ccfc0

Best regards,
-- 
Bjorn Andersson 



Re: [PATCH] rv: Update rv_en(dis)able_monitor doc to match kernel-doc

2024-05-28 Thread Daniel Bristot de Oliveira
On 5/20/24 07:42, Yang Li wrote:
> The patch updates the function documentation comment for
> rv_en(dis)able_monitor to adhere to the kernel-doc specification.
> 
> Signed-off-by: Yang Li 

Acked-by: Daniel Bristot de Oliveira 

Thanks
-- Daniel



Re: [PATCH] [v5] kallsyms: rework symbol lookup return codes

2024-05-28 Thread Geert Uytterhoeven
Hi Arnd,

On Thu, Apr 4, 2024 at 4:52 PM Arnd Bergmann  wrote:
> From: Arnd Bergmann 
>
> Building with W=1 in some configurations produces a false positive
> warning for kallsyms:
>
> kernel/kallsyms.c: In function '__sprint_symbol.isra':
> kernel/kallsyms.c:503:17: error: 'strcpy' source argument is the same as 
> destination [-Werror=restrict]
>   503 | strcpy(buffer, name);
>   | ^~~~
>
> This originally showed up while building with -O3, but later started
> happening in other configurations as well, depending on inlining
> decisions. The underlying issue is that the local 'name' variable is
> always initialized to the be the same as 'buffer' in the called functions
> that fill the buffer, which gcc notices while inlining, though it could
> see that the address check always skips the copy.
>
> The calling conventions here are rather unusual, as all of the internal
> lookup functions (bpf_address_lookup, ftrace_mod_address_lookup,
> ftrace_func_address_lookup, module_address_lookup and
> kallsyms_lookup_buildid) already use the provided buffer and either return
> the address of that buffer to indicate success, or NULL for failure,
> but the callers are written to also expect an arbitrary other buffer
> to be returned.
>
> Rework the calling conventions to return the length of the filled buffer
> instead of its address, which is simpler and easier to follow as well
> as avoiding the warning. Leave only the kallsyms_lookup() calling conventions
> unchanged, since that is called from 16 different functions and
> adapting this would be a much bigger change.
>
> Link: https://lore.kernel.org/all/20200107214042.855757-1-a...@arndb.de/
> Link: https://lore.kernel.org/lkml/20240326130647.7bfb1...@gandalf.local.home/
> Reviewed-by: Luis Chamberlain 
> Acked-by: Steven Rostedt (Google) 
> Signed-off-by: Arnd Bergmann 
> ---
> v5: fix ftrace_mod_address_lookup return value,
> rebased on top of 2e114248e086 ("bpf: Replace deprecated strncpy with 
> strscpy")
> v4: fix string length
> v3: use strscpy() instead of strlcpy()
> v2: complete rewrite after the first patch was rejected (in 2020). This
> is now one of only two warnings that are in the way of enabling
> -Wextra/-Wrestrict by default.

Aha, commit 06bb7fc0feee32d9 ("kbuild: turn on -Wrestrict by default")
still made v6.10-rc1, without this one...

> Signed-off-by: Arnd Bergmann 

Thanks, this fixes

kernel/kallsyms.c: In function ‘__sprint_symbol.constprop’:
kernel/kallsyms.c:492:17: warning: ‘strcpy’ source argument is the
same as destination [-Werror=restrict]
  492 | strcpy(buffer, name);
  | ^~~~

I am seeing with shmobile_defconfig and gcc version 11.4.0 (Ubuntu
11.4.0-1ubuntu1~22.04).

Tested-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



Re: [linus:master] [mm] d99e3140a4: BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report

2024-05-28 Thread Miaohe Lin
On 2024/5/28 15:43, David Hildenbrand wrote:
> Am 28.05.24 um 09:11 schrieb kernel test robot:
>>
>>
>> Hello,
>>
>> kernel test robot noticed 
>> "BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report" on:
>>
>> commit: d99e3140a4d33e26066183ff727d8f02f56bec64 ("mm: turn 
>> folio_test_hugetlb into a PageType")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> [test failed on linus/master  c760b3725e52403dc1b28644fb09c47a83cacea6]
>> [test failed on linux-next/master 3689b0ef08b70e4e03b82ebd37730a03a672853a]
>>
>> in testcase: trinity
>> version: trinity-i386-abe9de86-1_20230429
>> with following parameters:
>>
>> runtime: 300s
>> group: group-04
>> nr_groups: 5
>>
>>
>>
>> compiler: gcc-13
>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>>
>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>
>>
>> we noticed this issue does not always happen. we also noticed there are
>> different random KCSAN issues for both this commit and its parent. but below
>> 4 only happen on this commit with not small rate and keep clean on parent.
>>
> 
> Likely that's just a page_type check racing against concurrent
> mapcount changes.
> 
> In __folio_rmap_sanity_checks() we check
> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
> 
> To make sure we don't get hugetlb folios in the wrong rmap code path. That
> can easily race with concurrent mapcount changes, just like any other
> page_type checks that end up in folio_test_type/page_has_type e.g., from
> PFN walkers.
> 
> Load tearing in these functions shouldn't really result in false positives
> (what we care about), but READ_ONCE shouldn't hurt or make a difference.
> 
> 
> From b03dc9bf27571442d886d8da624a4e4f737433f2 Mon Sep 17 00:00:00 2001
> From: David Hildenbrand 
> Date: Tue, 28 May 2024 09:37:20 +0200
> Subject: [PATCH] mm: read page_type using READ_ONCE
> 
> KCSAN complains about possible data races: while we check for a
> page_type -- for example for sanity checks -- we might concurrently
> modify the mapcount that overlays page_type.
> 
> Let's use READ_ONCE to avoid laod tearing (shouldn't make a difference)
> and to make KCSAN happy.
> 
> Note: nothing should really be broken besides wrong KCSAN complaints.
> 
> Reported-by: kernel test robot 
> Closes: https://lore.kernel.org/oe-lkp/202405281431.c46a3be9-...@intel.com
> Signed-off-by: David Hildenbrand 

LGTM. Thanks for fixing.

Reviewed-by: Miaohe Lin 
Thanks.
.




Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-28 Thread Krzysztof Kozlowski
On 28/05/2024 09:26, Bartosz Golaszewski wrote:
> On Mon, May 27, 2024 at 10:56 AM Krzysztof Kozlowski  wrote:
>>
>> On 27/05/2024 10:43, Bartosz Golaszewski wrote:
>>> From: Bartosz Golaszewski 
>>>
>>> Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
>>> GPDSP1 on the SA8775p SoC.
>>>
>>> Co-developed-by: Tengfei Fan 
>>
>> Missing SoB.
>>
> 
> Does it though? The patch never passed through Tengfei's hands, I just
> wanted to give him credit for the work modifying the sm8550-pas
> bindings.

Then what was co-developed by Tengfei? If nothing, then indeed no SoB
but also no Co-developed-by. Docs are clear here: every Co-developed-by
*must* be followed by SoB.

Best regards,
Krzysztof




Re: [linus:master] [mm] d99e3140a4: BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report

2024-05-28 Thread Oscar Salvador
On Tue, May 28, 2024 at 09:43:39AM +0200, David Hildenbrand wrote:
> Likely that's just a page_type check racing against concurrent
> mapcount changes.
> 
> In __folio_rmap_sanity_checks() we check
>   VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);

Yeah, and that "collides" with

last = atomic_add_negative(-1, >_mapcount)

from __folio_remove_rmap.

> To make sure we don't get hugetlb folios in the wrong rmap code path. That
> can easily race with concurrent mapcount changes, just like any other
> page_type checks that end up in folio_test_type/page_has_type e.g., from
> PFN walkers.
> 
> Load tearing in these functions shouldn't really result in false positives
> (what we care about), but READ_ONCE shouldn't hurt or make a difference.
> 
> 
> From b03dc9bf27571442d886d8da624a4e4f737433f2 Mon Sep 17 00:00:00 2001
> From: David Hildenbrand 
> Date: Tue, 28 May 2024 09:37:20 +0200
> Subject: [PATCH] mm: read page_type using READ_ONCE
> 
> KCSAN complains about possible data races: while we check for a
> page_type -- for example for sanity checks -- we might concurrently
> modify the mapcount that overlays page_type.
> 
> Let's use READ_ONCE to avoid laod tearing (shouldn't make a difference)
> and to make KCSAN happy.
> 
> Note: nothing should really be broken besides wrong KCSAN complaints.
> 
> Reported-by: kernel test robot 
> Closes: https://lore.kernel.org/oe-lkp/202405281431.c46a3be9-...@intel.com
> Signed-off-by: David Hildenbrand 

Reviewed-by: Oscar Salvador 

Thanks!

-- 
Oscar Salvador
SUSE Labs



Re: [linus:master] [mm] d99e3140a4: BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report

2024-05-28 Thread David Hildenbrand

Am 28.05.24 um 09:11 schrieb kernel test robot:



Hello,

kernel test robot noticed 
"BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report" on:

commit: d99e3140a4d33e26066183ff727d8f02f56bec64 ("mm: turn folio_test_hugetlb into 
a PageType")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master  c760b3725e52403dc1b28644fb09c47a83cacea6]
[test failed on linux-next/master 3689b0ef08b70e4e03b82ebd37730a03a672853a]

in testcase: trinity
version: trinity-i386-abe9de86-1_20230429
with following parameters:

runtime: 300s
group: group-04
nr_groups: 5



compiler: gcc-13
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we noticed this issue does not always happen. we also noticed there are
different random KCSAN issues for both this commit and its parent. but below
4 only happen on this commit with not small rate and keep clean on parent.



Likely that's just a page_type check racing against concurrent
mapcount changes.

In __folio_rmap_sanity_checks() we check
VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);

To make sure we don't get hugetlb folios in the wrong rmap code path. That
can easily race with concurrent mapcount changes, just like any other
page_type checks that end up in folio_test_type/page_has_type e.g., from
PFN walkers.

Load tearing in these functions shouldn't really result in false positives
(what we care about), but READ_ONCE shouldn't hurt or make a difference.


From b03dc9bf27571442d886d8da624a4e4f737433f2 Mon Sep 17 00:00:00 2001
From: David Hildenbrand 
Date: Tue, 28 May 2024 09:37:20 +0200
Subject: [PATCH] mm: read page_type using READ_ONCE

KCSAN complains about possible data races: while we check for a
page_type -- for example for sanity checks -- we might concurrently
modify the mapcount that overlays page_type.

Let's use READ_ONCE to avoid laod tearing (shouldn't make a difference)
and to make KCSAN happy.

Note: nothing should really be broken besides wrong KCSAN complaints.

Reported-by: kernel test robot 
Closes: https://lore.kernel.org/oe-lkp/202405281431.c46a3be9-...@intel.com
Signed-off-by: David Hildenbrand 
---
 include/linux/page-flags.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 104078afe0b1..e46ccbb9aa58 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -955,9 +955,9 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
 #define PG_slab0x1000
 
 #define PageType(page, flag)		\

-   ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
+   ((READ_ONCE(page->page_type) & (PAGE_TYPE_BASE | flag)) == 
PAGE_TYPE_BASE)
 #define folio_test_type(folio, flag)   \
-   ((folio->page.page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
+   ((READ_ONCE(folio->page.page_type) & (PAGE_TYPE_BASE | flag))  == 
PAGE_TYPE_BASE)
 
 static inline int page_type_has_type(unsigned int page_type)

 {
@@ -966,7 +966,7 @@ static inline int page_type_has_type(unsigned int page_type)
 
 static inline int page_has_type(const struct page *page)

 {
-   return page_type_has_type(page->page_type);
+   return page_type_has_type(READ_ONCE(page->page_type));
 }
 
 #define FOLIO_TYPE_OPS(lname, fname)	\

--
2.45.1


--
Thanks,

David / dhildenb




Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-28 Thread Bartosz Golaszewski
On Mon, May 27, 2024 at 10:56 AM Krzysztof Kozlowski  wrote:
>
> On 27/05/2024 10:43, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski 
> >
> > Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
> > GPDSP1 on the SA8775p SoC.
> >
> > Co-developed-by: Tengfei Fan 
>
> Missing SoB.
>

Does it though? The patch never passed through Tengfei's hands, I just
wanted to give him credit for the work modifying the sm8550-pas
bindings.

Bart



Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Thorsten Scherer
Hello,

On Mon, May 27, 2024 at 01:51:49PM +0100, Russell King (Oracle) wrote:
> On Mon, May 27, 2024 at 02:28:59PM +0200, Uwe Kleine-König wrote:
> > On Mon, May 27, 2024 at 08:56:16AM +0100, Russell King (Oracle) wrote:
> > > On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> > > > Hello,
> > > > 
> > > > in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe 
> > > > pointed me to
> > > > this thread.  With the proposed code change applied the procedure
> > > > 
> > > > # set to some known good (randomly guessed) filter function and 
> > > > enable function_graph
> > > > echo mtdblock_open > set_ftrace_filter
> > > > echo function_graph > current_tracer
> > > > 
> > > > # walk available filter funcs
> > > > cat available_filter_functions | while read f; do echo $f | tee -a 
> > > > set_ftrace_filter; sleep 1; done
> > > > 
> > > > produces the following output
> > > > 
> > > > [  159.832173] Insufficient stack space to handle exception!
> > > > [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> > > > [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> > > > [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> > > > [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> > > > [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> > > > industrialio_triggered_buffer kfifo_buf capture_events_irq 
> > > > capture_events iio_trig_hrtimer industrialio_sw_trigger 
> > > > industrialio_configfs dm_mod
> > > > [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> > > > [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> > > > [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> > > > [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> > > > [  159.898376] pc : []lr : []psr: 6093
> > > > [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> > > > [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> > > > [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : 
> > > > c8e440ac
> > > > [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : 
> > > > c8e44038
> > > > [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  
> > > > Segment none
> > > > [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> > > > [  159.941436] Register r0 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.952253] Register r1 information: non-slab/vmalloc memory
> > > > [  159.957988] Register r2 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.968791] Register r3 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.979592] Register r4 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.990391] Register r5 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  160.001187] Register r6 information: non-paged memory
> > > > [  160.006303] Register r7 information: non-paged memory
> > > > [  160.011415] Register r8 information: non-slab/vmalloc memory
> > > > [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> > > > pointer offset 0 size 32
> > > > [  160.025718] Register r10 information: non-slab/vmalloc memory
> > > > [  160.031530] Register r11 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  160.042422] Register r12 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> > > > [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)
> > > 
> > > No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
> > > a function involving the ftrace tracing infrastructure, which is why 8KB
> > > of stack has been gobbled up. It could be
> > > copy_from_kernel_nofault_allowed() but it would be useful to have at
> > > least some extract of the backtrace showing the recursive cycle to
> > > confirm, otherwise there is nothing in your report to confirm. As I'm
> > > not a ftrace user myself, this isn't something I'd test for, so having
> > > a full report would be useful.
> > 
> > Is not having a backtrace related to ftrace_return_address() returning
> > NULL, as Arnd pointed out in
> > https://lore.kernel.org/linux-arm-kernel/36cd10de-c51c-40ff-90e8-714954060...@app.fastmail.com/
> > ?
> 
> Unlikely - the stack and code lines are also missing. I think the
> submitter truncated the oops which is highly likely given that it
> would've dumped all 8kB of the stack in hex, and the trace and
> code lines would be after that.

sorry for causing additional friction by my imprecise description.
Indeed, this was the whole oops 

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Ilkka Naulapää
yeah, the cache_from_obj tracing bug (without panic) has been
displayed quite some time now - maybe even since 6.7.x or so. I could
try checking a few versions back for this and try bisecting it if I
can find when this started.

--Ilkka

On Tue, May 28, 2024 at 1:31 AM Steven Rostedt  wrote:
>
> On Fri, 24 May 2024 12:50:08 +0200
> "Linux regression tracking (Thorsten Leemhuis)"  
> wrote:
>
> > > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > > quick display of a kernel trace dump before the shutdown/reboot
> > > completed. Starting from version 6.8.10 and continuing into version
> > > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > > preventing the shutdown or reboot from completing and leaving the
> > > machine stuck.
>
> You state "Before kernel version 6.8.10, the bug caused ...". Does that
> mean that a bug was happening before v6.8.10? But did not cause a panic?
>
> I just noticed your second screen shot from your report, and it has:
>
>  "cache_from_obj: Wrong slab cache, tracefs_inode_cache but object is from 
> inode_cache"
>
> So somehow an tracefs_inode was allocated from the inode_cache and is
> now being freed by the tracefs_inode logic? Did this happen before
> 6.8.10? If so, this code could just be triggering an issue from an
> unrelated bug.
>
> Thanks,
>
> -- Steve



Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Ilkka Naulapää
I tried 6.10-rc1 and it still ends up to panic

--Ilkka


On Tue, May 28, 2024 at 12:44 AM Steven Rostedt  wrote:
>
> On Mon, 27 May 2024 20:14:42 +0200
> Greg KH  wrote:
>
> > On Mon, May 27, 2024 at 07:40:21PM +0300, Ilkka Naulapää wrote:
> > > Hi Steven,
> > >
> > > I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
> > > panic inducing commit:
> > >
> > > 414fb08628143 (tracefs: Reset permissions on remount if permissions are 
> > > options)
> > >
> > > I reverted that commit to 6.9.2 and now it only serves the trace but
> > > the panic is gone. But I can live with it.
> >
> > Steven, should we revert that?
> >
> > Or is there some other change that we should take to resolve this?
> >
>
> Before we revert it (as it may be a bug in mainline), Ilkka, can you
> test v6.10-rc1?  If it exists there, it will let me know whether or not
> I missed something.
>
> Thanks,
>
> -- Steve



Re: (subset) [PATCH v2 0/2] Add Samsung Galaxy Note 3 support

2024-05-27 Thread Bjorn Andersson


On Thu, 14 Mar 2024 20:00:13 +0100, Luca Weiss wrote:
> Add the dts for "hlte" which is a phablet from 2013.
> 
> 

Applied, thanks!

[2/2] ARM: dts: qcom: msm8974: Add Samsung Galaxy Note 3
  commit: b4f6c63bf34d8da1b769483bb1f4a603c53896ce

Best regards,
-- 
Bjorn Andersson 



Re: (subset) [PATCH 0/2] Add basic APR sound support for SC7280 SoC

2024-05-27 Thread Bjorn Andersson


On Fri, 10 May 2024 14:27:07 +0200, Luca Weiss wrote:
> Validated on Fairphone 5 (QCM6490) smartphone by using DisplayPort over
> USB-C audio, connected to a TV, with a basic UCM to enable
> 'DISPLAY_PORT_RX Audio Mixer MultiMedia1':
> https://gitlab.com/postmarketOS/pmaports/-/tree/master/device/testing/device-fairphone-fp5/ucm
> 
> Unfortunately all the device-specific things can't be enabled yet
> upstream as detailed in the second patch, but the SoC parts should be
> good to go.
> 
> [...]

Applied, thanks!

[1/2] arm64: dts: qcom: sc7280: Add APR nodes for sound
  commit: f44da5d8722de348ff2eb8b206c69b52809c1772

Best regards,
-- 
Bjorn Andersson 



Re: (subset) [PATCH v3 0/2] Add samsung-milletwifi

2024-05-27 Thread Bjorn Andersson


On Mon, 19 Feb 2024 22:43:15 +0100, Bryant Mairs wrote:
> This series adds support for samsung-milletwifi, the smaller cousin
> to samsung-matisselte. I've used the manufacturer's naming convention
> for consistency.
> 
> Hardware currently supported:
> - Display
> - Cover detection
> - Physical buttons
> - Touchscreen and touchkeys
> - Accelerometer
> 
> [...]

Applied, thanks!

[2/2] ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 8.0 Wi-Fi
  commit: 49b9981a0ecae2bbb298d8b0c2b8058220038691

Best regards,
-- 
Bjorn Andersson 



Re: [PATCH v4 0/4] MSM8976 MDSS/GPU/WCNSS support

2024-05-27 Thread Bjorn Andersson


On Wed, 08 May 2024 18:34:33 +0200, Adam Skladowski wrote:
> This patch series provide support for display subsystem, gpu
> and also adds wireless connectivity subsystem support.
> 
> Changes since v3
> 
> 1. Minor styling fixes
> 2. Converted qcom,ipc into mailbox on wcnss patch
> 
> [...]

Applied, thanks!

[1/4] arm64: dts: qcom: msm8976: Add IOMMU nodes
  commit: 418c2ffd7df9bfc25c21172bd881b78d7569fb4d
[2/4] arm64: dts: qcom: msm8976: Add MDSS nodes
  commit: b0516dbf8e218dede2fd2837ca82dccd9cdcdafc
[3/4] arm64: dts: qcom: msm8976: Add Adreno GPU
  commit: 00e67d8e80f06bb848a3dd516d06e2f040b7d8f2
[4/4] arm64: dts: qcom: msm8976: Add WCNSS node
  commit: 45878973229a93f0f42aa048ac8c6223af010082

Best regards,
-- 
Bjorn Andersson 



Re: [PATCH v2 4/5] arm64: dts: qcom: sa8775p: add ADSP, CDSP and GPDSP nodes

2024-05-27 Thread Tengfei Fan




On 5/27/2024 4:43 PM, Bartosz Golaszewski wrote:

From: Tengfei Fan 

Add nodes for remoteprocs: ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 for
SA8775p SoCs.

Signed-off-by: Tengfei Fan 
Co-developed-by: Bartosz Golaszewski 
Signed-off-by: Bartosz Golaszewski 
---
  arch/arm64/boot/dts/qcom/sa8775p.dtsi | 332 ++
  1 file changed, 332 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sa8775p.dtsi 
b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
index 31de73594839..5c0b61a5624b 100644
--- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi
+++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
@@ -10,6 +10,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  
@@ -544,6 +545,121 @@ cpucp_fw_mem: cpucp-fw@db20 {

};
};
  
+	smp2p-adsp {

+   compatible = "qcom,smp2p";
+   qcom,smem = <443>, <429>;
+   interrupts-extended = < IPCC_CLIENT_LPASS
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_LPASS IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <2>;
+
+   smp2p_adsp_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_adsp_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-cdsp0 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <94>, <432>;
+   interrupts-extended = < IPCC_CLIENT_CDSP
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_CDSP IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <5>;
+
+   smp2p_cdsp0_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_cdsp0_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-cdsp1 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <617>, <616>;
+   interrupts-extended = < IPCC_CLIENT_NSP1
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_NSP1 IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <12>;
+
+   smp2p_cdsp1_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_cdsp1_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-gpdsp0 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <617>, <616>;
+   interrupts-extended = < IPCC_CLIENT_GPDSP0
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_GPDSP0 IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <17>;
+
+   smp2p_gpdsp0_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_gpdsp0_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-gpdsp1 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <617>, <616>;
+   interrupts-extended = < IPCC_CLIENT_GPDSP1
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_GPDSP1 IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <18>;
+
+   smp2p_gpdsp1_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_gpdsp1_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";

Re: [PATCH v2 3/5] remoteproc: qcom_q6v5_pas: Add support for SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Tengfei Fan




On 5/27/2024 4:43 PM, Bartosz Golaszewski wrote:

From: Tengfei Fan 

Add support for PIL loading on ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 on
SA8775p SoCs.

Signed-off-by: Tengfei Fan 
Co-developed-by: Bartosz Golaszewski 
Signed-off-by: Bartosz Golaszewski 
---
  drivers/remoteproc/qcom_q6v5_pas.c | 92 ++
  1 file changed, 92 insertions(+)

diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
b/drivers/remoteproc/qcom_q6v5_pas.c
index 54d8005d40a3..16053aa99298 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -820,6 +820,23 @@ static const struct adsp_data adsp_resource_init = {
.ssctl_id = 0x14,
  };
  
+static const struct adsp_data sa8775p_adsp_resource = {

+   .crash_reason_smem = 423,
+   .firmware_name = "adsp.mdt",
+   .pas_id = 1,
+   .minidump_id = 5,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "lcx",
+   "lmx",
+   NULL
+   },
+   .load_state = "adsp",
+   .ssr_name = "lpass",
+   .sysmon_name = "adsp",
+   .ssctl_id = 0x14,
+};
+
  static const struct adsp_data sdm845_adsp_resource_init = {
.crash_reason_smem = 423,
.firmware_name = "adsp.mdt",
@@ -933,6 +950,42 @@ static const struct adsp_data cdsp_resource_init = {
.ssctl_id = 0x17,
  };
  
+static const struct adsp_data sa8775p_cdsp0_resource = {

+   .crash_reason_smem = 601,
+   .firmware_name = "cdsp0.mdt",
+   .pas_id = 18,
+   .minidump_id = 7,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   "nsp0",


s/nsp0/nsp/


+   NULL
+   },
+   .load_state = "cdsp",
+   .ssr_name = "cdsp",
+   .sysmon_name = "cdsp",
+   .ssctl_id = 0x17,
+};
+
+static const struct adsp_data sa8775p_cdsp1_resource = {
+   .crash_reason_smem = 633,
+   .firmware_name = "cdsp1.mdt",
+   .pas_id = 30,
+   .minidump_id = 20,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   "nsp1",


s/nsp1/nsp/


+   NULL
+   },
+   .load_state = "nsp",
+   .ssr_name = "cdsp1",
+   .sysmon_name = "cdsp1",
+   .ssctl_id = 0x20,
+};
+
  static const struct adsp_data sdm845_cdsp_resource_init = {
.crash_reason_smem = 601,
.firmware_name = "cdsp.mdt",
@@ -1074,6 +1127,40 @@ static const struct adsp_data sm8350_cdsp_resource = {
.ssctl_id = 0x17,
  };
  
+static const struct adsp_data sa8775p_gpdsp0_resource = {

+   .crash_reason_smem = 640,
+   .firmware_name = "gpdsp0.mdt",
+   .pas_id = 39,
+   .minidump_id = 21,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   NULL
+   },
+   .load_state = "gpdsp0",
+   .ssr_name = "gpdsp0",
+   .sysmon_name = "gpdsp0",
+   .ssctl_id = 0x21,
+};
+
+static const struct adsp_data sa8775p_gpdsp1_resource = {
+   .crash_reason_smem = 641,
+   .firmware_name = "gpdsp1.mdt",
+   .pas_id = 40,
+   .minidump_id = 22,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   NULL
+   },
+   .load_state = "gpdsp1",
+   .ssr_name = "gpdsp1",
+   .sysmon_name = "gpdsp1",
+   .ssctl_id = 0x22,
+};
+
  static const struct adsp_data mpss_resource_init = {
.crash_reason_smem = 421,
.firmware_name = "modem.mdt",
@@ -1315,6 +1402,11 @@ static const struct of_device_id adsp_of_match[] = {
{ .compatible = "qcom,qcs404-adsp-pas", .data = _resource_init },
{ .compatible = "qcom,qcs404-cdsp-pas", .data = _resource_init },
{ .compatible = "qcom,qcs404-wcss-pas", .data = _resource_init },
+   { .compatible = "qcom,sa8775p-adsp-pas", .data = 
_adsp_resource},
+   { .compatible = "qcom,sa8775p-cdsp0-pas", .data = 
_cdsp0_resource},
+   { .compatible = "qcom,sa8775p-cdsp1-pas", .data = 
_cdsp1_resource},
+   { .compatible = "qcom,sa8775p-gpdsp0-pas", .data = 
_gpdsp0_resource},
+   { .compatible = "qcom,sa8775p-gpdsp1-pas", .data = 
_gpdsp1_resource},
{ .compatible = "qcom,sc7180-adsp-pas", .data = _adsp_resource},
{ .compatible = "qcom,sc7180-mpss-pas", .data = _resource_init},
{ .compatible = "qcom,sc7280-adsp-pas", .data = _adsp_resource},



--
Thx and BRs,
Tengfei Fan



Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Tengfei Fan




On 5/27/2024 4:56 PM, Krzysztof Kozlowski wrote:

On 27/05/2024 10:43, Bartosz Golaszewski wrote:

From: Bartosz Golaszewski 

Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
GPDSP1 on the SA8775p SoC.

Co-developed-by: Tengfei Fan 


Missing SoB.


Signed-off-by: Bartosz Golaszewski 


...



+
+allOf:
+  - $ref: /schemas/remoteproc/qcom,pas-common.yaml#
+
+  - if:
+  properties:
+compatible:
+  enum:
+- qcom,sa8775p-adsp-pas
+then:
+  properties:
+power-domains:
+  items:
+- description: LCX power domain
+- description: LMX power domain
+power-domain-names:
+  items:
+- const: lcx
+- const: lmx
+
+  - if:
+  properties:
+compatible:
+  enum:
+- qcom,sa8775p-cdsp-pas


cdsp0


+then:
+  properties:
+power-domains:
+  items:
+- description: CX power domain
+- description: MXC power domain
+- description: NSP0 power domain
+power-domain-names:
+  items:
+- const: cx
+- const: mxc
+- const: nsp0


Shouldn't this be just nsp, so both cdsp0 and cdsp1 entries can be
unified? That's the power domain from the device point of view, so the
device expects to be in some NSP domain, not explicitly NSPn.


Both cdsp0 and cdsp1 entries can uniformly use nsp.




Best regards,
Krzysztof



--
Thx and BRs,
Tengfei Fan



Re: [PATCH v2] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-27 Thread Tatsuya S
On Mon, May 27, 2024 at 07:49:14PM GMT, Steven Rostedt wrote:
> On Mon, 27 May 2024 18:44:56 +0900
> Tatsuya S  wrote:
> 
> > On setting set_ftrace_pid, a extra entry generated by ftrace_pid_func()
> > is shown on stack trace(CONFIG_UNWINDER_FRAME_POINTER=y).
> > 
> > [004] .68.459382: 
> >  => 0xa00090af
> >  => ksys_read
> >  => __x64_sys_read
> >  => x64_sys_call
> >  => do_syscall_64
> >  => entry_SYSCALL_64_after_hwframe  
> > 
> > To resolve this issue, increment skip count
> > in function_stack_trace_call() if pids are set.
> 
> Just a note, this isn't a "fix" but simply an improvement in output.
> I'm happy to take this (after testing and more reviewing), but it will
> be for the next merge window, and with a different subject line.
> 
>   "ftrace: Hide one more entry in stack trace when ftrace_pid is enabled"
> 
> Or something like that.
> 
> Thanks,
> 
> -- Steve
I will send patch with new subject line.

Thank you.

Tatsuya



Re: [PATCH v2 2/2] selftests/user_events: Add non-spacing separator check

2024-05-27 Thread Google
On Tue, 23 Apr 2024 16:23:38 +
Beau Belgrave  wrote:

> The ABI documentation indicates that field separators do not need a
> space between them, only a ';'. When no spacing is used, the register
> must work. Any subsequent register, with or without spaces, must match
> and not return -EADDRINUSE.
> 
> Add a non-spacing separator case to our self-test register case to ensure
> it works going forward.
> 

Looks good to me.

Acked-by: Masami Hiramatsu (Google) 

Thanks!

> Signed-off-by: Beau Belgrave 
> ---
>  tools/testing/selftests/user_events/ftrace_test.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tools/testing/selftests/user_events/ftrace_test.c 
> b/tools/testing/selftests/user_events/ftrace_test.c
> index dcd7509fe2e0..0bb46793dcd4 100644
> --- a/tools/testing/selftests/user_events/ftrace_test.c
> +++ b/tools/testing/selftests/user_events/ftrace_test.c
> @@ -261,6 +261,12 @@ TEST_F(user, register_events) {
>   ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ));
>   ASSERT_EQ(0, reg.write_index);
>  
> + /* Register without separator spacing should still match */
> + reg.enable_bit = 29;
> + reg.name_args = (__u64)"__test_event u32 field1;u32 field2";
> + ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ));
> + ASSERT_EQ(0, reg.write_index);
> +
>   /* Multiple registers to same name but different args should fail */
>   reg.enable_bit = 29;
>   reg.name_args = (__u64)"__test_event u32 field1;";
> @@ -288,6 +294,8 @@ TEST_F(user, register_events) {
>   ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, ));
>   unreg.disable_bit = 30;
>   ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, ));
> + unreg.disable_bit = 29;
> + ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, ));
>  
>   /* Delete should have been auto-done after close and unregister */
>   close(self->data_fd);
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH v2] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-27 Thread Steven Rostedt
On Mon, 27 May 2024 18:44:56 +0900
Tatsuya S  wrote:

> On setting set_ftrace_pid, a extra entry generated by ftrace_pid_func()
> is shown on stack trace(CONFIG_UNWINDER_FRAME_POINTER=y).
> 
> [004] .68.459382: 
>  => 0xa00090af
>  => ksys_read
>  => __x64_sys_read
>  => x64_sys_call
>  => do_syscall_64
>  => entry_SYSCALL_64_after_hwframe  
> 
> To resolve this issue, increment skip count
> in function_stack_trace_call() if pids are set.

Just a note, this isn't a "fix" but simply an improvement in output.
I'm happy to take this (after testing and more reviewing), but it will
be for the next merge window, and with a different subject line.

  "ftrace: Hide one more entry in stack trace when ftrace_pid is enabled"

Or something like that.

Thanks,

-- Steve



Re: [PATCH 2/2] ring-buffer: Fix a race between readers and resize checks

2024-05-27 Thread Steven Rostedt
On Mon, 27 May 2024 11:36:55 +0200
Petr Pavlu  wrote:

> >>  static void rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
> >>  {
> >> @@ -2200,8 +2205,13 @@ int ring_buffer_resize(struct trace_buffer *buffer, 
> >> unsigned long size,
> >> */
> >>synchronize_rcu();
> >>for_each_buffer_cpu(buffer, cpu) {
> >> +  unsigned long flags;
> >> +
> >>cpu_buffer = buffer->buffers[cpu];
> >> +  raw_spin_lock_irqsave(_buffer->reader_lock, flags);
> >>rb_check_pages(cpu_buffer);
> >> +  raw_spin_unlock_irqrestore(_buffer->reader_lock,
> >> + flags);  
> > 
> > Putting my RT hat on, I really don't like the above fix. The
> > rb_check_pages() iterates all subbuffers which makes the time interrupts
> > are disabled non-deterministic.  
> 
> I see, this applies also to the same rb_check_pages() validation invoked
> from ring_buffer_read_finish().
> 
> > 
> > Instead, I would rather have something where we disable readers while we do
> > the check, and re-enable them.
> > 
> > raw_spin_lock_irqsave(_buffer->reader_lock, flags);
> > cpu_buffer->read_disabled++;
> > raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> > flags);
> > 
> > // Also, don't put flags on a new line. We are allow to go 100 characters 
> > now.  
> 
> Noted.
> 
> > 
> > 
> > rb_check_pages(cpu_buffer);
> > raw_spin_lock_irqsave(_buffer->reader_lock, flags);
> > cpu_buffer->read_disabled--;
> > raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> > flags);
> > 
> > Or something like that. Yes, that also requires creating a new
> > "read_disabled" field in the ring_buffer_per_cpu code.  
> 
> I think this would work but I'm personally not immediately sold on this
> approach. If I understand the idea correctly, readers should then check
> whether cpu_buffer->read_disabled is set and bail out with some error if
> that is the case. The rb_check_pages() function is only a self-check
> code and as such, I feel it doesn't justify disrupting readers with
> a new error condition and adding more complex locking.

Honestly, this code was never made for more than one reader per
cpu_buffer. I'm perfectly fine if all check_pages() causes other
readers to the same per_cpu buffer to get -EBUSY.

Do you really see this being a problem? What use case is there for
hitting the check_pages() and reading the same cpu buffer at the same
time?

-- Steve



Re: [PATCH 0/3] tracing: Fix some selftest issues

2024-05-27 Thread Steven Rostedt
On Sun, 26 May 2024 19:10:57 +0900
"Masami Hiramatsu (Google)"  wrote:

> Hi,
> 
> Here is a series of some fixes/improvements for the test modules and boot
> time selftest of kprobe events. I found a WARNING message with some boot 
> time selftest configuration, which came from the combination of embedded
> kprobe generate API tests module and ftrace boot-time selftest. So the main
> problem is that the test module should not be built-in. But I also think
> this WARNING message is useless (because there are warning messages already)
> and the cleanup code is redundant. This series fixes those issues.

Note, when I enable trace tests as builtin instead of modules, I just
disable the bootup self tests when it detects this. This helps with
doing tests via config options than having to add user space code that
loads modules.

Could you do something similar?

-- Steve


> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (Google) (3):
>   tracing: Build event generation tests only as modules
>   tracing/kprobe: Remove unneeded WARN_ON_ONCE() in selftests
>   tracing/kprobe: Remove cleanup code unrelated to selftest
> 



Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Steven Rostedt
On Fri, 24 May 2024 12:50:08 +0200
"Linux regression tracking (Thorsten Leemhuis)"  
wrote:

> > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > quick display of a kernel trace dump before the shutdown/reboot
> > completed. Starting from version 6.8.10 and continuing into version
> > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > preventing the shutdown or reboot from completing and leaving the
> > machine stuck.

You state "Before kernel version 6.8.10, the bug caused ...". Does that
mean that a bug was happening before v6.8.10? But did not cause a panic?

I just noticed your second screen shot from your report, and it has:

 "cache_from_obj: Wrong slab cache, tracefs_inode_cache but object is from 
inode_cache"

So somehow an tracefs_inode was allocated from the inode_cache and is
now being freed by the tracefs_inode logic? Did this happen before
6.8.10? If so, this code could just be triggering an issue from an
unrelated bug.

Thanks,

-- Steve



Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Steven Rostedt
On Mon, 27 May 2024 20:14:42 +0200
Greg KH  wrote:

> On Mon, May 27, 2024 at 07:40:21PM +0300, Ilkka Naulapää wrote:
> > Hi Steven,
> > 
> > I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
> > panic inducing commit:
> > 
> > 414fb08628143 (tracefs: Reset permissions on remount if permissions are 
> > options)
> > 
> > I reverted that commit to 6.9.2 and now it only serves the trace but
> > the panic is gone. But I can live with it.  
> 
> Steven, should we revert that?
> 
> Or is there some other change that we should take to resolve this?
> 

Before we revert it (as it may be a bug in mainline), Ilkka, can you
test v6.10-rc1?  If it exists there, it will let me know whether or not
I missed something.

Thanks,

-- Steve



Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Greg KH
On Mon, May 27, 2024 at 07:40:21PM +0300, Ilkka Naulapää wrote:
> Hi Steven,
> 
> I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
> panic inducing commit:
> 
> 414fb08628143 (tracefs: Reset permissions on remount if permissions are 
> options)
> 
> I reverted that commit to 6.9.2 and now it only serves the trace but
> the panic is gone. But I can live with it.

Steven, should we revert that?

Or is there some other change that we should take to resolve this?

thanks,

greg k-h



Re: [PATCH] ipvs: Avoid unnecessary calls to skb_is_gso_sctp

2024-05-27 Thread Julian Anastasov

Hello,

On Thu, 23 May 2024, Ismael Luceno wrote:

> In the context of the SCTP SNAT/DNAT handler, these calls can only
> return true.
> 
> Ref: e10d3ba4d434 ("ipvs: Fix checksumming on GSO of SCTP packets")

checkpatch.pl prefers to see the "commit" word:

Ref: commit e10d3ba4d434 ("ipvs: Fix checksumming on GSO of SCTP packets")

> Signed-off-by: Ismael Luceno 

Looks good to me for nf-next, thanks!

Acked-by: Julian Anastasov 

> CC: Pablo Neira Ayuso 
> CC: Michal Kubeček 
> CC: Simon Horman 
> CC: Julian Anastasov 
> CC: lvs-de...@vger.kernel.org
> CC: netfilter-de...@vger.kernel.org
> CC: net...@vger.kernel.org
> CC: coret...@netfilter.org
> ---
>  net/netfilter/ipvs/ip_vs_proto_sctp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c 
> b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> index 1e689c714127..83e452916403 100644
> --- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
> +++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> @@ -126,7 +126,7 @@ sctp_snat_handler(struct sk_buff *skb, struct 
> ip_vs_protocol *pp,
>   if (sctph->source != cp->vport || payload_csum ||
>   skb->ip_summed == CHECKSUM_PARTIAL) {
>   sctph->source = cp->vport;
> - if (!skb_is_gso(skb) || !skb_is_gso_sctp(skb))
> + if (!skb_is_gso(skb))
>   sctp_nat_csum(skb, sctph, sctphoff);
>   } else {
>   skb->ip_summed = CHECKSUM_UNNECESSARY;
> @@ -175,7 +175,7 @@ sctp_dnat_handler(struct sk_buff *skb, struct 
> ip_vs_protocol *pp,
>   (skb->ip_summed == CHECKSUM_PARTIAL &&
>!(skb_dst(skb)->dev->features & NETIF_F_SCTP_CRC))) {
>   sctph->dest = cp->dport;
> - if (!skb_is_gso(skb) || !skb_is_gso_sctp(skb))
> + if (!skb_is_gso(skb))
>   sctp_nat_csum(skb, sctph, sctphoff);
>   } else if (skb->ip_summed != CHECKSUM_PARTIAL) {
>   skb->ip_summed = CHECKSUM_UNNECESSARY;
> -- 
> 2.44.0

Regards

--
Julian Anastasov 

Re: [PATCH v2] sched/rt: Clean up usage of rt_task()

2024-05-27 Thread Qais Yousef
On 05/23/24 11:45, Steven Rostedt wrote:
> On Wed, 15 May 2024 23:05:36 +0100
> Qais Yousef  wrote:
> > diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
> > index df3aca89d4f5..5cb88b748ad6 100644
> > --- a/include/linux/sched/deadline.h
> > +++ b/include/linux/sched/deadline.h
> > @@ -10,8 +10,6 @@
> >  
> >  #include 
> >  
> > -#define MAX_DL_PRIO0
> > -
> >  static inline int dl_prio(int prio)
> >  {
> > if (unlikely(prio < MAX_DL_PRIO))
> > @@ -19,6 +17,10 @@ static inline int dl_prio(int prio)
> > return 0;
> >  }
> >  
> > +/*
> > + * Returns true if a task has a priority that belongs to DL class. 
> > PI-boosted
> > + * tasks will return true. Use dl_policy() to ignore PI-boosted tasks.
> > + */
> >  static inline int dl_task(struct task_struct *p)
> >  {
> > return dl_prio(p->prio);
> > diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
> > index ab83d85e1183..6ab43b4f72f9 100644
> > --- a/include/linux/sched/prio.h
> > +++ b/include/linux/sched/prio.h
> > @@ -14,6 +14,7 @@
> >   */
> >  
> >  #define MAX_RT_PRIO100
> > +#define MAX_DL_PRIO0
> >  
> >  #define MAX_PRIO   (MAX_RT_PRIO + NICE_WIDTH)
> >  #define DEFAULT_PRIO   (MAX_RT_PRIO + NICE_WIDTH / 2)
> > diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
> > index b2b9e6eb9683..a055dd68a77c 100644
> > --- a/include/linux/sched/rt.h
> > +++ b/include/linux/sched/rt.h
> > @@ -7,18 +7,43 @@
> >  struct task_struct;
> >  
> >  static inline int rt_prio(int prio)
> > +{
> > +   if (unlikely(prio < MAX_RT_PRIO && prio >= MAX_DL_PRIO))
> > +   return 1;
> > +   return 0;
> > +}
> > +
> > +static inline int realtime_prio(int prio)
> >  {
> > if (unlikely(prio < MAX_RT_PRIO))
> > return 1;
> > return 0;
> >  }
> 
> I'm thinking we should change the above to bool (separate patch), as
> returning an int may give one the impression that it returns the actual
> priority number. Having it return bool will clear that up.
> 
> In fact, if we are touching theses functions, might as well change all of
> them to bool when returning true/false. Just to make it easier to
> understand what they are doing.

I can add a patch on top, sure.

> 
> >  
> > +/*
> > + * Returns true if a task has a priority that belongs to RT class. 
> > PI-boosted
> > + * tasks will return true. Use rt_policy() to ignore PI-boosted tasks.
> > + */
> >  static inline int rt_task(struct task_struct *p)
> >  {
> > return rt_prio(p->prio);
> >  }
> >  
> > -static inline bool task_is_realtime(struct task_struct *tsk)
> > +/*
> > + * Returns true if a task has a priority that belongs to RT or DL classes.
> > + * PI-boosted tasks will return true. Use realtime_task_policy() to ignore
> > + * PI-boosted tasks.
> > + */
> > +static inline int realtime_task(struct task_struct *p)
> > +{
> > +   return realtime_prio(p->prio);
> > +}
> > +
> > +/*
> > + * Returns true if a task has a policy that belongs to RT or DL classes.
> > + * PI-boosted tasks will return false.
> > + */
> > +static inline bool realtime_task_policy(struct task_struct *tsk)
> >  {
> > int policy = tsk->policy;
> >  
> 
> 
> 
> > diff --git a/kernel/trace/trace_sched_wakeup.c 
> > b/kernel/trace/trace_sched_wakeup.c
> > index 0469a04a355f..19d737742e29 100644
> > --- a/kernel/trace/trace_sched_wakeup.c
> > +++ b/kernel/trace/trace_sched_wakeup.c
> > @@ -545,7 +545,7 @@ probe_wakeup(void *ignore, struct task_struct *p)
> >  *  - wakeup_dl handles tasks belonging to sched_dl class only.
> >  */
> > if (tracing_dl || (wakeup_dl && !dl_task(p)) ||
> > -   (wakeup_rt && !dl_task(p) && !rt_task(p)) ||
> > +   (wakeup_rt && !realtime_task(p)) ||
> > (!dl_task(p) && (p->prio >= wakeup_prio || p->prio >= 
> > current->prio)))
> > return;
> >  
> 
> Reviewed-by: Steven Rostedt (Google) 

Thanks!

--
Qais Yousef

> 
> 



Re: [PATCH v2] sched/rt: Clean up usage of rt_task()

2024-05-27 Thread Qais Yousef
On 05/21/24 13:00, Sebastian Andrzej Siewior wrote:
> On 2024-05-15 23:05:36 [+0100], Qais Yousef wrote:
> > rt_task() checks if a task has RT priority. But depends on your
> > dictionary, this could mean it belongs to RT class, or is a 'realtime'
> > task, which includes RT and DL classes.
> > 
> > Since this has caused some confusion already on discussion [1], it
> > seemed a clean up is due.
> > 
> > I define the usage of rt_task() to be tasks that belong to RT class.
> > Make sure that it returns true only for RT class and audit the users and
> > replace the ones required the old behavior with the new realtime_task()
> > which returns true for RT and DL classes. Introduce similar
> > realtime_prio() to create similar distinction to rt_prio() and update
> > the users that required the old behavior to use the new function.
> > 
> > Move MAX_DL_PRIO to prio.h so it can be used in the new definitions.
> > 
> > Document the functions to make it more obvious what is the difference
> > between them. PI-boosted tasks is a factor that must be taken into
> > account when choosing which function to use.
> > 
> > Rename task_is_realtime() to realtime_task_policy() as the old name is
> > confusing against the new realtime_task().
> 
> I *think* everyone using rt_task() means to include DL tasks. And
> everyone means !SCHED-people since they know when the difference matters.

yes, this makes sense

> 
> > No functional changes were intended.
> > 
> > [1] 
> > https://lore.kernel.org/lkml/20240506100509.gl40...@noisy.programming.kicks-ass.net/
> > 
> > Reviewed-by: Phil Auld 
> > Signed-off-by: Qais Yousef 
> > ---
> > 
> > Changes since v1:
> > 
> > * Use realtime_task_policy() instead task_has_realtime_policy() (Peter)
> > * Improve commit message readability about replace some rt_task()
> >   users.
> > 
> > v1 discussion: 
> > https://lore.kernel.org/lkml/20240514234112.792989-1-qyou...@layalina.io/
> > 
> >  fs/select.c   |  2 +-
> 
> fs/bcachefs/six.c
> six_owner_running() has rt_task(). But imho should have realtime_task()
> to consider DL. But I think it is way worse that it has its own locking
> rather than using what everyone else but then again it wouldn't be the
> new hot thing…

I think I missed this one. Converted now. Thanks!

> 
> >  include/linux/ioprio.h|  2 +-
> >  include/linux/sched/deadline.h|  6 --
> >  include/linux/sched/prio.h|  1 +
> >  include/linux/sched/rt.h  | 27 ++-
> >  kernel/locking/rtmutex.c  |  4 ++--
> >  kernel/locking/rwsem.c|  4 ++--
> >  kernel/locking/ww_mutex.h |  2 +-
> >  kernel/sched/core.c   |  6 +++---
> >  kernel/time/hrtimer.c |  6 +++---
> >  kernel/trace/trace_sched_wakeup.c |  2 +-
> >  mm/page-writeback.c   |  4 ++--
> >  mm/page_alloc.c   |  2 +-
> >  13 files changed, 48 insertions(+), 20 deletions(-)
> …
> > diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> > index 70625dff62ce..08b95e0a41ab 100644
> > --- a/kernel/time/hrtimer.c
> > +++ b/kernel/time/hrtimer.c
> > @@ -1996,7 +1996,7 @@ static void __hrtimer_init_sleeper(struct 
> > hrtimer_sleeper *sl,
> >  * expiry.
> >  */
> > if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > -   if (task_is_realtime(current) && !(mode & HRTIMER_MODE_SOFT))
> > +   if (realtime_task_policy(current) && !(mode & 
> > HRTIMER_MODE_SOFT))
> > mode |= HRTIMER_MODE_HARD;
> > }
> >  
> > @@ -2096,7 +2096,7 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum 
> > hrtimer_mode mode,
> > u64 slack;
> >  
> > slack = current->timer_slack_ns;
> > -   if (rt_task(current))
> > +   if (realtime_task(current))
> > slack = 0;
> >  
> > hrtimer_init_sleeper_on_stack(, clockid, mode);
> > @@ -2301,7 +2301,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 
> > delta,
> >  * Override any slack passed by the user if under
> >  * rt contraints.
> >  */
> > -   if (rt_task(current))
> > +   if (realtime_task(current))
> > delta = 0;
> 
> I know this is just converting what is already here but…
> __hrtimer_init_sleeper() looks at the policy to figure out if the task
> is realtime do decide if should expire in HARD-IRQ context. This is
> correct, a boosted task should not sleep.
> 
> hrtimer_nanosleep() + schedule_hrtimeout_range_clock() is looking at
> priority to decide if slack should be removed. This should also look at
> policy since a boosted task shouldn't sleep.

I have to admit I never dug deep enough into this code. Happy to convert these
users. I'll add that as a separate patch as this is somewhat changing behavior
which this patch intends to do a clean up only.

> 
> In order to be PI-boosted you need to acquire a lock and the only lock
> you can sleep while acquired without generating a warning is a mutex_t
> (or equivalent sleeping lock) on PREEMPT_RT. 


Re: [PATCH v2 2/3] arm64: dts: qcom: pm7250b: Add a TCPM description

2024-05-27 Thread Bjorn Andersson
On Fri, Mar 29, 2024 at 01:26:20PM GMT, Luca Weiss wrote:
> Type-C port management functionality lives inside of the PMIC block on
> pm7250b.
> 
> The Type-C port management logic controls orientation detection,
> vbus/vconn sense and to send/receive Type-C Power Domain messages.
> 

pm7250b is found in devices where USB Type-C port management is
performed in firmware, presumably using this hardware block.

As such, it seems reasonable to leave this node disabled and only enable
it on the targets that doesn't do this in firmware.

Regards,
Bjorn

> Reviewed-by: Bryan O'Donoghue 
> Reviewed-by: Konrad Dybcio 
> Signed-off-by: Luca Weiss 
> ---
>  arch/arm64/boot/dts/qcom/pm7250b.dtsi | 39 
> +++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi 
> b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> index 4faed25a787f..0205c2669093 100644
> --- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> +++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> @@ -51,6 +51,45 @@ pm7250b_vbus: usb-vbus-regulator@1100 {
>   status = "disabled";
>   };
>  
> + pm7250b_typec: typec@1500 {
> + compatible = "qcom,pm7250b-typec", "qcom,pm8150b-typec";
> + reg = <0x1500>,
> +   <0x1700>;
> + interrupts =  IRQ_TYPE_EDGE_RISING>,
> +  ,
> +   IRQ_TYPE_EDGE_RISING>,
> +  ,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +  ,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>;
> + interrupt-names = "or-rid-detect-change",
> +   "vpd-detect",
> +   "cc-state-change",
> +   "vconn-oc",
> +   "vbus-change",
> +   "attach-detach",
> +   "legacy-cable-detect",
> +   "try-snk-src-detect",
> +   "sig-tx",
> +   "sig-rx",
> +   "msg-tx",
> +   "msg-rx",
> +   "msg-tx-failed",
> +   "msg-tx-discarded",
> +   "msg-rx-discarded",
> +   "fr-swap";
> + vdd-vbus-supply = <_vbus>;
> + };
> +
>   pm7250b_temp: temp-alarm@2400 {
>   compatible = "qcom,spmi-temp-alarm";
>   reg = <0x2400>;
> 
> -- 
> 2.44.0
> 



Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Ilkka Naulapää
Hi Steven,

I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
panic inducing commit:

414fb08628143 (tracefs: Reset permissions on remount if permissions are options)

I reverted that commit to 6.9.2 and now it only serves the trace but
the panic is gone. But I can live with it.

--Ilkka

On Sun, May 26, 2024 at 8:42 PM Ilkka Naulapää  wrote:
>
> hi,
>
> I took 6.9.2 and applied that 0bcfd9aa4dafa to it. Now the kernel is
> serving me both problems; the trace and the panic as the pic shows.
>
> > To understand this, did you do anything with tracing? Before shutting down,
> > is there anything in /sys/kernel/tracing/instances directory?
> > Were any of the files/directories permissions in /sys/kernel/tracing 
> > changed?
>
> And to answer your question, I did not do any tracing or so and the
> /sys/kernel/tracing is empty.
> Just plain boot-up, no matter if in full desktop or in bare rescue
> mode, ends up the same way.
>
> --Ilkka
>
> On Fri, May 24, 2024 at 8:19 PM Steven Rostedt  wrote:
> >
> > On Fri, 24 May 2024 12:50:08 +0200
> > "Linux regression tracking (Thorsten Leemhuis)"  
> > wrote:
> >
> > > > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > > > quick display of a kernel trace dump before the shutdown/reboot
> > > > completed. Starting from version 6.8.10 and continuing into version
> > > > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > > > preventing the shutdown or reboot from completing and leaving the
> > > > machine stuck.
> >
> > Ah, I bet it was this commit: baa23a8d4360d ("tracefs: Reset permissions on
> > remount if permissions are options"), which added a "iput" callback to the
> > dentry without calling iput, leaving stale inodes around.
> >
> > This is fixed with:
> >
> >   0bcfd9aa4dafa ("tracefs: Clear EVENT_INODE flag in tracefs_drop_inode()")
> >
> > Try adding just that patch. It will at least make it go back to what was
> > happening before 6.8.10 (I hope!).
> >
> > -- Steve



Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Russell King (Oracle)
On Mon, May 27, 2024 at 02:28:59PM +0200, Uwe Kleine-König wrote:
> On Mon, May 27, 2024 at 08:56:16AM +0100, Russell King (Oracle) wrote:
> > On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> > > Hello,
> > > 
> > > in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed 
> > > me to
> > > this thread.  With the proposed code change applied the procedure
> > > 
> > > # set to some known good (randomly guessed) filter function and 
> > > enable function_graph
> > > echo mtdblock_open > set_ftrace_filter
> > > echo function_graph > current_tracer
> > > 
> > > # walk available filter funcs
> > > cat available_filter_functions | while read f; do echo $f | tee -a 
> > > set_ftrace_filter; sleep 1; done
> > > 
> > > produces the following output
> > > 
> > > [  159.832173] Insufficient stack space to handle exception!
> > > [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> > > [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> > > [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> > > [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> > > [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> > > industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
> > > iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
> > > [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> > > [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> > > [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> > > [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> > > [  159.898376] pc : []lr : []psr: 6093
> > > [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> > > [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> > > [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
> > > [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
> > > [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  
> > > Segment none
> > > [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> > > [  159.941436] Register r0 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.952253] Register r1 information: non-slab/vmalloc memory
> > > [  159.957988] Register r2 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.968791] Register r3 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.979592] Register r4 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.990391] Register r5 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  160.001187] Register r6 information: non-paged memory
> > > [  160.006303] Register r7 information: non-paged memory
> > > [  160.011415] Register r8 information: non-slab/vmalloc memory
> > > [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> > > pointer offset 0 size 32
> > > [  160.025718] Register r10 information: non-slab/vmalloc memory
> > > [  160.031530] Register r11 information: 2-page vmalloc region starting 
> > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  160.042422] Register r12 information: 2-page vmalloc region starting 
> > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> > > [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)
> > 
> > No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
> > a function involving the ftrace tracing infrastructure, which is why 8KB
> > of stack has been gobbled up. It could be
> > copy_from_kernel_nofault_allowed() but it would be useful to have at
> > least some extract of the backtrace showing the recursive cycle to
> > confirm, otherwise there is nothing in your report to confirm. As I'm
> > not a ftrace user myself, this isn't something I'd test for, so having
> > a full report would be useful.
> 
> Is not having a backtrace related to ftrace_return_address() returning
> NULL, as Arnd pointed out in
> https://lore.kernel.org/linux-arm-kernel/36cd10de-c51c-40ff-90e8-714954060...@app.fastmail.com/
> ?

Unlikely - the stack and code lines are also missing. I think the
submitter truncated the oops which is highly likely given that it
would've dumped all 8kB of the stack in hex, and the trace and
code lines would be after that.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!



Re: [PATCH v2] tracing/probes: fix error check in parse_btf_field()

2024-05-27 Thread Google
On Mon, 27 May 2024 11:43:52 +0200
Carlos López  wrote:

> btf_find_struct_member() might return NULL or an error via the
> ERR_PTR() macro. However, its caller in parse_btf_field() only checks
> for the NULL condition. Fix this by using IS_ERR() and returning the
> error up the stack.
> 

Thanks for updating! This version looks good to me.
Let me pick this to probes/fixes.

Thank you,


> Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure field 
> access")
> Signed-off-by: Carlos López 
> ---
> v2: added call to trace_probe_log_err()
> 
>  kernel/trace/trace_probe.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 5e263c141574..39877c80d6cb 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -554,6 +554,10 @@ static int parse_btf_field(char *fieldname, const struct 
> btf_type *type,
>   anon_offs = 0;
>   field = btf_find_struct_member(ctx->btf, type, 
> fieldname,
>  _offs);
> + if (IS_ERR(field)) {
> + trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> + return PTR_ERR(field);
> + }
>   if (!field) {
>   trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
>   return -ENOENT;
> -- 
> 2.35.3
> 


-- 
Masami Hiramatsu (Google) 



Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Uwe Kleine-König
On Mon, May 27, 2024 at 08:56:16AM +0100, Russell King (Oracle) wrote:
> On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> > Hello,
> > 
> > in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed me 
> > to
> > this thread.  With the proposed code change applied the procedure
> > 
> > # set to some known good (randomly guessed) filter function and enable 
> > function_graph
> > echo mtdblock_open > set_ftrace_filter
> > echo function_graph > current_tracer
> > 
> > # walk available filter funcs
> > cat available_filter_functions | while read f; do echo $f | tee -a 
> > set_ftrace_filter; sleep 1; done
> > 
> > produces the following output
> > 
> > [  159.832173] Insufficient stack space to handle exception!
> > [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> > [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> > [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> > [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> > [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> > industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
> > iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
> > [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> > [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> > [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> > [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> > [  159.898376] pc : []lr : []psr: 6093
> > [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> > [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> > [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
> > [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
> > [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  
> > Segment none
> > [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> > [  159.941436] Register r0 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.952253] Register r1 information: non-slab/vmalloc memory
> > [  159.957988] Register r2 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.968791] Register r3 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.979592] Register r4 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.990391] Register r5 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  160.001187] Register r6 information: non-paged memory
> > [  160.006303] Register r7 information: non-paged memory
> > [  160.011415] Register r8 information: non-slab/vmalloc memory
> > [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> > pointer offset 0 size 32
> > [  160.025718] Register r10 information: non-slab/vmalloc memory
> > [  160.031530] Register r11 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  160.042422] Register r12 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> > [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)
> 
> No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
> a function involving the ftrace tracing infrastructure, which is why 8KB
> of stack has been gobbled up. It could be
> copy_from_kernel_nofault_allowed() but it would be useful to have at
> least some extract of the backtrace showing the recursive cycle to
> confirm, otherwise there is nothing in your report to confirm. As I'm
> not a ftrace user myself, this isn't something I'd test for, so having
> a full report would be useful.

Is not having a backtrace related to ftrace_return_address() returning
NULL, as Arnd pointed out in
https://lore.kernel.org/linux-arm-kernel/36cd10de-c51c-40ff-90e8-714954060...@app.fastmail.com/
?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature


Re: [PATCH 01/12] soc: qcom: add firmware name helper

2024-05-27 Thread Dmitry Baryshkov
On Thu, 23 May 2024 at 01:48, Bjorn Andersson  wrote:
>
> On Tue, May 21, 2024 at 03:08:31PM +0200, Dmitry Baryshkov wrote:
> > On Tue, 21 May 2024 at 13:20, Kalle Valo  wrote:
> > >
> > > Dmitry Baryshkov  writes:
> > >
> > > > On Tue, 21 May 2024 at 12:52,  wrote:
> > > >>
> > > >> On 21/05/2024 11:45, Dmitry Baryshkov wrote:
> > > >> > Qualcomm platforms have different sets of the firmware files, which
> > > >> > differ from platform to platform (and from board to board, due to the
> > > >> > embedded signatures). Rather than listing all the firmware files,
> > > >> > including full paths, in the DT, provide a way to determine firmware
> > > >> > path based on the root DT node compatible.
> > > >>
> > > >> Ok this looks quite over-engineered but necessary to handle the legacy,
> > > >> but I really think we should add a way to look for a board-specific 
> > > >> path
> > > >> first and fallback to those SoC specific paths.
> > > >
> > > > Again, CONFIG_FW_LOADER_USER_HELPER => delays.
> > >
> > > To me this also looks like very over-engineered, can you elaborate more
> > > why this is needed? Concrete examples would help to understand better.
> >
> > Sure. During the meeting last week Arnd suggested evaluating if we can
> > drop firmware-name from the board DT files. Several reasons for that:
> > - DT should describe the hardware, not the Linux-firmware locations
> > - having firmware name in DT complicates updating the tree to use
> > different firmware API (think of mbn vs mdt vs any other format)
> > - If the DT gets supplied by the vendor (e.g. for
> > SystemReady-certified devices), there should be a sync between the
> > vendor's DT, linux kernel and the rootfs. Dropping firmware names from
> > DT solves that by removing one piece of the equation
> >
> > Now for the complexity of the solution. Each SoC family has their own
> > firmware set. This includes firmware for the DSPs, for modem, WiFi
> > bits, GPU shader, etc.
> > For the development boards these devices are signed by the testing key
> > and the actual signature is not validated against the root of trust
> > certificate.
> > For the end-user devices the signature is actually validated against
> > the bits fused to the SoC during manufacturing process. CA certificate
> > (and thus the fuses) differ from vendor to vendor (and from the device
> > to device)
> >
> > Not all of the firmware files are a part of the public linux-firmware
> > tree. However we need to support the rootfs bundled with the firmware
> > for different platforms (both public and vendor). The non-signed files
> > come from the Adreno GPU and can be shared between platforms. All
> > other files are SoC-specific and in some cases device-specific.
> >
> > So for example the SDM845 db845c (open device) loads following firmware 
> > files:
> > Not signed:
> > - qcom/a630_sqe.fw
> > - qcom/a630_gmu.bin
> >
> > Signed, will work for any non-secured sdm845 device:
> > - qcom/sdm845/a630_zap.mbn
> > - qcom/sdm845/adsp.mbn
> > - qcom/sdm845/cdsp.mbn
> > - qcom/sdm485/mba.mbn
> > - qcom/sdm845/modem.mbn
> > - qcom/sdm845/wlanmdsp.mbn (loaded via TQFTP)
> > - qcom/venus-5.2/venus.mbn
> >
> > Signed, works only for DB845c.
> > - qcom/sdm845/Thundercomm/db845c/slpi.mbn
> >
> > In comparison, the SDM845 Pixel-3 phone (aka blueline) should load the
> > following firmware files:
> > - qcom/a630_sqe.fw (the same, non-signed file)
> > - qcom/a630_gmu.bin (the same, non-signed file)
> > - qcom/sdm845/Google/blueline/a630_zap.mbn
>
> How do you get from "a630_zap.mbn" to this? By extending the lookup
> table for every target, or what am I missing?

More or less so. Matching the root OF node gives us the firmware
location, then it gets prepended to all firmware targets. Not an ideal
solution, as there is no fallback support, but at least it gives us
some points to discuss (and to decide whether to move to some
particular direction or to abandon the idea completely, making Arnd
unhappy again).

>
> Regards,
> Bjorn
>
> > - qcom/sdm845/Google/blueline/adsp.mbn
> > - qcom/sdm845/Google/blueline/cdsp.mbn
> > - qcom/sdm845/Google/blueline/ipa_fws.mbn
> > - qcom/sdm845/Google/blueline/mba.mbn
> > - qcom/sdm845/Google/blueline/modem.mbn
> > - qcom/sdm845/Google/blueline/venus.mbn
> > - qcom/sdm845/Google/blueline/wlanmdsp.mbn
> > - qcom/sdm845/Google/blueline/slpi.mbn
> >
> > The Lenovo Yoga C630 WoS laptop (SDM850 is a variant of SDM845) uses
> > another set of files:
> > - qcom/a630_sqe.fw (the same, non-signed file)
> > - qcom/a630_gmu.bin (the same, non-signed file)
> > - qcom/sdm850/LENOVO/81JL/qcdxkmsuc850.mbn
> > - qcom/sdm850/LENOVO/81JL/qcadsp850.mbn
> > - qcom/sdm850/LENOVO/81JL/qccdsp850.mbn
> > - qcom/sdm850/LENOVO/81JL/ipa_fws.elf
> > - qcom/sdm850/LENOVO/81JL/qcdsp1v2850.mbn
> > - qcom/sdm850/LENOVO/81JL/qcdsp2850.mbn
> > - qcom/sdm850/LENOVO/81JL/qcvss850.mbn
> > - qcom/sdm850/LENOVO/81JL/wlanmdsp.mbn
> > - qcom/sdm850/LENOVO/81JL/qcslpi850.mbn
> >
> > If we look at one of the recent 

Re: [PATCH 2/2] ring-buffer: Fix a race between readers and resize checks

2024-05-27 Thread Petr Pavlu
do
>>  echo 16 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
>>  echo 8 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
>>  done &
>>  while true; do
>>  for i in /sys/kernel/tracing/per_cpu/*; do
>>  timeout 0.1 cat $i/trace_pipe; sleep 0.2
>>  done
>>  done
>>
>> To fix the problem, make sure ring_buffer_resize() doesn't invoke
>> rb_check_pages() concurrently with a reader operating on the same
>> ring_buffer_per_cpu by taking its cpu_buffer->reader_lock.
> 
> Definitely a bug. Thanks for catching it. But...
> 
>>
>> Fixes: 659f451ff213 ("ring-buffer: Add integrity check at end of iter read")
>> Signed-off-by: Petr Pavlu 
>> ---
>>  kernel/trace/ring_buffer.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>> index 0ae569eae55a..967655591719 100644
>> --- a/kernel/trace/ring_buffer.c
>> +++ b/kernel/trace/ring_buffer.c
>> @@ -1449,6 +1449,11 @@ static void rb_check_bpage(struct ring_buffer_per_cpu 
>> *cpu_buffer,
>>   *
>>   * As a safety measure we check to make sure the data pages have not
>>   * been corrupted.
>> + *
>> + * Callers of this function need to guarantee that the list of pages 
>> doesn't get
>> + * modified during the check. In particular, if it's possible that the 
>> function
>> + * is invoked with concurrent readers which can swap in a new reader page 
>> then
>> + * the caller should take cpu_buffer->reader_lock.
>>   */
>>  static void rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
>>  {
>> @@ -2200,8 +2205,13 @@ int ring_buffer_resize(struct trace_buffer *buffer, 
>> unsigned long size,
>>   */
>>  synchronize_rcu();
>>  for_each_buffer_cpu(buffer, cpu) {
>> +unsigned long flags;
>> +
>>  cpu_buffer = buffer->buffers[cpu];
>> +raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>>  rb_check_pages(cpu_buffer);
>> +raw_spin_unlock_irqrestore(_buffer->reader_lock,
>> +   flags);
> 
> Putting my RT hat on, I really don't like the above fix. The
> rb_check_pages() iterates all subbuffers which makes the time interrupts
> are disabled non-deterministic.

I see, this applies also to the same rb_check_pages() validation invoked
from ring_buffer_read_finish().

> 
> Instead, I would rather have something where we disable readers while we do
> the check, and re-enable them.
> 
>   raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>   cpu_buffer->read_disabled++;
>   raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> flags);
> 
> // Also, don't put flags on a new line. We are allow to go 100 characters now.

Noted.

> 
> 
>   rb_check_pages(cpu_buffer);
>   raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>   cpu_buffer->read_disabled--;
>   raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> flags);
> 
> Or something like that. Yes, that also requires creating a new
> "read_disabled" field in the ring_buffer_per_cpu code.

I think this would work but I'm personally not immediately sold on this
approach. If I understand the idea correctly, readers should then check
whether cpu_buffer->read_disabled is set and bail out with some error if
that is the case. The rb_check_pages() function is only a self-check
code and as such, I feel it doesn't justify disrupting readers with
a new error condition and adding more complex locking.

I've been considering how to approach this RT issue differently. One
obvious approach would be to drop the rb_check_pages() validation but
that is probably not desirable.

Another option could be to make the check less thorough and walk only
a part of the list which is bounded by some constant, typically one
would want to check the part where some change was just made. In case of
a smaller list, it should be still possible to traverse it completely.

This is an idea that I'm currently looking at.

> 
> That said, I'm going to accept these patches as is (moving flags onto the
> same line). But would like the above code for the next merge window as it
> would then not affect RT.
> 
> I'll accept these patches because it does fix the bug now.

Thanks,
Petr



Re: [PATCH v2 5/5] arm64: dts: qcom: sa8775p-ride: enable remoteprocs

2024-05-27 Thread Dmitry Baryshkov
On Mon, May 27, 2024 at 10:43:52AM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski 
> 
> Enable all remoteproc nodes on the sa8775p-ride board and point to the
> appropriate firmware files.
> 
> Signed-off-by: Bartosz Golaszewski 
> ---
>  arch/arm64/boot/dts/qcom/sa8775p-ride.dts | 25 +
>  1 file changed, 25 insertions(+)
> 

Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry



Re: [PATCH v2 4/5] arm64: dts: qcom: sa8775p: add ADSP, CDSP and GPDSP nodes

2024-05-27 Thread Dmitry Baryshkov
On Mon, May 27, 2024 at 10:43:51AM +0200, Bartosz Golaszewski wrote:
> From: Tengfei Fan 
> 
> Add nodes for remoteprocs: ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 for
> SA8775p SoCs.
> 
> Signed-off-by: Tengfei Fan 
> Co-developed-by: Bartosz Golaszewski 
> Signed-off-by: Bartosz Golaszewski 
> ---
>  arch/arm64/boot/dts/qcom/sa8775p.dtsi | 332 
> ++
>  1 file changed, 332 insertions(+)
> 

With nsp0 vs nsp1 vs nsp sorted out:


Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry



Re: [PATCH v2 3/5] remoteproc: qcom_q6v5_pas: Add support for SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Dmitry Baryshkov
On Mon, May 27, 2024 at 10:43:50AM +0200, Bartosz Golaszewski wrote:
> From: Tengfei Fan 
> 
> Add support for PIL loading on ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 on
> SA8775p SoCs.
> 
> Signed-off-by: Tengfei Fan 
> Co-developed-by: Bartosz Golaszewski 
> Signed-off-by: Bartosz Golaszewski 
> ---
>  drivers/remoteproc/qcom_q6v5_pas.c | 92 
> ++
>  1 file changed, 92 insertions(+)
> 
> diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
> b/drivers/remoteproc/qcom_q6v5_pas.c
> index 54d8005d40a3..16053aa99298 100644
> --- a/drivers/remoteproc/qcom_q6v5_pas.c
> +++ b/drivers/remoteproc/qcom_q6v5_pas.c
> @@ -820,6 +820,23 @@ static const struct adsp_data adsp_resource_init = {
>   .ssctl_id = 0x14,
>  };
>  
> +static const struct adsp_data sa8775p_adsp_resource = {
> + .crash_reason_smem = 423,
> + .firmware_name = "adsp.mdt",

mbn please.

Other than that LGTM

> + .pas_id = 1,
> + .minidump_id = 5,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "lcx",
> + "lmx",
> + NULL
> + },
> + .load_state = "adsp",
> + .ssr_name = "lpass",
> + .sysmon_name = "adsp",
> + .ssctl_id = 0x14,
> +};
> +
>  static const struct adsp_data sdm845_adsp_resource_init = {
>   .crash_reason_smem = 423,
>   .firmware_name = "adsp.mdt",
> @@ -933,6 +950,42 @@ static const struct adsp_data cdsp_resource_init = {
>   .ssctl_id = 0x17,
>  };
>  
> +static const struct adsp_data sa8775p_cdsp0_resource = {
> + .crash_reason_smem = 601,
> + .firmware_name = "cdsp0.mdt",
> + .pas_id = 18,
> + .minidump_id = 7,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + "nsp0",
> + NULL
> + },
> + .load_state = "cdsp",
> + .ssr_name = "cdsp",
> + .sysmon_name = "cdsp",
> + .ssctl_id = 0x17,
> +};
> +
> +static const struct adsp_data sa8775p_cdsp1_resource = {
> + .crash_reason_smem = 633,
> + .firmware_name = "cdsp1.mdt",
> + .pas_id = 30,
> + .minidump_id = 20,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + "nsp1",
> + NULL
> + },
> + .load_state = "nsp",
> + .ssr_name = "cdsp1",
> + .sysmon_name = "cdsp1",
> + .ssctl_id = 0x20,
> +};
> +
>  static const struct adsp_data sdm845_cdsp_resource_init = {
>   .crash_reason_smem = 601,
>   .firmware_name = "cdsp.mdt",
> @@ -1074,6 +1127,40 @@ static const struct adsp_data sm8350_cdsp_resource = {
>   .ssctl_id = 0x17,
>  };
>  
> +static const struct adsp_data sa8775p_gpdsp0_resource = {
> + .crash_reason_smem = 640,
> + .firmware_name = "gpdsp0.mdt",
> + .pas_id = 39,
> + .minidump_id = 21,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + NULL
> + },
> + .load_state = "gpdsp0",
> + .ssr_name = "gpdsp0",
> + .sysmon_name = "gpdsp0",
> + .ssctl_id = 0x21,
> +};
> +
> +static const struct adsp_data sa8775p_gpdsp1_resource = {
> + .crash_reason_smem = 641,
> + .firmware_name = "gpdsp1.mdt",
> + .pas_id = 40,
> + .minidump_id = 22,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + NULL
> + },
> + .load_state = "gpdsp1",
> + .ssr_name = "gpdsp1",
> + .sysmon_name = "gpdsp1",
> + .ssctl_id = 0x22,
> +};
> +
>  static const struct adsp_data mpss_resource_init = {
>   .crash_reason_smem = 421,
>   .firmware_name = "modem.mdt",
> @@ -1315,6 +1402,11 @@ static const struct of_device_id adsp_of_match[] = {
>   { .compatible = "qcom,qcs404-adsp-pas", .data = _resource_init },
>   { .compatible = "qcom,qcs404-cdsp-pas", .data = _resource_init },
>   { .compatible = "qcom,qcs404-wcss-pas", .data = _resource_init },
> + { .compatible = "qcom,sa8775p-adsp-pas", .data = 
> _adsp_resource},
> + { .compatible = "qcom,sa8775p-cdsp0-pas", .data = 
> _cdsp0_resource},
> + { .compatible = "qcom,sa8775p-cdsp1-pas", .data = 
> _cdsp1_resource},
> + { .compatible = "qcom,sa8775p-gpdsp0-pas", .data = 
> _gpdsp0_resource},
> + { .compatible = "qcom,sa8775p-gpdsp1-pas", .data = 
> _gpdsp1_resource},
>   { .compatible = "qcom,sc7180-adsp-pas", .data = _adsp_resource},
>   { .compatible = "qcom,sc7180-mpss-pas", .data = _resource_init},
>   { .compatible = "qcom,sc7280-adsp-pas", .data = _adsp_resource},
> 
> -- 
> 2.43.0
> 

-- 
With best wishes
Dmitry



Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Krzysztof Kozlowski
On 27/05/2024 10:43, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski 
> 
> Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
> GPDSP1 on the SA8775p SoC.
> 
> Co-developed-by: Tengfei Fan 

Missing SoB.

> Signed-off-by: Bartosz Golaszewski 

...


> +
> +allOf:
> +  - $ref: /schemas/remoteproc/qcom,pas-common.yaml#
> +
> +  - if:
> +  properties:
> +compatible:
> +  enum:
> +- qcom,sa8775p-adsp-pas
> +then:
> +  properties:
> +power-domains:
> +  items:
> +- description: LCX power domain
> +- description: LMX power domain
> +power-domain-names:
> +  items:
> +- const: lcx
> +- const: lmx
> +
> +  - if:
> +  properties:
> +compatible:
> +  enum:
> +- qcom,sa8775p-cdsp-pas

cdsp0

> +then:
> +  properties:
> +power-domains:
> +  items:
> +- description: CX power domain
> +- description: MXC power domain
> +- description: NSP0 power domain
> +power-domain-names:
> +  items:
> +- const: cx
> +- const: mxc
> +- const: nsp0

Shouldn't this be just nsp, so both cdsp0 and cdsp1 entries can be
unified? That's the power domain from the device point of view, so the
device expects to be in some NSP domain, not explicitly NSPn.



Best regards,
Krzysztof




Re: [PATCH] tools/virtio: pipe assertion in vring_test.c

2024-05-27 Thread Yunseong Kim



On 5/27/24 4:52 오후, Michael S. Tsirkin wrote:
> On Mon, May 27, 2024 at 04:13:31PM +0900, ysk...@gmail.com wrote:
>> From: Yunseong Kim 
>>
>> The virtio_device need to fail checking when create the geust/host pipe.
> 
> typo

Thank you for code review Michael.

Sorry, there was a typo in my message.

I'll fix it and send you patch version 2.

>>
>> Signed-off-by: Yunseong Kim 
> 
> 
> I guess ... 
> 
>> ---
>>  tools/virtio/vringh_test.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
>> index 98ff808d6f0c..b1af8807c02a 100644
>> --- a/tools/virtio/vringh_test.c
>> +++ b/tools/virtio/vringh_test.c
>> @@ -161,8 +161,8 @@ static int parallel_test(u64 features,
>>  host_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>>  guest_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 
>> 0);
>>  
>> -pipe(to_guest);
>> -pipe(to_host);
>> +assert(pipe(to_guest) == 0);
>> +assert(pipe(to_host) == 0);
> 
> 
> I don't like == 0, prefer ! .
> Also, calling pipe outside assert is preferable, since in theory
> assert can be compiled out.
> Not an issue here but people tend to copy/paste text.

I agree, it's uncomfortable even if I did it.

I'll fix it as you suggested and send it to patch 2.

Thank you!


Warm Regards,

Yunseong Kim


>>  CPU_ZERO(_set);
>>  find_cpus(_cpu, _cpu);
>> -- 
>> 2.34.1
> 



Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Russell King (Oracle)
On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> Hello,
> 
> in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed me to
> this thread.  With the proposed code change applied the procedure
> 
> # set to some known good (randomly guessed) filter function and enable 
> function_graph
> echo mtdblock_open > set_ftrace_filter
> echo function_graph > current_tracer
> 
> # walk available filter funcs
> cat available_filter_functions | while read f; do echo $f | tee -a 
> set_ftrace_filter; sleep 1; done
> 
> produces the following output
> 
> [  159.832173] Insufficient stack space to handle exception!
> [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
> iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
> [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> [  159.898376] pc : []lr : []psr: 6093
> [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
> [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
> [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
> none
> [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> [  159.941436] Register r0 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.952253] Register r1 information: non-slab/vmalloc memory
> [  159.957988] Register r2 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.968791] Register r3 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.979592] Register r4 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.990391] Register r5 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  160.001187] Register r6 information: non-paged memory
> [  160.006303] Register r7 information: non-paged memory
> [  160.011415] Register r8 information: non-slab/vmalloc memory
> [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> pointer offset 0 size 32
> [  160.025718] Register r10 information: non-slab/vmalloc memory
> [  160.031530] Register r11 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  160.042422] Register r12 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)

No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
a function involving the ftrace tracing infrastructure, which is why 8KB
of stack has been gobbled up. It could be
copy_from_kernel_nofault_allowed() but it would be useful to have at
least some extract of the backtrace showing the recursive cycle to
confirm, otherwise there is nothing in your report to confirm. As I'm
not a ftrace user myself, this isn't something I'd test for, so having
a full report would be useful.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!



Re: [PATCH] tools/virtio: pipe assertion in vring_test.c

2024-05-27 Thread Michael S. Tsirkin
On Mon, May 27, 2024 at 04:13:31PM +0900, ysk...@gmail.com wrote:
> From: Yunseong Kim 
> 
> The virtio_device need to fail checking when create the geust/host pipe.

typo

> 
> Signed-off-by: Yunseong Kim 


I guess ... 

> ---
>  tools/virtio/vringh_test.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
> index 98ff808d6f0c..b1af8807c02a 100644
> --- a/tools/virtio/vringh_test.c
> +++ b/tools/virtio/vringh_test.c
> @@ -161,8 +161,8 @@ static int parallel_test(u64 features,
>   host_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>   guest_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 
> 0);
>  
> - pipe(to_guest);
> - pipe(to_host);
> + assert(pipe(to_guest) == 0);
> + assert(pipe(to_host) == 0);


I don't like == 0, prefer ! .
Also, calling pipe outside assert is preferable, since in theory
assert can be compiled out.
Not an issue here but people tend to copy/paste text.

>   CPU_ZERO(_set);
>   find_cpus(_cpu, _cpu);
> -- 
> 2.34.1




Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Thorsten Scherer
Hello,

in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed me to
this thread.  With the proposed code change applied the procedure

# set to some known good (randomly guessed) filter function and enable 
function_graph
echo mtdblock_open > set_ftrace_filter
echo function_graph > current_tracer

# walk available filter funcs
cat available_filter_functions | while read f; do echo $f | tee -a 
set_ftrace_filter; sleep 1; done

produces the following output

[  159.832173] Insufficient stack space to handle exception!
[  159.832241] Task stack: [0xc8e44000..0xc8e46000]
[  159.842701] IRQ stack:  [0xc880..0xc8802000]
[  159.847712] Overflow stack: [0xc1934000..0xc1935000]
[  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
[  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
[  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
[  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
[  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
[  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
[  159.898376] pc : []lr : []psr: 6093
[  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
[  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
[  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
[  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
[  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
none
[  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
[  159.941436] Register r0 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.952253] Register r1 information: non-slab/vmalloc memory
[  159.957988] Register r2 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.968791] Register r3 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.979592] Register r4 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.990391] Register r5 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  160.001187] Register r6 information: non-paged memory
[  160.006303] Register r7 information: non-paged memory
[  160.011415] Register r8 information: non-slab/vmalloc memory
[  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 pointer 
offset 0 size 32
[  160.025718] Register r10 information: non-slab/vmalloc memory
[  160.031530] Register r11 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  160.042422] Register r12 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
[  160.058955] Stack: (0xc8e44018 to 0xc8e46000)

and points me to copy_from_kernel_nofault_allowed.

Disassembly of section .text:

c010eb8c :
c010eb8c:   e1a0c00dmov ip, sp
c010eb90:   e92dd800push{fp, ip, lr, pc}
c010eb94:   e24cb004sub fp, ip, #4
c010eb98:   e52de004push{lr}; (str lr, [sp, 
#-4]!)
c010eb9c:   ebfffb4fbl  c010d8e0 <__gnu_mcount_nc>
c010eba0:   e35004bfcmp r0, #-1090519040; 0xbf00
c010eba4:   3a03bcc c010ebb8 

c010eba8:   e091addsr0, r0, r1
c010ebac:   33a1movcc   r0, #1
c010ebb0:   23a0movcs   r0, #0
c010ebb4:   e89da800ldm sp, {fp, sp, pc}
c010ebb8:   e3a0mov r0, #0
c010ebbc:   e89da800ldm sp, {fp, sp, pc}


In the time given I couldn't make sense of the issue, so I am passing this
hopefully useful information to you.

Best regards
Thorsten

#regzbot introduced: 953f534a7ed6b725d4f101d2949393acc9262880 ^

[1] 
https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76/



Re: [PATCH] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-26 Thread Tatsuya S
On Mon, May 27, 2024 at 08:17:19AM GMT, Masami Hiramatsu wrote:
> On Sun, 26 May 2024 22:51:53 +0800
> kernel test robot  wrote:
> 
> > Hi Tatsuya,
> > 
> > kernel test robot noticed the following build warnings:
> > 
> > [auto build test WARNING on linus/master]
> > [also build test WARNING on rostedt-trace/for-next v6.9 next-20240523]
> > [cannot apply to rostedt-trace/for-next-urgent]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> > 
> > url:
> > https://github.com/intel-lab-lkp/linux/commits/Tatsuya-S/ftrace-Fix-stack-trace-entry-generated-by-ftrace_pid_func/20240526-193149
> > base:   linus/master
> > patch link:
> > https://lore.kernel.org/r/20240526112658.46740-1-tatsuya.s2862%40gmail.com
> > patch subject: [PATCH] ftrace: Fix stack trace entry generated by 
> > ftrace_pid_func()
> > config: x86_64-buildonly-randconfig-002-20240526 
> > (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/config)
> > compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 
> > 617a15a9eac96088ae5e9134248d8236e34b91b1)
> > reproduce (this is a W=1 build): 
> > (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/reproduce)
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new 
> > version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot 
> > | Closes: 
> > https://lore.kernel.org/oe-kbuild-all/202405262232.l4xh8q6o-...@intel.com/
> > 
> > All warnings (new ones prefixed by >>):
> > 
> > >> kernel/trace/ftrace.c:102:6: warning: no previous prototype for function 
> > >> 'ftrace_pids_enabled' [-Wmissing-prototypes]
> >  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
> >  |  ^
> >kernel/trace/ftrace.c:102:1: note: declare 'static' if the function is 
> > not intended to be used outside of this translation unit
> >  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
> >  | ^
> >  | static 
> >1 warning generated.
> 
> This is because the prototype in linux/ftrace.h is placed in the 
> #ifdef CONFIG_DYNAMIC_FTRACE block. The prototype needs to be moved
> outside of the block.
> 
> Thank you,
> 
> > 
> > 
> > vim +/ftrace_pids_enabled +102 kernel/trace/ftrace.c
> > 
> >101  
> >  > 102  bool ftrace_pids_enabled(struct ftrace_ops *ops)
> >103  {
> >104  struct trace_array *tr;
> >105  
> >106  if (!(ops->flags & FTRACE_OPS_FL_PID) || !ops->private)
> >107  return false;
> >108  
> >109  tr = ops->private;
> >110  
> >111  return tr->function_pids != NULL || 
> > tr->function_no_pids != NULL;
> >112  }
> >113  
> > 
> > -- 
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
> 
> 
> -- 
> Masami Hiramatsu (Google) 

I understand. Thank you.



Re: [PATCH 00/20] function_graph: Allow multiple users for function graph tracing

2024-05-26 Thread Steven Rostedt
On Mon, 27 May 2024 09:37:47 +0900
Masami Hiramatsu (Google)  wrote:

> > The diff between this and Masami's last update is at the end of this email. 
> >  
> 
> Thanks for update the patches!
> I think your changes are good. I just have some comments and replied.
> So, the plan is to push this series in the tracing/for-next? I will
> port my fprobe part on it and run some tests.

Yeah. I'll probably push it after -rc2 comes out and base it on top of
that. I usually wait till rc2 as it tends to be more stable than rc1.

I'll send an updated series later this week that addresses your comments.

-- Steve



Re: [PATCH 04/20] function_graph: Allow multiple users to attach to function graph

2024-05-26 Thread Steven Rostedt
On Mon, 27 May 2024 09:34:36 +0900
Masami Hiramatsu (Google)  wrote:

> > @@ -110,11 +253,13 @@ void ftrace_graph_stop(void)
> >  /* Add a function return address to the trace stack on thread info.*/
> >  static int
> >  ftrace_push_return_trace(unsigned long ret, unsigned long func,
> > -unsigned long frame_pointer, unsigned long *retp)
> > +unsigned long frame_pointer, unsigned long *retp,
> > +int fgraph_idx)  
> 
> We do not need this fgraph_idx parameter anymore because this removed
> reuse-frame check.

Agreed. Will remove.

> 
> >  {
> > struct ftrace_ret_stack *ret_stack;
> > unsigned long long calltime;
> > -   int index;
> > +   unsigned long val;
> > +   int offset;
> >  
> > if (unlikely(ftrace_graph_is_dead()))
> > return -EBUSY;
> > @@ -124,24 +269,57 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> >  
> > BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));
> >  
> > +   /* Set val to "reserved" with the delta to the new fgraph frame */
> > +   val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> > +
> > /*
> >  * We must make sure the ret_stack is tested before we read
> >  * anything else.
> >  */
> > smp_rmb();
> >  
> > -   /* The return trace stack is full */
> > -   if (current->curr_ret_stack >= SHADOW_STACK_MAX_INDEX) {
> > +   /*
> > +* Check if there's room on the shadow stack to fit a fraph frame
> > +* and a bitmap word.
> > +*/
> > +   if (current->curr_ret_stack + FGRAPH_FRAME_OFFSET + 1 >= 
> > SHADOW_STACK_MAX_OFFSET) {
> > atomic_inc(>trace_overrun);
> > return -EBUSY;
> > }
> >  
> > calltime = trace_clock_local();
> >  
> > -   index = current->curr_ret_stack;
> > -   RET_STACK_INC(current->curr_ret_stack);
> > -   ret_stack = RET_STACK(current, index);
> > +   offset = READ_ONCE(current->curr_ret_stack);
> > +   ret_stack = RET_STACK(current, offset);
> > +   offset += FGRAPH_FRAME_OFFSET;
> > +
> > +   /* ret offset = FGRAPH_FRAME_OFFSET ; type = reserved */
> > +   current->ret_stack[offset] = val;
> > +   ret_stack->ret = ret;
> > +   /*
> > +* The unwinders expect curr_ret_stack to point to either zero
> > +* or an offset where to find the next ret_stack. Even though the
> > +* ret stack might be bogus, we want to write the ret and the
> > +* offset to find the ret_stack before we increment the stack point.
> > +* If an interrupt comes in now before we increment the curr_ret_stack
> > +* it may blow away what we wrote. But that's fine, because the
> > +* offset will still be correct (even though the 'ret' won't be).
> > +* What we worry about is the offset being correct after we increment
> > +* the curr_ret_stack and before we update that offset, as if an
> > +* interrupt comes in and does an unwind stack dump, it will need
> > +* at least a correct offset!
> > +*/
> > barrier();
> > +   WRITE_ONCE(current->curr_ret_stack, offset + 1);
> > +   /*
> > +* This next barrier is to ensure that an interrupt coming in
> > +* will not corrupt what we are about to write.
> > +*/
> > +   barrier();
> > +
> > +   /* Still keep it reserved even if an interrupt came in */
> > +   current->ret_stack[offset] = val;
> > +
> > ret_stack->ret = ret;
> > ret_stack->func = func;
> > ret_stack->calltime = calltime;
> > @@ -151,7 +329,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> >  #ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
> > ret_stack->retp = retp;
> >  #endif
> > -   return 0;
> > +   return offset;
> >  }
> >  
> >  /*
> > @@ -168,49 +346,67 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> >  # define MCOUNT_INSN_SIZE 0
> >  #endif
> >  
> > +/* If the caller does not use ftrace, call this function. */
> >  int function_graph_enter(unsigned long ret, unsigned long func,
> >  unsigned long frame_pointer, unsigned long *retp)
> >  {
> > struct ftrace_graph_ent trace;
> > +   unsigned long bitmap = 0;
> > +   int offset;
> > +   int i;
> >  
> > trace.func = func;
> > trace.depth = ++current->curr_ret_depth;
> >  
> > -   if (ftrace_push_return_trace(ret, func, frame_pointer, retp))
> > +   offset = ftrace_push_return_trace(ret, func, frame_pointer, retp, 0);
> > +   if (offset < 0)
> > goto out;
> >  
> > -   /* Only trace if the calling function expects to */
> > -   if (!fgraph_array[0]->entryfunc())
> > +   for (i = 0; i < fgraph_array_cnt; i++) {
> > +   struct fgraph_ops *gops = fgraph_array[i];
> > +
> > +   if (gops == _stub)
> > +   continue;
> > +
> > +   if (gops->entryfunc())
> > +   bitmap |= BIT(i);
> > +   }
> > +
> > +   if (!bitmap)
> > goto out_ret;
> >  
> > +   /*
> > +* Since this function uses fgraph_idx = 0 as a tail-call 

Re: [PATCH 00/20] function_graph: Allow multiple users for function graph tracing

2024-05-26 Thread Google
On Fri, 24 May 2024 22:36:52 -0400
Steven Rostedt  wrote:

> [
>   Resend for some of you as I missed a comma in the Cc and
>   quilt died sending this out.
> ]
> 
> This is a continuation of the function graph multi user code.
> I wrote a proof of concept back in 2019 of this code[1] and
> Masami started cleaning it up. I started from Masami's work v10
> that can be found here:
> 
>  
> https://lore.kernel.org/linux-trace-kernel/171509088006.162236.7227326999861366050.stgit@devnote2/
> 
> This is *only* the code that allows multiple users of function
> graph tracing. This is not the fprobe work that Masami is working
> to add on top of it. As Masami took my proof of concept, there
> was still several things I disliked about that code. Instead of
> having Masami clean it up even more, I decided to take over on just
> my code and change it up a bit.
> 
> The biggest changes from where Masami left off is mostly renaming more
> variables, macros, and function names. I fixed up the current comments
> and added more to make the code a bit more understandable.
> 
> At the end of the series, I added two patches to optimize the entry
> and exit. On entry, there was a loop that iterated the 16 elements
> of the fgraph_array[] looking for any that may have a gops registered
> to it. It's quite a waste to do that loop if there's only one
> registered user. To fix that, I added a fgraph_array_bitmask that has
> its bits set that correspond to the elements of the array. Then
> a simple for_each_set_bit() is used for the iteration. I do the same
> thing at the exit callback of the function where it iterates over the
> bits of the bitmap saved on the ret_stack.
> 
> I also noticed that Masami added code to handle tail calls in the
> unwinder and had that in one of my patches. I took that code out
> and made it a separate patch with Masami as the author.
> 
> The diff between this and Masami's last update is at the end of this email.

Thanks for update the patches!
I think your changes are good. I just have some comments and replied.
So, the plan is to push this series in the tracing/for-next? I will
port my fprobe part on it and run some tests.

Thank you,

> 
> Based on Linus commit: 0eb03c7e8e2a4cc3653eb5eeb2d2001182071215
> 
> [1] https://lore.kernel.org/all/20190525031633.811342...@goodmis.org/
> 
> Masami Hiramatsu (Google) (3):
>   function_graph: Handle tail calls for stack unwinding
>   function_graph: Use a simple LRU for fgraph_array index number
>   ftrace: Add multiple fgraph storage selftest
> 
> Steven Rostedt (Google) (2):
>   function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()
>   function_graph: Use bitmask to loop on fgraph entry
> 
> Steven Rostedt (VMware) (15):
>   function_graph: Convert ret_stack to a series of longs
>   fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible by 
> long
>   function_graph: Add an array structure that will allow multiple 
> callbacks
>   function_graph: Allow multiple users to attach to function graph
>   function_graph: Remove logic around ftrace_graph_entry and return
>   ftrace/function_graph: Pass fgraph_ops to function graph callbacks
>   ftrace: Allow function_graph tracer to be enabled in instances
>   ftrace: Allow ftrace startup flags to exist without dynamic ftrace
>   function_graph: Have the instances use their own ftrace_ops for 
> filtering
>   function_graph: Add "task variables" per task for fgraph_ops
>   function_graph: Move set_graph_function tests to shadow stack global var
>   function_graph: Move graph depth stored data to shadow stack global var
>   function_graph: Move graph notrace bit to shadow stack global var
>   function_graph: Implement fgraph_reserve_data() and 
> fgraph_retrieve_data()
>   function_graph: Add selftest for passing local variables
> 
> 
>  arch/arm64/kernel/ftrace.c   |  21 +-
>  arch/loongarch/kernel/ftrace_dyn.c   |  15 +-
>  arch/powerpc/kernel/trace/ftrace.c   |   3 +-
>  arch/riscv/kernel/ftrace.c   |  15 +-
>  arch/x86/kernel/ftrace.c |  19 +-
>  include/linux/ftrace.h   |  42 +-
>  include/linux/sched.h|   2 +-
>  include/linux/trace_recursion.h  |  39 --
>  kernel/trace/fgraph.c| 994 
> ---
>  kernel/trace/ftrace.c|  11 +-
>  kernel/trace/ftrace_internal.h   |   2 -
>  kernel/trace/trace.h |  94 +++-
>  kernel/trace/trace_functions.c   |   8 +
>  kernel/trace/trace_functions_graph.c |  96 ++--
>  kernel/trace/trace_irqsoff.c |  10 +-
>  kernel/trace/trace_sched_wakeup.c|  10 +-
>  kernel/trace/trace_selftest.c| 259 -
>  17 files changed, 1330 insertions(+), 310 deletions(-)
> 
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 3313e4b83aa2..1aae521e5997 100644
> --- a/kernel/trace/fgraph.c
> +++ 

Re: [PATCH 04/20] function_graph: Allow multiple users to attach to function graph

2024-05-26 Thread Google
On Fri, 24 May 2024 22:36:56 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (VMware)" 
> 
> Allow for multiple users to attach to function graph tracer at the same
> time. Only 16 simultaneous users can attach to the tracer. This is because
> there's an array that stores the pointers to the attached fgraph_ops. When
> a function being traced is entered, each of the ftrace_ops entryfunc is
> called and if it returns non zero, its index into the array will be added
> to the shadow stack.
> 
> On exit of the function being traced, the shadow stack will contain the
> indexes of the ftrace_ops on the array that want their retfunc to be
> called.
> 
> Because a function may sleep for a long time (if a task sleeps itself),
> the return of the function may be literally days later. If the ftrace_ops
> is removed, its place on the array is replaced with a ftrace_ops that
> contains the stub functions and that will be called when the function
> finally returns.
> 
> If another ftrace_ops is added that happens to get the same index into the
> array, its return function may be called. But that's actually the way
> things current work with the old function graph tracer. If one tracer is
> removed and another is added, the new one will get the return calls of the
> function traced by the previous one, thus this is not a regression. This
> can be fixed by adding a counter to each time the array item is updated and
> save that on the shadow stack as well, such that it won't be called if the
> index saved does not match the index on the array.
> 
> Note, being able to filter functions when both are called is not completely
> handled yet, but that shouldn't be too hard to manage.
> 
> Co-developed with Masami Hiramatsu:
> Link: 
> https://lore.kernel.org/linux-trace-kernel/171509096221.162236.8806372072523195752.stgit@devnote2
> 

Thanks for update this. I have some comments below.

> Signed-off-by: Steven Rostedt (VMware) 
> Signed-off-by: Masami Hiramatsu (Google) 
> Signed-off-by: Steven Rostedt (Google) 
> ---
[...]

> @@ -110,11 +253,13 @@ void ftrace_graph_stop(void)
>  /* Add a function return address to the trace stack on thread info.*/
>  static int
>  ftrace_push_return_trace(unsigned long ret, unsigned long func,
> -  unsigned long frame_pointer, unsigned long *retp)
> +  unsigned long frame_pointer, unsigned long *retp,
> +  int fgraph_idx)

We do not need this fgraph_idx parameter anymore because this removed
reuse-frame check.

>  {
>   struct ftrace_ret_stack *ret_stack;
>   unsigned long long calltime;
> - int index;
> + unsigned long val;
> + int offset;
>  
>   if (unlikely(ftrace_graph_is_dead()))
>   return -EBUSY;
> @@ -124,24 +269,57 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> long func,
>  
>   BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));
>  
> + /* Set val to "reserved" with the delta to the new fgraph frame */
> + val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> +
>   /*
>* We must make sure the ret_stack is tested before we read
>* anything else.
>*/
>   smp_rmb();
>  
> - /* The return trace stack is full */
> - if (current->curr_ret_stack >= SHADOW_STACK_MAX_INDEX) {
> + /*
> +  * Check if there's room on the shadow stack to fit a fraph frame
> +  * and a bitmap word.
> +  */
> + if (current->curr_ret_stack + FGRAPH_FRAME_OFFSET + 1 >= 
> SHADOW_STACK_MAX_OFFSET) {
>   atomic_inc(>trace_overrun);
>   return -EBUSY;
>   }
>  
>   calltime = trace_clock_local();
>  
> - index = current->curr_ret_stack;
> - RET_STACK_INC(current->curr_ret_stack);
> - ret_stack = RET_STACK(current, index);
> + offset = READ_ONCE(current->curr_ret_stack);
> + ret_stack = RET_STACK(current, offset);
> + offset += FGRAPH_FRAME_OFFSET;
> +
> + /* ret offset = FGRAPH_FRAME_OFFSET ; type = reserved */
> + current->ret_stack[offset] = val;
> + ret_stack->ret = ret;
> + /*
> +  * The unwinders expect curr_ret_stack to point to either zero
> +  * or an offset where to find the next ret_stack. Even though the
> +  * ret stack might be bogus, we want to write the ret and the
> +  * offset to find the ret_stack before we increment the stack point.
> +  * If an interrupt comes in now before we increment the curr_ret_stack
> +  * it may blow away what we wrote. But that's fine, because the
> +  * offset will still be correct (even though the 'ret' won't be).
> +  * What we worry about is the offset being correct after we increment
> +  * the curr_ret_stack and before we update that offset, as if an
> +  * interrupt comes in and does an unwind stack dump, it will need
> +  * at least a correct offset!
> +  */
>   barrier();
> + WRITE_ONCE(current->curr_ret_stack, offset + 1);
> + /*
> +  

Re: [PATCH 20/20] function_graph: Use bitmask to loop on fgraph entry

2024-05-26 Thread Steven Rostedt
On Mon, 27 May 2024 09:09:49 +0900
Masami Hiramatsu (Google)  wrote:

> > Note, we do not care about races. If a bit is set before the gops is
> > assigned, it only wastes time looking at the element and ignoring it (as
> > it did before this bitmask is added).  
> 
> This is OK because anyway we check gops == _stub.
> By the way, shouldn't we also make "if (gops == _stub)"
> check unlikely()?

Yeah, I'll add the unlikely() here too.

> 
> This change looks good to me.
> 
> Reviewed-by: Masami Hiramatsu (Google) 

Thanks for the review Masami!

-- Steve




Re: [PATCH 19/20] function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()

2024-05-26 Thread Steven Rostedt
On Mon, 27 May 2024 09:04:34 +0900
Masami Hiramatsu (Google)  wrote:

> > bitmap = get_bitmap_bits(current, offset);
> > -   for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> > +
> > +   for_each_set_bit(i, , sizeof(bitmap) * BITS_PER_BYTE) {
> > struct fgraph_ops *gops = fgraph_array[i];
> >  
> > -   if (!(bitmap & BIT(i)))
> > -   continue;
> > if (gops == _stub)  
> 
> Ah, nit: maybe this is unlikely()?

Ah probably. I'll update it.

Thanks,

-- Steve




Re: [PATCH 20/20] function_graph: Use bitmask to loop on fgraph entry

2024-05-26 Thread Google
On Fri, 24 May 2024 22:37:12 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> Instead of looping through all the elements of fgraph_array[] to see if
> there's an gops attached to one and then calling its gops->func(). Create
> a fgraph_array_bitmask that sets bits when an index in the array is
> reserved (via the simple lru algorithm). Then only the bits set in this
> bitmask needs to be looked at where only elements in the array that have
> ops registered need to be looked at.
> 
> Note, we do not care about races. If a bit is set before the gops is
> assigned, it only wastes time looking at the element and ignoring it (as
> it did before this bitmask is added).

This is OK because anyway we check gops == _stub.
By the way, shouldn't we also make "if (gops == _stub)"
check unlikely()?

This change looks good to me.

Reviewed-by: Masami Hiramatsu (Google) 

Thank you,

> 
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/fgraph.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 5e8e13ffcfb6..1aae521e5997 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -173,6 +173,7 @@ DEFINE_STATIC_KEY_FALSE(kill_ftrace_graph);
>  int ftrace_graph_active;
>  
>  static struct fgraph_ops *fgraph_array[FGRAPH_ARRAY_SIZE];
> +static unsigned long fgraph_array_bitmask;
>  
>  /* LRU index table for fgraph_array */
>  static int fgraph_lru_table[FGRAPH_ARRAY_SIZE];
> @@ -197,6 +198,8 @@ static int fgraph_lru_release_index(int idx)
>  
>   fgraph_lru_table[fgraph_lru_last] = idx;
>   fgraph_lru_last = (fgraph_lru_last + 1) % FGRAPH_ARRAY_SIZE;
> +
> + clear_bit(idx, _array_bitmask);
>   return 0;
>  }
>  
> @@ -211,6 +214,8 @@ static int fgraph_lru_alloc_index(void)
>  
>   fgraph_lru_table[fgraph_lru_next] = -1;
>   fgraph_lru_next = (fgraph_lru_next + 1) % FGRAPH_ARRAY_SIZE;
> +
> + set_bit(idx, _array_bitmask);
>   return idx;
>  }
>  
> @@ -632,7 +637,8 @@ int function_graph_enter(unsigned long ret, unsigned long 
> func,
>   if (offset < 0)
>   goto out;
>  
> - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> + for_each_set_bit(i, _array_bitmask,
> +  sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) {
>   struct fgraph_ops *gops = fgraph_array[i];
>   int save_curr_ret_stack;
>  
> -- 
> 2.43.0
> 
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH 19/20] function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()

2024-05-26 Thread Google
On Fri, 24 May 2024 22:37:11 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> Instead of iterating through the entire fgraph_array[] and seeing if one
> of the bitmap bits are set to know to call the array's retfunc() function,
> use for_each_set_bit() on the bitmap itself. This will only iterate for
> the number of set bits.
> 
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/fgraph.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 4d503b3e45ad..5e8e13ffcfb6 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -827,11 +827,10 @@ static unsigned long __ftrace_return_to_handler(struct 
> fgraph_ret_regs *ret_regs
>  #endif
>  
>   bitmap = get_bitmap_bits(current, offset);
> - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> +
> + for_each_set_bit(i, , sizeof(bitmap) * BITS_PER_BYTE) {
>   struct fgraph_ops *gops = fgraph_array[i];
>  
> - if (!(bitmap & BIT(i)))
> - continue;
>   if (gops == _stub)

Ah, nit: maybe this is unlikely()?

Thank you,


-- 
Masami Hiramatsu (Google) 



Re: [PATCH 19/20] function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()

2024-05-26 Thread Google
On Fri, 24 May 2024 22:37:11 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> Instead of iterating through the entire fgraph_array[] and seeing if one
> of the bitmap bits are set to know to call the array's retfunc() function,
> use for_each_set_bit() on the bitmap itself. This will only iterate for
> the number of set bits.
> 

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) 

Thanks,

> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/fgraph.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 4d503b3e45ad..5e8e13ffcfb6 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -827,11 +827,10 @@ static unsigned long __ftrace_return_to_handler(struct 
> fgraph_ret_regs *ret_regs
>  #endif
>  
>   bitmap = get_bitmap_bits(current, offset);
> - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> +
> + for_each_set_bit(i, , sizeof(bitmap) * BITS_PER_BYTE) {
>   struct fgraph_ops *gops = fgraph_array[i];
>  
> - if (!(bitmap & BIT(i)))
> - continue;
>   if (gops == _stub)
>   continue;
>  
> -- 
> 2.43.0
> 
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-26 Thread Google
On Sun, 26 May 2024 22:51:53 +0800
kernel test robot  wrote:

> Hi Tatsuya,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on linus/master]
> [also build test WARNING on rostedt-trace/for-next v6.9 next-20240523]
> [cannot apply to rostedt-trace/for-next-urgent]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:
> https://github.com/intel-lab-lkp/linux/commits/Tatsuya-S/ftrace-Fix-stack-trace-entry-generated-by-ftrace_pid_func/20240526-193149
> base:   linus/master
> patch link:
> https://lore.kernel.org/r/20240526112658.46740-1-tatsuya.s2862%40gmail.com
> patch subject: [PATCH] ftrace: Fix stack trace entry generated by 
> ftrace_pid_func()
> config: x86_64-buildonly-randconfig-002-20240526 
> (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/config)
> compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 
> 617a15a9eac96088ae5e9134248d8236e34b91b1)
> reproduce (this is a W=1 build): 
> (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version 
> of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot 
> | Closes: 
> https://lore.kernel.org/oe-kbuild-all/202405262232.l4xh8q6o-...@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
> >> kernel/trace/ftrace.c:102:6: warning: no previous prototype for function 
> >> 'ftrace_pids_enabled' [-Wmissing-prototypes]
>  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
>  |  ^
>kernel/trace/ftrace.c:102:1: note: declare 'static' if the function is not 
> intended to be used outside of this translation unit
>  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
>  | ^
>  | static 
>1 warning generated.

This is because the prototype in linux/ftrace.h is placed in the 
#ifdef CONFIG_DYNAMIC_FTRACE block. The prototype needs to be moved
outside of the block.

Thank you,

> 
> 
> vim +/ftrace_pids_enabled +102 kernel/trace/ftrace.c
> 
>101
>  > 102bool ftrace_pids_enabled(struct ftrace_ops *ops)
>103{
>104struct trace_array *tr;
>105
>106if (!(ops->flags & FTRACE_OPS_FL_PID) || !ops->private)
>107return false;
>108
>109tr = ops->private;
>110
>111return tr->function_pids != NULL || 
> tr->function_no_pids != NULL;
>112}
>113
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki


-- 
Masami Hiramatsu (Google) 



Re: [PATCH] tracing/probes: fix error check in parse_btf_field()

2024-05-26 Thread Google
On Sun, 26 May 2024 14:27:56 +0200
Carlos López  wrote:

> 
> Hi,
> 
> On 26/5/24 12:17, Masami Hiramatsu (Google) wrote:
> > On Sat, 25 May 2024 20:21:32 +0200
> > Carlos López  wrote:
> > 
> >> btf_find_struct_member() might return NULL or an error via the
> >> ERR_PTR() macro. However, its caller in parse_btf_field() only checks
> >> for the NULL condition. Fix this by using IS_ERR() and returning the
> >> error up the stack.
> >>
> > 
> > Thanks for finding it!
> > I think this requires new error message for error_log file.
> > Can you add the log as
> > 
> > trace_probe_log_err(ctx->offset, BTF_ERROR);
> > 
> > And define BTF_ERROR in ERRORS@kernel/trace/trace_probe.h ?
> 
> Sounds good, but should we perhaps reuse BAD_BTF_TID?
> 
> ```
> C(BAD_BTF_TID,"Failed to get BTF type info."),\
> ```
> 
> `btf_find_struct_member()` fails if `type` is not a struct or if it runs
> OOM while allocating the anon stack, so it seems appropriate.

Good point, it sounds reasonable.

Thanks!

> 
> Best,
> Carlos
> 
> > Thank you,
> > 
> >> Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure 
> >> field access")
> >> Signed-off-by: Carlos López 
> >> ---
> >>   kernel/trace/trace_probe.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> >> index 5e263c141574..5417e9712157 100644
> >> --- a/kernel/trace/trace_probe.c
> >> +++ b/kernel/trace/trace_probe.c
> >> @@ -554,6 +554,8 @@ static int parse_btf_field(char *fieldname, const 
> >> struct btf_type *type,
> >>anon_offs = 0;
> >>field = btf_find_struct_member(ctx->btf, type, 
> >> fieldname,
> >>   _offs);
> >> +  if (IS_ERR(field))
> >> +  return PTR_ERR(field);
> >>if (!field) {
> >>trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
> >>return -ENOENT;
> >> -- 
> >> 2.35.3
> >>
> > 
> > 
> 
> -- 
> Carlos López
> Security Engineer
> SUSE Software Solutions


-- 
Masami Hiramatsu (Google) 



Re: [PATCH] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-26 Thread kernel test robot
Hi Tatsuya,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on rostedt-trace/for-next v6.9 next-20240523]
[cannot apply to rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Tatsuya-S/ftrace-Fix-stack-trace-entry-generated-by-ftrace_pid_func/20240526-193149
base:   linus/master
patch link:
https://lore.kernel.org/r/20240526112658.46740-1-tatsuya.s2862%40gmail.com
patch subject: [PATCH] ftrace: Fix stack trace entry generated by 
ftrace_pid_func()
config: x86_64-buildonly-randconfig-002-20240526 
(https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/config)
compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 
617a15a9eac96088ae5e9134248d8236e34b91b1)
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202405262232.l4xh8q6o-...@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/trace/ftrace.c:102:6: warning: no previous prototype for function 
>> 'ftrace_pids_enabled' [-Wmissing-prototypes]
 102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
 |  ^
   kernel/trace/ftrace.c:102:1: note: declare 'static' if the function is not 
intended to be used outside of this translation unit
 102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
 | ^
 | static 
   1 warning generated.


vim +/ftrace_pids_enabled +102 kernel/trace/ftrace.c

   101  
 > 102  bool ftrace_pids_enabled(struct ftrace_ops *ops)
   103  {
   104  struct trace_array *tr;
   105  
   106  if (!(ops->flags & FTRACE_OPS_FL_PID) || !ops->private)
   107  return false;
   108  
   109  tr = ops->private;
   110  
   111  return tr->function_pids != NULL || tr->function_no_pids != 
NULL;
   112  }
   113  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



Re: [PATCH] tracing/probes: fix error check in parse_btf_field()

2024-05-26 Thread Carlos López



Hi,

On 26/5/24 12:17, Masami Hiramatsu (Google) wrote:

On Sat, 25 May 2024 20:21:32 +0200
Carlos López  wrote:


btf_find_struct_member() might return NULL or an error via the
ERR_PTR() macro. However, its caller in parse_btf_field() only checks
for the NULL condition. Fix this by using IS_ERR() and returning the
error up the stack.



Thanks for finding it!
I think this requires new error message for error_log file.
Can you add the log as

trace_probe_log_err(ctx->offset, BTF_ERROR);

And define BTF_ERROR in ERRORS@kernel/trace/trace_probe.h ?


Sounds good, but should we perhaps reuse BAD_BTF_TID?

```
C(BAD_BTF_TID,  "Failed to get BTF type info."),\
```

`btf_find_struct_member()` fails if `type` is not a struct or if it runs
OOM while allocating the anon stack, so it seems appropriate.

Best,
Carlos


Thank you,


Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure field 
access")
Signed-off-by: Carlos López 
---
  kernel/trace/trace_probe.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 5e263c141574..5417e9712157 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -554,6 +554,8 @@ static int parse_btf_field(char *fieldname, const struct 
btf_type *type,
anon_offs = 0;
field = btf_find_struct_member(ctx->btf, type, 
fieldname,
   _offs);
+   if (IS_ERR(field))
+   return PTR_ERR(field);
if (!field) {
trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
return -ENOENT;
--
2.35.3






--
Carlos López
Security Engineer
SUSE Software Solutions



Re: [PATCH] tracing/probes: fix error check in parse_btf_field()

2024-05-26 Thread Google
On Sat, 25 May 2024 20:21:32 +0200
Carlos López  wrote:

> btf_find_struct_member() might return NULL or an error via the
> ERR_PTR() macro. However, its caller in parse_btf_field() only checks
> for the NULL condition. Fix this by using IS_ERR() and returning the
> error up the stack.
> 

Thanks for finding it!
I think this requires new error message for error_log file.
Can you add the log as

trace_probe_log_err(ctx->offset, BTF_ERROR);

And define BTF_ERROR in ERRORS@kernel/trace/trace_probe.h ?

Thank you,

> Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure field 
> access")
> Signed-off-by: Carlos López 
> ---
>  kernel/trace/trace_probe.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 5e263c141574..5417e9712157 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -554,6 +554,8 @@ static int parse_btf_field(char *fieldname, const struct 
> btf_type *type,
>   anon_offs = 0;
>   field = btf_find_struct_member(ctx->btf, type, 
> fieldname,
>  _offs);
> + if (IS_ERR(field))
> + return PTR_ERR(field);
>   if (!field) {
>   trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
>   return -ENOENT;
> -- 
> 2.35.3
> 


-- 
Masami Hiramatsu (Google) 



Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

2024-05-25 Thread Dave Chinner
On Fri, May 24, 2024 at 09:55:48AM +0200, Miklos Szeredi wrote:
> On Fri, 24 May 2024 at 02:47, John Groves  wrote:
> 
> > Apologies, but I'm short on time at the moment - going into a long holiday
> > weekend in the US with family plans. I should be focused again by middle of
> > next week.
> 
> NP.
> 
> Obviously I'll need to test it before anything is merged, other than
> that this is not urgent at all...
> 
> > But can you check /proc/cmdline to see of the memmap arg got through without
> > getting mangled? The '$' tends to get fubar'd. You might need \$, or I've 
> > seen
> > the need for \\\$. If it's un-mangled, there should be a dax device.
> 
> /proc/cmdline shows the option correctly:
> 
> root@kvm:~# cat /proc/cmdline
> root=/dev/vda console=hvc0 memmap=4G$4G
> 
> > If that doesn't work, it's worth trying '!' instead, which I think would 
> > give
> > you a pmem device - if the arg gets through (but ! is less likely to get
> > horked). That pmem device can be converted to devdax...
> 
> That doesn't work either.  No device created in /dev  (dax or pmem).

I think you need to do some ndctl magic to get the memory to be
namespaced correctly for the correct devices to appear.

https://docs.pmem.io/ndctl-user-guide/managing-namespaces

IIRC, need to set the type to pmem and the mode to fsdax, devdax or
raw to get the relevant device nodes to be created for the range..

Cheers,

Dave.

-- 
Dave Chinner
da...@fromorbit.com



Re: [PATCH RFC 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc

2024-05-25 Thread Krzysztof Kozlowski
On 24/05/2024 19:55, Luca Weiss wrote:
> On Donnerstag, 23. Mai 2024 08:19:11 MESZ Krzysztof Kozlowski wrote:
>> On 23/05/2024 08:16, Luca Weiss wrote:
>>> On Donnerstag, 23. Mai 2024 08:02:13 MESZ Krzysztof Kozlowski wrote:
 On 22/05/2024 19:34, Luca Weiss wrote:
> On Mittwoch, 22. Mai 2024 08:49:43 MESZ Krzysztof Kozlowski wrote:
>> On 21/05/2024 22:35, Luca Weiss wrote:
>>> On Dienstag, 21. Mai 2024 10:58:07 MESZ Krzysztof Kozlowski wrote:
 On 20/05/2024 17:11, Luca Weiss wrote:
> Hi Krzysztof
>
> Ack, sounds good.
>
> Maybe also from you, any opinion between these two binding styles?
>
> So first using index of mboxes for the numbering, where for the known
> usages the first element (and sometimes the 3rd - ipc-2) are empty <>.
>
> The second variant is using mbox-names to get the correct channel-mbox
> mapping.
>
> -   qcom,ipc-1 = < 8 13>;
> -   qcom,ipc-2 = < 8 9>;
> -   qcom,ipc-3 = < 8 19>;
> +   mboxes = <0>, < 13>, < 9>, < 19>;
>
> vs.
>
> -   qcom,ipc-1 = < 8 13>;
> -   qcom,ipc-2 = < 8 9>;
> -   qcom,ipc-3 = < 8 19>;
> +   mboxes = < 13>, < 9>, < 19>;
> +   mbox-names = "ipc-1", "ipc-2", "ipc-3";

 Sorry, don't get, ipc-1 is the first mailbox, so why would there be <0>
 in first case?
>>>
>>> Actually not, ipc-0 would be permissible by the driver, used for the 
>>> 0th host
>>>
>>> e.g. from:
>>>
>>> /* Iterate over all hosts to check whom wants a kick */
>>> for (host = 0; host < smsm->num_hosts; host++) {
>>> hostp = >hosts[host];
>>>
>>> Even though no mailbox is specified in any upstream dts for this 0th 
>>> host I
>>> didn't want the bindings to restrict that, that's why in the first 
>>> example
>>> there's an empty element (<0>) for the 0th smsm host
>>>
 Anyway, the question is if you need to know that some
 mailbox is missing. But then it is weird to name them "ipc-1" etc.
>>>
>>> In either case we'd just query the mbox (either by name or index) and 
>>> then
>>> see if it's there? Not quite sure I understand the sentence..
>>> Pretty sure either binding would work the same way.
>>
>> The question is: does the driver care only about having some mailboxes
>> or the driver cares about each specific mailbox? IOW, is skipping ipc-0
>> important for the driver?
>
> There's nothing special from driver side about any mailbox. Some SoCs have
> a mailbox for e.g. hosts 1&2&3, some have only 1&3, and apq8064 even has
> 1&2&3&4.
>
> And if the driver doesn't find a mailbox for a host, it just ignores it
> but then of course it can't 'ring' the mailbox for that host when 
> necessary.
>
> Not sure how much more I can add here, to be fair I barely understand what
> this driver is doing myself apart from the obvious.

 From what you said, it looks like it is enough to just list mailboxes,
 e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
>>>
>>> No, for sure we need also the possibility to list ipc-3.
>>
>> ? You can list it, what's the problem>
> 
> Maybe we're talking past each other...
> 
> You asked why this wouldn't work:
> 
>   e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
>   mboxes = < 13>, < 9>, < 19>;
> 
> How would we know that the 3rd mailbox ( 19) is for the 4th host
> (previous ipc-4)?
> 
> 1. If we use mboxes with indexes we'd need to have <0> values for
> "smsm hosts" where we don't have a mailbox for - this is at least
> for the 2nd smsm host (qcom,ipc-2) for a bunch of SoCs.
> 
> 2. If we use mboxes with mbox-names then we could skip that since we
> can directly specify which "smsm host" a given mailbox is for.
> 
> My only question really is whether 1. or 2. is a better idea.
> 
> Is this clearer now or still not?


So again, does the driver care about missing entry? If so, why?

Best regards,
Krzysztof




Re: [PATCH v10 00/36] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

2024-05-25 Thread Google
On Fri, 24 May 2024 18:41:56 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:08:00 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > Steven Rostedt (VMware) (15):
> >   function_graph: Convert ret_stack to a series of longs
> >   fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible 
> > by long
> >   function_graph: Add an array structure that will allow multiple 
> > callbacks
> >   function_graph: Allow multiple users to attach to function graph
> >   function_graph: Remove logic around ftrace_graph_entry and return
> >   ftrace/function_graph: Pass fgraph_ops to function graph callbacks
> >   ftrace: Allow function_graph tracer to be enabled in instances
> >   ftrace: Allow ftrace startup flags exist without dynamic ftrace
> >   function_graph: Have the instances use their own ftrace_ops for 
> > filtering
> >   function_graph: Add "task variables" per task for fgraph_ops
> >   function_graph: Move set_graph_function tests to shadow stack global 
> > var
> >   function_graph: Move graph depth stored data to shadow stack global 
> > var
> >   function_graph: Move graph notrace bit to shadow stack global var
> >   function_graph: Implement fgraph_reserve_data() and 
> > fgraph_retrieve_data()
> >   function_graph: Add selftest for passing local variables
> 
> Hi Masami,
> 
> While reviewing these patches, I realized there's several things I dislike
> about the patches I wrote. So I took these patches and started cleaning
> them up a little. Mostly renaming functions and adding comments.

Thanks for cleaning up the patches!!

> 
> As this is a major change to the function graph tracer, and I feel nervous
> about building something on top of this, how about I take over these
> patches and push them out for the next merge window. I'm hoping to get them
> into linux-next by v6.10-rc2 (I spent the day working on them, and it's
> mostly minor tweaks).

OK.

> Then I can push it out to 6.11 and get some good testing against it. Then
> we can add your stuff on top and get that merged in 6.12.

Yeah, it is reasonable plan. I also concerns about the stability. Especially,
this involves fprobe side changes too. If we introduce both at once, it may
mess up many things.

> 
> If all goes well, I'm hoping to get a series on just these patches (and
> your selftest addition) by tonight.
> 
> Thoughts?

I agree with you.

Thank you,

> 
> -- Steve


-- 
Masami Hiramatsu (Google) 



Re: [PATCH v10 07/36] function_graph: Allow multiple users to attach to function graph

2024-05-25 Thread Google
On Fri, 24 May 2024 21:32:08 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:09:22 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > @@ -109,6 +244,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> > if (!current->ret_stack)
> > return -EBUSY;
> >  
> > +   /*
> > +* At first, check whether the previous fgraph callback is pushed by
> > +* the fgraph on the same function entry.
> > +* But if @func is the self tail-call function, we also need to ensure
> > +* the ret_stack is not for the previous call by checking whether the
> > +* bit of @fgraph_idx is set or not.
> > +*/
> > +   ret_stack = get_ret_stack(current, current->curr_ret_stack, );
> > +   if (ret_stack && ret_stack->func == func &&
> > +   get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == 
> > FGRAPH_TYPE_BITMAP &&
> > +   !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, 
> > fgraph_idx))
> > +   return offset + FGRAPH_FRAME_OFFSET;
> > +
> > +   val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> > +
> > BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));
> 
> I'm trying to figure out what the above is trying to do. This gets called
> once in function_graph_enter() (or function_graph_enter_ops()). What
> exactly are you trying to catch here?

Aah, good catch! This was originally for catching the self tail-call case with
multiple fgraph callback on the same function, but it was my misread.
In later patch ([12/36]), we introduced function_graph_enter_ops() so that
we can skip checking hash table and directly pass the fgraph_ops to user
callback. I thought this function_graph_enter_ops() is used even if multiple
fgraph is set on the same function. In this case, we always need to check the
stack can be reused(pushed by other fgraph_ops on the same function) or not.
But as we discussed, the function_graph_enter_ops() is used only when only
one fgraph is set on the function (if there are multiple fgraphs are set on
the same function, use function_graph_enter() ), we are sure that 
ftrace_push_return_trace() is called only once on hooking the function entry.
Thus we don't need to reuse it.

> 
> Is it from this email:
> 
>   
> https://lore.kernel.org/all/20231110105154.df937bf9f200a0c16806c...@kernel.org/
> 
> As that's the last version before you added the above code.
> 
> But you also noticed it may not be needed, but triggered a crash without it
> in v3:
> 
>   
> https://lore.kernel.org/all/20231205234511.3839128259dfec153ea7d...@kernel.org/
> 
> I removed this code in my version and it runs just fine. Perhaps there was
> another bug that this was hiding that you fixed in later versions?

No problem. I think we can remove this block safely.

Thank you,

> 
> -- Steve
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH] x86/paravirt: Disable virt spinlock when CONFIG_PARAVIRT_SPINLOCKS disabled

2024-05-25 Thread Chen Yu
On 2024-05-23 at 09:30:59 -0700, Dave Hansen wrote:
> On 5/16/24 06:02, Chen Yu wrote:
> > Performance drop is reported when running encode/decode workload and
> > BenchSEE cache sub-workload.
> > Bisect points to commit ce0a1b608bfc ("x86/paravirt: Silence unused
> > native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS
> > is disabled the virt_spin_lock_key is set to true on bare-metal.
> > The qspinlock degenerates to test-and-set spinlock, which decrease the
> > performance on bare-metal.
> > 
> > Fix this by disabling virt_spin_lock_key if CONFIG_PARAVIRT_SPINLOCKS
> > is not set, or it is on bare-metal.
> 
> This is missing some background:
> 
> The kernel can change spinlock behavior when running as a guest.  But
> this guest-friendly behavior causes performance problems on bare metal.
> So there's a 'virt_spin_lock_key' static key to switch between the two
> modes.
> 
> The static key is always enabled by default (run in guest mode) and
> should be disabled for bare metal (and in some guests that want native
> behavior).
> 
> ... then describe the regression and the fix
>
Thanks Juergen for your review.

And thanks Dave for the write up, I'll refine the log according to your 
suggestion. 

> > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> > index 5358d43886ad..ee51c0949ed8 100644
> > --- a/arch/x86/kernel/paravirt.c
> > +++ b/arch/x86/kernel/paravirt.c
> > @@ -55,7 +55,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
> >  
> >  void __init native_pv_lock_init(void)
> >  {
> > -   if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) &&
> > +   if (!IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) ||
> > !boot_cpu_has(X86_FEATURE_HYPERVISOR))
> > static_branch_disable(_spin_lock_key);
> >  }
> This gets used at a single site:
> 
> if (pv_enabled())
> goto pv_queue;
> 
> if (virt_spin_lock(lock))
> return;
> 
> which is logically:
> 
>   if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS))
>   goto ...; // don't look at virt_spin_lock_key
> 
>   if (virt_spin_lock_key)
>   return; // On virt, but non-paravirt.  Did Test-and-Set
>   // spinlock.
>

Thanks for the description in detail, my original change might break the
"X86_FEATURE_HYPERVISOR + NO_CONFIG_PARAVIRT_SPINLOCKS " case that, the guest
can not fall into test-and-set.
 
> So I _think_ Arnd was trying to optimize native_pv_lock_init() away when
> it's going to get skipped over anyway by the 'goto'.
> 
> But this took me at least 30 minutes of scratching my head and trying to
> untangle the whole thing.  It's all far too subtle for my taste, and all
> of that to save a few bytes of init text in a configuration that's
> probably not even used very often (PARAVIRT=y, but PARAVIRT_SPINLOCKS=n).
> 
> Let's just keep it simple.  How about the attached patch?

Yes, this one works, I'll refine it.

thanks,
Chenyu 



Re: [PATCH v10 07/36] function_graph: Allow multiple users to attach to function graph

2024-05-24 Thread Steven Rostedt
On Tue,  7 May 2024 23:09:22 +0900
"Masami Hiramatsu (Google)"  wrote:

> @@ -109,6 +244,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> long func,
>   if (!current->ret_stack)
>   return -EBUSY;
>  
> + /*
> +  * At first, check whether the previous fgraph callback is pushed by
> +  * the fgraph on the same function entry.
> +  * But if @func is the self tail-call function, we also need to ensure
> +  * the ret_stack is not for the previous call by checking whether the
> +  * bit of @fgraph_idx is set or not.
> +  */
> + ret_stack = get_ret_stack(current, current->curr_ret_stack, );
> + if (ret_stack && ret_stack->func == func &&
> + get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == 
> FGRAPH_TYPE_BITMAP &&
> + !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, 
> fgraph_idx))
> + return offset + FGRAPH_FRAME_OFFSET;
> +
> + val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> +
>   BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));

I'm trying to figure out what the above is trying to do. This gets called
once in function_graph_enter() (or function_graph_enter_ops()). What
exactly are you trying to catch here?

Is it from this email:

  
https://lore.kernel.org/all/20231110105154.df937bf9f200a0c16806c...@kernel.org/

As that's the last version before you added the above code.

But you also noticed it may not be needed, but triggered a crash without it
in v3:

  
https://lore.kernel.org/all/20231205234511.3839128259dfec153ea7d...@kernel.org/

I removed this code in my version and it runs just fine. Perhaps there was
another bug that this was hiding that you fixed in later versions?

-- Steve



Re: [PATCH v10 00/36] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

2024-05-24 Thread Steven Rostedt
On Tue,  7 May 2024 23:08:00 +0900
"Masami Hiramatsu (Google)"  wrote:

> Steven Rostedt (VMware) (15):
>   function_graph: Convert ret_stack to a series of longs
>   fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible by 
> long
>   function_graph: Add an array structure that will allow multiple 
> callbacks
>   function_graph: Allow multiple users to attach to function graph
>   function_graph: Remove logic around ftrace_graph_entry and return
>   ftrace/function_graph: Pass fgraph_ops to function graph callbacks
>   ftrace: Allow function_graph tracer to be enabled in instances
>   ftrace: Allow ftrace startup flags exist without dynamic ftrace
>   function_graph: Have the instances use their own ftrace_ops for 
> filtering
>   function_graph: Add "task variables" per task for fgraph_ops
>   function_graph: Move set_graph_function tests to shadow stack global var
>   function_graph: Move graph depth stored data to shadow stack global var
>   function_graph: Move graph notrace bit to shadow stack global var
>   function_graph: Implement fgraph_reserve_data() and 
> fgraph_retrieve_data()
>   function_graph: Add selftest for passing local variables

Hi Masami,

While reviewing these patches, I realized there's several things I dislike
about the patches I wrote. So I took these patches and started cleaning
them up a little. Mostly renaming functions and adding comments.

As this is a major change to the function graph tracer, and I feel nervous
about building something on top of this, how about I take over these
patches and push them out for the next merge window. I'm hoping to get them
into linux-next by v6.10-rc2 (I spent the day working on them, and it's
mostly minor tweaks).

Then I can push it out to 6.11 and get some good testing against it. Then
we can add your stuff on top and get that merged in 6.12.

If all goes well, I'm hoping to get a series on just these patches (and
your selftest addition) by tonight.

Thoughts?

-- Steve



Re: [PATCH RFC 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc

2024-05-24 Thread Luca Weiss
On Donnerstag, 23. Mai 2024 08:19:11 MESZ Krzysztof Kozlowski wrote:
> On 23/05/2024 08:16, Luca Weiss wrote:
> > On Donnerstag, 23. Mai 2024 08:02:13 MESZ Krzysztof Kozlowski wrote:
> >> On 22/05/2024 19:34, Luca Weiss wrote:
> >>> On Mittwoch, 22. Mai 2024 08:49:43 MESZ Krzysztof Kozlowski wrote:
>  On 21/05/2024 22:35, Luca Weiss wrote:
> > On Dienstag, 21. Mai 2024 10:58:07 MESZ Krzysztof Kozlowski wrote:
> >> On 20/05/2024 17:11, Luca Weiss wrote:
> >>> Hi Krzysztof
> >>>
> >>> Ack, sounds good.
> >>>
> >>> Maybe also from you, any opinion between these two binding styles?
> >>>
> >>> So first using index of mboxes for the numbering, where for the known
> >>> usages the first element (and sometimes the 3rd - ipc-2) are empty <>.
> >>>
> >>> The second variant is using mbox-names to get the correct channel-mbox
> >>> mapping.
> >>>
> >>> -   qcom,ipc-1 = < 8 13>;
> >>> -   qcom,ipc-2 = < 8 9>;
> >>> -   qcom,ipc-3 = < 8 19>;
> >>> +   mboxes = <0>, < 13>, < 9>, < 19>;
> >>>
> >>> vs.
> >>>
> >>> -   qcom,ipc-1 = < 8 13>;
> >>> -   qcom,ipc-2 = < 8 9>;
> >>> -   qcom,ipc-3 = < 8 19>;
> >>> +   mboxes = < 13>, < 9>, < 19>;
> >>> +   mbox-names = "ipc-1", "ipc-2", "ipc-3";
> >>
> >> Sorry, don't get, ipc-1 is the first mailbox, so why would there be <0>
> >> in first case?
> >
> > Actually not, ipc-0 would be permissible by the driver, used for the 
> > 0th host
> >
> > e.g. from:
> >
> > /* Iterate over all hosts to check whom wants a kick */
> > for (host = 0; host < smsm->num_hosts; host++) {
> > hostp = >hosts[host];
> >
> > Even though no mailbox is specified in any upstream dts for this 0th 
> > host I
> > didn't want the bindings to restrict that, that's why in the first 
> > example
> > there's an empty element (<0>) for the 0th smsm host
> >
> >> Anyway, the question is if you need to know that some
> >> mailbox is missing. But then it is weird to name them "ipc-1" etc.
> >
> > In either case we'd just query the mbox (either by name or index) and 
> > then
> > see if it's there? Not quite sure I understand the sentence..
> > Pretty sure either binding would work the same way.
> 
>  The question is: does the driver care only about having some mailboxes
>  or the driver cares about each specific mailbox? IOW, is skipping ipc-0
>  important for the driver?
> >>>
> >>> There's nothing special from driver side about any mailbox. Some SoCs have
> >>> a mailbox for e.g. hosts 1&2&3, some have only 1&3, and apq8064 even has
> >>> 1&2&3&4.
> >>>
> >>> And if the driver doesn't find a mailbox for a host, it just ignores it
> >>> but then of course it can't 'ring' the mailbox for that host when 
> >>> necessary.
> >>>
> >>> Not sure how much more I can add here, to be fair I barely understand what
> >>> this driver is doing myself apart from the obvious.
> >>
> >> From what you said, it looks like it is enough to just list mailboxes,
> >> e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
> > 
> > No, for sure we need also the possibility to list ipc-3.
> 
> ? You can list it, what's the problem>

Maybe we're talking past each other...

You asked why this wouldn't work:

  e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
  mboxes = < 13>, < 9>, < 19>;

How would we know that the 3rd mailbox ( 19) is for the 4th host
(previous ipc-4)?

1. If we use mboxes with indexes we'd need to have <0> values for
"smsm hosts" where we don't have a mailbox for - this is at least
for the 2nd smsm host (qcom,ipc-2) for a bunch of SoCs.

2. If we use mboxes with mbox-names then we could skip that since we
can directly specify which "smsm host" a given mailbox is for.

My only question really is whether 1. or 2. is a better idea.

Is this clearer now or still not?


> 
> > 
> > And my point is that I'm not sure if any platform will ever need ipc-0, but
> > the code to use that if it ever exists is there - the driver always
> > tries getting an mbox (currently just syscon of course) for every host
> > from 0 to n.
> > 
> > These are the current (non-mbox-API) mboxes provided to smsm:
> > 
> > $ git grep qcom,ipc- arch/
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-1 = < 
> > 8 4>;
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-2 = < 
> > 8 14>;
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-3 = < 
> > 8 23>;
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-4 = 
> > <_sic_non_secure 0x4094 0>;
> > arch/arm/boot/dts/qcom/qcom-msm8974.dtsi:   qcom,ipc-1 = < 
> > 8 13>;
> > arch/arm/boot/dts/qcom/qcom-msm8974.dtsi:   qcom,ipc-2 = < 
> > 8 9>;
> > 

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-24 Thread Steven Rostedt
On Fri, 24 May 2024 12:50:08 +0200
"Linux regression tracking (Thorsten Leemhuis)"  
wrote:

> > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > quick display of a kernel trace dump before the shutdown/reboot
> > completed. Starting from version 6.8.10 and continuing into version
> > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > preventing the shutdown or reboot from completing and leaving the
> > machine stuck.

Ah, I bet it was this commit: baa23a8d4360d ("tracefs: Reset permissions on
remount if permissions are options"), which added a "iput" callback to the
dentry without calling iput, leaving stale inodes around.

This is fixed with:

  0bcfd9aa4dafa ("tracefs: Clear EVENT_INODE flag in tracefs_drop_inode()")

Try adding just that patch. It will at least make it go back to what was
happening before 6.8.10 (I hope!).

-- Steve



Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-24 Thread Steven Rostedt
On Fri, 24 May 2024 12:50:08 +0200
"Linux regression tracking (Thorsten Leemhuis)"  
wrote:

> [CCing a few people]
> 

Thanks for the Cc.

> On 24.05.24 12:31, Ilkka Naulapää wrote:
> > 
> > I have encountered a critical bug in the Linux vanilla kernel that
> > leads to a kernel panic during the shutdown or reboot process. The
> > issue arises after all services, including `journald`, have been
> > stopped. As a result, the machine fails to complete the shutdown or
> > reboot procedure, effectively causing the system to hang and not shut
> > down or reboot.  

To understand this, did you do anything with tracing? Before shutting down,
is there anything in /sys/kernel/tracing/instances directory?
Were any of the files/directories permissions in /sys/kernel/tracing changed?

> 
> Thx for the report. Not my area of expertise, so take this with a gain
> of salt. But given the versions your mention in your report and the
> screenshot that mentioned tracefs_free_inode I suspect this is caused by
> baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are
> options"). A few fixes for it will soon hit mainline and are meant to be
> backported to affected stable trees:
> 
> https://lore.kernel.org/all/20240523212406.254317...@goodmis.org/
> https://lore.kernel.org/all/20240523174419.1e588...@gandalf.local.home/
> 
> You might want to try them – or recheck once they hit the stable trees
> you are about. If they don't work, please report back.

There's been quite a bit of updates in this code, but this looks new to me.
I have more fixes that were just pulled by Linus today.

  https://git.kernel.org/torvalds/c/0eb03c7e8e2a4cc3653eb5eeb2d2001182071215

I'm not sure how relevant that is for this. But if you can reproduce it
with that commit, then this is a new bug.

-- Steve



Re: [PATCH v10 03/36] x86: tracing: Add ftrace_regs definition in the header

2024-05-24 Thread Steven Rostedt
On Fri, 24 May 2024 10:37:54 +0900
Masami Hiramatsu (Google)  wrote:
> > >  
> > >  #ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> > >  struct ftrace_regs {
> > > + /*
> > > +  * On the x86_64, the ftrace_regs saves;
> > > +  * rax, rcx, rdx, rdi, rsi, r8, r9, rbp, rip and rsp.
> > > +  * Also orig_ax is used for passing direct trampoline address.
> > > +  * x86_32 doesn't support ftrace_regs.  
> > 
> > Should add a comment that if fregs->regs.cs is set, then all of the pt_regs
> > is valid.  
> 
> But what about rbx and r1*? Only regs->cs should be care for pt_regs?
> Or, did you mean "the ftrace_regs is valid"?

Yeah, on x86_64 ftrace_regs uses regs.cs to denote if it is valid or not:

static __always_inline struct pt_regs *
arch_ftrace_get_regs(struct ftrace_regs *fregs)
{
/* Only when FL_SAVE_REGS is set, cs will be non zero */
if (!fregs->regs.cs)
return NULL;
return >regs;
}


-- Steve



Re: How to properly fix reading user pointers in bpf in android kernel 4.9?

2024-05-24 Thread Bagas Sanjaya
[also Cc: bpf maintainers and get_maintainer output]

On Thu, May 23, 2024 at 07:52:22PM +0300, Marcel wrote:
> This seems that it was a long standing problem with the Linux kernel in 
> general. bpf_probe_read should have worked for both kernel and user pointers 
> but it fails with access error when reading an user one instead. 
> 
> I know there's a patch upstream that fixes this by introducing new helpers 
> for reading kernel and userspace pointers and I tried to back port them back 
> to my kernel but with no success. Tools like bcc fail to use them and instead 
> they report that the arguments sent to the helpers are invalid. I assume this 
> is due to the arguments ARG_CONST_STACK_SIZE and ARG_PTR_TO_RAW_STACK handle 
> data different in the 4.9 android version and the upstream version but I'm 
> not sure that this is the cause. I left the patch I did below and with a link 
> to the kernel I'm working on and maybe someone can take a look and give me an 
> hand (the patch isn't applied yet)

What upstream patch? Has it already been in mainline?

> 
> 
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 744b4763b80e..de94c13b7193 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -559,6 +559,43 @@ enum bpf_func_id {
> */
> BPF_FUNC_probe_read_user,
>  
> +   /**
> +   * int bpf_probe_read_kernel(void *dst, int size, void *src)
> +   * Read a kernel pointer safely.
> +   * Return: 0 on success or negative error
> +   */
> +   BPF_FUNC_probe_read_kernel,
> +
> + /**
> +  * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
> +  * Copy a NUL terminated string from user unsafe address. In case 
> the string
> +  * length is smaller than size, the target is not padded with 
> further NUL
> +  * bytes. In case the string length is larger than size, just 
> count-1
> +  * bytes are copied and the last byte is set to NUL.
> +  * @dst: destination address
> +  * @size: maximum number of bytes to copy, including the trailing 
> NUL
> +  * @unsafe_ptr: unsafe address
> +  * Return:
> +  *   > 0 length of the string including the trailing NUL on success
> +  *   < 0 error
> +  */
> + BPF_FUNC_probe_read_user_str,
> +
> + /**
> +  * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
> +  * Copy a NUL terminated string from unsafe address. In case the 
> string
> +  * length is smaller than size, the target is not padded with 
> further NUL
> +  * bytes. In case the string length is larger than size, just 
> count-1
> +  * bytes are copied and the last byte is set to NUL.
> +  * @dst: destination address
> +  * @size: maximum number of bytes to copy, including the trailing 
> NUL
> +  * @unsafe_ptr: unsafe address
> +  * Return:
> +  *   > 0 length of the string including the trailing NUL on success
> +  *   < 0 error
> +  */
> + BPF_FUNC_probe_read_kernel_str,
> +
>   __BPF_FUNC_MAX_ID,
>  };
>  
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index a1e37a5d8c88..3478ca744a45 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -94,7 +94,7 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
>   .arg3_type  = ARG_ANYTHING,
>  };
>  
> -BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size, const void *, 
> unsafe_ptr)
> +BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size, const void  __user 
> *, unsafe_ptr)
>  {
>   int ret;
>  
> @@ -115,6 +115,27 @@ static const struct bpf_func_proto 
> bpf_probe_read_user_proto = {
>  };
>  
>  
> +BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size, const void *, 
> unsafe_ptr)
> +{
> + int ret;
> +
> + ret = probe_kernel_read(dst, unsafe_ptr, size);
> + if (unlikely(ret < 0))
> + memset(dst, 0, size);
> +
> + return ret;
> +}
> +
> +static const struct bpf_func_proto bpf_probe_read_kernel_proto = {
> + .func   = bpf_probe_read_kernel,
> + .gpl_only   = true,
> + .ret_type   = RET_INTEGER,
> + .arg1_type  = ARG_PTR_TO_RAW_STACK,
> + .arg2_type  = ARG_CONST_STACK_SIZE,
> + .arg3_type  = ARG_ANYTHING,
> +};
> +
> +
>  BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
>  u32, size)
>  {
> @@ -487,6 +508,69 @@ static const struct bpf_func_proto 
> bpf_probe_read_str_proto = {
>   .arg3_type  = ARG_ANYTHING,
>  };
>  
> +
> +
> +BPF_CALL_3(bpf_probe_read_user_str, void *, dst, u32, size,
> +const void __user *, unsafe_ptr)
> +{
> + int ret;
> +
> + /*
> +  * The strncpy_from_unsafe() call will likely not fill the entire
> +  * buffer, but that's okay in this circumstance as we're probing
> +  * 

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-24 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing a few people]

On 24.05.24 12:31, Ilkka Naulapää wrote:
> 
> I have encountered a critical bug in the Linux vanilla kernel that
> leads to a kernel panic during the shutdown or reboot process. The
> issue arises after all services, including `journald`, have been
> stopped. As a result, the machine fails to complete the shutdown or
> reboot procedure, effectively causing the system to hang and not shut
> down or reboot.

Thx for the report. Not my area of expertise, so take this with a gain
of salt. But given the versions your mention in your report and the
screenshot that mentioned tracefs_free_inode I suspect this is caused by
baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are
options"). A few fixes for it will soon hit mainline and are meant to be
backported to affected stable trees:

https://lore.kernel.org/all/20240523212406.254317...@goodmis.org/
https://lore.kernel.org/all/20240523174419.1e588...@gandalf.local.home/

You might want to try them – or recheck once they hit the stable trees
you are about. If they don't work, please report back.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> Here are the details of the issue:
> 
> - Affected Versions: Before kernel version 6.8.10, the bug caused a
> quick display of a kernel trace dump before the shutdown/reboot
> completed. Starting from version 6.8.10 and continuing into version
> 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> preventing the shutdown or reboot from completing and leaving the
> machine stuck.
> 
> - Symptoms:
>   - In normal shutdown/reboot scenarios, the kernel trace dump briefly
> appears as the last message on the screen.
>   - In rescue mode, the kernel panic message is displayed. Normally it
> is not shown.
> 
> Since `journald` is stopped before this issue occurs, no textual logs
> are available. However, I have captured two pictures illustrating
> these related issues, which I am attaching to this email for your
> reference. Also added my custom kernel config.
> 
> Thank you for your attention to this matter. Please let me know if any
> additional information is required to assist in diagnosing and
> resolving this bug.
> 
> Best regards,
> 
> Ilkka Naulapää



Re: [PATCH] livepatch: introduce klp_func called interface

2024-05-24 Thread zhang warden



> On May 23, 2024, at 22:22, Dan Carpenter  wrote:
> 
> Always run your patches through checkpatch.
> 
> So this patch is so that testers can see if a function has been called?
> Can you not get the same information from gcov or ftrace?
> 
> There are style issues with the patch, but it's not so important until
> the design is agreed on.
> 
> regards,
> dan carpenter

Hi, Dan.

This patch have format issues as Markus said. A newer version of this patch is 
sent which is checked by ./scripts/checkpatch.pl

Thanks for your suggestions.

Regards,
Wardenjohn




Re: [PATCH] livepatch: introduce klp_func called interface

2024-05-24 Thread zhang warden



> On May 21, 2024, at 16:04, Petr Mladek  wrote:
> 
> Another motivation to use ftrace for testing is that it does not
> affect the performance in production.
> 
> We should keep klp_ftrace_handler() as fast as possible so that we
> could livepatch also performance sensitive functions.
> 

How about using unlikely() for branch testing? If we use unlikely, maybe there 
is no negative effect to klp_ftrace_handler() once this function is called.

Regards,
Wardenjohn




Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

2024-05-24 Thread Miklos Szeredi
On Fri, 24 May 2024 at 02:47, John Groves  wrote:

> Apologies, but I'm short on time at the moment - going into a long holiday
> weekend in the US with family plans. I should be focused again by middle of
> next week.

NP.

Obviously I'll need to test it before anything is merged, other than
that this is not urgent at all...

> But can you check /proc/cmdline to see of the memmap arg got through without
> getting mangled? The '$' tends to get fubar'd. You might need \$, or I've seen
> the need for \\\$. If it's un-mangled, there should be a dax device.

/proc/cmdline shows the option correctly:

root@kvm:~# cat /proc/cmdline
root=/dev/vda console=hvc0 memmap=4G$4G

> If that doesn't work, it's worth trying '!' instead, which I think would give
> you a pmem device - if the arg gets through (but ! is less likely to get
> horked). That pmem device can be converted to devdax...

That doesn't work either.  No device created in /dev  (dax or pmem).

free(1) does show that the reserved memory is gone in both cases, so
something does happen.

Attaching my .config as well.

Thanks,
Miklos


.config
Description: Binary data


Re: [PATCH v10 03/36] x86: tracing: Add ftrace_regs definition in the header

2024-05-23 Thread Google
On Thu, 23 May 2024 19:14:59 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:08:35 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > From: Masami Hiramatsu (Google) 
> > 
> > Add ftrace_regs definition for x86_64 in the ftrace header to
> > clarify what register will be accessible from ftrace_regs.
> > 
> > Signed-off-by: Masami Hiramatsu (Google) 
> > ---
> >  Changes in v3:
> >   - Add rip to be saved.
> >  Changes in v2:
> >   - Newly added.
> > ---
> >  arch/x86/include/asm/ftrace.h |6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> > index cf88cc8cc74d..c88bf47f46da 100644
> > --- a/arch/x86/include/asm/ftrace.h
> > +++ b/arch/x86/include/asm/ftrace.h
> > @@ -36,6 +36,12 @@ static inline unsigned long ftrace_call_adjust(unsigned 
> > long addr)
> >  
> >  #ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> >  struct ftrace_regs {
> > +   /*
> > +* On the x86_64, the ftrace_regs saves;
> > +* rax, rcx, rdx, rdi, rsi, r8, r9, rbp, rip and rsp.
> > +* Also orig_ax is used for passing direct trampoline address.
> > +* x86_32 doesn't support ftrace_regs.
> 
> Should add a comment that if fregs->regs.cs is set, then all of the pt_regs
> is valid.

But what about rbx and r1*? Only regs->cs should be care for pt_regs?
Or, did you mean "the ftrace_regs is valid"?

> And x86_32 does support ftrace_regs, it just doesn't support
> having a subset of it.

Oh, thanks. I'll update the comment about x86_32.

Thank you,

> 
> -- Steve
> 
> 
> > +*/
> > struct pt_regs  regs;
> >  };
> >  
> 
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH v10 01/36] tracing: Add a comment about ftrace_regs definition

2024-05-23 Thread Google
On Thu, 23 May 2024 19:10:31 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:08:12 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > From: Masami Hiramatsu (Google) 
> > 
> > To clarify what will be expected on ftrace_regs, add a comment to the
> > architecture independent definition of the ftrace_regs.
> > 
> > Signed-off-by: Masami Hiramatsu (Google) 
> > Acked-by: Mark Rutland 
> > ---
> >  Changes in v8:
> >   - Update that the saved registers depends on the context.
> >  Changes in v3:
> >   - Add instruction pointer
> >  Changes in v2:
> >   - newly added.
> > ---
> >  include/linux/ftrace.h |   26 ++
> >  1 file changed, 26 insertions(+)
> > 
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index 54d53f345d14..b81f1afa82a1 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -118,6 +118,32 @@ extern int ftrace_enabled;
> >  
> >  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> >  
> > +/**
> > + * ftrace_regs - ftrace partial/optimal register set
> > + *
> > + * ftrace_regs represents a group of registers which is used at the
> > + * function entry and exit. There are three types of registers.
> > + *
> > + * - Registers for passing the parameters to callee, including the stack
> > + *   pointer. (e.g. rcx, rdx, rdi, rsi, r8, r9 and rsp on x86_64)
> > + * - Registers for passing the return values to caller.
> > + *   (e.g. rax and rdx on x86_64)
> > + * - Registers for hooking the function call and return including the
> > + *   frame pointer (the frame pointer is architecture/config dependent)
> > + *   (e.g. rip, rbp and rsp for x86_64)
> > + *
> > + * Also, architecture dependent fields can be used for internal process.
> > + * (e.g. orig_ax on x86_64)
> > + *
> > + * On the function entry, those registers will be restored except for
> > + * the stack pointer, so that user can change the function parameters
> > + * and instruction pointer (e.g. live patching.)
> > + * On the function exit, only registers which is used for return values
> > + * are restored.
> 
> I wonder if we should also add a note about some architectures in some
> circumstances may store all pt_regs in ftrace_regs. For example, if an
> architecture supports FTRACE_WITH_REGS, it may pass the pt_regs within the
> ftrace_regs. If that is the case, then ftrace_get_regs() called on it will
> return a pointer to a valid pt_regs, or NULL if it is not supported or the
> ftrace_regs does not have a all the registers.

Agreed. That case also should be noted. Thanks for pointing!


> 
> -- Steve
> 
> 
> > + *
> > + * NOTE: user *must not* access regs directly, only do it via APIs, because
> > + * the member can be changed according to the architecture.
> > + */
> >  struct ftrace_regs {
> > struct pt_regs  regs;
> >  };
> 


-- 
Masami Hiramatsu (Google) 



Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

2024-05-23 Thread John Groves
On 24/05/23 03:57PM, Miklos Szeredi wrote:
> [trimming CC list]
> 
> On Thu, 23 May 2024 at 04:49, John Groves  wrote:
> 
> > - memmap=! will reserve a pretend pmem device at 
> > 
> > - memmap=$ will reserve a pretend dax device at 
> > 
> 
> Doesn't get me a /dev/dax or /dev/pmem
> 
> Complete qemu command line:
> 
> qemu-kvm -s -serial none -parallel none -kernel
> /home/mszeredi/git/linux/arch/x86/boot/bzImage -drive
> format=raw,file=/home/mszeredi/root_fs,index=0,if=virtio -drive
> format=raw,file=/home/mszeredi/images/ubd1,index=1,if=virtio -chardev
> stdio,id=virtiocon0,signal=off -device virtio-serial -device
> virtconsole,chardev=virtiocon0 -cpu host -m 8G -net user -net
> nic,model=virtio -fsdev local,security_model=none,id=fsdev0,path=/home
> -device virtio-9p-pci,fsdev=fsdev0,mount_tag=hostshare -device
> virtio-rng-pci -smp 4 -append 'root=/dev/vda console=hvc0
> memmap=4G$4G'
> 
> root@kvm:~/famfs# scripts/chk_efi.sh
> This system is neither Ubuntu nor Fedora. It is identified as debian.
> /sys/firmware/efi not found; probably not efi
>  not found; probably nof efi
> /boot/efi/EFI not found; probably not efi
> /boot/efi/EFI/BOOT not found; probably not efi
> /boot/efi/EFI/ not found; probably not efi
> /boot/efi/EFI//grub.cfg not found; probably nof efi
> Probably not efi; errs=6
> 
> Thanks,
> Miklos


Apologies, but I'm short on time at the moment - going into a long holiday
weekend in the US with family plans. I should be focused again by middle of
next week.

But can you check /proc/cmdline to see of the memmap arg got through without
getting mangled? The '$' tends to get fubar'd. You might need \$, or I've seen
the need for \\\$. If it's un-mangled, there should be a dax device.

If that doesn't work, it's worth trying '!' instead, which I think would give
you a pmem device - if the arg gets through (but ! is less likely to get
horked). That pmem device can be converted to devdax...

Regards,
John




Re: [PATCH] uprobes: prevent mutex_lock() under rcu_read_lock()

2024-05-23 Thread Google
On Mon, 20 May 2024 22:30:17 -0700
Andrii Nakryiko  wrote:

> Recent changes made uprobe_cpu_buffer preparation lazy, and moved it
> deeper into __uprobe_trace_func(). This is problematic because
> __uprobe_trace_func() is called inside rcu_read_lock()/rcu_read_unlock()
> block, which then calls prepare_uprobe_buffer() -> uprobe_buffer_get() ->
> mutex_lock(>mutex), leading to a splat about using mutex under
> non-sleepable RCU:
> 
>   BUG: sleeping function called from invalid context at 
> kernel/locking/mutex.c:585
>in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 98231, name: 
> stress-ng-sigq
>preempt_count: 0, expected: 0
>RCU nest depth: 1, expected: 0
>...
>Call Trace:
> 
> dump_stack_lvl+0x3d/0xe0
> __might_resched+0x24c/0x270
> ? prepare_uprobe_buffer+0xd5/0x1d0
> __mutex_lock+0x41/0x820
> ? ___perf_sw_event+0x206/0x290
> ? __perf_event_task_sched_in+0x54/0x660
> ? __perf_event_task_sched_in+0x54/0x660
> prepare_uprobe_buffer+0xd5/0x1d0
> __uprobe_trace_func+0x4a/0x140
> uprobe_dispatcher+0x135/0x280
> ? uprobe_dispatcher+0x94/0x280
> uprobe_notify_resume+0x650/0xec0
> ? atomic_notifier_call_chain+0x21/0x110
> ? atomic_notifier_call_chain+0xf8/0x110
> irqentry_exit_to_user_mode+0xe2/0x1e0
> asm_exc_int3+0x35/0x40
>RIP: 0033:0x7f7e1d4da390
>Code: 33 04 00 0f 1f 80 00 00 00 00 f3 0f 1e fa b9 01 00 00 00 e9 b2 fc ff 
> ff 66 90 f3 0f 1e fa 31 c9 e9 a5 fc ff ff 0f 1f 44 00 00  0f 1e fa b8 27 
> 00 00 00 0f 05 c3 0f 1f 40 00 f3 0f 1e fa b8 6e
>RSP: 002b:7ffd2abc3608 EFLAGS: 0246
>RAX:  RBX: 76d325f1 RCX: 
>RDX: 76d325f1 RSI: 000a RDI: 7ffd2abc3690
>RBP: 000a R08: 00017fb7 R09: 00017fb7
>R10: 00017fb7 R11: 0246 R12: 00017ff2
>R13: 7ffd2abc3610 R14:  R15: 7ffd2abc3780
> 
> 
> Luckily, it's easy to fix by moving prepare_uprobe_buffer() to be called
> slightly earlier: into uprobe_trace_func() and uretprobe_trace_func(), outside
> of RCU locked section. This still keeps this buffer preparation lazy and helps
> avoid the overhead when it's not needed. E.g., if there is only BPF uprobe
> handler installed on a given uprobe, buffer won't be initialized.
> 
> Note, the other user of prepare_uprobe_buffer(), __uprobe_perf_func(), is not
> affected, as it doesn't prepare buffer under RCU read lock.
> 

Oops, good catch! This looks good to me. Let me pick it.
Let me add a simple uprobe test in ftracetest so that this error can
detect in selftests. (I could reproduced it.)

Thank you,

> Fixes: 1b8f85defbc8 ("uprobes: prepare uprobe args buffer lazily")
> Reported-by: Breno Leitao 
> Signed-off-by: Andrii Nakryiko 
> ---
>  kernel/trace/trace_uprobe.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 8541fa1494ae..c98e3b3386ba 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -970,19 +970,17 @@ static struct uprobe_cpu_buffer 
> *prepare_uprobe_buffer(struct trace_uprobe *tu,
>  
>  static void __uprobe_trace_func(struct trace_uprobe *tu,
>   unsigned long func, struct pt_regs *regs,
> - struct uprobe_cpu_buffer **ucbp,
> + struct uprobe_cpu_buffer *ucb,
>   struct trace_event_file *trace_file)
>  {
>   struct uprobe_trace_entry_head *entry;
>   struct trace_event_buffer fbuffer;
> - struct uprobe_cpu_buffer *ucb;
>   void *data;
>   int size, esize;
>   struct trace_event_call *call = trace_probe_event_call(>tp);
>  
>   WARN_ON(call != trace_file->event_call);
>  
> - ucb = prepare_uprobe_buffer(tu, regs, ucbp);
>   if (WARN_ON_ONCE(ucb->dsize > PAGE_SIZE))
>   return;
>  
> @@ -1014,13 +1012,16 @@ static int uprobe_trace_func(struct trace_uprobe *tu, 
> struct pt_regs *regs,
>struct uprobe_cpu_buffer **ucbp)
>  {
>   struct event_file_link *link;
> + struct uprobe_cpu_buffer *ucb;
>  
>   if (is_ret_probe(tu))
>   return 0;
>  
> + ucb = prepare_uprobe_buffer(tu, regs, ucbp);
> +
>   rcu_read_lock();
>   trace_probe_for_each_link_rcu(link, >tp)
> - __uprobe_trace_func(tu, 0, regs, ucbp, link->file);
> + __uprobe_trace_func(tu, 0, regs, ucb, link->file);
>   rcu_read_unlock();
>  
>   return 0;
> @@ -1031,10 +1032,13 @@ static void uretprobe_trace_func(struct trace_uprobe 
> *tu, unsigned long func,
>struct uprobe_cpu_buffer **ucbp)
>  {
>   struct event_file_link *link;
> + struct uprobe_cpu_buffer *ucb;
> +
> + ucb = prepare_uprobe_buffer(tu, regs, ucbp);
>  
>   rcu_read_lock();
>   

Re: [PATCH v10 03/36] x86: tracing: Add ftrace_regs definition in the header

2024-05-23 Thread Steven Rostedt
On Tue,  7 May 2024 23:08:35 +0900
"Masami Hiramatsu (Google)"  wrote:

> From: Masami Hiramatsu (Google) 
> 
> Add ftrace_regs definition for x86_64 in the ftrace header to
> clarify what register will be accessible from ftrace_regs.
> 
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  Changes in v3:
>   - Add rip to be saved.
>  Changes in v2:
>   - Newly added.
> ---
>  arch/x86/include/asm/ftrace.h |6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index cf88cc8cc74d..c88bf47f46da 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -36,6 +36,12 @@ static inline unsigned long ftrace_call_adjust(unsigned 
> long addr)
>  
>  #ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
>  struct ftrace_regs {
> + /*
> +  * On the x86_64, the ftrace_regs saves;
> +  * rax, rcx, rdx, rdi, rsi, r8, r9, rbp, rip and rsp.
> +  * Also orig_ax is used for passing direct trampoline address.
> +  * x86_32 doesn't support ftrace_regs.

Should add a comment that if fregs->regs.cs is set, then all of the pt_regs
is valid. And x86_32 does support ftrace_regs, it just doesn't support
having a subset of it.

-- Steve


> +  */
>   struct pt_regs  regs;
>  };
>  




Re: [PATCH v10 01/36] tracing: Add a comment about ftrace_regs definition

2024-05-23 Thread Steven Rostedt
On Tue,  7 May 2024 23:08:12 +0900
"Masami Hiramatsu (Google)"  wrote:

> From: Masami Hiramatsu (Google) 
> 
> To clarify what will be expected on ftrace_regs, add a comment to the
> architecture independent definition of the ftrace_regs.
> 
> Signed-off-by: Masami Hiramatsu (Google) 
> Acked-by: Mark Rutland 
> ---
>  Changes in v8:
>   - Update that the saved registers depends on the context.
>  Changes in v3:
>   - Add instruction pointer
>  Changes in v2:
>   - newly added.
> ---
>  include/linux/ftrace.h |   26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 54d53f345d14..b81f1afa82a1 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -118,6 +118,32 @@ extern int ftrace_enabled;
>  
>  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
>  
> +/**
> + * ftrace_regs - ftrace partial/optimal register set
> + *
> + * ftrace_regs represents a group of registers which is used at the
> + * function entry and exit. There are three types of registers.
> + *
> + * - Registers for passing the parameters to callee, including the stack
> + *   pointer. (e.g. rcx, rdx, rdi, rsi, r8, r9 and rsp on x86_64)
> + * - Registers for passing the return values to caller.
> + *   (e.g. rax and rdx on x86_64)
> + * - Registers for hooking the function call and return including the
> + *   frame pointer (the frame pointer is architecture/config dependent)
> + *   (e.g. rip, rbp and rsp for x86_64)
> + *
> + * Also, architecture dependent fields can be used for internal process.
> + * (e.g. orig_ax on x86_64)
> + *
> + * On the function entry, those registers will be restored except for
> + * the stack pointer, so that user can change the function parameters
> + * and instruction pointer (e.g. live patching.)
> + * On the function exit, only registers which is used for return values
> + * are restored.

I wonder if we should also add a note about some architectures in some
circumstances may store all pt_regs in ftrace_regs. For example, if an
architecture supports FTRACE_WITH_REGS, it may pass the pt_regs within the
ftrace_regs. If that is the case, then ftrace_get_regs() called on it will
return a pointer to a valid pt_regs, or NULL if it is not supported or the
ftrace_regs does not have a all the registers.

-- Steve


> + *
> + * NOTE: user *must not* access regs directly, only do it via APIs, because
> + * the member can be changed according to the architecture.
> + */
>  struct ftrace_regs {
>   struct pt_regs  regs;
>  };




Re: [PATCH v2 1/1] x86/vector: Fix vector leak during CPU offline

2024-05-23 Thread Thomas Gleixner
On Wed, May 22 2024 at 15:02, Dongli Zhang wrote:
> The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
> interrupt affinity reconfiguration via procfs. Instead, the change is
> deferred until the next instance of the interrupt being triggered on the
> original CPU.
>
> When the interrupt next triggers on the original CPU, the new affinity is
> enforced within __irq_move_irq(). A vector is allocated from the new CPU,
> but if the old vector on the original CPU remains online, it is not
> immediately reclaimed. Instead, apicd->move_in_progress is flagged, and the
> reclaiming process is delayed until the next trigger of the interrupt on
> the new CPU.
>
> Upon the subsequent triggering of the interrupt on the new CPU,
> irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
> remains online. Subsequently, the timer on the old CPU iterates over its
> vector_cleanup list, reclaiming old vectors.
>
> However, a rare scenario arises if the old CPU is outgoing before the
> interrupt triggers again on the new CPU. The irq_force_complete_move() may
> not have the chance to be invoked on the outgoing CPU to reclaim the old
> apicd->prev_vector. This is because the interrupt isn't currently affine to
> the outgoing CPU, and irq_needs_fixup() returns false. Even though
> __vector_schedule_cleanup() is later called on the new CPU, it doesn't
> reclaim apicd->prev_vector; instead, it simply resets both
> apicd->move_in_progress and apicd->prev_vector to 0.
>
> As a result, the vector remains unreclaimed in vector_matrix, leading to a
> CPU vector leak.
>
> To address this issue, move the invocation of irq_force_complete_move()
> before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
> interrupt is currently or used to affine to the outgoing CPU. Additionally,
> reclaim the vector in __vector_schedule_cleanup() as well, following a
> warning message, although theoretically it should never see
> apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.

Nice change log!



Re: [GIT PULL v2] virtio: features, fixes, cleanups

2024-05-23 Thread pr-tracker-bot
The pull request you sent on Thu, 23 May 2024 02:00:17 -0400:

> https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/2ef32ad2241340565c35baf77fc95053c84eeeb0

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html



Re: [PATCH] x86/paravirt: Disable virt spinlock when CONFIG_PARAVIRT_SPINLOCKS disabled

2024-05-23 Thread Dave Hansen
On 5/23/24 11:39, Jürgen Groß wrote:
>>
>> Let's just keep it simple.  How about the attached patch?
> 
> Simple indeed. The attachment is empty. 

Let's try this again.diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 5358d43886ad..c193c9e60a1b 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -55,8 +55,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
 
 void __init native_pv_lock_init(void)
 {
-	if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) &&
-	!boot_cpu_has(X86_FEATURE_HYPERVISOR))
+	if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
 		static_branch_disable(_spin_lock_key);
 }
 


  1   2   3   4   5   6   7   8   9   10   >