date:20240229

Re: [PATCH] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-02-29 Thread linke

> So basically you are worried about read-tearing?
> 
> That wasn't mentioned in the change log.

Yes. Sorry for making this confused, I am not very familiar with this and
still learning.

> Funny part is, if the above timestamp read did a tear, then this would
> definitely not match, and would return the correct value. That is, the
> buffer is not empty because the only way for this to get corrupted is if
> something is in the process of writing to it.

I agree with you here.

commit = rb_page_commit(commit_page);

But if commit_page above is the result of a torn read, the commit field
read by rb_page_commit() may not represent a valid value. 

In this case, READ_ONCE() is only needed for the commit_page.

Re: [PATCH v11 2/4] dt-bindings: remoteproc: add Tightly Coupled Memory (TCM) bindings

2024-02-29 Thread Tanmay Shah

Thanks for reviews.

Ack to all comments, I will address in next revision.


Tanmay

On 2/29/24 3:59 AM, Krzysztof Kozlowski wrote:
> On 19/02/2024 18:44, Tanmay Shah wrote:
> > From: Radhey Shyam Pandey 
> > 
> > Introduce bindings for TCM memory address space on AMD-xilinx Zynq
> > UltraScale+ platform. It will help in defining TCM in device-tree
> > and make it's access platform agnostic and data-driven.
> > 
> > Tightly-coupled memories(TCMs) are low-latency memory that provides
> > predictable instruction execution and predictable data load/store
> > timing. Each Cortex-R5F processor contains two 64-bit wide 64 KB memory
> > banks on the ATCM and BTCM ports, for a total of 128 KB of memory.
> > 
> > The TCM resources(reg, reg-names and power-domain) are documented for
> > each TCM in the R5 node. The reg and reg-names are made as required
> > properties as we don't want to hardcode TCM addresses for future
> > platforms and for zu+ legacy implementation will ensure that the
> > old dts w/o reg/reg-names works and stable ABI is maintained.
> > 
> > It also extends the examples for TCM split and lockstep modes.
> > 
> > Signed-off-by: Radhey Shyam Pandey 
> > Signed-off-by: Tanmay Shah 
> > ---
> > 
> > Changes in v11:
> >   - Fix yamllint warning and reduce indentation as needed
> > 
> >  .../remoteproc/xlnx,zynqmp-r5fss.yaml | 192 --
> >  1 file changed, 170 insertions(+), 22 deletions(-)
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml 
> > b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> > index 78aac69f1060..77030edf41fa 100644
> > --- a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> > +++ b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> > @@ -20,9 +20,21 @@ properties:
> >compatible:
> >  const: xlnx,zynqmp-r5fss
> >  
> > +  "#address-cells":
> > +const: 2
> > +
> > +  "#size-cells":
> > +const: 2
> > +
> > +  ranges:
> > +description: |
> > +  Standard ranges definition providing address translations for
> > +  local R5F TCM address spaces to bus addresses.
> > +
> >xlnx,cluster-mode:
> >  $ref: /schemas/types.yaml#/definitions/uint32
> >  enum: [0, 1, 2]
> > +default: 1
> >  description: |
> >The RPU MPCore can operate in split mode (Dual-processor 
> > performance), Safety
> >lock-step mode(Both RPU cores execute the same code in lock-step,
> > @@ -37,7 +49,7 @@ properties:
> >2: single cpu mode
> >  
> >  patternProperties:
> > -  "^r5f-[a-f0-9]+$":
> > +  "^r5f@[0-9a-f]+$":
> >  type: object
> >  description: |
> >The RPU is located in the Low Power Domain of the Processor 
> > Subsystem.
> > @@ -54,9 +66,6 @@ patternProperties:
> >compatible:
> >  const: xlnx,zynqmp-r5f
> >  
> > -  power-domains:
> > -maxItems: 1
>
> Why power-domains are being dropped? This should have widest constraints
> if you later customize it.
>
> > -
> >mboxes:
> >  minItems: 1
> >  items:
> > @@ -101,35 +110,174 @@ patternProperties:
> >  
> >  required:
> >- compatible
> > -  - power-domains
>
> Don't drop power domains.
>
>
> >  
> > -unevaluatedProperties: false
> > +allOf:
>
> allOf block goes after required:
>
> > +  - if:
> > +  properties:
> > +xlnx,cluster-mode:
> > +  enum:
> > +- 1
> > +then:
> > +  patternProperties:
> > +"^r5f@[0-9a-f]+$":
> > +  type: object
> > +
> > +  properties:
> > +reg:
>
> reg is missing in your patternProperties earlier.
>
>
>
> Best regards,
> Krzysztof
>

Re: [PATCH 1/4] asm-generic/page.h: apply page shift to PFN instead of VA in pfn_to_virt

2024-02-29 Thread Yan Zhao

On Thu, Feb 29, 2024 at 02:34:35PM +0100, Linus Walleij wrote:
> On Wed, Jan 31, 2024 at 7:27 AM Yan Zhao  wrote:
> 
> > Apply the page shift to PFN to get physical address for final VA.
> > The macro __va should take physical address instead of PFN as input.
> >
> > Fixes: 2d78057f0dd4 ("asm-generic/page.h: Make pfn accessors static 
> > inlines")
> > Signed-off-by: Yan Zhao 
> 
> My bug, obviously. :(
> Reviewed-by: Linus Walleij 
> 
> I thought this was already applied with the other fixes, but maybe it
> was missed?
>
Hi Linus,
The other 3 in csky/hexagon/openrisc should have been applied in
https://lore.kernel.org/all/20240202140550.9886-1-yan.y.z...@intel.com/.

This one in asm-generic is not, because Arnd said he is going to remove
the header as a whole soon. 

I explained it in the change log :)
"The pfn_to_virt() in asm-generic/page.h is not touched in this patch as
it's referenced by page_to_virt() in that header while the whole header is
going to be removed as a whole due to no one including it."

Thanks
Yan

Re: [PATCH] fprobe: Fix to allocate entry_data_size buffer with rethook instances

2024-02-29 Thread Google

On Thu, 29 Feb 2024 22:58:54 +0100
Jiri Olsa  wrote:

> On Thu, Feb 29, 2024 at 08:22:47PM +0900, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) 
> > 
> > Fix to allocate fprobe::entry_data_size buffer with rethook instances.
> > If fprobe doesn't allocate entry_data_size buffer for each rethook instance,
> > fprobe entry handler can cause a buffer overrun when storing entry data in
> > entry handler.
> > 
> > Reported-by: Jiri Olsa 
> 
> Tested-by: Jiri Olsa 

Thanks for testing!
Let me pick this to probe/fixes.

Thank you,

> 
> thanks,
> jirka
> 
> > Fixes: 4bbd93455659 ("kprobes: kretprobe scalability improvement")
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) 
> > ---
> >  kernel/trace/fprobe.c |   14 ++
> >  1 file changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> > index 6cd2a4e3afb8..9ff018245840 100644
> > --- a/kernel/trace/fprobe.c
> > +++ b/kernel/trace/fprobe.c
> > @@ -189,9 +189,6 @@ static int fprobe_init_rethook(struct fprobe *fp, int 
> > num)
> >  {
> > int size;
> >  
> > -   if (num <= 0)
> > -   return -EINVAL;
> > -
> > if (!fp->exit_handler) {
> > fp->rethook = NULL;
> > return 0;
> > @@ -199,15 +196,16 @@ static int fprobe_init_rethook(struct fprobe *fp, int 
> > num)
> >  
> > /* Initialize rethook if needed */
> > if (fp->nr_maxactive)
> > -   size = fp->nr_maxactive;
> > +   num = fp->nr_maxactive;
> > else
> > -   size = num * num_possible_cpus() * 2;
> > -   if (size <= 0)
> > +   num *= num_possible_cpus() * 2;
> > +   if (num <= 0)
> > return -EINVAL;
> >  
> > +   size = sizeof(struct fprobe_rethook_node) + fp->entry_data_size;
> > +
> > /* Initialize rethook */
> > -   fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler,
> > -   sizeof(struct fprobe_rethook_node), size);
> > +   fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, size, num);
> > if (IS_ERR(fp->rethook))
> > return PTR_ERR(fp->rethook);
> >  
> > 


-- 
Masami Hiramatsu (Google)

Re: [PATCH] fprobe: Fix to allocate entry_data_size buffer with rethook instances

2024-02-29 Thread Jiri Olsa

On Thu, Feb 29, 2024 at 08:22:47PM +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) 
> 
> Fix to allocate fprobe::entry_data_size buffer with rethook instances.
> If fprobe doesn't allocate entry_data_size buffer for each rethook instance,
> fprobe entry handler can cause a buffer overrun when storing entry data in
> entry handler.
> 
> Reported-by: Jiri Olsa 

Tested-by: Jiri Olsa 

thanks,
jirka

> Fixes: 4bbd93455659 ("kprobes: kretprobe scalability improvement")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  kernel/trace/fprobe.c |   14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 6cd2a4e3afb8..9ff018245840 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -189,9 +189,6 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
>  {
>   int size;
>  
> - if (num <= 0)
> - return -EINVAL;
> -
>   if (!fp->exit_handler) {
>   fp->rethook = NULL;
>   return 0;
> @@ -199,15 +196,16 @@ static int fprobe_init_rethook(struct fprobe *fp, int 
> num)
>  
>   /* Initialize rethook if needed */
>   if (fp->nr_maxactive)
> - size = fp->nr_maxactive;
> + num = fp->nr_maxactive;
>   else
> - size = num * num_possible_cpus() * 2;
> - if (size <= 0)
> + num *= num_possible_cpus() * 2;
> + if (num <= 0)
>   return -EINVAL;
>  
> + size = sizeof(struct fprobe_rethook_node) + fp->entry_data_size;
> +
>   /* Initialize rethook */
> - fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler,
> - sizeof(struct fprobe_rethook_node), size);
> + fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, size, num);
>   if (IS_ERR(fp->rethook))
>   return PTR_ERR(fp->rethook);
>  
>

[PATCH AUTOSEL 5.4 4/6] parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check

2024-02-29 Thread Sasha Levin

From: Max Kellermann 

[ Upstream commit 250f5402e636a5cec9e0e95df252c3d54307210f ]

Fixes a bug revealed by -Wmissing-prototypes when
CONFIG_FUNCTION_GRAPH_TRACER is enabled but not CONFIG_DYNAMIC_FTRACE:

 arch/parisc/kernel/ftrace.c:82:5: error: no previous prototype for 
'ftrace_enable_ftrace_graph_caller' [-Werror=missing-prototypes]
82 | int ftrace_enable_ftrace_graph_caller(void)
   | ^
 arch/parisc/kernel/ftrace.c:88:5: error: no previous prototype for 
'ftrace_disable_ftrace_graph_caller' [-Werror=missing-prototypes]
88 | int ftrace_disable_ftrace_graph_caller(void)
   | ^~

Signed-off-by: Max Kellermann 
Signed-off-by: Helge Deller 
Signed-off-by: Sasha Levin 
---
 arch/parisc/kernel/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index b836fc61a24f4..f3a5c5e480cf0 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -80,7 +80,7 @@ void notrace __hot ftrace_function_trampoline(unsigned long 
parent,
 #endif
 }
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if defined(CONFIG_DYNAMIC_FTRACE) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
 int ftrace_enable_ftrace_graph_caller(void)
 {
return 0;
-- 
2.43.0

[PATCH AUTOSEL 5.10 6/8] parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check

2024-02-29 Thread Sasha Levin

From: Max Kellermann 

[ Upstream commit 250f5402e636a5cec9e0e95df252c3d54307210f ]

Fixes a bug revealed by -Wmissing-prototypes when
CONFIG_FUNCTION_GRAPH_TRACER is enabled but not CONFIG_DYNAMIC_FTRACE:

 arch/parisc/kernel/ftrace.c:82:5: error: no previous prototype for 
'ftrace_enable_ftrace_graph_caller' [-Werror=missing-prototypes]
82 | int ftrace_enable_ftrace_graph_caller(void)
   | ^
 arch/parisc/kernel/ftrace.c:88:5: error: no previous prototype for 
'ftrace_disable_ftrace_graph_caller' [-Werror=missing-prototypes]
88 | int ftrace_disable_ftrace_graph_caller(void)
   | ^~

Signed-off-by: Max Kellermann 
Signed-off-by: Helge Deller 
Signed-off-by: Sasha Levin 
---
 arch/parisc/kernel/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 63e3ecb9da812..8538425cc43e0 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -81,7 +81,7 @@ void notrace __hot ftrace_function_trampoline(unsigned long 
parent,
 #endif
 }
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if defined(CONFIG_DYNAMIC_FTRACE) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
 int ftrace_enable_ftrace_graph_caller(void)
 {
return 0;
-- 
2.43.0

[PATCH AUTOSEL 5.15 7/9] parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check

2024-02-29 Thread Sasha Levin

From: Max Kellermann 

[ Upstream commit 250f5402e636a5cec9e0e95df252c3d54307210f ]

Fixes a bug revealed by -Wmissing-prototypes when
CONFIG_FUNCTION_GRAPH_TRACER is enabled but not CONFIG_DYNAMIC_FTRACE:

 arch/parisc/kernel/ftrace.c:82:5: error: no previous prototype for 
'ftrace_enable_ftrace_graph_caller' [-Werror=missing-prototypes]
82 | int ftrace_enable_ftrace_graph_caller(void)
   | ^
 arch/parisc/kernel/ftrace.c:88:5: error: no previous prototype for 
'ftrace_disable_ftrace_graph_caller' [-Werror=missing-prototypes]
88 | int ftrace_disable_ftrace_graph_caller(void)
   | ^~

Signed-off-by: Max Kellermann 
Signed-off-by: Helge Deller 
Signed-off-by: Sasha Levin 
---
 arch/parisc/kernel/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 0a1e75af5382d..44d70fc30aae5 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -81,7 +81,7 @@ void notrace __hot ftrace_function_trampoline(unsigned long 
parent,
 #endif
 }
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if defined(CONFIG_DYNAMIC_FTRACE) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
 int ftrace_enable_ftrace_graph_caller(void)
 {
return 0;
-- 
2.43.0

[PATCH AUTOSEL 6.1 08/12] parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check

2024-02-29 Thread Sasha Levin

From: Max Kellermann 

[ Upstream commit 250f5402e636a5cec9e0e95df252c3d54307210f ]

Fixes a bug revealed by -Wmissing-prototypes when
CONFIG_FUNCTION_GRAPH_TRACER is enabled but not CONFIG_DYNAMIC_FTRACE:

 arch/parisc/kernel/ftrace.c:82:5: error: no previous prototype for 
'ftrace_enable_ftrace_graph_caller' [-Werror=missing-prototypes]
82 | int ftrace_enable_ftrace_graph_caller(void)
   | ^
 arch/parisc/kernel/ftrace.c:88:5: error: no previous prototype for 
'ftrace_disable_ftrace_graph_caller' [-Werror=missing-prototypes]
88 | int ftrace_disable_ftrace_graph_caller(void)
   | ^~

Signed-off-by: Max Kellermann 
Signed-off-by: Helge Deller 
Signed-off-by: Sasha Levin 
---
 arch/parisc/kernel/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 4d392e4ed3584..ac7253891d5ed 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -78,7 +78,7 @@ void notrace __hot ftrace_function_trampoline(unsigned long 
parent,
 #endif
 }
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if defined(CONFIG_DYNAMIC_FTRACE) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
 int ftrace_enable_ftrace_graph_caller(void)
 {
static_key_enable(&ftrace_graph_enable.key);
-- 
2.43.0

[PATCH AUTOSEL 6.6 10/22] parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check

2024-02-29 Thread Sasha Levin

From: Max Kellermann 

[ Upstream commit 250f5402e636a5cec9e0e95df252c3d54307210f ]

Fixes a bug revealed by -Wmissing-prototypes when
CONFIG_FUNCTION_GRAPH_TRACER is enabled but not CONFIG_DYNAMIC_FTRACE:

 arch/parisc/kernel/ftrace.c:82:5: error: no previous prototype for 
'ftrace_enable_ftrace_graph_caller' [-Werror=missing-prototypes]
82 | int ftrace_enable_ftrace_graph_caller(void)
   | ^
 arch/parisc/kernel/ftrace.c:88:5: error: no previous prototype for 
'ftrace_disable_ftrace_graph_caller' [-Werror=missing-prototypes]
88 | int ftrace_disable_ftrace_graph_caller(void)
   | ^~

Signed-off-by: Max Kellermann 
Signed-off-by: Helge Deller 
Signed-off-by: Sasha Levin 
---
 arch/parisc/kernel/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index d1defb9ede70c..621a4b386ae4f 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -78,7 +78,7 @@ asmlinkage void notrace __hot 
ftrace_function_trampoline(unsigned long parent,
 #endif
 }
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if defined(CONFIG_DYNAMIC_FTRACE) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
 int ftrace_enable_ftrace_graph_caller(void)
 {
static_key_enable(&ftrace_graph_enable.key);
-- 
2.43.0

[PATCH AUTOSEL 6.7 11/24] parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check

2024-02-29 Thread Sasha Levin

From: Max Kellermann 

[ Upstream commit 250f5402e636a5cec9e0e95df252c3d54307210f ]

Fixes a bug revealed by -Wmissing-prototypes when
CONFIG_FUNCTION_GRAPH_TRACER is enabled but not CONFIG_DYNAMIC_FTRACE:

 arch/parisc/kernel/ftrace.c:82:5: error: no previous prototype for 
'ftrace_enable_ftrace_graph_caller' [-Werror=missing-prototypes]
82 | int ftrace_enable_ftrace_graph_caller(void)
   | ^
 arch/parisc/kernel/ftrace.c:88:5: error: no previous prototype for 
'ftrace_disable_ftrace_graph_caller' [-Werror=missing-prototypes]
88 | int ftrace_disable_ftrace_graph_caller(void)
   | ^~

Signed-off-by: Max Kellermann 
Signed-off-by: Helge Deller 
Signed-off-by: Sasha Levin 
---
 arch/parisc/kernel/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index d1defb9ede70c..621a4b386ae4f 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -78,7 +78,7 @@ asmlinkage void notrace __hot 
ftrace_function_trampoline(unsigned long parent,
 #endif
 }
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if defined(CONFIG_DYNAMIC_FTRACE) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
 int ftrace_enable_ftrace_graph_caller(void)
 {
static_key_enable(&ftrace_graph_enable.key);
-- 
2.43.0

Re: [PATCH v3 1/7] remoteproc: Add TEE support

2024-02-29 Thread Arnaud POULIQUEN

Hello Mathieu,

On 2/29/24 17:19, Mathieu Poirier wrote:
> Good morning,
> 
> On Wed, Feb 28, 2024 at 09:20:28AM +0100, Arnaud POULIQUEN wrote:
>> Hello Mathieu,
>>
>>
>> On 2/23/24 19:27, Mathieu Poirier wrote:
>>> On Wed, Feb 14, 2024 at 06:21:21PM +0100, Arnaud Pouliquen wrote:
 From: Arnaud Pouliquen 

 Add a remoteproc TEE (Trusted Execution Environment) driver
 that will be probed by the TEE bus. If the associated Trusted
 application is supported on secure part this device offers a client
 interface to load a firmware in the secure part.
 This firmware could be authenticated and decrypted by the secure
 trusted application.

 Signed-off-by: Arnaud Pouliquen 
 ---
 update from V2
 - Use 'tee_rproc' prefix for all functions
 - rename tee_rproc_get_loaded_rsc_table to tee_rproc_find_loaded_rsc_table
 - redefine fonction to better match with the rproc_ops structure format
- replace 'struct tee_rproc' parameter by 'struct rproc' parameter
- rename 'rproc_tee_get_rsc_table()' to tee_rproc_get_loaded_rsc_table()
  and rework it to remove the cached_table management.
- introduce tee_rproc_get_context() to get the tee_rproc struct from the
   rproc struct
- rename tee_rproc_get_loaded_rsc_table() to 
 tee_rproc_find_loaded_rsc_table()
 - remove useless check on tee_rproc_ctx structure in tee_rproc_register()
   and tee_rproc_unregister()
 - fix test on the return of tee_rproc_ctx = devm_kzalloc()
 - remove useless includes and unused tee_rproc_mem structure.
 ---
  drivers/remoteproc/Kconfig  |   9 +
  drivers/remoteproc/Makefile |   1 +
  drivers/remoteproc/tee_remoteproc.c | 397 
  include/linux/tee_remoteproc.h  | 102 +++
  4 files changed, 509 insertions(+)
  create mode 100644 drivers/remoteproc/tee_remoteproc.c
  create mode 100644 include/linux/tee_remoteproc.h

 diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
 index 48845dc8fa85..85299606806c 100644
 --- a/drivers/remoteproc/Kconfig
 +++ b/drivers/remoteproc/Kconfig
 @@ -365,6 +365,15 @@ config XLNX_R5_REMOTEPROC
  
  It's safe to say N if not interested in using RPU r5f cores.
  
 +
 +config TEE_REMOTEPROC
 +  tristate "trusted firmware support by a TEE application"
 +  depends on OPTEE
 +  help
 +Support for trusted remote processors firmware. The firmware
 +authentication and/or decryption are managed by a trusted application.
 +This can be either built-in or a loadable module.
 +
  endif # REMOTEPROC
  
  endmenu
 diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
 index 91314a9b43ce..fa8daebce277 100644
 --- a/drivers/remoteproc/Makefile
 +++ b/drivers/remoteproc/Makefile
 @@ -36,6 +36,7 @@ obj-$(CONFIG_RCAR_REMOTEPROC)+= rcar_rproc.o
  obj-$(CONFIG_ST_REMOTEPROC)   += st_remoteproc.o
  obj-$(CONFIG_ST_SLIM_REMOTEPROC)  += st_slim_rproc.o
  obj-$(CONFIG_STM32_RPROC) += stm32_rproc.o
 +obj-$(CONFIG_TEE_REMOTEPROC)  += tee_remoteproc.o
  obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)+= ti_k3_dsp_remoteproc.o
  obj-$(CONFIG_TI_K3_R5_REMOTEPROC) += ti_k3_r5_remoteproc.o
  obj-$(CONFIG_XLNX_R5_REMOTEPROC)  += xlnx_r5_remoteproc.o
 diff --git a/drivers/remoteproc/tee_remoteproc.c 
 b/drivers/remoteproc/tee_remoteproc.c
 new file mode 100644
 index ..ac727e062d00
 --- /dev/null
 +++ b/drivers/remoteproc/tee_remoteproc.c
 @@ -0,0 +1,397 @@
 +// SPDX-License-Identifier: GPL-2.0-or-later
 +/*
 + * Copyright (C) STMicroelectronics 2023 - All Rights Reserved
 + * Author: Arnaud Pouliquen 
 + */
 +
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +
 +#include "remoteproc_internal.h"
 +
 +#define MAX_TEE_PARAM_ARRY_MEMBER 4
 +
 +/*
 + * Authentication of the firmware and load in the remote processor memory
 + *
 + * [in]  params[0].value.a:   unique 32bit identifier of the remote 
 processor
 + * [in]params[1].memref:  buffer containing the image of the 
 buffer
 + */
 +#define TA_RPROC_FW_CMD_LOAD_FW   1
 +
 +/*
 + * Start the remote processor
 + *
 + * [in]  params[0].value.a:   unique 32bit identifier of the remote 
 processor
 + */
 +#define TA_RPROC_FW_CMD_START_FW  2
 +
 +/*
 + * Stop the remote processor
 + *
 + * [in]  params[0].value.a:   unique 32bit identifier of the remote 
 processor
 + */
 +#define TA_RPROC_FW_CMD_STOP_FW   3
 +
 +/*
 + * Return the address of the resource table, or 0 if not found
 + * No check is

Re: [PATCH v3 7/7] remoteproc: stm32: Add support of an OP-TEE TA to load the firmware

2024-02-29 Thread Arnaud POULIQUEN




On 2/23/24 19:37, Mathieu Poirier wrote:
> On Fri, Feb 23, 2024 at 02:54:13PM +0100, Arnaud POULIQUEN wrote:
>> Hello Mathieu,
>>
>> On 2/22/24 20:02, Mathieu Poirier wrote:
>>> Hi,
>>>
>>> On Wed, Feb 14, 2024 at 06:21:27PM +0100, Arnaud Pouliquen wrote:
 The new TEE remoteproc device is used to manage remote firmware in a
 secure, trusted context. The 'st,stm32mp1-m4-tee' compatibility is
 introduced to delegate the loading of the firmware to the trusted
 execution context. In such cases, the firmware should be signed and
 adhere to the image format defined by the TEE.

 A new "to_attach" field is introduced to differentiate the use cases
 "firmware loaded by the boot stage" and "firmware loaded by the TEE".

 Signed-off-by: Arnaud Pouliquen 
 ---
 V2 to V3 update:
 - remove stm32_rproc_tee_elf_sanity_check(), stm32_rproc_tee_elf_load()
   stm32_rproc_tee_elf_find_loaded_rsc_table() and  stm32_rproc_tee_start() 
 that are bnow unused
 - use new rproc::alt_boot field to sepcify that the alternate fboot method 
 is used
 - use stm32_rproc::to_attach field to differenciate attch mode from 
 remoteproc tee boot mode.
 - remove the used of stm32_rproc::fw_loaded
 ---
  drivers/remoteproc/stm32_rproc.c | 85 +---
  1 file changed, 79 insertions(+), 6 deletions(-)

 diff --git a/drivers/remoteproc/stm32_rproc.c 
 b/drivers/remoteproc/stm32_rproc.c
 index fcc0001e2657..9cfcf66462e0 100644
 --- a/drivers/remoteproc/stm32_rproc.c
 +++ b/drivers/remoteproc/stm32_rproc.c
 @@ -20,6 +20,7 @@
  #include 
  #include 
  #include 
 +#include 
  #include 
  
  #include "remoteproc_internal.h"
 @@ -49,6 +50,9 @@
  #define M4_STATE_STANDBY  4
  #define M4_STATE_CRASH5
  
 +/* Remote processor unique identifier aligned with the Trusted Execution 
 Environment definitions */
 +#define STM32_MP1_M4_PROC_ID0
 +
  struct stm32_syscon {
struct regmap *map;
u32 reg;
 @@ -90,6 +94,8 @@ struct stm32_rproc {
struct stm32_mbox mb[MBOX_NB_MBX];
struct workqueue_struct *workqueue;
bool hold_boot_smc;
 +  bool to_attach;
 +  struct tee_rproc *trproc;
void __iomem *rsc_va;
  };
  
 @@ -253,10 +259,30 @@ static int stm32_rproc_release(struct rproc *rproc)
return err;
}
}
 +  ddata->to_attach = false;
  
return err;
  }
  
 +static int stm32_rproc_tee_attach(struct rproc *rproc)
 +{
 +  /* Nothing to do, remote proc already started by the secured context. */
 +  return 0;
 +}
 +
 +static int stm32_rproc_tee_stop(struct rproc *rproc)
 +{
 +  int err;
 +
 +  stm32_rproc_request_shutdown(rproc);
 +
 +  err = tee_rproc_stop(rproc);
 +  if (err)
 +  return err;
 +
 +  return stm32_rproc_release(rproc);
 +}
 +
  static int stm32_rproc_prepare(struct rproc *rproc)
  {
struct device *dev = rproc->dev.parent;
 @@ -637,10 +663,14 @@ stm32_rproc_get_loaded_rsc_table(struct rproc 
 *rproc, size_t *table_sz)
  {
struct stm32_rproc *ddata = rproc->priv;
struct device *dev = rproc->dev.parent;
 +  struct tee_rproc *trproc = ddata->trproc;
phys_addr_t rsc_pa;
u32 rsc_da;
int err;
  
 +  if (trproc && !ddata->to_attach)
 +  return tee_rproc_get_loaded_rsc_table(rproc, table_sz);
 +
>>>
>>> Why do we need a flag at all?  Why can't 
>>> st_rproc_tee_ops::get_loaded_rsc_table
>>> be set to tee_rproc_get_loaded_rsc_table()?
>>
>>
>> This function is used to retrieve the address of the resource table in 3 
>> cases
>> - attach to a firmware started by the boot loader (U-boot).
>> - load of the firmware by OP-TEE.
>> - crash recovery on a signed firmware started by the boot loader.
>>
>> The flag is used to differentiate the attch from the other uses cases
>> For instance we support this use case.
>> 1) attach to the firmware on boot
>> 2) crash during runtime
>>   2a) stop the firmware by OP-TEE( ddata->to_attach set to 0)
>>   2b) load the firmware by OP-TEE
>>   2c) get the loaded resource table from OP-TEE (we can not guaranty
>>   that the firmware loaded on recovery is the same)
>>   2d) restart the firmware by OP-TEE
> 
> This is not maintainable and needs to be broken down into smaller
> building blocks.  The introduction of tee_rproc_parse_fw() should help dealing
> with some of the complexity.

The use cases I mentioned are supported by the legacy, if firmware is not
authenticated by a Trusted Application.
No problem to addressed this in a second step.
I will remove this constrain from this series in next version.

Regards,
Arnaud

> 
>>
>>>
/* The resource table has already been mapped, nothing to

[RESEND PATCH v2] virtio: uapi: Drop __packed attribute in linux/virtio_pci.h:

2024-02-29 Thread Suzuki K Poulose

Commit 92792ac752aa ("virtio-pci: Introduce admin command sending function")
added "__packed" structures to UAPI header linux/virtio_pci.h. This triggers
build failures in the consumer userspace applications without proper 
"definition"
of __packed (e.g., kvmtool build fails).

Moreover, the structures are already packed well, and doesn't need explicit
packing, similar to the rest of the structures in all virtio_* headers. Remove
the __packed attribute.

Fixes: 92792ac752aa ("virtio-pci: Introduce admin command sending function")
Cc: Feng Liu 
Cc: Michael S. Tsirkin 
Cc: Yishai Hadas 
Cc: Alex Williamson 
Cc: Jean-Philippe Brucker 
Reviewed-by: Jean-Philippe Brucker 
Acked-by: Michael S. Tsirkin 
Signed-off-by: Suzuki K Poulose 
---
Changes since v1:
  - Fix description for the "Fixes" tag format
  - Collect Tags from Jean-Philippe and Michael
 ---
 include/uapi/linux/virtio_pci.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index ef3810dee7ef..a8208492e822 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -240,7 +240,7 @@ struct virtio_pci_cfg_cap {
 #define VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ   0x5
 #define VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO0x6
 
-struct __packed virtio_admin_cmd_hdr {
+struct virtio_admin_cmd_hdr {
__le16 opcode;
/*
 * 1 - SR-IOV
@@ -252,20 +252,20 @@ struct __packed virtio_admin_cmd_hdr {
__le64 group_member_id;
 };
 
-struct __packed virtio_admin_cmd_status {
+struct virtio_admin_cmd_status {
__le16 status;
__le16 status_qualifier;
/* Unused, reserved for future extensions. */
__u8 reserved2[4];
 };
 
-struct __packed virtio_admin_cmd_legacy_wr_data {
+struct virtio_admin_cmd_legacy_wr_data {
__u8 offset; /* Starting offset of the register(s) to write. */
__u8 reserved[7];
__u8 registers[];
 };
 
-struct __packed virtio_admin_cmd_legacy_rd_data {
+struct virtio_admin_cmd_legacy_rd_data {
__u8 offset; /* Starting offset of the register(s) to read. */
 };
 
@@ -275,7 +275,7 @@ struct __packed virtio_admin_cmd_legacy_rd_data {
 
 #define VIRTIO_ADMIN_CMD_MAX_NOTIFY_INFO 4
 
-struct __packed virtio_admin_cmd_notify_info_data {
+struct virtio_admin_cmd_notify_info_data {
__u8 flags; /* 0 = end of list, 1 = owner device, 2 = member device */
__u8 bar; /* BAR of the member or the owner device */
__u8 padding[6];
-- 
2.34.1

Re: [PATCH v3 1/7] remoteproc: Add TEE support

2024-02-29 Thread Mathieu Poirier

Good morning,

On Wed, Feb 28, 2024 at 09:20:28AM +0100, Arnaud POULIQUEN wrote:
> Hello Mathieu,
> 
> 
> On 2/23/24 19:27, Mathieu Poirier wrote:
> > On Wed, Feb 14, 2024 at 06:21:21PM +0100, Arnaud Pouliquen wrote:
> >> From: Arnaud Pouliquen 
> >>
> >> Add a remoteproc TEE (Trusted Execution Environment) driver
> >> that will be probed by the TEE bus. If the associated Trusted
> >> application is supported on secure part this device offers a client
> >> interface to load a firmware in the secure part.
> >> This firmware could be authenticated and decrypted by the secure
> >> trusted application.
> >>
> >> Signed-off-by: Arnaud Pouliquen 
> >> ---
> >> update from V2
> >> - Use 'tee_rproc' prefix for all functions
> >> - rename tee_rproc_get_loaded_rsc_table to tee_rproc_find_loaded_rsc_table
> >> - redefine fonction to better match with the rproc_ops structure format
> >>- replace 'struct tee_rproc' parameter by 'struct rproc' parameter
> >>- rename 'rproc_tee_get_rsc_table()' to tee_rproc_get_loaded_rsc_table()
> >>  and rework it to remove the cached_table management.
> >>- introduce tee_rproc_get_context() to get the tee_rproc struct from the
> >>   rproc struct
> >>- rename tee_rproc_get_loaded_rsc_table() to 
> >> tee_rproc_find_loaded_rsc_table()
> >> - remove useless check on tee_rproc_ctx structure in tee_rproc_register()
> >>   and tee_rproc_unregister()
> >> - fix test on the return of tee_rproc_ctx = devm_kzalloc()
> >> - remove useless includes and unused tee_rproc_mem structure.
> >> ---
> >>  drivers/remoteproc/Kconfig  |   9 +
> >>  drivers/remoteproc/Makefile |   1 +
> >>  drivers/remoteproc/tee_remoteproc.c | 397 
> >>  include/linux/tee_remoteproc.h  | 102 +++
> >>  4 files changed, 509 insertions(+)
> >>  create mode 100644 drivers/remoteproc/tee_remoteproc.c
> >>  create mode 100644 include/linux/tee_remoteproc.h
> >>
> >> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> >> index 48845dc8fa85..85299606806c 100644
> >> --- a/drivers/remoteproc/Kconfig
> >> +++ b/drivers/remoteproc/Kconfig
> >> @@ -365,6 +365,15 @@ config XLNX_R5_REMOTEPROC
> >>  
> >>  It's safe to say N if not interested in using RPU r5f cores.
> >>  
> >> +
> >> +config TEE_REMOTEPROC
> >> +  tristate "trusted firmware support by a TEE application"
> >> +  depends on OPTEE
> >> +  help
> >> +Support for trusted remote processors firmware. The firmware
> >> +authentication and/or decryption are managed by a trusted application.
> >> +This can be either built-in or a loadable module.
> >> +
> >>  endif # REMOTEPROC
> >>  
> >>  endmenu
> >> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> >> index 91314a9b43ce..fa8daebce277 100644
> >> --- a/drivers/remoteproc/Makefile
> >> +++ b/drivers/remoteproc/Makefile
> >> @@ -36,6 +36,7 @@ obj-$(CONFIG_RCAR_REMOTEPROC)+= rcar_rproc.o
> >>  obj-$(CONFIG_ST_REMOTEPROC)   += st_remoteproc.o
> >>  obj-$(CONFIG_ST_SLIM_REMOTEPROC)  += st_slim_rproc.o
> >>  obj-$(CONFIG_STM32_RPROC) += stm32_rproc.o
> >> +obj-$(CONFIG_TEE_REMOTEPROC)  += tee_remoteproc.o
> >>  obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)+= ti_k3_dsp_remoteproc.o
> >>  obj-$(CONFIG_TI_K3_R5_REMOTEPROC) += ti_k3_r5_remoteproc.o
> >>  obj-$(CONFIG_XLNX_R5_REMOTEPROC)  += xlnx_r5_remoteproc.o
> >> diff --git a/drivers/remoteproc/tee_remoteproc.c 
> >> b/drivers/remoteproc/tee_remoteproc.c
> >> new file mode 100644
> >> index ..ac727e062d00
> >> --- /dev/null
> >> +++ b/drivers/remoteproc/tee_remoteproc.c
> >> @@ -0,0 +1,397 @@
> >> +// SPDX-License-Identifier: GPL-2.0-or-later
> >> +/*
> >> + * Copyright (C) STMicroelectronics 2023 - All Rights Reserved
> >> + * Author: Arnaud Pouliquen 
> >> + */
> >> +
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +
> >> +#include "remoteproc_internal.h"
> >> +
> >> +#define MAX_TEE_PARAM_ARRY_MEMBER 4
> >> +
> >> +/*
> >> + * Authentication of the firmware and load in the remote processor memory
> >> + *
> >> + * [in]  params[0].value.a:   unique 32bit identifier of the remote 
> >> processor
> >> + * [in]params[1].memref:  buffer containing the image of the 
> >> buffer
> >> + */
> >> +#define TA_RPROC_FW_CMD_LOAD_FW   1
> >> +
> >> +/*
> >> + * Start the remote processor
> >> + *
> >> + * [in]  params[0].value.a:   unique 32bit identifier of the remote 
> >> processor
> >> + */
> >> +#define TA_RPROC_FW_CMD_START_FW  2
> >> +
> >> +/*
> >> + * Stop the remote processor
> >> + *
> >> + * [in]  params[0].value.a:   unique 32bit identifier of the remote 
> >> processor
> >> + */
> >> +#define TA_RPROC_FW_CMD_STOP_FW   3
> >> +
> >> +/*
> >> + * Return the address of the resource table, or 0 if not found
> >> + * No check is done to verify that the address returned is accessible by
> >> +

Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-02-29 Thread kernel test robot

Hi Yunjian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:
https://github.com/intel-lab-lkp/linux/commits/Yunjian-Wang/xsk-Remove-non-zero-dma_page-check-in-xp_assign_dev/20240228-190840
base:   net-next/main
patch link:
https://lore.kernel.org/r/1709118356-133960-1-git-send-email-wangyunjian%40huawei.com
patch subject: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support
config: microblaze-randconfig-r131-20240229 
(https://download.01.org/0day-ci/archive/20240229/202402292345.a49gfjlj-...@intel.com/config)
compiler: microblaze-linux-gcc (GCC) 13.2.0
reproduce: 
(https://download.01.org/0day-ci/archive/20240229/202402292345.a49gfjlj-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202402292345.a49gfjlj-...@intel.com/

All errors (new ones prefixed by >>):

   drivers/vhost/net.c: In function 'vhost_net_buf_peek_len':
>> drivers/vhost/net.c:206:41: error: initialization of 'struct xdp_desc *' 
>> from incompatible pointer type 'struct xdp_frame *' 
>> [-Werror=incompatible-pointer-types]
 206 | struct xdp_desc *desc = tun_ptr_to_xdp_desc(ptr);
 | ^~~
   cc1: some warnings being treated as errors


vim +206 drivers/vhost/net.c

   198  
   199  static int vhost_net_buf_peek_len(void *ptr)
   200  {
   201  if (tun_is_xdp_frame(ptr)) {
   202  struct xdp_frame *xdpf = tun_ptr_to_xdp(ptr);
   203  
   204  return xdpf->len;
   205  } else if (tun_is_xdp_desc_frame(ptr)) {
 > 206  struct xdp_desc *desc = tun_ptr_to_xdp_desc(ptr);
   207  
   208  return desc->len;
   209  }
   210  
   211  return __skb_array_len_with_tag(ptr);
   212  }
   213  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Re: [PATCH] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-02-29 Thread Steven Rostedt

On Thu, 29 Feb 2024 20:32:26 +0800
linke  wrote:

> Hi Steven, sorry for the late reply.
> 
> > 
> > Now the reason for the above READ_ONCE() is because the variables *are*
> > going to be used again. We do *not* want the compiler to play any games
> > with that.
> >   
> 
> I don't think it is because the variables are going to be used again. 
> Compiler optimizations barely do bad things in single thread programs. It
> is because cpu_buffer->commit_page may change concurrently and should be
> accessed atomically.

So basically you are worried about read-tearing?

That wasn't mentioned in the change log.

> 
>   /* Make sure commit page didn't change */
>   curr_commit_page = READ_ONCE(cpu_buffer->commit_page);
>   curr_commit_ts = READ_ONCE(curr_commit_page->page->time_stamp);
> 
>   /* If the commit page changed, then there's more data */
>   if (curr_commit_page != commit_page ||
>   curr_commit_ts != commit_ts)
>   return 0;
> 
> This code read cpu_buffer->commit_page and time_stamp again to check
> whether commit page changed. It shows that cpu_buffer->commit_page and 
> time_stamp may be changed by other threads.
> 
> commit_page = cpu_buffer->commit_page;
> commit_ts = commit_page->page->time_stamp;
> 
> So the commit_page and time_stamp above is read while other threads may
> change it. I think it is a data race if it is not atomic. Thus it is 
> necessary to use READ_ONCE() here.

Funny part is, if the above timestamp read did a tear, then this would
definitely not match, and would return the correct value. That is, the
buffer is not empty because the only way for this to get corrupted is if
something is in the process of writing to it.

Now we could add a comment stating this.

So, I don't even think the reading of the commit_page is needed (it's long
size so it should not tear, and if it does, I consider that more a bug in
the compiler).

Please explain why READ_ONCE() is needed, and what exactly is it "fixing".
That is, what breaks if it's not there?

-- Steve

Re: [PATCH v2 4/6] virtiofs: support bounce buffer backed by scattered pages

2024-02-29 Thread Brian Foster

On Wed, Feb 28, 2024 at 10:41:24PM +0800, Hou Tao wrote:
> From: Hou Tao 
> 
> When reading a file kept in virtiofs from kernel (e.g., insmod a kernel
> module), if the cache of virtiofs is disabled, the read buffer will be
> passed to virtiofs through out_args[0].value instead of pages. Because
> virtiofs can't get the pages for the read buffer, virtio_fs_argbuf_new()
> will create a bounce buffer for the read buffer by using kmalloc() and
> copy the read buffer into bounce buffer. If the read buffer is large
> (e.g., 1MB), the allocation will incur significant stress on the memory
> subsystem.
> 
> So instead of allocating bounce buffer by using kmalloc(), allocate a
> bounce buffer which is backed by scattered pages. The original idea is
> to use vmap(), but the use of GFP_ATOMIC is no possible for vmap(). To
> simplify the copy operations in the bounce buffer, use a bio_vec flex
> array to represent the argbuf. Also add an is_flat field in struct
> virtio_fs_argbuf to distinguish between kmalloc-ed and scattered bounce
> buffer.
> 
> Signed-off-by: Hou Tao 
> ---
>  fs/fuse/virtio_fs.c | 163 
>  1 file changed, 149 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> index f10fff7f23a0f..ffea684bd100d 100644
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
...
> @@ -408,42 +425,143 @@ static void virtio_fs_request_dispatch_work(struct 
> work_struct *work)
>   }
>  }
>  
...  
>  static void virtio_fs_argbuf_copy_from_in_arg(struct virtio_fs_argbuf 
> *argbuf,
> unsigned int offset,
> const void *src, unsigned int len)
>  {
> - memcpy(argbuf->buf + offset, src, len);
> + struct iov_iter iter;
> + unsigned int copied;
> +
> + if (argbuf->is_flat) {
> + memcpy(argbuf->f.buf + offset, src, len);
> + return;
> + }
> +
> + iov_iter_bvec(&iter, ITER_DEST, argbuf->s.bvec,
> +   argbuf->s.nr, argbuf->s.size);
> + iov_iter_advance(&iter, offset);

Hi Hou,

Just a random comment, but it seems a little inefficient to reinit and
readvance the iter like this on every copy/call. It looks like offset is
already incremented in the callers of the argbuf copy helpers. Perhaps
iov_iter could be lifted into the callers and passed down, or even just
include it in the argbuf structure and init it at alloc time?

Brian

> +
> + copied = _copy_to_iter(src, len, &iter);
> + WARN_ON_ONCE(copied != len);
>  }
>  
>  static unsigned int
> @@ -451,15 +569,32 @@ virtio_fs_argbuf_out_args_offset(struct 
> virtio_fs_argbuf *argbuf,
>const struct fuse_args *args)
>  {
>   unsigned int num_in = args->in_numargs - args->in_pages;
> + unsigned int offset = fuse_len_args(num_in,
> + (struct fuse_arg *)args->in_args);
>  
> - return fuse_len_args(num_in, (struct fuse_arg *)args->in_args);
> + if (argbuf->is_flat)
> + return offset;
> + return round_up(offset, PAGE_SIZE);
>  }
>  
>  static void virtio_fs_argbuf_copy_to_out_arg(struct virtio_fs_argbuf *argbuf,
>unsigned int offset, void *dst,
>unsigned int len)
>  {
> - memcpy(dst, argbuf->buf + offset, len);
> + struct iov_iter iter;
> + unsigned int copied;
> +
> + if (argbuf->is_flat) {
> + memcpy(dst, argbuf->f.buf + offset, len);
> + return;
> + }
> +
> + iov_iter_bvec(&iter, ITER_SOURCE, argbuf->s.bvec,
> +   argbuf->s.nr, argbuf->s.size);
> + iov_iter_advance(&iter, offset);
> +
> + copied = _copy_from_iter(dst, len, &iter);
> + WARN_ON_ONCE(copied != len);
>  }
>  
>  /*
> @@ -1154,7 +1289,7 @@ static unsigned int sg_init_fuse_args(struct 
> scatterlist *sg,
>   len = fuse_len_args(numargs - argpages, args);
>   if (len)
>   total_sgs += virtio_fs_argbuf_setup_sg(req->argbuf, *len_used,
> -len, &sg[total_sgs]);
> +&len, &sg[total_sgs]);
>  
>   if (argpages)
>   total_sgs += sg_init_fuse_pages(&sg[total_sgs],
> @@ -1199,7 +1334,7 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq 
> *fsvq,
>   }
>  
>   /* Use a bounce buffer since stack args cannot be mapped */
> - req->argbuf = virtio_fs_argbuf_new(args, GFP_ATOMIC);
> + req->argbuf = virtio_fs_argbuf_new(args, GFP_ATOMIC, true);
>   if (!req->argbuf) {
>   ret = -ENOMEM;
>   goto out;
> -- 
> 2.29.2
> 
>

[mhiramat:topic/fprobe-on-fgraph] [function_graph] ab712273e5: WARNING:at_kernel/trace/trace.c:#run_tracer_selftest

2024-02-29 Thread kernel test robot

race.c:2210) 
[ 9.254800][ T1] ? register_trace_event (kernel/trace/trace_output.c:783) 
[ 9.254910][ T1] ? init_graph_tracefs 
(kernel/trace/trace_functions_graph.c:1449) 
[ 9.255706][ T1] init_graph_trace (kernel/trace/trace_functions_graph.c:1463) 
[ 9.256448][ T1] do_one_initcall (init/main.c:1236) 
[ 9.257185][ T1] ? parse_args (kernel/params.c:183) 
[ 9.257891][ T1] ? rdinit_setup (init/main.c:1282) 
[ 9.258602][ T1] ? rdinit_setup (init/main.c:1282) 
[ 9.258918][ T1] do_initcalls (init/main.c:1297 init/main.c:1314) 
[ 9.259617][ T1] ? rdinit_setup (init/main.c:1282) 
[ 9.260332][ T1] kernel_init_freeable (init/main.c:1555) 
[ 9.261149][ T1] ? kernel_init (init/main.c:1443) 
[ 9.261870][ T1] ? rest_init (init/main.c:1433) 
[ 9.262567][ T1] kernel_init (init/main.c:1443) 
[ 9.262911][ T1] ? schedule_tail (kernel/sched/core.c:5338) 
[ 9.263644][ T1] ret_from_fork (arch/x86/kernel/process.c:153) 
[ 9.264345][ T1] ? rest_init (init/main.c:1433) 
[ 9.265046][ T1] ret_from_fork_asm (arch/x86/entry/entry_32.S:741) 
[ 9.265798][ T1] entry_INT80_32 (arch/x86/entry/entry_32.S:947) 
[9.266571][T1] irq event stamp: 5037117
[ 9.266911][ T1] hardirqs last enabled at (5037127): console_unlock 
(kernel/printk/printk.c:341 kernel/printk/printk.c:2706 
kernel/printk/printk.c:3038) 
[ 9.268327][ T1] hardirqs last disabled at (5037134): console_unlock 
(kernel/printk/printk.c:339 kernel/printk/printk.c:2706 
kernel/printk/printk.c:3038) 
[ 9.269742][ T1] softirqs last enabled at (5036722): __do_softirq 
(arch/x86/include/asm/preempt.h:26 kernel/softirq.c:400 kernel/softirq.c:582) 
[ 9.270910][ T1] softirqs last disabled at (5036717): do_softirq_own_stack 
(arch/x86/kernel/irq_32.c:57 arch/x86/kernel/irq_32.c:147) 
[9.272383][T1] ---[ end trace  ]---
[9.273787][T1] pinctrl core: initialized pinctrl subsystem
[9.275831][T1] workqueue: round-robin CPU selection forced, expect 
performance impact
[9.277324][T1]


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240229/202402292204.c1f77860-oliver.s...@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Re: [PATCH 1/4] asm-generic/page.h: apply page shift to PFN instead of VA in pfn_to_virt

2024-02-29 Thread Linus Walleij

On Wed, Jan 31, 2024 at 7:27 AM Yan Zhao  wrote:

> Apply the page shift to PFN to get physical address for final VA.
> The macro __va should take physical address instead of PFN as input.
>
> Fixes: 2d78057f0dd4 ("asm-generic/page.h: Make pfn accessors static inlines")
> Signed-off-by: Yan Zhao 

My bug, obviously. :(
Reviewed-by: Linus Walleij 

I thought this was already applied with the other fixes, but maybe it
was missed?

Yours,
Linus Walleij

RE: [PATCH net-next v2 2/3] vhost_net: Call peek_len when using xdp

2024-02-29 Thread wangyunjian



> -Original Message-
> From: Paolo Abeni [mailto:pab...@redhat.com]
> Sent: Thursday, February 29, 2024 6:49 PM
> To: wangyunjian ; m...@redhat.com;
> willemdebruijn.ker...@gmail.com; jasow...@redhat.com; k...@kernel.org;
> bj...@kernel.org; magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> jonathan.le...@gmail.com; da...@davemloft.net
> Cc: b...@vger.kernel.org; net...@vger.kernel.org;
> linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> 
> Subject: Re: [PATCH net-next v2 2/3] vhost_net: Call peek_len when using xdp
> 
> On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> > If TUN supports AF_XDP TX zero-copy, the XDP program will enqueue
> > packets to the XDP ring and wake up the vhost worker. This requires
> > the vhost worker to call peek_len(), which can be used to consume XDP
> > descriptors.
> >
> > Signed-off-by: Yunjian Wang 
> > ---
> >  drivers/vhost/net.c | 17 -
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > f2ed7167c848..077e74421558 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -207,6 +207,11 @@ static int vhost_net_buf_peek_len(void *ptr)
> > return __skb_array_len_with_tag(ptr);  }
> >
> > +static bool vhost_sock_xdp(struct socket *sock) {
> > +   return sock_flag(sock->sk, SOCK_XDP); }
> > +
> >  static int vhost_net_buf_peek(struct vhost_net_virtqueue *nvq)  {
> > struct vhost_net_buf *rxq = &nvq->rxq; @@ -214,6 +219,13 @@ static
> > int vhost_net_buf_peek(struct vhost_net_virtqueue *nvq)
> > if (!vhost_net_buf_is_empty(rxq))
> > goto out;
> >
> > +   if (ptr_ring_empty(nvq->rx_ring)) {
> > +   struct socket *sock = vhost_vq_get_backend(&nvq->vq);
> > +   /* Call peek_len to consume XSK descriptors, when using xdp */
> > +   if (vhost_sock_xdp(sock) && sock->ops->peek_len)
> > +   sock->ops->peek_len(sock);
> 
> This really looks like a socket API misuse. Why can't you use ptr-ring 
> primitives
> to consume XSK descriptors? peek_len could be constified some day, this code
> will prevent such (good) thing.

Thank you for your suggestion. I will consider that with Patch 3/3.

> 
> Cheers,
> 
> Paolo

RE: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-02-29 Thread wangyunjian

> -Original Message-
> From: Paolo Abeni [mailto:pab...@redhat.com]
> Sent: Thursday, February 29, 2024 7:13 PM
> To: wangyunjian ; m...@redhat.com;
> willemdebruijn.ker...@gmail.com; jasow...@redhat.com; k...@kernel.org;
> bj...@kernel.org; magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> jonathan.le...@gmail.com; da...@davemloft.net
> Cc: b...@vger.kernel.org; net...@vger.kernel.org;
> linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> 
> Subject: Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support
> 
> On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> > @@ -2661,6 +2776,54 @@ static int tun_ptr_peek_len(void *ptr)
> > }
> >  }
> >
> > +static void tun_peek_xsk(struct tun_file *tfile) {
> > +   struct xsk_buff_pool *pool;
> > +   u32 i, batch, budget;
> > +   void *frame;
> > +
> > +   if (!ptr_ring_empty(&tfile->tx_ring))
> > +   return;
> > +
> > +   spin_lock(&tfile->pool_lock);
> > +   pool = tfile->xsk_pool;
> > +   if (!pool) {
> > +   spin_unlock(&tfile->pool_lock);
> > +   return;
> > +   }
> > +
> > +   if (tfile->nb_descs) {
> > +   xsk_tx_completed(pool, tfile->nb_descs);
> > +   if (xsk_uses_need_wakeup(pool))
> > +   xsk_set_tx_need_wakeup(pool);
> > +   }
> > +
> > +   spin_lock(&tfile->tx_ring.producer_lock);
> > +   budget = min_t(u32, tfile->tx_ring.size, TUN_XDP_BATCH);
> > +
> > +   batch = xsk_tx_peek_release_desc_batch(pool, budget);
> > +   if (!batch) {
> 
> This branch looks like an unneeded "optimization". The generic loop below
> should have the same effect with no measurable perf delta - and smaller code.
> Just remove this.

OK, I will update it, thanks.

> 
> > +   tfile->nb_descs = 0;
> > +   spin_unlock(&tfile->tx_ring.producer_lock);
> > +   spin_unlock(&tfile->pool_lock);
> > +   return;
> > +   }
> > +
> > +   tfile->nb_descs = batch;
> > +   for (i = 0; i < batch; i++) {
> > +   /* Encode the XDP DESC flag into lowest bit for consumer to 
> > differ
> > +* XDP desc from XDP buffer and sk_buff.
> > +*/
> > +   frame = tun_xdp_desc_to_ptr(&pool->tx_descs[i]);
> > +   /* The budget must be less than or equal to tx_ring.size,
> > +* so enqueuing will not fail.
> > +*/
> > +   __ptr_ring_produce(&tfile->tx_ring, frame);
> > +   }
> > +   spin_unlock(&tfile->tx_ring.producer_lock);
> > +   spin_unlock(&tfile->pool_lock);
> 
> More related to the general design: it looks wrong. What if
> get_rx_bufs() will fail (ENOBUF) after successful peeking? With no more
> incoming packets, later peek will return 0 and it looks like that the
> half-processed packets will stay in the ring forever???

The vhost_net_rx_peek_head_len function obtains the packet length
but does not consume it. The packet is still in the ring. The later peek
will reuse it.

> 
> I think the 'ring produce' part should be moved into tun_do_read().

Thank you for your suggestion. I will consider that.

> 
> Cheers,
> 
> Paolo

RE: [PATCH net-next v2 1/3] xsk: Remove non-zero 'dma_page' check in xp_assign_dev

2024-02-29 Thread wangyunjian

> -Original Message-
> From: Paolo Abeni [mailto:pab...@redhat.com]
> Sent: Thursday, February 29, 2024 6:43 PM
> To: wangyunjian ; m...@redhat.com;
> willemdebruijn.ker...@gmail.com; jasow...@redhat.com; k...@kernel.org;
> bj...@kernel.org; magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> jonathan.le...@gmail.com; da...@davemloft.net
> Cc: b...@vger.kernel.org; net...@vger.kernel.org;
> linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> 
> Subject: Re: [PATCH net-next v2 1/3] xsk: Remove non-zero 'dma_page' check in
> xp_assign_dev
> 
> On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> > Now dma mappings are used by the physical NICs. However the vNIC maybe
> > do not need them. So remove non-zero 'dma_page' check in
> > xp_assign_dev.
> >
> > Signed-off-by: Yunjian Wang 
> > ---
> >  net/xdp/xsk_buff_pool.c | 7 ---
> >  1 file changed, 7 deletions(-)
> >
> > diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index
> > ce60ecd48a4d..a5af75b1f43c 100644
> > --- a/net/xdp/xsk_buff_pool.c
> > +++ b/net/xdp/xsk_buff_pool.c
> > @@ -219,16 +219,9 @@ int xp_assign_dev(struct xsk_buff_pool *pool,
> > if (err)
> > goto err_unreg_pool;
> >
> > -   if (!pool->dma_pages) {
> > -   WARN(1, "Driver did not DMA map zero-copy buffers");
> > -   err = -EINVAL;
> > -   goto err_unreg_xsk;
> > -   }
> 
> This would unconditionally remove an otherwise valid check for most NIC. What
> about let the driver declare it wont need DMA map with a
> (pool?) flag.

This check is redundant. The NIC's driver determines whether a DMA map is 
required.
If the NIC'driver requires the DMA map, it uses the xsk_pool_dma_map function, 
which
initializes the DMA map and performs a check.

Thanks

> 
> Cheers,
> 
> Paolo

Re: [PATCH] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-02-29 Thread linke

Hi Steven, sorry for the late reply.

> 
> Now the reason for the above READ_ONCE() is because the variables *are*
> going to be used again. We do *not* want the compiler to play any games
> with that.
> 

I don't think it is because the variables are going to be used again. 
Compiler optimizations barely do bad things in single thread programs. It
is because cpu_buffer->commit_page may change concurrently and should be
accessed atomically.

/* Make sure commit page didn't change */
curr_commit_page = READ_ONCE(cpu_buffer->commit_page);
curr_commit_ts = READ_ONCE(curr_commit_page->page->time_stamp);

/* If the commit page changed, then there's more data */
if (curr_commit_page != commit_page ||
curr_commit_ts != commit_ts)
return 0;

This code read cpu_buffer->commit_page and time_stamp again to check
whether commit page changed. It shows that cpu_buffer->commit_page and 
time_stamp may be changed by other threads.

commit_page = cpu_buffer->commit_page;
commit_ts = commit_page->page->time_stamp;

So the commit_page and time_stamp above is read while other threads may
change it. I think it is a data race if it is not atomic. Thus it is 
necessary to use READ_ONCE() here.

Re: [PATCH] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-02-29 Thread linke

Hi Steven, sorry for the late reply.

> 
> Now the reason for the above READ_ONCE() is because the variables *are*
> going to be used again. We do *not* want the compiler to play any games
> with that.
> 

I don't think it is because the variables are going to be used again. 
Compiler optimizations barely do bad things in single thread programs. It
is because cpu_buffer->commit_page may change concurrently and should be
accessed atomically.

/* Make sure commit page didn't change */
curr_commit_page = READ_ONCE(cpu_buffer->commit_page);
curr_commit_ts = READ_ONCE(curr_commit_page->page->time_stamp);

/* If the commit page changed, then there's more data */
if (curr_commit_page != commit_page ||
curr_commit_ts != commit_ts)
return 0;

This code read cpu_buffer->commit_page and time_stamp again to check
whether commit page changed. It shows that cpu_buffer->commit_page and 
time_stamp may be changed by other threads.

commit_page = cpu_buffer->commit_page;
commit_ts = commit_page->page->time_stamp;

So the commit_page and time_stamp above is read while other threads may
change it. I think it is a data race if it is not atomic. Thus it is 
necessary to use READ_ONCE() here.

[PATCH v3 2/2] riscv: Fix text patching when IPI are used

2024-02-29 Thread Alexandre Ghiti

For now, we use stop_machine() to patch the text and when we use IPIs for
remote icache flushes (which is emitted in patch_text_nosync()), the system
hangs.

So instead, make sure every CPU executes the stop_machine() patching
function and emit a local icache flush there.

Co-developed-by: Björn Töpel 
Signed-off-by: Björn Töpel 
Signed-off-by: Alexandre Ghiti 
Reviewed-by: Andrea Parri 
---
 arch/riscv/include/asm/patch.h |  1 +
 arch/riscv/kernel/ftrace.c | 44 ++
 arch/riscv/kernel/patch.c  | 16 +
 3 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h
index e88b52d39eac..9f5d6e14c405 100644
--- a/arch/riscv/include/asm/patch.h
+++ b/arch/riscv/include/asm/patch.h
@@ -6,6 +6,7 @@
 #ifndef _ASM_RISCV_PATCH_H
 #define _ASM_RISCV_PATCH_H
 
+int patch_insn_write(void *addr, const void *insn, size_t len);
 int patch_text_nosync(void *addr, const void *insns, size_t len);
 int patch_text_set_nosync(void *addr, u8 c, size_t len);
 int patch_text(void *addr, u32 *insns, int ninsns);
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index f5aa24d9e1c1..4f4987a6d83d 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -75,8 +76,7 @@ static int __ftrace_modify_call(unsigned long hook_pos, 
unsigned long target,
make_call_t0(hook_pos, target, call);
 
/* Replace the auipc-jalr pair at once. Return -EPERM on write error. */
-   if (patch_text_nosync
-   ((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE))
+   if (patch_insn_write((void *)hook_pos, enable ? call : nops, 
MCOUNT_INSN_SIZE))
return -EPERM;
 
return 0;
@@ -88,7 +88,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long 
addr)
 
make_call_t0(rec->ip, addr, call);
 
-   if (patch_text_nosync((void *)rec->ip, call, MCOUNT_INSN_SIZE))
+   if (patch_insn_write((void *)rec->ip, call, MCOUNT_INSN_SIZE))
return -EPERM;
 
return 0;
@@ -99,7 +99,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace 
*rec,
 {
unsigned int nops[2] = {NOP4, NOP4};
 
-   if (patch_text_nosync((void *)rec->ip, nops, MCOUNT_INSN_SIZE))
+   if (patch_insn_write((void *)rec->ip, nops, MCOUNT_INSN_SIZE))
return -EPERM;
 
return 0;
@@ -134,6 +134,42 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 
return ret;
 }
+
+struct ftrace_modify_param {
+   int command;
+   atomic_t cpu_count;
+};
+
+static int __ftrace_modify_code(void *data)
+{
+   struct ftrace_modify_param *param = data;
+
+   if (atomic_inc_return(¶m->cpu_count) == num_online_cpus()) {
+   ftrace_modify_all_code(param->command);
+   /*
+* Make sure the patching store is effective *before* we
+* increment the counter which releases all waiting CPUs
+* by using the release variant of atomic increment. The
+* release pairs with the call to local_flush_icache_all()
+* on the waiting CPU.
+*/
+   atomic_inc_return_release(¶m->cpu_count);
+   } else {
+   while (atomic_read(¶m->cpu_count) <= num_online_cpus())
+   cpu_relax();
+   }
+
+   local_flush_icache_all();
+
+   return 0;
+}
+
+void arch_ftrace_update_code(int command)
+{
+   struct ftrace_modify_param param = { command, ATOMIC_INIT(0) };
+
+   stop_machine(__ftrace_modify_code, ¶m, cpu_online_mask);
+}
 #endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
index 0b5c16dfe3f4..9a1bce1adf5a 100644
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -188,7 +188,7 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len)
 }
 NOKPROBE_SYMBOL(patch_text_set_nosync);
 
-static int patch_insn_write(void *addr, const void *insn, size_t len)
+int patch_insn_write(void *addr, const void *insn, size_t len)
 {
size_t patched = 0;
size_t size;
@@ -232,15 +232,23 @@ static int patch_text_cb(void *data)
if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
for (i = 0; ret == 0 && i < patch->ninsns; i++) {
len = GET_INSN_LENGTH(patch->insns[i]);
-   ret = patch_text_nosync(patch->addr + i * len,
-   &patch->insns[i], len);
+   ret = patch_insn_write(patch->addr + i * len, 
&patch->insns[i], len);
}
-   atomic_inc(&patch->cpu_count);
+   /*
+* Make sure the patching store is effective *before* we
+* increment the counter which releases all waiting CPUs
+

[PATCH v3 1/2] riscv: Remove superfluous smp_mb()

2024-02-29 Thread Alexandre Ghiti

This memory barrier is not needed and not documented so simply remove
it.

Suggested-by: Andrea Parri 
Signed-off-by: Alexandre Ghiti 
Reviewed-by: Andrea Parri 
---
 arch/riscv/kernel/patch.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
index 37e87fdcf6a0..0b5c16dfe3f4 100644
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -239,7 +239,6 @@ static int patch_text_cb(void *data)
} else {
while (atomic_read(&patch->cpu_count) <= num_online_cpus())
cpu_relax();
-   smp_mb();
}
 
return ret;
-- 
2.39.2

[PATCH v3 0/2] riscv: fix patching with IPI

2024-02-29 Thread Alexandre Ghiti

patch 1 removes a useless memory barrier and patch 2 actually fixes the
issue with IPI in the patching code.

Changes in v3:
- Remove wrong cleanup as noted by Samuel
- Enhance comment about usage of release semantics as suggested by
  Andrea
- Add RBs from Andrea

Changes in v2:
- Add patch 1 and then remove the memory barrier from patch 2 as
  suggested by Andrea
- Convert atomic_inc into an atomic_inc with release semantics as
  suggested by Andrea

Alexandre Ghiti (2):
  riscv: Remove superfluous smp_mb()
  riscv: Fix text patching when IPI are used

 arch/riscv/include/asm/patch.h |  1 +
 arch/riscv/kernel/ftrace.c | 44 ++
 arch/riscv/kernel/patch.c  | 17 +
 3 files changed, 53 insertions(+), 9 deletions(-)

-- 
2.39.2

Re: [PATCH] fprobe: Fix to allocate entry_data_size buffer with rethook instances

2024-02-29 Thread Google

On Thu, 29 Feb 2024 20:22:47 +0900
"Masami Hiramatsu (Google)"  wrote:

> From: Masami Hiramatsu (Google) 
> 
> Fix to allocate fprobe::entry_data_size buffer with rethook instances.
> If fprobe doesn't allocate entry_data_size buffer for each rethook instance,
> fprobe entry handler can cause a buffer overrun when storing entry data in
> entry handler.
> 

Oops, missed a URL.

> Reported-by: Jiri Olsa 

Closes: https://lore.kernel.org/all/Zd9eBn2FTQzYyg7L@krava/

Thanks,

> Fixes: 4bbd93455659 ("kprobes: kretprobe scalability improvement")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  kernel/trace/fprobe.c |   14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 6cd2a4e3afb8..9ff018245840 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -189,9 +189,6 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
>  {
>   int size;
>  
> - if (num <= 0)
> - return -EINVAL;
> -
>   if (!fp->exit_handler) {
>   fp->rethook = NULL;
>   return 0;
> @@ -199,15 +196,16 @@ static int fprobe_init_rethook(struct fprobe *fp, int 
> num)
>  
>   /* Initialize rethook if needed */
>   if (fp->nr_maxactive)
> - size = fp->nr_maxactive;
> + num = fp->nr_maxactive;
>   else
> - size = num * num_possible_cpus() * 2;
> - if (size <= 0)
> + num *= num_possible_cpus() * 2;
> + if (num <= 0)
>   return -EINVAL;
>  
> + size = sizeof(struct fprobe_rethook_node) + fp->entry_data_size;
> +
>   /* Initialize rethook */
> - fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler,
> - sizeof(struct fprobe_rethook_node), size);
> + fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, size, num);
>   if (IS_ERR(fp->rethook))
>   return PTR_ERR(fp->rethook);
>  
> 
> 


-- 
Masami Hiramatsu (Google)

[PATCH] fprobe: Fix to allocate entry_data_size buffer with rethook instances

2024-02-29 Thread Masami Hiramatsu (Google)

From: Masami Hiramatsu (Google) 

Fix to allocate fprobe::entry_data_size buffer with rethook instances.
If fprobe doesn't allocate entry_data_size buffer for each rethook instance,
fprobe entry handler can cause a buffer overrun when storing entry data in
entry handler.

Reported-by: Jiri Olsa 
Fixes: 4bbd93455659 ("kprobes: kretprobe scalability improvement")
Cc: sta...@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) 
---
 kernel/trace/fprobe.c |   14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 6cd2a4e3afb8..9ff018245840 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -189,9 +189,6 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
 {
int size;
 
-   if (num <= 0)
-   return -EINVAL;
-
if (!fp->exit_handler) {
fp->rethook = NULL;
return 0;
@@ -199,15 +196,16 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
 
/* Initialize rethook if needed */
if (fp->nr_maxactive)
-   size = fp->nr_maxactive;
+   num = fp->nr_maxactive;
else
-   size = num * num_possible_cpus() * 2;
-   if (size <= 0)
+   num *= num_possible_cpus() * 2;
+   if (num <= 0)
return -EINVAL;
 
+   size = sizeof(struct fprobe_rethook_node) + fp->entry_data_size;
+
/* Initialize rethook */
-   fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler,
-   sizeof(struct fprobe_rethook_node), size);
+   fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, size, num);
if (IS_ERR(fp->rethook))
return PTR_ERR(fp->rethook);

Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-02-29 Thread Paolo Abeni

On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> @@ -2661,6 +2776,54 @@ static int tun_ptr_peek_len(void *ptr)
>   }
>  }
>  
> +static void tun_peek_xsk(struct tun_file *tfile)
> +{
> + struct xsk_buff_pool *pool;
> + u32 i, batch, budget;
> + void *frame;
> +
> + if (!ptr_ring_empty(&tfile->tx_ring))
> + return;
> +
> + spin_lock(&tfile->pool_lock);
> + pool = tfile->xsk_pool;
> + if (!pool) {
> + spin_unlock(&tfile->pool_lock);
> + return;
> + }
> +
> + if (tfile->nb_descs) {
> + xsk_tx_completed(pool, tfile->nb_descs);
> + if (xsk_uses_need_wakeup(pool))
> + xsk_set_tx_need_wakeup(pool);
> + }
> +
> + spin_lock(&tfile->tx_ring.producer_lock);
> + budget = min_t(u32, tfile->tx_ring.size, TUN_XDP_BATCH);
> +
> + batch = xsk_tx_peek_release_desc_batch(pool, budget);
> + if (!batch) {

This branch looks like an unneeded "optimization". The generic loop
below should have the same effect with no measurable perf delta - and
smaller code. Just remove this.

> + tfile->nb_descs = 0;
> + spin_unlock(&tfile->tx_ring.producer_lock);
> + spin_unlock(&tfile->pool_lock);
> + return;
> + }
> +
> + tfile->nb_descs = batch;
> + for (i = 0; i < batch; i++) {
> + /* Encode the XDP DESC flag into lowest bit for consumer to 
> differ
> +  * XDP desc from XDP buffer and sk_buff.
> +  */
> + frame = tun_xdp_desc_to_ptr(&pool->tx_descs[i]);
> + /* The budget must be less than or equal to tx_ring.size,
> +  * so enqueuing will not fail.
> +  */
> + __ptr_ring_produce(&tfile->tx_ring, frame);
> + }
> + spin_unlock(&tfile->tx_ring.producer_lock);
> + spin_unlock(&tfile->pool_lock);

More related to the general design: it looks wrong. What if
get_rx_bufs() will fail (ENOBUF) after successful peeking? With no more
incoming packets, later peek will return 0 and it looks like that the
half-processed packets will stay in the ring forever???

I think the 'ring produce' part should be moved into tun_do_read().

Cheers,

Paolo

Re: [PATCH net-next v2 2/3] vhost_net: Call peek_len when using xdp

2024-02-29 Thread Paolo Abeni

On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> If TUN supports AF_XDP TX zero-copy, the XDP program will enqueue
> packets to the XDP ring and wake up the vhost worker. This requires
> the vhost worker to call peek_len(), which can be used to consume
> XDP descriptors.
> 
> Signed-off-by: Yunjian Wang 
> ---
>  drivers/vhost/net.c | 17 -
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index f2ed7167c848..077e74421558 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -207,6 +207,11 @@ static int vhost_net_buf_peek_len(void *ptr)
>   return __skb_array_len_with_tag(ptr);
>  }
>  
> +static bool vhost_sock_xdp(struct socket *sock)
> +{
> + return sock_flag(sock->sk, SOCK_XDP);
> +}
> +
>  static int vhost_net_buf_peek(struct vhost_net_virtqueue *nvq)
>  {
>   struct vhost_net_buf *rxq = &nvq->rxq;
> @@ -214,6 +219,13 @@ static int vhost_net_buf_peek(struct vhost_net_virtqueue 
> *nvq)
>   if (!vhost_net_buf_is_empty(rxq))
>   goto out;
>  
> + if (ptr_ring_empty(nvq->rx_ring)) {
> + struct socket *sock = vhost_vq_get_backend(&nvq->vq);
> + /* Call peek_len to consume XSK descriptors, when using xdp */
> + if (vhost_sock_xdp(sock) && sock->ops->peek_len)
> + sock->ops->peek_len(sock);

This really looks like a socket API misuse. Why can't you use ptr-ring
primitives to consume XSK descriptors? peek_len could be constified
some day, this code will prevent such (good) thing.

Cheers,

Paolo

Re: [PATCH net-next v2 1/3] xsk: Remove non-zero 'dma_page' check in xp_assign_dev

2024-02-29 Thread Paolo Abeni

On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> Now dma mappings are used by the physical NICs. However the vNIC
> maybe do not need them. So remove non-zero 'dma_page' check in
> xp_assign_dev.
> 
> Signed-off-by: Yunjian Wang 
> ---
>  net/xdp/xsk_buff_pool.c | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
> index ce60ecd48a4d..a5af75b1f43c 100644
> --- a/net/xdp/xsk_buff_pool.c
> +++ b/net/xdp/xsk_buff_pool.c
> @@ -219,16 +219,9 @@ int xp_assign_dev(struct xsk_buff_pool *pool,
>   if (err)
>   goto err_unreg_pool;
>  
> - if (!pool->dma_pages) {
> - WARN(1, "Driver did not DMA map zero-copy buffers");
> - err = -EINVAL;
> - goto err_unreg_xsk;
> - }

This would unconditionally remove an otherwise valid check for most
NIC. What about let the driver declare it wont need DMA map with a
(pool?) flag.

Cheers,

Paolo

Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-02-29 Thread kernel test robot

Hi Yunjian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:
https://github.com/intel-lab-lkp/linux/commits/Yunjian-Wang/xsk-Remove-non-zero-dma_page-check-in-xp_assign_dev/20240228-190840
base:   net-next/main
patch link:
https://lore.kernel.org/r/1709118356-133960-1-git-send-email-wangyunjian%40huawei.com
patch subject: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support
config: i386-randconfig-012-20240229 
(https://download.01.org/0day-ci/archive/20240229/202402291828.g9c5tw50-...@intel.com/config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240229/202402291828.g9c5tw50-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202402291828.g9c5tw50-...@intel.com/

All errors (new ones prefixed by >>):

   drivers/vhost/net.c: In function 'vhost_net_buf_peek_len':
>> drivers/vhost/net.c:206:27: error: initialization from incompatible pointer 
>> type [-Werror=incompatible-pointer-types]
  struct xdp_desc *desc = tun_ptr_to_xdp_desc(ptr);
  ^~~
   cc1: some warnings being treated as errors


vim +206 drivers/vhost/net.c

   198  
   199  static int vhost_net_buf_peek_len(void *ptr)
   200  {
   201  if (tun_is_xdp_frame(ptr)) {
   202  struct xdp_frame *xdpf = tun_ptr_to_xdp(ptr);
   203  
   204  return xdpf->len;
   205  } else if (tun_is_xdp_desc_frame(ptr)) {
 > 206  struct xdp_desc *desc = tun_ptr_to_xdp_desc(ptr);
   207  
   208  return desc->len;
   209  }
   210  
   211  return __skb_array_len_with_tag(ptr);
   212  }
   213  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Re: [PATCH v7 0/3] vduse: add support for networking devices

2024-02-29 Thread Maxime Coquelin


Hello Michael,

On 2/1/24 09:40, Michael S. Tsirkin wrote:

On Thu, Feb 01, 2024 at 09:34:11AM +0100, Maxime Coquelin wrote:

Hi Jason,

It looks like all patches got acked by you.
Any blocker to queue the series for next release?

Thanks,
Maxime


I think it's good enough at this point. Will put it in
linux-next shortly.



I fetched linux-next and it seems the series is not in yet.
Is there anything to be reworked on my side?

Thanks,
Maxime

Fwd: [PATCH] tcp: Add skb addr and sock addr to arguments of tracepoint tcp_probe.

2024-02-29 Thread yuanli fu

-- Forwarded message -
发件人： yuanli fu 
Date: 2024年2月29日周四 16:27
Subject: Re: [PATCH] tcp: Add skb addr and sock addr to arguments of
tracepoint tcp_probe.
To: Jason Xing 


Jason Xing  于2024年2月29日周四 15:30写道：
>
> On Thu, Feb 29, 2024 at 1:33 PM fuyuanli  wrote:
> >
> > It is useful to expose skb addr and sock addr to user in tracepoint
> > tcp_probe, so that we can get more information while monitoring
> > receiving of tcp data, by ebpf or other ways.
> >
> > For example, we need to identify a packet by seq and end_seq when
> > calculate transmit latency between lay 2 and lay 4 by ebpf, but which is
> > not available in tcp_probe, so we can only use kprobe hooking
> > tcp_rcv_esatblised to get them. But we can use tcp_probe directly if skb
> > addr and sock addr are available, which is more efficient.
> >
> > Signed-off-by: fuyuanli 
>
> Please target 'net-next' in the title of your v2 patch.
Got it, thanks.
>
> > ---
> >  include/trace/events/tcp.h | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> > index 7b1ddffa3dfc..096c15f64b92 100644
> > --- a/include/trace/events/tcp.h
> > +++ b/include/trace/events/tcp.h
> > @@ -258,6 +258,8 @@ TRACE_EVENT(tcp_probe,
> > __field(__u32, srtt)
> > __field(__u32, rcv_wnd)
> > __field(__u64, sock_cookie)
> > +   __field(const void *, skbaddr)
> > +   __field(const void *, skaddr)
> > ),
> >
> > TP_fast_assign(
> > @@ -285,6 +287,9 @@ TRACE_EVENT(tcp_probe,
> > __entry->ssthresh = tcp_current_ssthresh(sk);
> > __entry->srtt = tp->srtt_us >> 3;
> > __entry->sock_cookie = sock_gen_cookie(sk);
> > +
> > +   __entry->skbaddr = skb;
> > +   __entry->skaddr = sk;
> > ),
> >
> > TP_printk("family=%s src=%pISpc dest=%pISpc mark=%#x data_len=%d 
> > snd_nxt=%#x snd_una=%#x snd_cwnd=%u ssthresh=%u snd_wnd=%u srtt=%u 
> > rcv_wnd=%u sock_cookie=%llx",
>
> If they are useful, at least you should printk those two addresses
> like what trace_kfree_skb() does.
>
Got it, I will add to printk
> May I ask how it could be useful if there is no more function printing
> such information in the receive path?
We can get seq and end_seq by skbaddr in netif_receive_skb, so latency
between l2->l4 can be calculated.
>
> Thanks,
> Jason
> > --
> > 2.17.1
> >
> >
>
thanks
fuyuanli

Re: [PATCH v11 2/4] dt-bindings: remoteproc: add Tightly Coupled Memory (TCM) bindings

2024-02-29 Thread Krzysztof Kozlowski

On 19/02/2024 18:44, Tanmay Shah wrote:
> From: Radhey Shyam Pandey 
> 
> Introduce bindings for TCM memory address space on AMD-xilinx Zynq
> UltraScale+ platform. It will help in defining TCM in device-tree
> and make it's access platform agnostic and data-driven.
> 
> Tightly-coupled memories(TCMs) are low-latency memory that provides
> predictable instruction execution and predictable data load/store
> timing. Each Cortex-R5F processor contains two 64-bit wide 64 KB memory
> banks on the ATCM and BTCM ports, for a total of 128 KB of memory.
> 
> The TCM resources(reg, reg-names and power-domain) are documented for
> each TCM in the R5 node. The reg and reg-names are made as required
> properties as we don't want to hardcode TCM addresses for future
> platforms and for zu+ legacy implementation will ensure that the
> old dts w/o reg/reg-names works and stable ABI is maintained.
> 
> It also extends the examples for TCM split and lockstep modes.
> 
> Signed-off-by: Radhey Shyam Pandey 
> Signed-off-by: Tanmay Shah 
> ---
> 
> Changes in v11:
>   - Fix yamllint warning and reduce indentation as needed
> 
>  .../remoteproc/xlnx,zynqmp-r5fss.yaml | 192 --
>  1 file changed, 170 insertions(+), 22 deletions(-)
> 
> diff --git 
> a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml 
> b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> index 78aac69f1060..77030edf41fa 100644
> --- a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> +++ b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> @@ -20,9 +20,21 @@ properties:
>compatible:
>  const: xlnx,zynqmp-r5fss
>  
> +  "#address-cells":
> +const: 2
> +
> +  "#size-cells":
> +const: 2
> +
> +  ranges:
> +description: |
> +  Standard ranges definition providing address translations for
> +  local R5F TCM address spaces to bus addresses.
> +
>xlnx,cluster-mode:
>  $ref: /schemas/types.yaml#/definitions/uint32
>  enum: [0, 1, 2]
> +default: 1
>  description: |
>The RPU MPCore can operate in split mode (Dual-processor performance), 
> Safety
>lock-step mode(Both RPU cores execute the same code in lock-step,
> @@ -37,7 +49,7 @@ properties:
>2: single cpu mode
>  
>  patternProperties:
> -  "^r5f-[a-f0-9]+$":
> +  "^r5f@[0-9a-f]+$":
>  type: object
>  description: |
>The RPU is located in the Low Power Domain of the Processor Subsystem.
> @@ -54,9 +66,6 @@ patternProperties:
>compatible:
>  const: xlnx,zynqmp-r5f
>  
> -  power-domains:
> -maxItems: 1

Why power-domains are being dropped? This should have widest constraints
if you later customize it.

> -
>mboxes:
>  minItems: 1
>  items:
> @@ -101,35 +110,174 @@ patternProperties:
>  
>  required:
>- compatible
> -  - power-domains

Don't drop power domains.


>  
> -unevaluatedProperties: false
> +allOf:

allOf block goes after required:

> +  - if:
> +  properties:
> +xlnx,cluster-mode:
> +  enum:
> +- 1
> +then:
> +  patternProperties:
> +"^r5f@[0-9a-f]+$":
> +  type: object
> +
> +  properties:
> +reg:

reg is missing in your patternProperties earlier.



Best regards,
Krzysztof

Re: [PATCH net-next v5] virtio_net: Support RX hash XDP hint

2024-02-29 Thread Liang Chen

On Tue, Feb 27, 2024 at 4:42 AM John Fastabend  wrote:
>
> Jason Wang wrote:
> > On Fri, Feb 23, 2024 at 9:42 AM Xuan Zhuo  
> > wrote:
> > >
> > > On Fri, 09 Feb 2024 13:57:25 +0100, Paolo Abeni  wrote:
> > > > On Fri, 2024-02-09 at 18:39 +0800, Liang Chen wrote:
> > > > > On Wed, Feb 7, 2024 at 10:27 PM Paolo Abeni  wrote:
> > > > > >
> > > > > > On Wed, 2024-02-07 at 10:54 +0800, Liang Chen wrote:
> > > > > > > On Tue, Feb 6, 2024 at 6:44 PM Paolo Abeni  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Sat, 2024-02-03 at 10:56 +0800, Liang Chen wrote:
> > > > > > > > > On Sat, Feb 3, 2024 at 12:20 AM Jesper Dangaard Brouer 
> > > > > > > > >  wrote:
> > > > > > > > > > On 02/02/2024 13.11, Liang Chen wrote:
> > > > > > > > [...]
> > > > > > > > > > > @@ -1033,6 +1039,16 @@ static void put_xdp_frags(struct 
> > > > > > > > > > > xdp_buff *xdp)
> > > > > > > > > > >   }
> > > > > > > > > > >   }
> > > > > > > > > > >
> > > > > > > > > > > +static void virtnet_xdp_save_rx_hash(struct 
> > > > > > > > > > > virtnet_xdp_buff *virtnet_xdp,
> > > > > > > > > > > +  struct net_device *dev,
> > > > > > > > > > > +  struct 
> > > > > > > > > > > virtio_net_hdr_v1_hash *hdr_hash)
> > > > > > > > > > > +{
> > > > > > > > > > > + if (dev->features & NETIF_F_RXHASH) {
> > > > > > > > > > > + virtnet_xdp->hash_value = 
> > > > > > > > > > > hdr_hash->hash_value;
> > > > > > > > > > > + virtnet_xdp->hash_report = 
> > > > > > > > > > > hdr_hash->hash_report;
> > > > > > > > > > > + }
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > >
> > > > > > > > > > Would it be possible to store a pointer to hdr_hash in 
> > > > > > > > > > virtnet_xdp_buff,
> > > > > > > > > > with the purpose of delaying extracting this, until and 
> > > > > > > > > > only if XDP
> > > > > > > > > > bpf_prog calls the kfunc?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > That seems to be the way v1 works,
> > > > > > > > > https://lore.kernel.org/all/20240122102256.261374-1-liangchen.li...@gmail.com/
> > > > > > > > > . But it was pointed out that the inline header may be 
> > > > > > > > > overwritten by
> > > > > > > > > the xdp prog, so the hash is copied out to maintain its 
> > > > > > > > > integrity.
> > > > > > > >
> > > > > > > > Why? isn't XDP supposed to get write access only to the pkt
> > > > > > > > contents/buffer?
> > > > > > > >
> > > > > > >
> > > > > > > Normally, an XDP program accesses only the packet data. However,
> > > > > > > there's also an XDP RX Metadata area, referenced by the data_meta
> > > > > > > pointer. This pointer can be adjusted with bpf_xdp_adjust_meta to
> > > > > > > point somewhere ahead of the data buffer, thereby granting the XDP
> > > > > > > program access to the virtio header located immediately before the
> > > > > >
> > > > > > AFAICS bpf_xdp_adjust_meta() does not allow moving the meta_data 
> > > > > > before
> > > > > > xdp->data_hard_start:
> > > > > >
> > > > > > https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4210
> > > > > >
> > > > > > and virtio net set such field after the virtio_net_hdr:
> > > > > >
> > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/net/virtio_net.c#L1218
> > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/net/virtio_net.c#L1420
> > > > > >
> > > > > > I don't see how the virtio hdr could be touched? Possibly even more
> > > > > > important: if such thing is possible, I think is should be somewhat
> > > > > > denied (for the same reason an H/W nic should prevent XDP from
> > > > > > modifying its own buffer descriptor).
> > > > >
> > > > > Thank you for highlighting this concern. The header layout differs
> > > > > slightly between small and mergeable mode. Taking 'mergeable mode' as
> > > > > an example, after calling xdp_prepare_buff the layout of xdp_buff
> > > > > would be as depicted in the diagram below,
> > > > >
> > > > >   buf
> > > > >|
> > > > >v
> > > > > +--+--+-+
> > > > > | xdp headroom | virtio header| packet  |
> > > > > | (256 bytes)  | (20 bytes)   | content |
> > > > > +--+--+-+
> > > > > ^ ^
> > > > > | |
> > > > >  data_hard_startdata
> > > > >   data_meta
> > > > >
> > > > > If 'bpf_xdp_adjust_meta' repositions the 'data_meta' pointer a little
> > > > > towards 'data_hard_start', it would point to the inline header, thus
> > > > > potentially allowing the XDP program to access the inline header.
>
> Fairly late to the thread sorry. Given above layout does it make sense to
> just delay extraction to the kfunc as suggested above? Sure the XDP program
> could smash the ent

38 matches

Mail list logo