subject:"RE"

Re: [PATCH] rv: Update rv_en(dis)able_monitor doc to match kernel-doc

2024-05-28 Thread Daniel Bristot de Oliveira

On 5/20/24 07:42, Yang Li wrote:
> The patch updates the function documentation comment for
> rv_en(dis)able_monitor to adhere to the kernel-doc specification.
> 
> Signed-off-by: Yang Li 

Acked-by: Daniel Bristot de Oliveira 

Thanks
-- Daniel

Re: [PATCH] [v5] kallsyms: rework symbol lookup return codes

2024-05-28 Thread Geert Uytterhoeven

Hi Arnd,

On Thu, Apr 4, 2024 at 4:52 PM Arnd Bergmann  wrote:
> From: Arnd Bergmann 
>
> Building with W=1 in some configurations produces a false positive
> warning for kallsyms:
>
> kernel/kallsyms.c: In function '__sprint_symbol.isra':
> kernel/kallsyms.c:503:17: error: 'strcpy' source argument is the same as 
> destination [-Werror=restrict]
>   503 | strcpy(buffer, name);
>   | ^~~~
>
> This originally showed up while building with -O3, but later started
> happening in other configurations as well, depending on inlining
> decisions. The underlying issue is that the local 'name' variable is
> always initialized to the be the same as 'buffer' in the called functions
> that fill the buffer, which gcc notices while inlining, though it could
> see that the address check always skips the copy.
>
> The calling conventions here are rather unusual, as all of the internal
> lookup functions (bpf_address_lookup, ftrace_mod_address_lookup,
> ftrace_func_address_lookup, module_address_lookup and
> kallsyms_lookup_buildid) already use the provided buffer and either return
> the address of that buffer to indicate success, or NULL for failure,
> but the callers are written to also expect an arbitrary other buffer
> to be returned.
>
> Rework the calling conventions to return the length of the filled buffer
> instead of its address, which is simpler and easier to follow as well
> as avoiding the warning. Leave only the kallsyms_lookup() calling conventions
> unchanged, since that is called from 16 different functions and
> adapting this would be a much bigger change.
>
> Link: https://lore.kernel.org/all/20200107214042.855757-1-a...@arndb.de/
> Link: https://lore.kernel.org/lkml/20240326130647.7bfb1...@gandalf.local.home/
> Reviewed-by: Luis Chamberlain 
> Acked-by: Steven Rostedt (Google) 
> Signed-off-by: Arnd Bergmann 
> ---
> v5: fix ftrace_mod_address_lookup return value,
> rebased on top of 2e114248e086 ("bpf: Replace deprecated strncpy with 
> strscpy")
> v4: fix string length
> v3: use strscpy() instead of strlcpy()
> v2: complete rewrite after the first patch was rejected (in 2020). This
> is now one of only two warnings that are in the way of enabling
> -Wextra/-Wrestrict by default.

Aha, commit 06bb7fc0feee32d9 ("kbuild: turn on -Wrestrict by default")
still made v6.10-rc1, without this one...

> Signed-off-by: Arnd Bergmann 

Thanks, this fixes

kernel/kallsyms.c: In function ‘__sprint_symbol.constprop’:
kernel/kallsyms.c:492:17: warning: ‘strcpy’ source argument is the
same as destination [-Werror=restrict]
  492 | strcpy(buffer, name);
  | ^~~~

I am seeing with shmobile_defconfig and gcc version 11.4.0 (Ubuntu
11.4.0-1ubuntu1~22.04).

Tested-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [linus:master] [mm] d99e3140a4: BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report

2024-05-28 Thread Miaohe Lin

On 2024/5/28 15:43, David Hildenbrand wrote:
> Am 28.05.24 um 09:11 schrieb kernel test robot:
>>
>>
>> Hello,
>>
>> kernel test robot noticed 
>> "BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report" on:
>>
>> commit: d99e3140a4d33e26066183ff727d8f02f56bec64 ("mm: turn 
>> folio_test_hugetlb into a PageType")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> [test failed on linus/master  c760b3725e52403dc1b28644fb09c47a83cacea6]
>> [test failed on linux-next/master 3689b0ef08b70e4e03b82ebd37730a03a672853a]
>>
>> in testcase: trinity
>> version: trinity-i386-abe9de86-1_20230429
>> with following parameters:
>>
>> runtime: 300s
>> group: group-04
>> nr_groups: 5
>>
>>
>>
>> compiler: gcc-13
>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>>
>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>
>>
>> we noticed this issue does not always happen. we also noticed there are
>> different random KCSAN issues for both this commit and its parent. but below
>> 4 only happen on this commit with not small rate and keep clean on parent.
>>
> 
> Likely that's just a page_type check racing against concurrent
> mapcount changes.
> 
> In __folio_rmap_sanity_checks() we check
> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
> 
> To make sure we don't get hugetlb folios in the wrong rmap code path. That
> can easily race with concurrent mapcount changes, just like any other
> page_type checks that end up in folio_test_type/page_has_type e.g., from
> PFN walkers.
> 
> Load tearing in these functions shouldn't really result in false positives
> (what we care about), but READ_ONCE shouldn't hurt or make a difference.
> 
> 
> From b03dc9bf27571442d886d8da624a4e4f737433f2 Mon Sep 17 00:00:00 2001
> From: David Hildenbrand 
> Date: Tue, 28 May 2024 09:37:20 +0200
> Subject: [PATCH] mm: read page_type using READ_ONCE
> 
> KCSAN complains about possible data races: while we check for a
> page_type -- for example for sanity checks -- we might concurrently
> modify the mapcount that overlays page_type.
> 
> Let's use READ_ONCE to avoid laod tearing (shouldn't make a difference)
> and to make KCSAN happy.
> 
> Note: nothing should really be broken besides wrong KCSAN complaints.
> 
> Reported-by: kernel test robot 
> Closes: https://lore.kernel.org/oe-lkp/202405281431.c46a3be9-...@intel.com
> Signed-off-by: David Hildenbrand 

LGTM. Thanks for fixing.

Reviewed-by: Miaohe Lin 
Thanks.
.

Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-28 Thread Krzysztof Kozlowski

On 28/05/2024 09:26, Bartosz Golaszewski wrote:
> On Mon, May 27, 2024 at 10:56 AM Krzysztof Kozlowski  wrote:
>>
>> On 27/05/2024 10:43, Bartosz Golaszewski wrote:
>>> From: Bartosz Golaszewski 
>>>
>>> Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
>>> GPDSP1 on the SA8775p SoC.
>>>
>>> Co-developed-by: Tengfei Fan 
>>
>> Missing SoB.
>>
> 
> Does it though? The patch never passed through Tengfei's hands, I just
> wanted to give him credit for the work modifying the sm8550-pas
> bindings.

Then what was co-developed by Tengfei? If nothing, then indeed no SoB
but also no Co-developed-by. Docs are clear here: every Co-developed-by
*must* be followed by SoB.

Best regards,
Krzysztof

Re: [linus:master] [mm] d99e3140a4: BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report

2024-05-28 Thread Oscar Salvador

On Tue, May 28, 2024 at 09:43:39AM +0200, David Hildenbrand wrote:
> Likely that's just a page_type check racing against concurrent
> mapcount changes.
> 
> In __folio_rmap_sanity_checks() we check
>   VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);

Yeah, and that "collides" with

last = atomic_add_negative(-1, >_mapcount)

from __folio_remove_rmap.

> To make sure we don't get hugetlb folios in the wrong rmap code path. That
> can easily race with concurrent mapcount changes, just like any other
> page_type checks that end up in folio_test_type/page_has_type e.g., from
> PFN walkers.
> 
> Load tearing in these functions shouldn't really result in false positives
> (what we care about), but READ_ONCE shouldn't hurt or make a difference.
> 
> 
> From b03dc9bf27571442d886d8da624a4e4f737433f2 Mon Sep 17 00:00:00 2001
> From: David Hildenbrand 
> Date: Tue, 28 May 2024 09:37:20 +0200
> Subject: [PATCH] mm: read page_type using READ_ONCE
> 
> KCSAN complains about possible data races: while we check for a
> page_type -- for example for sanity checks -- we might concurrently
> modify the mapcount that overlays page_type.
> 
> Let's use READ_ONCE to avoid laod tearing (shouldn't make a difference)
> and to make KCSAN happy.
> 
> Note: nothing should really be broken besides wrong KCSAN complaints.
> 
> Reported-by: kernel test robot 
> Closes: https://lore.kernel.org/oe-lkp/202405281431.c46a3be9-...@intel.com
> Signed-off-by: David Hildenbrand 

Reviewed-by: Oscar Salvador 

Thanks!

-- 
Oscar Salvador
SUSE Labs

Re: [linus:master] [mm] d99e3140a4: BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report

2024-05-28 Thread David Hildenbrand


Am 28.05.24 um 09:11 schrieb kernel test robot:



Hello,

kernel test robot noticed 
"BUG:KCSAN:data-race_in_folio_remove_rmap_ptes/print_report" on:

commit: d99e3140a4d33e26066183ff727d8f02f56bec64 ("mm: turn folio_test_hugetlb into 
a PageType")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master  c760b3725e52403dc1b28644fb09c47a83cacea6]
[test failed on linux-next/master 3689b0ef08b70e4e03b82ebd37730a03a672853a]

in testcase: trinity
version: trinity-i386-abe9de86-1_20230429
with following parameters:

runtime: 300s
group: group-04
nr_groups: 5



compiler: gcc-13
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we noticed this issue does not always happen. we also noticed there are
different random KCSAN issues for both this commit and its parent. but below
4 only happen on this commit with not small rate and keep clean on parent.



Likely that's just a page_type check racing against concurrent
mapcount changes.

In __folio_rmap_sanity_checks() we check
VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);

To make sure we don't get hugetlb folios in the wrong rmap code path. That
can easily race with concurrent mapcount changes, just like any other
page_type checks that end up in folio_test_type/page_has_type e.g., from
PFN walkers.

Load tearing in these functions shouldn't really result in false positives
(what we care about), but READ_ONCE shouldn't hurt or make a difference.


From b03dc9bf27571442d886d8da624a4e4f737433f2 Mon Sep 17 00:00:00 2001
From: David Hildenbrand 
Date: Tue, 28 May 2024 09:37:20 +0200
Subject: [PATCH] mm: read page_type using READ_ONCE

KCSAN complains about possible data races: while we check for a
page_type -- for example for sanity checks -- we might concurrently
modify the mapcount that overlays page_type.

Let's use READ_ONCE to avoid laod tearing (shouldn't make a difference)
and to make KCSAN happy.

Note: nothing should really be broken besides wrong KCSAN complaints.

Reported-by: kernel test robot 
Closes: https://lore.kernel.org/oe-lkp/202405281431.c46a3be9-...@intel.com
Signed-off-by: David Hildenbrand 
---
 include/linux/page-flags.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 104078afe0b1..e46ccbb9aa58 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -955,9 +955,9 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
 #define PG_slab0x1000
 
 #define PageType(page, flag)		\

-   ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
+   ((READ_ONCE(page->page_type) & (PAGE_TYPE_BASE | flag)) == 
PAGE_TYPE_BASE)
 #define folio_test_type(folio, flag)   \
-   ((folio->page.page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
+   ((READ_ONCE(folio->page.page_type) & (PAGE_TYPE_BASE | flag))  == 
PAGE_TYPE_BASE)
 
 static inline int page_type_has_type(unsigned int page_type)

 {
@@ -966,7 +966,7 @@ static inline int page_type_has_type(unsigned int page_type)
 
 static inline int page_has_type(const struct page *page)

 {
-   return page_type_has_type(page->page_type);
+   return page_type_has_type(READ_ONCE(page->page_type));
 }
 
 #define FOLIO_TYPE_OPS(lname, fname)	\

--
2.45.1


--
Thanks,

David / dhildenb

Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-28 Thread Bartosz Golaszewski

On Mon, May 27, 2024 at 10:56 AM Krzysztof Kozlowski  wrote:
>
> On 27/05/2024 10:43, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski 
> >
> > Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
> > GPDSP1 on the SA8775p SoC.
> >
> > Co-developed-by: Tengfei Fan 
>
> Missing SoB.
>

Does it though? The patch never passed through Tengfei's hands, I just
wanted to give him credit for the work modifying the sm8550-pas
bindings.

Bart

Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Thorsten Scherer

Hello,

On Mon, May 27, 2024 at 01:51:49PM +0100, Russell King (Oracle) wrote:
> On Mon, May 27, 2024 at 02:28:59PM +0200, Uwe Kleine-König wrote:
> > On Mon, May 27, 2024 at 08:56:16AM +0100, Russell King (Oracle) wrote:
> > > On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> > > > Hello,
> > > > 
> > > > in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe 
> > > > pointed me to
> > > > this thread.  With the proposed code change applied the procedure
> > > > 
> > > > # set to some known good (randomly guessed) filter function and 
> > > > enable function_graph
> > > > echo mtdblock_open > set_ftrace_filter
> > > > echo function_graph > current_tracer
> > > > 
> > > > # walk available filter funcs
> > > > cat available_filter_functions | while read f; do echo $f | tee -a 
> > > > set_ftrace_filter; sleep 1; done
> > > > 
> > > > produces the following output
> > > > 
> > > > [  159.832173] Insufficient stack space to handle exception!
> > > > [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> > > > [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> > > > [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> > > > [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> > > > [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> > > > industrialio_triggered_buffer kfifo_buf capture_events_irq 
> > > > capture_events iio_trig_hrtimer industrialio_sw_trigger 
> > > > industrialio_configfs dm_mod
> > > > [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> > > > [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> > > > [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> > > > [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> > > > [  159.898376] pc : []lr : []psr: 6093
> > > > [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> > > > [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> > > > [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : 
> > > > c8e440ac
> > > > [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : 
> > > > c8e44038
> > > > [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  
> > > > Segment none
> > > > [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> > > > [  159.941436] Register r0 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.952253] Register r1 information: non-slab/vmalloc memory
> > > > [  159.957988] Register r2 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.968791] Register r3 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.979592] Register r4 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  159.990391] Register r5 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  160.001187] Register r6 information: non-paged memory
> > > > [  160.006303] Register r7 information: non-paged memory
> > > > [  160.011415] Register r8 information: non-slab/vmalloc memory
> > > > [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> > > > pointer offset 0 size 32
> > > > [  160.025718] Register r10 information: non-slab/vmalloc memory
> > > > [  160.031530] Register r11 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  160.042422] Register r12 information: 2-page vmalloc region starting 
> > > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > > [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> > > > [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)
> > > 
> > > No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
> > > a function involving the ftrace tracing infrastructure, which is why 8KB
> > > of stack has been gobbled up. It could be
> > > copy_from_kernel_nofault_allowed() but it would be useful to have at
> > > least some extract of the backtrace showing the recursive cycle to
> > > confirm, otherwise there is nothing in your report to confirm. As I'm
> > > not a ftrace user myself, this isn't something I'd test for, so having
> > > a full report would be useful.
> > 
> > Is not having a backtrace related to ftrace_return_address() returning
> > NULL, as Arnd pointed out in
> > https://lore.kernel.org/linux-arm-kernel/36cd10de-c51c-40ff-90e8-714954060...@app.fastmail.com/
> > ?
> 
> Unlikely - the stack and code lines are also missing. I think the
> submitter truncated the oops which is highly likely given that it
> would've dumped all 8kB of the stack in hex, and the trace and
> code lines would be after that.

sorry for causing additional friction by my imprecise description.
Indeed, this was the whole oops

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Ilkka Naulapää

yeah, the cache_from_obj tracing bug (without panic) has been
displayed quite some time now - maybe even since 6.7.x or so. I could
try checking a few versions back for this and try bisecting it if I
can find when this started.

--Ilkka

On Tue, May 28, 2024 at 1:31 AM Steven Rostedt  wrote:
>
> On Fri, 24 May 2024 12:50:08 +0200
> "Linux regression tracking (Thorsten Leemhuis)"  
> wrote:
>
> > > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > > quick display of a kernel trace dump before the shutdown/reboot
> > > completed. Starting from version 6.8.10 and continuing into version
> > > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > > preventing the shutdown or reboot from completing and leaving the
> > > machine stuck.
>
> You state "Before kernel version 6.8.10, the bug caused ...". Does that
> mean that a bug was happening before v6.8.10? But did not cause a panic?
>
> I just noticed your second screen shot from your report, and it has:
>
>  "cache_from_obj: Wrong slab cache, tracefs_inode_cache but object is from 
> inode_cache"
>
> So somehow an tracefs_inode was allocated from the inode_cache and is
> now being freed by the tracefs_inode logic? Did this happen before
> 6.8.10? If so, this code could just be triggering an issue from an
> unrelated bug.
>
> Thanks,
>
> -- Steve

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Ilkka Naulapää

I tried 6.10-rc1 and it still ends up to panic

--Ilkka


On Tue, May 28, 2024 at 12:44 AM Steven Rostedt  wrote:
>
> On Mon, 27 May 2024 20:14:42 +0200
> Greg KH  wrote:
>
> > On Mon, May 27, 2024 at 07:40:21PM +0300, Ilkka Naulapää wrote:
> > > Hi Steven,
> > >
> > > I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
> > > panic inducing commit:
> > >
> > > 414fb08628143 (tracefs: Reset permissions on remount if permissions are 
> > > options)
> > >
> > > I reverted that commit to 6.9.2 and now it only serves the trace but
> > > the panic is gone. But I can live with it.
> >
> > Steven, should we revert that?
> >
> > Or is there some other change that we should take to resolve this?
> >
>
> Before we revert it (as it may be a bug in mainline), Ilkka, can you
> test v6.10-rc1?  If it exists there, it will let me know whether or not
> I missed something.
>
> Thanks,
>
> -- Steve

Re: (subset) [PATCH v2 0/2] Add Samsung Galaxy Note 3 support

2024-05-27 Thread Bjorn Andersson



On Thu, 14 Mar 2024 20:00:13 +0100, Luca Weiss wrote:
> Add the dts for "hlte" which is a phablet from 2013.
> 
> 

Applied, thanks!

[2/2] ARM: dts: qcom: msm8974: Add Samsung Galaxy Note 3
  commit: b4f6c63bf34d8da1b769483bb1f4a603c53896ce

Best regards,
-- 
Bjorn Andersson

Re: (subset) [PATCH 0/2] Add basic APR sound support for SC7280 SoC

2024-05-27 Thread Bjorn Andersson



On Fri, 10 May 2024 14:27:07 +0200, Luca Weiss wrote:
> Validated on Fairphone 5 (QCM6490) smartphone by using DisplayPort over
> USB-C audio, connected to a TV, with a basic UCM to enable
> 'DISPLAY_PORT_RX Audio Mixer MultiMedia1':
> https://gitlab.com/postmarketOS/pmaports/-/tree/master/device/testing/device-fairphone-fp5/ucm
> 
> Unfortunately all the device-specific things can't be enabled yet
> upstream as detailed in the second patch, but the SoC parts should be
> good to go.
> 
> [...]

Applied, thanks!

[1/2] arm64: dts: qcom: sc7280: Add APR nodes for sound
  commit: f44da5d8722de348ff2eb8b206c69b52809c1772

Best regards,
-- 
Bjorn Andersson

Re: (subset) [PATCH v3 0/2] Add samsung-milletwifi

2024-05-27 Thread Bjorn Andersson



On Mon, 19 Feb 2024 22:43:15 +0100, Bryant Mairs wrote:
> This series adds support for samsung-milletwifi, the smaller cousin
> to samsung-matisselte. I've used the manufacturer's naming convention
> for consistency.
> 
> Hardware currently supported:
> - Display
> - Cover detection
> - Physical buttons
> - Touchscreen and touchkeys
> - Accelerometer
> 
> [...]

Applied, thanks!

[2/2] ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 8.0 Wi-Fi
  commit: 49b9981a0ecae2bbb298d8b0c2b8058220038691

Best regards,
-- 
Bjorn Andersson

Re: [PATCH v4 0/4] MSM8976 MDSS/GPU/WCNSS support

2024-05-27 Thread Bjorn Andersson



On Wed, 08 May 2024 18:34:33 +0200, Adam Skladowski wrote:
> This patch series provide support for display subsystem, gpu
> and also adds wireless connectivity subsystem support.
> 
> Changes since v3
> 
> 1. Minor styling fixes
> 2. Converted qcom,ipc into mailbox on wcnss patch
> 
> [...]

Applied, thanks!

[1/4] arm64: dts: qcom: msm8976: Add IOMMU nodes
  commit: 418c2ffd7df9bfc25c21172bd881b78d7569fb4d
[2/4] arm64: dts: qcom: msm8976: Add MDSS nodes
  commit: b0516dbf8e218dede2fd2837ca82dccd9cdcdafc
[3/4] arm64: dts: qcom: msm8976: Add Adreno GPU
  commit: 00e67d8e80f06bb848a3dd516d06e2f040b7d8f2
[4/4] arm64: dts: qcom: msm8976: Add WCNSS node
  commit: 45878973229a93f0f42aa048ac8c6223af010082

Best regards,
-- 
Bjorn Andersson

Re: [PATCH v2 4/5] arm64: dts: qcom: sa8775p: add ADSP, CDSP and GPDSP nodes

2024-05-27 Thread Tengfei Fan





On 5/27/2024 4:43 PM, Bartosz Golaszewski wrote:

From: Tengfei Fan 

Add nodes for remoteprocs: ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 for
SA8775p SoCs.

Signed-off-by: Tengfei Fan 
Co-developed-by: Bartosz Golaszewski 
Signed-off-by: Bartosz Golaszewski 
---
  arch/arm64/boot/dts/qcom/sa8775p.dtsi | 332 ++
  1 file changed, 332 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sa8775p.dtsi 
b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
index 31de73594839..5c0b61a5624b 100644
--- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi
+++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
@@ -10,6 +10,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  
@@ -544,6 +545,121 @@ cpucp_fw_mem: cpucp-fw@db20 {

};
};
  
+	smp2p-adsp {

+   compatible = "qcom,smp2p";
+   qcom,smem = <443>, <429>;
+   interrupts-extended = < IPCC_CLIENT_LPASS
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_LPASS IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <2>;
+
+   smp2p_adsp_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_adsp_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-cdsp0 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <94>, <432>;
+   interrupts-extended = < IPCC_CLIENT_CDSP
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_CDSP IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <5>;
+
+   smp2p_cdsp0_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_cdsp0_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-cdsp1 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <617>, <616>;
+   interrupts-extended = < IPCC_CLIENT_NSP1
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_NSP1 IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <12>;
+
+   smp2p_cdsp1_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_cdsp1_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-gpdsp0 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <617>, <616>;
+   interrupts-extended = < IPCC_CLIENT_GPDSP0
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_GPDSP0 IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <17>;
+
+   smp2p_gpdsp0_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_gpdsp0_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-gpdsp1 {
+   compatible = "qcom,smp2p";
+   qcom,smem = <617>, <616>;
+   interrupts-extended = < IPCC_CLIENT_GPDSP1
+IPCC_MPROC_SIGNAL_SMP2P
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = < IPCC_CLIENT_GPDSP1 IPCC_MPROC_SIGNAL_SMP2P>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <18>;
+
+   smp2p_gpdsp1_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   smp2p_gpdsp1_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";

Re: [PATCH v2 3/5] remoteproc: qcom_q6v5_pas: Add support for SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Tengfei Fan





On 5/27/2024 4:43 PM, Bartosz Golaszewski wrote:

From: Tengfei Fan 

Add support for PIL loading on ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 on
SA8775p SoCs.

Signed-off-by: Tengfei Fan 
Co-developed-by: Bartosz Golaszewski 
Signed-off-by: Bartosz Golaszewski 
---
  drivers/remoteproc/qcom_q6v5_pas.c | 92 ++
  1 file changed, 92 insertions(+)

diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
b/drivers/remoteproc/qcom_q6v5_pas.c
index 54d8005d40a3..16053aa99298 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -820,6 +820,23 @@ static const struct adsp_data adsp_resource_init = {
.ssctl_id = 0x14,
  };
  
+static const struct adsp_data sa8775p_adsp_resource = {

+   .crash_reason_smem = 423,
+   .firmware_name = "adsp.mdt",
+   .pas_id = 1,
+   .minidump_id = 5,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "lcx",
+   "lmx",
+   NULL
+   },
+   .load_state = "adsp",
+   .ssr_name = "lpass",
+   .sysmon_name = "adsp",
+   .ssctl_id = 0x14,
+};
+
  static const struct adsp_data sdm845_adsp_resource_init = {
.crash_reason_smem = 423,
.firmware_name = "adsp.mdt",
@@ -933,6 +950,42 @@ static const struct adsp_data cdsp_resource_init = {
.ssctl_id = 0x17,
  };
  
+static const struct adsp_data sa8775p_cdsp0_resource = {

+   .crash_reason_smem = 601,
+   .firmware_name = "cdsp0.mdt",
+   .pas_id = 18,
+   .minidump_id = 7,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   "nsp0",


s/nsp0/nsp/


+   NULL
+   },
+   .load_state = "cdsp",
+   .ssr_name = "cdsp",
+   .sysmon_name = "cdsp",
+   .ssctl_id = 0x17,
+};
+
+static const struct adsp_data sa8775p_cdsp1_resource = {
+   .crash_reason_smem = 633,
+   .firmware_name = "cdsp1.mdt",
+   .pas_id = 30,
+   .minidump_id = 20,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   "nsp1",


s/nsp1/nsp/


+   NULL
+   },
+   .load_state = "nsp",
+   .ssr_name = "cdsp1",
+   .sysmon_name = "cdsp1",
+   .ssctl_id = 0x20,
+};
+
  static const struct adsp_data sdm845_cdsp_resource_init = {
.crash_reason_smem = 601,
.firmware_name = "cdsp.mdt",
@@ -1074,6 +1127,40 @@ static const struct adsp_data sm8350_cdsp_resource = {
.ssctl_id = 0x17,
  };
  
+static const struct adsp_data sa8775p_gpdsp0_resource = {

+   .crash_reason_smem = 640,
+   .firmware_name = "gpdsp0.mdt",
+   .pas_id = 39,
+   .minidump_id = 21,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   NULL
+   },
+   .load_state = "gpdsp0",
+   .ssr_name = "gpdsp0",
+   .sysmon_name = "gpdsp0",
+   .ssctl_id = 0x21,
+};
+
+static const struct adsp_data sa8775p_gpdsp1_resource = {
+   .crash_reason_smem = 641,
+   .firmware_name = "gpdsp1.mdt",
+   .pas_id = 40,
+   .minidump_id = 22,
+   .auto_boot = true,
+   .proxy_pd_names = (char*[]){
+   "cx",
+   "mxc",
+   NULL
+   },
+   .load_state = "gpdsp1",
+   .ssr_name = "gpdsp1",
+   .sysmon_name = "gpdsp1",
+   .ssctl_id = 0x22,
+};
+
  static const struct adsp_data mpss_resource_init = {
.crash_reason_smem = 421,
.firmware_name = "modem.mdt",
@@ -1315,6 +1402,11 @@ static const struct of_device_id adsp_of_match[] = {
{ .compatible = "qcom,qcs404-adsp-pas", .data = _resource_init },
{ .compatible = "qcom,qcs404-cdsp-pas", .data = _resource_init },
{ .compatible = "qcom,qcs404-wcss-pas", .data = _resource_init },
+   { .compatible = "qcom,sa8775p-adsp-pas", .data = 
_adsp_resource},
+   { .compatible = "qcom,sa8775p-cdsp0-pas", .data = 
_cdsp0_resource},
+   { .compatible = "qcom,sa8775p-cdsp1-pas", .data = 
_cdsp1_resource},
+   { .compatible = "qcom,sa8775p-gpdsp0-pas", .data = 
_gpdsp0_resource},
+   { .compatible = "qcom,sa8775p-gpdsp1-pas", .data = 
_gpdsp1_resource},
{ .compatible = "qcom,sc7180-adsp-pas", .data = _adsp_resource},
{ .compatible = "qcom,sc7180-mpss-pas", .data = _resource_init},
{ .compatible = "qcom,sc7280-adsp-pas", .data = _adsp_resource},



--
Thx and BRs,
Tengfei Fan

Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Tengfei Fan





On 5/27/2024 4:56 PM, Krzysztof Kozlowski wrote:

On 27/05/2024 10:43, Bartosz Golaszewski wrote:

From: Bartosz Golaszewski 

Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
GPDSP1 on the SA8775p SoC.

Co-developed-by: Tengfei Fan 


Missing SoB.


Signed-off-by: Bartosz Golaszewski 


...



+
+allOf:
+  - $ref: /schemas/remoteproc/qcom,pas-common.yaml#
+
+  - if:
+  properties:
+compatible:
+  enum:
+- qcom,sa8775p-adsp-pas
+then:
+  properties:
+power-domains:
+  items:
+- description: LCX power domain
+- description: LMX power domain
+power-domain-names:
+  items:
+- const: lcx
+- const: lmx
+
+  - if:
+  properties:
+compatible:
+  enum:
+- qcom,sa8775p-cdsp-pas


cdsp0


+then:
+  properties:
+power-domains:
+  items:
+- description: CX power domain
+- description: MXC power domain
+- description: NSP0 power domain
+power-domain-names:
+  items:
+- const: cx
+- const: mxc
+- const: nsp0


Shouldn't this be just nsp, so both cdsp0 and cdsp1 entries can be
unified? That's the power domain from the device point of view, so the
device expects to be in some NSP domain, not explicitly NSPn.


Both cdsp0 and cdsp1 entries can uniformly use nsp.




Best regards,
Krzysztof



--
Thx and BRs,
Tengfei Fan

Re: [PATCH v2] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-27 Thread Tatsuya S

On Mon, May 27, 2024 at 07:49:14PM GMT, Steven Rostedt wrote:
> On Mon, 27 May 2024 18:44:56 +0900
> Tatsuya S  wrote:
> 
> > On setting set_ftrace_pid, a extra entry generated by ftrace_pid_func()
> > is shown on stack trace(CONFIG_UNWINDER_FRAME_POINTER=y).
> > 
> > [004] .68.459382: 
> >  => 0xa00090af
> >  => ksys_read
> >  => __x64_sys_read
> >  => x64_sys_call
> >  => do_syscall_64
> >  => entry_SYSCALL_64_after_hwframe  
> > 
> > To resolve this issue, increment skip count
> > in function_stack_trace_call() if pids are set.
> 
> Just a note, this isn't a "fix" but simply an improvement in output.
> I'm happy to take this (after testing and more reviewing), but it will
> be for the next merge window, and with a different subject line.
> 
>   "ftrace: Hide one more entry in stack trace when ftrace_pid is enabled"
> 
> Or something like that.
> 
> Thanks,
> 
> -- Steve
I will send patch with new subject line.

Thank you.

Tatsuya

Re: [PATCH v2 2/2] selftests/user_events: Add non-spacing separator check

2024-05-27 Thread Google

On Tue, 23 Apr 2024 16:23:38 +
Beau Belgrave  wrote:

> The ABI documentation indicates that field separators do not need a
> space between them, only a ';'. When no spacing is used, the register
> must work. Any subsequent register, with or without spaces, must match
> and not return -EADDRINUSE.
> 
> Add a non-spacing separator case to our self-test register case to ensure
> it works going forward.
> 

Looks good to me.

Acked-by: Masami Hiramatsu (Google) 

Thanks!

> Signed-off-by: Beau Belgrave 
> ---
>  tools/testing/selftests/user_events/ftrace_test.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tools/testing/selftests/user_events/ftrace_test.c 
> b/tools/testing/selftests/user_events/ftrace_test.c
> index dcd7509fe2e0..0bb46793dcd4 100644
> --- a/tools/testing/selftests/user_events/ftrace_test.c
> +++ b/tools/testing/selftests/user_events/ftrace_test.c
> @@ -261,6 +261,12 @@ TEST_F(user, register_events) {
>   ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ));
>   ASSERT_EQ(0, reg.write_index);
>  
> + /* Register without separator spacing should still match */
> + reg.enable_bit = 29;
> + reg.name_args = (__u64)"__test_event u32 field1;u32 field2";
> + ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, ));
> + ASSERT_EQ(0, reg.write_index);
> +
>   /* Multiple registers to same name but different args should fail */
>   reg.enable_bit = 29;
>   reg.name_args = (__u64)"__test_event u32 field1;";
> @@ -288,6 +294,8 @@ TEST_F(user, register_events) {
>   ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, ));
>   unreg.disable_bit = 30;
>   ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, ));
> + unreg.disable_bit = 29;
> + ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, ));
>  
>   /* Delete should have been auto-done after close and unregister */
>   close(self->data_fd);
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google)

Re: [PATCH v2] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-27 Thread Steven Rostedt

On Mon, 27 May 2024 18:44:56 +0900
Tatsuya S  wrote:

> On setting set_ftrace_pid, a extra entry generated by ftrace_pid_func()
> is shown on stack trace(CONFIG_UNWINDER_FRAME_POINTER=y).
> 
> [004] .68.459382: 
>  => 0xa00090af
>  => ksys_read
>  => __x64_sys_read
>  => x64_sys_call
>  => do_syscall_64
>  => entry_SYSCALL_64_after_hwframe  
> 
> To resolve this issue, increment skip count
> in function_stack_trace_call() if pids are set.

Just a note, this isn't a "fix" but simply an improvement in output.
I'm happy to take this (after testing and more reviewing), but it will
be for the next merge window, and with a different subject line.

  "ftrace: Hide one more entry in stack trace when ftrace_pid is enabled"

Or something like that.

Thanks,

-- Steve

Re: [PATCH 2/2] ring-buffer: Fix a race between readers and resize checks

2024-05-27 Thread Steven Rostedt

On Mon, 27 May 2024 11:36:55 +0200
Petr Pavlu  wrote:

> >>  static void rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
> >>  {
> >> @@ -2200,8 +2205,13 @@ int ring_buffer_resize(struct trace_buffer *buffer, 
> >> unsigned long size,
> >> */
> >>synchronize_rcu();
> >>for_each_buffer_cpu(buffer, cpu) {
> >> +  unsigned long flags;
> >> +
> >>cpu_buffer = buffer->buffers[cpu];
> >> +  raw_spin_lock_irqsave(_buffer->reader_lock, flags);
> >>rb_check_pages(cpu_buffer);
> >> +  raw_spin_unlock_irqrestore(_buffer->reader_lock,
> >> + flags);  
> > 
> > Putting my RT hat on, I really don't like the above fix. The
> > rb_check_pages() iterates all subbuffers which makes the time interrupts
> > are disabled non-deterministic.  
> 
> I see, this applies also to the same rb_check_pages() validation invoked
> from ring_buffer_read_finish().
> 
> > 
> > Instead, I would rather have something where we disable readers while we do
> > the check, and re-enable them.
> > 
> > raw_spin_lock_irqsave(_buffer->reader_lock, flags);
> > cpu_buffer->read_disabled++;
> > raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> > flags);
> > 
> > // Also, don't put flags on a new line. We are allow to go 100 characters 
> > now.  
> 
> Noted.
> 
> > 
> > 
> > rb_check_pages(cpu_buffer);
> > raw_spin_lock_irqsave(_buffer->reader_lock, flags);
> > cpu_buffer->read_disabled--;
> > raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> > flags);
> > 
> > Or something like that. Yes, that also requires creating a new
> > "read_disabled" field in the ring_buffer_per_cpu code.  
> 
> I think this would work but I'm personally not immediately sold on this
> approach. If I understand the idea correctly, readers should then check
> whether cpu_buffer->read_disabled is set and bail out with some error if
> that is the case. The rb_check_pages() function is only a self-check
> code and as such, I feel it doesn't justify disrupting readers with
> a new error condition and adding more complex locking.

Honestly, this code was never made for more than one reader per
cpu_buffer. I'm perfectly fine if all check_pages() causes other
readers to the same per_cpu buffer to get -EBUSY.

Do you really see this being a problem? What use case is there for
hitting the check_pages() and reading the same cpu buffer at the same
time?

-- Steve

Re: [PATCH 0/3] tracing: Fix some selftest issues

2024-05-27 Thread Steven Rostedt

On Sun, 26 May 2024 19:10:57 +0900
"Masami Hiramatsu (Google)"  wrote:

> Hi,
> 
> Here is a series of some fixes/improvements for the test modules and boot
> time selftest of kprobe events. I found a WARNING message with some boot 
> time selftest configuration, which came from the combination of embedded
> kprobe generate API tests module and ftrace boot-time selftest. So the main
> problem is that the test module should not be built-in. But I also think
> this WARNING message is useless (because there are warning messages already)
> and the cleanup code is redundant. This series fixes those issues.

Note, when I enable trace tests as builtin instead of modules, I just
disable the bootup self tests when it detects this. This helps with
doing tests via config options than having to add user space code that
loads modules.

Could you do something similar?

-- Steve


> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (Google) (3):
>   tracing: Build event generation tests only as modules
>   tracing/kprobe: Remove unneeded WARN_ON_ONCE() in selftests
>   tracing/kprobe: Remove cleanup code unrelated to selftest
>

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Steven Rostedt

On Fri, 24 May 2024 12:50:08 +0200
"Linux regression tracking (Thorsten Leemhuis)"  
wrote:

> > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > quick display of a kernel trace dump before the shutdown/reboot
> > completed. Starting from version 6.8.10 and continuing into version
> > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > preventing the shutdown or reboot from completing and leaving the
> > machine stuck.

You state "Before kernel version 6.8.10, the bug caused ...". Does that
mean that a bug was happening before v6.8.10? But did not cause a panic?

I just noticed your second screen shot from your report, and it has:

 "cache_from_obj: Wrong slab cache, tracefs_inode_cache but object is from 
inode_cache"

So somehow an tracefs_inode was allocated from the inode_cache and is
now being freed by the tracefs_inode logic? Did this happen before
6.8.10? If so, this code could just be triggering an issue from an
unrelated bug.

Thanks,

-- Steve

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Steven Rostedt

On Mon, 27 May 2024 20:14:42 +0200
Greg KH  wrote:

> On Mon, May 27, 2024 at 07:40:21PM +0300, Ilkka Naulapää wrote:
> > Hi Steven,
> > 
> > I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
> > panic inducing commit:
> > 
> > 414fb08628143 (tracefs: Reset permissions on remount if permissions are 
> > options)
> > 
> > I reverted that commit to 6.9.2 and now it only serves the trace but
> > the panic is gone. But I can live with it.  
> 
> Steven, should we revert that?
> 
> Or is there some other change that we should take to resolve this?
> 

Before we revert it (as it may be a bug in mainline), Ilkka, can you
test v6.10-rc1?  If it exists there, it will let me know whether or not
I missed something.

Thanks,

-- Steve

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Greg KH

On Mon, May 27, 2024 at 07:40:21PM +0300, Ilkka Naulapää wrote:
> Hi Steven,
> 
> I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
> panic inducing commit:
> 
> 414fb08628143 (tracefs: Reset permissions on remount if permissions are 
> options)
> 
> I reverted that commit to 6.9.2 and now it only serves the trace but
> the panic is gone. But I can live with it.

Steven, should we revert that?

Or is there some other change that we should take to resolve this?

thanks,

greg k-h

Re: [PATCH] ipvs: Avoid unnecessary calls to skb_is_gso_sctp

2024-05-27 Thread Julian Anastasov


Hello,

On Thu, 23 May 2024, Ismael Luceno wrote:

> In the context of the SCTP SNAT/DNAT handler, these calls can only
> return true.
> 
> Ref: e10d3ba4d434 ("ipvs: Fix checksumming on GSO of SCTP packets")

checkpatch.pl prefers to see the "commit" word:

Ref: commit e10d3ba4d434 ("ipvs: Fix checksumming on GSO of SCTP packets")

> Signed-off-by: Ismael Luceno 

Looks good to me for nf-next, thanks!

Acked-by: Julian Anastasov 

> CC: Pablo Neira Ayuso 
> CC: Michal Kubeček 
> CC: Simon Horman 
> CC: Julian Anastasov 
> CC: lvs-de...@vger.kernel.org
> CC: netfilter-de...@vger.kernel.org
> CC: net...@vger.kernel.org
> CC: coret...@netfilter.org
> ---
>  net/netfilter/ipvs/ip_vs_proto_sctp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c 
> b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> index 1e689c714127..83e452916403 100644
> --- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
> +++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
> @@ -126,7 +126,7 @@ sctp_snat_handler(struct sk_buff *skb, struct 
> ip_vs_protocol *pp,
>   if (sctph->source != cp->vport || payload_csum ||
>   skb->ip_summed == CHECKSUM_PARTIAL) {
>   sctph->source = cp->vport;
> - if (!skb_is_gso(skb) || !skb_is_gso_sctp(skb))
> + if (!skb_is_gso(skb))
>   sctp_nat_csum(skb, sctph, sctphoff);
>   } else {
>   skb->ip_summed = CHECKSUM_UNNECESSARY;
> @@ -175,7 +175,7 @@ sctp_dnat_handler(struct sk_buff *skb, struct 
> ip_vs_protocol *pp,
>   (skb->ip_summed == CHECKSUM_PARTIAL &&
>!(skb_dst(skb)->dev->features & NETIF_F_SCTP_CRC))) {
>   sctph->dest = cp->dport;
> - if (!skb_is_gso(skb) || !skb_is_gso_sctp(skb))
> + if (!skb_is_gso(skb))
>   sctp_nat_csum(skb, sctph, sctphoff);
>   } else if (skb->ip_summed != CHECKSUM_PARTIAL) {
>   skb->ip_summed = CHECKSUM_UNNECESSARY;
> -- 
> 2.44.0

Regards

--
Julian Anastasov

Re: [PATCH v2] sched/rt: Clean up usage of rt_task()

2024-05-27 Thread Qais Yousef

On 05/23/24 11:45, Steven Rostedt wrote:
> On Wed, 15 May 2024 23:05:36 +0100
> Qais Yousef  wrote:
> > diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
> > index df3aca89d4f5..5cb88b748ad6 100644
> > --- a/include/linux/sched/deadline.h
> > +++ b/include/linux/sched/deadline.h
> > @@ -10,8 +10,6 @@
> >  
> >  #include 
> >  
> > -#define MAX_DL_PRIO0
> > -
> >  static inline int dl_prio(int prio)
> >  {
> > if (unlikely(prio < MAX_DL_PRIO))
> > @@ -19,6 +17,10 @@ static inline int dl_prio(int prio)
> > return 0;
> >  }
> >  
> > +/*
> > + * Returns true if a task has a priority that belongs to DL class. 
> > PI-boosted
> > + * tasks will return true. Use dl_policy() to ignore PI-boosted tasks.
> > + */
> >  static inline int dl_task(struct task_struct *p)
> >  {
> > return dl_prio(p->prio);
> > diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
> > index ab83d85e1183..6ab43b4f72f9 100644
> > --- a/include/linux/sched/prio.h
> > +++ b/include/linux/sched/prio.h
> > @@ -14,6 +14,7 @@
> >   */
> >  
> >  #define MAX_RT_PRIO100
> > +#define MAX_DL_PRIO0
> >  
> >  #define MAX_PRIO   (MAX_RT_PRIO + NICE_WIDTH)
> >  #define DEFAULT_PRIO   (MAX_RT_PRIO + NICE_WIDTH / 2)
> > diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
> > index b2b9e6eb9683..a055dd68a77c 100644
> > --- a/include/linux/sched/rt.h
> > +++ b/include/linux/sched/rt.h
> > @@ -7,18 +7,43 @@
> >  struct task_struct;
> >  
> >  static inline int rt_prio(int prio)
> > +{
> > +   if (unlikely(prio < MAX_RT_PRIO && prio >= MAX_DL_PRIO))
> > +   return 1;
> > +   return 0;
> > +}
> > +
> > +static inline int realtime_prio(int prio)
> >  {
> > if (unlikely(prio < MAX_RT_PRIO))
> > return 1;
> > return 0;
> >  }
> 
> I'm thinking we should change the above to bool (separate patch), as
> returning an int may give one the impression that it returns the actual
> priority number. Having it return bool will clear that up.
> 
> In fact, if we are touching theses functions, might as well change all of
> them to bool when returning true/false. Just to make it easier to
> understand what they are doing.

I can add a patch on top, sure.

> 
> >  
> > +/*
> > + * Returns true if a task has a priority that belongs to RT class. 
> > PI-boosted
> > + * tasks will return true. Use rt_policy() to ignore PI-boosted tasks.
> > + */
> >  static inline int rt_task(struct task_struct *p)
> >  {
> > return rt_prio(p->prio);
> >  }
> >  
> > -static inline bool task_is_realtime(struct task_struct *tsk)
> > +/*
> > + * Returns true if a task has a priority that belongs to RT or DL classes.
> > + * PI-boosted tasks will return true. Use realtime_task_policy() to ignore
> > + * PI-boosted tasks.
> > + */
> > +static inline int realtime_task(struct task_struct *p)
> > +{
> > +   return realtime_prio(p->prio);
> > +}
> > +
> > +/*
> > + * Returns true if a task has a policy that belongs to RT or DL classes.
> > + * PI-boosted tasks will return false.
> > + */
> > +static inline bool realtime_task_policy(struct task_struct *tsk)
> >  {
> > int policy = tsk->policy;
> >  
> 
> 
> 
> > diff --git a/kernel/trace/trace_sched_wakeup.c 
> > b/kernel/trace/trace_sched_wakeup.c
> > index 0469a04a355f..19d737742e29 100644
> > --- a/kernel/trace/trace_sched_wakeup.c
> > +++ b/kernel/trace/trace_sched_wakeup.c
> > @@ -545,7 +545,7 @@ probe_wakeup(void *ignore, struct task_struct *p)
> >  *  - wakeup_dl handles tasks belonging to sched_dl class only.
> >  */
> > if (tracing_dl || (wakeup_dl && !dl_task(p)) ||
> > -   (wakeup_rt && !dl_task(p) && !rt_task(p)) ||
> > +   (wakeup_rt && !realtime_task(p)) ||
> > (!dl_task(p) && (p->prio >= wakeup_prio || p->prio >= 
> > current->prio)))
> > return;
> >  
> 
> Reviewed-by: Steven Rostedt (Google) 

Thanks!

--
Qais Yousef

> 
>

Re: [PATCH v2] sched/rt: Clean up usage of rt_task()

2024-05-27 Thread Qais Yousef

On 05/21/24 13:00, Sebastian Andrzej Siewior wrote:
> On 2024-05-15 23:05:36 [+0100], Qais Yousef wrote:
> > rt_task() checks if a task has RT priority. But depends on your
> > dictionary, this could mean it belongs to RT class, or is a 'realtime'
> > task, which includes RT and DL classes.
> > 
> > Since this has caused some confusion already on discussion [1], it
> > seemed a clean up is due.
> > 
> > I define the usage of rt_task() to be tasks that belong to RT class.
> > Make sure that it returns true only for RT class and audit the users and
> > replace the ones required the old behavior with the new realtime_task()
> > which returns true for RT and DL classes. Introduce similar
> > realtime_prio() to create similar distinction to rt_prio() and update
> > the users that required the old behavior to use the new function.
> > 
> > Move MAX_DL_PRIO to prio.h so it can be used in the new definitions.
> > 
> > Document the functions to make it more obvious what is the difference
> > between them. PI-boosted tasks is a factor that must be taken into
> > account when choosing which function to use.
> > 
> > Rename task_is_realtime() to realtime_task_policy() as the old name is
> > confusing against the new realtime_task().
> 
> I *think* everyone using rt_task() means to include DL tasks. And
> everyone means !SCHED-people since they know when the difference matters.

yes, this makes sense

> 
> > No functional changes were intended.
> > 
> > [1] 
> > https://lore.kernel.org/lkml/20240506100509.gl40...@noisy.programming.kicks-ass.net/
> > 
> > Reviewed-by: Phil Auld 
> > Signed-off-by: Qais Yousef 
> > ---
> > 
> > Changes since v1:
> > 
> > * Use realtime_task_policy() instead task_has_realtime_policy() (Peter)
> > * Improve commit message readability about replace some rt_task()
> >   users.
> > 
> > v1 discussion: 
> > https://lore.kernel.org/lkml/20240514234112.792989-1-qyou...@layalina.io/
> > 
> >  fs/select.c   |  2 +-
> 
> fs/bcachefs/six.c
> six_owner_running() has rt_task(). But imho should have realtime_task()
> to consider DL. But I think it is way worse that it has its own locking
> rather than using what everyone else but then again it wouldn't be the
> new hot thing…

I think I missed this one. Converted now. Thanks!

> 
> >  include/linux/ioprio.h|  2 +-
> >  include/linux/sched/deadline.h|  6 --
> >  include/linux/sched/prio.h|  1 +
> >  include/linux/sched/rt.h  | 27 ++-
> >  kernel/locking/rtmutex.c  |  4 ++--
> >  kernel/locking/rwsem.c|  4 ++--
> >  kernel/locking/ww_mutex.h |  2 +-
> >  kernel/sched/core.c   |  6 +++---
> >  kernel/time/hrtimer.c |  6 +++---
> >  kernel/trace/trace_sched_wakeup.c |  2 +-
> >  mm/page-writeback.c   |  4 ++--
> >  mm/page_alloc.c   |  2 +-
> >  13 files changed, 48 insertions(+), 20 deletions(-)
> …
> > diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> > index 70625dff62ce..08b95e0a41ab 100644
> > --- a/kernel/time/hrtimer.c
> > +++ b/kernel/time/hrtimer.c
> > @@ -1996,7 +1996,7 @@ static void __hrtimer_init_sleeper(struct 
> > hrtimer_sleeper *sl,
> >  * expiry.
> >  */
> > if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > -   if (task_is_realtime(current) && !(mode & HRTIMER_MODE_SOFT))
> > +   if (realtime_task_policy(current) && !(mode & 
> > HRTIMER_MODE_SOFT))
> > mode |= HRTIMER_MODE_HARD;
> > }
> >  
> > @@ -2096,7 +2096,7 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum 
> > hrtimer_mode mode,
> > u64 slack;
> >  
> > slack = current->timer_slack_ns;
> > -   if (rt_task(current))
> > +   if (realtime_task(current))
> > slack = 0;
> >  
> > hrtimer_init_sleeper_on_stack(, clockid, mode);
> > @@ -2301,7 +2301,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 
> > delta,
> >  * Override any slack passed by the user if under
> >  * rt contraints.
> >  */
> > -   if (rt_task(current))
> > +   if (realtime_task(current))
> > delta = 0;
> 
> I know this is just converting what is already here but…
> __hrtimer_init_sleeper() looks at the policy to figure out if the task
> is realtime do decide if should expire in HARD-IRQ context. This is
> correct, a boosted task should not sleep.
> 
> hrtimer_nanosleep() + schedule_hrtimeout_range_clock() is looking at
> priority to decide if slack should be removed. This should also look at
> policy since a boosted task shouldn't sleep.

I have to admit I never dug deep enough into this code. Happy to convert these
users. I'll add that as a separate patch as this is somewhat changing behavior
which this patch intends to do a clean up only.

> 
> In order to be PI-boosted you need to acquire a lock and the only lock
> you can sleep while acquired without generating a warning is a mutex_t
> (or equivalent sleeping lock) on PREEMPT_RT.

Re: [PATCH v2 2/3] arm64: dts: qcom: pm7250b: Add a TCPM description

2024-05-27 Thread Bjorn Andersson

On Fri, Mar 29, 2024 at 01:26:20PM GMT, Luca Weiss wrote:
> Type-C port management functionality lives inside of the PMIC block on
> pm7250b.
> 
> The Type-C port management logic controls orientation detection,
> vbus/vconn sense and to send/receive Type-C Power Domain messages.
> 

pm7250b is found in devices where USB Type-C port management is
performed in firmware, presumably using this hardware block.

As such, it seems reasonable to leave this node disabled and only enable
it on the targets that doesn't do this in firmware.

Regards,
Bjorn

> Reviewed-by: Bryan O'Donoghue 
> Reviewed-by: Konrad Dybcio 
> Signed-off-by: Luca Weiss 
> ---
>  arch/arm64/boot/dts/qcom/pm7250b.dtsi | 39 
> +++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi 
> b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> index 4faed25a787f..0205c2669093 100644
> --- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> +++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> @@ -51,6 +51,45 @@ pm7250b_vbus: usb-vbus-regulator@1100 {
>   status = "disabled";
>   };
>  
> + pm7250b_typec: typec@1500 {
> + compatible = "qcom,pm7250b-typec", "qcom,pm8150b-typec";
> + reg = <0x1500>,
> +   <0x1700>;
> + interrupts =  IRQ_TYPE_EDGE_RISING>,
> +  ,
> +   IRQ_TYPE_EDGE_RISING>,
> +  ,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +  ,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>,
> +   IRQ_TYPE_EDGE_RISING>;
> + interrupt-names = "or-rid-detect-change",
> +   "vpd-detect",
> +   "cc-state-change",
> +   "vconn-oc",
> +   "vbus-change",
> +   "attach-detach",
> +   "legacy-cable-detect",
> +   "try-snk-src-detect",
> +   "sig-tx",
> +   "sig-rx",
> +   "msg-tx",
> +   "msg-rx",
> +   "msg-tx-failed",
> +   "msg-tx-discarded",
> +   "msg-rx-discarded",
> +   "fr-swap";
> + vdd-vbus-supply = <_vbus>;
> + };
> +
>   pm7250b_temp: temp-alarm@2400 {
>   compatible = "qcom,spmi-temp-alarm";
>   reg = <0x2400>;
> 
> -- 
> 2.44.0
>

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-27 Thread Ilkka Naulapää

Hi Steven,

I took some time and bisected the 6.8.9 - 6.8.10 and git gave the
panic inducing commit:

414fb08628143 (tracefs: Reset permissions on remount if permissions are options)

I reverted that commit to 6.9.2 and now it only serves the trace but
the panic is gone. But I can live with it.

--Ilkka

On Sun, May 26, 2024 at 8:42 PM Ilkka Naulapää  wrote:
>
> hi,
>
> I took 6.9.2 and applied that 0bcfd9aa4dafa to it. Now the kernel is
> serving me both problems; the trace and the panic as the pic shows.
>
> > To understand this, did you do anything with tracing? Before shutting down,
> > is there anything in /sys/kernel/tracing/instances directory?
> > Were any of the files/directories permissions in /sys/kernel/tracing 
> > changed?
>
> And to answer your question, I did not do any tracing or so and the
> /sys/kernel/tracing is empty.
> Just plain boot-up, no matter if in full desktop or in bare rescue
> mode, ends up the same way.
>
> --Ilkka
>
> On Fri, May 24, 2024 at 8:19 PM Steven Rostedt  wrote:
> >
> > On Fri, 24 May 2024 12:50:08 +0200
> > "Linux regression tracking (Thorsten Leemhuis)"  
> > wrote:
> >
> > > > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > > > quick display of a kernel trace dump before the shutdown/reboot
> > > > completed. Starting from version 6.8.10 and continuing into version
> > > > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > > > preventing the shutdown or reboot from completing and leaving the
> > > > machine stuck.
> >
> > Ah, I bet it was this commit: baa23a8d4360d ("tracefs: Reset permissions on
> > remount if permissions are options"), which added a "iput" callback to the
> > dentry without calling iput, leaving stale inodes around.
> >
> > This is fixed with:
> >
> >   0bcfd9aa4dafa ("tracefs: Clear EVENT_INODE flag in tracefs_drop_inode()")
> >
> > Try adding just that patch. It will at least make it go back to what was
> > happening before 6.8.10 (I hope!).
> >
> > -- Steve

Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Russell King (Oracle)

On Mon, May 27, 2024 at 02:28:59PM +0200, Uwe Kleine-König wrote:
> On Mon, May 27, 2024 at 08:56:16AM +0100, Russell King (Oracle) wrote:
> > On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> > > Hello,
> > > 
> > > in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed 
> > > me to
> > > this thread.  With the proposed code change applied the procedure
> > > 
> > > # set to some known good (randomly guessed) filter function and 
> > > enable function_graph
> > > echo mtdblock_open > set_ftrace_filter
> > > echo function_graph > current_tracer
> > > 
> > > # walk available filter funcs
> > > cat available_filter_functions | while read f; do echo $f | tee -a 
> > > set_ftrace_filter; sleep 1; done
> > > 
> > > produces the following output
> > > 
> > > [  159.832173] Insufficient stack space to handle exception!
> > > [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> > > [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> > > [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> > > [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> > > [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> > > industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
> > > iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
> > > [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> > > [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> > > [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> > > [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> > > [  159.898376] pc : []lr : []psr: 6093
> > > [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> > > [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> > > [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
> > > [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
> > > [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  
> > > Segment none
> > > [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> > > [  159.941436] Register r0 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.952253] Register r1 information: non-slab/vmalloc memory
> > > [  159.957988] Register r2 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.968791] Register r3 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.979592] Register r4 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  159.990391] Register r5 information: 2-page vmalloc region starting at 
> > > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  160.001187] Register r6 information: non-paged memory
> > > [  160.006303] Register r7 information: non-paged memory
> > > [  160.011415] Register r8 information: non-slab/vmalloc memory
> > > [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> > > pointer offset 0 size 32
> > > [  160.025718] Register r10 information: non-slab/vmalloc memory
> > > [  160.031530] Register r11 information: 2-page vmalloc region starting 
> > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  160.042422] Register r12 information: 2-page vmalloc region starting 
> > > at 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > > [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> > > [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)
> > 
> > No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
> > a function involving the ftrace tracing infrastructure, which is why 8KB
> > of stack has been gobbled up. It could be
> > copy_from_kernel_nofault_allowed() but it would be useful to have at
> > least some extract of the backtrace showing the recursive cycle to
> > confirm, otherwise there is nothing in your report to confirm. As I'm
> > not a ftrace user myself, this isn't something I'd test for, so having
> > a full report would be useful.
> 
> Is not having a backtrace related to ftrace_return_address() returning
> NULL, as Arnd pointed out in
> https://lore.kernel.org/linux-arm-kernel/36cd10de-c51c-40ff-90e8-714954060...@app.fastmail.com/
> ?

Unlikely - the stack and code lines are also missing. I think the
submitter truncated the oops which is highly likely given that it
would've dumped all 8kB of the stack in hex, and the trace and
code lines would be after that.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2] tracing/probes: fix error check in parse_btf_field()

2024-05-27 Thread Google

On Mon, 27 May 2024 11:43:52 +0200
Carlos López  wrote:

> btf_find_struct_member() might return NULL or an error via the
> ERR_PTR() macro. However, its caller in parse_btf_field() only checks
> for the NULL condition. Fix this by using IS_ERR() and returning the
> error up the stack.
> 

Thanks for updating! This version looks good to me.
Let me pick this to probes/fixes.

Thank you,


> Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure field 
> access")
> Signed-off-by: Carlos López 
> ---
> v2: added call to trace_probe_log_err()
> 
>  kernel/trace/trace_probe.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 5e263c141574..39877c80d6cb 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -554,6 +554,10 @@ static int parse_btf_field(char *fieldname, const struct 
> btf_type *type,
>   anon_offs = 0;
>   field = btf_find_struct_member(ctx->btf, type, 
> fieldname,
>  _offs);
> + if (IS_ERR(field)) {
> + trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> + return PTR_ERR(field);
> + }
>   if (!field) {
>   trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
>   return -ENOENT;
> -- 
> 2.35.3
> 


-- 
Masami Hiramatsu (Google)

Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Uwe Kleine-König

On Mon, May 27, 2024 at 08:56:16AM +0100, Russell King (Oracle) wrote:
> On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> > Hello,
> > 
> > in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed me 
> > to
> > this thread.  With the proposed code change applied the procedure
> > 
> > # set to some known good (randomly guessed) filter function and enable 
> > function_graph
> > echo mtdblock_open > set_ftrace_filter
> > echo function_graph > current_tracer
> > 
> > # walk available filter funcs
> > cat available_filter_functions | while read f; do echo $f | tee -a 
> > set_ftrace_filter; sleep 1; done
> > 
> > produces the following output
> > 
> > [  159.832173] Insufficient stack space to handle exception!
> > [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> > [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> > [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> > [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> > [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> > industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
> > iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
> > [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> > [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> > [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> > [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> > [  159.898376] pc : []lr : []psr: 6093
> > [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> > [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> > [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
> > [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
> > [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  
> > Segment none
> > [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> > [  159.941436] Register r0 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.952253] Register r1 information: non-slab/vmalloc memory
> > [  159.957988] Register r2 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.968791] Register r3 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.979592] Register r4 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  159.990391] Register r5 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  160.001187] Register r6 information: non-paged memory
> > [  160.006303] Register r7 information: non-paged memory
> > [  160.011415] Register r8 information: non-slab/vmalloc memory
> > [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> > pointer offset 0 size 32
> > [  160.025718] Register r10 information: non-slab/vmalloc memory
> > [  160.031530] Register r11 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  160.042422] Register r12 information: 2-page vmalloc region starting at 
> > 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> > [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> > [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)
> 
> No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
> a function involving the ftrace tracing infrastructure, which is why 8KB
> of stack has been gobbled up. It could be
> copy_from_kernel_nofault_allowed() but it would be useful to have at
> least some extract of the backtrace showing the recursive cycle to
> confirm, otherwise there is nothing in your report to confirm. As I'm
> not a ftrace user myself, this isn't something I'd test for, so having
> a full report would be useful.

Is not having a backtrace related to ftrace_return_address() returning
NULL, as Arnd pointed out in
https://lore.kernel.org/linux-arm-kernel/36cd10de-c51c-40ff-90e8-714954060...@app.fastmail.com/
?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature

Re: [PATCH 01/12] soc: qcom: add firmware name helper

2024-05-27 Thread Dmitry Baryshkov

On Thu, 23 May 2024 at 01:48, Bjorn Andersson  wrote:
>
> On Tue, May 21, 2024 at 03:08:31PM +0200, Dmitry Baryshkov wrote:
> > On Tue, 21 May 2024 at 13:20, Kalle Valo  wrote:
> > >
> > > Dmitry Baryshkov  writes:
> > >
> > > > On Tue, 21 May 2024 at 12:52,  wrote:
> > > >>
> > > >> On 21/05/2024 11:45, Dmitry Baryshkov wrote:
> > > >> > Qualcomm platforms have different sets of the firmware files, which
> > > >> > differ from platform to platform (and from board to board, due to the
> > > >> > embedded signatures). Rather than listing all the firmware files,
> > > >> > including full paths, in the DT, provide a way to determine firmware
> > > >> > path based on the root DT node compatible.
> > > >>
> > > >> Ok this looks quite over-engineered but necessary to handle the legacy,
> > > >> but I really think we should add a way to look for a board-specific 
> > > >> path
> > > >> first and fallback to those SoC specific paths.
> > > >
> > > > Again, CONFIG_FW_LOADER_USER_HELPER => delays.
> > >
> > > To me this also looks like very over-engineered, can you elaborate more
> > > why this is needed? Concrete examples would help to understand better.
> >
> > Sure. During the meeting last week Arnd suggested evaluating if we can
> > drop firmware-name from the board DT files. Several reasons for that:
> > - DT should describe the hardware, not the Linux-firmware locations
> > - having firmware name in DT complicates updating the tree to use
> > different firmware API (think of mbn vs mdt vs any other format)
> > - If the DT gets supplied by the vendor (e.g. for
> > SystemReady-certified devices), there should be a sync between the
> > vendor's DT, linux kernel and the rootfs. Dropping firmware names from
> > DT solves that by removing one piece of the equation
> >
> > Now for the complexity of the solution. Each SoC family has their own
> > firmware set. This includes firmware for the DSPs, for modem, WiFi
> > bits, GPU shader, etc.
> > For the development boards these devices are signed by the testing key
> > and the actual signature is not validated against the root of trust
> > certificate.
> > For the end-user devices the signature is actually validated against
> > the bits fused to the SoC during manufacturing process. CA certificate
> > (and thus the fuses) differ from vendor to vendor (and from the device
> > to device)
> >
> > Not all of the firmware files are a part of the public linux-firmware
> > tree. However we need to support the rootfs bundled with the firmware
> > for different platforms (both public and vendor). The non-signed files
> > come from the Adreno GPU and can be shared between platforms. All
> > other files are SoC-specific and in some cases device-specific.
> >
> > So for example the SDM845 db845c (open device) loads following firmware 
> > files:
> > Not signed:
> > - qcom/a630_sqe.fw
> > - qcom/a630_gmu.bin
> >
> > Signed, will work for any non-secured sdm845 device:
> > - qcom/sdm845/a630_zap.mbn
> > - qcom/sdm845/adsp.mbn
> > - qcom/sdm845/cdsp.mbn
> > - qcom/sdm485/mba.mbn
> > - qcom/sdm845/modem.mbn
> > - qcom/sdm845/wlanmdsp.mbn (loaded via TQFTP)
> > - qcom/venus-5.2/venus.mbn
> >
> > Signed, works only for DB845c.
> > - qcom/sdm845/Thundercomm/db845c/slpi.mbn
> >
> > In comparison, the SDM845 Pixel-3 phone (aka blueline) should load the
> > following firmware files:
> > - qcom/a630_sqe.fw (the same, non-signed file)
> > - qcom/a630_gmu.bin (the same, non-signed file)
> > - qcom/sdm845/Google/blueline/a630_zap.mbn
>
> How do you get from "a630_zap.mbn" to this? By extending the lookup
> table for every target, or what am I missing?

More or less so. Matching the root OF node gives us the firmware
location, then it gets prepended to all firmware targets. Not an ideal
solution, as there is no fallback support, but at least it gives us
some points to discuss (and to decide whether to move to some
particular direction or to abandon the idea completely, making Arnd
unhappy again).

>
> Regards,
> Bjorn
>
> > - qcom/sdm845/Google/blueline/adsp.mbn
> > - qcom/sdm845/Google/blueline/cdsp.mbn
> > - qcom/sdm845/Google/blueline/ipa_fws.mbn
> > - qcom/sdm845/Google/blueline/mba.mbn
> > - qcom/sdm845/Google/blueline/modem.mbn
> > - qcom/sdm845/Google/blueline/venus.mbn
> > - qcom/sdm845/Google/blueline/wlanmdsp.mbn
> > - qcom/sdm845/Google/blueline/slpi.mbn
> >
> > The Lenovo Yoga C630 WoS laptop (SDM850 is a variant of SDM845) uses
> > another set of files:
> > - qcom/a630_sqe.fw (the same, non-signed file)
> > - qcom/a630_gmu.bin (the same, non-signed file)
> > - qcom/sdm850/LENOVO/81JL/qcdxkmsuc850.mbn
> > - qcom/sdm850/LENOVO/81JL/qcadsp850.mbn
> > - qcom/sdm850/LENOVO/81JL/qccdsp850.mbn
> > - qcom/sdm850/LENOVO/81JL/ipa_fws.elf
> > - qcom/sdm850/LENOVO/81JL/qcdsp1v2850.mbn
> > - qcom/sdm850/LENOVO/81JL/qcdsp2850.mbn
> > - qcom/sdm850/LENOVO/81JL/qcvss850.mbn
> > - qcom/sdm850/LENOVO/81JL/wlanmdsp.mbn
> > - qcom/sdm850/LENOVO/81JL/qcslpi850.mbn
> >
> > If we look at one of the recent

Re: [PATCH 2/2] ring-buffer: Fix a race between readers and resize checks

2024-05-27 Thread Petr Pavlu

do
>>  echo 16 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
>>  echo 8 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
>>  done &
>>  while true; do
>>  for i in /sys/kernel/tracing/per_cpu/*; do
>>  timeout 0.1 cat $i/trace_pipe; sleep 0.2
>>  done
>>  done
>>
>> To fix the problem, make sure ring_buffer_resize() doesn't invoke
>> rb_check_pages() concurrently with a reader operating on the same
>> ring_buffer_per_cpu by taking its cpu_buffer->reader_lock.
> 
> Definitely a bug. Thanks for catching it. But...
> 
>>
>> Fixes: 659f451ff213 ("ring-buffer: Add integrity check at end of iter read")
>> Signed-off-by: Petr Pavlu 
>> ---
>>  kernel/trace/ring_buffer.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
>> index 0ae569eae55a..967655591719 100644
>> --- a/kernel/trace/ring_buffer.c
>> +++ b/kernel/trace/ring_buffer.c
>> @@ -1449,6 +1449,11 @@ static void rb_check_bpage(struct ring_buffer_per_cpu 
>> *cpu_buffer,
>>   *
>>   * As a safety measure we check to make sure the data pages have not
>>   * been corrupted.
>> + *
>> + * Callers of this function need to guarantee that the list of pages 
>> doesn't get
>> + * modified during the check. In particular, if it's possible that the 
>> function
>> + * is invoked with concurrent readers which can swap in a new reader page 
>> then
>> + * the caller should take cpu_buffer->reader_lock.
>>   */
>>  static void rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
>>  {
>> @@ -2200,8 +2205,13 @@ int ring_buffer_resize(struct trace_buffer *buffer, 
>> unsigned long size,
>>   */
>>  synchronize_rcu();
>>  for_each_buffer_cpu(buffer, cpu) {
>> +unsigned long flags;
>> +
>>  cpu_buffer = buffer->buffers[cpu];
>> +raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>>  rb_check_pages(cpu_buffer);
>> +raw_spin_unlock_irqrestore(_buffer->reader_lock,
>> +   flags);
> 
> Putting my RT hat on, I really don't like the above fix. The
> rb_check_pages() iterates all subbuffers which makes the time interrupts
> are disabled non-deterministic.

I see, this applies also to the same rb_check_pages() validation invoked
from ring_buffer_read_finish().

> 
> Instead, I would rather have something where we disable readers while we do
> the check, and re-enable them.
> 
>   raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>   cpu_buffer->read_disabled++;
>   raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> flags);
> 
> // Also, don't put flags on a new line. We are allow to go 100 characters now.

Noted.

> 
> 
>   rb_check_pages(cpu_buffer);
>   raw_spin_lock_irqsave(_buffer->reader_lock, flags);
>   cpu_buffer->read_disabled--;
>   raw_spin_unlock_irqrestore(_buffer->reader_lock, 
> flags);
> 
> Or something like that. Yes, that also requires creating a new
> "read_disabled" field in the ring_buffer_per_cpu code.

I think this would work but I'm personally not immediately sold on this
approach. If I understand the idea correctly, readers should then check
whether cpu_buffer->read_disabled is set and bail out with some error if
that is the case. The rb_check_pages() function is only a self-check
code and as such, I feel it doesn't justify disrupting readers with
a new error condition and adding more complex locking.

I've been considering how to approach this RT issue differently. One
obvious approach would be to drop the rb_check_pages() validation but
that is probably not desirable.

Another option could be to make the check less thorough and walk only
a part of the list which is bounded by some constant, typically one
would want to check the part where some change was just made. In case of
a smaller list, it should be still possible to traverse it completely.

This is an idea that I'm currently looking at.

> 
> That said, I'm going to accept these patches as is (moving flags onto the
> same line). But would like the above code for the next merge window as it
> would then not affect RT.
> 
> I'll accept these patches because it does fix the bug now.

Thanks,
Petr

Re: [PATCH v2 5/5] arm64: dts: qcom: sa8775p-ride: enable remoteprocs

2024-05-27 Thread Dmitry Baryshkov

On Mon, May 27, 2024 at 10:43:52AM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski 
> 
> Enable all remoteproc nodes on the sa8775p-ride board and point to the
> appropriate firmware files.
> 
> Signed-off-by: Bartosz Golaszewski 
> ---
>  arch/arm64/boot/dts/qcom/sa8775p-ride.dts | 25 +
>  1 file changed, 25 insertions(+)
> 

Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry

Re: [PATCH v2 4/5] arm64: dts: qcom: sa8775p: add ADSP, CDSP and GPDSP nodes

2024-05-27 Thread Dmitry Baryshkov

On Mon, May 27, 2024 at 10:43:51AM +0200, Bartosz Golaszewski wrote:
> From: Tengfei Fan 
> 
> Add nodes for remoteprocs: ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 for
> SA8775p SoCs.
> 
> Signed-off-by: Tengfei Fan 
> Co-developed-by: Bartosz Golaszewski 
> Signed-off-by: Bartosz Golaszewski 
> ---
>  arch/arm64/boot/dts/qcom/sa8775p.dtsi | 332 
> ++
>  1 file changed, 332 insertions(+)
> 

With nsp0 vs nsp1 vs nsp sorted out:


Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry

Re: [PATCH v2 3/5] remoteproc: qcom_q6v5_pas: Add support for SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Dmitry Baryshkov

On Mon, May 27, 2024 at 10:43:50AM +0200, Bartosz Golaszewski wrote:
> From: Tengfei Fan 
> 
> Add support for PIL loading on ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 on
> SA8775p SoCs.
> 
> Signed-off-by: Tengfei Fan 
> Co-developed-by: Bartosz Golaszewski 
> Signed-off-by: Bartosz Golaszewski 
> ---
>  drivers/remoteproc/qcom_q6v5_pas.c | 92 
> ++
>  1 file changed, 92 insertions(+)
> 
> diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
> b/drivers/remoteproc/qcom_q6v5_pas.c
> index 54d8005d40a3..16053aa99298 100644
> --- a/drivers/remoteproc/qcom_q6v5_pas.c
> +++ b/drivers/remoteproc/qcom_q6v5_pas.c
> @@ -820,6 +820,23 @@ static const struct adsp_data adsp_resource_init = {
>   .ssctl_id = 0x14,
>  };
>  
> +static const struct adsp_data sa8775p_adsp_resource = {
> + .crash_reason_smem = 423,
> + .firmware_name = "adsp.mdt",

mbn please.

Other than that LGTM

> + .pas_id = 1,
> + .minidump_id = 5,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "lcx",
> + "lmx",
> + NULL
> + },
> + .load_state = "adsp",
> + .ssr_name = "lpass",
> + .sysmon_name = "adsp",
> + .ssctl_id = 0x14,
> +};
> +
>  static const struct adsp_data sdm845_adsp_resource_init = {
>   .crash_reason_smem = 423,
>   .firmware_name = "adsp.mdt",
> @@ -933,6 +950,42 @@ static const struct adsp_data cdsp_resource_init = {
>   .ssctl_id = 0x17,
>  };
>  
> +static const struct adsp_data sa8775p_cdsp0_resource = {
> + .crash_reason_smem = 601,
> + .firmware_name = "cdsp0.mdt",
> + .pas_id = 18,
> + .minidump_id = 7,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + "nsp0",
> + NULL
> + },
> + .load_state = "cdsp",
> + .ssr_name = "cdsp",
> + .sysmon_name = "cdsp",
> + .ssctl_id = 0x17,
> +};
> +
> +static const struct adsp_data sa8775p_cdsp1_resource = {
> + .crash_reason_smem = 633,
> + .firmware_name = "cdsp1.mdt",
> + .pas_id = 30,
> + .minidump_id = 20,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + "nsp1",
> + NULL
> + },
> + .load_state = "nsp",
> + .ssr_name = "cdsp1",
> + .sysmon_name = "cdsp1",
> + .ssctl_id = 0x20,
> +};
> +
>  static const struct adsp_data sdm845_cdsp_resource_init = {
>   .crash_reason_smem = 601,
>   .firmware_name = "cdsp.mdt",
> @@ -1074,6 +1127,40 @@ static const struct adsp_data sm8350_cdsp_resource = {
>   .ssctl_id = 0x17,
>  };
>  
> +static const struct adsp_data sa8775p_gpdsp0_resource = {
> + .crash_reason_smem = 640,
> + .firmware_name = "gpdsp0.mdt",
> + .pas_id = 39,
> + .minidump_id = 21,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + NULL
> + },
> + .load_state = "gpdsp0",
> + .ssr_name = "gpdsp0",
> + .sysmon_name = "gpdsp0",
> + .ssctl_id = 0x21,
> +};
> +
> +static const struct adsp_data sa8775p_gpdsp1_resource = {
> + .crash_reason_smem = 641,
> + .firmware_name = "gpdsp1.mdt",
> + .pas_id = 40,
> + .minidump_id = 22,
> + .auto_boot = true,
> + .proxy_pd_names = (char*[]){
> + "cx",
> + "mxc",
> + NULL
> + },
> + .load_state = "gpdsp1",
> + .ssr_name = "gpdsp1",
> + .sysmon_name = "gpdsp1",
> + .ssctl_id = 0x22,
> +};
> +
>  static const struct adsp_data mpss_resource_init = {
>   .crash_reason_smem = 421,
>   .firmware_name = "modem.mdt",
> @@ -1315,6 +1402,11 @@ static const struct of_device_id adsp_of_match[] = {
>   { .compatible = "qcom,qcs404-adsp-pas", .data = _resource_init },
>   { .compatible = "qcom,qcs404-cdsp-pas", .data = _resource_init },
>   { .compatible = "qcom,qcs404-wcss-pas", .data = _resource_init },
> + { .compatible = "qcom,sa8775p-adsp-pas", .data = 
> _adsp_resource},
> + { .compatible = "qcom,sa8775p-cdsp0-pas", .data = 
> _cdsp0_resource},
> + { .compatible = "qcom,sa8775p-cdsp1-pas", .data = 
> _cdsp1_resource},
> + { .compatible = "qcom,sa8775p-gpdsp0-pas", .data = 
> _gpdsp0_resource},
> + { .compatible = "qcom,sa8775p-gpdsp1-pas", .data = 
> _gpdsp1_resource},
>   { .compatible = "qcom,sc7180-adsp-pas", .data = _adsp_resource},
>   { .compatible = "qcom,sc7180-mpss-pas", .data = _resource_init},
>   { .compatible = "qcom,sc7280-adsp-pas", .data = _adsp_resource},
> 
> -- 
> 2.43.0
> 

-- 
With best wishes
Dmitry

Re: [PATCH v2 1/5] dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP

2024-05-27 Thread Krzysztof Kozlowski

On 27/05/2024 10:43, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski 
> 
> Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and
> GPDSP1 on the SA8775p SoC.
> 
> Co-developed-by: Tengfei Fan 

Missing SoB.

> Signed-off-by: Bartosz Golaszewski 

...


> +
> +allOf:
> +  - $ref: /schemas/remoteproc/qcom,pas-common.yaml#
> +
> +  - if:
> +  properties:
> +compatible:
> +  enum:
> +- qcom,sa8775p-adsp-pas
> +then:
> +  properties:
> +power-domains:
> +  items:
> +- description: LCX power domain
> +- description: LMX power domain
> +power-domain-names:
> +  items:
> +- const: lcx
> +- const: lmx
> +
> +  - if:
> +  properties:
> +compatible:
> +  enum:
> +- qcom,sa8775p-cdsp-pas

cdsp0

> +then:
> +  properties:
> +power-domains:
> +  items:
> +- description: CX power domain
> +- description: MXC power domain
> +- description: NSP0 power domain
> +power-domain-names:
> +  items:
> +- const: cx
> +- const: mxc
> +- const: nsp0

Shouldn't this be just nsp, so both cdsp0 and cdsp1 entries can be
unified? That's the power domain from the device point of view, so the
device expects to be in some NSP domain, not explicitly NSPn.



Best regards,
Krzysztof

Re: [PATCH] tools/virtio: pipe assertion in vring_test.c

2024-05-27 Thread Yunseong Kim




On 5/27/24 4:52 오후, Michael S. Tsirkin wrote:
> On Mon, May 27, 2024 at 04:13:31PM +0900, ysk...@gmail.com wrote:
>> From: Yunseong Kim 
>>
>> The virtio_device need to fail checking when create the geust/host pipe.
> 
> typo

Thank you for code review Michael.

Sorry, there was a typo in my message.

I'll fix it and send you patch version 2.

>>
>> Signed-off-by: Yunseong Kim 
> 
> 
> I guess ... 
> 
>> ---
>>  tools/virtio/vringh_test.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
>> index 98ff808d6f0c..b1af8807c02a 100644
>> --- a/tools/virtio/vringh_test.c
>> +++ b/tools/virtio/vringh_test.c
>> @@ -161,8 +161,8 @@ static int parallel_test(u64 features,
>>  host_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>>  guest_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 
>> 0);
>>  
>> -pipe(to_guest);
>> -pipe(to_host);
>> +assert(pipe(to_guest) == 0);
>> +assert(pipe(to_host) == 0);
> 
> 
> I don't like == 0, prefer ! .
> Also, calling pipe outside assert is preferable, since in theory
> assert can be compiled out.
> Not an issue here but people tend to copy/paste text.

I agree, it's uncomfortable even if I did it.

I'll fix it as you suggested and send it to patch 2.

Thank you!


Warm Regards,

Yunseong Kim


>>  CPU_ZERO(_set);
>>  find_cpus(_cpu, _cpu);
>> -- 
>> 2.34.1
>

Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Russell King (Oracle)

On Mon, May 27, 2024 at 09:43:41AM +0200, Thorsten Scherer wrote:
> Hello,
> 
> in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed me to
> this thread.  With the proposed code change applied the procedure
> 
> # set to some known good (randomly guessed) filter function and enable 
> function_graph
> echo mtdblock_open > set_ftrace_filter
> echo function_graph > current_tracer
> 
> # walk available filter funcs
> cat available_filter_functions | while read f; do echo $f | tee -a 
> set_ftrace_filter; sleep 1; done
> 
> produces the following output
> 
> [  159.832173] Insufficient stack space to handle exception!
> [  159.832241] Task stack: [0xc8e44000..0xc8e46000]
> [  159.842701] IRQ stack:  [0xc880..0xc8802000]
> [  159.847712] Overflow stack: [0xc1934000..0xc1935000]
> [  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> [  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
> industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
> iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
> [  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
> [  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
> [  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
> [  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
> [  159.898376] pc : []lr : []psr: 6093
> [  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
> [  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
> [  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
> [  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
> [  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
> none
> [  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
> [  159.941436] Register r0 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.952253] Register r1 information: non-slab/vmalloc memory
> [  159.957988] Register r2 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.968791] Register r3 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.979592] Register r4 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  159.990391] Register r5 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  160.001187] Register r6 information: non-paged memory
> [  160.006303] Register r7 information: non-paged memory
> [  160.011415] Register r8 information: non-slab/vmalloc memory
> [  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 
> pointer offset 0 size 32
> [  160.025718] Register r10 information: non-slab/vmalloc memory
> [  160.031530] Register r11 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  160.042422] Register r12 information: 2-page vmalloc region starting at 
> 0xc8e44000 allocated at kernel_clone+0xa8/0x408
> [  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
> [  160.058955] Stack: (0xc8e44018 to 0xc8e46000)

No backtrace? No Code: line? I'm guessing there was an attempt to ftrace
a function involving the ftrace tracing infrastructure, which is why 8KB
of stack has been gobbled up. It could be
copy_from_kernel_nofault_allowed() but it would be useful to have at
least some extract of the backtrace showing the recursive cycle to
confirm, otherwise there is nothing in your report to confirm. As I'm
not a ftrace user myself, this isn't something I'd test for, so having
a full report would be useful.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH] tools/virtio: pipe assertion in vring_test.c

2024-05-27 Thread Michael S. Tsirkin

On Mon, May 27, 2024 at 04:13:31PM +0900, ysk...@gmail.com wrote:
> From: Yunseong Kim 
> 
> The virtio_device need to fail checking when create the geust/host pipe.

typo

> 
> Signed-off-by: Yunseong Kim 


I guess ... 

> ---
>  tools/virtio/vringh_test.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
> index 98ff808d6f0c..b1af8807c02a 100644
> --- a/tools/virtio/vringh_test.c
> +++ b/tools/virtio/vringh_test.c
> @@ -161,8 +161,8 @@ static int parallel_test(u64 features,
>   host_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>   guest_map = mmap(NULL, mapsize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 
> 0);
>  
> - pipe(to_guest);
> - pipe(to_host);
> + assert(pipe(to_guest) == 0);
> + assert(pipe(to_host) == 0);


I don't like == 0, prefer ! .
Also, calling pipe outside assert is preferable, since in theory
assert can be compiled out.
Not an issue here but people tend to copy/paste text.

>   CPU_ZERO(_set);
>   find_cpus(_cpu, _cpu);
> -- 
> 2.34.1

Re: ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

2024-05-27 Thread Thorsten Scherer

Hello,

in the context of a panic on an i.MX25 based v6.9 kernel [1] Uwe pointed me to
this thread.  With the proposed code change applied the procedure

# set to some known good (randomly guessed) filter function and enable 
function_graph
echo mtdblock_open > set_ftrace_filter
echo function_graph > current_tracer

# walk available filter funcs
cat available_filter_functions | while read f; do echo $f | tee -a 
set_ftrace_filter; sleep 1; done

produces the following output

[  159.832173] Insufficient stack space to handle exception!
[  159.832241] Task stack: [0xc8e44000..0xc8e46000]
[  159.842701] IRQ stack:  [0xc880..0xc8802000]
[  159.847712] Overflow stack: [0xc1934000..0xc1935000]
[  159.852726] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
[  159.858273] Modules linked in: capture_events_imxgpt ti_ads7950 
industrialio_triggered_buffer kfifo_buf capture_events_irq capture_events 
iio_trig_hrtimer industrialio_sw_trigger industrialio_configfs dm_mod
[  159.876948] CPU: 0 PID: 199 Comm: sh Not tainted 6.9.0 #3
[  159.882412] Hardware name: Freescale i.MX25 (Device Tree Support)
[  159.888547] PC is at prepare_ftrace_return+0x4/0x7c
[  159.893520] LR is at ftrace_graph_caller+0x1c/0x28
[  159.898376] pc : []lr : []psr: 6093
[  159.904690] sp : c8e44018  ip : c8e44018  fp : c8e4403c
[  159.909959] r10: c0c09e78  r9 : c35e9bc0  r8 : c010d9bc
[  159.915227] r7 : 0001  r6 : 0004  r5 : c8e44064  r4 : c8e440ac
[  159.921800] r3 : c8e44030  r2 : c8e4403c  r1 : c010eb9c  r0 : c8e44038
[  159.928376] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
none
[  159.935652] Control: 0005317f  Table: 83074000  DAC: 0051
[  159.941436] Register r0 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.952253] Register r1 information: non-slab/vmalloc memory
[  159.957988] Register r2 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.968791] Register r3 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.979592] Register r4 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  159.990391] Register r5 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  160.001187] Register r6 information: non-paged memory
[  160.006303] Register r7 information: non-paged memory
[  160.011415] Register r8 information: non-slab/vmalloc memory
[  160.017139] Register r9 information: slab kmalloc-32 start c35e9bc0 pointer 
offset 0 size 32
[  160.025718] Register r10 information: non-slab/vmalloc memory
[  160.031530] Register r11 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  160.042422] Register r12 information: 2-page vmalloc region starting at 
0xc8e44000 allocated at kernel_clone+0xa8/0x408
[  160.053315] Process sh (pid: 199, stack limit = 0x68fc3abb)
[  160.058955] Stack: (0xc8e44018 to 0xc8e46000)

and points me to copy_from_kernel_nofault_allowed.

Disassembly of section .text:

c010eb8c :
c010eb8c:   e1a0c00dmov ip, sp
c010eb90:   e92dd800push{fp, ip, lr, pc}
c010eb94:   e24cb004sub fp, ip, #4
c010eb98:   e52de004push{lr}; (str lr, [sp, 
#-4]!)
c010eb9c:   ebfffb4fbl  c010d8e0 <__gnu_mcount_nc>
c010eba0:   e35004bfcmp r0, #-1090519040; 0xbf00
c010eba4:   3a03bcc c010ebb8 

c010eba8:   e091addsr0, r0, r1
c010ebac:   33a1movcc   r0, #1
c010ebb0:   23a0movcs   r0, #0
c010ebb4:   e89da800ldm sp, {fp, sp, pc}
c010ebb8:   e3a0mov r0, #0
c010ebbc:   e89da800ldm sp, {fp, sp, pc}


In the time given I couldn't make sense of the issue, so I am passing this
hopefully useful information to you.

Best regards
Thorsten

#regzbot introduced: 953f534a7ed6b725d4f101d2949393acc9262880 ^

[1] 
https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76/

Re: [PATCH] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-26 Thread Tatsuya S

On Mon, May 27, 2024 at 08:17:19AM GMT, Masami Hiramatsu wrote:
> On Sun, 26 May 2024 22:51:53 +0800
> kernel test robot  wrote:
> 
> > Hi Tatsuya,
> > 
> > kernel test robot noticed the following build warnings:
> > 
> > [auto build test WARNING on linus/master]
> > [also build test WARNING on rostedt-trace/for-next v6.9 next-20240523]
> > [cannot apply to rostedt-trace/for-next-urgent]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> > 
> > url:
> > https://github.com/intel-lab-lkp/linux/commits/Tatsuya-S/ftrace-Fix-stack-trace-entry-generated-by-ftrace_pid_func/20240526-193149
> > base:   linus/master
> > patch link:
> > https://lore.kernel.org/r/20240526112658.46740-1-tatsuya.s2862%40gmail.com
> > patch subject: [PATCH] ftrace: Fix stack trace entry generated by 
> > ftrace_pid_func()
> > config: x86_64-buildonly-randconfig-002-20240526 
> > (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/config)
> > compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 
> > 617a15a9eac96088ae5e9134248d8236e34b91b1)
> > reproduce (this is a W=1 build): 
> > (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/reproduce)
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new 
> > version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot 
> > | Closes: 
> > https://lore.kernel.org/oe-kbuild-all/202405262232.l4xh8q6o-...@intel.com/
> > 
> > All warnings (new ones prefixed by >>):
> > 
> > >> kernel/trace/ftrace.c:102:6: warning: no previous prototype for function 
> > >> 'ftrace_pids_enabled' [-Wmissing-prototypes]
> >  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
> >  |  ^
> >kernel/trace/ftrace.c:102:1: note: declare 'static' if the function is 
> > not intended to be used outside of this translation unit
> >  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
> >  | ^
> >  | static 
> >1 warning generated.
> 
> This is because the prototype in linux/ftrace.h is placed in the 
> #ifdef CONFIG_DYNAMIC_FTRACE block. The prototype needs to be moved
> outside of the block.
> 
> Thank you,
> 
> > 
> > 
> > vim +/ftrace_pids_enabled +102 kernel/trace/ftrace.c
> > 
> >101  
> >  > 102  bool ftrace_pids_enabled(struct ftrace_ops *ops)
> >103  {
> >104  struct trace_array *tr;
> >105  
> >106  if (!(ops->flags & FTRACE_OPS_FL_PID) || !ops->private)
> >107  return false;
> >108  
> >109  tr = ops->private;
> >110  
> >111  return tr->function_pids != NULL || 
> > tr->function_no_pids != NULL;
> >112  }
> >113  
> > 
> > -- 
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
> 
> 
> -- 
> Masami Hiramatsu (Google) 

I understand. Thank you.

Re: [PATCH 00/20] function_graph: Allow multiple users for function graph tracing

2024-05-26 Thread Steven Rostedt

On Mon, 27 May 2024 09:37:47 +0900
Masami Hiramatsu (Google)  wrote:

> > The diff between this and Masami's last update is at the end of this email. 
> >  
> 
> Thanks for update the patches!
> I think your changes are good. I just have some comments and replied.
> So, the plan is to push this series in the tracing/for-next? I will
> port my fprobe part on it and run some tests.

Yeah. I'll probably push it after -rc2 comes out and base it on top of
that. I usually wait till rc2 as it tends to be more stable than rc1.

I'll send an updated series later this week that addresses your comments.

-- Steve

Re: [PATCH 04/20] function_graph: Allow multiple users to attach to function graph

2024-05-26 Thread Steven Rostedt

On Mon, 27 May 2024 09:34:36 +0900
Masami Hiramatsu (Google)  wrote:

> > @@ -110,11 +253,13 @@ void ftrace_graph_stop(void)
> >  /* Add a function return address to the trace stack on thread info.*/
> >  static int
> >  ftrace_push_return_trace(unsigned long ret, unsigned long func,
> > -unsigned long frame_pointer, unsigned long *retp)
> > +unsigned long frame_pointer, unsigned long *retp,
> > +int fgraph_idx)  
> 
> We do not need this fgraph_idx parameter anymore because this removed
> reuse-frame check.

Agreed. Will remove.

> 
> >  {
> > struct ftrace_ret_stack *ret_stack;
> > unsigned long long calltime;
> > -   int index;
> > +   unsigned long val;
> > +   int offset;
> >  
> > if (unlikely(ftrace_graph_is_dead()))
> > return -EBUSY;
> > @@ -124,24 +269,57 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> >  
> > BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));
> >  
> > +   /* Set val to "reserved" with the delta to the new fgraph frame */
> > +   val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> > +
> > /*
> >  * We must make sure the ret_stack is tested before we read
> >  * anything else.
> >  */
> > smp_rmb();
> >  
> > -   /* The return trace stack is full */
> > -   if (current->curr_ret_stack >= SHADOW_STACK_MAX_INDEX) {
> > +   /*
> > +* Check if there's room on the shadow stack to fit a fraph frame
> > +* and a bitmap word.
> > +*/
> > +   if (current->curr_ret_stack + FGRAPH_FRAME_OFFSET + 1 >= 
> > SHADOW_STACK_MAX_OFFSET) {
> > atomic_inc(>trace_overrun);
> > return -EBUSY;
> > }
> >  
> > calltime = trace_clock_local();
> >  
> > -   index = current->curr_ret_stack;
> > -   RET_STACK_INC(current->curr_ret_stack);
> > -   ret_stack = RET_STACK(current, index);
> > +   offset = READ_ONCE(current->curr_ret_stack);
> > +   ret_stack = RET_STACK(current, offset);
> > +   offset += FGRAPH_FRAME_OFFSET;
> > +
> > +   /* ret offset = FGRAPH_FRAME_OFFSET ; type = reserved */
> > +   current->ret_stack[offset] = val;
> > +   ret_stack->ret = ret;
> > +   /*
> > +* The unwinders expect curr_ret_stack to point to either zero
> > +* or an offset where to find the next ret_stack. Even though the
> > +* ret stack might be bogus, we want to write the ret and the
> > +* offset to find the ret_stack before we increment the stack point.
> > +* If an interrupt comes in now before we increment the curr_ret_stack
> > +* it may blow away what we wrote. But that's fine, because the
> > +* offset will still be correct (even though the 'ret' won't be).
> > +* What we worry about is the offset being correct after we increment
> > +* the curr_ret_stack and before we update that offset, as if an
> > +* interrupt comes in and does an unwind stack dump, it will need
> > +* at least a correct offset!
> > +*/
> > barrier();
> > +   WRITE_ONCE(current->curr_ret_stack, offset + 1);
> > +   /*
> > +* This next barrier is to ensure that an interrupt coming in
> > +* will not corrupt what we are about to write.
> > +*/
> > +   barrier();
> > +
> > +   /* Still keep it reserved even if an interrupt came in */
> > +   current->ret_stack[offset] = val;
> > +
> > ret_stack->ret = ret;
> > ret_stack->func = func;
> > ret_stack->calltime = calltime;
> > @@ -151,7 +329,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> >  #ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
> > ret_stack->retp = retp;
> >  #endif
> > -   return 0;
> > +   return offset;
> >  }
> >  
> >  /*
> > @@ -168,49 +346,67 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> >  # define MCOUNT_INSN_SIZE 0
> >  #endif
> >  
> > +/* If the caller does not use ftrace, call this function. */
> >  int function_graph_enter(unsigned long ret, unsigned long func,
> >  unsigned long frame_pointer, unsigned long *retp)
> >  {
> > struct ftrace_graph_ent trace;
> > +   unsigned long bitmap = 0;
> > +   int offset;
> > +   int i;
> >  
> > trace.func = func;
> > trace.depth = ++current->curr_ret_depth;
> >  
> > -   if (ftrace_push_return_trace(ret, func, frame_pointer, retp))
> > +   offset = ftrace_push_return_trace(ret, func, frame_pointer, retp, 0);
> > +   if (offset < 0)
> > goto out;
> >  
> > -   /* Only trace if the calling function expects to */
> > -   if (!fgraph_array[0]->entryfunc())
> > +   for (i = 0; i < fgraph_array_cnt; i++) {
> > +   struct fgraph_ops *gops = fgraph_array[i];
> > +
> > +   if (gops == _stub)
> > +   continue;
> > +
> > +   if (gops->entryfunc())
> > +   bitmap |= BIT(i);
> > +   }
> > +
> > +   if (!bitmap)
> > goto out_ret;
> >  
> > +   /*
> > +* Since this function uses fgraph_idx = 0 as a tail-call

Re: [PATCH 00/20] function_graph: Allow multiple users for function graph tracing

2024-05-26 Thread Google

On Fri, 24 May 2024 22:36:52 -0400
Steven Rostedt  wrote:

> [
>   Resend for some of you as I missed a comma in the Cc and
>   quilt died sending this out.
> ]
> 
> This is a continuation of the function graph multi user code.
> I wrote a proof of concept back in 2019 of this code[1] and
> Masami started cleaning it up. I started from Masami's work v10
> that can be found here:
> 
>  
> https://lore.kernel.org/linux-trace-kernel/171509088006.162236.7227326999861366050.stgit@devnote2/
> 
> This is *only* the code that allows multiple users of function
> graph tracing. This is not the fprobe work that Masami is working
> to add on top of it. As Masami took my proof of concept, there
> was still several things I disliked about that code. Instead of
> having Masami clean it up even more, I decided to take over on just
> my code and change it up a bit.
> 
> The biggest changes from where Masami left off is mostly renaming more
> variables, macros, and function names. I fixed up the current comments
> and added more to make the code a bit more understandable.
> 
> At the end of the series, I added two patches to optimize the entry
> and exit. On entry, there was a loop that iterated the 16 elements
> of the fgraph_array[] looking for any that may have a gops registered
> to it. It's quite a waste to do that loop if there's only one
> registered user. To fix that, I added a fgraph_array_bitmask that has
> its bits set that correspond to the elements of the array. Then
> a simple for_each_set_bit() is used for the iteration. I do the same
> thing at the exit callback of the function where it iterates over the
> bits of the bitmap saved on the ret_stack.
> 
> I also noticed that Masami added code to handle tail calls in the
> unwinder and had that in one of my patches. I took that code out
> and made it a separate patch with Masami as the author.
> 
> The diff between this and Masami's last update is at the end of this email.

Thanks for update the patches!
I think your changes are good. I just have some comments and replied.
So, the plan is to push this series in the tracing/for-next? I will
port my fprobe part on it and run some tests.

Thank you,

> 
> Based on Linus commit: 0eb03c7e8e2a4cc3653eb5eeb2d2001182071215
> 
> [1] https://lore.kernel.org/all/20190525031633.811342...@goodmis.org/
> 
> Masami Hiramatsu (Google) (3):
>   function_graph: Handle tail calls for stack unwinding
>   function_graph: Use a simple LRU for fgraph_array index number
>   ftrace: Add multiple fgraph storage selftest
> 
> Steven Rostedt (Google) (2):
>   function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()
>   function_graph: Use bitmask to loop on fgraph entry
> 
> Steven Rostedt (VMware) (15):
>   function_graph: Convert ret_stack to a series of longs
>   fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible by 
> long
>   function_graph: Add an array structure that will allow multiple 
> callbacks
>   function_graph: Allow multiple users to attach to function graph
>   function_graph: Remove logic around ftrace_graph_entry and return
>   ftrace/function_graph: Pass fgraph_ops to function graph callbacks
>   ftrace: Allow function_graph tracer to be enabled in instances
>   ftrace: Allow ftrace startup flags to exist without dynamic ftrace
>   function_graph: Have the instances use their own ftrace_ops for 
> filtering
>   function_graph: Add "task variables" per task for fgraph_ops
>   function_graph: Move set_graph_function tests to shadow stack global var
>   function_graph: Move graph depth stored data to shadow stack global var
>   function_graph: Move graph notrace bit to shadow stack global var
>   function_graph: Implement fgraph_reserve_data() and 
> fgraph_retrieve_data()
>   function_graph: Add selftest for passing local variables
> 
> 
>  arch/arm64/kernel/ftrace.c   |  21 +-
>  arch/loongarch/kernel/ftrace_dyn.c   |  15 +-
>  arch/powerpc/kernel/trace/ftrace.c   |   3 +-
>  arch/riscv/kernel/ftrace.c   |  15 +-
>  arch/x86/kernel/ftrace.c |  19 +-
>  include/linux/ftrace.h   |  42 +-
>  include/linux/sched.h|   2 +-
>  include/linux/trace_recursion.h  |  39 --
>  kernel/trace/fgraph.c| 994 
> ---
>  kernel/trace/ftrace.c|  11 +-
>  kernel/trace/ftrace_internal.h   |   2 -
>  kernel/trace/trace.h |  94 +++-
>  kernel/trace/trace_functions.c   |   8 +
>  kernel/trace/trace_functions_graph.c |  96 ++--
>  kernel/trace/trace_irqsoff.c |  10 +-
>  kernel/trace/trace_sched_wakeup.c|  10 +-
>  kernel/trace/trace_selftest.c| 259 -
>  17 files changed, 1330 insertions(+), 310 deletions(-)
> 
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 3313e4b83aa2..1aae521e5997 100644
> --- a/kernel/trace/fgraph.c
> +++

Re: [PATCH 04/20] function_graph: Allow multiple users to attach to function graph

2024-05-26 Thread Google

On Fri, 24 May 2024 22:36:56 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (VMware)" 
> 
> Allow for multiple users to attach to function graph tracer at the same
> time. Only 16 simultaneous users can attach to the tracer. This is because
> there's an array that stores the pointers to the attached fgraph_ops. When
> a function being traced is entered, each of the ftrace_ops entryfunc is
> called and if it returns non zero, its index into the array will be added
> to the shadow stack.
> 
> On exit of the function being traced, the shadow stack will contain the
> indexes of the ftrace_ops on the array that want their retfunc to be
> called.
> 
> Because a function may sleep for a long time (if a task sleeps itself),
> the return of the function may be literally days later. If the ftrace_ops
> is removed, its place on the array is replaced with a ftrace_ops that
> contains the stub functions and that will be called when the function
> finally returns.
> 
> If another ftrace_ops is added that happens to get the same index into the
> array, its return function may be called. But that's actually the way
> things current work with the old function graph tracer. If one tracer is
> removed and another is added, the new one will get the return calls of the
> function traced by the previous one, thus this is not a regression. This
> can be fixed by adding a counter to each time the array item is updated and
> save that on the shadow stack as well, such that it won't be called if the
> index saved does not match the index on the array.
> 
> Note, being able to filter functions when both are called is not completely
> handled yet, but that shouldn't be too hard to manage.
> 
> Co-developed with Masami Hiramatsu:
> Link: 
> https://lore.kernel.org/linux-trace-kernel/171509096221.162236.8806372072523195752.stgit@devnote2
> 

Thanks for update this. I have some comments below.

> Signed-off-by: Steven Rostedt (VMware) 
> Signed-off-by: Masami Hiramatsu (Google) 
> Signed-off-by: Steven Rostedt (Google) 
> ---
[...]

> @@ -110,11 +253,13 @@ void ftrace_graph_stop(void)
>  /* Add a function return address to the trace stack on thread info.*/
>  static int
>  ftrace_push_return_trace(unsigned long ret, unsigned long func,
> -  unsigned long frame_pointer, unsigned long *retp)
> +  unsigned long frame_pointer, unsigned long *retp,
> +  int fgraph_idx)

We do not need this fgraph_idx parameter anymore because this removed
reuse-frame check.

>  {
>   struct ftrace_ret_stack *ret_stack;
>   unsigned long long calltime;
> - int index;
> + unsigned long val;
> + int offset;
>  
>   if (unlikely(ftrace_graph_is_dead()))
>   return -EBUSY;
> @@ -124,24 +269,57 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> long func,
>  
>   BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));
>  
> + /* Set val to "reserved" with the delta to the new fgraph frame */
> + val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> +
>   /*
>* We must make sure the ret_stack is tested before we read
>* anything else.
>*/
>   smp_rmb();
>  
> - /* The return trace stack is full */
> - if (current->curr_ret_stack >= SHADOW_STACK_MAX_INDEX) {
> + /*
> +  * Check if there's room on the shadow stack to fit a fraph frame
> +  * and a bitmap word.
> +  */
> + if (current->curr_ret_stack + FGRAPH_FRAME_OFFSET + 1 >= 
> SHADOW_STACK_MAX_OFFSET) {
>   atomic_inc(>trace_overrun);
>   return -EBUSY;
>   }
>  
>   calltime = trace_clock_local();
>  
> - index = current->curr_ret_stack;
> - RET_STACK_INC(current->curr_ret_stack);
> - ret_stack = RET_STACK(current, index);
> + offset = READ_ONCE(current->curr_ret_stack);
> + ret_stack = RET_STACK(current, offset);
> + offset += FGRAPH_FRAME_OFFSET;
> +
> + /* ret offset = FGRAPH_FRAME_OFFSET ; type = reserved */
> + current->ret_stack[offset] = val;
> + ret_stack->ret = ret;
> + /*
> +  * The unwinders expect curr_ret_stack to point to either zero
> +  * or an offset where to find the next ret_stack. Even though the
> +  * ret stack might be bogus, we want to write the ret and the
> +  * offset to find the ret_stack before we increment the stack point.
> +  * If an interrupt comes in now before we increment the curr_ret_stack
> +  * it may blow away what we wrote. But that's fine, because the
> +  * offset will still be correct (even though the 'ret' won't be).
> +  * What we worry about is the offset being correct after we increment
> +  * the curr_ret_stack and before we update that offset, as if an
> +  * interrupt comes in and does an unwind stack dump, it will need
> +  * at least a correct offset!
> +  */
>   barrier();
> + WRITE_ONCE(current->curr_ret_stack, offset + 1);
> + /*
> +

Re: [PATCH 20/20] function_graph: Use bitmask to loop on fgraph entry

2024-05-26 Thread Steven Rostedt

On Mon, 27 May 2024 09:09:49 +0900
Masami Hiramatsu (Google)  wrote:

> > Note, we do not care about races. If a bit is set before the gops is
> > assigned, it only wastes time looking at the element and ignoring it (as
> > it did before this bitmask is added).  
> 
> This is OK because anyway we check gops == _stub.
> By the way, shouldn't we also make "if (gops == _stub)"
> check unlikely()?

Yeah, I'll add the unlikely() here too.

> 
> This change looks good to me.
> 
> Reviewed-by: Masami Hiramatsu (Google) 

Thanks for the review Masami!

-- Steve

Re: [PATCH 19/20] function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()

2024-05-26 Thread Steven Rostedt

On Mon, 27 May 2024 09:04:34 +0900
Masami Hiramatsu (Google)  wrote:

> > bitmap = get_bitmap_bits(current, offset);
> > -   for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> > +
> > +   for_each_set_bit(i, , sizeof(bitmap) * BITS_PER_BYTE) {
> > struct fgraph_ops *gops = fgraph_array[i];
> >  
> > -   if (!(bitmap & BIT(i)))
> > -   continue;
> > if (gops == _stub)  
> 
> Ah, nit: maybe this is unlikely()?

Ah probably. I'll update it.

Thanks,

-- Steve

Re: [PATCH 20/20] function_graph: Use bitmask to loop on fgraph entry

2024-05-26 Thread Google

On Fri, 24 May 2024 22:37:12 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> Instead of looping through all the elements of fgraph_array[] to see if
> there's an gops attached to one and then calling its gops->func(). Create
> a fgraph_array_bitmask that sets bits when an index in the array is
> reserved (via the simple lru algorithm). Then only the bits set in this
> bitmask needs to be looked at where only elements in the array that have
> ops registered need to be looked at.
> 
> Note, we do not care about races. If a bit is set before the gops is
> assigned, it only wastes time looking at the element and ignoring it (as
> it did before this bitmask is added).

This is OK because anyway we check gops == _stub.
By the way, shouldn't we also make "if (gops == _stub)"
check unlikely()?

This change looks good to me.

Reviewed-by: Masami Hiramatsu (Google) 

Thank you,

> 
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/fgraph.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 5e8e13ffcfb6..1aae521e5997 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -173,6 +173,7 @@ DEFINE_STATIC_KEY_FALSE(kill_ftrace_graph);
>  int ftrace_graph_active;
>  
>  static struct fgraph_ops *fgraph_array[FGRAPH_ARRAY_SIZE];
> +static unsigned long fgraph_array_bitmask;
>  
>  /* LRU index table for fgraph_array */
>  static int fgraph_lru_table[FGRAPH_ARRAY_SIZE];
> @@ -197,6 +198,8 @@ static int fgraph_lru_release_index(int idx)
>  
>   fgraph_lru_table[fgraph_lru_last] = idx;
>   fgraph_lru_last = (fgraph_lru_last + 1) % FGRAPH_ARRAY_SIZE;
> +
> + clear_bit(idx, _array_bitmask);
>   return 0;
>  }
>  
> @@ -211,6 +214,8 @@ static int fgraph_lru_alloc_index(void)
>  
>   fgraph_lru_table[fgraph_lru_next] = -1;
>   fgraph_lru_next = (fgraph_lru_next + 1) % FGRAPH_ARRAY_SIZE;
> +
> + set_bit(idx, _array_bitmask);
>   return idx;
>  }
>  
> @@ -632,7 +637,8 @@ int function_graph_enter(unsigned long ret, unsigned long 
> func,
>   if (offset < 0)
>   goto out;
>  
> - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> + for_each_set_bit(i, _array_bitmask,
> +  sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) {
>   struct fgraph_ops *gops = fgraph_array[i];
>   int save_curr_ret_stack;
>  
> -- 
> 2.43.0
> 
> 


-- 
Masami Hiramatsu (Google)

Re: [PATCH 19/20] function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()

2024-05-26 Thread Google

On Fri, 24 May 2024 22:37:11 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> Instead of iterating through the entire fgraph_array[] and seeing if one
> of the bitmap bits are set to know to call the array's retfunc() function,
> use for_each_set_bit() on the bitmap itself. This will only iterate for
> the number of set bits.
> 
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/fgraph.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 4d503b3e45ad..5e8e13ffcfb6 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -827,11 +827,10 @@ static unsigned long __ftrace_return_to_handler(struct 
> fgraph_ret_regs *ret_regs
>  #endif
>  
>   bitmap = get_bitmap_bits(current, offset);
> - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> +
> + for_each_set_bit(i, , sizeof(bitmap) * BITS_PER_BYTE) {
>   struct fgraph_ops *gops = fgraph_array[i];
>  
> - if (!(bitmap & BIT(i)))
> - continue;
>   if (gops == _stub)

Ah, nit: maybe this is unlikely()?

Thank you,


-- 
Masami Hiramatsu (Google)

Re: [PATCH 19/20] function_graph: Use for_each_set_bit() in __ftrace_return_to_handler()

2024-05-26 Thread Google

On Fri, 24 May 2024 22:37:11 -0400
Steven Rostedt  wrote:

> From: "Steven Rostedt (Google)" 
> 
> Instead of iterating through the entire fgraph_array[] and seeing if one
> of the bitmap bits are set to know to call the array's retfunc() function,
> use for_each_set_bit() on the bitmap itself. This will only iterate for
> the number of set bits.
> 

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) 

Thanks,

> Signed-off-by: Steven Rostedt (Google) 
> ---
>  kernel/trace/fgraph.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 4d503b3e45ad..5e8e13ffcfb6 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -827,11 +827,10 @@ static unsigned long __ftrace_return_to_handler(struct 
> fgraph_ret_regs *ret_regs
>  #endif
>  
>   bitmap = get_bitmap_bits(current, offset);
> - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) {
> +
> + for_each_set_bit(i, , sizeof(bitmap) * BITS_PER_BYTE) {
>   struct fgraph_ops *gops = fgraph_array[i];
>  
> - if (!(bitmap & BIT(i)))
> - continue;
>   if (gops == _stub)
>   continue;
>  
> -- 
> 2.43.0
> 
> 


-- 
Masami Hiramatsu (Google)

Re: [PATCH] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-26 Thread Google

On Sun, 26 May 2024 22:51:53 +0800
kernel test robot  wrote:

> Hi Tatsuya,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on linus/master]
> [also build test WARNING on rostedt-trace/for-next v6.9 next-20240523]
> [cannot apply to rostedt-trace/for-next-urgent]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:
> https://github.com/intel-lab-lkp/linux/commits/Tatsuya-S/ftrace-Fix-stack-trace-entry-generated-by-ftrace_pid_func/20240526-193149
> base:   linus/master
> patch link:
> https://lore.kernel.org/r/20240526112658.46740-1-tatsuya.s2862%40gmail.com
> patch subject: [PATCH] ftrace: Fix stack trace entry generated by 
> ftrace_pid_func()
> config: x86_64-buildonly-randconfig-002-20240526 
> (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/config)
> compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 
> 617a15a9eac96088ae5e9134248d8236e34b91b1)
> reproduce (this is a W=1 build): 
> (https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version 
> of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot 
> | Closes: 
> https://lore.kernel.org/oe-kbuild-all/202405262232.l4xh8q6o-...@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
> >> kernel/trace/ftrace.c:102:6: warning: no previous prototype for function 
> >> 'ftrace_pids_enabled' [-Wmissing-prototypes]
>  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
>  |  ^
>kernel/trace/ftrace.c:102:1: note: declare 'static' if the function is not 
> intended to be used outside of this translation unit
>  102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
>  | ^
>  | static 
>1 warning generated.

This is because the prototype in linux/ftrace.h is placed in the 
#ifdef CONFIG_DYNAMIC_FTRACE block. The prototype needs to be moved
outside of the block.

Thank you,

> 
> 
> vim +/ftrace_pids_enabled +102 kernel/trace/ftrace.c
> 
>101
>  > 102bool ftrace_pids_enabled(struct ftrace_ops *ops)
>103{
>104struct trace_array *tr;
>105
>106if (!(ops->flags & FTRACE_OPS_FL_PID) || !ops->private)
>107return false;
>108
>109tr = ops->private;
>110
>111return tr->function_pids != NULL || 
> tr->function_no_pids != NULL;
>112}
>113
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki


-- 
Masami Hiramatsu (Google)

Re: [PATCH] tracing/probes: fix error check in parse_btf_field()

2024-05-26 Thread Google

On Sun, 26 May 2024 14:27:56 +0200
Carlos López  wrote:

> 
> Hi,
> 
> On 26/5/24 12:17, Masami Hiramatsu (Google) wrote:
> > On Sat, 25 May 2024 20:21:32 +0200
> > Carlos López  wrote:
> > 
> >> btf_find_struct_member() might return NULL or an error via the
> >> ERR_PTR() macro. However, its caller in parse_btf_field() only checks
> >> for the NULL condition. Fix this by using IS_ERR() and returning the
> >> error up the stack.
> >>
> > 
> > Thanks for finding it!
> > I think this requires new error message for error_log file.
> > Can you add the log as
> > 
> > trace_probe_log_err(ctx->offset, BTF_ERROR);
> > 
> > And define BTF_ERROR in ERRORS@kernel/trace/trace_probe.h ?
> 
> Sounds good, but should we perhaps reuse BAD_BTF_TID?
> 
> ```
> C(BAD_BTF_TID,"Failed to get BTF type info."),\
> ```
> 
> `btf_find_struct_member()` fails if `type` is not a struct or if it runs
> OOM while allocating the anon stack, so it seems appropriate.

Good point, it sounds reasonable.

Thanks!

> 
> Best,
> Carlos
> 
> > Thank you,
> > 
> >> Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure 
> >> field access")
> >> Signed-off-by: Carlos López 
> >> ---
> >>   kernel/trace/trace_probe.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> >> index 5e263c141574..5417e9712157 100644
> >> --- a/kernel/trace/trace_probe.c
> >> +++ b/kernel/trace/trace_probe.c
> >> @@ -554,6 +554,8 @@ static int parse_btf_field(char *fieldname, const 
> >> struct btf_type *type,
> >>anon_offs = 0;
> >>field = btf_find_struct_member(ctx->btf, type, 
> >> fieldname,
> >>   _offs);
> >> +  if (IS_ERR(field))
> >> +  return PTR_ERR(field);
> >>if (!field) {
> >>trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
> >>return -ENOENT;
> >> -- 
> >> 2.35.3
> >>
> > 
> > 
> 
> -- 
> Carlos López
> Security Engineer
> SUSE Software Solutions


-- 
Masami Hiramatsu (Google)

Re: [PATCH] ftrace: Fix stack trace entry generated by ftrace_pid_func()

2024-05-26 Thread kernel test robot

Hi Tatsuya,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on rostedt-trace/for-next v6.9 next-20240523]
[cannot apply to rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Tatsuya-S/ftrace-Fix-stack-trace-entry-generated-by-ftrace_pid_func/20240526-193149
base:   linus/master
patch link:
https://lore.kernel.org/r/20240526112658.46740-1-tatsuya.s2862%40gmail.com
patch subject: [PATCH] ftrace: Fix stack trace entry generated by 
ftrace_pid_func()
config: x86_64-buildonly-randconfig-002-20240526 
(https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/config)
compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 
617a15a9eac96088ae5e9134248d8236e34b91b1)
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240526/202405262232.l4xh8q6o-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202405262232.l4xh8q6o-...@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/trace/ftrace.c:102:6: warning: no previous prototype for function 
>> 'ftrace_pids_enabled' [-Wmissing-prototypes]
 102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
 |  ^
   kernel/trace/ftrace.c:102:1: note: declare 'static' if the function is not 
intended to be used outside of this translation unit
 102 | bool ftrace_pids_enabled(struct ftrace_ops *ops)
 | ^
 | static 
   1 warning generated.


vim +/ftrace_pids_enabled +102 kernel/trace/ftrace.c

   101  
 > 102  bool ftrace_pids_enabled(struct ftrace_ops *ops)
   103  {
   104  struct trace_array *tr;
   105  
   106  if (!(ops->flags & FTRACE_OPS_FL_PID) || !ops->private)
   107  return false;
   108  
   109  tr = ops->private;
   110  
   111  return tr->function_pids != NULL || tr->function_no_pids != 
NULL;
   112  }
   113  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Re: [PATCH] tracing/probes: fix error check in parse_btf_field()

2024-05-26 Thread Carlos López




Hi,

On 26/5/24 12:17, Masami Hiramatsu (Google) wrote:

On Sat, 25 May 2024 20:21:32 +0200
Carlos López  wrote:


btf_find_struct_member() might return NULL or an error via the
ERR_PTR() macro. However, its caller in parse_btf_field() only checks
for the NULL condition. Fix this by using IS_ERR() and returning the
error up the stack.



Thanks for finding it!
I think this requires new error message for error_log file.
Can you add the log as

trace_probe_log_err(ctx->offset, BTF_ERROR);

And define BTF_ERROR in ERRORS@kernel/trace/trace_probe.h ?


Sounds good, but should we perhaps reuse BAD_BTF_TID?

```
C(BAD_BTF_TID,  "Failed to get BTF type info."),\
```

`btf_find_struct_member()` fails if `type` is not a struct or if it runs
OOM while allocating the anon stack, so it seems appropriate.

Best,
Carlos


Thank you,


Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure field 
access")
Signed-off-by: Carlos López 
---
  kernel/trace/trace_probe.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 5e263c141574..5417e9712157 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -554,6 +554,8 @@ static int parse_btf_field(char *fieldname, const struct 
btf_type *type,
anon_offs = 0;
field = btf_find_struct_member(ctx->btf, type, 
fieldname,
   _offs);
+   if (IS_ERR(field))
+   return PTR_ERR(field);
if (!field) {
trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
return -ENOENT;
--
2.35.3






--
Carlos López
Security Engineer
SUSE Software Solutions

Re: [PATCH] tracing/probes: fix error check in parse_btf_field()

2024-05-26 Thread Google

On Sat, 25 May 2024 20:21:32 +0200
Carlos López  wrote:

> btf_find_struct_member() might return NULL or an error via the
> ERR_PTR() macro. However, its caller in parse_btf_field() only checks
> for the NULL condition. Fix this by using IS_ERR() and returning the
> error up the stack.
> 

Thanks for finding it!
I think this requires new error message for error_log file.
Can you add the log as

trace_probe_log_err(ctx->offset, BTF_ERROR);

And define BTF_ERROR in ERRORS@kernel/trace/trace_probe.h ?

Thank you,

> Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure field 
> access")
> Signed-off-by: Carlos López 
> ---
>  kernel/trace/trace_probe.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 5e263c141574..5417e9712157 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -554,6 +554,8 @@ static int parse_btf_field(char *fieldname, const struct 
> btf_type *type,
>   anon_offs = 0;
>   field = btf_find_struct_member(ctx->btf, type, 
> fieldname,
>  _offs);
> + if (IS_ERR(field))
> + return PTR_ERR(field);
>   if (!field) {
>   trace_probe_log_err(ctx->offset, NO_BTF_FIELD);
>   return -ENOENT;
> -- 
> 2.35.3
> 


-- 
Masami Hiramatsu (Google)

Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

2024-05-25 Thread Dave Chinner

On Fri, May 24, 2024 at 09:55:48AM +0200, Miklos Szeredi wrote:
> On Fri, 24 May 2024 at 02:47, John Groves  wrote:
> 
> > Apologies, but I'm short on time at the moment - going into a long holiday
> > weekend in the US with family plans. I should be focused again by middle of
> > next week.
> 
> NP.
> 
> Obviously I'll need to test it before anything is merged, other than
> that this is not urgent at all...
> 
> > But can you check /proc/cmdline to see of the memmap arg got through without
> > getting mangled? The '$' tends to get fubar'd. You might need \$, or I've 
> > seen
> > the need for \\\$. If it's un-mangled, there should be a dax device.
> 
> /proc/cmdline shows the option correctly:
> 
> root@kvm:~# cat /proc/cmdline
> root=/dev/vda console=hvc0 memmap=4G$4G
> 
> > If that doesn't work, it's worth trying '!' instead, which I think would 
> > give
> > you a pmem device - if the arg gets through (but ! is less likely to get
> > horked). That pmem device can be converted to devdax...
> 
> That doesn't work either.  No device created in /dev  (dax or pmem).

I think you need to do some ndctl magic to get the memory to be
namespaced correctly for the correct devices to appear.

https://docs.pmem.io/ndctl-user-guide/managing-namespaces

IIRC, need to set the type to pmem and the mode to fsdax, devdax or
raw to get the relevant device nodes to be created for the range..

Cheers,

Dave.

-- 
Dave Chinner
da...@fromorbit.com

Re: [PATCH RFC 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc

2024-05-25 Thread Krzysztof Kozlowski

On 24/05/2024 19:55, Luca Weiss wrote:
> On Donnerstag, 23. Mai 2024 08:19:11 MESZ Krzysztof Kozlowski wrote:
>> On 23/05/2024 08:16, Luca Weiss wrote:
>>> On Donnerstag, 23. Mai 2024 08:02:13 MESZ Krzysztof Kozlowski wrote:
 On 22/05/2024 19:34, Luca Weiss wrote:
> On Mittwoch, 22. Mai 2024 08:49:43 MESZ Krzysztof Kozlowski wrote:
>> On 21/05/2024 22:35, Luca Weiss wrote:
>>> On Dienstag, 21. Mai 2024 10:58:07 MESZ Krzysztof Kozlowski wrote:
 On 20/05/2024 17:11, Luca Weiss wrote:
> Hi Krzysztof
>
> Ack, sounds good.
>
> Maybe also from you, any opinion between these two binding styles?
>
> So first using index of mboxes for the numbering, where for the known
> usages the first element (and sometimes the 3rd - ipc-2) are empty <>.
>
> The second variant is using mbox-names to get the correct channel-mbox
> mapping.
>
> -   qcom,ipc-1 = < 8 13>;
> -   qcom,ipc-2 = < 8 9>;
> -   qcom,ipc-3 = < 8 19>;
> +   mboxes = <0>, < 13>, < 9>, < 19>;
>
> vs.
>
> -   qcom,ipc-1 = < 8 13>;
> -   qcom,ipc-2 = < 8 9>;
> -   qcom,ipc-3 = < 8 19>;
> +   mboxes = < 13>, < 9>, < 19>;
> +   mbox-names = "ipc-1", "ipc-2", "ipc-3";

 Sorry, don't get, ipc-1 is the first mailbox, so why would there be <0>
 in first case?
>>>
>>> Actually not, ipc-0 would be permissible by the driver, used for the 
>>> 0th host
>>>
>>> e.g. from:
>>>
>>> /* Iterate over all hosts to check whom wants a kick */
>>> for (host = 0; host < smsm->num_hosts; host++) {
>>> hostp = >hosts[host];
>>>
>>> Even though no mailbox is specified in any upstream dts for this 0th 
>>> host I
>>> didn't want the bindings to restrict that, that's why in the first 
>>> example
>>> there's an empty element (<0>) for the 0th smsm host
>>>
 Anyway, the question is if you need to know that some
 mailbox is missing. But then it is weird to name them "ipc-1" etc.
>>>
>>> In either case we'd just query the mbox (either by name or index) and 
>>> then
>>> see if it's there? Not quite sure I understand the sentence..
>>> Pretty sure either binding would work the same way.
>>
>> The question is: does the driver care only about having some mailboxes
>> or the driver cares about each specific mailbox? IOW, is skipping ipc-0
>> important for the driver?
>
> There's nothing special from driver side about any mailbox. Some SoCs have
> a mailbox for e.g. hosts 1&2&3, some have only 1&3, and apq8064 even has
> 1&2&3&4.
>
> And if the driver doesn't find a mailbox for a host, it just ignores it
> but then of course it can't 'ring' the mailbox for that host when 
> necessary.
>
> Not sure how much more I can add here, to be fair I barely understand what
> this driver is doing myself apart from the obvious.

 From what you said, it looks like it is enough to just list mailboxes,
 e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
>>>
>>> No, for sure we need also the possibility to list ipc-3.
>>
>> ? You can list it, what's the problem>
> 
> Maybe we're talking past each other...
> 
> You asked why this wouldn't work:
> 
>   e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
>   mboxes = < 13>, < 9>, < 19>;
> 
> How would we know that the 3rd mailbox ( 19) is for the 4th host
> (previous ipc-4)?
> 
> 1. If we use mboxes with indexes we'd need to have <0> values for
> "smsm hosts" where we don't have a mailbox for - this is at least
> for the 2nd smsm host (qcom,ipc-2) for a bunch of SoCs.
> 
> 2. If we use mboxes with mbox-names then we could skip that since we
> can directly specify which "smsm host" a given mailbox is for.
> 
> My only question really is whether 1. or 2. is a better idea.
> 
> Is this clearer now or still not?


So again, does the driver care about missing entry? If so, why?

Best regards,
Krzysztof

Re: [PATCH v10 00/36] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

2024-05-25 Thread Google

On Fri, 24 May 2024 18:41:56 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:08:00 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > Steven Rostedt (VMware) (15):
> >   function_graph: Convert ret_stack to a series of longs
> >   fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible 
> > by long
> >   function_graph: Add an array structure that will allow multiple 
> > callbacks
> >   function_graph: Allow multiple users to attach to function graph
> >   function_graph: Remove logic around ftrace_graph_entry and return
> >   ftrace/function_graph: Pass fgraph_ops to function graph callbacks
> >   ftrace: Allow function_graph tracer to be enabled in instances
> >   ftrace: Allow ftrace startup flags exist without dynamic ftrace
> >   function_graph: Have the instances use their own ftrace_ops for 
> > filtering
> >   function_graph: Add "task variables" per task for fgraph_ops
> >   function_graph: Move set_graph_function tests to shadow stack global 
> > var
> >   function_graph: Move graph depth stored data to shadow stack global 
> > var
> >   function_graph: Move graph notrace bit to shadow stack global var
> >   function_graph: Implement fgraph_reserve_data() and 
> > fgraph_retrieve_data()
> >   function_graph: Add selftest for passing local variables
> 
> Hi Masami,
> 
> While reviewing these patches, I realized there's several things I dislike
> about the patches I wrote. So I took these patches and started cleaning
> them up a little. Mostly renaming functions and adding comments.

Thanks for cleaning up the patches!!

> 
> As this is a major change to the function graph tracer, and I feel nervous
> about building something on top of this, how about I take over these
> patches and push them out for the next merge window. I'm hoping to get them
> into linux-next by v6.10-rc2 (I spent the day working on them, and it's
> mostly minor tweaks).

OK.

> Then I can push it out to 6.11 and get some good testing against it. Then
> we can add your stuff on top and get that merged in 6.12.

Yeah, it is reasonable plan. I also concerns about the stability. Especially,
this involves fprobe side changes too. If we introduce both at once, it may
mess up many things.

> 
> If all goes well, I'm hoping to get a series on just these patches (and
> your selftest addition) by tonight.
> 
> Thoughts?

I agree with you.

Thank you,

> 
> -- Steve


-- 
Masami Hiramatsu (Google)

Re: [PATCH v10 07/36] function_graph: Allow multiple users to attach to function graph

2024-05-25 Thread Google

On Fri, 24 May 2024 21:32:08 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:09:22 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > @@ -109,6 +244,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> > long func,
> > if (!current->ret_stack)
> > return -EBUSY;
> >  
> > +   /*
> > +* At first, check whether the previous fgraph callback is pushed by
> > +* the fgraph on the same function entry.
> > +* But if @func is the self tail-call function, we also need to ensure
> > +* the ret_stack is not for the previous call by checking whether the
> > +* bit of @fgraph_idx is set or not.
> > +*/
> > +   ret_stack = get_ret_stack(current, current->curr_ret_stack, );
> > +   if (ret_stack && ret_stack->func == func &&
> > +   get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == 
> > FGRAPH_TYPE_BITMAP &&
> > +   !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, 
> > fgraph_idx))
> > +   return offset + FGRAPH_FRAME_OFFSET;
> > +
> > +   val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> > +
> > BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));
> 
> I'm trying to figure out what the above is trying to do. This gets called
> once in function_graph_enter() (or function_graph_enter_ops()). What
> exactly are you trying to catch here?

Aah, good catch! This was originally for catching the self tail-call case with
multiple fgraph callback on the same function, but it was my misread.
In later patch ([12/36]), we introduced function_graph_enter_ops() so that
we can skip checking hash table and directly pass the fgraph_ops to user
callback. I thought this function_graph_enter_ops() is used even if multiple
fgraph is set on the same function. In this case, we always need to check the
stack can be reused(pushed by other fgraph_ops on the same function) or not.
But as we discussed, the function_graph_enter_ops() is used only when only
one fgraph is set on the function (if there are multiple fgraphs are set on
the same function, use function_graph_enter() ), we are sure that 
ftrace_push_return_trace() is called only once on hooking the function entry.
Thus we don't need to reuse it.

> 
> Is it from this email:
> 
>   
> https://lore.kernel.org/all/20231110105154.df937bf9f200a0c16806c...@kernel.org/
> 
> As that's the last version before you added the above code.
> 
> But you also noticed it may not be needed, but triggered a crash without it
> in v3:
> 
>   
> https://lore.kernel.org/all/20231205234511.3839128259dfec153ea7d...@kernel.org/
> 
> I removed this code in my version and it runs just fine. Perhaps there was
> another bug that this was hiding that you fixed in later versions?

No problem. I think we can remove this block safely.

Thank you,

> 
> -- Steve
> 

-- 
Masami Hiramatsu (Google)

Re: [PATCH] x86/paravirt: Disable virt spinlock when CONFIG_PARAVIRT_SPINLOCKS disabled

2024-05-25 Thread Chen Yu

On 2024-05-23 at 09:30:59 -0700, Dave Hansen wrote:
> On 5/16/24 06:02, Chen Yu wrote:
> > Performance drop is reported when running encode/decode workload and
> > BenchSEE cache sub-workload.
> > Bisect points to commit ce0a1b608bfc ("x86/paravirt: Silence unused
> > native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS
> > is disabled the virt_spin_lock_key is set to true on bare-metal.
> > The qspinlock degenerates to test-and-set spinlock, which decrease the
> > performance on bare-metal.
> > 
> > Fix this by disabling virt_spin_lock_key if CONFIG_PARAVIRT_SPINLOCKS
> > is not set, or it is on bare-metal.
> 
> This is missing some background:
> 
> The kernel can change spinlock behavior when running as a guest.  But
> this guest-friendly behavior causes performance problems on bare metal.
> So there's a 'virt_spin_lock_key' static key to switch between the two
> modes.
> 
> The static key is always enabled by default (run in guest mode) and
> should be disabled for bare metal (and in some guests that want native
> behavior).
> 
> ... then describe the regression and the fix
>
Thanks Juergen for your review.

And thanks Dave for the write up, I'll refine the log according to your 
suggestion. 

> > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> > index 5358d43886ad..ee51c0949ed8 100644
> > --- a/arch/x86/kernel/paravirt.c
> > +++ b/arch/x86/kernel/paravirt.c
> > @@ -55,7 +55,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
> >  
> >  void __init native_pv_lock_init(void)
> >  {
> > -   if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) &&
> > +   if (!IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) ||
> > !boot_cpu_has(X86_FEATURE_HYPERVISOR))
> > static_branch_disable(_spin_lock_key);
> >  }
> This gets used at a single site:
> 
> if (pv_enabled())
> goto pv_queue;
> 
> if (virt_spin_lock(lock))
> return;
> 
> which is logically:
> 
>   if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS))
>   goto ...; // don't look at virt_spin_lock_key
> 
>   if (virt_spin_lock_key)
>   return; // On virt, but non-paravirt.  Did Test-and-Set
>   // spinlock.
>

Thanks for the description in detail, my original change might break the
"X86_FEATURE_HYPERVISOR + NO_CONFIG_PARAVIRT_SPINLOCKS " case that, the guest
can not fall into test-and-set.
 
> So I _think_ Arnd was trying to optimize native_pv_lock_init() away when
> it's going to get skipped over anyway by the 'goto'.
> 
> But this took me at least 30 minutes of scratching my head and trying to
> untangle the whole thing.  It's all far too subtle for my taste, and all
> of that to save a few bytes of init text in a configuration that's
> probably not even used very often (PARAVIRT=y, but PARAVIRT_SPINLOCKS=n).
> 
> Let's just keep it simple.  How about the attached patch?

Yes, this one works, I'll refine it.

thanks,
Chenyu

Re: [PATCH v10 07/36] function_graph: Allow multiple users to attach to function graph

2024-05-24 Thread Steven Rostedt

On Tue,  7 May 2024 23:09:22 +0900
"Masami Hiramatsu (Google)"  wrote:

> @@ -109,6 +244,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned 
> long func,
>   if (!current->ret_stack)
>   return -EBUSY;
>  
> + /*
> +  * At first, check whether the previous fgraph callback is pushed by
> +  * the fgraph on the same function entry.
> +  * But if @func is the self tail-call function, we also need to ensure
> +  * the ret_stack is not for the previous call by checking whether the
> +  * bit of @fgraph_idx is set or not.
> +  */
> + ret_stack = get_ret_stack(current, current->curr_ret_stack, );
> + if (ret_stack && ret_stack->func == func &&
> + get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == 
> FGRAPH_TYPE_BITMAP &&
> + !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, 
> fgraph_idx))
> + return offset + FGRAPH_FRAME_OFFSET;
> +
> + val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET;
> +
>   BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long));

I'm trying to figure out what the above is trying to do. This gets called
once in function_graph_enter() (or function_graph_enter_ops()). What
exactly are you trying to catch here?

Is it from this email:

  
https://lore.kernel.org/all/20231110105154.df937bf9f200a0c16806c...@kernel.org/

As that's the last version before you added the above code.

But you also noticed it may not be needed, but triggered a crash without it
in v3:

  
https://lore.kernel.org/all/20231205234511.3839128259dfec153ea7d...@kernel.org/

I removed this code in my version and it runs just fine. Perhaps there was
another bug that this was hiding that you fixed in later versions?

-- Steve

Re: [PATCH v10 00/36] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

2024-05-24 Thread Steven Rostedt

On Tue,  7 May 2024 23:08:00 +0900
"Masami Hiramatsu (Google)"  wrote:

> Steven Rostedt (VMware) (15):
>   function_graph: Convert ret_stack to a series of longs
>   fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible by 
> long
>   function_graph: Add an array structure that will allow multiple 
> callbacks
>   function_graph: Allow multiple users to attach to function graph
>   function_graph: Remove logic around ftrace_graph_entry and return
>   ftrace/function_graph: Pass fgraph_ops to function graph callbacks
>   ftrace: Allow function_graph tracer to be enabled in instances
>   ftrace: Allow ftrace startup flags exist without dynamic ftrace
>   function_graph: Have the instances use their own ftrace_ops for 
> filtering
>   function_graph: Add "task variables" per task for fgraph_ops
>   function_graph: Move set_graph_function tests to shadow stack global var
>   function_graph: Move graph depth stored data to shadow stack global var
>   function_graph: Move graph notrace bit to shadow stack global var
>   function_graph: Implement fgraph_reserve_data() and 
> fgraph_retrieve_data()
>   function_graph: Add selftest for passing local variables

Hi Masami,

While reviewing these patches, I realized there's several things I dislike
about the patches I wrote. So I took these patches and started cleaning
them up a little. Mostly renaming functions and adding comments.

As this is a major change to the function graph tracer, and I feel nervous
about building something on top of this, how about I take over these
patches and push them out for the next merge window. I'm hoping to get them
into linux-next by v6.10-rc2 (I spent the day working on them, and it's
mostly minor tweaks).

Then I can push it out to 6.11 and get some good testing against it. Then
we can add your stuff on top and get that merged in 6.12.

If all goes well, I'm hoping to get a series on just these patches (and
your selftest addition) by tonight.

Thoughts?

-- Steve

Re: [PATCH RFC 1/2] dt-bindings: soc: qcom,smsm: Allow specifying mboxes instead of qcom,ipc

2024-05-24 Thread Luca Weiss

On Donnerstag, 23. Mai 2024 08:19:11 MESZ Krzysztof Kozlowski wrote:
> On 23/05/2024 08:16, Luca Weiss wrote:
> > On Donnerstag, 23. Mai 2024 08:02:13 MESZ Krzysztof Kozlowski wrote:
> >> On 22/05/2024 19:34, Luca Weiss wrote:
> >>> On Mittwoch, 22. Mai 2024 08:49:43 MESZ Krzysztof Kozlowski wrote:
>  On 21/05/2024 22:35, Luca Weiss wrote:
> > On Dienstag, 21. Mai 2024 10:58:07 MESZ Krzysztof Kozlowski wrote:
> >> On 20/05/2024 17:11, Luca Weiss wrote:
> >>> Hi Krzysztof
> >>>
> >>> Ack, sounds good.
> >>>
> >>> Maybe also from you, any opinion between these two binding styles?
> >>>
> >>> So first using index of mboxes for the numbering, where for the known
> >>> usages the first element (and sometimes the 3rd - ipc-2) are empty <>.
> >>>
> >>> The second variant is using mbox-names to get the correct channel-mbox
> >>> mapping.
> >>>
> >>> -   qcom,ipc-1 = < 8 13>;
> >>> -   qcom,ipc-2 = < 8 9>;
> >>> -   qcom,ipc-3 = < 8 19>;
> >>> +   mboxes = <0>, < 13>, < 9>, < 19>;
> >>>
> >>> vs.
> >>>
> >>> -   qcom,ipc-1 = < 8 13>;
> >>> -   qcom,ipc-2 = < 8 9>;
> >>> -   qcom,ipc-3 = < 8 19>;
> >>> +   mboxes = < 13>, < 9>, < 19>;
> >>> +   mbox-names = "ipc-1", "ipc-2", "ipc-3";
> >>
> >> Sorry, don't get, ipc-1 is the first mailbox, so why would there be <0>
> >> in first case?
> >
> > Actually not, ipc-0 would be permissible by the driver, used for the 
> > 0th host
> >
> > e.g. from:
> >
> > /* Iterate over all hosts to check whom wants a kick */
> > for (host = 0; host < smsm->num_hosts; host++) {
> > hostp = >hosts[host];
> >
> > Even though no mailbox is specified in any upstream dts for this 0th 
> > host I
> > didn't want the bindings to restrict that, that's why in the first 
> > example
> > there's an empty element (<0>) for the 0th smsm host
> >
> >> Anyway, the question is if you need to know that some
> >> mailbox is missing. But then it is weird to name them "ipc-1" etc.
> >
> > In either case we'd just query the mbox (either by name or index) and 
> > then
> > see if it's there? Not quite sure I understand the sentence..
> > Pretty sure either binding would work the same way.
> 
>  The question is: does the driver care only about having some mailboxes
>  or the driver cares about each specific mailbox? IOW, is skipping ipc-0
>  important for the driver?
> >>>
> >>> There's nothing special from driver side about any mailbox. Some SoCs have
> >>> a mailbox for e.g. hosts 1&2&3, some have only 1&3, and apq8064 even has
> >>> 1&2&3&4.
> >>>
> >>> And if the driver doesn't find a mailbox for a host, it just ignores it
> >>> but then of course it can't 'ring' the mailbox for that host when 
> >>> necessary.
> >>>
> >>> Not sure how much more I can add here, to be fair I barely understand what
> >>> this driver is doing myself apart from the obvious.
> >>
> >> From what you said, it looks like it is enough to just list mailboxes,
> >> e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
> > 
> > No, for sure we need also the possibility to list ipc-3.
> 
> ? You can list it, what's the problem>

Maybe we're talking past each other...

You asked why this wouldn't work:

  e.g. for ipc-1, ipc-2 and ipc-4 (so no ipc-0 and ipc-3):
  mboxes = < 13>, < 9>, < 19>;

How would we know that the 3rd mailbox ( 19) is for the 4th host
(previous ipc-4)?

1. If we use mboxes with indexes we'd need to have <0> values for
"smsm hosts" where we don't have a mailbox for - this is at least
for the 2nd smsm host (qcom,ipc-2) for a bunch of SoCs.

2. If we use mboxes with mbox-names then we could skip that since we
can directly specify which "smsm host" a given mailbox is for.

My only question really is whether 1. or 2. is a better idea.

Is this clearer now or still not?


> 
> > 
> > And my point is that I'm not sure if any platform will ever need ipc-0, but
> > the code to use that if it ever exists is there - the driver always
> > tries getting an mbox (currently just syscon of course) for every host
> > from 0 to n.
> > 
> > These are the current (non-mbox-API) mboxes provided to smsm:
> > 
> > $ git grep qcom,ipc- arch/
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-1 = < 
> > 8 4>;
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-2 = < 
> > 8 14>;
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-3 = < 
> > 8 23>;
> > arch/arm/boot/dts/qcom/qcom-apq8064.dtsi:   qcom,ipc-4 = 
> > <_sic_non_secure 0x4094 0>;
> > arch/arm/boot/dts/qcom/qcom-msm8974.dtsi:   qcom,ipc-1 = < 
> > 8 13>;
> > arch/arm/boot/dts/qcom/qcom-msm8974.dtsi:   qcom,ipc-2 = < 
> > 8 9>;
> >

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-24 Thread Steven Rostedt

On Fri, 24 May 2024 12:50:08 +0200
"Linux regression tracking (Thorsten Leemhuis)"  
wrote:

> > - Affected Versions: Before kernel version 6.8.10, the bug caused a
> > quick display of a kernel trace dump before the shutdown/reboot
> > completed. Starting from version 6.8.10 and continuing into version
> > 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> > preventing the shutdown or reboot from completing and leaving the
> > machine stuck.

Ah, I bet it was this commit: baa23a8d4360d ("tracefs: Reset permissions on
remount if permissions are options"), which added a "iput" callback to the
dentry without calling iput, leaving stale inodes around.

This is fixed with:

  0bcfd9aa4dafa ("tracefs: Clear EVENT_INODE flag in tracefs_drop_inode()")

Try adding just that patch. It will at least make it go back to what was
happening before 6.8.10 (I hope!).

-- Steve

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-24 Thread Steven Rostedt

On Fri, 24 May 2024 12:50:08 +0200
"Linux regression tracking (Thorsten Leemhuis)"  
wrote:

> [CCing a few people]
> 

Thanks for the Cc.

> On 24.05.24 12:31, Ilkka Naulapää wrote:
> > 
> > I have encountered a critical bug in the Linux vanilla kernel that
> > leads to a kernel panic during the shutdown or reboot process. The
> > issue arises after all services, including `journald`, have been
> > stopped. As a result, the machine fails to complete the shutdown or
> > reboot procedure, effectively causing the system to hang and not shut
> > down or reboot.  

To understand this, did you do anything with tracing? Before shutting down,
is there anything in /sys/kernel/tracing/instances directory?
Were any of the files/directories permissions in /sys/kernel/tracing changed?

> 
> Thx for the report. Not my area of expertise, so take this with a gain
> of salt. But given the versions your mention in your report and the
> screenshot that mentioned tracefs_free_inode I suspect this is caused by
> baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are
> options"). A few fixes for it will soon hit mainline and are meant to be
> backported to affected stable trees:
> 
> https://lore.kernel.org/all/20240523212406.254317...@goodmis.org/
> https://lore.kernel.org/all/20240523174419.1e588...@gandalf.local.home/
> 
> You might want to try them – or recheck once they hit the stable trees
> you are about. If they don't work, please report back.

There's been quite a bit of updates in this code, but this looks new to me.
I have more fixes that were just pulled by Linus today.

  https://git.kernel.org/torvalds/c/0eb03c7e8e2a4cc3653eb5eeb2d2001182071215

I'm not sure how relevant that is for this. But if you can reproduce it
with that commit, then this is a new bug.

-- Steve

Re: [PATCH v10 03/36] x86: tracing: Add ftrace_regs definition in the header

2024-05-24 Thread Steven Rostedt

On Fri, 24 May 2024 10:37:54 +0900
Masami Hiramatsu (Google)  wrote:
> > >  
> > >  #ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> > >  struct ftrace_regs {
> > > + /*
> > > +  * On the x86_64, the ftrace_regs saves;
> > > +  * rax, rcx, rdx, rdi, rsi, r8, r9, rbp, rip and rsp.
> > > +  * Also orig_ax is used for passing direct trampoline address.
> > > +  * x86_32 doesn't support ftrace_regs.  
> > 
> > Should add a comment that if fregs->regs.cs is set, then all of the pt_regs
> > is valid.  
> 
> But what about rbx and r1*? Only regs->cs should be care for pt_regs?
> Or, did you mean "the ftrace_regs is valid"?

Yeah, on x86_64 ftrace_regs uses regs.cs to denote if it is valid or not:

static __always_inline struct pt_regs *
arch_ftrace_get_regs(struct ftrace_regs *fregs)
{
/* Only when FL_SAVE_REGS is set, cs will be non zero */
if (!fregs->regs.cs)
return NULL;
return >regs;
}


-- Steve

Re: How to properly fix reading user pointers in bpf in android kernel 4.9?

2024-05-24 Thread Bagas Sanjaya

[also Cc: bpf maintainers and get_maintainer output]

On Thu, May 23, 2024 at 07:52:22PM +0300, Marcel wrote:
> This seems that it was a long standing problem with the Linux kernel in 
> general. bpf_probe_read should have worked for both kernel and user pointers 
> but it fails with access error when reading an user one instead. 
> 
> I know there's a patch upstream that fixes this by introducing new helpers 
> for reading kernel and userspace pointers and I tried to back port them back 
> to my kernel but with no success. Tools like bcc fail to use them and instead 
> they report that the arguments sent to the helpers are invalid. I assume this 
> is due to the arguments ARG_CONST_STACK_SIZE and ARG_PTR_TO_RAW_STACK handle 
> data different in the 4.9 android version and the upstream version but I'm 
> not sure that this is the cause. I left the patch I did below and with a link 
> to the kernel I'm working on and maybe someone can take a look and give me an 
> hand (the patch isn't applied yet)

What upstream patch? Has it already been in mainline?

> 
> 
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 744b4763b80e..de94c13b7193 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -559,6 +559,43 @@ enum bpf_func_id {
> */
> BPF_FUNC_probe_read_user,
>  
> +   /**
> +   * int bpf_probe_read_kernel(void *dst, int size, void *src)
> +   * Read a kernel pointer safely.
> +   * Return: 0 on success or negative error
> +   */
> +   BPF_FUNC_probe_read_kernel,
> +
> + /**
> +  * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
> +  * Copy a NUL terminated string from user unsafe address. In case 
> the string
> +  * length is smaller than size, the target is not padded with 
> further NUL
> +  * bytes. In case the string length is larger than size, just 
> count-1
> +  * bytes are copied and the last byte is set to NUL.
> +  * @dst: destination address
> +  * @size: maximum number of bytes to copy, including the trailing 
> NUL
> +  * @unsafe_ptr: unsafe address
> +  * Return:
> +  *   > 0 length of the string including the trailing NUL on success
> +  *   < 0 error
> +  */
> + BPF_FUNC_probe_read_user_str,
> +
> + /**
> +  * int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
> +  * Copy a NUL terminated string from unsafe address. In case the 
> string
> +  * length is smaller than size, the target is not padded with 
> further NUL
> +  * bytes. In case the string length is larger than size, just 
> count-1
> +  * bytes are copied and the last byte is set to NUL.
> +  * @dst: destination address
> +  * @size: maximum number of bytes to copy, including the trailing 
> NUL
> +  * @unsafe_ptr: unsafe address
> +  * Return:
> +  *   > 0 length of the string including the trailing NUL on success
> +  *   < 0 error
> +  */
> + BPF_FUNC_probe_read_kernel_str,
> +
>   __BPF_FUNC_MAX_ID,
>  };
>  
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index a1e37a5d8c88..3478ca744a45 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -94,7 +94,7 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
>   .arg3_type  = ARG_ANYTHING,
>  };
>  
> -BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size, const void *, 
> unsafe_ptr)
> +BPF_CALL_3(bpf_probe_read_user, void *, dst, u32, size, const void  __user 
> *, unsafe_ptr)
>  {
>   int ret;
>  
> @@ -115,6 +115,27 @@ static const struct bpf_func_proto 
> bpf_probe_read_user_proto = {
>  };
>  
>  
> +BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size, const void *, 
> unsafe_ptr)
> +{
> + int ret;
> +
> + ret = probe_kernel_read(dst, unsafe_ptr, size);
> + if (unlikely(ret < 0))
> + memset(dst, 0, size);
> +
> + return ret;
> +}
> +
> +static const struct bpf_func_proto bpf_probe_read_kernel_proto = {
> + .func   = bpf_probe_read_kernel,
> + .gpl_only   = true,
> + .ret_type   = RET_INTEGER,
> + .arg1_type  = ARG_PTR_TO_RAW_STACK,
> + .arg2_type  = ARG_CONST_STACK_SIZE,
> + .arg3_type  = ARG_ANYTHING,
> +};
> +
> +
>  BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
>  u32, size)
>  {
> @@ -487,6 +508,69 @@ static const struct bpf_func_proto 
> bpf_probe_read_str_proto = {
>   .arg3_type  = ARG_ANYTHING,
>  };
>  
> +
> +
> +BPF_CALL_3(bpf_probe_read_user_str, void *, dst, u32, size,
> +const void __user *, unsafe_ptr)
> +{
> + int ret;
> +
> + /*
> +  * The strncpy_from_unsafe() call will likely not fill the entire
> +  * buffer, but that's okay in this circumstance as we're probing
> +  *

Re: Bug in Kernel 6.8.x, 6.9.x Causing Trace/Panic During Shutdown/Reboot

2024-05-24 Thread Linux regression tracking (Thorsten Leemhuis)

[CCing a few people]

On 24.05.24 12:31, Ilkka Naulapää wrote:
> 
> I have encountered a critical bug in the Linux vanilla kernel that
> leads to a kernel panic during the shutdown or reboot process. The
> issue arises after all services, including `journald`, have been
> stopped. As a result, the machine fails to complete the shutdown or
> reboot procedure, effectively causing the system to hang and not shut
> down or reboot.

Thx for the report. Not my area of expertise, so take this with a gain
of salt. But given the versions your mention in your report and the
screenshot that mentioned tracefs_free_inode I suspect this is caused by
baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are
options"). A few fixes for it will soon hit mainline and are meant to be
backported to affected stable trees:

https://lore.kernel.org/all/20240523212406.254317...@goodmis.org/
https://lore.kernel.org/all/20240523174419.1e588...@gandalf.local.home/

You might want to try them – or recheck once they hit the stable trees
you are about. If they don't work, please report back.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> Here are the details of the issue:
> 
> - Affected Versions: Before kernel version 6.8.10, the bug caused a
> quick display of a kernel trace dump before the shutdown/reboot
> completed. Starting from version 6.8.10 and continuing into version
> 6.9.0 and 6.9.1, this issue has escalated to a kernel panic,
> preventing the shutdown or reboot from completing and leaving the
> machine stuck.
> 
> - Symptoms:
>   - In normal shutdown/reboot scenarios, the kernel trace dump briefly
> appears as the last message on the screen.
>   - In rescue mode, the kernel panic message is displayed. Normally it
> is not shown.
> 
> Since `journald` is stopped before this issue occurs, no textual logs
> are available. However, I have captured two pictures illustrating
> these related issues, which I am attaching to this email for your
> reference. Also added my custom kernel config.
> 
> Thank you for your attention to this matter. Please let me know if any
> additional information is required to assist in diagnosing and
> resolving this bug.
> 
> Best regards,
> 
> Ilkka Naulapää

Re: [PATCH] livepatch: introduce klp_func called interface

2024-05-24 Thread zhang warden




> On May 23, 2024, at 22:22, Dan Carpenter  wrote:
> 
> Always run your patches through checkpatch.
> 
> So this patch is so that testers can see if a function has been called?
> Can you not get the same information from gcov or ftrace?
> 
> There are style issues with the patch, but it's not so important until
> the design is agreed on.
> 
> regards,
> dan carpenter

Hi, Dan.

This patch have format issues as Markus said. A newer version of this patch is 
sent which is checked by ./scripts/checkpatch.pl

Thanks for your suggestions.

Regards,
Wardenjohn

Re: [PATCH] livepatch: introduce klp_func called interface

2024-05-24 Thread zhang warden

> On May 21, 2024, at 16:04, Petr Mladek  wrote:
> 
> Another motivation to use ftrace for testing is that it does not
> affect the performance in production.
> 
> We should keep klp_ftrace_handler() as fast as possible so that we
> could livepatch also performance sensitive functions.
> 

How about using unlikely() for branch testing? If we use unlikely, maybe there 
is no negative effect to klp_ftrace_handler() once this function is called.

Regards,
Wardenjohn

Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

2024-05-24 Thread Miklos Szeredi

On Fri, 24 May 2024 at 02:47, John Groves  wrote:

> Apologies, but I'm short on time at the moment - going into a long holiday
> weekend in the US with family plans. I should be focused again by middle of
> next week.

NP.

Obviously I'll need to test it before anything is merged, other than
that this is not urgent at all...

> But can you check /proc/cmdline to see of the memmap arg got through without
> getting mangled? The '$' tends to get fubar'd. You might need \$, or I've seen
> the need for \\\$. If it's un-mangled, there should be a dax device.

/proc/cmdline shows the option correctly:

root@kvm:~# cat /proc/cmdline
root=/dev/vda console=hvc0 memmap=4G$4G

> If that doesn't work, it's worth trying '!' instead, which I think would give
> you a pmem device - if the arg gets through (but ! is less likely to get
> horked). That pmem device can be converted to devdax...

That doesn't work either.  No device created in /dev  (dax or pmem).

free(1) does show that the reserved memory is gone in both cases, so
something does happen.

Attaching my .config as well.

Thanks,
Miklos

.config
Description: Binary data

Re: [PATCH v10 03/36] x86: tracing: Add ftrace_regs definition in the header

2024-05-23 Thread Google

On Thu, 23 May 2024 19:14:59 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:08:35 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > From: Masami Hiramatsu (Google) 
> > 
> > Add ftrace_regs definition for x86_64 in the ftrace header to
> > clarify what register will be accessible from ftrace_regs.
> > 
> > Signed-off-by: Masami Hiramatsu (Google) 
> > ---
> >  Changes in v3:
> >   - Add rip to be saved.
> >  Changes in v2:
> >   - Newly added.
> > ---
> >  arch/x86/include/asm/ftrace.h |6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> > index cf88cc8cc74d..c88bf47f46da 100644
> > --- a/arch/x86/include/asm/ftrace.h
> > +++ b/arch/x86/include/asm/ftrace.h
> > @@ -36,6 +36,12 @@ static inline unsigned long ftrace_call_adjust(unsigned 
> > long addr)
> >  
> >  #ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> >  struct ftrace_regs {
> > +   /*
> > +* On the x86_64, the ftrace_regs saves;
> > +* rax, rcx, rdx, rdi, rsi, r8, r9, rbp, rip and rsp.
> > +* Also orig_ax is used for passing direct trampoline address.
> > +* x86_32 doesn't support ftrace_regs.
> 
> Should add a comment that if fregs->regs.cs is set, then all of the pt_regs
> is valid.

But what about rbx and r1*? Only regs->cs should be care for pt_regs?
Or, did you mean "the ftrace_regs is valid"?

> And x86_32 does support ftrace_regs, it just doesn't support
> having a subset of it.

Oh, thanks. I'll update the comment about x86_32.

Thank you,

> 
> -- Steve
> 
> 
> > +*/
> > struct pt_regs  regs;
> >  };
> >  
> 
> 


-- 
Masami Hiramatsu (Google)

Re: [PATCH v10 01/36] tracing: Add a comment about ftrace_regs definition

2024-05-23 Thread Google

On Thu, 23 May 2024 19:10:31 -0400
Steven Rostedt  wrote:

> On Tue,  7 May 2024 23:08:12 +0900
> "Masami Hiramatsu (Google)"  wrote:
> 
> > From: Masami Hiramatsu (Google) 
> > 
> > To clarify what will be expected on ftrace_regs, add a comment to the
> > architecture independent definition of the ftrace_regs.
> > 
> > Signed-off-by: Masami Hiramatsu (Google) 
> > Acked-by: Mark Rutland 
> > ---
> >  Changes in v8:
> >   - Update that the saved registers depends on the context.
> >  Changes in v3:
> >   - Add instruction pointer
> >  Changes in v2:
> >   - newly added.
> > ---
> >  include/linux/ftrace.h |   26 ++
> >  1 file changed, 26 insertions(+)
> > 
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index 54d53f345d14..b81f1afa82a1 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -118,6 +118,32 @@ extern int ftrace_enabled;
> >  
> >  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> >  
> > +/**
> > + * ftrace_regs - ftrace partial/optimal register set
> > + *
> > + * ftrace_regs represents a group of registers which is used at the
> > + * function entry and exit. There are three types of registers.
> > + *
> > + * - Registers for passing the parameters to callee, including the stack
> > + *   pointer. (e.g. rcx, rdx, rdi, rsi, r8, r9 and rsp on x86_64)
> > + * - Registers for passing the return values to caller.
> > + *   (e.g. rax and rdx on x86_64)
> > + * - Registers for hooking the function call and return including the
> > + *   frame pointer (the frame pointer is architecture/config dependent)
> > + *   (e.g. rip, rbp and rsp for x86_64)
> > + *
> > + * Also, architecture dependent fields can be used for internal process.
> > + * (e.g. orig_ax on x86_64)
> > + *
> > + * On the function entry, those registers will be restored except for
> > + * the stack pointer, so that user can change the function parameters
> > + * and instruction pointer (e.g. live patching.)
> > + * On the function exit, only registers which is used for return values
> > + * are restored.
> 
> I wonder if we should also add a note about some architectures in some
> circumstances may store all pt_regs in ftrace_regs. For example, if an
> architecture supports FTRACE_WITH_REGS, it may pass the pt_regs within the
> ftrace_regs. If that is the case, then ftrace_get_regs() called on it will
> return a pointer to a valid pt_regs, or NULL if it is not supported or the
> ftrace_regs does not have a all the registers.

Agreed. That case also should be noted. Thanks for pointing!


> 
> -- Steve
> 
> 
> > + *
> > + * NOTE: user *must not* access regs directly, only do it via APIs, because
> > + * the member can be changed according to the architecture.
> > + */
> >  struct ftrace_regs {
> > struct pt_regs  regs;
> >  };
> 


-- 
Masami Hiramatsu (Google)

Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

2024-05-23 Thread John Groves

On 24/05/23 03:57PM, Miklos Szeredi wrote:
> [trimming CC list]
> 
> On Thu, 23 May 2024 at 04:49, John Groves  wrote:
> 
> > - memmap=! will reserve a pretend pmem device at 
> > 
> > - memmap=$ will reserve a pretend dax device at 
> > 
> 
> Doesn't get me a /dev/dax or /dev/pmem
> 
> Complete qemu command line:
> 
> qemu-kvm -s -serial none -parallel none -kernel
> /home/mszeredi/git/linux/arch/x86/boot/bzImage -drive
> format=raw,file=/home/mszeredi/root_fs,index=0,if=virtio -drive
> format=raw,file=/home/mszeredi/images/ubd1,index=1,if=virtio -chardev
> stdio,id=virtiocon0,signal=off -device virtio-serial -device
> virtconsole,chardev=virtiocon0 -cpu host -m 8G -net user -net
> nic,model=virtio -fsdev local,security_model=none,id=fsdev0,path=/home
> -device virtio-9p-pci,fsdev=fsdev0,mount_tag=hostshare -device
> virtio-rng-pci -smp 4 -append 'root=/dev/vda console=hvc0
> memmap=4G$4G'
> 
> root@kvm:~/famfs# scripts/chk_efi.sh
> This system is neither Ubuntu nor Fedora. It is identified as debian.
> /sys/firmware/efi not found; probably not efi
>  not found; probably nof efi
> /boot/efi/EFI not found; probably not efi
> /boot/efi/EFI/BOOT not found; probably not efi
> /boot/efi/EFI/ not found; probably not efi
> /boot/efi/EFI//grub.cfg not found; probably nof efi
> Probably not efi; errs=6
> 
> Thanks,
> Miklos

Apologies, but I'm short on time at the moment - going into a long holiday
weekend in the US with family plans. I should be focused again by middle of
next week.

But can you check /proc/cmdline to see of the memmap arg got through without
getting mangled? The '$' tends to get fubar'd. You might need \$, or I've seen
the need for \\\$. If it's un-mangled, there should be a dax device.

If that doesn't work, it's worth trying '!' instead, which I think would give
you a pmem device - if the arg gets through (but ! is less likely to get
horked). That pmem device can be converted to devdax...

Regards,
John

Re: [PATCH] uprobes: prevent mutex_lock() under rcu_read_lock()

2024-05-23 Thread Google

On Mon, 20 May 2024 22:30:17 -0700
Andrii Nakryiko  wrote:

> Recent changes made uprobe_cpu_buffer preparation lazy, and moved it
> deeper into __uprobe_trace_func(). This is problematic because
> __uprobe_trace_func() is called inside rcu_read_lock()/rcu_read_unlock()
> block, which then calls prepare_uprobe_buffer() -> uprobe_buffer_get() ->
> mutex_lock(>mutex), leading to a splat about using mutex under
> non-sleepable RCU:
> 
>   BUG: sleeping function called from invalid context at 
> kernel/locking/mutex.c:585
>in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 98231, name: 
> stress-ng-sigq
>preempt_count: 0, expected: 0
>RCU nest depth: 1, expected: 0
>...
>Call Trace:
> 
> dump_stack_lvl+0x3d/0xe0
> __might_resched+0x24c/0x270
> ? prepare_uprobe_buffer+0xd5/0x1d0
> __mutex_lock+0x41/0x820
> ? ___perf_sw_event+0x206/0x290
> ? __perf_event_task_sched_in+0x54/0x660
> ? __perf_event_task_sched_in+0x54/0x660
> prepare_uprobe_buffer+0xd5/0x1d0
> __uprobe_trace_func+0x4a/0x140
> uprobe_dispatcher+0x135/0x280
> ? uprobe_dispatcher+0x94/0x280
> uprobe_notify_resume+0x650/0xec0
> ? atomic_notifier_call_chain+0x21/0x110
> ? atomic_notifier_call_chain+0xf8/0x110
> irqentry_exit_to_user_mode+0xe2/0x1e0
> asm_exc_int3+0x35/0x40
>RIP: 0033:0x7f7e1d4da390
>Code: 33 04 00 0f 1f 80 00 00 00 00 f3 0f 1e fa b9 01 00 00 00 e9 b2 fc ff 
> ff 66 90 f3 0f 1e fa 31 c9 e9 a5 fc ff ff 0f 1f 44 00 00  0f 1e fa b8 27 
> 00 00 00 0f 05 c3 0f 1f 40 00 f3 0f 1e fa b8 6e
>RSP: 002b:7ffd2abc3608 EFLAGS: 0246
>RAX:  RBX: 76d325f1 RCX: 
>RDX: 76d325f1 RSI: 000a RDI: 7ffd2abc3690
>RBP: 000a R08: 00017fb7 R09: 00017fb7
>R10: 00017fb7 R11: 0246 R12: 00017ff2
>R13: 7ffd2abc3610 R14:  R15: 7ffd2abc3780
> 
> 
> Luckily, it's easy to fix by moving prepare_uprobe_buffer() to be called
> slightly earlier: into uprobe_trace_func() and uretprobe_trace_func(), outside
> of RCU locked section. This still keeps this buffer preparation lazy and helps
> avoid the overhead when it's not needed. E.g., if there is only BPF uprobe
> handler installed on a given uprobe, buffer won't be initialized.
> 
> Note, the other user of prepare_uprobe_buffer(), __uprobe_perf_func(), is not
> affected, as it doesn't prepare buffer under RCU read lock.
> 

Oops, good catch! This looks good to me. Let me pick it.
Let me add a simple uprobe test in ftracetest so that this error can
detect in selftests. (I could reproduced it.)

Thank you,

> Fixes: 1b8f85defbc8 ("uprobes: prepare uprobe args buffer lazily")
> Reported-by: Breno Leitao 
> Signed-off-by: Andrii Nakryiko 
> ---
>  kernel/trace/trace_uprobe.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index 8541fa1494ae..c98e3b3386ba 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -970,19 +970,17 @@ static struct uprobe_cpu_buffer 
> *prepare_uprobe_buffer(struct trace_uprobe *tu,
>  
>  static void __uprobe_trace_func(struct trace_uprobe *tu,
>   unsigned long func, struct pt_regs *regs,
> - struct uprobe_cpu_buffer **ucbp,
> + struct uprobe_cpu_buffer *ucb,
>   struct trace_event_file *trace_file)
>  {
>   struct uprobe_trace_entry_head *entry;
>   struct trace_event_buffer fbuffer;
> - struct uprobe_cpu_buffer *ucb;
>   void *data;
>   int size, esize;
>   struct trace_event_call *call = trace_probe_event_call(>tp);
>  
>   WARN_ON(call != trace_file->event_call);
>  
> - ucb = prepare_uprobe_buffer(tu, regs, ucbp);
>   if (WARN_ON_ONCE(ucb->dsize > PAGE_SIZE))
>   return;
>  
> @@ -1014,13 +1012,16 @@ static int uprobe_trace_func(struct trace_uprobe *tu, 
> struct pt_regs *regs,
>struct uprobe_cpu_buffer **ucbp)
>  {
>   struct event_file_link *link;
> + struct uprobe_cpu_buffer *ucb;
>  
>   if (is_ret_probe(tu))
>   return 0;
>  
> + ucb = prepare_uprobe_buffer(tu, regs, ucbp);
> +
>   rcu_read_lock();
>   trace_probe_for_each_link_rcu(link, >tp)
> - __uprobe_trace_func(tu, 0, regs, ucbp, link->file);
> + __uprobe_trace_func(tu, 0, regs, ucb, link->file);
>   rcu_read_unlock();
>  
>   return 0;
> @@ -1031,10 +1032,13 @@ static void uretprobe_trace_func(struct trace_uprobe 
> *tu, unsigned long func,
>struct uprobe_cpu_buffer **ucbp)
>  {
>   struct event_file_link *link;
> + struct uprobe_cpu_buffer *ucb;
> +
> + ucb = prepare_uprobe_buffer(tu, regs, ucbp);
>  
>   rcu_read_lock();
>

Re: [PATCH v10 03/36] x86: tracing: Add ftrace_regs definition in the header

2024-05-23 Thread Steven Rostedt

On Tue,  7 May 2024 23:08:35 +0900
"Masami Hiramatsu (Google)"  wrote:

> From: Masami Hiramatsu (Google) 
> 
> Add ftrace_regs definition for x86_64 in the ftrace header to
> clarify what register will be accessible from ftrace_regs.
> 
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  Changes in v3:
>   - Add rip to be saved.
>  Changes in v2:
>   - Newly added.
> ---
>  arch/x86/include/asm/ftrace.h |6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index cf88cc8cc74d..c88bf47f46da 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -36,6 +36,12 @@ static inline unsigned long ftrace_call_adjust(unsigned 
> long addr)
>  
>  #ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
>  struct ftrace_regs {
> + /*
> +  * On the x86_64, the ftrace_regs saves;
> +  * rax, rcx, rdx, rdi, rsi, r8, r9, rbp, rip and rsp.
> +  * Also orig_ax is used for passing direct trampoline address.
> +  * x86_32 doesn't support ftrace_regs.

Should add a comment that if fregs->regs.cs is set, then all of the pt_regs
is valid. And x86_32 does support ftrace_regs, it just doesn't support
having a subset of it.

-- Steve


> +  */
>   struct pt_regs  regs;
>  };
>

Re: [PATCH v10 01/36] tracing: Add a comment about ftrace_regs definition

2024-05-23 Thread Steven Rostedt

On Tue,  7 May 2024 23:08:12 +0900
"Masami Hiramatsu (Google)"  wrote:

> From: Masami Hiramatsu (Google) 
> 
> To clarify what will be expected on ftrace_regs, add a comment to the
> architecture independent definition of the ftrace_regs.
> 
> Signed-off-by: Masami Hiramatsu (Google) 
> Acked-by: Mark Rutland 
> ---
>  Changes in v8:
>   - Update that the saved registers depends on the context.
>  Changes in v3:
>   - Add instruction pointer
>  Changes in v2:
>   - newly added.
> ---
>  include/linux/ftrace.h |   26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 54d53f345d14..b81f1afa82a1 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -118,6 +118,32 @@ extern int ftrace_enabled;
>  
>  #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
>  
> +/**
> + * ftrace_regs - ftrace partial/optimal register set
> + *
> + * ftrace_regs represents a group of registers which is used at the
> + * function entry and exit. There are three types of registers.
> + *
> + * - Registers for passing the parameters to callee, including the stack
> + *   pointer. (e.g. rcx, rdx, rdi, rsi, r8, r9 and rsp on x86_64)
> + * - Registers for passing the return values to caller.
> + *   (e.g. rax and rdx on x86_64)
> + * - Registers for hooking the function call and return including the
> + *   frame pointer (the frame pointer is architecture/config dependent)
> + *   (e.g. rip, rbp and rsp for x86_64)
> + *
> + * Also, architecture dependent fields can be used for internal process.
> + * (e.g. orig_ax on x86_64)
> + *
> + * On the function entry, those registers will be restored except for
> + * the stack pointer, so that user can change the function parameters
> + * and instruction pointer (e.g. live patching.)
> + * On the function exit, only registers which is used for return values
> + * are restored.

I wonder if we should also add a note about some architectures in some
circumstances may store all pt_regs in ftrace_regs. For example, if an
architecture supports FTRACE_WITH_REGS, it may pass the pt_regs within the
ftrace_regs. If that is the case, then ftrace_get_regs() called on it will
return a pointer to a valid pt_regs, or NULL if it is not supported or the
ftrace_regs does not have a all the registers.

-- Steve


> + *
> + * NOTE: user *must not* access regs directly, only do it via APIs, because
> + * the member can be changed according to the architecture.
> + */
>  struct ftrace_regs {
>   struct pt_regs  regs;
>  };

Re: [PATCH v2 1/1] x86/vector: Fix vector leak during CPU offline

2024-05-23 Thread Thomas Gleixner

On Wed, May 22 2024 at 15:02, Dongli Zhang wrote:
> The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
> interrupt affinity reconfiguration via procfs. Instead, the change is
> deferred until the next instance of the interrupt being triggered on the
> original CPU.
>
> When the interrupt next triggers on the original CPU, the new affinity is
> enforced within __irq_move_irq(). A vector is allocated from the new CPU,
> but if the old vector on the original CPU remains online, it is not
> immediately reclaimed. Instead, apicd->move_in_progress is flagged, and the
> reclaiming process is delayed until the next trigger of the interrupt on
> the new CPU.
>
> Upon the subsequent triggering of the interrupt on the new CPU,
> irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
> remains online. Subsequently, the timer on the old CPU iterates over its
> vector_cleanup list, reclaiming old vectors.
>
> However, a rare scenario arises if the old CPU is outgoing before the
> interrupt triggers again on the new CPU. The irq_force_complete_move() may
> not have the chance to be invoked on the outgoing CPU to reclaim the old
> apicd->prev_vector. This is because the interrupt isn't currently affine to
> the outgoing CPU, and irq_needs_fixup() returns false. Even though
> __vector_schedule_cleanup() is later called on the new CPU, it doesn't
> reclaim apicd->prev_vector; instead, it simply resets both
> apicd->move_in_progress and apicd->prev_vector to 0.
>
> As a result, the vector remains unreclaimed in vector_matrix, leading to a
> CPU vector leak.
>
> To address this issue, move the invocation of irq_force_complete_move()
> before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
> interrupt is currently or used to affine to the outgoing CPU. Additionally,
> reclaim the vector in __vector_schedule_cleanup() as well, following a
> warning message, although theoretically it should never see
> apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.

Nice change log!

Re: [GIT PULL v2] virtio: features, fixes, cleanups

2024-05-23 Thread pr-tracker-bot

The pull request you sent on Thu, 23 May 2024 02:00:17 -0400:

> https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/2ef32ad2241340565c35baf77fc95053c84eeeb0

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

Re: [PATCH] x86/paravirt: Disable virt spinlock when CONFIG_PARAVIRT_SPINLOCKS disabled

2024-05-23 Thread Dave Hansen

On 5/23/24 11:39, Jürgen Groß wrote:
>>
>> Let's just keep it simple.  How about the attached patch?
> 
> Simple indeed. The attachment is empty. 

Let's try this again.diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 5358d43886ad..c193c9e60a1b 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -55,8 +55,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
 
 void __init native_pv_lock_init(void)
 {
-	if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) &&
-	!boot_cpu_has(X86_FEATURE_HYPERVISOR))
+	if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
 		static_branch_disable(_spin_lock_key);
 }

Re: [PATCH] x86/paravirt: Disable virt spinlock when CONFIG_PARAVIRT_SPINLOCKS disabled

2024-05-23 Thread Jürgen Groß


On 23.05.24 18:30, Dave Hansen wrote:

On 5/16/24 06:02, Chen Yu wrote:

Performance drop is reported when running encode/decode workload and
BenchSEE cache sub-workload.
Bisect points to commit ce0a1b608bfc ("x86/paravirt: Silence unused
native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS
is disabled the virt_spin_lock_key is set to true on bare-metal.
The qspinlock degenerates to test-and-set spinlock, which decrease the
performance on bare-metal.

Fix this by disabling virt_spin_lock_key if CONFIG_PARAVIRT_SPINLOCKS
is not set, or it is on bare-metal.


This is missing some background:

The kernel can change spinlock behavior when running as a guest.  But
this guest-friendly behavior causes performance problems on bare metal.
So there's a 'virt_spin_lock_key' static key to switch between the two
modes.

The static key is always enabled by default (run in guest mode) and
should be disabled for bare metal (and in some guests that want native
behavior).

... then describe the regression and the fix


diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 5358d43886ad..ee51c0949ed8 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -55,7 +55,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
  
  void __init native_pv_lock_init(void)

  {
-   if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) &&
+   if (!IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) ||
!boot_cpu_has(X86_FEATURE_HYPERVISOR))
static_branch_disable(_spin_lock_key);
  }

This gets used at a single site:

 if (pv_enabled())
 goto pv_queue;

 if (virt_spin_lock(lock))
 return;

which is logically:

if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS))
goto ...; // don't look at virt_spin_lock_key

if (virt_spin_lock_key)
return; // On virt, but non-paravirt.  Did Test-and-Set
// spinlock.

So I _think_ Arnd was trying to optimize native_pv_lock_init() away when
it's going to get skipped over anyway by the 'goto'.

But this took me at least 30 minutes of scratching my head and trying to
untangle the whole thing.  It's all far too subtle for my taste, and all
of that to save a few bytes of init text in a configuration that's
probably not even used very often (PARAVIRT=y, but PARAVIRT_SPINLOCKS=n).

Let's just keep it simple.  How about the attached patch?


Simple indeed. The attachment is empty. :-p


Juergen

Re: [PATCH] riscv: Fix early ftrace nop patching

2024-05-23 Thread patchwork-bot+linux-riscv

Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt :

On Thu, 23 May 2024 13:51:34 +0200 you wrote:
> Commit c97bf629963e ("riscv: Fix text patching when IPI are used")
> converted ftrace_make_nop() to use patch_insn_write() which does not
> emit any icache flush relying entirely on __ftrace_modify_code() to do
> that.
> 
> But we missed that ftrace_make_nop() was called very early directly when
> converting mcount calls into nops (actually on riscv it converts 2B nops
> emitted by the compiler into 4B nops).
> 
> [...]

Here is the summary with links:
  - riscv: Fix early ftrace nop patching
https://git.kernel.org/riscv/c/6ca445d8af0e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH v2 1/2] drivers: remoteproc: xlnx: add attach detach support

2024-05-23 Thread Tanmay Shah

t device_node *np;
>> >>   int tcm_bank_count;
>> >>   struct mem_bank_data **tcm_banks;
>> >>   struct rproc *rproc;
>> >> + u32 rsc_tbl_size;
>> >>   u32 pm_domain_id;
>> >>   struct mbox_info *ipi;
>> >>  };
>> >> @@ -621,10 +638,19 @@ static int zynqmp_r5_rproc_prepare(struct rproc 
>> >> *rproc)
>> >>  {
>> >>   int ret;
>> >>  
>> >> - ret = add_tcm_banks(rproc);
>> >> - if (ret) {
>> >> - dev_err(>dev, "failed to get TCM banks, err %d\n", ret);
>> >> - return ret;
>> >> + /**
>> > 
>> > Using "/**" is for comments that will endup in the documentation, which I 
>> > don't
>> > think is needed here.  Please correct throughout the patch.
>> 
>> Thanks. Ack, I will use only /* format.
>> 
>> > 
>> >> +  * For attach/detach use case, Firmware is already loaded so
>> >> +  * TCM isn't really needed at all. Also, for security TCM can be
>> >> +  * locked in such case and linux may not have access at all.
>> >> +  * So avoid adding TCM banks. TCM power-domains requested during attach
>> >> +  * callback.
>> >> +  */
>> >> + if (rproc->state != RPROC_DETACHED) {
>> >> + ret = add_tcm_banks(rproc);
>> >> + if (ret) {
>> >> + dev_err(>dev, "failed to get TCM banks, err 
>> >> %d\n", ret);
>> >> + return ret;
>> >> + }
>> >>   }
>> >>  
>> >>   ret = add_mem_regions_carveout(rproc);
>> >> @@ -662,6 +688,123 @@ static int zynqmp_r5_rproc_unprepare(struct rproc 
>> >> *rproc)
>> >>   return 0;
>> >>  }
>> >>  
>> >> +static struct resource_table *zynqmp_r5_get_loaded_rsc_table(struct 
>> >> rproc *rproc,
>> >> +  size_t *size)
>> >> +{
>> >> + struct zynqmp_r5_core *r5_core;
>> >> +
>> >> + r5_core = rproc->priv;
>> >> +
>> >> + *size = r5_core->rsc_tbl_size;
>> >> +
>> >> + return r5_core->rsc_tbl_va;
>> >> +}
>> >> +
>> >> +static int zynqmp_r5_get_rsc_table_va(struct zynqmp_r5_core *r5_core)
>> >> +{
>> >> + struct device *dev = r5_core->dev;
>> >> + struct rsc_tbl_data *rsc_data_va;
>> >> + struct resource_table *rsc_addr;
>> >> + struct resource res_mem;
>> >> + struct device_node *np;
>> >> + int ret;
>> >> +
>> >> + /**
>> >> +  * It is expected from remote processor firmware to provide resource
>> >> +  * table address via struct rsc_tbl_data data structure.
>> >> +  * Start address of first entry under "memory-region" property list
>> >> +  * contains that data structure which holds resource table address, size
>> >> +  * and some magic number to validate correct resource table entry.
>> >> +  */
>> >> + np = of_parse_phandle(r5_core->np, "memory-region", 0);
>> >> + if (!np) {
>> >> + dev_err(dev, "failed to get memory region dev node\n");
>> >> + return -EINVAL;
>> >> + }
>> >> +
>> >> + ret = of_address_to_resource(np, 0, _mem);
>> >> + if (ret) {
>> >> + dev_err(dev, "failed to get memory-region resource addr\n");
>> >> + return -EINVAL;
>> >> + }
>> >> +
>> >> + rsc_data_va = (struct rsc_tbl_data *)devm_ioremap_wc(dev, res_mem.start,
>> >> +  sizeof(struct 
>> >> rsc_tbl_data));
>> > 
>> > There is no point in holding memory until the driver is unloaded.  Please 
>> > use
>> > ioremap_wc() and free at the end of the function.
>> > 
>> 
>> Ack.
>> 
>> >> + if (!rsc_data_va) {
>> >> + dev_err(dev, "failed to map resource table data address\n");
>> >> + return -EIO;
>> >> + }
>> >> +
>> >> + /**
>> >> +  * If RSC_TBL_XLNX_MAGIC number and its complement isn't found then
>> >> +  * do not consider resource table address valid and don't attach
>> >> +  */
>> >> + if (rsc_data_va->magic_num != RS

Re: [PATCH v2 1/2] drivers: remoteproc: xlnx: add attach detach support

2024-05-23 Thread Mathieu Poirier

ret = add_tcm_banks(rproc);
> >> -  if (ret) {
> >> -  dev_err(>dev, "failed to get TCM banks, err %d\n", ret);
> >> -  return ret;
> >> +  /**
> > 
> > Using "/**" is for comments that will endup in the documentation, which I 
> > don't
> > think is needed here.  Please correct throughout the patch.
> 
> Thanks. Ack, I will use only /* format.
> 
> > 
> >> +   * For attach/detach use case, Firmware is already loaded so
> >> +   * TCM isn't really needed at all. Also, for security TCM can be
> >> +   * locked in such case and linux may not have access at all.
> >> +   * So avoid adding TCM banks. TCM power-domains requested during attach
> >> +   * callback.
> >> +   */
> >> +  if (rproc->state != RPROC_DETACHED) {
> >> +  ret = add_tcm_banks(rproc);
> >> +  if (ret) {
> >> +  dev_err(>dev, "failed to get TCM banks, err 
> >> %d\n", ret);
> >> +  return ret;
> >> +  }
> >>}
> >>  
> >>ret = add_mem_regions_carveout(rproc);
> >> @@ -662,6 +688,123 @@ static int zynqmp_r5_rproc_unprepare(struct rproc 
> >> *rproc)
> >>return 0;
> >>  }
> >>  
> >> +static struct resource_table *zynqmp_r5_get_loaded_rsc_table(struct rproc 
> >> *rproc,
> >> +   size_t *size)
> >> +{
> >> +  struct zynqmp_r5_core *r5_core;
> >> +
> >> +  r5_core = rproc->priv;
> >> +
> >> +  *size = r5_core->rsc_tbl_size;
> >> +
> >> +  return r5_core->rsc_tbl_va;
> >> +}
> >> +
> >> +static int zynqmp_r5_get_rsc_table_va(struct zynqmp_r5_core *r5_core)
> >> +{
> >> +  struct device *dev = r5_core->dev;
> >> +  struct rsc_tbl_data *rsc_data_va;
> >> +  struct resource_table *rsc_addr;
> >> +  struct resource res_mem;
> >> +  struct device_node *np;
> >> +  int ret;
> >> +
> >> +  /**
> >> +   * It is expected from remote processor firmware to provide resource
> >> +   * table address via struct rsc_tbl_data data structure.
> >> +   * Start address of first entry under "memory-region" property list
> >> +   * contains that data structure which holds resource table address, size
> >> +   * and some magic number to validate correct resource table entry.
> >> +   */
> >> +  np = of_parse_phandle(r5_core->np, "memory-region", 0);
> >> +  if (!np) {
> >> +  dev_err(dev, "failed to get memory region dev node\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  ret = of_address_to_resource(np, 0, _mem);
> >> +  if (ret) {
> >> +  dev_err(dev, "failed to get memory-region resource addr\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  rsc_data_va = (struct rsc_tbl_data *)devm_ioremap_wc(dev, res_mem.start,
> >> +   sizeof(struct 
> >> rsc_tbl_data));
> > 
> > There is no point in holding memory until the driver is unloaded.  Please 
> > use
> > ioremap_wc() and free at the end of the function.
> > 
> 
> Ack.
> 
> >> +  if (!rsc_data_va) {
> >> +  dev_err(dev, "failed to map resource table data address\n");
> >> +  return -EIO;
> >> +  }
> >> +
> >> +  /**
> >> +   * If RSC_TBL_XLNX_MAGIC number and its complement isn't found then
> >> +   * do not consider resource table address valid and don't attach
> >> +   */
> >> +  if (rsc_data_va->magic_num != RSC_TBL_XLNX_MAGIC ||
> >> +  rsc_data_va->comp_magic_num != ~RSC_TBL_XLNX_MAGIC) {
> >> +  dev_dbg(dev, "invalid magic number, won't attach\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  rsc_addr = (struct resource_table *)ioremap_wc(rsc_data_va->rsc_tbl,
> >> + 
> >> rsc_data_va->rsc_tbl_size);
> 
> [1] Here typecast is done.
> 
> >> +  if (!rsc_addr) {
> >> +  dev_err(dev, "failed to get rsc_addr\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  /**
> >> +   * As of now resource table version 1 is expected. Don't fail to attach
> >> +   * but w

Re: [PATCH] x86/paravirt: Disable virt spinlock when CONFIG_PARAVIRT_SPINLOCKS disabled

2024-05-23 Thread Dave Hansen

On 5/16/24 06:02, Chen Yu wrote:
> Performance drop is reported when running encode/decode workload and
> BenchSEE cache sub-workload.
> Bisect points to commit ce0a1b608bfc ("x86/paravirt: Silence unused
> native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS
> is disabled the virt_spin_lock_key is set to true on bare-metal.
> The qspinlock degenerates to test-and-set spinlock, which decrease the
> performance on bare-metal.
> 
> Fix this by disabling virt_spin_lock_key if CONFIG_PARAVIRT_SPINLOCKS
> is not set, or it is on bare-metal.

This is missing some background:

The kernel can change spinlock behavior when running as a guest.  But
this guest-friendly behavior causes performance problems on bare metal.
So there's a 'virt_spin_lock_key' static key to switch between the two
modes.

The static key is always enabled by default (run in guest mode) and
should be disabled for bare metal (and in some guests that want native
behavior).

... then describe the regression and the fix

> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index 5358d43886ad..ee51c0949ed8 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -55,7 +55,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
>  
>  void __init native_pv_lock_init(void)
>  {
> - if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) &&
> + if (!IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) ||
>   !boot_cpu_has(X86_FEATURE_HYPERVISOR))
>   static_branch_disable(_spin_lock_key);
>  }
This gets used at a single site:

if (pv_enabled())
goto pv_queue;

if (virt_spin_lock(lock))
return;

which is logically:

if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS))
goto ...; // don't look at virt_spin_lock_key

if (virt_spin_lock_key)
return; // On virt, but non-paravirt.  Did Test-and-Set
// spinlock.

So I _think_ Arnd was trying to optimize native_pv_lock_init() away when
it's going to get skipped over anyway by the 'goto'.

But this took me at least 30 minutes of scratching my head and trying to
untangle the whole thing.  It's all far too subtle for my taste, and all
of that to save a few bytes of init text in a configuration that's
probably not even used very often (PARAVIRT=y, but PARAVIRT_SPINLOCKS=n).

Let's just keep it simple.  How about the attached patch?

Re: [PATCH v2] sched/rt: Clean up usage of rt_task()

2024-05-23 Thread Steven Rostedt

On Wed, 15 May 2024 23:05:36 +0100
Qais Yousef  wrote:
> diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
> index df3aca89d4f5..5cb88b748ad6 100644
> --- a/include/linux/sched/deadline.h
> +++ b/include/linux/sched/deadline.h
> @@ -10,8 +10,6 @@
>  
>  #include 
>  
> -#define MAX_DL_PRIO  0
> -
>  static inline int dl_prio(int prio)
>  {
>   if (unlikely(prio < MAX_DL_PRIO))
> @@ -19,6 +17,10 @@ static inline int dl_prio(int prio)
>   return 0;
>  }
>  
> +/*
> + * Returns true if a task has a priority that belongs to DL class. PI-boosted
> + * tasks will return true. Use dl_policy() to ignore PI-boosted tasks.
> + */
>  static inline int dl_task(struct task_struct *p)
>  {
>   return dl_prio(p->prio);
> diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
> index ab83d85e1183..6ab43b4f72f9 100644
> --- a/include/linux/sched/prio.h
> +++ b/include/linux/sched/prio.h
> @@ -14,6 +14,7 @@
>   */
>  
>  #define MAX_RT_PRIO  100
> +#define MAX_DL_PRIO  0
>  
>  #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
>  #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
> diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
> index b2b9e6eb9683..a055dd68a77c 100644
> --- a/include/linux/sched/rt.h
> +++ b/include/linux/sched/rt.h
> @@ -7,18 +7,43 @@
>  struct task_struct;
>  
>  static inline int rt_prio(int prio)
> +{
> + if (unlikely(prio < MAX_RT_PRIO && prio >= MAX_DL_PRIO))
> + return 1;
> + return 0;
> +}
> +
> +static inline int realtime_prio(int prio)
>  {
>   if (unlikely(prio < MAX_RT_PRIO))
>   return 1;
>   return 0;
>  }

I'm thinking we should change the above to bool (separate patch), as
returning an int may give one the impression that it returns the actual
priority number. Having it return bool will clear that up.

In fact, if we are touching theses functions, might as well change all of
them to bool when returning true/false. Just to make it easier to
understand what they are doing.

>  
> +/*
> + * Returns true if a task has a priority that belongs to RT class. PI-boosted
> + * tasks will return true. Use rt_policy() to ignore PI-boosted tasks.
> + */
>  static inline int rt_task(struct task_struct *p)
>  {
>   return rt_prio(p->prio);
>  }
>  
> -static inline bool task_is_realtime(struct task_struct *tsk)
> +/*
> + * Returns true if a task has a priority that belongs to RT or DL classes.
> + * PI-boosted tasks will return true. Use realtime_task_policy() to ignore
> + * PI-boosted tasks.
> + */
> +static inline int realtime_task(struct task_struct *p)
> +{
> + return realtime_prio(p->prio);
> +}
> +
> +/*
> + * Returns true if a task has a policy that belongs to RT or DL classes.
> + * PI-boosted tasks will return false.
> + */
> +static inline bool realtime_task_policy(struct task_struct *tsk)
>  {
>   int policy = tsk->policy;
>  



> diff --git a/kernel/trace/trace_sched_wakeup.c 
> b/kernel/trace/trace_sched_wakeup.c
> index 0469a04a355f..19d737742e29 100644
> --- a/kernel/trace/trace_sched_wakeup.c
> +++ b/kernel/trace/trace_sched_wakeup.c
> @@ -545,7 +545,7 @@ probe_wakeup(void *ignore, struct task_struct *p)
>*  - wakeup_dl handles tasks belonging to sched_dl class only.
>*/
>   if (tracing_dl || (wakeup_dl && !dl_task(p)) ||
> - (wakeup_rt && !dl_task(p) && !rt_task(p)) ||
> + (wakeup_rt && !realtime_task(p)) ||
>   (!dl_task(p) && (p->prio >= wakeup_prio || p->prio >= 
> current->prio)))
>   return;
>  

Reviewed-by: Steven Rostedt (Google)

Re: [RFC PATCH 0/5] vsock/virtio: Add support for multi-devices

2024-05-23 Thread Michael S. Tsirkin

On Fri, May 17, 2024 at 10:46:02PM +0800, Xuewei Niu wrote:
>  include/linux/virtio_vsock.h|   2 +-
>  include/net/af_vsock.h  |  25 ++-
>  include/uapi/linux/virtio_vsock.h   |   1 +
>  include/uapi/linux/vm_sockets.h |  14 ++
>  net/vmw_vsock/af_vsock.c| 116 +--
>  net/vmw_vsock/virtio_transport.c| 255 ++--
>  net/vmw_vsock/virtio_transport_common.c |  16 +-
>  net/vmw_vsock/vsock_loopback.c  |   4 +-
>  8 files changed, 352 insertions(+), 81 deletions(-)

As any change to virtio device/driver interface, this has to
go through the virtio TC. Please subscribe at
virtio-comment+subscr...@lists.linux.dev and then
contact the TC at virtio-comm...@lists.linux.dev

You will likely eventually need to write a spec draft document, too.

-- 
MST

Re: [PATCH] livepatch: introduce klp_func called interface

2024-05-23 Thread Dan Carpenter

On Sun, May 19, 2024 at 03:43:43PM +0800, Wardenjohn wrote:
> Livepatch module usually used to modify kernel functions.
> If the patched function have bug, it may cause serious result
> such as kernel crash.
> 
> This commit introduce a read only interface of livepatch
> sysfs interface. If a livepatch function is called, this
> sysfs interface "called" of the patched function will
> set to be 1.
> 
> /sys/kernel/livepatchcalled
> 
> This value "called" is quite necessary for kernel stability assurance for 
> livepatching
> module of a running system. Testing process is important before a livepatch 
> module
> apply to a production system. With this interface, testing process can easily
> find out which function is successfully called. Any testing process can make 
> sure they
> have successfully cover all the patched function that changed with the help 
> of this interface.
> ---

Always run your patches through checkpatch.

So this patch is so that testers can see if a function has been called?
Can you not get the same information from gcov or ftrace?

There are style issues with the patch, but it's not so important until
the design is agreed on.

regards,
dan carpenter

Re: [PATCH] riscv: Fix early ftrace nop patching

2024-05-23 Thread Conor Dooley

On Thu, May 23, 2024 at 01:51:34PM +0200, Alexandre Ghiti wrote:
> Commit c97bf629963e ("riscv: Fix text patching when IPI are used")
> converted ftrace_make_nop() to use patch_insn_write() which does not
> emit any icache flush relying entirely on __ftrace_modify_code() to do
> that.
> 
> But we missed that ftrace_make_nop() was called very early directly when
> converting mcount calls into nops (actually on riscv it converts 2B nops
> emitted by the compiler into 4B nops).
> 
> This caused crashes on multiple HW as reported by Conor and Björn since
> the booting core could have half-patched instructions in its icache
> which would trigger an illegal instruction trap: fix this by emitting a
> local flush icache when early patching nops.
> 
> Fixes: c97bf629963e ("riscv: Fix text patching when IPI are used")
> Signed-off-by: Alexandre Ghiti 

Reported-by: Conor Dooley 
Tested-by: Conor Dooley 

Thanks for the quick fix Alex :)


signature.asc
Description: PGP signature

Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

2024-05-23 Thread Miklos Szeredi

[trimming CC list]

On Thu, 23 May 2024 at 04:49, John Groves  wrote:

> - memmap=! will reserve a pretend pmem device at 
> 
> - memmap=$ will reserve a pretend dax device at 

Doesn't get me a /dev/dax or /dev/pmem

Complete qemu command line:

qemu-kvm -s -serial none -parallel none -kernel
/home/mszeredi/git/linux/arch/x86/boot/bzImage -drive
format=raw,file=/home/mszeredi/root_fs,index=0,if=virtio -drive
format=raw,file=/home/mszeredi/images/ubd1,index=1,if=virtio -chardev
stdio,id=virtiocon0,signal=off -device virtio-serial -device
virtconsole,chardev=virtiocon0 -cpu host -m 8G -net user -net
nic,model=virtio -fsdev local,security_model=none,id=fsdev0,path=/home
-device virtio-9p-pci,fsdev=fsdev0,mount_tag=hostshare -device
virtio-rng-pci -smp 4 -append 'root=/dev/vda console=hvc0
memmap=4G$4G'

root@kvm:~/famfs# scripts/chk_efi.sh
This system is neither Ubuntu nor Fedora. It is identified as debian.
/sys/firmware/efi not found; probably not efi
 not found; probably nof efi
/boot/efi/EFI not found; probably not efi
/boot/efi/EFI/BOOT not found; probably not efi
/boot/efi/EFI/ not found; probably not efi
/boot/efi/EFI//grub.cfg not found; probably nof efi
Probably not efi; errs=6

Thanks,
Miklos

Re: [PATCH] riscv: Fix early ftrace nop patching

2024-05-23 Thread Björn Töpel

Alexandre Ghiti  writes:

> Commit c97bf629963e ("riscv: Fix text patching when IPI are used")
> converted ftrace_make_nop() to use patch_insn_write() which does not
> emit any icache flush relying entirely on __ftrace_modify_code() to do
> that.
>
> But we missed that ftrace_make_nop() was called very early directly when
> converting mcount calls into nops (actually on riscv it converts 2B nops
> emitted by the compiler into 4B nops).
>
> This caused crashes on multiple HW as reported by Conor and Björn since
> the booting core could have half-patched instructions in its icache
> which would trigger an illegal instruction trap: fix this by emitting a
> local flush icache when early patching nops.
>
> Fixes: c97bf629963e ("riscv: Fix text patching when IPI are used")
> Signed-off-by: Alexandre Ghiti 

Nice!

I've manged to reproduce the crash on the VisionFive2 board (however
only triggered when CONFIG_RELOCATABLE=y), and can verify that this fix
solves the issue.

Reviewed-by: Björn Töpel 
Tested-by: Björn Töpel

Re: [RFC PATCH 0/5] vsock/virtio: Add support for multi-devices

2024-05-23 Thread Stefano Garzarella

 on whether the
   feature is negotiated, since guest and host may not support it.
3. Re-work the patch order for bisectability (more detail on patches 3/4)
4. Do we really need the order or just a default device?
5. Check if we can add some tests in tools/testing/vsock
6. When we agree on the RFC, we should discuss the spec changes in the
   virtio ML before sending a non-RFC series on Linux

These are the main things, but I left other comments in the patches.

Thanks,
Stefano

Re: [RFC PATCH 5/5] vsock: Add an ioctl request to get all CIDs

2024-05-23 Thread Stefano Garzarella


On Fri, May 17, 2024 at 10:46:07PM GMT, Xuewei Niu wrote:

The new request is called `IOCTL_VM_SOCKETS_GET_LOCAL_CIDS`. And the old
one, `IOCTL_VM_SOCKETS_GET_LOCAL_CID` is retained.

For the transport that supports multi-devices:

* `IOCTL_VM_SOCKETS_GET_LOCAL_CID` returns "-1";


What about returning the default CID (lower prio)?

* `IOCTL_VM_SOCKETS_GET_LOCAL_CIDS` returns a vector of CIDS. The usage 
is

shown as following.

```
struct vsock_local_cids local_cids;
if ((ret = ioctl(fd, IOCTL_VM_SOCKETS_GET_LOCAL_CIDS, _cids))) {
   perror("failed to get cids");
   exit(1);
}
for (i = 0; i
---
include/net/af_vsock.h  |  7 +++
include/uapi/linux/vm_sockets.h |  8 
net/vmw_vsock/af_vsock.c| 19 +++
3 files changed, 34 insertions(+)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 25f7dc3d602d..2febc816e388 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -264,4 +264,11 @@ static inline bool vsock_msgzerocopy_allow(const struct 
vsock_transport *t)
{
return t->msgzerocopy_allow && t->msgzerocopy_allow();
}
+
+/ IOCTL /
+/* Type of return value of IOCTL_VM_SOCKETS_GET_LOCAL_CIDS. */
+struct vsock_local_cids {
+   int nr;
+   unsigned int data[MAX_VSOCK_NUM];
+};
#endif /* __AF_VSOCK_H__ */
diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h
index 36ca5023293a..01f73fb7af5a 100644
--- a/include/uapi/linux/vm_sockets.h
+++ b/include/uapi/linux/vm_sockets.h
@@ -195,8 +195,16 @@ struct sockaddr_vm {

#define MAX_VSOCK_NUM 16


Okay, now I see why you need this in the UAPI, but pleace try to follow
other defines.

What about VM_SOCKETS_MAX_DEVS ?



+/* Return actual context id if the transport not support vsock
+ * multi-devices. Otherwise, return `-1U`.
+ */
+
#define IOCTL_VM_SOCKETS_GET_LOCAL_CID  _IO(7, 0xb9)

+/* Only available in transports that support multiple devices. */
+
+#define IOCTL_VM_SOCKETS_GET_LOCAL_CIDS _IOR(7, 0xba, struct 
vsock_local_cids)
+
/* MSG_ZEROCOPY notifications are encoded in the standard error format,
 * sock_extended_err. See Documentation/networking/msg_zerocopy.rst in
 * kernel source tree for more details.
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 3b34be802bf2..2ea2ff52f15b 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -2454,6 +2454,7 @@ static long vsock_dev_do_ioctl(struct file *filp,
u32 __user *p = ptr;
u32 cid = VMADDR_CID_ANY;
int retval = 0;
+   struct vsock_local_cids local_cids;

switch (cmd) {
case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
@@ -2469,6 +2470,24 @@ static long vsock_dev_do_ioctl(struct file *filp,
retval = -EFAULT;
break;

+   case IOCTL_VM_SOCKETS_GET_LOCAL_CIDS:
+   if (!transport_g2h || !transport_g2h->get_local_cids)
+   goto fault;
+
+   rcu_read_lock();
+   local_cids.nr = transport_g2h->get_local_cids(local_cids.data);
+   rcu_read_unlock();
+
+   if (local_cids.nr < 0 ||
+   copy_to_user(p, _cids, sizeof(local_cids)))
+   goto fault;
+
+   break;
+
+fault:
+   retval = -EFAULT;
+   break;
+
default:
retval = -ENOIOCTLCMD;
}
--
2.34.1

Re: [RFC PATCH 4/5] vsock: seqpacket_allow adapts to multi-devices

2024-05-23 Thread Stefano Garzarella


On Fri, May 17, 2024 at 10:46:06PM GMT, Xuewei Niu wrote:

Adds a new argument, named "src_cid", to let them know which `virtio_vsock`
to be selected.

Signed-off-by: Xuewei Niu 
---
include/net/af_vsock.h   |  2 +-
net/vmw_vsock/af_vsock.c | 15 +--
net/vmw_vsock/virtio_transport.c |  4 ++--
net/vmw_vsock/vsock_loopback.c   |  4 ++--
4 files changed, 18 insertions(+), 7 deletions(-)


Same for this.



diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 0151296a0bc5..25f7dc3d602d 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -143,7 +143,7 @@ struct vsock_transport {
 int flags);
int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
 size_t len);
-   bool (*seqpacket_allow)(u32 remote_cid);
+   bool (*seqpacket_allow)(u32 src_cid, u32 remote_cid);
u32 (*seqpacket_has_data)(struct vsock_sock *vsk);

/* Notification. */
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index da06ddc940cd..3b34be802bf2 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -470,10 +470,12 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct 
vsock_sock *psk)
{
const struct vsock_transport *new_transport;
struct sock *sk = sk_vsock(vsk);
-   unsigned int remote_cid = vsk->remote_addr.svm_cid;
+   unsigned int src_cid, remote_cid;
__u8 remote_flags;
int ret;

+   remote_cid = vsk->remote_addr.svm_cid;
+
/* If the packet is coming with the source and destination CIDs higher
 * than VMADDR_CID_HOST, then a vsock channel where all the packets are
 * forwarded to the host should be established. Then the host will
@@ -527,8 +529,17 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct 
vsock_sock *psk)
return -ENODEV;

if (sk->sk_type == SOCK_SEQPACKET) {
+   if (vsk->local_addr.svm_cid == VMADDR_CID_ANY) {
+   if (new_transport->get_default_cid)
+   src_cid = new_transport->get_default_cid();
+   else
+   src_cid = new_transport->get_local_cid();
+   } else {
+   src_cid = vsk->local_addr.svm_cid;
+   }
+
if (!new_transport->seqpacket_allow ||
-   !new_transport->seqpacket_allow(remote_cid)) {
+   !new_transport->seqpacket_allow(src_cid, remote_cid)) {
module_put(new_transport->module);
return -ESOCKTNOSUPPORT;
}
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 998b22e5ce36..0bddcbd906a2 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -615,14 +615,14 @@ static struct virtio_transport virtio_transport = {
.can_msgzerocopy = virtio_transport_can_msgzerocopy,
};

-static bool virtio_transport_seqpacket_allow(u32 remote_cid)
+static bool virtio_transport_seqpacket_allow(u32 src_cid, u32 remote_cid)
{
struct virtio_vsock *vsock;
bool seqpacket_allow;

seqpacket_allow = false;
rcu_read_lock();
-   vsock = rcu_dereference(the_virtio_vsock);
+   vsock = virtio_transport_get_virtio_vsock(src_cid);
if (vsock)
seqpacket_allow = vsock->seqpacket_allow;
rcu_read_unlock();
diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index 6dea6119f5b2..b94358f5bb2c 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -46,7 +46,7 @@ static int vsock_loopback_cancel_pkt(struct vsock_sock *vsk)
return 0;
}

-static bool vsock_loopback_seqpacket_allow(u32 remote_cid);
+static bool vsock_loopback_seqpacket_allow(u32 src_cid, u32 remote_cid);
static bool vsock_loopback_msgzerocopy_allow(void)
{
return true;
@@ -104,7 +104,7 @@ static struct virtio_transport loopback_transport = {
.send_pkt = vsock_loopback_send_pkt,
};

-static bool vsock_loopback_seqpacket_allow(u32 remote_cid)
+static bool vsock_loopback_seqpacket_allow(u32 src_cid, u32 remote_cid)
{
return true;
}
--
2.34.1

Re: [RFC PATCH 3/5] vsock/virtio: can_msgzerocopy adapts to multi-devices

2024-05-23 Thread Stefano Garzarella


On Fri, May 17, 2024 at 10:46:05PM GMT, Xuewei Niu wrote:

Adds a new argument, named "cid", to let them know which `virtio_vsock` to
be selected.

Signed-off-by: Xuewei Niu 
---
include/linux/virtio_vsock.h| 2 +-
net/vmw_vsock/virtio_transport.c| 5 ++---
net/vmw_vsock/virtio_transport_common.c | 6 +++---
3 files changed, 6 insertions(+), 7 deletions(-)


Every commit in linux must be working to support bisection. So these 
changes should be made before adding multi-device support.




diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index c82089dee0c8..21bfd5e0c2e7 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -168,7 +168,7 @@ struct virtio_transport {
 * extra checks and can perform zerocopy transmission by
 * default.
 */
-   bool (*can_msgzerocopy)(int bufs_num);
+   bool (*can_msgzerocopy)(u32 cid, int bufs_num);
};

ssize_t
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 93d25aeafb83..998b22e5ce36 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -521,14 +521,13 @@ static void virtio_vsock_rx_done(struct virtqueue *vq)
queue_work(virtio_vsock_workqueue, >rx_work);
}

-static bool virtio_transport_can_msgzerocopy(int bufs_num)
+static bool virtio_transport_can_msgzerocopy(u32 cid, int bufs_num)
{
struct virtio_vsock *vsock;
bool res = false;

rcu_read_lock();
-
-   vsock = rcu_dereference(the_virtio_vsock);
+   vsock = virtio_transport_get_virtio_vsock(cid);
if (vsock) {
struct virtqueue *vq = vsock->vqs[VSOCK_VQ_TX];

diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index bed75a41419e..e7315d7b9af1 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -39,7 +39,7 @@ virtio_transport_get_ops(struct vsock_sock *vsk)

static bool virtio_transport_can_zcopy(const struct virtio_transport *t_ops,
   struct virtio_vsock_pkt_info *info,
-  size_t pkt_len)
+  size_t pkt_len, unsigned int cid)
{
struct iov_iter *iov_iter;

@@ -62,7 +62,7 @@ static bool virtio_transport_can_zcopy(const struct 
virtio_transport *t_ops,
int pages_to_send = iov_iter_npages(iov_iter, MAX_SKB_FRAGS);

/* +1 is for packet header. */
-   return t_ops->can_msgzerocopy(pages_to_send + 1);
+   return t_ops->can_msgzerocopy(cid, pages_to_send + 1);
}

return true;
@@ -375,7 +375,7 @@ static int virtio_transport_send_pkt_info(struct vsock_sock 
*vsk,
info->msg->msg_flags &= ~MSG_ZEROCOPY;

if (info->msg->msg_flags & MSG_ZEROCOPY)
-   can_zcopy = virtio_transport_can_zcopy(t_ops, info, 
pkt_len);
+   can_zcopy = virtio_transport_can_zcopy(t_ops, info, 
pkt_len, src_cid);

if (can_zcopy)
max_skb_len = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE,
--
2.34.1

Re: [RFC PATCH 2/5] vsock/virtio: Add support for multi-devices

2024-05-23 Thread Stefano Garzarella


On Fri, May 17, 2024 at 10:46:04PM GMT, Xuewei Niu wrote:

The maximum number of devices is limited by `MAX_VSOCK_NUM`.

Extends `vsock_transport` struct with 4 methods to support multi-devices:

* `get_virtio_vsock()`: It receives a CID, and returns a struct of virtio
 vsock. This method is designed to select a vsock device by its CID.
* `get_default_cid()`: It receives nothing, returns the default CID of the
 first vsock device registered to the kernel.
* `get_local_cids()`: It returns a vector of vsock devices' CIDs.
* `compare_order()`: It receives two different CIDs, named "left" and
 "right" respectively. It returns "-1" while the "left" is behind the
 "right". Otherwise, return "1".

`get_local_cid()` is retained, but returns "-1" if the transport supports
multi-devices.

Replaces the single instance of `virtio_vsock` with a list, named
`virtio_vsock_list`. The devices are inserted into the list when probing.

The kernel will deny devices from being registered if there are conflicts
existing in CIDs or orders.

Signed-off-by: Xuewei Niu 
---
include/net/af_vsock.h  |  16 ++
include/uapi/linux/vm_sockets.h |   6 +
net/vmw_vsock/af_vsock.c|  82 ++--
net/vmw_vsock/virtio_transport.c| 246 ++--
net/vmw_vsock/virtio_transport_common.c |  10 +-
5 files changed, 293 insertions(+), 67 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 535701efc1e5..0151296a0bc5 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -174,6 +174,22 @@ struct vsock_transport {

/* Addressing. */
u32 (*get_local_cid)(void);
+   /* Held rcu read lock by the caller. */


We should also explain why the rcu is needed.


+   struct virtio_vsock *(*get_virtio_vsock)(unsigned int cid);


af_vsock supports several transports (i.e. HyperV, VMCI, VIRTIO/VHOST,
loobpack), so we need to be generic here.

In addition, the pointer returned by this function is never used, so
why we need this?


+   unsigned int (*get_default_cid)(void);
+   /* Get an list containing all the CIDs of registered vsock.   Return
+* the length of the list.
+*
+* Held rcu read lock by the caller.
+*/
+   int (*get_local_cids)(unsigned int *local_cids);


Why int? get_local_cid() returns an u32, we should do the same.

In addition, can we remove get_local_cid() and implement 
get_local_cids() for all the transports?



+   /* Compare the order of two devices.  Given the guest CIDs of two
+* different devices, returns -1 while the left is behind the right.
+* Otherwise, return 1.
+*
+* Held rcu read lock by the caller.
+*/
+   int (*compare_order)(unsigned int left, unsigned int right);


Please check better the type for CIDs all over the place.



/* Read a single skb */
int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h
index ed07181d4eff..36ca5023293a 100644
--- a/include/uapi/linux/vm_sockets.h
+++ b/include/uapi/linux/vm_sockets.h
@@ -189,6 +189,12 @@ struct sockaddr_vm {
   sizeof(__u8)];
};

+/* The maximum number of vsock devices.  Each vsock device has an exclusive
+ * context id.
+ */
+
+#define MAX_VSOCK_NUM 16


This is used internally in AF_VSOCK, I don't think we should expose it
in the UAPI.



+
#define IOCTL_VM_SOCKETS_GET_LOCAL_CID  _IO(7, 0xb9)

/* MSG_ZEROCOPY notifications are encoded in the standard error format,
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 54ba7316f808..da06ddc940cd 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -234,19 +234,45 @@ static void __vsock_remove_connected(struct vsock_sock 
*vsk)

static struct sock *__vsock_find_bound_socket(struct sockaddr_vm *addr)
{
-   struct vsock_sock *vsk;
+   struct vsock_sock *vsk, *any_vsk = NULL;

+   rcu_read_lock();


Why the rcu is needed?

	list_for_each_entry(vsk, vsock_bound_sockets(addr), bound_table) 
	{

+   /* The highest priority: full match. */
if (vsock_addr_equals_addr(addr, >local_addr))
-   return sk_vsock(vsk);
+   goto out;

-   if (addr->svm_port == vsk->local_addr.svm_port &&
-   (vsk->local_addr.svm_cid == VMADDR_CID_ANY ||
-addr->svm_cid == VMADDR_CID_ANY))
-   return sk_vsock(vsk);
+   /* Port match */
+   if (addr->svm_port == vsk->local_addr.svm_port) {
+   /* The second priority: local cid is VMADDR_CID_ANY. */
+   if (vsk->local_addr.svm_cid == VMADDR_CID_ANY)
+   goto out;
+
+   /* The third priority: local cid isn't VMADDR_CID_ANY. 
*/
+   if (addr->svm_cid ==

Re: [RFC PATCH 1/5] vsock/virtio: Extend virtio-vsock spec with an "order" field

2024-05-23 Thread Stefano Garzarella


As Alyssa suggested, we should discuss spec changes in the virtio ML.
BTW as long as this is an RFC, it's fine. Just be sure, though, to 
remember to merge the change in the specification first versus the 
patches in Linux.
So I recommend that you don't send a non-RFC set into Linux until you 
have agreed on the changes to the specification.


On Fri, May 17, 2024 at 10:46:03PM GMT, Xuewei Niu wrote:

The "order" field determines the location of the device in the linked list,
the device with CID 4, having a smallest order, is in the first place, and
so forth.


Do we really need an order, or would it suffice to just indicate the 
device to be used by default? (as the default gateway in networking)




Rules:

* It doesn’t have to be continuous;
* It cannot exist conflicts;
* It is optional for the mode of a single device, but is required for the
 mode of multiple devices.


We should also add a feature to support this new field.



Signed-off-by: Xuewei Niu 
---
include/uapi/linux/virtio_vsock.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/virtio_vsock.h 
b/include/uapi/linux/virtio_vsock.h
index 64738838bee5..b62ec7d2ab1e 100644
--- a/include/uapi/linux/virtio_vsock.h
+++ b/include/uapi/linux/virtio_vsock.h
@@ -43,6 +43,7 @@

struct virtio_vsock_config {
__le64 guest_cid;
+   __le64 order;


Do we really need 64 bits for the order?


} __attribute__((packed));

enum virtio_vsock_event_id {
--
2.34.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2410299 matches

Mail list logo