[PATCH v2] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-03-01 Thread linke li
In function ring_buffer_iter_empty(), cpu_buffer->commit_page is read
while other threads may change it. It may cause the time_stamp that read
in the next line come from a different page. Use READ_ONCE() to avoid
having to reason about compiler optimizations now and in future.

Signed-off-by: linke li 
---
v1 -> v2: only add READ_ONCE() to read cpu_buffer->commit_page, make change log 
clear

 kernel/trace/ring_buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 0699027b4f4c..c7203a436d3c 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4337,7 +4337,7 @@ int ring_buffer_iter_empty(struct ring_buffer_iter *iter)
cpu_buffer = iter->cpu_buffer;
reader = cpu_buffer->reader_page;
head_page = cpu_buffer->head_page;
-   commit_page = cpu_buffer->commit_page;
+   commit_page = READ_ONCE(cpu_buffer->commit_page);
commit_ts = commit_page->page->time_stamp;
 
/*
-- 
2.39.3 (Apple Git-145)




Re: [PATCH v5 3/6] LoongArch: KVM: Add cpucfg area for kvm hypervisor

2024-03-01 Thread maobibo




On 2024/2/27 下午6:19, WANG Xuerui wrote:

On 2/27/24 18:12, maobibo wrote:



On 2024/2/27 下午5:10, WANG Xuerui wrote:

On 2/27/24 11:14, maobibo wrote:



On 2024/2/27 上午4:02, Jiaxun Yang wrote:



在2024年2月26日二月 上午8:04,maobibo写道:

On 2024/2/26 下午2:12, Huacai Chen wrote:
On Mon, Feb 26, 2024 at 10:04 AM maobibo  
wrote:




On 2024/2/24 下午5:13, Huacai Chen wrote:

Hi, Bibo,

On Thu, Feb 22, 2024 at 11:28 AM Bibo Mao  
wrote:


Instruction cpucfg can be used to get processor features. And 
there

is trap exception when it is executed in VM mode, and also it is
to provide cpu features to VM. On real hardware cpucfg area 0 
- 20
is used.  Here one specified area 0x4000 -- 0x40ff is 
used
for KVM hypervisor to privide PV features, and the area can be 
extended

for other hypervisors in future. This area will never be used for
real HW, it is only used by software.
After reading and thinking, I find that the hypercall method 
which is

used in our productive kernel is better than this cpucfg method.
Because hypercall is more simple and straightforward, plus we 
don't

worry about conflicting with the real hardware.
No, I do not think so. cpucfg is simper than hypercall, 
hypercall can
be in effect when system runs in guest mode. In some scenario 
like TCG

mode, hypercall is illegal intruction, however cpucfg can work.

Nearly all architectures use hypercall except x86 for its historical
Only x86 support multiple hypervisors and there is multiple 
hypervisor

in x86 only. It is an advantage, not historical reason.


I do believe that all those stuff should not be exposed to guest 
user space

for security reasons.
Can you add PLV checking when cpucfg 0x4000-0x40FF is 
emulated? if it is user mode return value is zero and it is kernel 
mode emulated value will be returned. It can avoid information leaking.


I've suggested this approach in another reply [1], but I've rechecked 
the manual, and it turns out this behavior is not permitted by the 
current wording. See LoongArch Reference Manual v1.10, Volume 1, 
Section 2.2.10.5 "CPUCFG":


 > CPUCFG 访问未定义的配置字将读回全 0 值。
 >
 > Reads of undefined CPUCFG configuration words shall return 
all-zeroes.


This sentence mentions no distinction based on privilege modes, so it 
can only mean the behavior applies universally regardless of 
privilege modes.


I think if you want to make CPUCFG behavior PLV-dependent, you may 
have to ask the LoongArch spec editors, internally or in public, for 
a new spec revision.
No, CPUCFG behavior between CPUCFG0-CPUCFG21 is unchanged, only that 
it can be defined by software since CPUCFG 0x4 is used by 
software.


The 0x4000 range is not mentioned in the manuals. I know you've 
confirmed privately with HW team but this needs to be properly 
documented for public projects to properly rely on.


(There are already multiple third-party LoongArch implementers as of 
late 2023, so any ISA-level change like this would best be 
coordinated, to minimize surprises.)

With document Vol 4-23
https://www.intel.com/content/dam/develop/external/us/en/documents/335592-sdm-vol-4.pdf 



There is one line "MSR address range between 4000H - 40FFH is 
marked as a specially reserved range. All existing and
future processors will not implement any features using any MSR in 
this range."


Thanks for providing this info, now at least we know why it's this 
specific range of 0x40XX that's chosen.




It only says that it is reserved, it does not say detailed software 
behavior. Software behavior is defined in hypervisor such as:
https://github.com/MicrosoftDocs/Virtualization-Documentation/blob/main/tlfs/Requirements%20for%20Implementing%20the%20Microsoft%20Hypervisor%20Interface.pdf 


https://kb.vmware.com/s/article/1009458

If hypercall method is used, there should be ABI also like aarch64:
https://documentation-service.arm.com/static/6013e5faeee5236980d08619


Yes proper documentation of public API surface is always necessary 
*before* doing real work. Because right now the hypercall provider is 
Linux KVM, maybe we can document the existing and planned hypercall 
usage and ABI in the kernel docs along with code changes.



yes, I will add hypercall in kernel docs.

SMC calling convention is ABI between OS and secure firmware, LoongArch 
has no secure mode, it not necessary to has such doc.The hypercall 
calling convention is relative with hypervisor SW, not relative with CPU 
HW vendor. Each hypervisor maybe has its separate hypercall calling 
convention just like syscall ABIs.


Regards
Bibo Mao




Re: [PATCH RFC ftrace] Chose RCU Tasks based on TASKS_RCU rather than PREEMPTION

2024-03-01 Thread Paul E. McKenney
On Fri, Mar 01, 2024 at 03:30:01PM -0500, Steven Rostedt wrote:
> On Fri, 1 Mar 2024 12:25:10 -0800
> "Paul E. McKenney"  wrote:
> 
> > > That would work for me.  If there are no objections, I will make this
> > > change.  
> > 
> > But I did check the latency of synchronize_rcu_tasks_rude() (about 100ms)
> > and synchronize_rcu() (about 20ms).  This is on a 80-hardware-thread
> > x86 system that is being flooded with calls to one or the other of
> > these two functions, but is otherwise idle.  So adding that unnecessary
> > synchronize_rcu() adds about 20% to that synchronization delay.
> > 
> > Which might still be OK, but...  In the immortal words of MS-DOS,
> > "Are you sure?".  ;-)
> 
> It's just safe to keep it. It's definitely not a fast path.

OK, you got it!  ;-)

Thanx, Paul



[PATCH 18/20] sh: kprobes: Remove unneeded kprobe_opcode_t casts

2024-03-01 Thread Geert Uytterhoeven
There is no need to cast a kprobe_opcode_t pointer to a kprobe_opcode_t
pointer.

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/kprobes.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/sh/kernel/kprobes.c b/arch/sh/kernel/kprobes.c
index d8c2e399d6e50794..49c4ffd782d6d6c5 100644
--- a/arch/sh/kernel/kprobes.c
+++ b/arch/sh/kernel/kprobes.c
@@ -39,7 +39,7 @@ static DEFINE_PER_CPU(struct kprobe, saved_next_opcode2);
 
 int __kprobes arch_prepare_kprobe(struct kprobe *p)
 {
-   kprobe_opcode_t opcode = *(kprobe_opcode_t *) (p->addr);
+   kprobe_opcode_t opcode = *p->addr;
 
if (OPCODE_RTE(opcode))
return -EFAULT; /* Bad breakpoint */
@@ -248,7 +248,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
p = get_kprobe(addr);
if (!p) {
/* Not one of ours: let kernel handle it */
-   if (*(kprobe_opcode_t *)addr != BREAKPOINT_INSTRUCTION) {
+   if (*addr != BREAKPOINT_INSTRUCTION) {
/*
 * The breakpoint instruction was removed right
 * after we hit it. Another cpu has removed
-- 
2.34.1




[PATCH 20/20] [RFC] sh: dma: Remove unused functionality

2024-03-01 Thread Geert Uytterhoeven
dma_extend(), get_dma_info_by_name(), register_chan_caps(), and
request_dma_bycap() are unused.  Remove them, and all related code.

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/drivers/dma/dma-api.c | 116 --
 arch/sh/include/asm/dma.h |   7 --
 2 files changed, 123 deletions(-)

diff --git a/arch/sh/drivers/dma/dma-api.c b/arch/sh/drivers/dma/dma-api.c
index f49097fa634c36d4..87e5a892887360f5 100644
--- a/arch/sh/drivers/dma/dma-api.c
+++ b/arch/sh/drivers/dma/dma-api.c
@@ -41,21 +41,6 @@ struct dma_info *get_dma_info(unsigned int chan)
 }
 EXPORT_SYMBOL(get_dma_info);
 
-struct dma_info *get_dma_info_by_name(const char *dmac_name)
-{
-   struct dma_info *info;
-
-   list_for_each_entry(info, _dmac_list, list) {
-   if (dmac_name && (strcmp(dmac_name, info->name) != 0))
-   continue;
-   else
-   return info;
-   }
-
-   return NULL;
-}
-EXPORT_SYMBOL(get_dma_info_by_name);
-
 static unsigned int get_nr_channels(void)
 {
struct dma_info *info;
@@ -101,66 +86,6 @@ int get_dma_residue(unsigned int chan)
 }
 EXPORT_SYMBOL(get_dma_residue);
 
-static int search_cap(const char **haystack, const char *needle)
-{
-   const char **p;
-
-   for (p = haystack; *p; p++)
-   if (strcmp(*p, needle) == 0)
-   return 1;
-
-   return 0;
-}
-
-/**
- * request_dma_bycap - Allocate a DMA channel based on its capabilities
- * @dmac: List of DMA controllers to search
- * @caps: List of capabilities
- *
- * Search all channels of all DMA controllers to find a channel which
- * matches the requested capabilities. The result is the channel
- * number if a match is found, or %-ENODEV if no match is found.
- *
- * Note that not all DMA controllers export capabilities, in which
- * case they can never be allocated using this API, and so
- * request_dma() must be used specifying the channel number.
- */
-int request_dma_bycap(const char **dmac, const char **caps, const char *dev_id)
-{
-   unsigned int found = 0;
-   struct dma_info *info;
-   const char **p;
-   int i;
-
-   BUG_ON(!dmac || !caps);
-
-   list_for_each_entry(info, _dmac_list, list)
-   if (strcmp(*dmac, info->name) == 0) {
-   found = 1;
-   break;
-   }
-
-   if (!found)
-   return -ENODEV;
-
-   for (i = 0; i < info->nr_channels; i++) {
-   struct dma_channel *channel = >channels[i];
-
-   if (unlikely(!channel->caps))
-   continue;
-
-   for (p = caps; *p; p++) {
-   if (!search_cap(channel->caps, *p))
-   break;
-   if (request_dma(channel->chan, dev_id) == 0)
-   return channel->chan;
-   }
-   }
-
-   return -EINVAL;
-}
-EXPORT_SYMBOL(request_dma_bycap);
-
 int request_dma(unsigned int chan, const char *dev_id)
 {
struct dma_channel *channel = { 0 };
@@ -213,35 +138,6 @@ void dma_wait_for_completion(unsigned int chan)
 }
 EXPORT_SYMBOL(dma_wait_for_completion);
 
-int register_chan_caps(const char *dmac, struct dma_chan_caps *caps)
-{
-   struct dma_info *info;
-   unsigned int found = 0;
-   int i;
-
-   list_for_each_entry(info, _dmac_list, list)
-   if (strcmp(dmac, info->name) == 0) {
-   found = 1;
-   break;
-   }
-
-   if (unlikely(!found))
-   return -ENODEV;
-
-   for (i = 0; i < info->nr_channels; i++, caps++) {
-   struct dma_channel *channel;
-
-   if ((info->first_channel_nr + i) != caps->ch_num)
-   return -EINVAL;
-
-   channel = >channels[i];
-   channel->caps = caps->caplist;
-   }
-
-   return 0;
-}
-EXPORT_SYMBOL(register_chan_caps);
-
 void dma_configure_channel(unsigned int chan, unsigned long flags)
 {
struct dma_info *info = get_dma_info(chan);
@@ -267,18 +163,6 @@ int dma_xfer(unsigned int chan, unsigned long from,
 }
 EXPORT_SYMBOL(dma_xfer);
 
-int dma_extend(unsigned int chan, unsigned long op, void *param)
-{
-   struct dma_info *info = get_dma_info(chan);
-   struct dma_channel *channel = get_dma_channel(chan);
-
-   if (info->ops->extend)
-   return info->ops->extend(channel, op, param);
-
-   return -ENOSYS;
-}
-EXPORT_SYMBOL(dma_extend);
-
 static int dma_proc_show(struct seq_file *m, void *v)
 {
struct dma_info *info = v;
diff --git a/arch/sh/include/asm/dma.h b/arch/sh/include/asm/dma.h
index c8bee3f985a29393..6b6d409956d17f09 100644
--- a/arch/sh/include/asm/dma.h
+++ b/arch/sh/include/asm/dma.h
@@ -56,7 +56,6 @@ struct dma_ops {
int (*get_residue)(struct dma_channel *chan);
int (*xfer)(struct dma_channel *chan);
int 

[PATCH 16/20] sh: kprobes: Merge arch_copy_kprobe() into arch_prepare_kprobe()

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/kprobes.c:52:16: warning: no previous prototype for 
'arch_copy_kprobe' [-Wmissing-prototypes]

Although SH kprobes support was only merged in v2.6.28, it missed the
earlier removal of the arch_copy_kprobe() callback in v2.6.15.

Based on the powerpc part of commit 49a2a1b83ba6fa40 ("[PATCH] kprobes:
changed from using spinlock to mutex").

Fixes: d39f5450146ff39f ("sh: Add kprobes support.")
Signed-off-by: Geert Uytterhoeven 
---
Compile-tested only.
---
 arch/sh/kernel/kprobes.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/sh/kernel/kprobes.c b/arch/sh/kernel/kprobes.c
index aed1ea8e2c2f063b..74051b8ddf3e7bf9 100644
--- a/arch/sh/kernel/kprobes.c
+++ b/arch/sh/kernel/kprobes.c
@@ -44,17 +44,12 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
if (OPCODE_RTE(opcode))
return -EFAULT; /* Bad breakpoint */
 
+   memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
p->opcode = opcode;
 
return 0;
 }
 
-void __kprobes arch_copy_kprobe(struct kprobe *p)
-{
-   memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
-   p->opcode = *p->addr;
-}
-
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
*p->addr = BREAKPOINT_INSTRUCTION;
-- 
2.34.1




[PATCH 05/20] sh: return_address: Add missing #include

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/return_address.c:49:7: warning: no previous prototype for 
‘return_address’ [-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/return_address.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/sh/kernel/return_address.c b/arch/sh/kernel/return_address.c
index 8838094c9ff9444f..2ce22f11eab37839 100644
--- a/arch/sh/kernel/return_address.c
+++ b/arch/sh/kernel/return_address.c
@@ -7,7 +7,9 @@
  */
 #include 
 #include 
+
 #include 
+#include 
 
 #ifdef CONFIG_DWARF_UNWINDER
 
-- 
2.34.1




[PATCH 12/20] sh: dma: Remove unused dmac_search_free_channel()

2024-03-01 Thread Geert Uytterhoeven
arch/sh/drivers/dma/dma-api.c:164:5: warning: no previous prototype for 
'dmac_search_free_channel' [-Wmissing-prototypes]

dmac_search_free_channel() never had a user in upstream, remove it.

Signed-off-by: Geert Uytterhoeven 
---
dma_extend(), get_dma_info_by_name(), register_chan_caps(), and
request_dma_bycap() are also unused, but don't trigger warnings
---
 arch/sh/drivers/dma/dma-api.c | 27 ---
 1 file changed, 27 deletions(-)

diff --git a/arch/sh/drivers/dma/dma-api.c b/arch/sh/drivers/dma/dma-api.c
index 89cd4a3b4ccafbe2..f49097fa634c36d4 100644
--- a/arch/sh/drivers/dma/dma-api.c
+++ b/arch/sh/drivers/dma/dma-api.c
@@ -161,33 +161,6 @@ int request_dma_bycap(const char **dmac, const char 
**caps, const char *dev_id)
 }
 EXPORT_SYMBOL(request_dma_bycap);
 
-int dmac_search_free_channel(const char *dev_id)
-{
-   struct dma_channel *channel = { 0 };
-   struct dma_info *info = get_dma_info(0);
-   int i;
-
-   for (i = 0; i < info->nr_channels; i++) {
-   channel = >channels[i];
-   if (unlikely(!channel))
-   return -ENODEV;
-
-   if (atomic_read(>busy) == 0)
-   break;
-   }
-
-   if (info->ops->request) {
-   int result = info->ops->request(channel);
-   if (result)
-   return result;
-
-   atomic_set(>busy, 1);
-   return channel->chan;
-   }
-
-   return -ENOSYS;
-}
-
 int request_dma(unsigned int chan, const char *dev_id)
 {
struct dma_channel *channel = { 0 };
-- 
2.34.1




[PATCH 00/20] sh: Fix missing prototypes

2024-03-01 Thread Geert Uytterhoeven
Hi all,

This patch series fixes several "no previous prototype for "
warnings when building a kernel for SuperH.

Known issues:
  - The various warnings about cache functions are not yet fixed, but
I didn't want to hold off the rest of this series,
  - sdk7786_defconfig needs "[PATCH/RFC] locking/spinlocks: Make __raw_*
lock ops static" [1],
  - Probably there are more warnings to fix, I didn't build all
defconfigs.

This has been boot-tested on landisk and on qemu/rts7751r2d.

Thanks for your comments!

[1] 
https://lore.kernel.org/linux-sh/c395b02613572131568bc1fd1bc456d20d1a5426.1709325647.git.geert+rene...@glider.be

Geert Uytterhoeven (20):
  sh: pgtable: Fix missing prototypes
  sh: fpu: Add missing forward declarations
  sh: syscall: Add missing forward declaration for sys_cacheflush()
  sh: tlb: Add missing forward declaration for handle_tlbmiss()
  sh: return_address: Add missing #include 
  sh: traps: Add missing #include 
  sh: hw_breakpoint: Add missing forward declaration for
arch_bp_generic_fields()
  sh: boot: Add proper forward declarations
  sh: ftrace: Fix missing prototypes
  sh: nommu: Add missing #include 
  sh: math-emu: Add missing #include 
  sh: dma: Remove unused dmac_search_free_channel()
  sh: sh2a: Add missing #include 
  sh: sh7786: Remove unused sh7786_usb_use_exclock()
  sh: smp: Fix missing prototypes
  sh: kprobes: Merge arch_copy_kprobe() into arch_prepare_kprobe()
  sh: kprobes: Make trampoline_probe_handler() static
  sh: kprobes: Remove unneeded kprobe_opcode_t casts
  sh: dwarf: Make dwarf_lookup_fde() static
  [RFC] sh: dma: Remove unused functionality

 arch/sh/boot/compressed/cache.c |   3 +
 arch/sh/boot/compressed/cache.h |  10 ++
 arch/sh/boot/compressed/misc.c  |   8 +-
 arch/sh/boot/compressed/misc.h  |   9 ++
 arch/sh/drivers/dma/dma-api.c   | 143 
 arch/sh/include/asm/dma.h   |   7 --
 arch/sh/include/asm/fpu.h   |   3 +
 arch/sh/include/asm/ftrace.h|  10 ++
 arch/sh/include/asm/hw_breakpoint.h |   2 +
 arch/sh/include/asm/syscalls.h  |   1 +
 arch/sh/include/asm/tlb.h   |   4 +
 arch/sh/kernel/cpu/sh2a/opcode_helper.c |   2 +
 arch/sh/kernel/cpu/sh4a/setup-sh7786.c  |  14 ---
 arch/sh/kernel/dwarf.c  |   2 +-
 arch/sh/kernel/kprobes.c|  13 +--
 arch/sh/kernel/return_address.c |   2 +
 arch/sh/kernel/smp.c|   4 +-
 arch/sh/kernel/traps.c  |  10 +-
 arch/sh/kernel/traps_32.c   |   1 +
 arch/sh/math-emu/math.c |   2 +
 arch/sh/mm/nommu.c  |   2 +
 arch/sh/mm/pgtable.c|   4 +-
 arch/sh/mm/tlbex_32.c   |   1 +
 23 files changed, 68 insertions(+), 189 deletions(-)
 create mode 100644 arch/sh/boot/compressed/cache.h
 create mode 100644 arch/sh/boot/compressed/misc.h

-- 
2.34.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



[PATCH 15/20] sh: smp: Fix missing prototypes

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/smp.c:173:17: warning: no previous prototype for 
'start_secondary' [-Wmissing-prototypes]
arch/sh/kernel/smp.c:324:5: warning: no previous prototype for 
'setup_profiling_timer' [-Wmissing-prototypes]

Make start_secondary() static, as it is only used in this file.
Include  to fix the other warning.

There are no users outside this file, so make it static.

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/smp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c
index 5cf35a774dc70082..b3ea50aabba3d7f2 100644
--- a/arch/sh/kernel/smp.c
+++ b/arch/sh/kernel/smp.c
@@ -21,6 +21,8 @@
 #include 
 #include 
 #include 
+#include 
+
 #include 
 #include 
 #include 
@@ -170,7 +172,7 @@ void native_play_dead(void)
 }
 #endif
 
-asmlinkage void start_secondary(void)
+static asmlinkage void start_secondary(void)
 {
unsigned int cpu = smp_processor_id();
struct mm_struct *mm = _mm;
-- 
2.34.1




[PATCH 08/20] sh: boot: Add proper forward declarations

2024-03-01 Thread Geert Uytterhoeven
arch/sh/boot/compressed/cache.c:2:5: warning: no previous prototype for 
‘cache_control’ [-Wmissing-prototypes]
arch/sh/boot/compressed/misc.c:115:6: warning: no previous prototype for 
‘ftrace_stub’ [-Wmissing-prototypes]
arch/sh/boot/compressed/misc.c:118:6: warning: no previous prototype for 
‘arch_ftrace_ops_list_func’ [-Wmissing-prototypes]
arch/sh/boot/compressed/misc.c:128:6: warning: no previous prototype for 
‘decompress_kernel’ [-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/boot/compressed/cache.c |  3 +++
 arch/sh/boot/compressed/cache.h | 10 ++
 arch/sh/boot/compressed/misc.c  |  8 +++-
 arch/sh/boot/compressed/misc.h  |  9 +
 4 files changed, 25 insertions(+), 5 deletions(-)
 create mode 100644 arch/sh/boot/compressed/cache.h
 create mode 100644 arch/sh/boot/compressed/misc.h

diff --git a/arch/sh/boot/compressed/cache.c b/arch/sh/boot/compressed/cache.c
index 31e04ff4841ed084..95c1e73ccbb7e011 100644
--- a/arch/sh/boot/compressed/cache.c
+++ b/arch/sh/boot/compressed/cache.c
@@ -1,4 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
+
+#include "cache.h"
+
 int cache_control(unsigned int command)
 {
volatile unsigned int *p = (volatile unsigned int *) 0x8000;
diff --git a/arch/sh/boot/compressed/cache.h b/arch/sh/boot/compressed/cache.h
new file mode 100644
index ..b622b68c87f59b97
--- /dev/null
+++ b/arch/sh/boot/compressed/cache.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef CACHE_H
+#define CACHE_H
+
+#define CACHE_ENABLE   0
+#define CACHE_DISABLE  1
+
+int cache_control(unsigned int command);
+
+#endif /* CACHE_H */
diff --git a/arch/sh/boot/compressed/misc.c b/arch/sh/boot/compressed/misc.c
index ca05c99a3d5b488d..5178150ca6650dcf 100644
--- a/arch/sh/boot/compressed/misc.c
+++ b/arch/sh/boot/compressed/misc.c
@@ -16,6 +16,9 @@
 #include 
 #include 
 
+#include "cache.h"
+#include "misc.h"
+
 /*
  * gzip declarations
  */
@@ -26,11 +29,6 @@
 #undef memcpy
 #define memzero(s, n) memset ((s), 0, (n))
 
-/* cache.c */
-#define CACHE_ENABLE  0
-#define CACHE_DISABLE 1
-int cache_control(unsigned int command);
-
 extern char input_data[];
 extern int input_len;
 static unsigned char *output;
diff --git a/arch/sh/boot/compressed/misc.h b/arch/sh/boot/compressed/misc.h
new file mode 100644
index ..2b4534faa3052857
--- /dev/null
+++ b/arch/sh/boot/compressed/misc.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef MISC_H
+#define MISC_H
+
+void arch_ftrace_ops_list_func(void);
+void decompress_kernel(void);
+void ftrace_stub(void);
+
+#endif /* MISC_H */
-- 
2.34.1




[PATCH 06/20] sh: traps: Add missing #include

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/traps_32.c:735:6: warning: no previous prototype for 
‘per_cpu_trap_init’ [-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/traps_32.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/sh/kernel/traps_32.c b/arch/sh/kernel/traps_32.c
index 6cdda3a621a1e577..8cd4b05df75c3e07 100644
--- a/arch/sh/kernel/traps_32.c
+++ b/arch/sh/kernel/traps_32.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
-- 
2.34.1




[PATCH 07/20] sh: hw_breakpoint: Add missing forward declaration for arch_bp_generic_fields()

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/hw_breakpoint.c:135:5: warning: no previous prototype for 
‘arch_bp_generic_fields’ [-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/include/asm/hw_breakpoint.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/sh/include/asm/hw_breakpoint.h 
b/arch/sh/include/asm/hw_breakpoint.h
index 361a0f57bdebda6e..74a438cea6559efd 100644
--- a/arch/sh/include/asm/hw_breakpoint.h
+++ b/arch/sh/include/asm/hw_breakpoint.h
@@ -52,6 +52,8 @@ struct pmu;
 
 /* arch/sh/kernel/hw_breakpoint.c */
 extern int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw);
+extern int arch_bp_generic_fields(int sh_len, int sh_type, int *gen_len,
+ int *gen_type);
 extern int hw_breakpoint_arch_parse(struct perf_event *bp,
const struct perf_event_attr *attr,
struct arch_hw_breakpoint *hw);
-- 
2.34.1




[PATCH 01/20] sh: pgtable: Fix missing prototypes

2024-03-01 Thread Geert Uytterhoeven
arch/sh/mm/pgtable.c:12:6: warning: no previous prototype for 'pgd_ctor' 
[-Wmissing-prototypes]
arch/sh/mm/pgtable.c:34:8: warning: no previous prototype for 'pgd_alloc' 
[-Wmissing-prototypes]
arch/sh/mm/pgtable.c:39:6: warning: no previous prototype for 'pgd_free' 
[-Wmissing-prototypes]
arch/sh/mm/pgtable.c:45:6: warning: no previous prototype for 'pud_populate' 
[-Wmissing-prototypes]
arch/sh/mm/pgtable.c:50:8: warning: no previous prototype for 'pmd_alloc_one' 
[-Wmissing-prototypes]
arch/sh/mm/pgtable.c:55:6: warning: no previous prototype for 'pmd_free' 
[-Wmissing-prototypes]

Make pgd_ctor() static, as it is only used in this file.
Include  to fix the other warnings.

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/mm/pgtable.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/sh/mm/pgtable.c b/arch/sh/mm/pgtable.c
index cf7ce4b5735954bf..3a4085ea0161fe56 100644
--- a/arch/sh/mm/pgtable.c
+++ b/arch/sh/mm/pgtable.c
@@ -2,12 +2,14 @@
 #include 
 #include 
 
+#include 
+
 static struct kmem_cache *pgd_cachep;
 #if PAGETABLE_LEVELS > 2
 static struct kmem_cache *pmd_cachep;
 #endif
 
-void pgd_ctor(void *x)
+static void pgd_ctor(void *x)
 {
pgd_t *pgd = x;
 
-- 
2.34.1




[PATCH 13/20] sh: sh2a: Add missing #include

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/cpu/sh2a/opcode_helper.c:34:14: warning: no previous prototype 
for 'instruction_size' [-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/cpu/sh2a/opcode_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/sh/kernel/cpu/sh2a/opcode_helper.c 
b/arch/sh/kernel/cpu/sh2a/opcode_helper.c
index c509081d90b9affe..fcf53f5827eb286c 100644
--- a/arch/sh/kernel/cpu/sh2a/opcode_helper.c
+++ b/arch/sh/kernel/cpu/sh2a/opcode_helper.c
@@ -8,6 +8,8 @@
  */
 #include 
 
+#include 
+
 /*
  * Instructions on SH are generally fixed at 16-bits, however, SH-2A
  * introduces some 32-bit instructions. Since there are no real
-- 
2.34.1




[PATCH 09/20] sh: ftrace: Fix missing prototypes

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/ftrace.c:130:6: warning: no previous prototype for 
‘arch_ftrace_nmi_enter’ [-Wmissing-prototypes]
arch/sh/kernel/ftrace.c:140:6: warning: no previous prototype for 
‘arch_ftrace_nmi_exit’ [-Wmissing-prototypes]
arch/sh/kernel/ftrace.c:316:6: warning: no previous prototype for 
‘prepare_ftrace_return’ [-Wmissing-prototypes]

Fix this by moving existing forward declarations to , and
adding the missing forward declaration for prepare_ftrace_return().

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/include/asm/ftrace.h | 10 ++
 arch/sh/kernel/traps.c   | 10 ++
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/sh/include/asm/ftrace.h b/arch/sh/include/asm/ftrace.h
index b1c1dc0cc261d1db..1c10e106639098fc 100644
--- a/arch/sh/include/asm/ftrace.h
+++ b/arch/sh/include/asm/ftrace.h
@@ -33,6 +33,8 @@ static inline unsigned long ftrace_call_adjust(unsigned long 
addr)
return addr;
 }
 
+void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr);
+
 #endif /* __ASSEMBLY__ */
 #endif /* CONFIG_FUNCTION_TRACER */
 
@@ -43,6 +45,14 @@ extern void *return_address(unsigned int);
 
 #define ftrace_return_address(n) return_address(n)
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+extern void arch_ftrace_nmi_enter(void);
+extern void arch_ftrace_nmi_exit(void);
+#else
+static inline void arch_ftrace_nmi_enter(void) { }
+static inline void arch_ftrace_nmi_exit(void) { }
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __ASM_SH_FTRACE_H */
diff --git a/arch/sh/kernel/traps.c b/arch/sh/kernel/traps.c
index 01884054aeb2bd30..4339c4cafa79ce2a 100644
--- a/arch/sh/kernel/traps.c
+++ b/arch/sh/kernel/traps.c
@@ -15,6 +15,8 @@
 
 #include 
 #include   /* print_modules */
+
+#include 
 #include 
 #include 
 
@@ -170,14 +172,6 @@ BUILD_TRAP_HANDLER(bug)
force_sig(SIGTRAP);
 }
 
-#ifdef CONFIG_DYNAMIC_FTRACE
-extern void arch_ftrace_nmi_enter(void);
-extern void arch_ftrace_nmi_exit(void);
-#else
-static inline void arch_ftrace_nmi_enter(void) { }
-static inline void arch_ftrace_nmi_exit(void) { }
-#endif
-
 BUILD_TRAP_HANDLER(nmi)
 {
TRAP_HANDLER_DECL;
-- 
2.34.1




[PATCH 14/20] sh: sh7786: Remove unused sh7786_usb_use_exclock()

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/cpu/sh4a/setup-sh7786.c:411:13: warning: no previous prototype 
for 'sh7786_usb_use_exclock' [-Wmissing-prototypes]

Upstream never had a user of sh7786_usb_use_exclock(), remove it.

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/cpu/sh4a/setup-sh7786.c | 14 --
 1 file changed, 14 deletions(-)

diff --git a/arch/sh/kernel/cpu/sh4a/setup-sh7786.c 
b/arch/sh/kernel/cpu/sh4a/setup-sh7786.c
index 74620f30b19badbd..c048842d8a589866 100644
--- a/arch/sh/kernel/cpu/sh4a/setup-sh7786.c
+++ b/arch/sh/kernel/cpu/sh4a/setup-sh7786.c
@@ -400,20 +400,6 @@ static struct platform_device *sh7786_devices[] __initdata 
= {
_ohci_device,
 };
 
-/*
- * Please call this function if your platform board
- * use external clock for USB
- * */
-#define USBCTL00xffe70858
-#define CLOCK_MODE_MASK 0xff7f
-#define EXT_CLOCK_MODE  0x0080
-
-void __init sh7786_usb_use_exclock(void)
-{
-   u32 val = __raw_readl(USBCTL0) & CLOCK_MODE_MASK;
-   __raw_writel(val | EXT_CLOCK_MODE, USBCTL0);
-}
-
 #define USBINITREG10xffe70094
 #define USBINITREG20xffe7009c
 #define USBINITVAL10x00ff0040
-- 
2.34.1




[PATCH 11/20] sh: math-emu: Add missing #include

2024-03-01 Thread Geert Uytterhoeven
arch/sh/math-emu/math.c:492:5: warning: no previous prototype for 'do_fpu_inst' 
[-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/math-emu/math.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/sh/math-emu/math.c b/arch/sh/math-emu/math.c
index cdaef6501d764a0c..b65703e065735663 100644
--- a/arch/sh/math-emu/math.c
+++ b/arch/sh/math-emu/math.c
@@ -15,6 +15,8 @@
 #include 
 
 #include 
+
+#include 
 #include 
 #include 
 
-- 
2.34.1




[PATCH 10/20] sh: nommu: Add missing #include

2024-03-01 Thread Geert Uytterhoeven
arch/sh/mm/nommu.c:76:13: warning: no previous prototype for 
'kmap_coherent_init' [-Wmissing-prototypes]
arch/sh/mm/nommu.c:80:7: warning: no previous prototype for 'kmap_coherent' 
[-Wmissing-prototypes]
arch/sh/mm/nommu.c:86:6: warning: no previous prototype for 'kunmap_coherent' 
[-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/mm/nommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/sh/mm/nommu.c b/arch/sh/mm/nommu.c
index 78c4b6e6d33ba3af..fa3dc9428a737ffe 100644
--- a/arch/sh/mm/nommu.c
+++ b/arch/sh/mm/nommu.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 #include 
+
+#include 
 #include 
 #include 
 #include 
-- 
2.34.1




[PATCH 02/20] sh: fpu: Add missing forward declarations

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/cpu/sh4/fpu.c:389:6: warning: no previous prototype for 
‘float_raise’ [-Wmissing-prototypes]
arch/sh/kernel/cpu/sh4/fpu.c:394:5: warning: no previous prototype for 
‘float_rounding_mode’ [-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/include/asm/fpu.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/sh/include/asm/fpu.h b/arch/sh/include/asm/fpu.h
index 04584be8986c418a..0379f4cce5ed25fb 100644
--- a/arch/sh/include/asm/fpu.h
+++ b/arch/sh/include/asm/fpu.h
@@ -64,6 +64,9 @@ static inline void clear_fpu(struct task_struct *tsk, struct 
pt_regs *regs)
preempt_enable();
 }
 
+void float_raise(unsigned int flags);
+int float_rounding_mode(void);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __ASM_SH_FPU_H */
-- 
2.34.1




[PATCH 17/20] sh: kprobes: Make trampoline_probe_handler() static

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/kprobes.c:299:15: warning: no previous prototype for 
'trampoline_probe_handler' [-Wmissing-prototypes]

There are no users outside this file, so make it static.

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/kprobes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sh/kernel/kprobes.c b/arch/sh/kernel/kprobes.c
index 74051b8ddf3e7bf9..d8c2e399d6e50794 100644
--- a/arch/sh/kernel/kprobes.c
+++ b/arch/sh/kernel/kprobes.c
@@ -296,7 +296,7 @@ static void __used kretprobe_trampoline_holder(void)
 /*
  * Called when we hit the probe point at __kretprobe_trampoline
  */
-int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
+static int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs 
*regs)
 {
regs->pc = __kretprobe_trampoline_handler(regs, NULL);
 
-- 
2.34.1




[PATCH 04/20] sh: tlb: Add missing forward declaration for handle_tlbmiss()

2024-03-01 Thread Geert Uytterhoeven
arch/sh/mm/tlbex_32.c:22:1: warning: no previous prototype for ‘handle_tlbmiss’ 
[-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/include/asm/tlb.h | 4 
 arch/sh/mm/tlbex_32.c | 1 +
 2 files changed, 5 insertions(+)

diff --git a/arch/sh/include/asm/tlb.h b/arch/sh/include/asm/tlb.h
index aeb8915e92549609..ddf324bfb9a09721 100644
--- a/arch/sh/include/asm/tlb.h
+++ b/arch/sh/include/asm/tlb.h
@@ -24,6 +24,10 @@ static inline void tlb_unwire_entry(void)
BUG();
 }
 #endif /* CONFIG_CPU_SH4 */
+
+asmlinkage int handle_tlbmiss(struct pt_regs *regs, unsigned long error_code,
+ unsigned long address);
+
 #endif /* CONFIG_MMU */
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_SH_TLB_H */
diff --git a/arch/sh/mm/tlbex_32.c b/arch/sh/mm/tlbex_32.c
index 1c53868632ee4c69..7d58578c15f4ef55 100644
--- a/arch/sh/mm/tlbex_32.c
+++ b/arch/sh/mm/tlbex_32.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Called with interrupts disabled.
-- 
2.34.1




[PATCH 03/20] sh: syscall: Add missing forward declaration for sys_cacheflush()

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/sys_sh.c:58:16: warning: no previous prototype for 
‘sys_cacheflush’ [-Wmissing-prototypes]

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/include/asm/syscalls.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/sh/include/asm/syscalls.h b/arch/sh/include/asm/syscalls.h
index 387105316d2882fe..39240e06e8aa5f6b 100644
--- a/arch/sh/include/asm/syscalls.h
+++ b/arch/sh/include/asm/syscalls.h
@@ -8,6 +8,7 @@ asmlinkage int old_mmap(unsigned long addr, unsigned long len,
 asmlinkage long sys_mmap2(unsigned long addr, unsigned long len,
  unsigned long prot, unsigned long flags,
  unsigned long fd, unsigned long pgoff);
+asmlinkage int sys_cacheflush(unsigned long addr, unsigned long len, int op);
 
 #include 
 
-- 
2.34.1




[PATCH 19/20] sh: dwarf: Make dwarf_lookup_fde() static

2024-03-01 Thread Geert Uytterhoeven
arch/sh/kernel/dwarf.c:347:19: warning: no previous prototype for 
'dwarf_lookup_fde' [-Wmissing-prototypes]

There are no users outside this file, so make it static.

Signed-off-by: Geert Uytterhoeven 
---
 arch/sh/kernel/dwarf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sh/kernel/dwarf.c b/arch/sh/kernel/dwarf.c
index bf8682e718303051..45c8ae20d10957db 100644
--- a/arch/sh/kernel/dwarf.c
+++ b/arch/sh/kernel/dwarf.c
@@ -344,7 +344,7 @@ static struct dwarf_cie *dwarf_lookup_cie(unsigned long 
cie_ptr)
  * dwarf_lookup_fde - locate the FDE that covers pc
  * @pc: the program counter
  */
-struct dwarf_fde *dwarf_lookup_fde(unsigned long pc)
+static struct dwarf_fde *dwarf_lookup_fde(unsigned long pc)
 {
struct rb_node **rb_node = _root.rb_node;
struct dwarf_fde *fde = NULL;
-- 
2.34.1




Re: [PATCH RFC ftrace] Chose RCU Tasks based on TASKS_RCU rather than PREEMPTION

2024-03-01 Thread Steven Rostedt
On Fri, 1 Mar 2024 12:25:10 -0800
"Paul E. McKenney"  wrote:

> > That would work for me.  If there are no objections, I will make this
> > change.  
> 
> But I did check the latency of synchronize_rcu_tasks_rude() (about 100ms)
> and synchronize_rcu() (about 20ms).  This is on a 80-hardware-thread
> x86 system that is being flooded with calls to one or the other of
> these two functions, but is otherwise idle.  So adding that unnecessary
> synchronize_rcu() adds about 20% to that synchronization delay.
> 
> Which might still be OK, but...  In the immortal words of MS-DOS,
> "Are you sure?".  ;-)

It's just safe to keep it. It's definitely not a fast path.

-- Steve



Re: [PATCH RFC ftrace] Chose RCU Tasks based on TASKS_RCU rather than PREEMPTION

2024-03-01 Thread Paul E. McKenney
On Wed, Feb 28, 2024 at 01:16:04PM -0800, Paul E. McKenney wrote:
> On Wed, Feb 28, 2024 at 03:22:36PM -0500, Steven Rostedt wrote:
> > On Wed, 28 Feb 2024 11:38:29 -0800
> > "Paul E. McKenney"  wrote:
> > 
> > > The advent of CONFIG_PREEMPT_AUTO, AKA lazy preemption, will mean that
> > > even kernels built with CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY
> > > might see the occasional preemption, and that this preemption just might
> > > happen within a trampoline.
> > > 
> > > Therefore, update ftrace_shutdown() to invoke synchronize_rcu_tasks()
> > > based on CONFIG_TASKS_RCU instead of CONFIG_PREEMPTION.
> > > 
> > > Only build tested.
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > > Cc: Steven Rostedt 
> > > Cc: Masami Hiramatsu 
> > > Cc: Mark Rutland 
> > > Cc: Mathieu Desnoyers 
> > > Cc: Ankur Arora 
> > > Cc: Thomas Gleixner 
> > > Cc: 
> > > 
> > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > > index 2da4eaa2777d6..c9e6c69cf3446 100644
> > > --- a/kernel/trace/ftrace.c
> > > +++ b/kernel/trace/ftrace.c
> > > @@ -3156,7 +3156,7 @@ int ftrace_shutdown(struct ftrace_ops *ops, int 
> > > command)
> > >* synchronize_rcu_tasks() will wait for those tasks to
> > >* execute and either schedule voluntarily or enter user space.
> > >*/
> > > - if (IS_ENABLED(CONFIG_PREEMPTION))
> > > + if (IS_ENABLED(CONFIG_TASKS_RCU))
> > >   synchronize_rcu_tasks();
> > 
> > What happens if CONFIG_TASKS_RCU is not enabled? Does
> > synchronize_rcu_tasks() do anything? Or is it just a synchronize_rcu()?
> 
> It is just a synchronize_rcu().
> 
> > If that's the case, perhaps just remove the if statement and make it:
> > 
> > synchronize_rcu_tasks();
> > 
> > Not sure an extra synchronize_rcu() will hurt (especially after doing a
> > synchronize_rcu_tasks_rude() just before hand!
> 
> That would work for me.  If there are no objections, I will make this
> change.

But I did check the latency of synchronize_rcu_tasks_rude() (about 100ms)
and synchronize_rcu() (about 20ms).  This is on a 80-hardware-thread
x86 system that is being flooded with calls to one or the other of
these two functions, but is otherwise idle.  So adding that unnecessary
synchronize_rcu() adds about 20% to that synchronization delay.

Which might still be OK, but...  In the immortal words of MS-DOS,
"Are you sure?".  ;-)

Thanx, Paul



Re: [PATCH 3/3] dt-bindings: remoteproc: Add Arm remoteproc

2024-03-01 Thread Krzysztof Kozlowski
On 01/03/2024 17:42, abdellatif.elkhl...@arm.com wrote:
> From: Abdellatif El Khlifi 
> 
> introduce the bindings for Arm remoteproc support.
> 
> Signed-off-by: Abdellatif El Khlifi 
> ---
>  .../bindings/remoteproc/arm,rproc.yaml| 69 +++
>  MAINTAINERS   |  1 +

Fix order of patches - bindings are always before the user (see
submitting bindings doc).

>  2 files changed, 70 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
> 
> diff --git a/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml 
> b/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
> new file mode 100644
> index ..322197158059
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
> @@ -0,0 +1,69 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/remoteproc/arm,rproc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Arm Remoteproc Devices

That's quite generic... does it applied to all ARM designs?

> +
> +maintainers:
> +  - Abdellatif El Khlifi 
> +
> +description: |
> +  Some Arm heterogeneous System-On-Chips feature remote processors that can
> +  be controlled with a reset control register and a reset status register to
> +  start or stop the processor.
> +
> +  This document defines the bindings for these remote processors.

Drop last sentence.

> +
> +properties:
> +  compatible:
> +enum:
> +  - arm,corstone1000-extsys
> +
> +  reg:
> +minItems: 2
> +maxItems: 2
> +description: |
> +  Address and size in bytes of the reset control register
> +  and the reset status register.
> +  Expects the registers to be in the order as above.
> +  Should contain an entry for each value in 'reg-names'.

Entirely redundant sentences... instead this all just list items with
description.

> +
> +  reg-names:Do not need '|' unless you need to preserve formatting.
> +description: |
> +  Required names for each of the reset registers defined in
> +  the 'reg' property. Expects the names from the following
> +  list, in the specified order, each representing the corresponding
> +  reset register.

Really, drop.

> +items:
> +  - const: reset-control
> +  - const: reset-status
> +
> +  firmware-name:
> +description: |

Do not need '|' unless you need to preserve formatting.

> +  Default name of the firmware to load to the remote processor.
> +
> +required:
> +  - compatible
> +  - reg
> +  - reg-names
> +  - firmware-name
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +extsys0: remoteproc@1a010310 {

Drop label, not used.

> +compatible = "arm,corstone1000-extsys";

Use 4 spaces for example indentation.

> +reg = <0x1a010310 0x4>, <0x1a010314 0x4>;
> +reg-names = "reset-control", "reset-status";
> +firmware-name = "es0_flashfw.elf";
> +};
> +
> +extsys1: remoteproc@1a010318 {
> +compatible = "arm,corstone1000-extsys";

These are the same examples, so keep only one.

> +reg = <0x1a010318 0x4>, <0x1a01031c 0x4>;
> +reg-names = "reset-control", "reset-status";
> +firmware-name = "es1_flashfw.elf";
> +};

Best regards,
Krzysztof




Re: [PATCH 2/3] arm64: dts: Add corstone1000 external system device node

2024-03-01 Thread Krzysztof Kozlowski
On 01/03/2024 17:42, abdellatif.elkhl...@arm.com wrote:
> From: Abdellatif El Khlifi 
> 
> add device tree node for the external system core in Corstone-1000
> 
> Signed-off-by: Abdellatif El Khlifi 
> ---
>  arch/arm64/boot/dts/arm/corstone1000.dtsi | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/boot/dts/arm/corstone1000.dtsi 
> b/arch/arm64/boot/dts/arm/corstone1000.dtsi
> index 6ad7829f9e28..67df642363e9 100644
> --- a/arch/arm64/boot/dts/arm/corstone1000.dtsi
> +++ b/arch/arm64/boot/dts/arm/corstone1000.dtsi
> @@ -1,6 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0 OR MIT
>  /*
> - * Copyright (c) 2022, Arm Limited. All rights reserved.
> + * Copyright 2022, 2024, Arm Limited and/or its affiliates 
> 
>   * Copyright (c) 2022, Linaro Limited. All rights reserved.
>   *
>   */
> @@ -157,5 +157,13 @@ mhu_seh1: mailbox@1b83 {
>   secure-status = "okay"; /* secure-world-only */
>   status = "disabled";
>   };
> +
> + extsys0: remoteproc@1a010310 {

Looks not really ordered.

> + compatible = "arm,corstone1000-extsys";
> + reg = <0x1a010310 0x4>,
> + <0x1a010314 0X4>;

And this needs alignment.


Best regards,
Krzysztof




Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-03-01 Thread Willem de Bruijn
Maciej Fijalkowski wrote:
> On Wed, Feb 28, 2024 at 07:05:56PM +0800, Yunjian Wang wrote:
> > This patch set allows TUN to support the AF_XDP Tx zero-copy feature,
> > which can significantly reduce CPU utilization for XDP programs.
> 
> Why no Rx ZC support though? What will happen if I try rxdrop xdpsock
> against tun with this patch? You clearly allow for that.

This is AF_XDP receive zerocopy, right?

The naming is always confusing with tun, but even though from a tun
PoV this happens on ndo_start_xmit, it is the AF_XDP equivalent to
tun_put_user.

So the implementation is more like other device's Rx ZC.

I would have preferred that name, but I think Jason asked for this
and given tun's weird status, there is something bo said for either.



[PATCH v12 3/4] dts: zynqmp: add properties for TCM in remoteproc

2024-03-01 Thread Tanmay Shah
Add properties as per new bindings in zynqmp remoteproc node
to represent TCM address and size.

This patch also adds alternative remoteproc node to represent
remoteproc cluster in split mode. By default lockstep mode is
enabled and users should disable it before using split mode
dts. Both device-tree nodes can't be used simultaneously one
of them must be disabled. For zcu102-1.0 and zcu102-1.1 board
remoteproc split mode dts node is enabled and lockstep mode
dts is disabled.

Signed-off-by: Tanmay Shah 
---
 .../boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts  |  8 +++
 arch/arm64/boot/dts/xilinx/zynqmp.dtsi| 65 +--
 2 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts 
b/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts
index c8f71a1aec89..495ca94b45db 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts
+++ b/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts
@@ -14,6 +14,14 @@ / {
compatible = "xlnx,zynqmp-zcu102-rev1.0", "xlnx,zynqmp-zcu102", 
"xlnx,zynqmp";
 };
 
+_split {
+   status = "okay";
+};
+
+_lockstep {
+   status = "disabled";
+};
+
  {
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi 
b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
index eaba466804bc..c8a7fd0f3a1e 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
+++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
@@ -248,19 +248,74 @@ fpga_full: fpga-full {
ranges;
};
 
-   remoteproc {
+   rproc_lockstep: remoteproc@ffe0 {
compatible = "xlnx,zynqmp-r5fss";
xlnx,cluster-mode = <1>;
 
-   r5f-0 {
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   ranges = <0x0 0x0 0x0 0xffe0 0x0 0x1>,
+<0x0 0x2 0x0 0xffe2 0x0 0x1>,
+<0x0 0x1 0x0 0xffe1 0x0 0x1>,
+<0x0 0x3 0x0 0xffe3 0x0 0x1>;
+
+   r5f@0 {
+   compatible = "xlnx,zynqmp-r5f";
+   reg = <0x0 0x0 0x0 0x1>,
+ <0x0 0x2 0x0 0x1>,
+ <0x0 0x1 0x0 0x1>,
+ <0x0 0x3 0x0 0x1>;
+   reg-names = "atcm0", "btcm0", "atcm1", "btcm1";
+   power-domains = <_firmware PD_RPU_0>,
+   <_firmware PD_R5_0_ATCM>,
+   <_firmware PD_R5_0_BTCM>,
+   <_firmware PD_R5_1_ATCM>,
+   <_firmware PD_R5_1_BTCM>;
+   memory-region = <_0_fw_image>;
+   };
+
+   r5f@1 {
+   compatible = "xlnx,zynqmp-r5f";
+   reg = <0x1 0x0 0x0 0x1>, <0x1 0x2 0x0 0x1>;
+   reg-names = "atcm0", "btcm0";
+   power-domains = <_firmware PD_RPU_1>,
+   <_firmware PD_R5_1_ATCM>,
+   <_firmware PD_R5_1_BTCM>;
+   memory-region = <_1_fw_image>;
+   };
+   };
+
+   rproc_split: remoteproc-split@ffe0 {
+   status = "disabled";
+   compatible = "xlnx,zynqmp-r5fss";
+   xlnx,cluster-mode = <0>;
+
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   ranges = <0x0 0x0 0x0 0xffe0 0x0 0x1>,
+<0x0 0x2 0x0 0xffe2 0x0 0x1>,
+<0x1 0x0 0x0 0xffe9 0x0 0x1>,
+<0x1 0x2 0x0 0xffeb 0x0 0x1>;
+
+   r5f@0 {
compatible = "xlnx,zynqmp-r5f";
-   power-domains = <_firmware PD_RPU_0>;
+   reg = <0x0 0x0 0x0 0x1>, <0x0 0x2 0x0 0x1>;
+   reg-names = "atcm0", "btcm0";
+   power-domains = <_firmware PD_RPU_0>,
+   <_firmware PD_R5_0_ATCM>,
+   <_firmware PD_R5_0_BTCM>;
memory-region = <_0_fw_image>;
};
 
-   r5f-1 {
+   r5f@1 {
compatible = "xlnx,zynqmp-r5f";
-   power-domains = <_firmware PD_RPU_1>;
+   reg = <0x1 0x0 0x0 0x1>, <0x1 0x2 0x0 0x1>;
+   reg-names = "atcm0", "btcm0";
+   power-domains = <_firmware PD_RPU_1>,
+   <_firmware PD_R5_1_ATCM>,
+   <_firmware PD_R5_1_BTCM>;
memory-region = <_1_fw_image>;
};
};
-- 
2.25.1




[PATCH v12 2/4] dt-bindings: remoteproc: add Tightly Coupled Memory (TCM) bindings

2024-03-01 Thread Tanmay Shah
From: Radhey Shyam Pandey 

Introduce bindings for TCM memory address space on AMD-xilinx Zynq
UltraScale+ platform. It will help in defining TCM in device-tree
and make it's access platform agnostic and data-driven.

Tightly-coupled memories(TCMs) are low-latency memory that provides
predictable instruction execution and predictable data load/store
timing. Each Cortex-R5F processor contains two 64-bit wide 64 KB memory
banks on the ATCM and BTCM ports, for a total of 128 KB of memory.

The TCM resources(reg, reg-names and power-domain) are documented for
each TCM in the R5 node. The reg and reg-names are made as required
properties as we don't want to hardcode TCM addresses for future
platforms and for zu+ legacy implementation will ensure that the
old dts w/o reg/reg-names works and stable ABI is maintained.

It also extends the examples for TCM split and lockstep modes.

Signed-off-by: Radhey Shyam Pandey 
Signed-off-by: Tanmay Shah 
---

Changes in v12:
  - add "reg", "reg-names" and "power-domains" in pattern properties
  - add "reg" and "reg-names" in required list
  - keep "power-domains" in required list as it was before the change

Changes in v11:
  - Fix yamllint warning and reduce indentation as needed

 .../remoteproc/xlnx,zynqmp-r5fss.yaml | 188 --
 1 file changed, 168 insertions(+), 20 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml 
b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
index 78aac69f1060..dc6ce308688f 100644
--- a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
+++ b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
@@ -20,9 +20,21 @@ properties:
   compatible:
 const: xlnx,zynqmp-r5fss
 
+  "#address-cells":
+const: 2
+
+  "#size-cells":
+const: 2
+
+  ranges:
+description: |
+  Standard ranges definition providing address translations for
+  local R5F TCM address spaces to bus addresses.
+
   xlnx,cluster-mode:
 $ref: /schemas/types.yaml#/definitions/uint32
 enum: [0, 1, 2]
+default: 1
 description: |
   The RPU MPCore can operate in split mode (Dual-processor performance), 
Safety
   lock-step mode(Both RPU cores execute the same code in lock-step,
@@ -37,7 +49,7 @@ properties:
   2: single cpu mode
 
 patternProperties:
-  "^r5f-[a-f0-9]+$":
+  "^r5f@[0-9a-f]+$":
 type: object
 description: |
   The RPU is located in the Low Power Domain of the Processor Subsystem.
@@ -54,8 +66,17 @@ patternProperties:
   compatible:
 const: xlnx,zynqmp-r5f
 
+  reg:
+minItems: 1
+maxItems: 4
+
+  reg-names:
+minItems: 1
+maxItems: 4
+
   power-domains:
-maxItems: 1
+minItems: 2
+maxItems: 5
 
   mboxes:
 minItems: 1
@@ -101,35 +122,162 @@ patternProperties:
 
 required:
   - compatible
+  - reg
+  - reg-names
   - power-domains
 
-unevaluatedProperties: false
-
 required:
   - compatible
+  - "#address-cells"
+  - "#size-cells"
+  - ranges
+
+allOf:
+  - if:
+  properties:
+xlnx,cluster-mode:
+  enum:
+- 1
+then:
+  patternProperties:
+"^r5f@[0-9a-f]+$":
+  type: object
+
+  properties:
+reg:
+  minItems: 1
+  items:
+- description: ATCM internal memory
+- description: BTCM internal memory
+- description: extra ATCM memory in lockstep mode
+- description: extra BTCM memory in lockstep mode
+
+reg-names:
+  minItems: 1
+  items:
+- const: atcm0
+- const: btcm0
+- const: atcm1
+- const: btcm1
+
+else:
+  patternProperties:
+"^r5f@[0-9a-f]+$":
+  type: object
+
+  properties:
+reg:
+  minItems: 1
+  items:
+- description: ATCM internal memory
+- description: BTCM internal memory
+
+reg-names:
+  minItems: 1
+  items:
+- const: atcm0
+- const: btcm0
+
+power-domains:
+  maxItems: 3
 
 additionalProperties: false
 
 examples:
   - |
-remoteproc {
-compatible = "xlnx,zynqmp-r5fss";
-xlnx,cluster-mode = <1>;
-
-r5f-0 {
-compatible = "xlnx,zynqmp-r5f";
-power-domains = <_firmware 0x7>;
-memory-region = <_0_fw_image>, <>, 
<>, <>;
-mboxes = <_mailbox_rpu0 0>, <_mailbox_rpu0 1>;
-mbox-names = "tx", "rx";
+#include 
+
+// Split mode configuration
+soc {
+#address-cells = <2>;
+#size-cells = <2>;
+
+remoteproc@ffe0 {
+compatible = "xlnx,zynqmp-r5fss";
+xlnx,cluster-mode = <0>;
+
+

[PATCH v12 4/4] remoteproc: zynqmp: parse TCM from device tree

2024-03-01 Thread Tanmay Shah
ZynqMP TCM information was fixed in driver. Now ZynqMP TCM information
is available in device-tree. Parse TCM information in driver
as per new bindings.

Signed-off-by: Tanmay Shah 
---

Changes in v12:
  - None

Changes in v11:
  - Remove redundant initialization of the variable
  - return correct error code if memory allocation failed

 drivers/remoteproc/xlnx_r5_remoteproc.c | 112 ++--
 1 file changed, 107 insertions(+), 5 deletions(-)

diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c 
b/drivers/remoteproc/xlnx_r5_remoteproc.c
index 42b0384d34f2..d4a22caebaad 100644
--- a/drivers/remoteproc/xlnx_r5_remoteproc.c
+++ b/drivers/remoteproc/xlnx_r5_remoteproc.c
@@ -74,8 +74,8 @@ struct mbox_info {
 };
 
 /*
- * Hardcoded TCM bank values. This will be removed once TCM bindings are
- * accepted for system-dt specifications and upstreamed in linux kernel
+ * Hardcoded TCM bank values. This will stay in driver to maintain backward
+ * compatibility with device-tree that does not have TCM information.
  */
 static const struct mem_bank_data zynqmp_tcm_banks_split[] = {
{0xffe0UL, 0x0, 0x1UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 64KB each 
*/
@@ -757,6 +757,103 @@ static struct zynqmp_r5_core 
*zynqmp_r5_add_rproc_core(struct device *cdev)
return ERR_PTR(ret);
 }
 
+static int zynqmp_r5_get_tcm_node_from_dt(struct zynqmp_r5_cluster *cluster)
+{
+   int i, j, tcm_bank_count, ret, tcm_pd_idx, pd_count;
+   struct of_phandle_args out_args;
+   struct zynqmp_r5_core *r5_core;
+   struct platform_device *cpdev;
+   struct mem_bank_data *tcm;
+   struct device_node *np;
+   struct resource *res;
+   u64 abs_addr, size;
+   struct device *dev;
+
+   for (i = 0; i < cluster->core_count; i++) {
+   r5_core = cluster->r5_cores[i];
+   dev = r5_core->dev;
+   np = r5_core->np;
+
+   pd_count = of_count_phandle_with_args(np, "power-domains",
+ "#power-domain-cells");
+
+   if (pd_count <= 0) {
+   dev_err(dev, "invalid power-domains property, %d\n", 
pd_count);
+   return -EINVAL;
+   }
+
+   /* First entry in power-domains list is for r5 core, rest for 
TCM. */
+   tcm_bank_count = pd_count - 1;
+
+   if (tcm_bank_count <= 0) {
+   dev_err(dev, "invalid TCM count %d\n", tcm_bank_count);
+   return -EINVAL;
+   }
+
+   r5_core->tcm_banks = devm_kcalloc(dev, tcm_bank_count,
+ sizeof(struct mem_bank_data 
*),
+ GFP_KERNEL);
+   if (!r5_core->tcm_banks)
+   return -ENOMEM;
+
+   r5_core->tcm_bank_count = tcm_bank_count;
+   for (j = 0, tcm_pd_idx = 1; j < tcm_bank_count; j++, 
tcm_pd_idx++) {
+   tcm = devm_kzalloc(dev, sizeof(struct mem_bank_data),
+  GFP_KERNEL);
+   if (!tcm)
+   return -ENOMEM;
+
+   r5_core->tcm_banks[j] = tcm;
+
+   /* Get power-domains id of TCM. */
+   ret = of_parse_phandle_with_args(np, "power-domains",
+"#power-domain-cells",
+tcm_pd_idx, _args);
+   if (ret) {
+   dev_err(r5_core->dev,
+   "failed to get tcm %d pm domain, ret 
%d\n",
+   tcm_pd_idx, ret);
+   return ret;
+   }
+   tcm->pm_domain_id = out_args.args[0];
+   of_node_put(out_args.np);
+
+   /* Get TCM address without translation. */
+   ret = of_property_read_reg(np, j, _addr, );
+   if (ret) {
+   dev_err(dev, "failed to get reg property\n");
+   return ret;
+   }
+
+   /*
+* Remote processor can address only 32 bits
+* so convert 64-bits into 32-bits. This will discard
+* any unwanted upper 32-bits.
+*/
+   tcm->da = (u32)abs_addr;
+   tcm->size = (u32)size;
+
+   cpdev = to_platform_device(dev);
+   res = platform_get_resource(cpdev, IORESOURCE_MEM, j);
+   if (!res) {
+   dev_err(dev, "failed to get tcm resource\n");
+   return -EINVAL;
+   }
+
+ 

[PATCH v12 1/4] remoteproc: zynqmp: fix lockstep mode memory region

2024-03-01 Thread Tanmay Shah
In lockstep mode, r5 core0 uses TCM of R5 core1. Following is lockstep
mode memory region as per hardware reference manual.

|  *TCM* |   *R5 View* | *Linux view* |
| R5_0 ATCM (128 KB) | 0x_ | 0xFFE0_  |
| R5_0 BTCM (128 KB) | 0x0002_ | 0xFFE2_  |

However, driver shouldn't model it as above because R5 core0 TCM and core1
TCM has different power-domains mapped to it.
Hence, TCM address space in lockstep mode should be modeled as 64KB
regions only where each region has its own power-domain as following:

|  *TCM* |   *R5 View* | *Linux view* |
| R5_0 ATCM0 (64 KB) | 0x_ | 0xFFE0_  |
| R5_0 BTCM0 (64 KB) | 0x0002_ | 0xFFE2_  |
| R5_0 ATCM1 (64 KB) | 0x0001_ | 0xFFE1_  |
| R5_0 BTCM1 (64 KB) | 0x0003_ | 0xFFE3_  |

This makes driver maintanance easy and makes design robust for future
platorms as well.

Signed-off-by: Tanmay Shah 
---

Changes in v12:
  - None

 drivers/remoteproc/xlnx_r5_remoteproc.c | 145 ++--
 1 file changed, 12 insertions(+), 133 deletions(-)

diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c 
b/drivers/remoteproc/xlnx_r5_remoteproc.c
index 4395edea9a64..42b0384d34f2 100644
--- a/drivers/remoteproc/xlnx_r5_remoteproc.c
+++ b/drivers/remoteproc/xlnx_r5_remoteproc.c
@@ -84,12 +84,12 @@ static const struct mem_bank_data zynqmp_tcm_banks_split[] 
= {
{0xffebUL, 0x2, 0x1UL, PD_R5_1_BTCM, "btcm1"},
 };
 
-/* In lockstep mode cluster combines each 64KB TCM and makes 128KB TCM */
+/* In lockstep mode cluster uses each 64KB TCM from second core as well */
 static const struct mem_bank_data zynqmp_tcm_banks_lockstep[] = {
-   {0xffe0UL, 0x0, 0x2UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 128KB 
each */
-   {0xffe2UL, 0x2, 0x2UL, PD_R5_0_BTCM, "btcm0"},
-   {0, 0, 0, PD_R5_1_ATCM, ""},
-   {0, 0, 0, PD_R5_1_BTCM, ""},
+   {0xffe0UL, 0x0, 0x1UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 64KB each 
*/
+   {0xffe2UL, 0x2, 0x1UL, PD_R5_0_BTCM, "btcm0"},
+   {0xffe1UL, 0x1, 0x1UL, PD_R5_1_ATCM, "atcm1"},
+   {0xffe3UL, 0x3, 0x1UL, PD_R5_1_BTCM, "btcm1"},
 };
 
 /**
@@ -540,14 +540,14 @@ static int tcm_mem_map(struct rproc *rproc,
 }
 
 /*
- * add_tcm_carveout_split_mode()
+ * add_tcm_banks()
  * @rproc: single R5 core's corresponding rproc instance
  *
- * allocate and add remoteproc carveout for TCM memory in split mode
+ * allocate and add remoteproc carveout for TCM memory
  *
  * return 0 on success, otherwise non-zero value on failure
  */
-static int add_tcm_carveout_split_mode(struct rproc *rproc)
+static int add_tcm_banks(struct rproc *rproc)
 {
struct rproc_mem_entry *rproc_mem;
struct zynqmp_r5_core *r5_core;
@@ -580,10 +580,10 @@ static int add_tcm_carveout_split_mode(struct rproc 
*rproc)
 ZYNQMP_PM_REQUEST_ACK_BLOCKING);
if (ret < 0) {
dev_err(dev, "failed to turn on TCM 0x%x", 
pm_domain_id);
-   goto release_tcm_split;
+   goto release_tcm;
}
 
-   dev_dbg(dev, "TCM carveout split mode %s addr=%llx, da=0x%x, 
size=0x%lx",
+   dev_dbg(dev, "TCM carveout %s addr=%llx, da=0x%x, size=0x%lx",
bank_name, bank_addr, da, bank_size);
 
rproc_mem = rproc_mem_entry_init(dev, NULL, bank_addr,
@@ -593,7 +593,7 @@ static int add_tcm_carveout_split_mode(struct rproc *rproc)
if (!rproc_mem) {
ret = -ENOMEM;
zynqmp_pm_release_node(pm_domain_id);
-   goto release_tcm_split;
+   goto release_tcm;
}
 
rproc_add_carveout(rproc, rproc_mem);
@@ -601,7 +601,7 @@ static int add_tcm_carveout_split_mode(struct rproc *rproc)
 
return 0;
 
-release_tcm_split:
+release_tcm:
/* If failed, Turn off all TCM banks turned on before */
for (i--; i >= 0; i--) {
pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
@@ -610,127 +610,6 @@ static int add_tcm_carveout_split_mode(struct rproc 
*rproc)
return ret;
 }
 
-/*
- * add_tcm_carveout_lockstep_mode()
- * @rproc: single R5 core's corresponding rproc instance
- *
- * allocate and add remoteproc carveout for TCM memory in lockstep mode
- *
- * return 0 on success, otherwise non-zero value on failure
- */
-static int add_tcm_carveout_lockstep_mode(struct rproc *rproc)
-{
-   struct rproc_mem_entry *rproc_mem;
-   struct zynqmp_r5_core *r5_core;
-   int i, num_banks, ret;
-   phys_addr_t bank_addr;
-   size_t bank_size = 0;
-   struct device *dev;
-   u32 pm_domain_id;
-   char *bank_name;
-   u32 da;
-
-   r5_core = rproc->priv;
-   dev = r5_core->dev;
-
-   /* Go through zynqmp banks for r5 node */
-   

[PATCH v12 0/4] add zynqmp TCM bindings

2024-03-01 Thread Tanmay Shah
Tightly-Coupled Memories(TCMs) are low-latency memory that provides
predictable instruction execution and predictable data load/store
timing. Each Cortex-R5F processor contains exclusive two 64 KB memory
banks on the ATCM and BTCM ports, for a total of 128 KB of memory.
In lockstep mode, both 128KB memory is accessible to the cluster.

As per ZynqMP Ultrascale+ Technical Reference Manual UG1085, following
is address space of TCM memory. The bindings in this patch series
introduces properties to accommodate following address space with
address translation between Linux and Cortex-R5 views.

| | | |
| --- | --- | --- |
|  *Mode*|   *R5 View* | *Linux view* |  Notes   |
| *Split Mode*   | *start addr*| *start addr* |  |
| R5_0 ATCM (64 KB)  | 0x_ | 0xFFE0_  |  |
| R5_0 BTCM (64 KB)  | 0x0002_ | 0xFFE2_  |  |
| R5_1 ATCM (64 KB)  | 0x_ | 0xFFE9_  | alias of 0xFFE1_ |
| R5_1 BTCM (64 KB)  | 0x0002_ | 0xFFEB_  | alias of 0xFFE3_ |
|  ___   | ___ |___   |  |
| *Lockstep Mode*| |  |  |
| R5_0 ATCM (128 KB) | 0x_ | 0xFFE0_  |  |
| R5_0 BTCM (128 KB) | 0x0002_ | 0xFFE2_  |  |

References:
UG1085 TCM address space:
https://docs.xilinx.com/r/en-US/ug1085-zynq-ultrascale-trm/Tightly-Coupled-Memory-Address-Map

Changes in v12:
  - add "reg", "reg-names" and "power-domains" in pattern properties
  - add "reg" and "reg-names" in required list
  - keep "power-domains" in required list as it was before the change

Changes in v11:
  - Fix yamllint warning and reduce indentation as needed
  - Remove redundant initialization of the variable
  - Return correct error code if memory allocation failed

Changs in v10:
  - Add new patch (1/4) to series that changes hardcode TCM addresses in
lockstep mode and removes separate handling of TCM in lockstep and
split mode
  - modify number of "reg", "reg-names" and "power-domains" entries
based on cluster mode
  - Add extra optional atcm and btcm in "reg" property for lockstep mode
  - Add "reg-names" for extra optional atcm and btcm for lockstep mode
  - Drop previous Ack as bindings has new change
  - Add individual tcm regions via "reg" and "reg-names" for lockstep mode
  - Add each tcm's power-domains in lockstep mode
  - Drop previous Ack as new change in dts patchset
  - Remove redundant changes in driver to handle TCM in lockstep mode

Changes in v9:
  - Fix rproc lockstep dts
  - Introduce new API to request and release core1 TCM power-domains in
lockstep mode. This will be used during prepare -> add_tcm_banks
callback to enable TCM in lockstep mode.
  - Parse TCM from device-tree in lockstep mode and split mode in
uniform way.
  - Fix TCM representation in device-tree in lockstep mode.
  - Fix comments as suggested

Changes in v8:
  - Remove use of pm_domains framework
  - Remove checking of pm_domain_id validation to power on/off tcm
  - Remove spurious change
  - parse power-domains property from device-tree and use EEMI calls
to power on/off TCM instead of using pm domains framework

Changes in v7:
  - %s/pm_dev1/pm_dev_core0/r
  - %s/pm_dev_link1/pm_dev_core0_link/r
  - %s/pm_dev2/pm_dev_core1/r
  - %s/pm_dev_link2/pm_dev_core1_link/r
  - remove pm_domain_id check to move next patch
  - add comment about how 1st entry in pm domain list is used
  - fix loop when jump to fail_add_pm_domains loop
  - move checking of pm_domain_id from previous patch
  - fix mem_bank_data memory allocation

Changes in v6:
  - Introduce new node entry for r5f cluster split mode dts and
keep it disabled by default.
  - Keep remoteproc lockstep mode enabled by default to maintian
back compatibility.
  - Enable split mode only for zcu102 board to demo split mode use
  - Remove spurious change
  - Handle errors in add_pm_domains function
  - Remove redundant code to handle errors from remove_pm_domains
  - Missing . at the end of the commit message
  - remove redundant initialization of variables
  - remove fail_tcm label and relevant code to free memory
acquired using devm_* API. As this will be freed when device free it
  - add extra check to see if "reg" property is supported or not

Changes in v5:
  - maintain Rob's Ack on bindings patch as no changes in bindings
  - split previous patch into multiple patches
  - Use pm domain framework to turn on/off TCM
  - Add support of parsing TCM information from device-tree
  - maintain backward compatibility with previous bindings without
TCM information available in device-tree

This patch series continues previous effort to upstream ZynqMP
TCM bindings:
Previous v4 version link:
https://lore.kernel.org/all/20230829181900.2561194-1-tanmay.s...@amd.com/

Previous v3 version link:

Re: [PATCH] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-03-01 Thread Steven Rostedt
On Fri, 1 Mar 2024 11:37:54 -0500
Mathieu Desnoyers  wrote:

> On 2024-03-01 10:49, Steven Rostedt wrote:
> > On Fri, 1 Mar 2024 13:37:18 +0800
> > linke  wrote:
> >   
> >>> So basically you are worried about read-tearing?
> >>>
> >>> That wasn't mentioned in the change log.  
> >>
> >> Yes. Sorry for making this confused, I am not very familiar with this and
> >> still learning.  
> > 
> > No problem. We all have to learn this anyway.
> >   
> >>  
> >>> Funny part is, if the above timestamp read did a tear, then this would
> >>> definitely not match, and would return the correct value. That is, the
> >>> buffer is not empty because the only way for this to get corrupted is if
> >>> something is in the process of writing to it.  
> >>
> >> I agree with you here.
> >>
> >>commit = rb_page_commit(commit_page);
> >>
> >> But if commit_page above is the result of a torn read, the commit field
> >> read by rb_page_commit() may not represent a valid value.  
> > 
> > But commit_page is a word length, and I will argue that any compiler that
> > tears "long" words is broken. ;-)  
> 
> [ For those tuning in, we are discussing ring_buffer_iter_empty()
>"commit_page = cpu_buffer->commit_page;" racy load. ]
> 
> I counter-argue that real-world compilers *are* broken based on your
> personal definition, but we have to deal with them, as documented
> in Documentation/memory-barriers.txt (see below).
> 
> What is the added overhead of using a READ_ONCE() there ? Why are
> we wasting effort trying to guess the compiler behavior if the
> real-world performance impact is insignificant ?
> 
> Quote from memory-barrier.txt explaining the purpose of {READ,WRITE}_ONCE():
> 
> "(*) For aligned memory locations whose size allows them to be accessed
>   with a single memory-reference instruction, prevents "load tearing"
>   and "store tearing," in which a single large access is replaced by
>   multiple smaller accesses."
> 
> I agree that {READ,WRITE}_ONCE() are really not needed at initialization,
> when there are demonstrably no concurrent accesses to the data
> 
> But trying to eliminate {READ,WRITE}_ONCE() on concurrently accessed fields
> just adds complexity, prevents static analyzers to properly understand the
> code and report issues, and just obfuscates the code.
> 
> Thanks,
> 
> Mathieu
> 
> >   
> >>
> >> In this case, READ_ONCE() is only needed for the commit_page.  
> > 
> > But we can at least keep the READ_ONCE() on the commit_page just because it
> > is used in the next instruction.
> 

And here I did state that READ_ONCE() does have another use case. So
there's no argument about adding it here.

-- Steve




[RESEND PATCH v5 5/5] input/touchscreen: imagis: add support for IST3032C

2024-03-01 Thread Karel Balej
From: Karel Balej 

IST3032C is a touchscreen chip used for instance in the
samsung,coreprimevelte smartphone, with which this was tested. Add the
chip specific information to the driver.

Reviewed-by: Markuss Broks 
Signed-off-by: Karel Balej 
---

Notes:
v4:
* Change the WHOAMI definition position to preserve alphanumerical order
  of the definitions.
* Add Markuss' Reviewed-by trailer.

 drivers/input/touchscreen/imagis.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/input/touchscreen/imagis.c 
b/drivers/input/touchscreen/imagis.c
index 9af8a6332ae6..e1fafa561ee3 100644
--- a/drivers/input/touchscreen/imagis.c
+++ b/drivers/input/touchscreen/imagis.c
@@ -11,6 +11,8 @@
 #include 
 #include 
 
+#define IST3032C_WHOAMI0x32c
+
 #define IST3038B_REG_STATUS0x20
 #define IST3038B_REG_CHIPID0x30
 #define IST3038B_WHOAMI0x30380b
@@ -363,6 +365,13 @@ static int imagis_resume(struct device *dev)
 
 static DEFINE_SIMPLE_DEV_PM_OPS(imagis_pm_ops, imagis_suspend, imagis_resume);
 
+static const struct imagis_properties imagis_3032c_data = {
+   .interrupt_msg_cmd = IST3038C_REG_INTR_MESSAGE,
+   .touch_coord_cmd = IST3038C_REG_TOUCH_COORD,
+   .whoami_cmd = IST3038C_REG_CHIPID,
+   .whoami_val = IST3032C_WHOAMI,
+};
+
 static const struct imagis_properties imagis_3038b_data = {
.interrupt_msg_cmd = IST3038B_REG_STATUS,
.touch_coord_cmd = IST3038B_REG_STATUS,
@@ -380,6 +389,7 @@ static const struct imagis_properties imagis_3038c_data = {
 
 #ifdef CONFIG_OF
 static const struct of_device_id imagis_of_match[] = {
+   { .compatible = "imagis,ist3032c", .data = _3032c_data },
{ .compatible = "imagis,ist3038b", .data = _3038b_data },
{ .compatible = "imagis,ist3038c", .data = _3038c_data },
{ },
-- 
2.44.0




[RESEND PATCH v5 4/5] dt-bindings: input/touchscreen: imagis: add compatible for IST3032C

2024-03-01 Thread Karel Balej
From: Karel Balej 

IST3032C is a touchscreen IC which seems mostly compatible with IST3038C
except that it reports a different chip ID value.

Acked-by: Rob Herring 
Signed-off-by: Karel Balej 
---

Notes:
v5:
- Add Rob's trailer.
v4:
- Reword commit description to mention how this IC differs from the
  already supported.

 .../devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml   | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml 
b/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml
index b5372c4eae56..2af71cbcc97d 100644
--- a/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml
+++ b/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml
@@ -18,6 +18,7 @@ properties:
 
   compatible:
 enum:
+  - imagis,ist3032c
   - imagis,ist3038b
   - imagis,ist3038c
 
-- 
2.44.0




[RESEND PATCH v5 3/5] input/touchscreen: imagis: Add support for Imagis IST3038B

2024-03-01 Thread Karel Balej
From: Markuss Broks 

Imagis IST3038B is another variant of Imagis IST3038 IC, which has
a different register interface from IST3038C (possibly firmware defined).
This should also work for IST3044B (though untested), however other
variants using this interface/protocol(IST3026, IST3032, IST3026B,
IST3032B) have a different format for coordinates, and they'd need
additional effort to be supported by this driver.

Signed-off-by: Markuss Broks 
Signed-off-by: Karel Balej 
---

Notes:
v4:
* Sort the definitions in alphanumerical order.

 drivers/input/touchscreen/imagis.c | 58 --
 1 file changed, 47 insertions(+), 11 deletions(-)

diff --git a/drivers/input/touchscreen/imagis.c 
b/drivers/input/touchscreen/imagis.c
index e67fd3011027..9af8a6332ae6 100644
--- a/drivers/input/touchscreen/imagis.c
+++ b/drivers/input/touchscreen/imagis.c
@@ -11,9 +11,13 @@
 #include 
 #include 
 
+#define IST3038B_REG_STATUS0x20
+#define IST3038B_REG_CHIPID0x30
+#define IST3038B_WHOAMI0x30380b
+
 #define IST3038C_HIB_ACCESS(0x800B << 16)
 #define IST3038C_DIRECT_ACCESS BIT(31)
-#define IST3038C_REG_CHIPID0x40001000
+#define IST3038C_REG_CHIPID(0x40001000 | IST3038C_DIRECT_ACCESS)
 #define IST3038C_REG_HIB_BASE  0x3100
 #define IST3038C_REG_TOUCH_STATUS  (IST3038C_REG_HIB_BASE | 
IST3038C_HIB_ACCESS)
 #define IST3038C_REG_TOUCH_COORD   (IST3038C_REG_HIB_BASE | 
IST3038C_HIB_ACCESS | 0x8)
@@ -31,8 +35,17 @@
 #define IST3038C_FINGER_COUNT_SHIFT12
 #define IST3038C_FINGER_STATUS_MASKGENMASK(9, 0)
 
+struct imagis_properties {
+   unsigned int interrupt_msg_cmd;
+   unsigned int touch_coord_cmd;
+   unsigned int whoami_cmd;
+   unsigned int whoami_val;
+   bool protocol_b;
+};
+
 struct imagis_ts {
struct i2c_client *client;
+   const struct imagis_properties *tdata;
struct input_dev *input_dev;
struct touchscreen_properties prop;
struct regulator_bulk_data supplies[2];
@@ -84,8 +97,7 @@ static irqreturn_t imagis_interrupt(int irq, void *dev_id)
int i;
int error;
 
-   error = imagis_i2c_read_reg(ts, IST3038C_REG_INTR_MESSAGE,
-   _message);
+   error = imagis_i2c_read_reg(ts, ts->tdata->interrupt_msg_cmd, 
_message);
if (error) {
dev_err(>client->dev,
"failed to read the interrupt message: %d\n", error);
@@ -104,9 +116,13 @@ static irqreturn_t imagis_interrupt(int irq, void *dev_id)
finger_pressed = intr_message & IST3038C_FINGER_STATUS_MASK;
 
for (i = 0; i < finger_count; i++) {
-   error = imagis_i2c_read_reg(ts,
-   IST3038C_REG_TOUCH_COORD + (i * 4),
-   _status);
+   if (ts->tdata->protocol_b)
+   error = imagis_i2c_read_reg(ts,
+   ts->tdata->touch_coord_cmd, 
_status);
+   else
+   error = imagis_i2c_read_reg(ts,
+   ts->tdata->touch_coord_cmd 
+ (i * 4),
+   _status);
if (error) {
dev_err(>client->dev,
"failed to read coordinates for finger %d: 
%d\n",
@@ -261,6 +277,12 @@ static int imagis_probe(struct i2c_client *i2c)
 
ts->client = i2c;
 
+   ts->tdata = device_get_match_data(dev);
+   if (!ts->tdata) {
+   dev_err(dev, "missing chip data\n");
+   return -EINVAL;
+   }
+
error = imagis_init_regulators(ts);
if (error) {
dev_err(dev, "regulator init error: %d\n", error);
@@ -279,15 +301,13 @@ static int imagis_probe(struct i2c_client *i2c)
return error;
}
 
-   error = imagis_i2c_read_reg(ts,
-   IST3038C_REG_CHIPID | IST3038C_DIRECT_ACCESS,
-   _id);
+   error = imagis_i2c_read_reg(ts, ts->tdata->whoami_cmd, _id);
if (error) {
dev_err(dev, "chip ID read failure: %d\n", error);
return error;
}
 
-   if (chip_id != IST3038C_WHOAMI) {
+   if (chip_id != ts->tdata->whoami_val) {
dev_err(dev, "unknown chip ID: 0x%x\n", chip_id);
return -EINVAL;
}
@@ -343,9 +363,25 @@ static int imagis_resume(struct device *dev)
 
 static DEFINE_SIMPLE_DEV_PM_OPS(imagis_pm_ops, imagis_suspend, imagis_resume);
 
+static const struct imagis_properties imagis_3038b_data = {
+   .interrupt_msg_cmd = IST3038B_REG_STATUS,
+   .touch_coord_cmd = IST3038B_REG_STATUS,
+   .whoami_cmd = IST3038B_REG_CHIPID,
+   .whoami_val = IST3038B_WHOAMI,
+   .protocol_b = true,
+};
+
+static const struct 

[RESEND PATCH v5 2/5] dt-bindings: input/touchscreen: Add compatible for IST3038B

2024-03-01 Thread Karel Balej
From: Markuss Broks 

Imagis IST3038B is a variant (firmware?) of Imagis IST3038 IC
differing from IST3038C in its register interface. Add the
compatible for it to the IST3038C bindings.

Signed-off-by: Markuss Broks 
Acked-by: Conor Dooley 
[bal...@matfyz.cz: elaborate chip differences in the commit message]
Signed-off-by: Karel Balej 
---

Notes:
v4:
* Mention how the chip is different in terms of the programming model in
  the commit message.
* Add Conor's trailer.

 .../devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml   | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml 
b/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml
index 0d6b033fd5fb..b5372c4eae56 100644
--- a/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml
+++ b/Documentation/devicetree/bindings/input/touchscreen/imagis,ist3038c.yaml
@@ -18,6 +18,7 @@ properties:
 
   compatible:
 enum:
+  - imagis,ist3038b
   - imagis,ist3038c
 
   reg:
-- 
2.44.0




[RESEND PATCH v5 0/5] input/touchscreen: imagis: add support for IST3032C

2024-03-01 Thread Karel Balej
From: Karel Balej 

Hello,

this patch series generalizes the Imagis touchscreen driver to support
other Imagis chips, namely IST3038B and IST3032C.

The motivation for IST3032C is the samsung,coreprimevelte smartphone
with which this series has been tested. However, the support for this
device is not yet in-tree, the effort is happening at [1]. Preliminary
version of the regulator driver needed to use the touchscreen on this
phone can be found here [2].

Note that this is a prerequisite for (at least a part of) this series
[3] which among other things implements support for touch keys for
Imagis touchscreens that have it.

[1] 
https://lore.kernel.org/all/20240110-pxa1908-lkml-v8-0-fea768a59...@skole.hr/
[2] 
https://lore.kernel.org/all/20240211094609.2223-1-kar...@gimli.ms.mff.cuni.cz/
[3] 
https://lore.kernel.org/all/20240120-b4-imagis-keys-v2-0-d7fc16f2e...@skole.hr/

Best regards,
K. B.
---
v5:
- Rebase to v6.8-rc3.
- v4: 
https://lore.kernel.org/all/20240120191940.3631-1-kar...@gimli.ms.mff.cuni.cz/
v4:
- Rebase to v6.7.
- v3: 
https://lore.kernel.org/all/20231202125948.10345-1-kar...@gimli.ms.mff.cuni.cz/
- Address feedback and add trailers.
v3:
- Rebase to v6.7-rc3.
- v2: 
https://lore.kernel.org/all/20231003133440.4696-1-kar...@gimli.ms.mff.cuni.cz/
v2:
- Do not rename the driver.
- Do not hardcode voltage required by the IST3032C.
- Use Markuss' series which generalizes the driver. Link to the original
  series: 
https://lore.kernel.org/all/20220504152406.8730-1-markuss.br...@gmail.com/
- Separate bindings into separate patch.
- v1: https://lore.kernel.org/all/20230926173531.18715-1-bal...@matfyz.cz/

Karel Balej (2):
  dt-bindings: input/touchscreen: imagis: add compatible for IST3032C
  input/touchscreen: imagis: add support for IST3032C

Markuss Broks (3):
  input/touchscreen: imagis: Correct the maximum touch area value
  dt-bindings: input/touchscreen: Add compatible for IST3038B
  input/touchscreen: imagis: Add support for Imagis IST3038B

 .../input/touchscreen/imagis,ist3038c.yaml|  2 +
 drivers/input/touchscreen/imagis.c| 70 +++
 2 files changed, 60 insertions(+), 12 deletions(-)

-- 
2.44.0




[RESEND PATCH v5 1/5] input/touchscreen: imagis: Correct the maximum touch area value

2024-03-01 Thread Karel Balej
From: Markuss Broks 

As specified in downstream IST3038B driver and proved by testing,
the correct maximum reported value of touch area is 16.

Signed-off-by: Markuss Broks 
Signed-off-by: Karel Balej 
---
 drivers/input/touchscreen/imagis.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/imagis.c 
b/drivers/input/touchscreen/imagis.c
index 07111ca24455..e67fd3011027 100644
--- a/drivers/input/touchscreen/imagis.c
+++ b/drivers/input/touchscreen/imagis.c
@@ -210,7 +210,7 @@ static int imagis_init_input_dev(struct imagis_ts *ts)
 
input_set_capability(input_dev, EV_ABS, ABS_MT_POSITION_X);
input_set_capability(input_dev, EV_ABS, ABS_MT_POSITION_Y);
-   input_set_abs_params(input_dev, ABS_MT_TOUCH_MAJOR, 0, 255, 0, 0);
+   input_set_abs_params(input_dev, ABS_MT_TOUCH_MAJOR, 0, 16, 0, 0);
 
touchscreen_parse_properties(input_dev, true, >prop);
if (!ts->prop.max_x || !ts->prop.max_y) {
-- 
2.44.0




[PATCH 3/3] dt-bindings: remoteproc: Add Arm remoteproc

2024-03-01 Thread abdellatif . elkhlifi
From: Abdellatif El Khlifi 

introduce the bindings for Arm remoteproc support.

Signed-off-by: Abdellatif El Khlifi 
---
 .../bindings/remoteproc/arm,rproc.yaml| 69 +++
 MAINTAINERS   |  1 +
 2 files changed, 70 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml

diff --git a/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
new file mode 100644
index ..322197158059
--- /dev/null
+++ b/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
@@ -0,0 +1,69 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/remoteproc/arm,rproc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arm Remoteproc Devices
+
+maintainers:
+  - Abdellatif El Khlifi 
+
+description: |
+  Some Arm heterogeneous System-On-Chips feature remote processors that can
+  be controlled with a reset control register and a reset status register to
+  start or stop the processor.
+
+  This document defines the bindings for these remote processors.
+
+properties:
+  compatible:
+enum:
+  - arm,corstone1000-extsys
+
+  reg:
+minItems: 2
+maxItems: 2
+description: |
+  Address and size in bytes of the reset control register
+  and the reset status register.
+  Expects the registers to be in the order as above.
+  Should contain an entry for each value in 'reg-names'.
+
+  reg-names:
+description: |
+  Required names for each of the reset registers defined in
+  the 'reg' property. Expects the names from the following
+  list, in the specified order, each representing the corresponding
+  reset register.
+items:
+  - const: reset-control
+  - const: reset-status
+
+  firmware-name:
+description: |
+  Default name of the firmware to load to the remote processor.
+
+required:
+  - compatible
+  - reg
+  - reg-names
+  - firmware-name
+
+additionalProperties: false
+
+examples:
+  - |
+extsys0: remoteproc@1a010310 {
+compatible = "arm,corstone1000-extsys";
+reg = <0x1a010310 0x4>, <0x1a010314 0x4>;
+reg-names = "reset-control", "reset-status";
+firmware-name = "es0_flashfw.elf";
+};
+
+extsys1: remoteproc@1a010318 {
+compatible = "arm,corstone1000-extsys";
+reg = <0x1a010318 0x4>, <0x1a01031c 0x4>;
+reg-names = "reset-control", "reset-status";
+firmware-name = "es1_flashfw.elf";
+};
diff --git a/MAINTAINERS b/MAINTAINERS
index 54d6a40feea5..eddaa3841a65 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1768,6 +1768,7 @@ ARM REMOTEPROC DRIVER
 M: Abdellatif El Khlifi 
 L: linux-remotep...@vger.kernel.org
 S: Maintained
+F: Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
 F: drivers/remoteproc/arm_rproc.c
 
 ARM SMC WATCHDOG DRIVER
-- 
2.25.1




[PATCH 2/3] arm64: dts: Add corstone1000 external system device node

2024-03-01 Thread abdellatif . elkhlifi
From: Abdellatif El Khlifi 

add device tree node for the external system core in Corstone-1000

Signed-off-by: Abdellatif El Khlifi 
---
 arch/arm64/boot/dts/arm/corstone1000.dtsi | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/arm/corstone1000.dtsi 
b/arch/arm64/boot/dts/arm/corstone1000.dtsi
index 6ad7829f9e28..67df642363e9 100644
--- a/arch/arm64/boot/dts/arm/corstone1000.dtsi
+++ b/arch/arm64/boot/dts/arm/corstone1000.dtsi
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
- * Copyright (c) 2022, Arm Limited. All rights reserved.
+ * Copyright 2022, 2024, Arm Limited and/or its affiliates 

  * Copyright (c) 2022, Linaro Limited. All rights reserved.
  *
  */
@@ -157,5 +157,13 @@ mhu_seh1: mailbox@1b83 {
secure-status = "okay"; /* secure-world-only */
status = "disabled";
};
+
+   extsys0: remoteproc@1a010310 {
+   compatible = "arm,corstone1000-extsys";
+   reg = <0x1a010310 0x4>,
+   <0x1a010314 0X4>;
+   reg-names = "reset-control", "reset-status";
+   firmware-name = "es_flashfw.elf";
+   };
};
 };
-- 
2.25.1




[PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-01 Thread abdellatif . elkhlifi
From: Abdellatif El Khlifi 

introduce remoteproc support for Arm remote processors

The supported remote processors are those that come with a reset
control register and a reset status register. The driver allows to
switch on or off the remote processor.

The current use case is Corstone-1000 External System (Cortex-M3).

The driver can be extended to support other remote processors
controlled with a reset control and a reset status registers.

The driver also supports control of multiple remote processors at the
same time.

Signed-off-by: Abdellatif El Khlifi 
---
 MAINTAINERS|   6 +
 drivers/remoteproc/Kconfig |  18 ++
 drivers/remoteproc/Makefile|   1 +
 drivers/remoteproc/arm_rproc.c | 395 +
 4 files changed, 420 insertions(+)
 create mode 100644 drivers/remoteproc/arm_rproc.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 8d1052fa6a69..54d6a40feea5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1764,6 +1764,12 @@ S:   Maintained
 F: Documentation/devicetree/bindings/interrupt-controller/arm,vic.yaml
 F: drivers/irqchip/irq-vic.c
 
+ARM REMOTEPROC DRIVER
+M: Abdellatif El Khlifi 
+L: linux-remotep...@vger.kernel.org
+S: Maintained
+F: drivers/remoteproc/arm_rproc.c
+
 ARM SMC WATCHDOG DRIVER
 M: Julius Werner 
 R: Evan Benn 
diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 48845dc8fa85..57fbac454a5d 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -365,6 +365,24 @@ config XLNX_R5_REMOTEPROC
 
  It's safe to say N if not interested in using RPU r5f cores.
 
+config ARM_REMOTEPROC
+   tristate "Arm remoteproc support"
+   depends on HAS_IOMEM && ARM64
+   default n
+   help
+ Say y here to support Arm remote processors via the remote
+ processor framework.
+
+ The supported processors are those that come with a reset control 
register
+ and a reset status register. The design can be extended to support 
different
+ processors meeting these requirements.
+ The driver also supports control of multiple remote cores at the same 
time.
+
+ Supported remote cores:
+ Corstone-1000 External System (Cortex-M3)
+
+ It's safe to say N here.
+
 endif # REMOTEPROC
 
 endmenu
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 91314a9b43ce..73126310835b 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -39,3 +39,4 @@ obj-$(CONFIG_STM32_RPROC) += stm32_rproc.o
 obj-$(CONFIG_TI_K3_DSP_REMOTEPROC) += ti_k3_dsp_remoteproc.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)  += ti_k3_r5_remoteproc.o
 obj-$(CONFIG_XLNX_R5_REMOTEPROC)   += xlnx_r5_remoteproc.o
+obj-$(CONFIG_ARM_REMOTEPROC)   += arm_rproc.o
diff --git a/drivers/remoteproc/arm_rproc.c b/drivers/remoteproc/arm_rproc.c
new file mode 100644
index ..6afa78ae7ad3
--- /dev/null
+++ b/drivers/remoteproc/arm_rproc.c
@@ -0,0 +1,395 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright 2024 Arm Limited and/or its affiliates 

+ *
+ * Authors:
+ *   Abdellatif El Khlifi 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "remoteproc_internal.h"
+
+/**
+ * struct arm_rproc_reset_cfg - remote processor reset configuration
+ * @ctrl_reg: address of the control register
+ * @state_reg: address of the reset status register
+ */
+struct arm_rproc_reset_cfg {
+   void __iomem *ctrl_reg;
+   void __iomem *state_reg;
+};
+
+struct arm_rproc;
+
+/**
+ * struct arm_rproc_dcfg - Arm remote processor configuration
+ * @stop: stop callback function
+ * @start: start callback function
+ */
+struct arm_rproc_dcfg {
+   int (*stop)(struct rproc *rproc);
+   int (*start)(struct rproc *rproc);
+};
+
+/**
+ * struct arm_rproc - Arm remote processor instance
+ * @rproc: rproc handler
+ * @core_dcfg: device configuration pointer
+ * @reset_cfg: reset configuration registers
+ */
+struct arm_rproc {
+   struct rproc*rproc;
+   const struct arm_rproc_dcfg *core_dcfg;
+   struct arm_rproc_reset_cfg  reset_cfg;
+};
+
+/* Definitions for Arm Corstone-1000 External System */
+
+#define EXTSYS_RST_CTRL_CPUWAITBIT(0)
+#define EXTSYS_RST_CTRL_RST_REQBIT(1)
+
+#define EXTSYS_RST_ACK_MASKGENMASK(2, 1)
+#define EXTSYS_RST_ST_RST_ACK(x)   \
+   ((u8)(FIELD_GET(EXTSYS_RST_ACK_MASK, (x
+
+#define EXTSYS_RST_ACK_NO_RESET_REQ(0x0)
+#define EXTSYS_RST_ACK_NOT_COMPLETE(0x1)
+#define EXTSYS_RST_ACK_COMPLETE(0x2)
+#define EXTSYS_RST_ACK_RESERVED(0x3)
+
+#define EXTSYS_RST_ACK_POLL_TRIES  (3)
+#define 

[PATCH 0/3] remoteproc: introduce Arm remoteproc support

2024-03-01 Thread abdellatif . elkhlifi
From: Abdellatif El Khlifi 

Some Arm heterogeneous System-On-Chips feature remote processors that can
be controlled with a reset control register and a reset status register to
start or stop the processor.

This patchset adds support for these processors by providing the
following:

1) A remoteproc driver that retrieves the reset registers addresses from
the DT, register a new rproc device with the remoteproc subsystem and
provides the start and stop operations for switching on or off the remote
processor.

The start and stop operations are provided as a data config selected on DT node 
match.
Currently we are providing support for Corstone-1000 External System 
(Cortex-M3) [1]
as a remote processor. The driver can be extended to support other remote 
processors
by adding a data config and custom implementation of the start and stop 
operations.

2) DT bindings

3) Support control of multiple remote processors at the same time

[1]: 
https://developer.arm.com/documentation/102360//Overview-of-Corstone-1000/Corstone-1000

Cheers,
Abdellatif

Abdellatif El Khlifi (3):
  remoteproc: Add Arm remoteproc driver
  arm64: dts: Add corstone1000 external system device node
  dt-bindings: remoteproc: Add Arm remoteproc

 .../bindings/remoteproc/arm,rproc.yaml|  69 +++
 MAINTAINERS   |   7 +
 arch/arm64/boot/dts/arm/corstone1000.dtsi |  10 +-
 drivers/remoteproc/Kconfig|  18 +
 drivers/remoteproc/Makefile   |   1 +
 drivers/remoteproc/arm_rproc.c| 395 ++
 6 files changed, 499 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
 create mode 100644 drivers/remoteproc/arm_rproc.c


base-commit: 8b46dc5cfa5ffea279aed0fc05dc4b1c39a51517
-- 
2.25.1




Re: [PATCH] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-03-01 Thread Mathieu Desnoyers

On 2024-03-01 10:49, Steven Rostedt wrote:

On Fri, 1 Mar 2024 13:37:18 +0800
linke  wrote:


So basically you are worried about read-tearing?

That wasn't mentioned in the change log.


Yes. Sorry for making this confused, I am not very familiar with this and
still learning.


No problem. We all have to learn this anyway.




Funny part is, if the above timestamp read did a tear, then this would
definitely not match, and would return the correct value. That is, the
buffer is not empty because the only way for this to get corrupted is if
something is in the process of writing to it.


I agree with you here.

commit = rb_page_commit(commit_page);

But if commit_page above is the result of a torn read, the commit field
read by rb_page_commit() may not represent a valid value.


But commit_page is a word length, and I will argue that any compiler that
tears "long" words is broken. ;-)


[ For those tuning in, we are discussing ring_buffer_iter_empty()
  "commit_page = cpu_buffer->commit_page;" racy load. ]

I counter-argue that real-world compilers *are* broken based on your
personal definition, but we have to deal with them, as documented
in Documentation/memory-barriers.txt (see below).

What is the added overhead of using a READ_ONCE() there ? Why are
we wasting effort trying to guess the compiler behavior if the
real-world performance impact is insignificant ?

Quote from memory-barrier.txt explaining the purpose of {READ,WRITE}_ONCE():

"(*) For aligned memory locations whose size allows them to be accessed
 with a single memory-reference instruction, prevents "load tearing"
 and "store tearing," in which a single large access is replaced by
 multiple smaller accesses."

I agree that {READ,WRITE}_ONCE() are really not needed at initialization,
when there are demonstrably no concurrent accesses to the data

But trying to eliminate {READ,WRITE}_ONCE() on concurrently accessed fields
just adds complexity, prevents static analyzers to properly understand the
code and report issues, and just obfuscates the code.

Thanks,

Mathieu





In this case, READ_ONCE() is only needed for the commit_page.


But we can at least keep the READ_ONCE() on the commit_page just because it
is used in the next instruction.

-- Steve


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




[ANNOUNCE] 5.10.210-rt102

2024-03-01 Thread Luis Claudio R. Goncalves
Hello RT-list!

I'm pleased to announce the 5.10.210-rt102 stable release.

This release is just an update to the new stable 5.10.210 version and no
RT-specific changes have been performed.

You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  branch: v5.10-rt
  Head SHA1: 2e4f63341da86cf080b22925a8f4f8cb746b8e25

Or to build 5.10.210-rt102 directly, the following patches should be applied:

  https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.10.tar.xz

  https://www.kernel.org/pub/linux/kernel/v5.x/patch-5.10.210.xz

  
https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.210-rt102.patch.xz

Signing key fingerprint:

  9354 0649 9972 8D31 D464  D140 F394 A423 F8E6 7C26

All keys used for the above files and repositories can be found on the
following git repository:

   git://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git

Enjoy!
Luis




Re: [PATCH] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment

2024-03-01 Thread Steven Rostedt
On Fri, 1 Mar 2024 13:37:18 +0800
linke  wrote:

> > So basically you are worried about read-tearing?
> > 
> > That wasn't mentioned in the change log.  
> 
> Yes. Sorry for making this confused, I am not very familiar with this and
> still learning.

No problem. We all have to learn this anyway.

> 
> > Funny part is, if the above timestamp read did a tear, then this would
> > definitely not match, and would return the correct value. That is, the
> > buffer is not empty because the only way for this to get corrupted is if
> > something is in the process of writing to it.  
> 
> I agree with you here.
> 
>   commit = rb_page_commit(commit_page);
> 
> But if commit_page above is the result of a torn read, the commit field
> read by rb_page_commit() may not represent a valid value. 

But commit_page is a word length, and I will argue that any compiler that
tears "long" words is broken. ;-)

> 
> In this case, READ_ONCE() is only needed for the commit_page.

But we can at least keep the READ_ONCE() on the commit_page just because it
is used in the next instruction.

-- Steve



Re: [PATCH v2 3/6] virtiofs: factor out more common methods for argbuf

2024-03-01 Thread Miklos Szeredi
On Wed, 28 Feb 2024 at 15:41, Hou Tao  wrote:
>
> From: Hou Tao 
>
> Factor out more common methods for bounce buffer of fuse args:
>
> 1) virtio_fs_argbuf_setup_sg: set-up sgs for bounce buffer
> 2) virtio_fs_argbuf_copy_from_in_arg: copy each in-arg to bounce buffer
> 3) virtio_fs_argbuf_out_args_offset: calc the start offset of out-arg
> 4) virtio_fs_argbuf_copy_to_out_arg: copy bounce buffer to each out-arg
>
> These methods will be used to implement bounce buffer backed by
> scattered pages which are allocated separatedly.

Why is req->argbuf not changed to being typed?

Thanks,
Miklos



Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-03-01 Thread Maciej Fijalkowski
On Wed, Feb 28, 2024 at 07:05:56PM +0800, Yunjian Wang wrote:
> This patch set allows TUN to support the AF_XDP Tx zero-copy feature,
> which can significantly reduce CPU utilization for XDP programs.

Why no Rx ZC support though? What will happen if I try rxdrop xdpsock
against tun with this patch? You clearly allow for that.

> 
> Since commit fc72d1d54dd9 ("tuntap: XDP transmission"), the pointer
> ring has been utilized to queue different types of pointers by encoding
> the type into the lower bits. Therefore, we introduce a new flag,
> TUN_XDP_DESC_FLAG(0x2UL), which allows us to enqueue XDP descriptors
> and differentiate them from XDP buffers and sk_buffs. Additionally, a
> spin lock is added for enabling and disabling operations on the xsk pool.
> 
> The performance testing was performed on a Intel E5-2620 2.40GHz machine.
> Traffic were generated/send through TUN(testpmd txonly with AF_XDP)
> to VM (testpmd rxonly in guest).
> 
> +--+-+-+-+
> |  |   copy  |zero-copy| speedup |
> +--+-+-+-+
> | UDP  |   Mpps  |   Mpps  |%|
> | 64   |   2.5   |   4.0   |   60%   |
> | 512  |   2.1   |   3.6   |   71%   |
> | 1024 |   1.9   |   3.3   |   73%   |
> +--+-+-+-+
> 
> Signed-off-by: Yunjian Wang 
> ---
>  drivers/net/tun.c  | 177 +++--
>  drivers/vhost/net.c|   4 +
>  include/linux/if_tun.h |  32 
>  3 files changed, 208 insertions(+), 5 deletions(-)
> 



Re: [PATCH v2 1/6] fuse: limit the length of ITER_KVEC dio by max_pages

2024-03-01 Thread Miklos Szeredi
On Wed, 28 Feb 2024 at 15:40, Hou Tao  wrote:

> So instead of limiting both the values of max_read and max_write in
> kernel, capping the maximal length of kvec iter IO by using max_pages in
> fuse_direct_io() just like it does for ubuf/iovec iter IO. Now the max
> value for max_pages is 256, so on host with 4KB page size, the maximal
> size passed to kmalloc() in copy_args_to_argbuf() is about 1MB+40B. The
> allocation of 2MB of physically contiguous memory will still incur
> significant stress on the memory subsystem, but the warning is fixed.
> Additionally, the requirement for huge physically contiguous memory will
> be removed in the following patch.

So the issue will be fixed properly by following patches?

In that case this patch could be omitted, right?

Thanks,
Miklos



Re: [PATCH 1/4] iommu: constify pointer to bus_type

2024-03-01 Thread Joerg Roedel
On Fri, Feb 16, 2024 at 03:40:24PM +0100, Krzysztof Kozlowski wrote:

Applied all, thanks.



[PATCH v2] drm/qxl: fix NULL dereference in qxl_add_mode

2024-03-01 Thread Aleksandr Burakov
Return value of a function 'drm_cvt_mode' is dereferenced without
checking for NULL but drm_mode_create() in drm_cvt_mode() may
return NULL value in case of memory allocation error.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 1b043677d4be ("drm/qxl: add qxl_add_mode helper function")
Signed-off-by: Aleksandr Burakov 
---
v2: case with false value of 'preferred' is now taken into account
 drivers/gpu/drm/qxl/qxl_display.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c 
b/drivers/gpu/drm/qxl/qxl_display.c
index a152a7c6db21..d6dece7a0ed2 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -236,6 +236,9 @@ static int qxl_add_mode(struct drm_connector *connector,
return 0;
 
mode = drm_cvt_mode(dev, width, height, 60, false, false, false);
+   if (!mode)
+   return 0;
+
if (preferred)
mode->type |= DRM_MODE_TYPE_PREFERRED;
mode->hdisplay = width;
-- 
2.25.1




Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-03-01 Thread Michael S. Tsirkin
On Fri, Mar 01, 2024 at 11:45:52AM +, wangyunjian wrote:
> > -Original Message-
> > From: Paolo Abeni [mailto:pab...@redhat.com]
> > Sent: Thursday, February 29, 2024 7:13 PM
> > To: wangyunjian ; m...@redhat.com;
> > willemdebruijn.ker...@gmail.com; jasow...@redhat.com; k...@kernel.org;
> > bj...@kernel.org; magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> > jonathan.le...@gmail.com; da...@davemloft.net
> > Cc: b...@vger.kernel.org; net...@vger.kernel.org;
> > linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> > virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> > 
> > Subject: Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support
> > 
> > On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> > > @@ -2661,6 +2776,54 @@ static int tun_ptr_peek_len(void *ptr)
> > >   }
> > >  }
> > >
> > > +static void tun_peek_xsk(struct tun_file *tfile) {
> > > + struct xsk_buff_pool *pool;
> > > + u32 i, batch, budget;
> > > + void *frame;
> > > +
> > > + if (!ptr_ring_empty(>tx_ring))
> > > + return;
> > > +
> > > + spin_lock(>pool_lock);
> > > + pool = tfile->xsk_pool;
> > > + if (!pool) {
> > > + spin_unlock(>pool_lock);
> > > + return;
> > > + }
> > > +
> > > + if (tfile->nb_descs) {
> > > + xsk_tx_completed(pool, tfile->nb_descs);
> > > + if (xsk_uses_need_wakeup(pool))
> > > + xsk_set_tx_need_wakeup(pool);
> > > + }
> > > +
> > > + spin_lock(>tx_ring.producer_lock);
> > > + budget = min_t(u32, tfile->tx_ring.size, TUN_XDP_BATCH);
> > > +
> > > + batch = xsk_tx_peek_release_desc_batch(pool, budget);
> > > + if (!batch) {
> > 
> > This branch looks like an unneeded "optimization". The generic loop below
> > should have the same effect with no measurable perf delta - and smaller 
> > code.
> > Just remove this.
> > 
> > > + tfile->nb_descs = 0;
> > > + spin_unlock(>tx_ring.producer_lock);
> > > + spin_unlock(>pool_lock);
> > > + return;
> > > + }
> > > +
> > > + tfile->nb_descs = batch;
> > > + for (i = 0; i < batch; i++) {
> > > + /* Encode the XDP DESC flag into lowest bit for consumer to 
> > > differ
> > > +  * XDP desc from XDP buffer and sk_buff.
> > > +  */
> > > + frame = tun_xdp_desc_to_ptr(>tx_descs[i]);
> > > + /* The budget must be less than or equal to tx_ring.size,
> > > +  * so enqueuing will not fail.
> > > +  */
> > > + __ptr_ring_produce(>tx_ring, frame);
> > > + }
> > > + spin_unlock(>tx_ring.producer_lock);
> > > + spin_unlock(>pool_lock);
> > 
> > More related to the general design: it looks wrong. What if
> > get_rx_bufs() will fail (ENOBUF) after successful peeking? With no more
> > incoming packets, later peek will return 0 and it looks like that the
> > half-processed packets will stay in the ring forever???
> > 
> > I think the 'ring produce' part should be moved into tun_do_read().
> 
> Currently, the vhost-net obtains a batch descriptors/sk_buffs from the
> ptr_ring and enqueue the batch descriptors/sk_buffs to the virtqueue'queue,
> and then consumes the descriptors/sk_buffs from the virtqueue'queue in
> sequence. As a result, TUN does not know whether the batch descriptors have
> been used up, and thus does not know when to return the batch descriptors.
> 
> So, I think it's reasonable that when vhost-net checks ptr_ring is empty,
> it calls peek_len to get new xsk's descs and return the descriptors.
> 
> Thanks

What you need to think about is that if you peek, another call
in parallel can get the same value at the same time.


> > 
> > Cheers,
> > 
> > Paolo
> 




RE: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support

2024-03-01 Thread wangyunjian
> -Original Message-
> From: Paolo Abeni [mailto:pab...@redhat.com]
> Sent: Thursday, February 29, 2024 7:13 PM
> To: wangyunjian ; m...@redhat.com;
> willemdebruijn.ker...@gmail.com; jasow...@redhat.com; k...@kernel.org;
> bj...@kernel.org; magnus.karls...@intel.com; maciej.fijalkow...@intel.com;
> jonathan.le...@gmail.com; da...@davemloft.net
> Cc: b...@vger.kernel.org; net...@vger.kernel.org;
> linux-kernel@vger.kernel.org; k...@vger.kernel.org;
> virtualizat...@lists.linux.dev; xudingke ; liwei (DT)
> 
> Subject: Re: [PATCH net-next v2 3/3] tun: AF_XDP Tx zero-copy support
> 
> On Wed, 2024-02-28 at 19:05 +0800, Yunjian Wang wrote:
> > @@ -2661,6 +2776,54 @@ static int tun_ptr_peek_len(void *ptr)
> > }
> >  }
> >
> > +static void tun_peek_xsk(struct tun_file *tfile) {
> > +   struct xsk_buff_pool *pool;
> > +   u32 i, batch, budget;
> > +   void *frame;
> > +
> > +   if (!ptr_ring_empty(>tx_ring))
> > +   return;
> > +
> > +   spin_lock(>pool_lock);
> > +   pool = tfile->xsk_pool;
> > +   if (!pool) {
> > +   spin_unlock(>pool_lock);
> > +   return;
> > +   }
> > +
> > +   if (tfile->nb_descs) {
> > +   xsk_tx_completed(pool, tfile->nb_descs);
> > +   if (xsk_uses_need_wakeup(pool))
> > +   xsk_set_tx_need_wakeup(pool);
> > +   }
> > +
> > +   spin_lock(>tx_ring.producer_lock);
> > +   budget = min_t(u32, tfile->tx_ring.size, TUN_XDP_BATCH);
> > +
> > +   batch = xsk_tx_peek_release_desc_batch(pool, budget);
> > +   if (!batch) {
> 
> This branch looks like an unneeded "optimization". The generic loop below
> should have the same effect with no measurable perf delta - and smaller code.
> Just remove this.
> 
> > +   tfile->nb_descs = 0;
> > +   spin_unlock(>tx_ring.producer_lock);
> > +   spin_unlock(>pool_lock);
> > +   return;
> > +   }
> > +
> > +   tfile->nb_descs = batch;
> > +   for (i = 0; i < batch; i++) {
> > +   /* Encode the XDP DESC flag into lowest bit for consumer to 
> > differ
> > +* XDP desc from XDP buffer and sk_buff.
> > +*/
> > +   frame = tun_xdp_desc_to_ptr(>tx_descs[i]);
> > +   /* The budget must be less than or equal to tx_ring.size,
> > +* so enqueuing will not fail.
> > +*/
> > +   __ptr_ring_produce(>tx_ring, frame);
> > +   }
> > +   spin_unlock(>tx_ring.producer_lock);
> > +   spin_unlock(>pool_lock);
> 
> More related to the general design: it looks wrong. What if
> get_rx_bufs() will fail (ENOBUF) after successful peeking? With no more
> incoming packets, later peek will return 0 and it looks like that the
> half-processed packets will stay in the ring forever???
> 
> I think the 'ring produce' part should be moved into tun_do_read().

Currently, the vhost-net obtains a batch descriptors/sk_buffs from the
ptr_ring and enqueue the batch descriptors/sk_buffs to the virtqueue'queue,
and then consumes the descriptors/sk_buffs from the virtqueue'queue in
sequence. As a result, TUN does not know whether the batch descriptors have
been used up, and thus does not know when to return the batch descriptors.

So, I think it's reasonable that when vhost-net checks ptr_ring is empty,
it calls peek_len to get new xsk's descs and return the descriptors.

Thanks
> 
> Cheers,
> 
> Paolo



Re: [PATCH] drm/qxl: fix NULL dereference in qxl_add_mode

2024-03-01 Thread Gerd Hoffmann
On Fri, Mar 01, 2024 at 11:55:11AM +0300, Aleksandr Burakov wrote:
> Return value of a function 'drm_cvt_mode' is dereferenced without
> checking for NULL but drm_mode_create() in drm_cvt_mode() may
> return NULL value in case of memory allocation error.
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Fixes: 1b043677d4be ("drm/qxl: add qxl_add_mode helper function")
> Signed-off-by: Aleksandr Burakov 
> ---
>  drivers/gpu/drm/qxl/qxl_display.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/qxl/qxl_display.c 
> b/drivers/gpu/drm/qxl/qxl_display.c
> index a152a7c6db21..447532c29e02 100644
> --- a/drivers/gpu/drm/qxl/qxl_display.c
> +++ b/drivers/gpu/drm/qxl/qxl_display.c
> @@ -236,8 +236,10 @@ static int qxl_add_mode(struct drm_connector *connector,
>   return 0;
>  
>   mode = drm_cvt_mode(dev, width, height, 60, false, false, false);
> - if (preferred)
> + if (preferred && mode)
>   mode->type |= DRM_MODE_TYPE_PREFERRED;
> + else
> + return 0;
>   mode->hdisplay = width;

That doesn't fix the NULL pointer dereference in case "preferred" is
false.

I'd suggest "if (!mode) return 0" instead.




[PATCH] drm/qxl: fix NULL dereference in qxl_add_mode

2024-03-01 Thread Aleksandr Burakov
Return value of a function 'drm_cvt_mode' is dereferenced without
checking for NULL but drm_mode_create() in drm_cvt_mode() may
return NULL value in case of memory allocation error.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 1b043677d4be ("drm/qxl: add qxl_add_mode helper function")
Signed-off-by: Aleksandr Burakov 
---
 drivers/gpu/drm/qxl/qxl_display.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c 
b/drivers/gpu/drm/qxl/qxl_display.c
index a152a7c6db21..447532c29e02 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -236,8 +236,10 @@ static int qxl_add_mode(struct drm_connector *connector,
return 0;
 
mode = drm_cvt_mode(dev, width, height, 60, false, false, false);
-   if (preferred)
+   if (preferred && mode)
mode->type |= DRM_MODE_TYPE_PREFERRED;
+   else
+   return 0;
mode->hdisplay = width;
mode->vdisplay = height;
drm_mode_set_name(mode);
-- 
2.25.1