Re: [PATCH net-next 16/24] net: netkit, veth, tun, virt*: Use nested-BH locking for XDP redirect.

2024-01-12 Thread Sebastian Andrzej Siewior
On 2023-12-18 09:52:05 [+0100], Daniel Borkmann wrote:
> Hi Sebastian,
Hi Daniel,

> Please exclude netkit from this set given it does not support XDP, but
> instead only accepts tc BPF typed programs.

okay, thank you.

> Thanks,
> Daniel

Sebastian



[PATCH net-next 16/24] net: netkit, veth, tun, virt*: Use nested-BH locking for XDP redirect.

2023-12-15 Thread Sebastian Andrzej Siewior
The per-CPU variables used during bpf_prog_run_xdp() invocation and
later during xdp_do_redirect() rely on disabled BH for their protection.
Without locking in local_bh_disable() on PREEMPT_RT these data structure
require explicit locking.

This is a follow-up on the previous change which introduced
bpf_run_lock.redirect_lock and uses it now within drivers.

The simple way is to acquire the lock before bpf_prog_run_xdp() is
invoked and hold it until the end of function.
This does not always work because some drivers (cpsw, atlantic) invoke
xdp_do_flush() in the same context.
Acquiring the lock in bpf_prog_run_xdp() and dropping in
xdp_do_redirect() (without touching drivers) does not work because not
all driver, which use bpf_prog_run_xdp(), do support XDP_REDIRECT (and
invoke xdp_do_redirect()).

Ideally the minimal locking scope would be bpf_prog_run_xdp() +
xdp_do_redirect() and everything else (error recovery, DMA unmapping,
free/ alloc of memory, …) would happen outside of the locked section.

Cc: "K. Y. Srinivasan" 
Cc: "Michael S. Tsirkin" 
Cc: Alexei Starovoitov 
Cc: Andrii Nakryiko 
Cc: Dexuan Cui 
Cc: Haiyang Zhang 
Cc: Hao Luo 
Cc: Jesper Dangaard Brouer 
Cc: Jiri Olsa 
Cc: John Fastabend 
Cc: Juergen Gross 
Cc: KP Singh 
Cc: Martin KaFai Lau 
Cc: Nikolay Aleksandrov 
Cc: Song Liu 
Cc: Stanislav Fomichev 
Cc: Stefano Stabellini 
Cc: Wei Liu 
Cc: Willem de Bruijn 
Cc: Xuan Zhuo 
Cc: Yonghong Song 
Cc: b...@vger.kernel.org
Cc: virtualizat...@lists.linux.dev
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/net/hyperv/netvsc_bpf.c |  1 +
 drivers/net/netkit.c| 13 +++
 drivers/net/tun.c   | 28 +--
 drivers/net/veth.c  | 40 -
 drivers/net/virtio_net.c|  1 +
 drivers/net/xen-netfront.c  |  1 +
 6 files changed, 52 insertions(+), 32 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_bpf.c b/drivers/net/hyperv/netvsc_bpf.c
index 4a9522689fa4f..55f8ca92ca199 100644
--- a/drivers/net/hyperv/netvsc_bpf.c
+++ b/drivers/net/hyperv/netvsc_bpf.c
@@ -58,6 +58,7 @@ u32 netvsc_run_xdp(struct net_device *ndev, struct 
netvsc_channel *nvchan,
 
memcpy(xdp->data, data, len);
 
+   guard(local_lock_nested_bh)(_run_lock.redirect_lock);
act = bpf_prog_run_xdp(prog, xdp);
 
switch (act) {
diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
index 39171380ccf29..fbcf78477bda8 100644
--- a/drivers/net/netkit.c
+++ b/drivers/net/netkit.c
@@ -80,8 +80,15 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct 
net_device *dev)
netkit_prep_forward(skb, !net_eq(dev_net(dev), dev_net(peer)));
skb->dev = peer;
entry = rcu_dereference(nk->active);
-   if (entry)
-   ret = netkit_run(entry, skb, ret);
+   if (entry) {
+   scoped_guard(local_lock_nested_bh, _run_lock.redirect_lock) 
{
+   ret = netkit_run(entry, skb, ret);
+   if (ret == NETKIT_REDIRECT) {
+   dev_sw_netstats_tx_add(dev, 1, len);
+   skb_do_redirect(skb);
+   }
+   }
+   }
switch (ret) {
case NETKIT_NEXT:
case NETKIT_PASS:
@@ -95,8 +102,6 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct 
net_device *dev)
}
break;
case NETKIT_REDIRECT:
-   dev_sw_netstats_tx_add(dev, 1, len);
-   skb_do_redirect(skb);
break;
case NETKIT_DROP:
default:
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index afa5497f7c35c..fe0d31f11e4b6 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1708,16 +1708,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct 
*tun,
xdp_init_buff(, buflen, >xdp_rxq);
xdp_prepare_buff(, buf, pad, len, false);
 
-   act = bpf_prog_run_xdp(xdp_prog, );
-   if (act == XDP_REDIRECT || act == XDP_TX) {
-   get_page(alloc_frag->page);
-   alloc_frag->offset += buflen;
-   }
-   err = tun_xdp_act(tun, xdp_prog, , act);
-   if (err < 0) {
-   if (act == XDP_REDIRECT || act == XDP_TX)
-   put_page(alloc_frag->page);
-   goto out;
+   scoped_guard(local_lock_nested_bh, _run_lock.redirect_lock) 
{
+   act = bpf_prog_run_xdp(xdp_prog, );
+   if (act == XDP_REDIRECT || act == XDP_TX) {
+   get_page(alloc_frag->page);
+   alloc_frag->offset += buflen;
+   }
+   err = tun_xdp_act(tun, xdp_prog, , act);
+   if (err < 0) {
+  

[PATCH REPOST v2 1/2] x86/xen: Allow to retry if cpu_initialize_context() failed.

2022-02-09 Thread Sebastian Andrzej Siewior
From: Boris Ostrovsky 

If memory allocation in cpu_initialize_context() fails then it will
bring up the VCPU and leave with the corresponding CPU bit set in
xen_cpu_initialized_map.
The following (presumably successful) CPU bring up will BUG in
xen_pv_cpu_up() because nothing for that VCPU would be initialized.

Clear the CPU bits, that were set in cpu_initialize_context() in case
the memory allocation fails.

[ bigeasy: Creating a patch from Boris' email. ]

Signed-off-by: Boris Ostrovsky 
Signed-off-by: Sebastian Andrzej Siewior 
Link: https://lore.kernel.org/r/20211206152034.2150770-2-bige...@linutronix.de
---
 arch/x86/xen/smp_pv.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 4a6019238ee7d..57c3f9332ec94 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -260,8 +260,11 @@ cpu_initialize_context(unsigned int cpu, struct 
task_struct *idle)
return 0;
 
ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
-   if (ctxt == NULL)
+   if (ctxt == NULL) {
+   cpumask_clear_cpu(cpu, xen_cpu_initialized_map);
+   cpumask_clear_cpu(cpu, cpu_callout_mask);
return -ENOMEM;
+   }
 
gdt = get_cpu_gdt_rw(cpu);
 
-- 
2.34.1




[PATCH REPOST v2 2/2] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.

2022-02-09 Thread Sebastian Andrzej Siewior
From: "Longpeng(Mike)" 

A CPU will not show up in virtualized environment which includes an
Enclave. The VM splits its resources into a primary VM and a Enclave
VM. While the Enclave is active, the hypervisor will ignore all requests
to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
The kernel will wait up to ten seconds for CPU to show up
(do_boot_cpu()) and then rollback the hotplug state back to
CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.

After the Enclave VM terminates, the primary VM can bring up the CPU
again.

Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.

[bigeasy: Rewrite commit description.]

Signed-off-by: Longpeng(Mike) 
Signed-off-by: Sebastian Andrzej Siewior 
Link: https://lore.kernel.org/r/20210901051143.2752-1-longpe...@huawei.com
Reviewed-by: Valentin Schneider 
Tested-by: Dongli Zhang 
Reviewed-by: Henry Wang 
Link: https://lore.kernel.org/r/20211122154714.xaoxok3fpk5bg...@linutronix.de
Link: https://lore.kernel.org/r/20211206152034.2150770-3-bige...@linutronix.de
---
 kernel/smpboot.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index f6bc0bc8a2aab..b9f54544e7499 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
 */
return -EAGAIN;
 
+   case CPU_UP_PREPARE:
+   /*
+* Timeout while waiting for the CPU to show up. Allow to try
+* again later.
+*/
+   return 0;
+
default:
 
/* Should not happen.  Famous last words. */
-- 
2.34.1




[PATCH REPOST v2 0/2 v2] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.

2022-02-09 Thread Sebastian Andrzej Siewior
This is a another repost of the previous patch (#2) and adding Boris
(Ostrovsky)'s suggestion regarding the XEN bits.
The previous posts can be found at
   https://lore.kernel.org/all/20211206152034.2150770-1-bige...@linutronix.de/
   https://lore.kernel.org/all/20211122154714.xaoxok3fpk5bg...@linutronix.de/

Sebastian




Re: [PATCH 0/2 v2] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.

2021-12-14 Thread Sebastian Andrzej Siewior
On 2021-12-06 16:20:32 [+0100], To linux-ker...@vger.kernel.org wrote:
> This is a repost of the previous patch (#2) and adding Boris
> (Ostrovsky)'s suggestion regarding the XEN bits.
> The previous post can be found at
>https://lore.kernel.org/all/20211122154714.xaoxok3fpk5bg...@linutronix.de/

ping.

Sebastian



[PATCH 2/2] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.

2021-12-06 Thread Sebastian Andrzej Siewior
From: "Longpeng(Mike)" 

A CPU will not show up in virtualized environment which includes an
Enclave. The VM splits its resources into a primary VM and a Enclave
VM. While the Enclave is active, the hypervisor will ignore all requests
to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
The kernel will wait up to ten seconds for CPU to show up
(do_boot_cpu()) and then rollback the hotplug state back to
CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.

After the Enclave VM terminates, the primary VM can bring up the CPU
again.

Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.

[bigeasy: Rewrite commit description.]

Signed-off-by: Longpeng(Mike) 
Signed-off-by: Sebastian Andrzej Siewior 
Link: https://lore.kernel.org/r/20210901051143.2752-1-longpe...@huawei.com
Reviewed-by: Valentin Schneider 
Tested-by: Dongli Zhang 
Reviewed-by: Henry Wang 
Link: https://lore.kernel.org/r/20211122154714.xaoxok3fpk5bg...@linutronix.de
---
 kernel/smpboot.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index f6bc0bc8a2aab..b9f54544e7499 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
 */
return -EAGAIN;
 
+   case CPU_UP_PREPARE:
+   /*
+* Timeout while waiting for the CPU to show up. Allow to try
+* again later.
+*/
+   return 0;
+
default:
 
/* Should not happen.  Famous last words. */
-- 
2.34.1




[PATCH 0/2 v2] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.

2021-12-06 Thread Sebastian Andrzej Siewior
This is a repost of the previous patch (#2) and adding Boris
(Ostrovsky)'s suggestion regarding the XEN bits.
The previous post can be found at
   https://lore.kernel.org/all/20211122154714.xaoxok3fpk5bg...@linutronix.de/

Sebastian





[PATCH 1/2] x86/xen: Allow to retry if cpu_initialize_context() failed.

2021-12-06 Thread Sebastian Andrzej Siewior
From: Boris Ostrovsky 

If memory allocation in cpu_initialize_context() fails then it will
bring up the VCPU and leave with the corresponding CPU bit set in
xen_cpu_initialized_map.
The following (presumably successful) CPU bring up will BUG in
xen_pv_cpu_up() because nothing for that VCPU would be initialized.

Clear the CPU bits, that were set in cpu_initialize_context() in case
the memory allocation fails.

[ bigeasy: Creating a patch from Boris' email. ]

Signed-off-by: Boris Ostrovsky 
Signed-off-by: Sebastian Andrzej Siewior 
---
 arch/x86/xen/smp_pv.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 6a8f3b53ab834..86368fcef4667 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -277,8 +277,11 @@ cpu_initialize_context(unsigned int cpu, struct 
task_struct *idle)
return 0;
 
ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
-   if (ctxt == NULL)
+   if (ctxt == NULL) {
+   cpumask_clear_cpu(cpu, xen_cpu_initialized_map);
+   cpumask_clear_cpu(cpu, cpu_callout_mask);
return -ENOMEM;
+   }
 
gdt = get_cpu_gdt_rw(cpu);
 
-- 
2.34.1




Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.

2021-12-06 Thread Sebastian Andrzej Siewior
On 2021-11-24 21:17:34 [-0500], Boris Ostrovsky wrote:
> 
> On 11/24/21 5:54 PM, Thomas Gleixner wrote:
> > Any comment from XEN folks?
> 
> 
> If memory allocation in cpu_initialize_context() fails we will not be
> able to bring up the VCPU because xen_cpu_initialized_map bit at the
> top of that routine will already have been set. We will BUG in
> xen_pv_cpu_up() on second (presumably successful) attempt because
> nothing for that VCPU would be initialized. This can in principle be
> fixed by moving allocation to the top of the routine and freeing
> context if the bit in the bitmap is already set.
> 
> 
> Having said that, allocation really should not fail: for PV guests we
> first bring max number of VCPUs up and then offline them down to
> however many need to run. I think if we fail allocation during boot we
> are going to have a really bad day anyway.
> 

So can we keep the patch as-is or are some changes needed?

> -boris

Sebastian



[PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.

2021-11-22 Thread Sebastian Andrzej Siewior
From: "Longpeng(Mike)" 

A CPU will not show up in virtualized environment which includes an
Enclave. The VM splits its resources into a primary VM and a Enclave
VM. While the Enclave is active, the hypervisor will ignore all requests
to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
The kernel will wait up to ten seconds for CPU to show up
(do_boot_cpu()) and then rollback the hotplug state back to
CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.

After the Enclave VM terminates, the primary VM can bring up the CPU
again.

Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.

[bigeasy: Rewrite commit description.]

Signed-off-by: Longpeng(Mike) 
Signed-off-by: Sebastian Andrzej Siewior 
Link: https://lore.kernel.org/r/20210901051143.2752-1-longpe...@huawei.com
---

For XEN: this changes the behaviour as it allows to invoke
cpu_initialize_context() again should it have have earlier. I *think*
this is okay and would to bring up the CPU again should the memory
allocation in cpu_initialize_context() fail.

 kernel/smpboot.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index f6bc0bc8a2aab..34958d7fe2c1c 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
 */
return -EAGAIN;
 
+   case CPU_UP_PREPARE:
+   /*
+* Timeout while waiting for the CPU to show up. Allow to try
+* again later.
+*/
+   return 0;
+
default:
 
/* Should not happen.  Famous last words. */
-- 
2.33.1




[Xen-devel] [PATCH 22/34] xen: Use CONFIG_PREEMPTION

2019-10-15 Thread Sebastian Andrzej Siewior
From: Thomas Gleixner 

CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
Both PREEMPT and PREEMPT_RT require the same functionality which today
depends on CONFIG_PREEMPT.

Switch the preempt anand xen-ops code over to use CONFIG_PREEMPTION.

Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/xen/preempt.c | 4 ++--
 include/xen/xen-ops.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/preempt.c b/drivers/xen/preempt.c
index 8b9919c26095d..70650b248de5d 100644
--- a/drivers/xen/preempt.c
+++ b/drivers/xen/preempt.c
@@ -8,7 +8,7 @@
 #include 
 #include 
 
-#ifndef CONFIG_PREEMPT
+#ifndef CONFIG_PREEMPTION
 
 /*
  * Some hypercalls issued by the toolstack can take many 10s of
@@ -37,4 +37,4 @@ asmlinkage __visible void xen_maybe_preempt_hcall(void)
__this_cpu_write(xen_in_preemptible_hcall, true);
}
 }
-#endif /* CONFIG_PREEMPT */
+#endif /* CONFIG_PREEMPTION */
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index d89969aa9942c..095be1d66f31c 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -215,7 +215,7 @@ bool xen_running_on_version_or_later(unsigned int major, 
unsigned int minor);
 void xen_efi_runtime_setup(void);
 
 
-#ifdef CONFIG_PREEMPT
+#ifdef CONFIG_PREEMPTION
 
 static inline void xen_preemptible_hcall_begin(void)
 {
@@ -239,6 +239,6 @@ static inline void xen_preemptible_hcall_end(void)
__this_cpu_write(xen_in_preemptible_hcall, false);
 }
 
-#endif /* CONFIG_PREEMPT */
+#endif /* CONFIG_PREEMPTION */
 
 #endif /* INCLUDE_XEN_OPS_H */
-- 
2.23.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel