Re: [External] : Re: XCP-ng, OpenBSD and network interface changes

2021-02-01 Thread Denis Fondras
Le Mon, Feb 01, 2021 at 01:49:09PM +0100, Alexandr Nedvedicky a écrit :
> Hello Denis,
> 
> I think we need to refresh expected value in 'flags'
> with every loop iteration.  does diff below help?
> 

Thank you but it does not help. Same panic and also same panic if I test with
loop++ > 10.

If loop++ > 100, no more panic but I get :

xnf0 detached
xen0: failed to attach "device/vif/"


> regards
> sashan
> 
> 8<---8<---8<--8<
> diff --git a/sys/dev/pv/xen.c b/sys/dev/pv/xen.c
> index 11ce4ca99cd..c93e68614b4 100644
> --- a/sys/dev/pv/xen.c
> +++ b/sys/dev/pv/xen.c
> @@ -1202,20 +1202,22 @@ xen_grant_table_remove(struct xen_softc *sc, 
> grant_ref_t ref)
>   flags = (ge->ge_table[ref].flags & ~(GTF_reading|GTF_writing)) |
>   (ge->ge_table[ref].domid << 16);
>   loop = 0;
>   while (atomic_cas_uint(ptr, flags, GTF_invalid) != flags) {
>   if (loop++ > 10) {
>   panic("grant table reference %u is held "
>   "by domain %d: frame %#x flags %#x",
>   ref + ge->ge_start, ge->ge_table[ref].domid,
>   ge->ge_table[ref].frame, ge->ge_table[ref].flags);
>   }
> + flags = (ge->ge_table[ref].flags & ~(GTF_reading|GTF_writing)) |
> + (ge->ge_table[ref].domid << 16);
>  #if (defined(__amd64__) || defined(__i386__))
>   __asm volatile("pause": : : "memory");
>  #endif
>   }
>   ge->ge_table[ref].frame = 0x;
>  }
>  
>  int
>  xen_bus_dmamap_create(bus_dma_tag_t t, bus_size_t size, int nsegments,
>  bus_size_t maxsegsz, bus_size_t boundary, int flags, bus_dmamap_t *dmamp)



Re: XCP-ng, OpenBSD and network interface changes

2021-02-01 Thread Mike Belopuhov
On Sun, Jan 31, 2021 at 2:59 PM Denis Fondras  wrote:

> I am using XCP-ng with the latest OpenBSD snapshot.
>
> Whenever I make an hardware change in networking on the VM (connect or
> disconnect an interface, change associated network), the VM panics :
>
> openbsd# panic: grant table reference 5912 is held by domain 0: frame
> 0x1f1a4 flags 0x19
> Stopped at   db_enter+0x10: popq %rbp
> TID   PID  UIDPRFLAGS   PFLAGS CPU COMMAND
> *349758 6557900x14000   0x200   0 xenwatch
> db_enter() at db_enter+0x10
> panic(81da7541) at panic+0x12a
> xen_bus_dmamap_unload(820ede50,800e9380) at
> xen_bus_dmamap_unload+0x138
> xnf_tx_ring_destroy(80162000) at xnf_tx_ring_destroy+0x104
> xnf_detach(80162000,0) at xnf_detach+0x55
> config_detach(80162000,0) at config_detach+0x140
> xen_hotplug(8012e200) at xen_hotplug+0x181
> taskq_thread(800dde00) at taskq_thread+0x66
> end trace frame: 0x0, count: 7
> https://www.openbsd.org/ddb.html describes the minimum info required in
> bug reports. Insufficient info makes it difficult to find and fix bugs.
> ddb>
>
> If I apply the following patch, it obviously does not panic and seems to
> work
> correctly :
>
>
Hi Denis,

This is not a real fix unfortunately, you're just ignoring the issue.
Somehow the grant table reference is not released when we perform the
detach.
You can try increasing amount of iterations to 1 (or more) for example
and see
if this is a timing issue.

Cheers,
Mike


> Index: xen.c
> ===
> RCS file: /cvs/src/sys/dev/pv/xen.c,v
> retrieving revision 1.97
> diff -u -p -r1.97 xen.c
> --- xen.c   29 Jun 2020 06:50:52 -  1.97
> +++ xen.c   31 Jan 2021 13:13:07 -
> @@ -1204,7 +1204,7 @@ xen_grant_table_remove(struct xen_softc
> loop = 0;
> while (atomic_cas_uint(ptr, flags, GTF_invalid) != flags) {
> if (loop++ > 10) {
> -   panic("grant table reference %u is held "
> +   printf("grant table reference %u is held "
> "by domain %d: frame %#x flags %#x",
> ref + ge->ge_start, ge->ge_table[ref].domid,
> ge->ge_table[ref].frame,
> ge->ge_table[ref].flags);
>
> Can someone give me a clue on what _atomic_cas_uint() is ?
>
> Thank you in advance.
>
> Denis
>
> OpenBSD 6.8-current (GENERIC) #9: Sun Jan 31 14:08:42 CET 2021
> r...@openbsd.lab.ledeuns.net:/sys/arch/amd64/compile/GENERIC
> real mem = 1052770304 (1004MB)
> avail mem = 1005694976 (959MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xeb01f (11 entries)
> bios0: vendor Xen version "4.13" date 01/21/2021
> bios0: Xen HVM domU
> acpi0 at bios0: ACPI 4.0
> acpi0: sleep states S5
> acpi0: tables DSDT FACP APIC HPET WAET
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 32 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 48 pins, remapped
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz, 2394.83 MHz, 06-3e-04
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 100MHz
> acpihpet0 at acpi0: 6250 Hz
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpipci0 at acpi0 PCI0
> acpicmos0 at acpi0
> "ACPI0007" at acpi0 not configured
> acpicpu0 at acpi0: C1(@1 halt!)
> cpu0: using VERW MDS workaround (except on vmm entry)
> pvbus0 at mainbus0: Hyper-V 0.0, Xen 4.13
> xen0 at pvbus0: features 0x2705, 64 grant table frames, event channel 2
> xbf0 at xen0 backend 0 channel 6: disk
> scsibus1 at xbf0: 1 targets
> sd0 at scsibus1 targ 0 lun 0: 
> sd0: 10240MB, 512 bytes/sector, 20971520 sectors
> xbf1 at xen0 backend 0 channel 7: cdrom
> xbf1: timed out waiting for backend to connect
> xnf0 at xen0 backend 0 channel 7: address 76:88:23:28:25:f4
> xnf1 at xen0 backend 0 channel 8: address 62:36:ed:68:46:3c
> xnf2 at xen0 backend 0 channel 9: address be:04:e2:f3:7d:75
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
> pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
> pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA,
> channel 0 wired to compatibility, channel 1 wired to compatibility
> pciide0: channel 0 disabled (no drives)
> atapiscsi0 at pciide0 channel 1 drive 1
> scsibus2 at atapiscsi0: 2 targets
> cd0 at scsibus2 targ 0 lun 

XCP-ng, OpenBSD and network interface changes

2021-01-31 Thread Denis Fondras
I am using XCP-ng with the latest OpenBSD snapshot.

Whenever I make an hardware change in networking on the VM (connect or
disconnect an interface, change associated network), the VM panics :

openbsd# panic: grant table reference 5912 is held by domain 0: frame 0x1f1a4 
flags 0x19
Stopped at   db_enter+0x10: popq %rbp
TID   PID  UIDPRFLAGS   PFLAGS CPU COMMAND
*349758 6557900x14000   0x200   0 xenwatch
db_enter() at db_enter+0x10
panic(81da7541) at panic+0x12a
xen_bus_dmamap_unload(820ede50,800e9380) at 
xen_bus_dmamap_unload+0x138
xnf_tx_ring_destroy(80162000) at xnf_tx_ring_destroy+0x104
xnf_detach(80162000,0) at xnf_detach+0x55
config_detach(80162000,0) at config_detach+0x140
xen_hotplug(8012e200) at xen_hotplug+0x181
taskq_thread(800dde00) at taskq_thread+0x66
end trace frame: 0x0, count: 7
https://www.openbsd.org/ddb.html describes the minimum info required in bug 
reports. Insufficient info makes it difficult to find and fix bugs.
ddb>

If I apply the following patch, it obviously does not panic and seems to work
correctly :

Index: xen.c
===
RCS file: /cvs/src/sys/dev/pv/xen.c,v
retrieving revision 1.97
diff -u -p -r1.97 xen.c
--- xen.c   29 Jun 2020 06:50:52 -  1.97
+++ xen.c   31 Jan 2021 13:13:07 -
@@ -1204,7 +1204,7 @@ xen_grant_table_remove(struct xen_softc 
loop = 0;
while (atomic_cas_uint(ptr, flags, GTF_invalid) != flags) {
if (loop++ > 10) {
-   panic("grant table reference %u is held "
+   printf("grant table reference %u is held "
"by domain %d: frame %#x flags %#x",
ref + ge->ge_start, ge->ge_table[ref].domid,
ge->ge_table[ref].frame, ge->ge_table[ref].flags);

Can someone give me a clue on what _atomic_cas_uint() is ?

Thank you in advance.

Denis

OpenBSD 6.8-current (GENERIC) #9: Sun Jan 31 14:08:42 CET 2021
r...@openbsd.lab.ledeuns.net:/sys/arch/amd64/compile/GENERIC
real mem = 1052770304 (1004MB)
avail mem = 1005694976 (959MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xeb01f (11 entries)
bios0: vendor Xen version "4.13" date 01/21/2021
bios0: Xen HVM domU
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S5
acpi0: tables DSDT FACP APIC HPET WAET
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 48 pins, remapped
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz, 2394.83 MHz, 06-3e-04
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 100MHz
acpihpet0 at acpi0: 6250 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpipci0 at acpi0 PCI0
acpicmos0 at acpi0
"ACPI0007" at acpi0 not configured
acpicpu0 at acpi0: C1(@1 halt!)
cpu0: using VERW MDS workaround (except on vmm entry)
pvbus0 at mainbus0: Hyper-V 0.0, Xen 4.13
xen0 at pvbus0: features 0x2705, 64 grant table frames, event channel 2
xbf0 at xen0 backend 0 channel 6: disk
scsibus1 at xbf0: 1 targets
sd0 at scsibus1 targ 0 lun 0: 
sd0: 10240MB, 512 bytes/sector, 20971520 sectors
xbf1 at xen0 backend 0 channel 7: cdrom
xbf1: timed out waiting for backend to connect
xnf0 at xen0 backend 0 channel 7: address 76:88:23:28:25:f4
xnf1 at xen0 backend 0 channel 8: address 62:36:ed:68:46:3c
xnf2 at xen0 backend 0 channel 9: address be:04:e2:f3:7d:75
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
pciide0: channel 0 disabled (no drives)
atapiscsi0 at pciide0 channel 1 drive 1
scsibus2 at atapiscsi0: 2 targets
cd0 at scsibus2 targ 0 lun 0:  removable
cd0(pciide0:1:1): using PIO mode 4, DMA mode 2
uhci0 at pci0 dev 1 function 2 "Intel 82371SB USB" rev 0x01: apic 1 int 23
piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x01: SMBus disabled
vga1 at pci0 dev 2 function 0 "Cirrus Logic CL-GD5446" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
xspd0 at pci0 dev 3 function 0 "XenSource Platform Device" rev 0x01
isa0 at pcib0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
com0 at