Re: vmm(4): swap in log(9) for printf(9) [vmx 3/3]

2021-11-28 Thread Mark Kettenis
> From: Dave Voutila 
> Date: Sun, 28 Nov 2021 22:51:59 -0500
> 
> The last vmm diff I'll be sending tonight...promise! This swaps out
> usage of printf(9) outside the autoconf(4) functions.
> 
> The reason for this change is printf(9) could acquire a sleepable
> lock.

Huh?

/*
 * printf: print a message to the console and the log
 */
int
printf(const char *fmt, ...)
{
va_list ap;
int retval;

va_start(ap, fmt);
mtx_enter(_mutex);
retval = kprintf(fmt, TOCONS | TOLOG, NULL, NULL, ap);
mtx_leave(_mutex);
va_end(ap);
if (!panicstr)
logwakeup();

return(retval);
}

The guts of the the code runs while holding a mutex, which means it
can't sleep.  And logwakeup() doesn't sleep either.



Re: vmm(4): bump vmclear spinout [vmx 1/3]

2021-11-28 Thread Mike Larkin
On Sun, Nov 28, 2021 at 10:32:47PM -0500, Dave Voutila wrote:
> Smallest of the VMX/VMCS stability diffs. This bumps the spinout to be
> the same number of ticks used by the mplock debug. This is needed on
> older/slower hosts.
>
> ok?
>
> -dv
>
> diff e8c587551f20ba6fdaa0f483ea768aade9f66f7d 
> 981a8cfd4e1dfe412e9c72fb5b47e7e46813bfbb
> blob - a7b21ec75899c81f076143fbe59f14279334ea09
> blob + e335a1dc5e8a400b4bbf49cac2ec8853dffcdae3
> --- sys/arch/amd64/amd64/vmm.c
> +++ sys/arch/amd64/amd64/vmm.c
> @@ -1373,7 +1373,7 @@ vmclear_on_cpu(struct cpu_info *ci)
>  static int
>  vmx_remote_vmclear(struct cpu_info *ci, struct vcpu *vcpu)
>  {
> - int ret = 0, nticks = 10;
> + int ret = 0, nticks = 2;
>
>   mtx_enter(>ci_vmcs_mtx);
>   atomic_swap_ulong(>ci_vmcs_pa, vcpu->vc_control_pa);

ok mlarkin



vmm(4): swap in log(9) for printf(9) [vmx 3/3]

2021-11-28 Thread Dave Voutila
The last vmm diff I'll be sending tonight...promise! This swaps out
usage of printf(9) outside the autoconf(4) functions.

The reason for this change is printf(9) could acquire a sleepable
lock. For VMX-based hosts, this can corrupt the active VMCS as the
scheduler may decide to migrate the process to another cpu core without
the same VMCS active. log(9) doesn't sleep.

Users of custom kernels built with VMM_DEBUG will now see vmm(4) debug
logging spamming syslog and not the system message buffer.

ok?

-dv

diff 707f6a4f4311a686c185880828f53a6ad5ee54e9 
e8c587551f20ba6fdaa0f483ea768aade9f66f7d
blob - 76c08badb82e3a90c5b7ab75c1b349fdb3229737
blob + a7b21ec75899c81f076143fbe59f14279334ea09
--- sys/arch/amd64/amd64/vmm.c
+++ sys/arch/amd64/amd64/vmm.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 

@@ -47,14 +48,18 @@
 void *l1tf_flush_region;

 #ifdef VMM_DEBUG
-#define DPRINTF(x...)  do { printf(x); } while(0)
+#define DPRINTF(x...)  do { log(LOG_DEBUG, x); } while(0)
+#define DEBUG_ADD(x...)do { addlog(x); } while(0)
 #else
 #define DPRINTF(x...)
+#define DEBUG_ADD(x...)
 #endif /* VMM_DEBUG */

+#define ERROR(x...)do { log(LOG_ERR, x); } while(0)
+
 #define DEVNAME(s)  ((s)->sc_dev.dv_xname)

-#define CTRL_DUMP(x,y,z) printf(" %s: Can set:%s Can clear:%s\n", #z , \
+#define CTRL_DUMP(x,y,z) addlog(" %s: Can set:%s Can clear:%s\n", #z , \
vcpu_vmx_check_cap(x, IA32_VMX_##y ##_CTLS, \
IA32_VMX_##z, 1) ? "Yes" : "No", \
vcpu_vmx_check_cap(x, IA32_VMX_##y ##_CTLS, \
@@ -657,7 +662,7 @@ vm_resetcpu(struct vm_resetcpu_params *vrp)
vm->vm_id, vcpu->vc_id);

if (vcpu_reset_regs(vcpu, >vrp_init_state)) {
-   printf("%s: failed\n", __func__);
+   ERROR("%s: failed\n", __func__);
 #ifdef VMM_DEBUG
dump_vcpu(vcpu);
 #endif /* VMM_DEBUG */
@@ -705,9 +710,7 @@ vm_intr_pending(struct vm_intr_params *vip)
if (vcpu == NULL)
return (ENOENT);

-   rw_enter_write(>vc_lock);
vcpu->vc_intr = vip->vip_intr;
-   rw_exit_write(>vc_lock);

return (0);
 }
@@ -1002,7 +1005,7 @@ vmx_mprotect_ept(vm_map_t vm_map, paddr_t sgpa, paddr_
ret = uvm_fault(vm_map, addr, VM_FAULT_WIRE,
PROT_READ | PROT_WRITE | PROT_EXEC);
if (ret)
-   printf("%s: uvm_fault returns %d, GPA=0x%llx\n",
+   ERROR("%s: uvm_fault returns %d, GPA=0x%llx\n",
__func__, ret, (uint64_t)addr);

pte = vmx_pmap_find_pte_ept(pmap, addr);
@@ -1183,7 +1186,7 @@ vmm_start(void)
for (i = 10; (!(ci->ci_flags & CPUF_VMM)) && i>0;i--)
delay(10);
if (!(ci->ci_flags & CPUF_VMM)) {
-   printf("%s: failed to enter VMM mode\n",
+   ERROR("%s: failed to enter VMM mode\n",
ci->ci_dev->dv_xname);
ret = EIO;
}
@@ -1193,7 +1196,7 @@ vmm_start(void)
/* Start VMM on this CPU */
start_vmm_on_cpu(self);
if (!(self->ci_flags & CPUF_VMM)) {
-   printf("%s: failed to enter VMM mode\n",
+   ERROR("%s: failed to enter VMM mode\n",
self->ci_dev->dv_xname);
ret = EIO;
}
@@ -1231,7 +1234,7 @@ vmm_stop(void)
for (i = 10; (ci->ci_flags & CPUF_VMM) && i>0 ;i--)
delay(10);
if (ci->ci_flags & CPUF_VMM) {
-   printf("%s: failed to exit VMM mode\n",
+   ERROR("%s: failed to exit VMM mode\n",
ci->ci_dev->dv_xname);
ret = EIO;
}
@@ -1241,7 +1244,7 @@ vmm_stop(void)
/* Stop VMM on this CPU */
stop_vmm_on_cpu(self);
if (self->ci_flags & CPUF_VMM) {
-   printf("%s: failed to exit VMM mode\n",
+   ERROR("%s: failed to exit VMM mode\n",
self->ci_dev->dv_xname);
ret = EIO;
}
@@ -1379,7 +1382,7 @@ vmx_remote_vmclear(struct cpu_info *ci, struct vcpu *v
while (ci->ci_vmcs_pa != VMX_VMCS_PA_CLEAR) {
CPU_BUSY_CYCLE();
if (--nticks <= 0) {
-   printf("%s: spun out\n", __func__);
+   ERROR("%s: spun out\n", __func__);
ret = 1;
break;
}
@@ -1525,7 +1528,7 @@ vm_create(struct vm_create_params *vcp, struct proc *p
strncpy(vm->vm_name, vcp->vcp_name, VMM_MAX_NAME_LEN - 1);

if (vm_impl_init(vm, p)) {
-   printf("failed to init arch-specific features for vm %p\n", vm);
+   ERROR("failed 

vmm(4): restore vmcs after sleep points [vmx 2/3]

2021-11-28 Thread Dave Voutila
This diff removes instability from VMX-based hosts by either removing
the possibility of the process sleeping while the VMCS is active or
reloading it if we had no choice.

A mutex is added to help guard the VMCS state so testing with witness
has helped verify the diff.

The rwlock on the cpu originally used in the remote vmclear routine is
changed to a mutex accordingly.

This diff does not remote possible calls to printf(9) via the DPRINTF
macro as that's part of the next diff.

One area of note: in vmx_load_pdptes() there's a XXX to call out that
because of the printf(9) call on failure to km_alloc that the VMCS is
potentially no longer valid. The upcoming diff to swap out printf(9) for
log(9) will remove that.

ok?

-dv


diff refs/heads/master refs/heads/vmm-vmx-02
blob - cc189771bc76745cd391e9b2a58f46dd92aa32ce
blob + d7c6c592b12b236d14ade863fd1e75d8effc179c
--- sys/arch/amd64/amd64/cpu.c
+++ sys/arch/amd64/amd64/cpu.c
@@ -790,7 +790,7 @@ cpu_init_vmm(struct cpu_info *ci)
>ci_vmxon_region_pa))
panic("Can't locate VMXON region in phys mem");
ci->ci_vmcs_pa = VMX_VMCS_PA_CLEAR;
-   rw_init(>ci_vmcs_lock, "vmcslock");
+   mtx_init(>ci_vmcs_mtx, IPL_MPFLOOR);
}
 }
 #endif /* NVMM > 0 */
blob - 2535558cce5f07f4d6a150ce53b100524801755a
blob + 76c08badb82e3a90c5b7ab75c1b349fdb3229737
--- sys/arch/amd64/amd64/vmm.c
+++ sys/arch/amd64/amd64/vmm.c
@@ -810,11 +810,13 @@ vm_rwregs(struct vm_rwregs_params *vrwp, int dir)

rw_enter_write(>vc_lock);
if (vmm_softc->mode == VMM_MODE_VMX ||
-   vmm_softc->mode == VMM_MODE_EPT)
+   vmm_softc->mode == VMM_MODE_EPT) {
+   mtx_enter(>vc_vmx_mtx);
ret = (dir == 0) ?
vcpu_readregs_vmx(vcpu, vrwp->vrwp_mask, vrs) :
vcpu_writeregs_vmx(vcpu, vrwp->vrwp_mask, 1, vrs);
-   else if (vmm_softc->mode == VMM_MODE_SVM ||
+   mtx_leave(>vc_vmx_mtx);
+   } else if (vmm_softc->mode == VMM_MODE_SVM ||
vmm_softc->mode == VMM_MODE_RVI)
ret = (dir == 0) ?
vcpu_readregs_svm(vcpu, vrwp->vrwp_mask, vrs) :
@@ -1370,7 +1372,7 @@ vmx_remote_vmclear(struct cpu_info *ci, struct vcpu *v
 {
int ret = 0, nticks = 10;

-   rw_enter_write(>ci_vmcs_lock);
+   mtx_enter(>ci_vmcs_mtx);
atomic_swap_ulong(>ci_vmcs_pa, vcpu->vc_control_pa);
x86_send_ipi(ci, X86_IPI_VMCLEAR_VMM);

@@ -1383,7 +1385,7 @@ vmx_remote_vmclear(struct cpu_info *ci, struct vcpu *v
}
}
atomic_swap_uint(>vc_vmx_vmcs_state, VMCS_CLEARED);
-   rw_exit_write(>ci_vmcs_lock);
+   mtx_leave(>ci_vmcs_mtx);

return (ret);
 }
@@ -1785,6 +1787,7 @@ vcpu_reload_vmcs_vmx(struct vcpu *vcpu)
struct cpu_info *ci, *last_ci;

rw_assert_wrlock(>vc_lock);
+   MUTEX_ASSERT_LOCKED(>vc_vmx_mtx);

ci = curcpu();
last_ci = vcpu->vc_last_pcpu;
@@ -1838,6 +1841,8 @@ vcpu_readregs_vmx(struct vcpu *vcpu, uint64_t regmask,
struct vcpu_segment_info *sregs = vrs->vrs_sregs;
struct vmx_msr_store *msr_store;

+   MUTEX_ASSERT_LOCKED(>vc_vmx_mtx);
+
 #ifdef VMM_DEBUG
/* VMCS should be loaded... */
paddr_t pa = 0ULL;
@@ -2113,6 +2118,8 @@ vcpu_writeregs_vmx(struct vcpu *vcpu, uint64_t regmask
struct vcpu_segment_info *sregs = vrs->vrs_sregs;
struct vmx_msr_store *msr_store;

+   MUTEX_ASSERT_LOCKED(>vc_vmx_mtx);
+
if (loadvmcs) {
if (vcpu_reload_vmcs_vmx(vcpu))
return (EINVAL);
@@ -2744,6 +2751,7 @@ vcpu_reset_regs_vmx(struct vcpu *vcpu, struct vcpu_reg
struct vmx_msr_store *msr_store;

rw_assert_wrlock(>vc_lock);
+   mtx_enter(>vc_vmx_mtx);

cr0 = vrs->vrs_crs[VCPU_REGS_CR0];

@@ -3028,12 +3036,25 @@ vcpu_reset_regs_vmx(struct vcpu *vcpu, struct vcpu_reg
IA32_VMX_ACTIVATE_SECONDARY_CONTROLS, 1)) {
if (vcpu_vmx_check_cap(vcpu, IA32_VMX_PROCBASED2_CTLS,
IA32_VMX_ENABLE_VPID, 1)) {
+
+   /* We need to drop the mutex before acquiring a vpid */
+   vcpu->vc_last_pcpu = curcpu();
+   mtx_leave(>vc_vmx_mtx);
+
if (vmm_alloc_vpid()) {
DPRINTF("%s: could not allocate VPID\n",
__func__);
ret = EINVAL;
+   goto exit_no_mtx;
+   }
+
+   mtx_enter(>vc_vmx_mtx);
+   if (vcpu_reload_vmcs_vmx(vcpu)) {
+   printf("%s: failed to reload vmcs\n", __func__);
+   ret = EINVAL;
goto exit;
}
+
if (vmwrite(VMCS_GUEST_VPID, vpid)) {
 

vmm(4): bump vmclear spinout [vmx 1/3]

2021-11-28 Thread Dave Voutila
Smallest of the VMX/VMCS stability diffs. This bumps the spinout to be
the same number of ticks used by the mplock debug. This is needed on
older/slower hosts.

ok?

-dv

diff e8c587551f20ba6fdaa0f483ea768aade9f66f7d 
981a8cfd4e1dfe412e9c72fb5b47e7e46813bfbb
blob - a7b21ec75899c81f076143fbe59f14279334ea09
blob + e335a1dc5e8a400b4bbf49cac2ec8853dffcdae3
--- sys/arch/amd64/amd64/vmm.c
+++ sys/arch/amd64/amd64/vmm.c
@@ -1373,7 +1373,7 @@ vmclear_on_cpu(struct cpu_info *ci)
 static int
 vmx_remote_vmclear(struct cpu_info *ci, struct vcpu *vcpu)
 {
-   int ret = 0, nticks = 10;
+   int ret = 0, nticks = 2;

mtx_enter(>ci_vmcs_mtx);
atomic_swap_ulong(>ci_vmcs_pa, vcpu->vc_control_pa);



mpsafe dwge(4)

2021-11-28 Thread Jonathan Matthew
This applies our normal strategies for making network drivers mpsafe, and
also writes to GMAC_TX_POLL_DEMAND once per call to dwge_start() rather than
once per packet, and returns rx slots once per interrupt rather than once
per packet.

I've tested this on a rockpro64, where it makes tcpbench etc. a bit faster.
I think I have an armv7 board with dwge(4) somewhere, but I haven't tested it
there yet.

ok?

Index: if_dwge.c
===
RCS file: /cvs/src/sys/dev/fdt/if_dwge.c,v
retrieving revision 1.12
diff -u -p -r1.12 if_dwge.c
--- if_dwge.c   24 Oct 2021 17:52:26 -  1.12
+++ if_dwge.c   28 Nov 2021 09:36:56 -
@@ -234,6 +234,7 @@ struct dwge_softc {
bus_space_tag_t sc_iot;
bus_space_handle_t  sc_ioh;
bus_dma_tag_t   sc_dmat;
+   void*sc_ih;
 
struct arpcom   sc_ac;
 #define sc_lladdr  sc_ac.ac_enaddr
@@ -247,7 +248,6 @@ struct dwge_softc {
struct dwge_buf *sc_txbuf;
struct dwge_desc*sc_txdesc;
int sc_tx_prod;
-   int sc_tx_cnt;
int sc_tx_cons;
 
struct dwge_dmamem  *sc_rxring;
@@ -289,7 +289,7 @@ uint32_t dwge_read(struct dwge_softc *, 
 void   dwge_write(struct dwge_softc *, bus_addr_t, uint32_t);
 
 intdwge_ioctl(struct ifnet *, u_long, caddr_t);
-void   dwge_start(struct ifnet *);
+void   dwge_start(struct ifqueue *);
 void   dwge_watchdog(struct ifnet *);
 
 intdwge_media_change(struct ifnet *);
@@ -312,7 +312,7 @@ voiddwge_rx_proc(struct dwge_softc *);
 void   dwge_up(struct dwge_softc *);
 void   dwge_down(struct dwge_softc *);
 void   dwge_iff(struct dwge_softc *);
-intdwge_encap(struct dwge_softc *, struct mbuf *, int *);
+intdwge_encap(struct dwge_softc *, struct mbuf *, int *, int *);
 
 void   dwge_reset(struct dwge_softc *);
 void   dwge_stop_dma(struct dwge_softc *);
@@ -422,8 +422,9 @@ dwge_attach(struct device *parent, struc
ifp = >sc_ac.ac_if;
ifp->if_softc = sc;
ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
+   ifp->if_xflags = IFXF_MPSAFE;
ifp->if_ioctl = dwge_ioctl;
-   ifp->if_start = dwge_start;
+   ifp->if_qstart = dwge_start;
ifp->if_watchdog = dwge_watchdog;
ifq_set_maxlen(>if_snd, DWGE_NTXDESC - 1);
bcopy(sc->sc_dev.dv_xname, ifp->if_xname, IFNAMSIZ);
@@ -535,8 +536,10 @@ dwge_attach(struct device *parent, struc
dwge_write(sc, GMAC_MMC_TX_INT_MSK, 0x);
dwge_write(sc, GMAC_MMC_IPC_INT_MSK, 0x);
 
-   fdt_intr_establish(faa->fa_node, IPL_NET, dwge_intr, sc,
-   sc->sc_dev.dv_xname);
+   sc->sc_ih = fdt_intr_establish(faa->fa_node, IPL_NET | IPL_MPSAFE,
+   dwge_intr, sc, sc->sc_dev.dv_xname);
+   if (sc->sc_ih == NULL)
+   printf("%s: can't establish interrupt\n", sc->sc_dev.dv_xname);
 }
 
 void
@@ -612,11 +615,12 @@ dwge_lladdr_write(struct dwge_softc *sc)
 }
 
 void
-dwge_start(struct ifnet *ifp)
+dwge_start(struct ifqueue *ifq)
 {
+   struct ifnet *ifp = ifq->ifq_if;
struct dwge_softc *sc = ifp->if_softc;
struct mbuf *m;
-   int error, idx;
+   int error, idx, left, used;
 
if (!(ifp->if_flags & IFF_RUNNING))
return;
@@ -628,27 +632,29 @@ dwge_start(struct ifnet *ifp)
return;
 
idx = sc->sc_tx_prod;
-   while ((sc->sc_txdesc[idx].sd_status & TDES0_OWN) == 0) {
-   m = ifq_deq_begin(>if_snd);
-   if (m == NULL)
+   left = sc->sc_tx_cons;
+   if (left <= idx)
+   left += DWGE_NTXDESC;
+   left -= idx;
+   used = 0;
+
+   for (;;) {
+   if (used + DWGE_NTXSEGS + 1 > left) {
+   ifq_set_oactive(ifq);
break;
+   }
 
-   error = dwge_encap(sc, m, );
-   if (error == ENOBUFS) {
-   ifq_deq_rollback(>if_snd, m);
-   ifq_set_oactive(>if_snd);
+   m = ifq_dequeue(ifq);
+   if (m == NULL)
break;
-   }
+
+   error = dwge_encap(sc, m, , );
if (error == EFBIG) {
-   ifq_deq_commit(>if_snd, m);
m_freem(m); /* give up: drop it */
ifp->if_oerrors++;
continue;
}
 
-   /* Now we are committed to transmit the packet. */
-   ifq_deq_commit(>if_snd, m);
-
 #if NBPFILTER > 0
if (ifp->if_bpf)
bpf_mtap(ifp->if_bpf, m, BPF_DIRECTION_OUT);
@@ -660,6 +666,8 @@ dwge_start(struct ifnet *ifp)
 
/* Set a timeout in case the chip goes out to lunch. */
ifp->if_timer = 5;
+
+   dwge_write(sc, 

Re: vmm(4) testers wanted: VMX stability fixes

2021-11-28 Thread Dave Voutila


Dave Voutila  writes:

> Greetings tech@,
>
> I'm looking for testers of the following diff which addresses spurious
> VMCS corruption on Intel VMX hosts. If since upgrading to 7.0 or
> following -current you've had guests die and errors about failure to
> vmresume, please try this diff. I've had it running with no issues on
> some hardware mlarkin has that was showing frequent issues, as well as
> on my own systems.
>

Friendly bump. Have had some testers, but looking for last call as I
carve this up into the 3-4 diffs I'm going to look to commit.

> If you do, please compile with WITNESS support. I'm temporarily using
> mutex(9) to help identify any possible sleeping locks when working with
> VMCS. I might swap back to rwlock(9) usage before commit as there's no
> reason we need to spin...it's just for leveraging WITNESS to see
> if/where we try acquiring a sleepable lock when we don't expect to.
>
> -- AMD Folks
>
> Even if you run on AMD SVM hardware, please try and look for any
> regressions. The only changes you should see are all vmm(4) messages
> after probe/attach are sent to syslog and not directly to the console
> via printf(9).
>
> -- What's changed? This diff is huge :(
>
> The simplest change is bumping up the vmclear ipi spin counter to a
> higher value (matching the one used by MP_LOCKDEBUG) as on slower
> hardware (like mlarkin's) can spin out when the host is under heavy
> load. This will be committed on its own, but worth including in this
> diff for testing.
>
> The major changes address sleep points in vmm(4) in critical paths
> related to VMCS state. Sleeping on any locks, such as the remote vmclear
> via ipi, may result in the scheduler stealing the process and moving it
> to a free cpu. Any subsequent vmresume/vmwrite/vmread instruction the
> VMCS needs to be remotely cleared and then reloaded otherwise they will
> fail.
>
> The diff either removes possible sleep points (swapping out printf(9)
> for log(9)) or performs a proper vmclear/vmptrld dance where we can't
> remove the sleep point (e.g. uvm_fault(9), km_alloc(9)).
>
> This diff is not ready for OK as it will need to be split into 2-3
> commits to keep manageable, but for testing really needs to be used in
> concert.
>
> THANKS!
>
> -dv
>
>
> diff refs/heads/master refs/heads/vmresume-vmx
> blob - cc189771bc76745cd391e9b2a58f46dd92aa32ce
> blob + d7c6c592b12b236d14ade863fd1e75d8effc179c
> --- sys/arch/amd64/amd64/cpu.c
> +++ sys/arch/amd64/amd64/cpu.c
> @@ -790,7 +790,7 @@ cpu_init_vmm(struct cpu_info *ci)
>   >ci_vmxon_region_pa))
>   panic("Can't locate VMXON region in phys mem");
>   ci->ci_vmcs_pa = VMX_VMCS_PA_CLEAR;
> - rw_init(>ci_vmcs_lock, "vmcslock");
> + mtx_init(>ci_vmcs_mtx, IPL_MPFLOOR);
>   }
>  }
>  #endif /* NVMM > 0 */
> blob - 2535558cce5f07f4d6a150ce53b100524801755a
> blob + e335a1dc5e8a400b4bbf49cac2ec8853dffcdae3
> --- sys/arch/amd64/amd64/vmm.c
> +++ sys/arch/amd64/amd64/vmm.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>
> @@ -47,14 +48,18 @@
>  void *l1tf_flush_region;
>
>  #ifdef VMM_DEBUG
> -#define DPRINTF(x...)do { printf(x); } while(0)
> +#define DPRINTF(x...)do { log(LOG_DEBUG, x); } while(0)
> +#define DEBUG_ADD(x...)  do { addlog(x); } while(0)
>  #else
>  #define DPRINTF(x...)
> +#define DEBUG_ADD(x...)
>  #endif /* VMM_DEBUG */
>
> +#define ERROR(x...)  do { log(LOG_ERR, x); } while(0)
> +
>  #define DEVNAME(s)  ((s)->sc_dev.dv_xname)
>
> -#define CTRL_DUMP(x,y,z) printf(" %s: Can set:%s Can clear:%s\n", #z , \
> +#define CTRL_DUMP(x,y,z) addlog(" %s: Can set:%s Can clear:%s\n", #z , \
>   vcpu_vmx_check_cap(x, IA32_VMX_##y ##_CTLS, \
>   IA32_VMX_##z, 1) ? "Yes" : "No", \
>   vcpu_vmx_check_cap(x, IA32_VMX_##y ##_CTLS, \
> @@ -657,7 +662,7 @@ vm_resetcpu(struct vm_resetcpu_params *vrp)
>   vm->vm_id, vcpu->vc_id);
>
>   if (vcpu_reset_regs(vcpu, >vrp_init_state)) {
> - printf("%s: failed\n", __func__);
> + ERROR("%s: failed\n", __func__);
>  #ifdef VMM_DEBUG
>   dump_vcpu(vcpu);
>  #endif /* VMM_DEBUG */
> @@ -705,9 +710,7 @@ vm_intr_pending(struct vm_intr_params *vip)
>   if (vcpu == NULL)
>   return (ENOENT);
>
> - rw_enter_write(>vc_lock);
>   vcpu->vc_intr = vip->vip_intr;
> - rw_exit_write(>vc_lock);
>
>   return (0);
>  }
> @@ -810,11 +813,13 @@ vm_rwregs(struct vm_rwregs_params *vrwp, int dir)
>
>   rw_enter_write(>vc_lock);
>   if (vmm_softc->mode == VMM_MODE_VMX ||
> - vmm_softc->mode == VMM_MODE_EPT)
> + vmm_softc->mode == VMM_MODE_EPT) {
> + mtx_enter(>vc_vmx_mtx);
>   ret = (dir == 0) ?
>   vcpu_readregs_vmx(vcpu, vrwp->vrwp_mask, vrs) :
>   vcpu_writeregs_vmx(vcpu, vrwp->vrwp_mask, 1, vrs);

Re: slaacd(8): prevent crash when interface disappears

2021-11-28 Thread Florian Obser
On 2021-11-28 05:13 UTC, Klemens Nanni  wrote:
> On Thu, Nov 18, 2021 at 09:02:00AM +0100, Florian Obser wrote:
>> This might be the crash kn@ was seeing once in a blue moon.
>
> I somewhat doubt it, since slaacd crashed on my notebook using trunk(4)
> over em(4) and athn(4), none of these interface get destroyed.
>
> I toggle the autoconf, set IPs, etc. but once trunk0 has its interface
> index, it'll stay the same until reboot.

Oh, but this is not just about autoconf interfaces, it's about *all*
interfaces. Look at the RTM_NEWADDR case in handle_route_message().

If it gets a NULL if_name for *any* interface it happily passes this to
update_iface() -> get_flags() -> strlcpy -> boom. It doesn't yet know if
it's an autoconf interface, it currently tries to figure that out.

You can probably crash slaacd like this:

while :; do ifconfig vether0 inet 2001:db8::23/64; ifconfig vether0 destroy; 
done

The route socket is restricted to AF_INET6, so you need to fiddle around
with IPv6 addresses...

-- 
I'm not entirely sure you are real.



Re: slaacd(8): prevent crash when interface disappears

2021-11-28 Thread Klemens Nanni
On Thu, Nov 18, 2021 at 09:02:00AM +0100, Florian Obser wrote:
> This is split in two for easier review and I also intend to commit it
> like this.
> 
> The first diff shuffles setting of if_index around so that it's
> available in all switch cases and uses it consistently instead of
> ifm->ifm_index.
> 
> That way we can copy the if_name == NULL check into each case block
> preventing crashes when an interface disappears at just the wrong
> moment.
> 
> OK?
> 
> (btw. I found this because I transmogrified slaacd into gelatod and kn@
> reported a crash while running in debug mode so I could spot what's
> wrong with it. Much harder to find in slaacd...)
> 
> commit 2880598cb424e5d889ecdbf06d9d72d777b11569
> Author: Florian Obser 
> Date:   Wed Nov 17 20:13:55 2021 +0100
> 
> normalize if_index handling in route messages

OK kn

> commit e8882f2329db1789914358c1c6b56ddec74fc35f
> Author: Florian Obser 
> Date:   Wed Nov 17 20:17:29 2021 +0100
> 
> Make sure the interface still exists before updating it.
> 
> When we get a route message, for example an address being added
> (RTM_NEWADDR, but the problem exists with most of the route messages)
> and the interface gets unplugged at just the right moment
> if_nametoindex(3) will return NULL. We will pass NULL through
> update_iface() to get_xflags() which will then crash because we
> dereference the NULL pointer there.

OK kn

> This might be the crash kn@ was seeing once in a blue moon.

I somewhat doubt it, since slaacd crashed on my notebook using trunk(4)
over em(4) and athn(4), none of these interface get destroyed.

I toggle the autoconf, set IPs, etc. but once trunk0 has its interface
index, it'll stay the same until reboot.