Re: About virtio-scsi and\or scsi.
On Wed, Jul 29, 2015 at 4:53 PM, Eliezer Croitoru elie...@ngtech.co.il wrote: I am testing couple VMs under kvm and from my tests it seems that there might not be support for hot-plug of virtio disks or virtio-scsi disks in freebsd? Hot plug of VirtIO block devices is not supported, but that is more because of a lack PCI hot plug. Hot plugging of disks to an existing VirtIO SCSI adapter is supported. I wanted to make sure I am understand right the situation FreeBSD is right now. If anyone knows please reply. Thanks, Eliezer ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] Paravirtualized KVM clock
On Wed, Jan 21, 2015 at 3:15 PM, Peter Jeremy pe...@rulingia.com wrote: On 2015-Jan-04 11:56:14 -0600, Bryan Venteicher bry...@daemoninthecloset.org wrote: For the last few weeks, I've been working on adding support for KVM clock in the projects/paravirt branch. Currently, a KVM VM guest will end up selecting either the HPET or ACPI as the timecounter source. Unfortunately, this is very costly since every timecounter fetch causes a VM exit. KVM clock allows the guest to use the TSC instead; it is very similar to the existing Xen timer. A somewhat late response but have you looked at https://github.com/blitz/freebsd/commit/cdc5f872b3e48cc0dda031fc7d6bdedc65c3148f I've been running this[*] on a Google Compute Engine instance for about 6 months without problems. A goal of my work was to put a bit of infrastructure in place so FreeBSD can support pvops across a variety of hypervisors. KVMCLOCK happens to be about the easiest to implement, and has a decent performance win for many situations. I think that commit is broken on SMP guests: CPU_FOREACH() does not switch the current CPU, so it just keeps writing to the MSR on the BSP. [*] I had to patch out the test for KVM_FEATURE_CLOCKSOURCE_STABLE_BIT but I think that's a GCE issue. -- Peter Jeremy ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: DigitalOcean offers VMs with FreeBSD!
On Thu, Jan 15, 2015 at 9:44 AM, Slawa Olhovchenkov s...@zxy.spb.ru wrote: On Thu, Jan 15, 2015 at 06:28:23PM +0300, Lev Serebryakov wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 15.01.2015 14:29, Lev Serebryakov wrote: https://www.digitalocean.com/company/blog/presenting-freebsd-how-we-made-it-happen/ I didn't see this news on mailing lists :) But here are some thread about FreeBSD is way slower than Linux in these virtual installations https://news.ycombinator.com/item?id=487 May be IOPS quotation? Can you test with dd and custom kernel with MAXPHYS=1048576 ? What's the value of kern.timecounter.hardware? It will likely be either HPET or ACPI which means there is an VM exit whenever the guest reads from the emulated timecounter hardware. That's why I have some WIP to add support for KVMCLOCK [1]. I hope to merge those changes to HEAD in a week and STABLE shortly after. In the meanwhile, not completely foolproof workaround is to use the TSC-low timecounter source. [1] - https://lists.freebsd.org/pipermail/freebsd-arch/2015-January/016587.html ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
[CFT] Paravirtualized KVM clock
(uint64_t delta, uint32_t mul_frac, int shift) -{ - uint64_t product; - - if (shift 0) - delta = -shift; - else - delta = shift; - -#if defined(__i386__) - { - uint32_t tmp1, tmp2; - - /** - * For i386, the formula looks like: - * - * lower = (mul_frac * (delta UINT_MAX)) 32 - * upper = mul_frac * (delta 32) - * product = lower + upper - */ - __asm__ ( - mul %5 ; - mov %4,%%eax ; - mov %%edx,%4 ; - mul %5 ; - xor %5,%5; - add %4,%%eax ; - adc %5,%%edx ; - : =A (product), =r (tmp1), =r (tmp2) - : a ((uint32_t)delta), 1 ((uint32_t)(delta 32)), - 2 (mul_frac) ); - } -#elif defined(__amd64__) - { - unsigned long tmp; - - __asm__ ( - mulq %[mul_frac] ; shrd $32, %[hi], %[lo] - : [lo]=a (product), [hi]=d (tmp) - : 0 (delta), [mul_frac]rm((uint64_t)mul_frac)); - } -#else -#error xentimer: unsupported architecture -#endif - - return (product); -} - -static uint64_t -get_nsec_offset(struct vcpu_time_info *tinfo) -{ - - return (scale_delta(rdtsc() - tinfo-tsc_timestamp, - tinfo-tsc_to_system_mul, tinfo-tsc_shift)); -} - -/* - * Read the current hypervisor system uptime value from Xen. - * See xen/interface/xen.h for a description of how this works. - */ -static uint32_t -xen_fetch_vcpu_tinfo(struct vcpu_time_info *dst, struct vcpu_time_info *src) -{ - - do { - dst-version = src-version; - rmb(); - dst-tsc_timestamp = src-tsc_timestamp; - dst-system_time = src-system_time; - dst-tsc_to_system_mul = src-tsc_to_system_mul; - dst-tsc_shift = src-tsc_shift; - rmb(); - } while ((src-version 1) | (dst-version ^ src-version)); - - return (dst-version); -} - /** * \brief Get the current time, in nanoseconds, since the hypervisor booted. * * \param vcpu vcpu_info structure to fetch the time from. * - * \note This function returns the current CPU's idea of this value, unless - * it happens to be less than another CPU's previously determined value. */ static uint64_t xen_fetch_vcpu_time(struct vcpu_info *vcpu) { - struct vcpu_time_info dst; - struct vcpu_time_info *src; - uint32_t pre_version; - uint64_t now; - volatile uint64_t last; - - src = vcpu-time; - - do { - pre_version = xen_fetch_vcpu_tinfo(dst, src); - barrier(); - now = dst.system_time + get_nsec_offset(dst); - barrier(); - } while (pre_version != src-version); + struct pvclock_vcpu_time_info *time; - /* - * Enforce a monotonically increasing clock time across all - * VCPUs. If our time is too old, use the last time and return. - * Otherwise, try to update the last time. - */ - do { - last = xen_timer_last_time; - if (last now) { - now = last; - break; - } - } while (!atomic_cmpset_64(xen_timer_last_time, last, now)); + time = (struct pvclock_vcpu_time_info *) vcpu-time; - return (now); + return (pvclock_get_timecount(time)); } static uint32_t @@ -302,15 +192,11 @@ static void xen_fetch_wallclock(struct timespec *ts) { shared_info_t *src = HYPERVISOR_shared_info; - uint32_t version = 0; + struct pvclock_wall_clock *wc; - do { - version = src-wc_version; - rmb(); - ts-tv_sec = src-wc_sec; - ts-tv_nsec = src-wc_nsec; - rmb(); - } while ((src-wc_version 1) | (version ^ src-wc_version)); + wc = (struct pvclock_wall_clock *) src-wc_version; + + pvclock_get_wallclock(wc, ts); } static void @@ -574,7 +460,7 @@ xentimer_resume(device_t dev) } /* Reset the last uptime value */ - xen_timer_last_time = 0; + pvclock_resume(); /* Reset the RTC clock */ inittodr(time_second); diff --git a/sys/i386/include/pvclock.h b/sys/i386/include/pvclock.h new file mode 100644 index 000..f01fac6 --- /dev/null +++ b/sys/i386/include/pvclock.h @@ -0,0 +1,6 @@ +/*- + * This file is in the public domain. + */ +/* $FreeBSD$ */ + +#include x86/pvclock.h diff --git a/sys/kern/subr_param.c b/sys/kern/subr_param.c index 95f3250..5332055 100644 --- a/sys/kern/subr_param.c +++ b/sys/kern/subr_param.c @@ -159,6 +159,8 @@ static const char *const vm_guest_sysctl_names[] = { xen, hv, vmware, + bhyve, + kvm, NULL }; CTASSERT(nitems(vm_guest_sysctl_names) - 1 == VM_LAST); diff --git a/sys/sys/systm.h b/sys/sys/systm.h index d3833d0..50a49d2 100644 --- a/sys/sys/systm.h +++ b/sys/sys/systm.h @@ -73,7 +73,7 @@ extern int vm_guest; /* Running as virtual machine guest? */ * Keep in sync with vm_guest_sysctl_names[]. */ enum VM_GUEST { VM_GUEST_NO = 0, VM_GUEST_VM, VM_GUEST_XEN, VM_GUEST_HV, - VM_GUEST_VMWARE, VM_LAST }; + VM_GUEST_VMWARE, VM_GUEST_BHYVE, VM_GUEST_KVM, VM_LAST }; #if defined(WITNESS) || defined(INVARIANTS) void kassert_panic(const char *fmt, ...) __printflike(1, 2); diff --git a/sys/x86/include/hypervisor.h b/sys/x86/include/hypervisor.h new file mode 100644 index 000..d5d30eb --- /dev/null +++ b/sys/x86/include/hypervisor.h @@ -0,0 +1,56 @@ +/*- + * Copyright (c) 2014 Bryan Venteicher bry...@freebsd.org + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without
Re: [CFT] Paravirtualized KVM clock
On Sun, Jan 4, 2015 at 8:01 PM, Jim Harris jim.har...@gmail.com wrote: On Sun, Jan 4, 2015 at 12:00 PM, Adrian Chadd adr...@freebsd.org wrote: ... so, out of pure curiousity - what's making the benchmark go faster? Is it userland side of things calling clock methods, or something in the kernel, or both? Most likely GEOM statistic gathering in the kernel but Bryan would have to confirm. Yes - t hat's the main source . A similar issue exists in the network stack BPF. I haven't looked or thought too much if it make sense / is possible to use kvmclock in userland too (I think kib@ added fast gettimeofday friends support a few years back). I intermittently saw this same kind of massive slowdown in nvme(4) performance a couple of years back due to a bug in the TSC self-check code which has since been fixed. The bug would result in falling back to HPET and all of the clock calls from the GEOM code for each I/O would kill performance. -adrian On 4 January 2015 at 09:56, Bryan Venteicher bry...@daemoninthecloset.org wrote: For the last few weeks, I've been working on adding support for KVM clock in the projects/paravirt branch. Currently, a KVM VM guest will end up selecting either the HPET or ACPI as the timecounter source. Unfortunately, this is very costly since every timecounter fetch causes a VM exit. KVM clock allows the guest to use the TSC instead; it is very similar to the existing Xen timer. The performance difference between HPET/ACPI and KVMCLOCK can be dramatic: a simple disk benchmark goes from 10K IOPs to 100K IOPs. The patch is attached is attached or available at [1]. I'd appreciate any testing. Also as a part of this, I've tried to generalized a bit of our existing hypervisor guest code, with the eventual goal of being able to support more invasive PV operations. The patch series is viewable in Phabricator. https://reviews.freebsd.org/D1429 - paravirt: Generalize parts of the XEN timer code into pvclock https://reviews.freebsd.org/D1430 - paravirt: Add interface to calculate the TSC frequency from pvclock https://reviews.freebsd.org/D1431 - paravirt: Add simple hypervisor registration and detection interface https://reviews.freebsd.org/D1432 - paravirt: Add detection of bhyve using new hypervisor interface https://reviews.freebsd.org/D1433 - paravirt: Add detection of VMware using new hypervisor interface https://reviews.freebsd.org/D1434 - paravirt: Add detection of KVM using new hypervisor interface https://reviews.freebsd.org/D1435 - paravirt: Add KVM clock timecounter support My current plan is to MFC this series to 10-STABLE, and commit a self-contained KVM clock to the other stable branches. [1] - https://people.freebsd.org/~bryanv/patches/kvm_clock-1.patch ___ freebsd-a...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arch To unsubscribe, send any mail to freebsd-arch-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Bug in virtio-net
On Mon, Dec 8, 2014 at 5:34 PM, Shawn Webb latt...@gmail.com wrote: I was running Poudriere in bhyve. I got this kernel panic. I'm on a new 11-CURRENT as of this morning. Would this be a NULL pointer deref? `uname -a`: FreeBSD 11.0-CURRENT FreeBSD 11.0-CURRENT #1 b5310d8(hardened/current/master)-dirty: Mon Dec 8 12:58:12 UTC 2014 shawn@pkg-build-01:/usr/obj/usr/src/sys/LATT-SEC amd64 This bhyve VM is at r275606. The host is at r275575. Thanks, Shawn Kern panic backtrace: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read instruction, page not present instruction pointer = 0x20:0x0 stack pointer = 0x28:0xfe0469a0c830 frame pointer = 0x28:0xfe0469a0c8b0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12 (irq267: virtio_pci0) [ thread pid 12 tid 100040 ] Stopped at 0:KDB: reentering KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0469a0bd90 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe0469a0be40 kdb_reenter() at kdb_reenter+0x33/frame 0xfe0469a0be50 trap() at trap+0x54/frame 0xfe0469a0c060 calltrap() at calltrap+0x8/frame 0xfe0469a0c060 --- trap 0xc, rip = 0x80e06033, rsp = 0xfe0469a0c120, rbp = 0xfe0469a0c1c0 --- db_read_bytes() at db_read_bytes+0x53/frame 0xfe0469a0c1c0 db_get_value() at db_get_value+0x38/frame 0xfe0469a0c210 db_disasm() at db_disasm+0x23/frame 0xfe0469a0c330 db_trap() at db_trap+0xc0/frame 0xfe0469a0c3c0 kdb_trap() at kdb_trap+0x191/frame 0xfe0469a0c460 trap_fatal() at trap_fatal+0x34c/frame 0xfe0469a0c4c0 trap_pfault() at trap_pfault+0x33c/frame 0xfe0469a0c560 trap() at trap+0x45e/frame 0xfe0469a0c770 calltrap() at calltrap+0x8/frame 0xfe0469a0c770 --- trap 0xc, rip = 0, rsp = 0xfe0469a0c830, rbp = 0xfe0469a0c8b0 --- uart_sab82532_class() at 0/frame 0xfe0469a0c8b0 ether_input() at ether_input+0x26/frame 0xfe0469a0c8d0 vtnet_rxq_eof() at vtnet_rxq_eof+0x7be/frame 0xfe0469a0c9a0 vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x94/frame 0xfe0469a0c9e0 intr_event_execute_handlers() at intr_event_execute_handlers+0x1b8/frame 0xfe0469a0ca20 ithread_loop() at ithread_loop+0x96/frame 0xfe0469a0ca70 fork_exit() at fork_exit+0x9a/frame 0xfe0469a0cab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe0469a0cab0 --- trap 0, rip = 0, rsp = 0xfe0469a0cb70, rbp = 0 --- I doubt this has anything to do with vtnet. My guess is that netisr_proto[NETISR_ETHER].np_handler(m) is NULL for some reason. Do you have a dump? *** error reading from address 0 *** KDB: reentering KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0469a0c100 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe0469a0c1b0 kdb_reenter() at kdb_reenter+0x33/frame 0xfe0469a0c1c0 db_get_value() at db_get_value+0x52/frame 0xfe0469a0c210 db_disasm() at db_disasm+0x23/frame 0xfe0469a0c330 db_trap() at db_trap+0xc0/frame 0xfe0469a0c3c0 kdb_trap() at kdb_trap+0x191/frame 0xfe0469a0c460 trap_fatal() at trap_fatal+0x34c/frame 0xfe0469a0c4c0 trap_pfault() at trap_pfault+0x33c/frame 0xfe0469a0c560 trap() at trap+0x45e/frame 0xfe0469a0c770 calltrap() at calltrap+0x8/frame 0xfe0469a0c770 --- trap 0xc, rip = 0, rsp = 0xfe0469a0c830, rbp = 0xfe0469a0c8b0 --- uart_sab82532_class() at 0/frame 0xfe0469a0c8b0 ether_input() at ether_input+0x26/frame 0xfe0469a0c8d0 vtnet_rxq_eof() at vtnet_rxq_eof+0x7be/frame 0xfe0469a0c9a0 vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x94/frame 0xfe0469a0c9e0 intr_event_execute_handlers() at intr_event_execute_handlers+0x1b8/frame 0xfe0469a0ca20 ithread_loop() at ithread_loop+0x96/frame 0xfe0469a0ca70 fork_exit() at fork_exit+0x9a/frame 0xfe0469a0cab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe0469a0cab0 --- trap 0, rip = 0, rsp = 0xfe0469a0cb70, rbp = 0 --- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: dhclient sucks cpu usage...
- Original Message - On 10.06.2014 07:03, Bryan Venteicher wrote: Hi, - Original Message - So, after finding out that nc has a stupidly small buffer size (2k even though there is space for 16k), I was still not getting as good as performance using nc between machines, so I decided to generate some flame graphs to try to identify issues... (Thanks to who included a full set of modules, including dtraceall on memstick!) So, the first one is: https://www.funkthat.com/~jmg/em.stack.svg As I was browsing around, the em_handle_que was consuming quite a bit of cpu usage for only doing ~50MB/sec over gige.. Running top -SH shows me that the taskqueue for em was consuming about 50% cpu... Also pretty high for only 50MB/sec... Looking closer, you'll see that bpf_mtap is consuming ~3.18% (under ether_nh_input).. I know I'm not running tcpdump or anything, but I think dhclient uses bpf to be able to inject packets and listen in on them, so I kill off dhclient, and instantly, the taskqueue thread for em drops down to 40% CPU... (transfer rate only marginally improves, if it does) I decide to run another flame graph w/o dhclient running: https://www.funkthat.com/~jmg/em.stack.nodhclient.svg and now _rxeof drops from 17.22% to 11.94%, pretty significant... So, if you care about performance, don't run dhclient... Yes, I've noticed the same issue. It can absolutely kill performance in a VM guest. It is much more pronounced on only some of my systems, and I hadn't tracked it down yet. I wonder if this is fallout from the callout work, or if there was some bpf change. I've been using the kludgey workaround patch below. Hm, pretty interesting. dhclient should setup proper filter (and it looks like it does so: 13:10 [0] m@ptichko s netstat -B Pid Netif Flags Recv Drop Match Sblen Hblen Command 1224em0 -ifs--l 41225922 011 0 0 dhclient ) see match count. And BPF itself adds the cost of read rwlock (+ bgp_filter() calls for each consumer on interface). It should not introduce significant performance penalties. It will be a bit before I'm able to capture that. Here's a Flamegraph from earlier in the year showing an absurd amount of time spent in bpf_mtap(): http://people.freebsd.org/~bryanv/vtnet/vtnet-bpf-10.svg diff --git a/sys/net/bpf.c b/sys/net/bpf.c index cb3ed27..9751986 100644 --- a/sys/net/bpf.c +++ b/sys/net/bpf.c @@ -2013,9 +2013,11 @@ bpf_gettime(struct bintime *bt, int tstype, struct mbuf *m) return (BPF_TSTAMP_EXTERN); } } +#if 0 if (quality == BPF_TSTAMP_NORMAL) binuptime(bt); else +#endif bpf_getttime() is called IFF packet filter matches some traffic. Can you show your netstat -B output ? getbinuptime(bt); return (quality); -- John-Mark GurneyVoice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: dhclient sucks cpu usage...
Hi, - Original Message - So, after finding out that nc has a stupidly small buffer size (2k even though there is space for 16k), I was still not getting as good as performance using nc between machines, so I decided to generate some flame graphs to try to identify issues... (Thanks to who included a full set of modules, including dtraceall on memstick!) So, the first one is: https://www.funkthat.com/~jmg/em.stack.svg As I was browsing around, the em_handle_que was consuming quite a bit of cpu usage for only doing ~50MB/sec over gige.. Running top -SH shows me that the taskqueue for em was consuming about 50% cpu... Also pretty high for only 50MB/sec... Looking closer, you'll see that bpf_mtap is consuming ~3.18% (under ether_nh_input).. I know I'm not running tcpdump or anything, but I think dhclient uses bpf to be able to inject packets and listen in on them, so I kill off dhclient, and instantly, the taskqueue thread for em drops down to 40% CPU... (transfer rate only marginally improves, if it does) I decide to run another flame graph w/o dhclient running: https://www.funkthat.com/~jmg/em.stack.nodhclient.svg and now _rxeof drops from 17.22% to 11.94%, pretty significant... So, if you care about performance, don't run dhclient... Yes, I've noticed the same issue. It can absolutely kill performance in a VM guest. It is much more pronounced on only some of my systems, and I hadn't tracked it down yet. I wonder if this is fallout from the callout work, or if there was some bpf change. I've been using the kludgey workaround patch below. diff --git a/sys/net/bpf.c b/sys/net/bpf.c index cb3ed27..9751986 100644 --- a/sys/net/bpf.c +++ b/sys/net/bpf.c @@ -2013,9 +2013,11 @@ bpf_gettime(struct bintime *bt, int tstype, struct mbuf *m) return (BPF_TSTAMP_EXTERN); } } +#if 0 if (quality == BPF_TSTAMP_NORMAL) binuptime(bt); else +#endif getbinuptime(bt); return (quality); -- John-Mark GurneyVoice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: BUG: some drivers return ENOBUFS when the mbuf is actually queued
On Wed, Jun 4, 2014 at 8:49 AM, Luigi Rizzo ri...@iet.unipi.it wrote: Hi, if I read correctly the code, there are a few network device drivers (igb, ixgbe, i40e, vtnet, vmxnet) where ifp-if_transmit(ifp, m) can return ENOBUFS even when 'm' has _not_ been dropped: e1000/if_igb.c :: igb_mq_start() can return ENOBUFS from igb_xmit() ixgbe/ixgbe_main.c :: ixgbe_mq_start_locked() can return ENOBUFS from ixgbe_xmit() (similar for i40) virtio/network/if_vtnet.c :: vtnet_txq_mq_start can return ENOBUFS if virtqueue_full() In all these cases, the error comes from a later attempt to transfer mbufs from the buf_ring to the NIC ring. All drivers using if_transmit() seem correct, as well as a bunch of others (cxgbe, sfxge, mxge ...) that reassign if_transmit and I checked for correctness. I think that when the current buffer has been queued, returning ENOBUFS is extremely confusing and should not be done. I would also argue that the return from ifp-if_transmit(ifp, m) should only tell what happened to 'm', not other things such as the status of the queue. Any objections if i fix the above drivers ? No objection for vtnet and vmxnet. cheers luigi (For those curious: i found this issue when using emulated netmap mode on top of a standard driver. The netmap emulation code assumes that ENOBUFS indicates that the driver has m_free()'d the mbuf, same as it happens on linux, and the bug was causing panics in my system). ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: device vtnet - device virtio_net?
- Original Message - Hi, GENERIC has # VirtIO support device virtio # Generic VirtIO bus (required) device virtio_pci # VirtIO PCI device device vtnet # VirtIO Ethernet device device virtio_blk # VirtIO Block device device virtio_scsi # VirtIO SCSI device device virtio_balloon # VirtIO Memory Balloon device Maybe it's just my OCD kicking in, but why is vtnet not named virtio_net? That would be consistent with the other virtio device names. That's what I picked 3 some years ago and it is too late to change it. I believe my thinking at the time was to match most other Ethernet drives: the module name is if_vtnet, so use vtnet in the kernel config. Cheers, Jos -- Jos Backus jos at catnook.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vtnet broken on -CURRENT when using VirtualBox
- Original Message - Hi list, I'm observing a 100%-reproducible panic in the following setup: Host system: FreeBSD 9.1-RELEASE-p7 amd64 $ pkg info | grep virtualbox virtualbox-ose-4.2.18_1A general-purpose full virtualizer for x86 hardware virtualbox-ose-kmod-4.2.18 VirtualBox kernel module for FreeBSD System in a virtual machine: FreeBSD-CURRENT SVN rev 259064. Virtual machine is created with virtio host-only adapter. When trying to ssh into VM, the system in VM panics with the following message: panic: vtnet_txq_offload: mbuf 0xc309e900 TSO without checksum offload KDB: stack backtrace: db_trace_self_wrapper(c0b4fd4d,a6461,65393030,6039,c13a29c0,...) at db_trace_self_wrapper+0x2d/frame 0xc23f85a0 kdb_backtrace(c0b4b145,c0c29a7c,c0b9b43d,c23f865c,c23f865c,...) at kdb_backtrace+0x30/frame 0xc23f8608 vpanic(c0c29918,100,c0b9b43d,c23f865c,c23f865c,...) at vpanic+0x80/frame 0xc23f862c kassert_panic(c0b9b43d,c0b9b466,c309e900,8b1,c0dad504,...) at kassert_panic+0xe9/frame 0xc23f8650 vtnet_txq_mq_start_locked(c2e02810,0,c0b9b369,8ea,c2e02810,...) at vtnet_txq_mq_start_locked+0x62b/frame 0xc23f8808 vtnet_txq_mq_start(c2cf7800,c309e900,6,c23f89e0,c23f8866,...) at vtnet_txq_mq_start+0x76/frame 0xc23f8834 ether_output(c2cf7800,c309e900,c23f89e0,c23f89d0,c36639d8,...) at ether_output+0x64b/frame 0xc23f ip_output(c309e900,0,c23f89d0,0,0,...) at ip_output+0x173f/frame 0xc23f8938 tcp_output(c36665e0,c342f400,32c,1,c36639d8,...) at tcp_output+0x1cbf/frame 0xc23f8a9c tcp_usr_send(c3410d40,0,c342f400,0,0,...) at tcp_usr_send+0x346/frame 0xc23f8ad0 sosend_generic(c3410d40,0,c23f8c10,0,0,...) at sosend_generic+0x3b3/frame 0xc23f8b40 soo_write(c3142f50,c23f8c10,c2cf0d00,0,c3108620,...) at soo_write+0x5d/frame 0xc23f8b70 dofilewrite(c3142f50,c23f8c10,,,0,...) at dofilewrite+0x86/frame 0xc23f8ba8 kern_writev(c3108620,3,c23f8c10,0,28c4d608,...) at kern_writev+0x96/frame 0xc23f8bf0 sys_write(c3108620,c23f8cc8,c23f8c98,c076b3a4,c0c36e90,...) at sys_write+0x5c/frame 0xc23f8c40 syscall(c23f8d08) at syscall+0x2de/frame 0xc23f8cfc Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xc23f8cfc --- syscall (4, FreeBSD ELF32, sys_write), eip = 0x2840dd77, esp = 0xbfbfb328, ebp = 0xbfbfb348 --- KDB: enter: panic [ thread pid 1570 tid 100065 ] Stopped at kdb_enter+0x3d: movl$0,kdb_why db Please help me to debug this. I suspect I know what is wrong. What's the output of `ifconfig vtnetX`? -- Regards, Ilya Bakulin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
amd64 minidump slowness
Hi, At $JOB, we have machines with 400GB RAM that even the smallest 15GB amd64 minidump takes well over an hour. The major cause of the slowness is that in minidumpsys(), blk_write() is called PAGE_SIZE at a time. This causes blk_write() to poll the console for the Ctrl-C abort once per page. The attached patch changes blk_write() to be called with a run of physically contiguous pages. This reduced the dump time by over a magnitude. Of course, blk_write() could also be changed to poll the console less frequently (like only on every IO). If anybody else dumps on machines with lots of RAM, it would be nice to know the difference this patch makes. I've got a second set of patches that further reduces the dump time by over half that I'll try to clean up soon. http://people.freebsd.org/~bryanv/patches/minidump.patchcommit 25f9e82e4ac93e71c6cf06fe2faa1899967db725 Author: Bryan Venteicher bryanventeic...@gmail.com Date: Sun Sep 29 13:56:42 2013 -0500 Call blk_write() with a run of physically contiguous pages Previously, blk_write() was being called one page at a time, which would cause it to poll the console for every page. This change makes dumping a magnitude faster, and is especially useful on large memory machines. diff --git a/sys/amd64/amd64/minidump_machdep.c b/sys/amd64/amd64/minidump_machdep.c index f14c539..26b2b31 100644 --- a/sys/amd64/amd64/minidump_machdep.c +++ b/sys/amd64/amd64/minidump_machdep.c @@ -221,7 +221,8 @@ minidumpsys(struct dumperinfo *di) vm_offset_t va; int error; uint64_t bits; - uint64_t *pml4, *pdp, *pd, *pt, pa; + uint64_t *pml4, *pdp, *pd, *pt, start_pa, pa; + size_t sz; int i, ii, j, k, n, bit; int retry_count; struct minidumphdr mdhdr; @@ -412,18 +413,29 @@ minidumpsys(struct dumperinfo *di) } /* Dump memory chunks */ - /* XXX cluster it up and use blk_dump() */ - for (i = 0; i vm_page_dump_size / sizeof(*vm_page_dump); i++) { + for (i = 0, start_pa = 0, sz = 0; + i vm_page_dump_size / sizeof(*vm_page_dump); i++) { bits = vm_page_dump[i]; while (bits) { bit = bsfq(bits); pa = (((uint64_t)i * sizeof(*vm_page_dump) * NBBY) + bit) * PAGE_SIZE; - error = blk_write(di, 0, pa, PAGE_SIZE); - if (error) -goto fail; + if (sz == 0 || start_pa + sz == pa) { +if (sz == 0) + start_pa = pa; +sz += PAGE_SIZE; + } else { +error = blk_write(di, 0, start_pa, sz); +if (error) + goto fail; +start_pa = pa; +sz = PAGE_SIZE; + } bits = ~(1ul bit); } } + error = blk_write(di, 0, start_pa, sz); + if (error) + goto fail; error = blk_flush(di); if (error) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] VMware vmxnet3 ethernet driver
- Original Message - - Original Message - Bezüglich Bryan Venteicher's Nachricht vom 27.08.2013 06:18 (localtime): ... snip The intr usage is higher than the other drivers you compared against because if_vmx does the off-level processing in ithreads where as the others do it in a taskqueue. BTW: if_vmx can to LRO as well. I don't think the emulated e1000 can, but I bet the e1000e does. if_vmx - if_vmx 1.32 GBits/sec, load: 10-45%Sys 40-48%Intr if_vmxJumbo - if_vmxJumbo 5.01 GBits/sec, load: 10-45%Sys 40-48%Intr Please find attached the different outputs of dev.vmx.X (the mtu9000 run was only 3.47GBits/sec in that case, took the numbers anyway) Thanks for the sysctl output. dev.vmx.0.txq0.ringfull: 133479 dev.vmx.0.txq0.hstats.tso_packets: 564986 dev.vmx.0.txq0.hstats.ucast_packets: 570604 For the number of packets transmitted, there's a really high percentage of time we find the Tx queue full enough it is not able to hold the next to transmit frame. I've haven't been able to recreate this. But I recently made a commit [1] that might help alleviate this. [1] http://svnweb.freebsd.org/base?view=revisionrevision=255055 wbr, -Harry ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] VMware vmxnet3 ethernet driver
- Original Message - Bezüglich Bryan Venteicher's Nachricht vom 27.08.2013 06:18 (localtime): ... It seems if_vmx doesn't support jumbo frames. If I set mtu 9000, I get »vmx0: cannot populate Rx queue 0«, I have no problems using jumbo frames with vmxnet3. This could fail for two reasons - could not allocate an mbuf cluster, or the call to bus_dmamap_load_mbuf_sg() failed. For the former, you should check vmstat -z. For the later, the behavior of bus_dmamap_load_mbuf_sg() changed between 9.1 and 9.2, and I know it was broken for awhile. I don't recall exactly when I fixed it (I think shortly after I made the original announcement). Could you retry with the files from HEAD @ [1]? Also, there are new sysctl oids (dev.vmx.X.mbuf_load_failed dev.vmx.X.mgetcl_failed) for these errors. I just compiled the driver on 9.2-RC2 with the sources from HEAD and was able to change the MTU to 9000. [1]- http://svnweb.freebsd.org/base/head/sys/dev/vmware/vmxnet3/ Thanks a lot for your ongoing work! I can confirm that with recent if_vmx.c from head and compiled for 9.2-RC3, setting mtu to 9000 works as expected :-) I took a oldish host (4x2,8GHz Core2[LGA775]) with recent software: ESXi 5.1U1 and FreeBSD-9.2-RC2 Two guests are connected to one MTU9000 VMware Software Switch. I've got a few performance things to still look at. What's the sysctl dev.vmx.X output for the if_vmx-if_vmx tests? Just repeated if_vmx simple iperf bench, results vary slightly from standard 10sec run to run, but still noticable high Intr usage: The intr usage is higher than the other drivers you compared against because if_vmx does the off-level processing in ithreads where as the others do it in a taskqueue. BTW: if_vmx can to LRO as well. I don't think the emulated e1000 can, but I bet the e1000e does. if_vmx - if_vmx 1.32 GBits/sec, load: 10-45%Sys 40-48%Intr if_vmxJumbo - if_vmxJumbo 5.01 GBits/sec, load: 10-45%Sys 40-48%Intr Please find attached the different outputs of dev.vmx.X (the mtu9000 run was only 3.47GBits/sec in that case, took the numbers anyway) wbr, -Harry ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] VMware vmxnet3 ethernet driver
- Original Message - Bezüglich Bryan Venteicher's Nachricht vom 05.08.2013 02:12 (localtime): Hi, I've ported the OpenBSD vmxnet3 ethernet driver to FreeBSD. I did a lot of cleanup, bug fixes, new features, etc (+2000 new lines) along the way so there is not much of a resemblance left. The driver is in good enough shape I'd like additional testers. A patch against -CURRENT is at [1]. Alternatively, the driver and a Makefile is at [2]; this should compile at least as far back as 9.1. I can look at 8-STABLE if there is interest. Obviously, besides reports of 'it works', I'm interested performance vs the emulated e1000, and (for those using it) the VMware tools vmxnet3 driver. Hopefully it is no worse :) Hello Bryan, thanks a lot for your hard work! It seems if_vmx doesn't support jumbo frames. If I set mtu 9000, I get »vmx0: cannot populate Rx queue 0«, I have no problems using jumbo frames with vmxnet3. This could fail for two reasons - could not allocate an mbuf cluster, or the call to bus_dmamap_load_mbuf_sg() failed. For the former, you should check vmstat -z. For the later, the behavior of bus_dmamap_load_mbuf_sg() changed between 9.1 and 9.2, and I know it was broken for awhile. I don't recall exactly when I fixed it (I think shortly after I made the original announcement). Could you retry with the files from HEAD @ [1]? Also, there are new sysctl oids (dev.vmx.X.mbuf_load_failed dev.vmx.X.mgetcl_failed) for these errors. I just compiled the driver on 9.2-RC2 with the sources from HEAD and was able to change the MTU to 9000. [1]- http://svnweb.freebsd.org/base/head/sys/dev/vmware/vmxnet3/ I took a oldish host (4x2,8GHz Core2[LGA775]) with recent software: ESXi 5.1U1 and FreeBSD-9.2-RC2 Two guests are connected to one MTU9000 VMware Software Switch. I've got a few performance things to still look at. What's the sysctl dev.vmx.X output for the if_vmx-if_vmx tests? Simple iperf (standard TCP) results: vmxnet3jumbo - vmxnet3jumbo 5.3Gbits/sec, load: 40-60%Sys 0.5-2%Intr vmxnet3 - vmxnet3 1.85 GBits/sec, load: 60-80%Sys 0-0.8%Intr if_vmx - if_vmx 1.51 GBits/sec, load: 10-45%Sys 40-48%Intr !!! if_vmxjumbo - if_vmxjumbo not possible if_em(e1000) - if_em(e1000) 1.23 GBits/sec, load: 80-60%Sys 0.5-8%Intr if_em(e1000)jumbo - if_em(e1000)jumbo 2.27Gbits/sec, load: 40-30%Sys 0.5-5%Intr if_igb(e1000e)junmbo - if_igb(e1000e)jumbo 5.03 Gbits/s, load: 70-60%Sys 0.5%Intr if_igb(e1000e) - if_igb(e1000e) 1.39 Gbits/s, load: 60-80%Sys 0.5%Intr f_igb(e1000e) - if_igb(e1000e), both hw.em.[rt]xd=4096 1.66 Gbits/s, load: 65-90%Sys 0.5%Intr if_igb(e1000e)junmbo - if_igb(e1000e)jumbo, both hw.em.[rt]xd=4096 4.81 Gbits/s, load: 65%Sys 0.5%Intr Conclusion: if_vmx performs well compared to the regular emulated nics and standard MTU, but it's behind tuned e1000e nic emulation and can't reach vmxnet3 performance with regular mtu. If one needs throughput, the missing jumbo frame support in if_vmx is a show stopper. e1000e is preferable over e1000, even if not officially choosable with FreeBSD-selection as guest (edit .vmx and alter ethernet0.virtualDev = e1000e, and dont forget to set hw.em.enable_msix=0 in loader.conf, although the driver e1000e attaches is if_igb!) Thanks, -Harry ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] VMware vmxnet3 ethernet driver
- Original Message - it'd be nice if we could get vmware to just support the drivers in tree.. by which I mean, just submit patches.. why do they need to have it out of tree? I agree. But they are all unfriendly licensed. The FF had a discussion to get them relicensed to something more suitable, but that went no where over the past year. It is unfortunate this vendor supplied, out of tree driver, issue is still around. Linux should have taught companies how foolish this is. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] VMware vmxnet3 ethernet driver
- Original Message - Perhaps not, but they do support FreeBSD. I've started several support cases with FreeBSD-specific problems and they've fixed all so far. Yes, it is not a blackhole of support. At $JOB, we got caught by the FreeBSD specific issue of the busted timer that was fixed. But they've less helpful in other regards, and have more or less said FreeBSD isn't high in their priority because it isn't Linux. Are you aiming at completely replacing VMware tools, or just the device drivers? I'd like as much as possible to work out of the box. vmxnet3 is as far as my current interests go. OpenBSD has a vmt device that apparently does (at least the important bits of) what vmtoolsd does; I'll look at that closer at some point. I have no intention of preventing people from using VMware's tools if they desire, nor breaking existing users. -- Joel ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [net] protecting interfaces from races between control and data ?
- Original Message - i am slightly unclear of what mechanisms we use to prevent races between interface being reconfigured (up/down/multicast setting, etc, all causing reinitialization of the rx and tx rings) and i) packets from the host stack being sent out; ii) interrupts from the network card being processed. I think in the old times IFF_DRV_RUNNING was used for this purpose, but now it is not enough. Acquiring the core lock in the NIC does not seem enough, either, because newer drivers, especially multiqueue ones, have per-queue rx and tx locks. What I've done in my drivers is: * Lock the core mutex * Clear IFF_DRV_RUNNING * Lock/unlock each queue's lock The various Rx/Tx queue functions check for IFF_DRV_RUNNING after (re)acquiring their queue lock. See at vtnet_stop_rendezvous() at [1] for an example. Does anyone know if there is a generic mechanism, or each driver reimplements its own way ? We desperately need a saner ifnet/driver interface. I think andre@ had some previous work in this area (and additional plans as well?). IMO, there's a lot to like on what DragonflyBSD has done in this area. [1] - http://svnweb.freebsd.org/base/user/bryanv/vtnetmq/sys/dev/virtio/network/if_vtnet.c?revision=252451view=markup thanks luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [net] protecting interfaces from races between control and data ?
- Original Message - On Mon, Aug 5, 2013 at 8:19 PM, Adrian Chadd adr...@freebsd.org wrote: No, brian said two things: * the flag, protected by the core lock * per-queue flags i see no mentions on per-queue flags on his email. This is the relevant part Right, I just use the IFF_DRV_RUNNING flag. I think Adrian meant 'per-queue locks' here? What I've done in my drivers is: * Lock the core mutex * Clear IFF_DRV_RUNNING * Lock/unlock each queue's lock The various Rx/Tx queue functions check for IFF_DRV_RUNNING after (re)acquiring their queue lock. See at vtnet_stop_rendezvous() at [1] for an example. [1] http://svnweb.freebsd.org/base/user/bryanv/vtnetmq/sys/dev/virtio/network/if_vtnet.c?revision=252451view=markup - -adrian -- -+--- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2211611 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] VMware vmxnet3 ethernet driver
- Original Message - I have ~100 FreeBSD 8/9 VMs in my vSphere 5.1 environment, all using the VMware tools package from VMware. Everything has been running great for years. (we skipped vSphere 5.0). Why should I use this vmxnet driver instead of the VMware tools driver or the emulated e1000? They are out of tree and subject to rotting. I had to use the patches at [1] to even get them to compile on 9.1 and -current. I don't think VMware puts much engineering resources behind it; there was a compiler warning of a silly bug like: if (foo) ; do_something(); vmxnet3 has modern features LRO, IPv6 checksum offloading, etc that the emulated e1000 lacks. In my test setup, e1000 tops out at 30MB/sec but vmxnet3 goes to 50MB/sec. I'd like to hear other's experiences. [1] - http://ogris.de/vmware/ -- Joel ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
[CFT] VMware vmxnet3 ethernet driver
Hi, I've ported the OpenBSD vmxnet3 ethernet driver to FreeBSD. I did a lot of cleanup, bug fixes, new features, etc (+2000 new lines) along the way so there is not much of a resemblance left. The driver is in good enough shape I'd like additional testers. A patch against -CURRENT is at [1]. Alternatively, the driver and a Makefile is at [2]; this should compile at least as far back as 9.1. I can look at 8-STABLE if there is interest. Obviously, besides reports of 'it works', I'm interested performance vs the emulated e1000, and (for those using it) the VMware tools vmxnet3 driver. Hopefully it is no worse :) The drivers supports most VMXNET3 features - IPv4/IPv6 checksum offload, TSO, LRO, VLAN tag offload. AFAIK, the only notable missing feature is multiqueue; 3/4 of the code needed is already in the driver, but I don't have time to do final bit of work. Most of the development was done on QEMU 1.5, but also tested on VMware Fusion and VMware ESXi. [1] - http://people.freebsd.org/~bryanv/vmware/if_vmx.patch [2] - http://people.freebsd.org/~bryanv/vmware/files/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Problem with curret in vmware
On Tuesday, July 30, 2013 5:25:06 am Alexander Yerenkow wrote: Hello all. I have panics in vmware with installed vmwaretools (they are guessed culprit). Seems that memory balooning (or using more memory in all vms than there is in host) produces some kind of weird behavior in FreeBSD. This vm aren't shutted down now, is there somethin I can do to help investigate this? Panic screens: http://gits.kiev.ua/FreeBSD/panic1.png http://gits.kiev.ua/FreeBSD/panic2.png Looks like their code needs to be updated to work with locking changes in HEAD. Attilio is probably the best person to ask. This highlights why we should move away from the poorly supported, out of tree, unfriendly licensed VMware tools. I have a port of the vmxnet3 from OpenBSD [1] that I intend to commit in time for 10. Next, I hope to look at the OpenBSD vmt [2] VMware tools driver. The balloon is a bit trickier. AFAIK, OpenBSD doesn't have a driver for easy porting. The VMware tools driver for FreeBSD is GPL licensed, and VMware has shown no interest/ability to relicense their tools. Likely, the best way forward is to port their CDDL licensed Solaris driver. [1] - http://svnweb.freebsd.org/base/projects/vmxnet/sys/dev/vmware/vmxnet3/ [2] - http://www.openbsd.org/cgi-bin/man.cgi?query=vmtapropos=0sektion=0manpath=OpenBSD+Currentarch=i386format=html -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: VirtIO in GENERIC
On Mon, Dec 17, 2012 at 1:17 AM, Bryan Venteicher bry...@freebsd.org wrote: On Sun, Dec 16, 2012 at 11:06 PM, Jim Harris jimhar...@freebsd.org wrote: On Sun, Dec 16, 2012 at 6:53 PM, Andrew Thompson thom...@freebsd.org wrote: On 17 December 2012 13:17, Bryan Venteicher bry...@freebsd.org wrote: There's been lots of requests to have VirtIO in GENERIC for i386 and amd64. Anybody have any issues or concerns with this or the patch at [1]. This also removes the kludge that was introduced in r239009. I've compiled LINT for i386 and amd64 so hopefully there won't be any surprise breakages. [1] http://people.freebsd.org/~bryanv/patches/virtio.generic.patch It would be great to have the drivers enabled. You do not need the sys/conf/files changes, the common and arch files are combined. Removing the virtio files from sys/conf/files ensures these drivers can only be specified in x86 kernel configuration files. r239009 added these lines to sys/conf/files, but Bryan's patch does it more correctly. The only question I have is the GENERIC changes where device virtio is added - it says it is required, but should this instead say it's required for any of the other drivers in this section? Yes, that wording could be improved; will update the patch in the morning. Hmm .. on second thought, I think 'required' is sufficiently clear on its own that it applies only to this section. Other nearby sections (USB, sound) use the word also. For the time being, I still intend to add VirtIO only to i386 and amd64 GENERIC. ARM and PPC64 can join the club later once I have a chance to test/debug them on QEMU. I'd like to commit this this weekend if nobody raises any objections. -Jim ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: VirtIO in GENERIC
On Mon, Dec 17, 2012 at 12:06 AM, Andrew Thompson thom...@freebsd.org wrote: On 17 December 2012 18:06, Jim Harris jimhar...@freebsd.org wrote: On Sun, Dec 16, 2012 at 6:53 PM, Andrew Thompson thom...@freebsd.org wrote: On 17 December 2012 13:17, Bryan Venteicher bry...@freebsd.org wrote: There's been lots of requests to have VirtIO in GENERIC for i386 and amd64. Anybody have any issues or concerns with this or the patch at [1]. This also removes the kludge that was introduced in r239009. I've compiled LINT for i386 and amd64 so hopefully there won't be any surprise breakages. [1] http://people.freebsd.org/~bryanv/patches/virtio.generic.patch It would be great to have the drivers enabled. You do not need the sys/conf/files changes, the common and arch files are combined. Removing the virtio files from sys/conf/files ensures these drivers can only be specified in x86 kernel configuration files. r239009 added these lines to sys/conf/files, but Bryan's patch does it more correctly. Yes, I think the patch is correct for what I intended - support for x86 only (for now). Linux supports virtio on ARM so I dont think its necessarily x86 MD. I guess it can be moved back later. I think VirtIO on ARM (on QEMU) effectively requires VirtIO-MMIO, which we don't support yet. And virtio_pci is probably missing some bus_space_barriers() required for non-x86. Both are on my TODO, but nobody has prodded me about either yet. Andrew ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
VirtIO in GENERIC
There's been lots of requests to have VirtIO in GENERIC for i386 and amd64. Anybody have any issues or concerns with this or the patch at [1]. This also removes the kludge that was introduced in r239009. I've compiled LINT for i386 and amd64 so hopefully there won't be any surprise breakages. [1] http://people.freebsd.org/~bryanv/patches/virtio.generic.patch ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: VirtIO in GENERIC
On Sun, Dec 16, 2012 at 11:06 PM, Jim Harris jimhar...@freebsd.org wrote: On Sun, Dec 16, 2012 at 6:53 PM, Andrew Thompson thom...@freebsd.org wrote: On 17 December 2012 13:17, Bryan Venteicher bry...@freebsd.org wrote: There's been lots of requests to have VirtIO in GENERIC for i386 and amd64. Anybody have any issues or concerns with this or the patch at [1]. This also removes the kludge that was introduced in r239009. I've compiled LINT for i386 and amd64 so hopefully there won't be any surprise breakages. [1] http://people.freebsd.org/~bryanv/patches/virtio.generic.patch It would be great to have the drivers enabled. You do not need the sys/conf/files changes, the common and arch files are combined. Removing the virtio files from sys/conf/files ensures these drivers can only be specified in x86 kernel configuration files. r239009 added these lines to sys/conf/files, but Bryan's patch does it more correctly. The only question I have is the GENERIC changes where device virtio is added - it says it is required, but should this instead say it's required for any of the other drivers in this section? Yes, that wording could be improved; will update the patch in the morning. -Jim ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [head tinderbox] failure on i386/i386
Hi, - Original Message - From: FreeBSD Tinderbox tinder...@freebsd.org To: FreeBSD Tinderbox tinder...@freebsd.org, curr...@freebsd.org, i...@freebsd.org Sent: Friday, October 12, 2012 6:11:27 AM Subject: [head tinderbox] failure on i386/i386 TB --- 2012-10-12 04:50:01 - tinderbox 2.9 running on freebsd-current.sentex.ca TB --- 2012-10-12 04:50:01 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC amd64 TB --- 2012-10-12 04:50:01 - starting HEAD tinderbox run for i386/i386 TB --- 2012-10-12 04:50:01 - cleaning the object tree TB --- 2012-10-12 04:50:01 - checking out /src from svn://svn.freebsd.org/base/head TB --- 2012-10-12 04:50:01 - cd /tinderbox/HEAD/i386/i386 TB --- 2012-10-12 04:50:01 - /usr/local/bin/svn cleanup /src TB --- 2012-10-12 04:53:23 - /usr/local/bin/svn update /src TB --- 2012-10-12 04:53:42 - At svn revision 241478 [SNIP] TB --- 2012-10-12 10:54:26 - /usr/bin/make -B buildkernel KERNCONF=XEN Kernel build for XEN started on Fri Oct 12 10:54:26 UTC 2012 stage 1: configuring the kernel stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3.1: making dependencies stage 3.2: building everything [...] objcopy --only-keep-debug virtio_balloon.ko.debug virtio_balloon.ko.symbols objcopy --strip-debug --add-gnu-debuglink=virtio_balloon.ko.symbols virtio_balloon.ko.debug virtio_balloon.ko === virtio/scsi (all) cc -O2 -pipe -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc -DHAVE_KERNEL_OPTION_HEADERS -include /obj/i386.i386/src/sys/XEN/opt_global.h -I. -I@ -I@/contrib/altq -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common -g -I/obj/i386.i386/src/sys/XEN -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -std=iso9899:1999 -fstack-protector -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -c /src/sys/modules/virtio/scsi/../../../dev/virtio/scsi/virtio_scsi.c cc1: warnings being treated as errors /src/sys/modules/virtio/scsi/../../../dev/virtio/scsi/virtio_scsi.c: In function 'vtscsi_sg_append_scsi_buf': /src/sys/modules/virtio/scsi/../../../dev/virtio/scsi/virtio_scsi.c:974: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] /src/sys/modules/virtio/scsi/../../../dev/virtio/scsi/virtio_scsi.c:982: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] *** [virtio_scsi.o] Error code 1 I cannot seem to recreate this locally, but I think these need to be casted through uintptr? diff --git a/sys/dev/virtio/scsi/virtio_scsi.c b/sys/dev/virtio/scsi/virtio_scsi.c index f2e1412..79bc988 100644 --- a/sys/dev/virtio/scsi/virtio_scsi.c +++ b/sys/dev/virtio/scsi/virtio_scsi.c @@ -971,7 +971,7 @@ vtscsi_sg_append_scsi_buf(struct vtscsi_softc *sc, struct sglist *sg, csio-data_ptr, csio-dxfer_len); else error = sglist_append_phys(sg, - (vm_paddr_t) csio-data_ptr, csio-dxfer_len); + (vm_paddr_t)(uintptr_t) csio-data_ptr, csio-dxfer_len); } else { for (i = 0; i csio-sglist_cnt error == 0; i++) { @@ -979,7 +979,7 @@ vtscsi_sg_append_scsi_buf(struct vtscsi_softc *sc, struct sglist *sg, if ((ccbh-flags CAM_SG_LIST_PHYS) == 0) error = sglist_append(sg, - (void *) dseg-ds_addr, dseg-ds_len); + (void *)(uintptr_t) dseg-ds_addr, dseg-ds_len); else error = sglist_append_phys(sg, (vm_paddr_t) dseg-ds_addr, dseg-ds_len); That being said, compiling VirtIO for a XEN kernel probably doesn't make any sense. Bryan Stop in /src/sys/modules/virtio/scsi. *** [all] Error code 1 Stop in /src/sys/modules/virtio. *** [all] Error code 1 Stop in /src/sys/modules. *** [modules-all] Error code 1 Stop in /obj/i386.i386/src/sys/XEN. *** [buildkernel] Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2012-10-12 11:11:27 - WARNING: /usr/bin/make returned exit code 1 TB --- 2012-10-12 11:11:27 - ERROR: failed to build XEN kernel TB --- 2012-10-12 11:11:27 - 17474.50 user 2374.09 system 22886.60 real http://tinderbox.freebsd.org/tinderbox-head-HEAD-i386-i386.full ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any
Re: Awful FreeBSD 9 block IO performance in KVM
- Original Message - From: Dieter BSD dieter...@engineer.com To: hack...@freebsd.org, curr...@freebsd.org Sent: Sunday, July 22, 2012 1:19:32 AM Subject: Re: Awful FreeBSD 9 block IO performance in KVM da0: 3.300MB/s transfers da0: Command Queueing enabled root@freebsd:/root # dd if=/dev/zero of=/dev/da1 bs=16384 count=262144 4294967296 bytes transferred in 615.840721 secs (6974153 bytes/sec) 1) Does a larger block size (bs=1m) help? 2) That's roughly the speed I'd expect without queueing. Is it really making effective use of queueing, or is something limiting queueing to one transfer at a time? The likely fix here is basically do vtblk_startio() in a separate kproc that vtblk_strategy() enqueues bio's to. This has been on my todo for a while, but haven't had the time. Also, the use of bioq_disksort() probably doesn't gain much for virtualized disks, but I never found much of a difference in my testing. ___ freebsd-hack...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
deadlkres() panic
On a recent -current, I got the following panic from deadlkres: Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680 Tracing pid 0 tid 100058 td 0xff00024bf7a0 kdb_enter() at kdb_enter+0x3d panic() at panic+0x176 sleepq_type() at sleepq_type+0x56 deadlkres() at deadlkres+0x224 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xff8074976d30, rbp = 0 --- (Hand transcribed, doadump() hung) deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a sleepqueue (ie, td-td_wchan == NULL). I don't think this is an invalid state for thread to be in: After adding itself to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig(). sleepq_catch_signals() determines there is a signal pending so it removes the thread from the sleepq via sleepq_resume_thread(). Returning to sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is unable to cancel the timeout because it is already firing (likely waiting on thread_lock()). So the thread calls TD_SET_SLEEPING() followed by mi_switch(). deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() !TD_ON_SLEEPQ(). The attached patch takes care of the panic for me.--- /usr/src-nfs/sys/kern/kern_clock.c 2010-06-30 03:38:25.0 -0500 +++ kern_clock.c 2010-07-01 02:19:39.048697991 -0500 @@ -232,7 +232,8 @@ panic(%s: possible deadlock detected for %p, blocked for %d ticks\n, __func__, td, tticks); } -} else if (TD_IS_SLEEPING(td)) { +} else if (TD_IS_SLEEPING(td) +TD_ON_SLEEPQ(td)) { /* Handle ticks wrap-up. */ if (ticks td-td_blktick) { ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org