Re: svn commit: r348737 - head/sys/kern
On Thu, Jun 6, 2019 at 2:02 PM John Baldwin wrote: > > On 6/6/19 11:21 AM, Ian Lepore wrote: > > On Thu, 2019-06-06 at 12:04 -0600, Alan Somers wrote: > >> On Thu, Jun 6, 2019 at 12:01 PM John Baldwin wrote: > >>> > >>> On 6/6/19 10:39 AM, Alan Somers wrote: > On Thu, Jun 6, 2019 at 11:35 AM Rodney W. Grimes > wrote: > > > >> Author: asomers > >> Date: Thu Jun 6 15:04:50 2019 > >> New Revision: 348737 > >> URL: https://svnweb.freebsd.org/changeset/base/348737 > >> > >> Log: > >> Add a testing facility to manually reclaim a vnode > >> > >> Add the debug.try_reclaim_vnode sysctl. When a pathname is > >> written to it, it > >> will be reclaimed, as long as it isn't already or doomed. > >> The purpose is to > >> gain test coverage for vnode reclamation, which is > >> otherwise hard to > >> achieve. > >> > >> Add the debug.ftry_reclaim_vnode sysctl. It does the same > >> thing, except > >> that its argument is a file descriptor instead of a > >> pathname. > > > > Should not this all be wrapped in some #ifdef or other > > protection, > > is it really a good idea to have this on every single box > > running > > FreeBSD? > > I initially thought so too, but kib thought that it could be > useful > for debugging problems in the field. The potential downside is > limited, because only root can write to the sysctls, and the > worse-case damage is similar to a "umount -f". > >>> > >>> A compromise might be to stick this in a kernel module instead of > >>> in the > >>> base kernel. You could still kldload it in the field for debugging > >>> but > >>> not necessarily have it directly available out of the box. > >>> > >>> -- > >>> John Baldwin > >> > >> If we already had such a module, it would make sense to put these > >> sysctls in there. But I don't want to create an entire module for > >> just a few dozen LOC. Nor do I want to mediate a bike shed. So > >> let's > >> vote. kib already registered a vote for making them available all of > >> the time. rgrimes voted to guard them by INVARIANTS. Anybody else > >> who cares can reply to this thread. I'll count the votes in 24 > >> hours. > >> -Alan > >> > > > > If our new policy is to remove sysctls that aren't used often "because > > something bad might happen" (without any requirement for the complainer > > to elaborate on just what might happen or why it's so much worse than > > the damage a root user could do with any other sysctl), I think several > > people could be employed full time doing that removal work. Or we > > could all just get on with doing some real work. > > What I find a bit different about this case is when it's a debugging > knob. For that sort of thing, kernel modules are a pretty decent way > to inject new functionality into the system that is rarely needed. A > while back I had a problem with resume on a laptop seemingly not > unsticking all of the processes that had been paused via stop_all and had > a hacky kernel module with a magic sysctl that would try to unstick things. > That worked better as a module that I only loaded if needed. Similar for > a hacky kernel module at a previous job (killsmi.ko) that would write to > the appropriate ICH register to disable all SMIs when loaded, etc. > > -- > John Baldwin It's been two weeks, and the vote tally is: * Unconditional: 3 * Module: 2 * Don't care/Get on with bigger problems: 2 Unconditional wins the vote. Though if rgrimes goes to the trouble of writing the module, I'll review it. -Alan ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349220 - in head: share/man/man9 sys/kern sys/sys
Author: mav Date: Thu Jun 20 01:15:33 2019 New Revision: 349220 URL: https://svnweb.freebsd.org/changeset/base/349220 Log: Add wakeup_any(), cheaper wakeup_one() for taskqueue(9). wakeup_one() and underlying sleepq_signal() spend additional time trying to be fair, waking thread with highest priority, sleeping longest time. But in case of taskqueue there are many absolutely identical threads, and any fairness between them is quite pointless. It makes even worse, since round-robin wakeups not only make previous CPU affinity in scheduler quite useless, but also hide from user chance to see CPU bottlenecks, when sequential workload with one request at a time looks evenly distributed between multiple threads. This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup thread that went to sleep last, but no longer in context switch (to avoid immediate spinning on the thread lock). On top of that new wakeup_any() function is added, equivalent to wakeup_one(), but setting the flag. On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its threads. As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs with 16KB block size spend 34% less time in wakeup_any() and descendants then it was spending in wakeup_one(), and total write throughput increased by ~10% with the same as before CPU usage. Reviewed by: markj, mmacy MFC after:2 weeks Sponsored by: iXsystems, Inc. Differential Revision:https://reviews.freebsd.org/D20669 Modified: head/share/man/man9/Makefile head/share/man/man9/sleep.9 head/share/man/man9/sleepqueue.9 head/sys/kern/kern_synch.c head/sys/kern/subr_sleepqueue.c head/sys/kern/subr_taskqueue.c head/sys/sys/queue.h head/sys/sys/sleepqueue.h head/sys/sys/systm.h Modified: head/share/man/man9/Makefile == --- head/share/man/man9/MakefileThu Jun 20 00:23:51 2019 (r349219) +++ head/share/man/man9/MakefileThu Jun 20 01:15:33 2019 (r349220) @@ -1880,7 +1880,8 @@ MLINKS+=sleep.9 msleep.9 \ sleep.9 tsleep.9 \ sleep.9 tsleep_sbt.9 \ sleep.9 wakeup.9 \ - sleep.9 wakeup_one.9 + sleep.9 wakeup_one.9 \ + sleep.9 wakeup_any.9 MLINKS+=sleepqueue.9 init_sleepqueues.9 \ sleepqueue.9 sleepq_abort.9 \ sleepqueue.9 sleepq_add.9 \ Modified: head/share/man/man9/sleep.9 == --- head/share/man/man9/sleep.9 Thu Jun 20 00:23:51 2019(r349219) +++ head/share/man/man9/sleep.9 Thu Jun 20 01:15:33 2019(r349220) @@ -25,7 +25,7 @@ .\" .\" $FreeBSD$ .\" -.Dd March 4, 2018 +.Dd June 19, 2019 .Dt SLEEP 9 .Os .Sh NAME @@ -38,7 +38,9 @@ .Nm pause_sbt , .Nm tsleep , .Nm tsleep_sbt , -.Nm wakeup +.Nm wakeup , +.Nm wakeup_one , +.Nm wakeup_any .Nd wait for events .Sh SYNOPSIS .In sys/param.h @@ -70,6 +72,8 @@ .Fn wakeup "void *chan" .Ft void .Fn wakeup_one "void *chan" +.Ft void +.Fn wakeup_any "void *chan" .Sh DESCRIPTION The functions .Fn tsleep , @@ -79,8 +83,9 @@ The functions .Fn pause_sig , .Fn pause_sbt , .Fn wakeup , +.Fn wakeup_one , and -.Fn wakeup_one +.Fn wakeup_any handle event-based thread blocking. If a thread must wait for an external event, it is put to sleep by @@ -252,9 +257,10 @@ function is a wrapper around .Fn tsleep that suspends execution of the current thread for the indicated timeout. The thread can not be awakened early by signals or calls to -.Fn wakeup +.Fn wakeup , +.Fn wakeup_one or -.Fn wakeup_one . +.Fn wakeup_any . The .Fn pause_sig function is a variant of @@ -263,8 +269,8 @@ which can be awakened early by signals. .Pp The .Fn wakeup_one -function makes the first thread in the queue that is sleeping on the -parameter +function makes the first highest priority thread in the queue that is +sleeping on the parameter .Fa chan runnable. This reduces the load when a large number of threads are sleeping on @@ -292,6 +298,16 @@ to pay particular attention to ensure that no other threads wait on the same .Fa chan . +.Pp +The +.Fn wakeup_any +function is similar to +.Fn wakeup_one , +except that it makes runnable last thread on the queue (sleeping less), +ignoring fairness. +It can be used when threads sleeping on the +.Fa chan +are known to be identical and there is no reason to be fair. .Pp If the timeout given by .Fa timo Modified: head/share/man/man9/sleepqueue.9 == --- head/share/man/man9/sleepqueue.9Thu Jun 20 00:23:51 2019 (r349219) +++ head/share/man/man9/sleepqueue.9Thu Jun 20 01:15:33 2019 (r349220) @@ -22,7 +22,7 @@ .\" .\" $FreeBSD$ .\" -.Dd September 22, 2014 +.Dd June 19, 2019 .Dt SLEEPQUEUE 9 .Os .Sh NAME @@ -290,7 +290,8 @@ and functions. The .Fn sleepq_signal
svn commit: r349218 - head/sys/vm
Author: markj Date: Wed Jun 19 21:36:00 2019 New Revision: 349218 URL: https://svnweb.freebsd.org/changeset/base/349218 Log: Group vm_page_activate()'s definition with other related functions. No functional change intended. MFC after:3 days Modified: head/sys/vm/vm_page.c Modified: head/sys/vm/vm_page.c == --- head/sys/vm/vm_page.c Wed Jun 19 21:10:13 2019(r349217) +++ head/sys/vm/vm_page.c Wed Jun 19 21:36:00 2019(r349218) @@ -3401,35 +3401,6 @@ vm_page_requeue(vm_page_t m) } /* - * vm_page_activate: - * - * Put the specified page on the active list (if appropriate). - * Ensure that act_count is at least ACT_INIT but do not otherwise - * mess with it. - * - * The page must be locked. - */ -void -vm_page_activate(vm_page_t m) -{ - - vm_page_assert_locked(m); - - if (vm_page_wired(m) || (m->oflags & VPO_UNMANAGED) != 0) - return; - if (vm_page_queue(m) == PQ_ACTIVE) { - if (m->act_count < ACT_INIT) - m->act_count = ACT_INIT; - return; - } - - vm_page_dequeue(m); - if (m->act_count < ACT_INIT) - m->act_count = ACT_INIT; - vm_page_enqueue(m, PQ_ACTIVE); -} - -/* * vm_page_free_prep: * * Prepares the given page to be put on the free list, @@ -3677,6 +3648,35 @@ vm_page_unwire_noq(vm_page_t m) return (true); } else return (false); +} + +/* + * vm_page_activate: + * + * Put the specified page on the active list (if appropriate). + * Ensure that act_count is at least ACT_INIT but do not otherwise + * mess with it. + * + * The page must be locked. + */ +void +vm_page_activate(vm_page_t m) +{ + + vm_page_assert_locked(m); + + if (vm_page_wired(m) || (m->oflags & VPO_UNMANAGED) != 0) + return; + if (vm_page_queue(m) == PQ_ACTIVE) { + if (m->act_count < ACT_INIT) + m->act_count = ACT_INIT; + return; + } + + vm_page_dequeue(m); + if (m->act_count < ACT_INIT) + m->act_count = ACT_INIT; + vm_page_enqueue(m, PQ_ACTIVE); } /* ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349217 - head/stand/libsa/zfs
Author: mmacy Date: Wed Jun 19 21:10:13 2019 New Revision: 349217 URL: https://svnweb.freebsd.org/changeset/base/349217 Log: Tell loader to ignore newer features enabled on the root pool. There are many new features in ZoF. Most, if not all, do not effect read only usage. Encryption in particular is enabled at the pool level but used at the dataset level. The loader obviously will not be able to boot if the boot dataset is encrypted, but should not care if some other dataset in the root pool is encrypted. Reviewed by: allanjude MFC after:1 week Modified: head/stand/libsa/zfs/zfsimpl.c Modified: head/stand/libsa/zfs/zfsimpl.c == --- head/stand/libsa/zfs/zfsimpl.c Wed Jun 19 20:29:02 2019 (r349216) +++ head/stand/libsa/zfs/zfsimpl.c Wed Jun 19 21:10:13 2019 (r349217) @@ -64,6 +64,12 @@ static const char *features_for_read[] = { "org.illumos:skein", "org.zfsonlinux:large_dnode", "com.joyent:multi_vdev_crash_dump", + "com.delphix:spacemap_histogram", + "com.delphix:zpool_checkpoint", + "com.delphix:spacemap_v2", + "com.datto:encryption", + "org.zfsonlinux:allocation_classes", + "com.datto:resilver_defer", NULL }; ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349202 - head/share/mk
Author: bdrewery Date: Wed Jun 19 19:19:37 2019 New Revision: 349202 URL: https://svnweb.freebsd.org/changeset/base/349202 Log: Follow-up r349065: Fix .TARGET flag ambiguity with PROGS which broke MK_TESTS. X-MFC-With: r349065 Sponsored by: DellEMC Modified: head/share/mk/bsd.sys.mk Modified: head/share/mk/bsd.sys.mk == --- head/share/mk/bsd.sys.mkWed Jun 19 18:47:44 2019(r349201) +++ head/share/mk/bsd.sys.mkWed Jun 19 19:19:37 2019(r349202) @@ -234,7 +234,6 @@ DEBUG_FILES_CFLAGS?= -g .if ${MK_WARNS} != "no" CFLAGS+= ${CWARNFLAGS:M*} ${CWARNFLAGS.${COMPILER_TYPE}} CFLAGS+= ${CWARNFLAGS.${.IMPSRC:T}} -CFLAGS+= ${CWARNFLAGS.${.TARGET:T}} .endif CFLAGS+=${CFLAGS.${COMPILER_TYPE}} @@ -245,14 +244,23 @@ AFLAGS+= ${AFLAGS.${.TARGET:T}} ACFLAGS+= ${ACFLAGS.${.IMPSRC:T}} ACFLAGS+= ${ACFLAGS.${.TARGET:T}} CFLAGS+= ${CFLAGS.${.IMPSRC:T}} -CFLAGS+= ${CFLAGS.${.TARGET:T}} CXXFLAGS+= ${CXXFLAGS.${.IMPSRC:T}} -CXXFLAGS+= ${CXXFLAGS.${.TARGET:T}} LDFLAGS+= ${LDFLAGS.${LINKER_TYPE}} + +# Only allow .TARGET when not using PROGS as it has the same syntax +# per PROG which is ambiguous with this syntax. This is only needed +# for PROG_VARS vars. +.if !defined(_RECURSING_PROGS) +.if ${MK_WARNS} != "no" +CFLAGS+= ${CWARNFLAGS.${.TARGET:T}} +.endif +CFLAGS+= ${CFLAGS.${.TARGET:T}} +CXXFLAGS+= ${CXXFLAGS.${.TARGET:T}} LDFLAGS+= ${LDFLAGS.${.TARGET:T}} LDADD+=${LDADD.${.TARGET:T}} LIBADD+= ${LIBADD.${.TARGET:T}} +.endif .if defined(SRCTOP) # Prevent rebuilding during install to support read-only objdirs. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349201 - head/stand/efi/libefi
Author: bcran Date: Wed Jun 19 18:47:44 2019 New Revision: 349201 URL: https://svnweb.freebsd.org/changeset/base/349201 Log: efinet: Defer exclusively opening the network handles Don't commit to exclusive access to the network device handle by efinet until the loader has decided to load something through the network. This allows for the possibility of other users of the network device. Submitted by: scottph Reviewed by: tsoome, emaste Tested by:tsoome, bcran Differential Revision:https://reviews.freebsd.org/D20642 Modified: head/stand/efi/libefi/efinet.c Modified: head/stand/efi/libefi/efinet.c == --- head/stand/efi/libefi/efinet.c Wed Jun 19 16:44:07 2019 (r349200) +++ head/stand/efi/libefi/efinet.c Wed Jun 19 18:47:44 2019 (r349201) @@ -108,7 +108,25 @@ efinet_match(struct netif *nif, void *machdep_hint) static int efinet_probe(struct netif *nif, void *machdep_hint) { + EFI_SIMPLE_NETWORK *net; + EFI_HANDLE h; + EFI_STATUS status; + h = nif->nif_driver->netif_ifs[nif->nif_unit].dif_private; + /* +* Open the network device in exclusive mode. Without this +* we will be racing with the UEFI network stack. It will +* pull packets off the network leading to lost packets. +*/ + status = BS->OpenProtocol(h, _guid, (void **), + IH, NULL, EFI_OPEN_PROTOCOL_EXCLUSIVE); + if (status != EFI_SUCCESS) { + printf("Unable to open network interface %d for " + "exclusive access: %lu\n", nif->nif_unit, + EFI_ERROR_CODE(status)); + return (efi_status_to_errno(status)); + } + return (0); } @@ -269,7 +287,6 @@ efinet_dev_init() struct netif_dif *dif; struct netif_stats *stats; EFI_DEVICE_PATH *devpath, *node; - EFI_SIMPLE_NETWORK *net; EFI_HANDLE *handles, *handles2; EFI_STATUS status; UINTN sz; @@ -304,19 +321,6 @@ efinet_dev_init() if (DevicePathType(node) != MESSAGING_DEVICE_PATH || DevicePathSubType(node) != MSG_MAC_ADDR_DP) continue; - - /* -* Open the network device in exclusive mode. Without this -* we will be racing with the UEFI network stack. It will -* pull packets off the network leading to lost packets. -*/ - status = BS->OpenProtocol(handles[i], _guid, (void **), - IH, NULL, EFI_OPEN_PROTOCOL_EXCLUSIVE); - if (status != EFI_SUCCESS) { - printf("Unable to open network interface %d for " - "exclusive access: %lu\n", i, - EFI_ERROR_CODE(status)); - } handles2[nifs] = handles[i]; nifs++; ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349196 - head/usr.sbin/bhyve
Author: markj Date: Wed Jun 19 16:09:20 2019 New Revision: 349196 URL: https://svnweb.freebsd.org/changeset/base/349196 Log: Make zlib encoding messages idempotent. Otherwise duplicate messages can trigger a reinitialization of the compression stream while the update thread is running. Also ensure that the stream is initialized before the update thread may attempt to use it. PR: 238333 Reviewed by: cem, rgrimes MFC after:3 days Sponsored by: The FreeBSD Foundation Differential Revision:https://reviews.freebsd.org/D20673 Modified: head/usr.sbin/bhyve/rfb.c Modified: head/usr.sbin/bhyve/rfb.c == --- head/usr.sbin/bhyve/rfb.c Wed Jun 19 15:36:02 2019(r349195) +++ head/usr.sbin/bhyve/rfb.c Wed Jun 19 16:09:20 2019(r349196) @@ -273,8 +273,10 @@ rfb_recv_set_encodings_msg(struct rfb_softc *rc, int c rc->enc_raw_ok = true; break; case RFB_ENCODING_ZLIB: - rc->enc_zlib_ok = true; - deflateInit(>zstream, Z_BEST_SPEED); + if (!rc->enc_zlib_ok) { + deflateInit(>zstream, Z_BEST_SPEED); + rc->enc_zlib_ok = true; + } break; case RFB_ENCODING_RESIZE: rc->enc_resize_ok = true; ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349195 - in head/sys: geom geom/concat geom/eli geom/journal geom/mirror geom/multipath geom/part geom/raid geom/raid3 geom/stripe kern
Author: mav Date: Wed Jun 19 15:36:02 2019 New Revision: 349195 URL: https://svnweb.freebsd.org/changeset/base/349195 Log: Use sbuf_cat() in GEOM confxml generation. When it comes to megabytes of text, difference between sbuf_printf() and sbuf_cat() becomes substantial. MFC after:2 weeks Sponsored by: iXsystems, Inc. Modified: head/sys/geom/concat/g_concat.c head/sys/geom/eli/g_eli.c head/sys/geom/geom_disk.c head/sys/geom/geom_dump.c head/sys/geom/geom_int.h head/sys/geom/journal/g_journal.c head/sys/geom/mirror/g_mirror.c head/sys/geom/multipath/g_multipath.c head/sys/geom/part/g_part_apm.c head/sys/geom/part/g_part_bsd64.c head/sys/geom/part/g_part_gpt.c head/sys/geom/part/g_part_mbr.c head/sys/geom/raid/g_raid.c head/sys/geom/raid3/g_raid3.c head/sys/geom/stripe/g_stripe.c head/sys/kern/kern_uuid.c Modified: head/sys/geom/concat/g_concat.c == --- head/sys/geom/concat/g_concat.c Wed Jun 19 15:26:52 2019 (r349194) +++ head/sys/geom/concat/g_concat.c Wed Jun 19 15:36:02 2019 (r349195) @@ -1004,24 +1004,24 @@ g_concat_dumpconf(struct sbuf *sb, const char *indent, sbuf_printf(sb, "%s", indent); switch (sc->sc_type) { case G_CONCAT_TYPE_AUTOMATIC: - sbuf_printf(sb, "AUTOMATIC"); + sbuf_cat(sb, "AUTOMATIC"); break; case G_CONCAT_TYPE_MANUAL: - sbuf_printf(sb, "MANUAL"); + sbuf_cat(sb, "MANUAL"); break; default: - sbuf_printf(sb, "UNKNOWN"); + sbuf_cat(sb, "UNKNOWN"); break; } - sbuf_printf(sb, "\n"); + sbuf_cat(sb, "\n"); sbuf_printf(sb, "%sTotal=%u, Online=%u\n", indent, sc->sc_ndisks, g_concat_nvalid(sc)); sbuf_printf(sb, "%s", indent); if (sc->sc_provider != NULL && sc->sc_provider->error == 0) - sbuf_printf(sb, "UP"); + sbuf_cat(sb, "UP"); else - sbuf_printf(sb, "DOWN"); - sbuf_printf(sb, "\n"); + sbuf_cat(sb, "DOWN"); + sbuf_cat(sb, "\n"); } } Modified: head/sys/geom/eli/g_eli.c == --- head/sys/geom/eli/g_eli.c Wed Jun 19 15:26:52 2019(r349194) +++ head/sys/geom/eli/g_eli.c Wed Jun 19 15:36:02 2019(r349195) @@ -1328,17 +1328,17 @@ g_eli_dumpconf(struct sbuf *sb, const char *indent, st (uintmax_t)sc->sc_ekeys_allocated); sbuf_printf(sb, "%s", indent); if (sc->sc_flags == 0) - sbuf_printf(sb, "NONE"); + sbuf_cat(sb, "NONE"); else { int first = 1; #define ADD_FLAG(flag, name) do {\ if (sc->sc_flags & (flag)) {\ if (!first) \ - sbuf_printf(sb, ", "); \ + sbuf_cat(sb, ", "); \ else\ first = 0; \ - sbuf_printf(sb, name); \ + sbuf_cat(sb, name); \ } \ } while (0) ADD_FLAG(G_ELI_FLAG_SUSPEND, "SUSPEND"); @@ -1358,7 +1358,7 @@ g_eli_dumpconf(struct sbuf *sb, const char *indent, st ADD_FLAG(G_ELI_FLAG_AUTORESIZE, "AUTORESIZE"); #undef ADD_FLAG } - sbuf_printf(sb, "\n"); + sbuf_cat(sb, "\n"); if (!(sc->sc_flags & G_ELI_FLAG_ONETIME)) { sbuf_printf(sb, "%s%u\n", indent, @@ -1368,16 +1368,16 @@ g_eli_dumpconf(struct sbuf *sb, const char *indent, st sbuf_printf(sb, "%s", indent); switch (sc->sc_crypto) { case G_ELI_CRYPTO_HW: - sbuf_printf(sb, "hardware"); + sbuf_cat(sb, "hardware"); break; case G_ELI_CRYPTO_SW: - sbuf_printf(sb, "software"); + sbuf_cat(sb, "software"); break; default: - sbuf_printf(sb, "UNKNOWN"); + sbuf_cat(sb, "UNKNOWN"); break; } - sbuf_printf(sb, "\n"); + sbuf_cat(sb, "\n"); if (sc->sc_flags & G_ELI_FLAG_AUTH) { sbuf_printf(sb, "%s%s\n", Modified: head/sys/geom/geom_disk.c
Re: svn commit: r349184 - head/sys/amd64/vmm/intel
> Author: scottl > Date: Wed Jun 19 06:41:07 2019 > New Revision: 349184 > URL: https://svnweb.freebsd.org/changeset/base/349184 > > Log: > Implement VT-d capability detection on chipsets that have multiple > translation units with differing capabilities > > From the author via Bugzilla: > --- If you had read the full bug report you would also know: https://reviews.freebsd.org/D19001 existed and that some code cleanup had occurred since this bug was created. The review was pending approval by bhyve maintainer(s). > When an attempt is made to passthrough a PCI device to a bhyve VM > (causing initialisation of IOMMU) on certain Intel chipsets using > VT-d the PCI bus stops working entirely. This issue occurs on the > E3-1275 v5 processor on C236 chipset and has also been encountered > by others on the forums with different hardware in the Skylake > series. > > The chipset has two VT-d translation units. The issue is caused by > an attempt to use the VT-d device-IOTLB capability that is > supported by only the first unit for devices attached to the > second unit which lacks that capability. Only the capabilities of > the first unit are checked and are assumed to be the same for all > units. > > Attached is a patch to rectify this issue by determining which > unit is responsible for the device being added to a domain and > then checking that unit's device-IOTLB capability. In addition to > this a few fixes have been made to other instances where the first > unit's capabilities are assumed for all units for domains they > share. In these cases a mutual set of capabilities is determined. > The patch should hopefully fix any bugs for current/future > hardware with multiple translation units supporting different > capabilities. > > A description is on the forums at > https://forums.freebsd.org/threads/pci-passthrough-bhyve-usb-xhci.65235 > The thread includes observations by other users of the bug > occurring, and description as well as confirmation of the fix. > I'd also like to thank Ordoban for their help. > > --- > Personally tested on a Skylake laptop, Skylake Xeon server, and > a Xeon-D-1541, passing through XHCI and NVMe functions. Passthru > is hit-or-miss to the point of being unusable without this > patch. > > PR: 229852 > Submitted by: cal...@aitchison.org > MFC after: 1 week > > Modified: > head/sys/amd64/vmm/intel/vtd.c > > Modified: head/sys/amd64/vmm/intel/vtd.c > == > --- head/sys/amd64/vmm/intel/vtd.cWed Jun 19 03:33:00 2019 > (r349183) > +++ head/sys/amd64/vmm/intel/vtd.cWed Jun 19 06:41:07 2019 > (r349184) > @@ -51,6 +51,8 @@ __FBSDID("$FreeBSD$"); > * Architecture Spec, September 2008. > */ > > +#define VTD_DRHD_INCLUDE_PCI_ALL(Flags) (((Flags) >> 0) & 0x1) > + > /* Section 10.4 "Register Descriptions" */ > struct vtdmap { > volatile uint32_t version; > @@ -116,10 +118,11 @@ struct domain { > static SLIST_HEAD(, domain) domhead; > > #define DRHD_MAX_UNITS 8 > -static int drhd_num; > -static struct vtdmap *vtdmaps[DRHD_MAX_UNITS]; > -static int max_domains; > -typedef int (*drhd_ident_func_t)(void); > +static ACPI_DMAR_HARDWARE_UNIT *drhds[DRHD_MAX_UNITS]; > +static int drhd_num; > +static struct vtdmap *vtdmaps[DRHD_MAX_UNITS]; > +static int max_domains; > +typedef int (*drhd_ident_func_t)(void); > > static uint64_t root_table[PAGE_SIZE / sizeof(uint64_t)] __aligned(4096); > static uint64_t ctx_tables[256][PAGE_SIZE / sizeof(uint64_t)] > __aligned(4096); > @@ -175,6 +178,69 @@ domain_id(void) > return (id); > } > > +static struct vtdmap * > +vtd_device_scope(uint16_t rid) > +{ > + int i, remaining, pathremaining; > + char *end, *pathend; > + struct vtdmap *vtdmap; > + ACPI_DMAR_HARDWARE_UNIT *drhd; > + ACPI_DMAR_DEVICE_SCOPE *device_scope; > + ACPI_DMAR_PCI_PATH *path; > + > + for (i = 0; i < drhd_num; i++) { > + drhd = drhds[i]; > + > + if (VTD_DRHD_INCLUDE_PCI_ALL(drhd->Flags)) { > + /* > + * From Intel VT-d arch spec, version 3.0: > + * If a DRHD structure with INCLUDE_PCI_ALL flag Set is > reported > + * for a Segment, it must be enumerated by BIOS after > all other > + * DRHD structures for the same Segment. > + */ > + vtdmap = vtdmaps[i]; > + return(vtdmap); > + } > + > + end = (char *)drhd + drhd->Header.Length; > + remaining = drhd->Header.Length - > sizeof(ACPI_DMAR_HARDWARE_UNIT); > + while (remaining > sizeof(ACPI_DMAR_DEVICE_SCOPE)) { > + device_scope =
svn commit: r349192 - head/sys/netinet/tcp_stacks
Author: jtl Date: Wed Jun 19 13:55:00 2019 New Revision: 349192 URL: https://svnweb.freebsd.org/changeset/base/349192 Log: Add the ability to limit how much the code will fragment the RACK send map in response to SACKs. The default behavior is unchanged; however, the limit can be activated by changing the new net.inet.tcp.rack.split_limit sysctl. Submitted by: Peter Lei Reported by: jtl Reviewed by: lstewart (earlier version) Security: CVE-2019-5599 Modified: head/sys/netinet/tcp_stacks/rack.c head/sys/netinet/tcp_stacks/tcp_rack.h Modified: head/sys/netinet/tcp_stacks/rack.c == --- head/sys/netinet/tcp_stacks/rack.c Wed Jun 19 13:33:34 2019 (r349191) +++ head/sys/netinet/tcp_stacks/rack.c Wed Jun 19 13:55:00 2019 (r349192) @@ -1,5 +1,5 @@ /*- - * Copyright (c) 2016-2018 Netflix, Inc. + * Copyright (c) 2016-2019 Netflix, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -202,6 +202,7 @@ static int32_t rack_always_send_oldest = 0; static int32_t rack_sack_block_limit = 128; static int32_t rack_use_sack_filter = 1; static int32_t rack_tlp_threshold_use = TLP_USE_TWO_ONE; +static uint32_t rack_map_split_limit = 0; /* unlimited by default */ /* Rack specific counters */ counter_u64_t rack_badfr; @@ -227,6 +228,8 @@ counter_u64_t rack_to_arm_tlp; counter_u64_t rack_to_alloc; counter_u64_t rack_to_alloc_hard; counter_u64_t rack_to_alloc_emerg; +counter_u64_t rack_alloc_limited_conns; +counter_u64_t rack_split_limited; counter_u64_t rack_sack_proc_all; counter_u64_t rack_sack_proc_short; @@ -260,6 +263,8 @@ static void rack_ack_received(struct tcpcb *tp, struct tcp_rack *rack, struct tcphdr *th, uint16_t nsegs, uint16_t type, int32_t recovery); static struct rack_sendmap *rack_alloc(struct tcp_rack *rack); +static struct rack_sendmap *rack_alloc_limit(struct tcp_rack *rack, +uint8_t limit_type); static struct rack_sendmap * rack_check_recovery_mode(struct tcpcb *tp, uint32_t tsused); @@ -444,6 +449,8 @@ sysctl_rack_clear(SYSCTL_HANDLER_ARGS) counter_u64_zero(rack_sack_proc_short); counter_u64_zero(rack_sack_proc_restart); counter_u64_zero(rack_to_alloc); + counter_u64_zero(rack_alloc_limited_conns); + counter_u64_zero(rack_split_limited); counter_u64_zero(rack_find_high); counter_u64_zero(rack_runt_sacks); counter_u64_zero(rack_used_tlpmethod); @@ -621,6 +628,11 @@ rack_init_sysctls() OID_AUTO, "pktdelay", CTLFLAG_RW, _pkt_delay, 1, "Extra RACK time (in ms) besides reordering thresh"); + SYSCTL_ADD_U32(_sysctl_ctx, + SYSCTL_CHILDREN(rack_sysctl_root), + OID_AUTO, "split_limit", CTLFLAG_RW, + _map_split_limit, 0, + "Is there a limit on the number of map split entries (0=unlimited)"); SYSCTL_ADD_S32(_sysctl_ctx, SYSCTL_CHILDREN(rack_sysctl_root), OID_AUTO, "inc_var", CTLFLAG_RW, @@ -756,7 +768,19 @@ rack_init_sysctls() SYSCTL_CHILDREN(rack_sysctl_root), OID_AUTO, "allocemerg", CTLFLAG_RD, _to_alloc_emerg, - "Total alocations done from emergency cache"); + "Total allocations done from emergency cache"); + rack_alloc_limited_conns = counter_u64_alloc(M_WAITOK); + SYSCTL_ADD_COUNTER_U64(_sysctl_ctx, + SYSCTL_CHILDREN(rack_sysctl_root), + OID_AUTO, "alloc_limited_conns", CTLFLAG_RD, + _alloc_limited_conns, + "Connections with allocations dropped due to limit"); + rack_split_limited = counter_u64_alloc(M_WAITOK); + SYSCTL_ADD_COUNTER_U64(_sysctl_ctx, + SYSCTL_CHILDREN(rack_sysctl_root), + OID_AUTO, "split_limited", CTLFLAG_RD, + _split_limited, + "Split allocations dropped due to limit"); rack_sack_proc_all = counter_u64_alloc(M_WAITOK); SYSCTL_ADD_COUNTER_U64(_sysctl_ctx, SYSCTL_CHILDREN(rack_sysctl_root), @@ -1120,10 +1144,11 @@ rack_alloc(struct tcp_rack *rack) { struct rack_sendmap *rsm; - counter_u64_add(rack_to_alloc, 1); - rack->r_ctl.rc_num_maps_alloced++; rsm = uma_zalloc(rack_zone, M_NOWAIT); if (rsm) { +alloc_done: + counter_u64_add(rack_to_alloc, 1); + rack->r_ctl.rc_num_maps_alloced++; return (rsm); } if (rack->rc_free_cnt) { @@ -1131,14 +1156,46 @@ rack_alloc(struct tcp_rack *rack) rsm = TAILQ_FIRST(>r_ctl.rc_free); TAILQ_REMOVE(>r_ctl.rc_free, rsm, r_next); rack->rc_free_cnt--; - return (rsm); + goto alloc_done; } return (NULL); }
svn commit: r349190 - head/sys/kern
Author: mav Date: Wed Jun 19 13:30:50 2019 New Revision: 349190 URL: https://svnweb.freebsd.org/changeset/base/349190 Log: Fix typo in r349178. Reported by: ae MFC after:1 week Modified: head/sys/kern/subr_sbuf.c Modified: head/sys/kern/subr_sbuf.c == --- head/sys/kern/subr_sbuf.c Wed Jun 19 13:19:36 2019(r349189) +++ head/sys/kern/subr_sbuf.c Wed Jun 19 13:30:50 2019(r349190) @@ -342,7 +342,7 @@ sbuf_setpos(struct sbuf *s, ssize_t pos) } /* - * Drain into a counter. Counts amount of data without prodicing output. + * Drain into a counter. Counts amount of data without producing output. * Useful for cases like sysctl, where user may first request only size. * This allows to avoid pointless allocation/freeing of large buffers. */ ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349188 - head/stand/ofw/libofw
Author: luporl Date: Wed Jun 19 11:37:43 2019 New Revision: 349188 URL: https://svnweb.freebsd.org/changeset/base/349188 Log: [PPC] Fix loader input with newer QEMU versions At least since version 4.0.0, QEMU became bug-compatible with PowerVM's vty, by inserting a \0 after every \r. As this confuses loader's interpreter and as a \0 coming from the console doesn't seem reasonable, it's now being filtered at OFW console input. Reviewed by: jhibbits MFC after:2 weeks Differential Revision:https://reviews.freebsd.org/D20676 Modified: head/stand/ofw/libofw/ofw_console.c Modified: head/stand/ofw/libofw/ofw_console.c == --- head/stand/ofw/libofw/ofw_console.c Wed Jun 19 11:22:09 2019 (r349187) +++ head/stand/ofw/libofw/ofw_console.c Wed Jun 19 11:37:43 2019 (r349188) @@ -97,7 +97,11 @@ ofw_cons_getchar() return l; } - if (OF_read(stdin, , 1) > 0) + /* At least since version 4.0.0, QEMU became bug-compatible +* with PowerVM's vty, by inserting a \0 after every \r. +* As this confuses loader's interpreter and as a \0 coming +* from the console doesn't seem reasonable, it's filtered here. */ + if (OF_read(stdin, , 1) > 0 && ch != '\0') return (ch); return (-1); ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349187 - head/share/misc
Author: sevan (doc committer) Date: Wed Jun 19 11:22:09 2019 New Revision: 349187 URL: https://svnweb.freebsd.org/changeset/base/349187 Log: Whitespace Modified: head/share/misc/bsd-family-tree Modified: head/share/misc/bsd-family-tree == --- head/share/misc/bsd-family-tree Wed Jun 19 08:49:24 2019 (r349186) +++ head/share/misc/bsd-family-tree Wed Jun 19 11:22:09 2019 (r349187) @@ -372,7 +372,7 @@ FreeBSD 5.2 | | | | | | 10.13| ||OpenBSD 6.1 | | FreeBSD | | | ||| DragonFly 5.0.0 | 11.1 FreeBSD| | ||| | - | |10.4 | | ||OpenBSD 6.2 DragonFly 5.0.1 + | |10.4 | | ||OpenBSD 6.2 DragonFly 5.0.1 | | | | ||| | | `--. | | | NetBSD | DragonFly 5.0.2 || | | | 7.1.1 | | @@ -381,7 +381,7 @@ FreeBSD 5.2 | | | || | | | 7.1.2 `--.| || | | ||| || | | `-. OpenBSD 6.3 | - || | *--NetBSD | | DragonFly 5.2.0 + || | *--NetBSD | |DragonFly 5.2.0 || | | 8.0 | || || | | | | |DragonFly 5.2.1 || | | | | || ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349186 - head/sys/net
Author: zec Date: Wed Jun 19 08:49:24 2019 New Revision: 349186 URL: https://svnweb.freebsd.org/changeset/base/349186 Log: V_ip6_forwarding and V_ipforwarding have been defined in ip6_var.h / ip_var.h since at least 2008, so make use of those definitions here. MFC after:3 days Modified: head/sys/net/iflib.c Modified: head/sys/net/iflib.c == --- head/sys/net/iflib.cWed Jun 19 08:39:19 2019(r349185) +++ head/sys/net/iflib.cWed Jun 19 08:49:24 2019(r349186) @@ -2688,10 +2688,10 @@ iflib_get_ip_forwarding(struct lro_ctrl *lc, bool *v4, { CURVNET_SET(lc->ifp->if_vnet); #if defined(INET6) - *v6 = VNET(ip6_forwarding); + *v6 = V_ip6_forwarding; #endif #if defined(INET) - *v4 = VNET(ipforwarding); + *v4 = V_ipforwarding; #endif CURVNET_RESTORE(); } ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349185 - head/sys/net
Author: zec Date: Wed Jun 19 08:39:19 2019 New Revision: 349185 URL: https://svnweb.freebsd.org/changeset/base/349185 Log: Evaluating htons() at compile time is more efficient than doing ntohs() at runtime. This change removes a dependency on a barrel shifter pass before branch resolution, while reducing the instruction stream size by 9 bytes on amd64. MFC after:3 days Modified: head/sys/net/iflib.c Modified: head/sys/net/iflib.c == --- head/sys/net/iflib.cWed Jun 19 06:41:07 2019(r349184) +++ head/sys/net/iflib.cWed Jun 19 08:39:19 2019(r349185) @@ -2705,18 +2705,16 @@ static bool iflib_check_lro_possible(struct mbuf *m, bool v4_forwarding, bool v6_forwarding) { struct ether_header *eh; - uint16_t eh_type; eh = mtod(m, struct ether_header *); - eh_type = ntohs(eh->ether_type); - switch (eh_type) { + switch (eh->ether_type) { #if defined(INET6) - case ETHERTYPE_IPV6: - return !v6_forwarding; + case htons(ETHERTYPE_IPV6): + return (!v6_forwarding); #endif #if defined (INET) - case ETHERTYPE_IP: - return !v4_forwarding; + case htons(ETHERTYPE_IP): + return (!v4_forwarding); #endif } ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r349184 - head/sys/amd64/vmm/intel
Author: scottl Date: Wed Jun 19 06:41:07 2019 New Revision: 349184 URL: https://svnweb.freebsd.org/changeset/base/349184 Log: Implement VT-d capability detection on chipsets that have multiple translation units with differing capabilities From the author via Bugzilla: --- When an attempt is made to passthrough a PCI device to a bhyve VM (causing initialisation of IOMMU) on certain Intel chipsets using VT-d the PCI bus stops working entirely. This issue occurs on the E3-1275 v5 processor on C236 chipset and has also been encountered by others on the forums with different hardware in the Skylake series. The chipset has two VT-d translation units. The issue is caused by an attempt to use the VT-d device-IOTLB capability that is supported by only the first unit for devices attached to the second unit which lacks that capability. Only the capabilities of the first unit are checked and are assumed to be the same for all units. Attached is a patch to rectify this issue by determining which unit is responsible for the device being added to a domain and then checking that unit's device-IOTLB capability. In addition to this a few fixes have been made to other instances where the first unit's capabilities are assumed for all units for domains they share. In these cases a mutual set of capabilities is determined. The patch should hopefully fix any bugs for current/future hardware with multiple translation units supporting different capabilities. A description is on the forums at https://forums.freebsd.org/threads/pci-passthrough-bhyve-usb-xhci.65235 The thread includes observations by other users of the bug occurring, and description as well as confirmation of the fix. I'd also like to thank Ordoban for their help. --- Personally tested on a Skylake laptop, Skylake Xeon server, and a Xeon-D-1541, passing through XHCI and NVMe functions. Passthru is hit-or-miss to the point of being unusable without this patch. PR: 229852 Submitted by: cal...@aitchison.org MFC after: 1 week Modified: head/sys/amd64/vmm/intel/vtd.c Modified: head/sys/amd64/vmm/intel/vtd.c == --- head/sys/amd64/vmm/intel/vtd.c Wed Jun 19 03:33:00 2019 (r349183) +++ head/sys/amd64/vmm/intel/vtd.c Wed Jun 19 06:41:07 2019 (r349184) @@ -51,6 +51,8 @@ __FBSDID("$FreeBSD$"); * Architecture Spec, September 2008. */ +#define VTD_DRHD_INCLUDE_PCI_ALL(Flags) (((Flags) >> 0) & 0x1) + /* Section 10.4 "Register Descriptions" */ struct vtdmap { volatile uint32_t version; @@ -116,10 +118,11 @@ struct domain { static SLIST_HEAD(, domain) domhead; #defineDRHD_MAX_UNITS 8 -static int drhd_num; -static struct vtdmap *vtdmaps[DRHD_MAX_UNITS]; -static int max_domains; -typedef int(*drhd_ident_func_t)(void); +static ACPI_DMAR_HARDWARE_UNIT *drhds[DRHD_MAX_UNITS]; +static int drhd_num; +static struct vtdmap *vtdmaps[DRHD_MAX_UNITS]; +static int max_domains; +typedef int(*drhd_ident_func_t)(void); static uint64_t root_table[PAGE_SIZE / sizeof(uint64_t)] __aligned(4096); static uint64_t ctx_tables[256][PAGE_SIZE / sizeof(uint64_t)] __aligned(4096); @@ -175,6 +178,69 @@ domain_id(void) return (id); } +static struct vtdmap * +vtd_device_scope(uint16_t rid) +{ + int i, remaining, pathremaining; + char *end, *pathend; + struct vtdmap *vtdmap; + ACPI_DMAR_HARDWARE_UNIT *drhd; + ACPI_DMAR_DEVICE_SCOPE *device_scope; + ACPI_DMAR_PCI_PATH *path; + + for (i = 0; i < drhd_num; i++) { + drhd = drhds[i]; + + if (VTD_DRHD_INCLUDE_PCI_ALL(drhd->Flags)) { + /* +* From Intel VT-d arch spec, version 3.0: +* If a DRHD structure with INCLUDE_PCI_ALL flag Set is reported +* for a Segment, it must be enumerated by BIOS after all other +* DRHD structures for the same Segment. +*/ + vtdmap = vtdmaps[i]; + return(vtdmap); + } + + end = (char *)drhd + drhd->Header.Length; + remaining = drhd->Header.Length - sizeof(ACPI_DMAR_HARDWARE_UNIT); + while (remaining > sizeof(ACPI_DMAR_DEVICE_SCOPE)) { + device_scope = (ACPI_DMAR_DEVICE_SCOPE *)(end - remaining); + remaining -= device_scope->Length; + + switch (device_scope->EntryType){ + /* 0x01 and 0x02 are PCI device entries */ + case 0x01: + case 0x02: + break; + default: