Re: fix snmpd reporting of IP-MIB::ipForwarding.0
On Wed, Oct 07, 2015 at 10:39:17PM +0100, Stuart Henderson wrote: > IP-MIB::ipForwarding.0 should return one of these values: > > ipForwarding OBJECT-TYPE > SYNTAX INTEGER { > forwarding(1),-- acting as a router > notForwarding(2) -- NOT acting as a router >} > > Currently we directly return the value of sysctl net.inet.ip.forwarding > which is incorrect. > > OK to fix it like this? > See one comment comment below, otherwise OK. > Index: mib.c > === > RCS file: /cvs/src/usr.sbin/snmpd/mib.c,v > retrieving revision 1.76 > diff -u -p -r1.76 mib.c > --- mib.c 10 Jun 2015 10:03:59 - 1.76 > +++ mib.c 7 Oct 2015 21:34:27 - > @@ -2984,7 +2984,7 @@ mib_ipforwarding(struct oid *oid, struct > if (sysctl(mib, sizeofa(mib), , , NULL, 0) == -1) > return (-1); > > - *elm = ber_add_integer(*elm, v); We don't handle types in snmpd very well, but I'd at least like to have a comment: /* ipForwarding: forwarding(1), notForwarding(2) */ > + *elm = ber_add_integer(*elm, v ? 1 : 2); > > return (0); > } > --
Re: Bulkget & snmpd
On 2015/10/08 01:02, Stuart Henderson wrote: > > $ snmpbulkget -Cn2 -Cr4 -v2c -c public 127.0.0.1 SNMPv2-MIB::sysDescr > > IP-MIB::ipForwarding IF-MIB::ifName IF-MIB::ifInOctets > > SNMPv2-MIB::sysDescr.0 = STRING: some host description > > IP-MIB::ipForwarding.0 = INTEGER: notForwarding(2) > > IF-MIB::ifName.1 = STRING: em0 > > IF-MIB::ifName.2 = STRING: em1 > > IF-MIB::ifName.3 = STRING: enc0 > > IF-MIB::ifName.4 = STRING: lo0 > > IF-MIB::ifInOctets.1 = Counter32: 4019922211 > > IF-MIB::ifInOctets.2 = Counter32: 0 > > IF-MIB::ifInOctets.3 = Counter32: 0 > > IF-MIB::ifInOctets.4 = Counter32: 1433448428 > > > > (the answers don't have to be in the same order as the query, I have > > just formatted them that way for the example to make it clearer). > > O. And now I find Gerhard Roth's post > > https://marc.info/?l=openbsd-tech=143375327425321=2 > > In a nutshell: the mps_getbulkreq() calls subsequent to the first one > link the elements to the *first* element of the previous call, not the > last element. Gerhard's diff passes in the address of the last element > so it can be linked correctly. > > I'll look at it again tomorrow but that looks correct for the -Cr case, > nonreps and I think also len counting are still a problem but I'll look > at those again on top of Gerhard's diff. Gerhard's diff (repeated below) is correct for -Cr. Any OKs to commit it? $ snmpbulkget -Cr4 -v2c -c public 127.0.0.1 SNMPv2-MIB::sysDescr IP-MIB::ipForwarding IF-MIB::ifName IF-MIB::ifInOctets before: SNMPv2-MIB::sysDescr.0 = STRING: sym IP-MIB::ipForwarding.0 = INTEGER: 0 IF-MIB::ifName.1 = STRING: em0 IF-MIB::ifInOctets.1 = Counter32: 2305271272 IF-MIB::ifInOctets.2 = Counter32: 0 IF-MIB::ifInOctets.3 = Counter32: 0 IF-MIB::ifInOctets.4 = Counter32: 2091665909 after: SNMPv2-MIB::sysDescr.0 = STRING: sym SNMPv2-MIB::sysObjectID.0 = OID: OPENBSD-BASE-MIB::openBSDDefaultObjectID SNMPv2-MIB::sysUpTime.0 = Timeticks: (10069) 0:01:40.69 SNMPv2-MIB::sysContact.0 = STRING: r...@symphytum.spacehopper.org IP-MIB::ipForwarding.0 = INTEGER: 0 IP-MIB::ipDefaultTTL.0 = INTEGER: 64 IP-MIB::ipInReceives.0 = Counter32: 11634146 IP-MIB::ipInHdrErrors.0 = Counter32: 2 IF-MIB::ifName.1 = STRING: em0 IF-MIB::ifName.2 = STRING: em1 IF-MIB::ifName.3 = STRING: enc0 IF-MIB::ifName.4 = STRING: lo0 IF-MIB::ifInOctets.1 = Counter32: 2305182489 IF-MIB::ifInOctets.2 = Counter32: 0 IF-MIB::ifInOctets.3 = Counter32: 0 IF-MIB::ifInOctets.4 = Counter32: 2091496232 Index: mps.c === RCS file: /cvs/src/usr.sbin/snmpd/mps.c,v retrieving revision 1.21 diff -u -p -r1.21 mps.c --- mps.c 18 Jul 2015 16:54:43 - 1.21 +++ mps.c 8 Oct 2015 07:37:54 - @@ -300,7 +300,7 @@ fail: int mps_getbulkreq(struct snmp_message *msg, struct ber_element **root, -struct ber_oid *o, int max) +struct ber_element **end, struct ber_oid *o, int max) { struct ber_element *c, *d, *e; size_t len; @@ -308,14 +308,17 @@ mps_getbulkreq(struct snmp_message *msg, j = max; c = *root; + *end = NULL; for (d = NULL, len = 0; j > 0; j--) { e = ber_add_sequence(NULL); if (c == NULL) c = e; ret = mps_getnextreq(msg, e, o); - if (ret == 1) + if (ret == 1) { + *root = c; return (1); + } if (ret == -1) { ber_free_elements(e); if (d == NULL) @@ -330,6 +333,7 @@ mps_getbulkreq(struct snmp_message *msg, if (d != NULL) ber_link_elements(d, e); d = e; + *end = d; } *root = c; Index: snmpd.h === RCS file: /cvs/src/usr.sbin/snmpd/snmpd.h,v retrieving revision 1.61 diff -u -p -r1.61 snmpd.h --- snmpd.h 5 Oct 2015 15:29:14 - 1.61 +++ snmpd.h 8 Oct 2015 07:37:54 - @@ -397,6 +397,7 @@ struct snmp_message { struct ber_element *sm_c; struct ber_element *sm_next; struct ber_element *sm_last; + struct ber_element *sm_end; u_int8_t sm_data[READ_BUF_SIZE]; size_t sm_datalen; @@ -639,7 +640,7 @@ int mps_getreq(struct snmp_message *, int mps_getnextreq(struct snmp_message *, struct ber_element *, struct ber_oid *); int mps_getbulkreq(struct snmp_message *, struct ber_element **, - struct ber_oid *, int); + struct ber_element **, struct ber_oid *, int); int mps_setreq(struct snmp_message *, struct ber_element *, struct ber_oid *); int mps_set(struct ber_oid *, void *, long long); Index: snmpe.c
Re: unlock bge(4)
On 05/10/15(Mon) 12:42, Jonathan Matthew wrote: > This brings bge(4) up to about where em(4) is. It involves a few different > changes, notably: > - per-ring refill timeouts, to ensure we don't try to refill a ring from the > timeout and an interrupt at the same time > - removing the list of tx dma maps and just assigning a map to use based > on the current ring slot number (this is why it's more - than +) > - using atomics to adjust bge_txcnt, saving those adjustments until the > end of the tx/txeof loops, and only acting on the adjusted value > - not adding any mutexes > > I've tested it on amd64 with these: > > bge0 at pci1 dev 0 function 0 "Broadcom BCM5721" rev 0x21, BCM5750 C1 > (0x4201): msi, address 00:18:f3:d1:80:64 > > bge0 at pci2 dev 2 function 0 "Broadcom BCM5703X" rev 0x02, BCM5702/5703 A2 > (0x1002): apic 3 int 1, address 00:09:3d:00:84:d1 > > and on sparc64: > > bge0 at pci7 dev 4 function 0 "Broadcom BCM5714" rev 0xa3, BCM5715 A3 > (0x9003): ivec 0x795, address 00:14:4f:00:5a:5a > > On the sparc64 box (a v245), with if_input_process unlocked, this gets me > 10-20% > more pps or about 100mbps more in tcpbench (550, up from 450). > > ok? ok mpi@ > > Index: if_bgereg.h > === > RCS file: /cvs/src/sys/dev/pci/if_bgereg.h,v > retrieving revision 1.127 > diff -u -p -u -p -r1.127 if_bgereg.h > --- if_bgereg.h 11 Sep 2015 13:02:28 - 1.127 > +++ if_bgereg.h 30 Sep 2015 11:14:09 - > @@ -2830,11 +2830,6 @@ struct bge_type { > #define BGE_TIMEOUT 10 > #define BGE_TXCONS_UNSET0x /* impossible value */ > > -struct txdmamap_pool_entry { > - bus_dmamap_t dmamap; > - SLIST_ENTRY(txdmamap_pool_entry) link; > -}; > - > #define ASF_ENABLE 1 > #define ASF_NEW_HANDSHAKE 2 > #define ASF_STACKUP 4 > @@ -2934,11 +2929,11 @@ struct bge_softc { > int bge_txcnt; > struct timeout bge_timeout; > struct timeout bge_rxtimeout; > + struct timeout bge_rxtimeout_jumbo; > u_int32_t bge_rx_discards; > u_int32_t bge_tx_discards; > u_int32_t bge_rx_inerrors; > u_int32_t bge_rx_overruns; > u_int32_t bge_tx_collisions; > - SLIST_HEAD(, txdmamap_pool_entry) txdma_list; > - struct txdmamap_pool_entry *txdma[BGE_TX_RING_CNT]; > + bus_dmamap_tbge_txdma[BGE_TX_RING_CNT]; > }; > Index: if_bge.c > === > RCS file: /cvs/src/sys/dev/pci/if_bge.c,v > retrieving revision 1.369 > diff -u -p -u -p -r1.369 if_bge.c > --- if_bge.c 19 Jul 2015 06:28:12 - 1.369 > +++ if_bge.c 5 Oct 2015 01:06:21 - > @@ -141,7 +141,7 @@ void bge_tick(void *); > void bge_stats_update(struct bge_softc *); > void bge_stats_update_regs(struct bge_softc *); > int bge_cksum_pad(struct mbuf *); > -int bge_encap(struct bge_softc *, struct mbuf *, u_int32_t *); > +int bge_encap(struct bge_softc *, struct mbuf *, int *); > int bge_compact_dma_runt(struct mbuf *); > > int bge_intr(void *); > @@ -1262,20 +1262,31 @@ uncreate: > return (1); > } > > +/* > + * When the refill timeout for a ring is active, that ring is so empty > + * that no more packets can be received on it, so the interrupt handler > + * will not attempt to refill it, meaning we don't need to protect against > + * interrupts here. > + */ > + > void > bge_rxtick(void *arg) > { > struct bge_softc *sc = arg; > - int s; > > - s = splnet(); > if (ISSET(sc->bge_flags, BGE_RXRING_VALID) && > if_rxr_inuse(>bge_std_ring) <= 8) > bge_fill_rx_ring_std(sc); > +} > + > +void > +bge_rxtick_jumbo(void *arg) > +{ > + struct bge_softc *sc = arg; > + > if (ISSET(sc->bge_flags, BGE_JUMBO_RXRING_VALID) && > if_rxr_inuse(>bge_jumbo_ring) <= 8) > bge_fill_rx_ring_jumbo(sc); > - splx(s); > } > > void > @@ -1410,7 +1421,7 @@ bge_fill_rx_ring_jumbo(struct bge_softc >* that now, then try again later. >*/ > if (if_rxr_inuse(>bge_jumbo_ring) <= 8) > - timeout_add(>bge_rxtimeout, 1); > + timeout_add(>bge_rxtimeout_jumbo, 1); > } > > void > @@ -1446,7 +1457,6 @@ void > bge_free_tx_ring(struct bge_softc *sc) > { > int i; > - struct txdmamap_pool_entry *dma; > > if (!(sc->bge_flags & BGE_TXRING_VALID)) > return; > @@ -1455,18 +1465,12 @@ bge_free_tx_ring(struct bge_softc *sc) > if (sc->bge_cdata.bge_tx_chain[i] != NULL) { > m_freem(sc->bge_cdata.bge_tx_chain[i]); > sc->bge_cdata.bge_tx_chain[i] = NULL; > - SLIST_INSERT_HEAD(>txdma_list, sc->txdma[i], > - link); >
link state change madness
Recent NFS-related rtisvalid(9) regressions turns out to be related to the use of DOWN RTF_CLONED route entries. Such entries are DOWN because they are cloned from a DOWN RTF_CLONING entry. While investigating this race I figured out that we have different code paths messing with link-state change. Diff below unify them all. As a result two #ifdef chunks dies because dohook() is now called. ok? Index: net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.385 diff -u -p -r1.385 if.c --- net/if.c5 Oct 2015 19:05:09 - 1.385 +++ net/if.c8 Oct 2015 09:42:01 - @@ -137,13 +137,14 @@ int if_getgroupmembers(caddr_t); intif_getgroupattribs(caddr_t); intif_setgroupattribs(caddr_t); +void if_linkstate(void *); + intif_clone_list(struct if_clonereq *); struct if_clone*if_clone_lookup(const char *, int *); intif_group_egress_build(void); void if_watchdog_task(void *); -void if_link_state_change_task(void *); void if_input_process(void *); @@ -408,7 +409,7 @@ if_attachsetup(struct ifnet *ifp) timeout_set(ifp->if_slowtimo, if_slowtimo, ifp); if_slowtimo(ifp); - task_set(ifp->if_linkstatetask, if_link_state_change_task, ifp); + task_set(ifp->if_linkstatetask, if_linkstate, ifp); if_idxmap_insert(ifp); KASSERT(if_get(0) == NULL); @@ -1385,7 +1386,6 @@ if_downall(void) /* * Mark an interface down and notify protocols of * the transition. - * NOTE: must be called at splsoftnet or equivalent. */ void if_down(struct ifnet *ifp) @@ -1400,24 +1400,13 @@ if_down(struct ifnet *ifp) pfctlinput(PRC_IFDOWN, ifa->ifa_addr); } IFQ_PURGE(>if_snd); -#if NCARP > 0 - if (ifp->if_carp) - carp_carpdev_state(ifp); -#endif -#if NBRIDGE > 0 - if (ifp->if_bridgeport) - bstp_ifstate(ifp); -#endif - rt_ifmsg(ifp); -#ifndef SMALL_KERNEL - rt_if_track(ifp); -#endif + + if_linkstate(ifp); } /* * Mark an interface up and notify protocols of * the transition. - * NOTE: must be called at splsoftnet or equivalent. */ void if_up(struct ifnet *ifp) @@ -1426,42 +1415,24 @@ if_up(struct ifnet *ifp) ifp->if_flags |= IFF_UP; microtime(>if_lastchange); -#if NCARP > 0 - if (ifp->if_carp) - carp_carpdev_state(ifp); -#endif -#if NBRIDGE > 0 - if (ifp->if_bridgeport) - bstp_ifstate(ifp); -#endif - rt_ifmsg(ifp); + #ifdef INET6 /* Userland expects the kernel to set ::1 on lo0. */ if (ifp == lo0ifp) in6_ifattach(ifp); #endif -#ifndef SMALL_KERNEL - rt_if_track(ifp); -#endif -} -/* - * Schedule a link state change task. - */ -void -if_link_state_change(struct ifnet *ifp) -{ - /* put the routing table update task on systq */ - task_add(systq, ifp->if_linkstatetask); + if_linkstate(ifp); } /* - * Process a link state change. + * Notify userland, the routing table and hooks owner of + * a link-state transition. */ void -if_link_state_change_task(void *arg) +if_linkstate(void *xifp) { - struct ifnet *ifp = arg; + struct ifnet *ifp = xifp; int s; s = splsoftnet(); @@ -1471,6 +1442,15 @@ if_link_state_change_task(void *arg) #endif dohooks(ifp->if_linkstatehooks, 0); splx(s); +} + +/* + * Schedule a link state change task. + */ +void +if_link_state_change(struct ifnet *ifp) +{ + task_add(systq, ifp->if_linkstatetask); } /*
tame signify
Without mucking about in the internals, here are some toplevel tame calls. Index: signify.c === RCS file: /cvs/src/usr.bin/signify/signify.c,v retrieving revision 1.100 diff -u -p -r1.100 signify.c --- signify.c 16 Jan 2015 06:16:12 - 1.100 +++ signify.c 8 Oct 2015 15:11:05 - @@ -663,6 +663,7 @@ main(int argc, char **argv) VERIFY } verb = NONE; + tame("stdio rpath wpath cpath tty", NULL); rounds = 42; @@ -721,6 +722,25 @@ main(int argc, char **argv) } argc -= optind; argv += optind; + + switch (verb) { + case GENERATE: + case SIGN: + /* keep it all */ + break; + case CHECK: + tame("stdio rpath", NULL); + break; + case VERIFY: + if (embedded) + tame("stdio rpath wpath cpath", NULL); + else + tame("stdio rpath", NULL); + break; + default: + tame("stdio", NULL); + break; + } #ifndef VERIFYONLY if (verb == CHECK) {
Re: tame signify
Ted Unangst wrote: > Without mucking about in the internals, here are some toplevel tame calls. check return values. ok, ok. in the fairly common verify case of piping msgfile to - (as in patching), we can cut things down a bit more as well. Index: signify.c === RCS file: /cvs/src/usr.bin/signify/signify.c,v retrieving revision 1.100 diff -u -p -r1.100 signify.c --- signify.c 16 Jan 2015 06:16:12 - 1.100 +++ signify.c 8 Oct 2015 15:29:35 - @@ -663,6 +663,8 @@ main(int argc, char **argv) VERIFY } verb = NONE; + if (tame("stdio rpath wpath cpath tty", NULL) == -1) + err(1, "tame"); rounds = 42; @@ -721,6 +723,30 @@ main(int argc, char **argv) } argc -= optind; argv += optind; + + switch (verb) { + case GENERATE: + case SIGN: + /* keep it all */ + break; + case CHECK: + if (tame("stdio rpath", NULL) == -1) + err(1, "tame"); + break; + case VERIFY: + if (embedded && (!msgfile || strcmp(msgfile, "-") != 0)) { + if (tame("stdio rpath wpath cpath", NULL) == -1) + err(1, "tame"); + } else { + if (tame("stdio rpath", NULL) == -1) + err(1, "tame"); + } + break; + default: + if (tame("stdio", NULL) == -1) + err(1, "tame"); + break; + } #ifndef VERIFYONLY if (verb == CHECK) {
ip6_output bcopy->memcpy
Hello - This diff converts some malloc/bcopy to malloc/memcpy. netinet6 didn't get the bcopy->memcpy overhaul like netinet did. Index: netinet6/ip6_output.c === RCS file: /cvs/src/sys/netinet6/ip6_output.c,v retrieving revision 1.189 diff -u -p -r1.189 ip6_output.c --- netinet6/ip6_output.c 23 Sep 2015 08:49:46 - 1.189 +++ netinet6/ip6_output.c 8 Oct 2015 15:09:01 - @@ -2189,7 +2189,7 @@ do {\ dst->type = malloc(hlen, M_IP6OPT, canwait);\ if (dst->type == NULL && canwait == M_NOWAIT)\ goto bad;\ - bcopy(src->type, dst->type, hlen);\ + memcpy(dst->type, src->type, hlen);\ }\ } while (/*CONSTCOND*/ 0) @@ -2211,7 +2211,7 @@ copypktopts(struct ip6_pktopts *dst, str M_IP6OPT, canwait); if (dst->ip6po_nexthop == NULL) goto bad; - bcopy(src->ip6po_nexthop, dst->ip6po_nexthop, + memcpy(dst->ip6po_nexthop, src->ip6po_nexthop, src->ip6po_nexthop->sa_len); } PKTOPT_EXTHDRCPY(ip6po_hbh); @@ -2847,7 +2847,7 @@ ip6_setpktopt(int optname, u_char *buf, opt->ip6po_nexthop = malloc(*buf, M_IP6OPT, M_NOWAIT); if (opt->ip6po_nexthop == NULL) return (ENOBUFS); - bcopy(buf, opt->ip6po_nexthop, *buf); + memcpy(opt->ip6po_nexthop, buf, *buf); break; case IPV6_2292HOPOPTS: @@ -2882,7 +2882,7 @@ ip6_setpktopt(int optname, u_char *buf, opt->ip6po_hbh = malloc(hbhlen, M_IP6OPT, M_NOWAIT); if (opt->ip6po_hbh == NULL) return (ENOBUFS); - bcopy(hbh, opt->ip6po_hbh, hbhlen); + memcpy(opt->ip6po_hbh, hbh, hbhlen); break; } @@ -2945,7 +2945,7 @@ ip6_setpktopt(int optname, u_char *buf, *newdest = malloc(destlen, M_IP6OPT, M_NOWAIT); if (*newdest == NULL) return (ENOBUFS); - bcopy(dest, *newdest, destlen); + memcpy(*newdest, dest, destlen); break; } @@ -2986,7 +2986,7 @@ ip6_setpktopt(int optname, u_char *buf, opt->ip6po_rthdr = malloc(rthlen, M_IP6OPT, M_NOWAIT); if (opt->ip6po_rthdr == NULL) return (ENOBUFS); - bcopy(rth, opt->ip6po_rthdr, rthlen); + memcpy(opt->ip6po_rthdr, rth, rthlen); break; }
fix ksh histfile owner
ksh does a little dance to try and gift history files to their original owner if it's somehow running as a different user. this of course only works as root, and is probably a terrible idea. ksh should simply refuse to open a history file that's owned by somebody else. Index: history.c === RCS file: /cvs/src/bin/ksh/history.c,v retrieving revision 1.45 diff -u -p -r1.45 history.c --- history.c 8 Oct 2015 15:54:59 - 1.45 +++ history.c 8 Oct 2015 16:00:10 - @@ -619,6 +619,7 @@ hist_init(Source *s) unsigned char *base; int lines; int fd; + struct stat sb; if (Flag(FTALKING) == 0) return; @@ -636,6 +637,10 @@ hist_init(Source *s) /* we have a file and are interactive */ if ((fd = open(hname, O_RDWR|O_CREAT|O_APPEND, 0600)) < 0) return; + if (fstat(fd, ) == -1 || sb.st_uid != getuid()) { + close(fd); + return; + } histfd = savefd(fd); if (histfd != fd) @@ -732,7 +737,6 @@ hist_shrink(unsigned char *oldbase, int { int fd; charnfile[1024]; - struct stat statb; unsigned char *nbase = oldbase; int nbytes = oldbytes; @@ -759,11 +763,6 @@ hist_shrink(unsigned char *oldbase, int unlink(nfile); return 1; } - /* -* worry about who owns this file -*/ - if (fstat(histfd, ) >= 0) - fchown(fd, statb.st_uid, statb.st_gid); close(fd); /*
Re: unlocking em - unable to fill any rx descriptors
On Thu, Oct 08, 2015 at 01:20 +0200, Hrvoje Popovski wrote: > Hi all, > > i have fairly simple setup with receiver connected to em2 and sender > connected to em3. Both em are Intel I350. Setup is without pf with these > sysctls: > > kern.pool_debug=1 >net.inet.ip.forwarding=1 > net.inet.ip.ifq.maxlen=8192 > ddb.console=1 > > with if_em.c revisions 1.307 and 1.306 i can trigger > em2: unable to fill any rx descriptors > when doing ifconfig em2 down/up (receiver side) while generating > traffic. i can't trigger this with ifconfig em3 down/up (sender side) or > destroying bridge and enabling it. this is reproducible. > > with bridged setup when doing ifconfig em2 down/up i'm getting rx > descriptors log and bridge stops bridging traffic until doing this: > stop generating traffic > ifconfig em2 down > ifconfig em3 down > ifconfig bridge0 destroy > ifconfig em2 up > ifconfig em3 up > sh netstart bridge0 > start generating traffic > > > with routed setup when doing ifconfig em2 down/up traffic is not > forwarded until > stop generating traffic > ifconfig em2 down > ifconfig em2 up > start generating traffic > > > with if_em.c revisions 1.305 and if_em.h revision 1.57 i can't trigger > rx descriptors log and bridge starts to bridge traffic almost instantly > when doing ifconfig em2 up > > > i am willing to debug this further but i don't know how... > Is it possible that em_reset_hw is called a bit too early while we don't have full control of the driver? Also it appears to me that while we're mi_switch'ing in the sched_barrier/sleep_finish another process can start running em_init since mi_switch will unlock all KERNEL_LOCKs that the process doing em_stop might hold. Now of course KASSERT would have fired if IFF_RUNNING would have been set, but perhaps we just haven't hit this yet. Shouldn't we do rwlock protection to make sure that while em_stop is not finished no other process can do em_stop and/or em_init? diff --git sys/dev/pci/if_em.c sys/dev/pci/if_em.c index f2ddb32..7b24dce 100644 --- sys/dev/pci/if_em.c +++ sys/dev/pci/if_em.c @@ -1559,13 +1559,14 @@ em_stop(void *arg, int softonly) timeout_del(>timer_handle); timeout_del(>tx_fifo_timer_handle); - if (!softonly) { + if (!softonly) em_disable_intr(sc); - em_reset_hw(>hw); - } intr_barrier(sc->sc_intrhand); + if (!softonly) + em_reset_hw(>hw); + KASSERT((ifp->if_flags & IFF_RUNNING) == 0); em_free_transmit_structures(sc);
Re: fix ksh histfile owner
> ksh does a little dance to try and gift history files to their original owner > if it's somehow running as a different user. this of course only works as > root, and is probably a terrible idea. When I found this with tame, I got worried about the implications of how ksh is opening the file in the first place. No *chown call should be present in ksh! > ksh should simply refuse to open a history file that's owned by somebody else. good policy. ok deraadt
mg(1) onlywind() line number anomaly
If you open mg with only the *scratch* buffer loaded (no file names given on command line), and press enter 5 times, take a note of the line number in the status bar then open theo mode (M-x theo). Then move back to the *scratch* buffer via switch-to-buffer (C-x b). You'll notice the line number of the *scratch* buffer is now 1, though the cursor is on line 5. theo mode is initally opened splitting the screen in two, then onlywind() is called. The bug is in onlywind(). This diff fixes this behavior. ok? -lum Index: window.c === RCS file: /cvs/src/usr.bin/mg/window.c,v retrieving revision 1.33 diff -u -p -u -p -r1.33 window.c --- window.c25 Mar 2015 20:53:31 - 1.33 +++ window.c8 Oct 2015 19:49:34 - @@ -167,6 +167,8 @@ onlywind(int f, int n) wp->w_bufp->b_doto = wp->w_doto; wp->w_bufp->b_markp = wp->w_markp; wp->w_bufp->b_marko = wp->w_marko; + wp->w_bufp->b_dotline = wp->w_dotline; + wp->w_bufp->b_markline = wp->w_markline; } free(wp); } @@ -178,6 +180,8 @@ onlywind(int f, int n) wp->w_bufp->b_doto = wp->w_doto; wp->w_bufp->b_markp = wp->w_markp; wp->w_bufp->b_marko = wp->w_marko; + wp->w_bufp->b_dotline = wp->w_dotline; + wp->w_bufp->b_markline = wp->w_markline; } free(wp); }
vattr_null() doesn't initialize va_filerev
Filesystem implementations depend on vattr_null() to initialize the fields in struct vattr, which is true for all the fields except va_filerev. It therefore is not set to VNOVAL as expected by the file system, but contains whatever was there on the stack. This causes VOP_GETATTR() on cd9660 and msdosfs vnodes to yield garbage for va_filerev. Index: vfs_subr.c === RCS file: /cvs/src/sys/kern/vfs_subr.c,v retrieving revision 1.235 diff -u -r1.235 vfs_subr.c --- vfs_subr.c 8 Oct 2015 08:41:58 - 1.235 +++ vfs_subr.c 8 Oct 2015 19:33:44 - @@ -305,7 +305,7 @@ vap->va_atime.tv_sec = vap->va_atime.tv_nsec = vap->va_mtime.tv_sec = vap->va_mtime.tv_nsec = vap->va_ctime.tv_sec = vap->va_ctime.tv_nsec = - vap->va_flags = vap->va_gen = VNOVAL; + vap->va_flags = vap->va_gen = vap->va_filerev = VNOVAL; vap->va_vaflags = 0; } cheers, natano
[patch] regress test fix: libc/db
Implement max file size constant in libc/db/dbtest regression test. Some /bin files read for testing are larger than SIZE_MAX causing tests to fail. Also change error for file too large from E2BIG to EFBIG. Feedback is very appreciated. Index: dbtest.c === RCS file: /cvs/src/regress/lib/libc/db/dbtest.c,v retrieving revision 1.13 diff -u -p -u -p -r1.13 dbtest.c --- dbtest.c6 Feb 2015 23:21:58 - 1.13 +++ dbtest.c9 Oct 2015 03:34:48 - @@ -45,6 +45,8 @@ #include +#define FILE_SIZE_MAX (512 * 1024) + enum S { COMMAND, COMPARE, GET, PUT, REMOVE, SEQ, SEQFLAG, KEY, DATA }; voidcompare(DBT *, DBT *); @@ -685,8 +687,8 @@ rfile(name, lenp) if ((fd = open(name, O_RDONLY, 0)) < 0 || fstat(fd, )) dberr("%s: %s\n", name, strerror(errno)); - if (sb.st_size > (off_t)SIZE_MAX) - dberr("%s: %s\n", name, strerror(E2BIG)); + if (sb.st_size > FILE_SIZE_MAX) + dberr("%s: %s\n", name, strerror(EFBIG)); if ((p = (void *)malloc((u_int)sb.st_size)) == NULL) dberr("%s", strerror(errno)); (void)read(fd, p, (int)sb.st_size);
[patch] regress test fix: systrace/id
Attached is a patch for the systrace/id regress test: Updated the id.policy used to allow the new pledge syscall This is my first time working with the regress tests. I want to make sure I'm on the right track so any tips are appreciated. Is there interest in additional regress test work? I have additional regress fixes if this one is on the right track. Thanks, Kevin Index: id.policy === RCS file: /cvs/src/regress/bin/systrace/id/id.policy,v retrieving revision 1.5 diff -u -p -u -p -r1.5 id.policy --- id.policy 13 Sep 2015 17:08:04 - 1.5 +++ id.policy 9 Oct 2015 05:13:00 - @@ -28,6 +28,7 @@ Policy: /usr/bin/id, Emulation: native native-mprotect: permit native-mquery: permit native-munmap: permit + native-pledge: permit native-pread: permit native-read: permit native-sendsyslog: permit
Re: Unlocking ix(4) a bit further
> Date: Wed, 30 Sep 2015 14:30:11 +0200 (CEST) > From: Mark Kettenis> > Since people seemed to like my diff for em(4), here is one for ix(4). > In addition to unlocking the rx completion path, this one also uses > intr_barrier() and removes the rx mutex that claudio@ put in recently. > > I don't have any ix(4) hardware. So the only guarantee is that this > compiles ;). Based on the experience with em(4), here is an updated diff. Index: if_ix.c === RCS file: /home/cvs/src/sys/dev/pci/if_ix.c,v retrieving revision 1.125 diff -u -p -r1.125 if_ix.c --- if_ix.c 11 Sep 2015 12:09:10 - 1.125 +++ if_ix.c 8 Oct 2015 18:43:36 - @@ -210,8 +210,6 @@ ixgbe_attach(struct device *parent, stru sc->osdep.os_sc = sc; sc->osdep.os_pa = *pa; - mtx_init(>rx_mtx, IPL_NET); - /* Set up the timer callout */ timeout_set(>timer, ixgbe_local_timer, sc); timeout_set(>rx_refill, ixgbe_rxrefill, sc); @@ -529,10 +527,7 @@ ixgbe_watchdog(struct ifnet * ifp) /* * The timer is set to 5 every time ixgbe_start() queues a packet. -* Then ixgbe_txeof() keeps resetting to 5 as long as it cleans at -* least one descriptor. -* Finally, anytime all descriptors are clean the timer is -* set to 0. +* Anytime all descriptors are clean the timer is set to 0. */ for (i = 0; i < sc->num_queues; i++, txr++) { if (txr->watchdog_timer == 0 || --txr->watchdog_timer) @@ -864,7 +859,7 @@ ixgbe_intr(void *arg) struct tx_ring *txr = sc->tx_rings; struct ixgbe_hw *hw = >hw; uint32_t reg_eicr; - int i, refill = 0, was_active = 0; + int i, refill = 0; reg_eicr = IXGBE_READ_REG(>hw, IXGBE_EICR); if (reg_eicr == 0) { @@ -874,11 +869,8 @@ ixgbe_intr(void *arg) if (ISSET(ifp->if_flags, IFF_RUNNING)) { ixgbe_rxeof(que); - refill = 1; - - if (ISSET(ifp->if_flags, IFF_OACTIVE)) - was_active = 1; ixgbe_txeof(txr); + refill = 1; } if (refill) { @@ -894,6 +886,8 @@ ixgbe_intr(void *arg) if (reg_eicr & IXGBE_EICR_LSC) { KERNEL_LOCK(); ixgbe_update_link_status(sc); + if (!IFQ_IS_EMPTY(>if_snd)) + ixgbe_start(ifp); KERNEL_UNLOCK(); } @@ -915,12 +909,6 @@ ixgbe_intr(void *arg) } } - if (was_active) { - KERNEL_LOCK(); - ixgbe_start(ifp); - KERNEL_UNLOCK(); - } - /* Check for fan failure */ if ((hw->device_id == IXGBE_DEV_ID_82598AT) && (reg_eicr & IXGBE_EICR_GPI_SDP1)) { @@ -1063,17 +1051,9 @@ ixgbe_encap(struct tx_ring *txr, struct cmd_type_len |= IXGBE_ADVTXD_DCMD_VLE; #endif - /* -* Force a cleanup if number of TX descriptors -* available is below the threshold. If it fails -* to get above, then abort transmit. -*/ - if (txr->tx_avail <= IXGBE_TX_CLEANUP_THRESHOLD) { - ixgbe_txeof(txr); - /* Make sure things have improved */ - if (txr->tx_avail <= IXGBE_TX_OP_THRESHOLD) - return (ENOBUFS); - } + /* Check that we have least the minimal number of TX descriptors. */ + if (txr->tx_avail <= IXGBE_TX_OP_THRESHOLD) + return (ENOBUFS); /* * Important to capture the first descriptor @@ -1137,8 +1117,6 @@ ixgbe_encap(struct tx_ring *txr, struct txd->read.cmd_type_len |= htole32(IXGBE_TXD_CMD_EOP | IXGBE_TXD_CMD_RS); - txr->tx_avail -= map->dm_nsegs; - txr->next_avail_desc = i; txbuf->m_head = m_head; /* @@ -1156,6 +1134,11 @@ ixgbe_encap(struct tx_ring *txr, struct txbuf = >tx_buffers[first]; txbuf->eop_index = last; + membar_producer(); + + atomic_sub_int(>tx_avail, map->dm_nsegs); + txr->next_avail_desc = i; + ++txr->tx_packets; return (0); @@ -1323,6 +1306,10 @@ ixgbe_stop(void *arg) /* reprogram the RAR[0] in case user changed it. */ ixgbe_set_rar(>hw, 0, sc->hw.mac.addr, 0, IXGBE_RAH_AV); + intr_barrier(sc->tag); + + KASSERT((ifp->if_flags & IFF_RUNNING) == 0); + /* Should we really clear all structures on stop? */ ixgbe_free_transmit_structures(sc); ixgbe_free_receive_structures(sc); @@ -2203,11 +2190,13 @@ ixgbe_tx_ctx_setup(struct tx_ring *txr, tx_buffer->m_head = NULL; tx_buffer->eop_index = -1; + membar_producer(); + /* We've consumed the first desc, adjust counters */ if (++ctxd == sc->num_tx_desc) ctxd = 0;
Re: unlocking em - unable to fill any rx descriptors
> Date: Thu, 8 Oct 2015 18:12:26 +0200 > From: Mike Belopuhov> > Is it possible that em_reset_hw is called a bit too early > while we don't have full control of the driver? I don't think that's a problem. > Also it appears to me that while we're mi_switch'ing in the > sched_barrier/sleep_finish another process can start running > em_init since mi_switch will unlock all KERNEL_LOCKs that the > process doing em_stop might hold. Now of course KASSERT would > have fired if IFF_RUNNING would have been set, but perhaps we > just haven't hit this yet. Shouldn't we do rwlock protection > to make sure that while em_stop is not finished no other > process can do em_stop and/or em_init? I certainly couldn't completely convince myself that such a race couldn't happen. That's why I put that KASSERT there. Adding an rwlockmight indeed make sense. But I wonder if the locking should be added in the layer above such that we don't have to replicate it in all the drivers. > diff --git sys/dev/pci/if_em.c sys/dev/pci/if_em.c > index f2ddb32..7b24dce 100644 > --- sys/dev/pci/if_em.c > +++ sys/dev/pci/if_em.c > @@ -1559,13 +1559,14 @@ em_stop(void *arg, int softonly) > timeout_del(>timer_handle); > timeout_del(>tx_fifo_timer_handle); > > - if (!softonly) { > + if (!softonly) > em_disable_intr(sc); > - em_reset_hw(>hw); > - } > > intr_barrier(sc->sc_intrhand); > > + if (!softonly) > + em_reset_hw(>hw); > + > KASSERT((ifp->if_flags & IFF_RUNNING) == 0); > > em_free_transmit_structures(sc); >