Re: fix snmpd reporting of IP-MIB::ipForwarding.0

2015-10-08 Thread Reyk Floeter
On Wed, Oct 07, 2015 at 10:39:17PM +0100, Stuart Henderson wrote:
> IP-MIB::ipForwarding.0 should return one of these values:
> 
> ipForwarding OBJECT-TYPE
> SYNTAX INTEGER {
> forwarding(1),-- acting as a router
> notForwarding(2)  -- NOT acting as a router
>}
> 
> Currently we directly return the value of sysctl net.inet.ip.forwarding
> which is incorrect.
> 
> OK to fix it like this?
> 

See one comment comment below, otherwise OK.

> Index: mib.c
> ===
> RCS file: /cvs/src/usr.sbin/snmpd/mib.c,v
> retrieving revision 1.76
> diff -u -p -r1.76 mib.c
> --- mib.c 10 Jun 2015 10:03:59 -  1.76
> +++ mib.c 7 Oct 2015 21:34:27 -
> @@ -2984,7 +2984,7 @@ mib_ipforwarding(struct oid *oid, struct
>   if (sysctl(mib, sizeofa(mib), , , NULL, 0) == -1)
>   return (-1);
>  
> - *elm = ber_add_integer(*elm, v);

We don't handle types in snmpd very well, but I'd at least like to
have a comment:

/* ipForwarding: forwarding(1), notForwarding(2) */

> + *elm = ber_add_integer(*elm, v ? 1 : 2);
>  
>   return (0);
>  }
> 

-- 



Re: Bulkget & snmpd

2015-10-08 Thread Stuart Henderson
On 2015/10/08 01:02, Stuart Henderson wrote:
> > $ snmpbulkget -Cn2 -Cr4 -v2c -c public 127.0.0.1 SNMPv2-MIB::sysDescr 
> > IP-MIB::ipForwarding IF-MIB::ifName IF-MIB::ifInOctets
> > SNMPv2-MIB::sysDescr.0 = STRING: some host description
> > IP-MIB::ipForwarding.0 = INTEGER: notForwarding(2)
> > IF-MIB::ifName.1 = STRING: em0
> > IF-MIB::ifName.2 = STRING: em1
> > IF-MIB::ifName.3 = STRING: enc0
> > IF-MIB::ifName.4 = STRING: lo0
> > IF-MIB::ifInOctets.1 = Counter32: 4019922211
> > IF-MIB::ifInOctets.2 = Counter32: 0
> > IF-MIB::ifInOctets.3 = Counter32: 0
> > IF-MIB::ifInOctets.4 = Counter32: 1433448428
> > 
> > (the answers don't have to be in the same order as the query, I have
> > just formatted them that way for the example to make it clearer).
> 
> O. And now I find Gerhard Roth's post
> 
> https://marc.info/?l=openbsd-tech=143375327425321=2
> 
> In a nutshell: the mps_getbulkreq() calls subsequent to the first one
> link the elements to the *first* element of the previous call, not the
> last element. Gerhard's diff passes in the address of the last element
> so it can be linked correctly.
> 
> I'll look at it again tomorrow but that looks correct for the -Cr case,
> nonreps and I think also len counting are still a problem but I'll look
> at those again on top of Gerhard's diff.

Gerhard's diff (repeated below) is correct for -Cr. Any OKs to commit it?

$ snmpbulkget -Cr4 -v2c -c public 127.0.0.1 SNMPv2-MIB::sysDescr 
IP-MIB::ipForwarding IF-MIB::ifName IF-MIB::ifInOctets

before:

SNMPv2-MIB::sysDescr.0 = STRING: sym
IP-MIB::ipForwarding.0 = INTEGER: 0
IF-MIB::ifName.1 = STRING: em0
IF-MIB::ifInOctets.1 = Counter32: 2305271272
IF-MIB::ifInOctets.2 = Counter32: 0
IF-MIB::ifInOctets.3 = Counter32: 0
IF-MIB::ifInOctets.4 = Counter32: 2091665909

after:

SNMPv2-MIB::sysDescr.0 = STRING: sym
SNMPv2-MIB::sysObjectID.0 = OID: OPENBSD-BASE-MIB::openBSDDefaultObjectID
SNMPv2-MIB::sysUpTime.0 = Timeticks: (10069) 0:01:40.69
SNMPv2-MIB::sysContact.0 = STRING: r...@symphytum.spacehopper.org
IP-MIB::ipForwarding.0 = INTEGER: 0
IP-MIB::ipDefaultTTL.0 = INTEGER: 64
IP-MIB::ipInReceives.0 = Counter32: 11634146
IP-MIB::ipInHdrErrors.0 = Counter32: 2
IF-MIB::ifName.1 = STRING: em0
IF-MIB::ifName.2 = STRING: em1
IF-MIB::ifName.3 = STRING: enc0
IF-MIB::ifName.4 = STRING: lo0
IF-MIB::ifInOctets.1 = Counter32: 2305182489
IF-MIB::ifInOctets.2 = Counter32: 0
IF-MIB::ifInOctets.3 = Counter32: 0
IF-MIB::ifInOctets.4 = Counter32: 2091496232

Index: mps.c
===
RCS file: /cvs/src/usr.sbin/snmpd/mps.c,v
retrieving revision 1.21
diff -u -p -r1.21 mps.c
--- mps.c   18 Jul 2015 16:54:43 -  1.21
+++ mps.c   8 Oct 2015 07:37:54 -
@@ -300,7 +300,7 @@ fail:
 
 int
 mps_getbulkreq(struct snmp_message *msg, struct ber_element **root,
-struct ber_oid *o, int max)
+struct ber_element **end, struct ber_oid *o, int max)
 {
struct ber_element *c, *d, *e;
size_t len;
@@ -308,14 +308,17 @@ mps_getbulkreq(struct snmp_message *msg,
 
j = max;
c = *root;
+   *end = NULL;
 
for (d = NULL, len = 0; j > 0; j--) {
e = ber_add_sequence(NULL);
if (c == NULL)
c = e;
ret = mps_getnextreq(msg, e, o);
-   if (ret == 1)
+   if (ret == 1) {
+   *root = c;
return (1);
+   }
if (ret == -1) {
ber_free_elements(e);
if (d == NULL)
@@ -330,6 +333,7 @@ mps_getbulkreq(struct snmp_message *msg,
if (d != NULL)
ber_link_elements(d, e);
d = e;
+   *end = d;
}
 
*root = c;
Index: snmpd.h
===
RCS file: /cvs/src/usr.sbin/snmpd/snmpd.h,v
retrieving revision 1.61
diff -u -p -r1.61 snmpd.h
--- snmpd.h 5 Oct 2015 15:29:14 -   1.61
+++ snmpd.h 8 Oct 2015 07:37:54 -
@@ -397,6 +397,7 @@ struct snmp_message {
struct ber_element  *sm_c;
struct ber_element  *sm_next;
struct ber_element  *sm_last;
+   struct ber_element  *sm_end;
 
u_int8_t sm_data[READ_BUF_SIZE];
size_t   sm_datalen;
@@ -639,7 +640,7 @@ int  mps_getreq(struct snmp_message *, 
 int mps_getnextreq(struct snmp_message *, struct ber_element *,
struct ber_oid *);
 int mps_getbulkreq(struct snmp_message *, struct ber_element **,
-   struct ber_oid *, int);
+   struct ber_element **, struct ber_oid *, int);
 int mps_setreq(struct snmp_message *, struct ber_element *,
struct ber_oid *);
 int mps_set(struct ber_oid *, void *, long long);
Index: snmpe.c

Re: unlock bge(4)

2015-10-08 Thread Martin Pieuchot
On 05/10/15(Mon) 12:42, Jonathan Matthew wrote:
> This brings bge(4) up to about where em(4) is.  It involves a few different
> changes, notably:
> - per-ring refill timeouts, to ensure we don't try to refill a ring from the
> timeout and an interrupt at the same time
> - removing the list of tx dma maps and just assigning a map to use based
> on the current ring slot number (this is why it's more - than +)
> - using atomics to adjust bge_txcnt, saving those adjustments until the
> end of the tx/txeof loops, and only acting on the adjusted value
> - not adding any mutexes
> 
> I've tested it on amd64 with these:
> 
> bge0 at pci1 dev 0 function 0 "Broadcom BCM5721" rev 0x21, BCM5750 C1 
> (0x4201): msi, address 00:18:f3:d1:80:64
> 
> bge0 at pci2 dev 2 function 0 "Broadcom BCM5703X" rev 0x02, BCM5702/5703 A2 
> (0x1002): apic 3 int 1, address 00:09:3d:00:84:d1
> 
> and on sparc64:
> 
> bge0 at pci7 dev 4 function 0 "Broadcom BCM5714" rev 0xa3, BCM5715 A3 
> (0x9003): ivec 0x795, address 00:14:4f:00:5a:5a
> 
> On the sparc64 box (a v245), with if_input_process unlocked, this gets me 
> 10-20%
> more pps or about 100mbps more in tcpbench (550, up from 450).
> 
> ok?

ok mpi@

> 
> Index: if_bgereg.h
> ===
> RCS file: /cvs/src/sys/dev/pci/if_bgereg.h,v
> retrieving revision 1.127
> diff -u -p -u -p -r1.127 if_bgereg.h
> --- if_bgereg.h   11 Sep 2015 13:02:28 -  1.127
> +++ if_bgereg.h   30 Sep 2015 11:14:09 -
> @@ -2830,11 +2830,6 @@ struct bge_type {
>  #define  BGE_TIMEOUT 10
>  #define  BGE_TXCONS_UNSET0x  /* impossible value */
>  
> -struct txdmamap_pool_entry {
> - bus_dmamap_t dmamap;
> - SLIST_ENTRY(txdmamap_pool_entry) link;
> -};
> -
>  #define  ASF_ENABLE  1
>  #define  ASF_NEW_HANDSHAKE   2
>  #define  ASF_STACKUP 4
> @@ -2934,11 +2929,11 @@ struct bge_softc {
>   int bge_txcnt;
>   struct timeout  bge_timeout;
>   struct timeout  bge_rxtimeout;
> + struct timeout  bge_rxtimeout_jumbo;
>   u_int32_t   bge_rx_discards;
>   u_int32_t   bge_tx_discards;
>   u_int32_t   bge_rx_inerrors;
>   u_int32_t   bge_rx_overruns;
>   u_int32_t   bge_tx_collisions;
> - SLIST_HEAD(, txdmamap_pool_entry) txdma_list;
> - struct txdmamap_pool_entry *txdma[BGE_TX_RING_CNT];
> + bus_dmamap_tbge_txdma[BGE_TX_RING_CNT];
>  };
> Index: if_bge.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_bge.c,v
> retrieving revision 1.369
> diff -u -p -u -p -r1.369 if_bge.c
> --- if_bge.c  19 Jul 2015 06:28:12 -  1.369
> +++ if_bge.c  5 Oct 2015 01:06:21 -
> @@ -141,7 +141,7 @@ void bge_tick(void *);
>  void bge_stats_update(struct bge_softc *);
>  void bge_stats_update_regs(struct bge_softc *);
>  int bge_cksum_pad(struct mbuf *);
> -int bge_encap(struct bge_softc *, struct mbuf *, u_int32_t *);
> +int bge_encap(struct bge_softc *, struct mbuf *, int *);
>  int bge_compact_dma_runt(struct mbuf *);
>  
>  int bge_intr(void *);
> @@ -1262,20 +1262,31 @@ uncreate:
>   return (1);
>  }
>  
> +/*
> + * When the refill timeout for a ring is active, that ring is so empty
> + * that no more packets can be received on it, so the interrupt handler
> + * will not attempt to refill it, meaning we don't need to protect against
> + * interrupts here.
> + */
> +
>  void
>  bge_rxtick(void *arg)
>  {
>   struct bge_softc *sc = arg;
> - int s;
>  
> - s = splnet();
>   if (ISSET(sc->bge_flags, BGE_RXRING_VALID) &&
>   if_rxr_inuse(>bge_std_ring) <= 8)
>   bge_fill_rx_ring_std(sc);
> +}
> +
> +void
> +bge_rxtick_jumbo(void *arg)
> +{
> + struct bge_softc *sc = arg;
> +
>   if (ISSET(sc->bge_flags, BGE_JUMBO_RXRING_VALID) &&
>   if_rxr_inuse(>bge_jumbo_ring) <= 8)
>   bge_fill_rx_ring_jumbo(sc);
> - splx(s);
>  }
>  
>  void
> @@ -1410,7 +1421,7 @@ bge_fill_rx_ring_jumbo(struct bge_softc 
>* that now, then try again later.
>*/
>   if (if_rxr_inuse(>bge_jumbo_ring) <= 8)
> - timeout_add(>bge_rxtimeout, 1);
> + timeout_add(>bge_rxtimeout_jumbo, 1);
>  }
>  
>  void
> @@ -1446,7 +1457,6 @@ void
>  bge_free_tx_ring(struct bge_softc *sc)
>  {
>   int i;
> - struct txdmamap_pool_entry *dma;
>  
>   if (!(sc->bge_flags & BGE_TXRING_VALID))
>   return;
> @@ -1455,18 +1465,12 @@ bge_free_tx_ring(struct bge_softc *sc)
>   if (sc->bge_cdata.bge_tx_chain[i] != NULL) {
>   m_freem(sc->bge_cdata.bge_tx_chain[i]);
>   sc->bge_cdata.bge_tx_chain[i] = NULL;
> - SLIST_INSERT_HEAD(>txdma_list, sc->txdma[i],
> - link);
> 

link state change madness

2015-10-08 Thread Martin Pieuchot
Recent NFS-related rtisvalid(9) regressions turns out to be related
to the use of DOWN RTF_CLONED route entries.  Such entries are DOWN
because they are cloned from a DOWN RTF_CLONING entry.

While investigating this race I figured out that we have different
code paths messing with link-state change.  Diff below unify them
all.  As a result two #ifdef chunks dies because dohook() is now
called.

ok?


Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.385
diff -u -p -r1.385 if.c
--- net/if.c5 Oct 2015 19:05:09 -   1.385
+++ net/if.c8 Oct 2015 09:42:01 -
@@ -137,13 +137,14 @@ int   if_getgroupmembers(caddr_t);
 intif_getgroupattribs(caddr_t);
 intif_setgroupattribs(caddr_t);
 
+void   if_linkstate(void *);
+
 intif_clone_list(struct if_clonereq *);
 struct if_clone*if_clone_lookup(const char *, int *);
 
 intif_group_egress_build(void);
 
 void   if_watchdog_task(void *);
-void   if_link_state_change_task(void *);
 
 void   if_input_process(void *);
 
@@ -408,7 +409,7 @@ if_attachsetup(struct ifnet *ifp)
timeout_set(ifp->if_slowtimo, if_slowtimo, ifp);
if_slowtimo(ifp);
 
-   task_set(ifp->if_linkstatetask, if_link_state_change_task, ifp);
+   task_set(ifp->if_linkstatetask, if_linkstate, ifp);
 
if_idxmap_insert(ifp);
KASSERT(if_get(0) == NULL);
@@ -1385,7 +1386,6 @@ if_downall(void)
 /*
  * Mark an interface down and notify protocols of
  * the transition.
- * NOTE: must be called at splsoftnet or equivalent.
  */
 void
 if_down(struct ifnet *ifp)
@@ -1400,24 +1400,13 @@ if_down(struct ifnet *ifp)
pfctlinput(PRC_IFDOWN, ifa->ifa_addr);
}
IFQ_PURGE(>if_snd);
-#if NCARP > 0
-   if (ifp->if_carp)
-   carp_carpdev_state(ifp);
-#endif
-#if NBRIDGE > 0
-   if (ifp->if_bridgeport)
-   bstp_ifstate(ifp);
-#endif
-   rt_ifmsg(ifp);
-#ifndef SMALL_KERNEL
-   rt_if_track(ifp);
-#endif
+
+   if_linkstate(ifp);
 }
 
 /*
  * Mark an interface up and notify protocols of
  * the transition.
- * NOTE: must be called at splsoftnet or equivalent.
  */
 void
 if_up(struct ifnet *ifp)
@@ -1426,42 +1415,24 @@ if_up(struct ifnet *ifp)
 
ifp->if_flags |= IFF_UP;
microtime(>if_lastchange);
-#if NCARP > 0
-   if (ifp->if_carp)
-   carp_carpdev_state(ifp);
-#endif
-#if NBRIDGE > 0
-   if (ifp->if_bridgeport)
-   bstp_ifstate(ifp);
-#endif
-   rt_ifmsg(ifp);
+
 #ifdef INET6
/* Userland expects the kernel to set ::1 on lo0. */
if (ifp == lo0ifp)
in6_ifattach(ifp);
 #endif
-#ifndef SMALL_KERNEL
-   rt_if_track(ifp);
-#endif
-}
 
-/*
- * Schedule a link state change task.
- */
-void
-if_link_state_change(struct ifnet *ifp)
-{
-   /* put the routing table update task on systq */
-   task_add(systq, ifp->if_linkstatetask);
+   if_linkstate(ifp);
 }
 
 /*
- * Process a link state change.
+ * Notify userland, the routing table and hooks owner of
+ * a link-state transition.
  */
 void
-if_link_state_change_task(void *arg)
+if_linkstate(void *xifp)
 {
-   struct ifnet *ifp = arg;
+   struct ifnet *ifp = xifp;
int s;
 
s = splsoftnet();
@@ -1471,6 +1442,15 @@ if_link_state_change_task(void *arg)
 #endif
dohooks(ifp->if_linkstatehooks, 0);
splx(s);
+}
+
+/*
+ * Schedule a link state change task.
+ */
+void
+if_link_state_change(struct ifnet *ifp)
+{
+   task_add(systq, ifp->if_linkstatetask);
 }
 
 /*



tame signify

2015-10-08 Thread Ted Unangst
Without mucking about in the internals, here are some toplevel tame calls.


Index: signify.c
===
RCS file: /cvs/src/usr.bin/signify/signify.c,v
retrieving revision 1.100
diff -u -p -r1.100 signify.c
--- signify.c   16 Jan 2015 06:16:12 -  1.100
+++ signify.c   8 Oct 2015 15:11:05 -
@@ -663,6 +663,7 @@ main(int argc, char **argv)
VERIFY
} verb = NONE;
 
+   tame("stdio rpath wpath cpath tty", NULL);
 
rounds = 42;
 
@@ -721,6 +722,25 @@ main(int argc, char **argv)
}
argc -= optind;
argv += optind;
+
+   switch (verb) {
+   case GENERATE:
+   case SIGN:
+   /* keep it all */
+   break;
+   case CHECK:
+   tame("stdio rpath", NULL);
+   break;
+   case VERIFY:
+   if (embedded)
+   tame("stdio rpath wpath cpath", NULL);
+   else
+   tame("stdio rpath", NULL);
+   break;
+   default:
+   tame("stdio", NULL);
+   break;
+   }
 
 #ifndef VERIFYONLY
if (verb == CHECK) {



Re: tame signify

2015-10-08 Thread Ted Unangst
Ted Unangst wrote:
> Without mucking about in the internals, here are some toplevel tame calls.

check return values. ok, ok.

in the fairly common verify case of piping msgfile to - (as in patching), we
can cut things down a bit more as well.


Index: signify.c
===
RCS file: /cvs/src/usr.bin/signify/signify.c,v
retrieving revision 1.100
diff -u -p -r1.100 signify.c
--- signify.c   16 Jan 2015 06:16:12 -  1.100
+++ signify.c   8 Oct 2015 15:29:35 -
@@ -663,6 +663,8 @@ main(int argc, char **argv)
VERIFY
} verb = NONE;
 
+   if (tame("stdio rpath wpath cpath tty", NULL) == -1)
+   err(1, "tame");
 
rounds = 42;
 
@@ -721,6 +723,30 @@ main(int argc, char **argv)
}
argc -= optind;
argv += optind;
+
+   switch (verb) {
+   case GENERATE:
+   case SIGN:
+   /* keep it all */
+   break;
+   case CHECK:
+   if (tame("stdio rpath", NULL) == -1)
+   err(1, "tame");
+   break;
+   case VERIFY:
+   if (embedded && (!msgfile || strcmp(msgfile, "-") != 0)) {
+   if (tame("stdio rpath wpath cpath", NULL) == -1)
+   err(1, "tame");
+   } else {
+   if (tame("stdio rpath", NULL) == -1)
+   err(1, "tame");
+   }
+   break;
+   default:
+   if (tame("stdio", NULL) == -1)
+   err(1, "tame");
+   break;
+   }
 
 #ifndef VERIFYONLY
if (verb == CHECK) {



ip6_output bcopy->memcpy

2015-10-08 Thread David Hill
Hello -

This diff converts some malloc/bcopy to malloc/memcpy.  netinet6
didn't get the bcopy->memcpy overhaul like netinet did.

Index: netinet6/ip6_output.c
===
RCS file: /cvs/src/sys/netinet6/ip6_output.c,v
retrieving revision 1.189
diff -u -p -r1.189 ip6_output.c
--- netinet6/ip6_output.c   23 Sep 2015 08:49:46 -  1.189
+++ netinet6/ip6_output.c   8 Oct 2015 15:09:01 -
@@ -2189,7 +2189,7 @@ do {\
dst->type = malloc(hlen, M_IP6OPT, canwait);\
if (dst->type == NULL && canwait == M_NOWAIT)\
goto bad;\
-   bcopy(src->type, dst->type, hlen);\
+   memcpy(dst->type, src->type, hlen);\
}\
 } while (/*CONSTCOND*/ 0)
 
@@ -2211,7 +2211,7 @@ copypktopts(struct ip6_pktopts *dst, str
M_IP6OPT, canwait);
if (dst->ip6po_nexthop == NULL)
goto bad;
-   bcopy(src->ip6po_nexthop, dst->ip6po_nexthop,
+   memcpy(dst->ip6po_nexthop, src->ip6po_nexthop,
src->ip6po_nexthop->sa_len);
}
PKTOPT_EXTHDRCPY(ip6po_hbh);
@@ -2847,7 +2847,7 @@ ip6_setpktopt(int optname, u_char *buf, 
opt->ip6po_nexthop = malloc(*buf, M_IP6OPT, M_NOWAIT);
if (opt->ip6po_nexthop == NULL)
return (ENOBUFS);
-   bcopy(buf, opt->ip6po_nexthop, *buf);
+   memcpy(opt->ip6po_nexthop, buf, *buf);
break;
 
case IPV6_2292HOPOPTS:
@@ -2882,7 +2882,7 @@ ip6_setpktopt(int optname, u_char *buf, 
opt->ip6po_hbh = malloc(hbhlen, M_IP6OPT, M_NOWAIT);
if (opt->ip6po_hbh == NULL)
return (ENOBUFS);
-   bcopy(hbh, opt->ip6po_hbh, hbhlen);
+   memcpy(opt->ip6po_hbh, hbh, hbhlen);
 
break;
}
@@ -2945,7 +2945,7 @@ ip6_setpktopt(int optname, u_char *buf, 
*newdest = malloc(destlen, M_IP6OPT, M_NOWAIT);
if (*newdest == NULL)
return (ENOBUFS);
-   bcopy(dest, *newdest, destlen);
+   memcpy(*newdest, dest, destlen);
 
break;
}
@@ -2986,7 +2986,7 @@ ip6_setpktopt(int optname, u_char *buf, 
opt->ip6po_rthdr = malloc(rthlen, M_IP6OPT, M_NOWAIT);
if (opt->ip6po_rthdr == NULL)
return (ENOBUFS);
-   bcopy(rth, opt->ip6po_rthdr, rthlen);
+   memcpy(opt->ip6po_rthdr, rth, rthlen);
break;
}
 



fix ksh histfile owner

2015-10-08 Thread Ted Unangst
ksh does a little dance to try and gift history files to their original owner
if it's somehow running as a different user. this of course only works as
root, and is probably a terrible idea.

ksh should simply refuse to open a history file that's owned by somebody else.


Index: history.c
===
RCS file: /cvs/src/bin/ksh/history.c,v
retrieving revision 1.45
diff -u -p -r1.45 history.c
--- history.c   8 Oct 2015 15:54:59 -   1.45
+++ history.c   8 Oct 2015 16:00:10 -
@@ -619,6 +619,7 @@ hist_init(Source *s)
unsigned char   *base;
int lines;
int fd;
+   struct stat sb;
 
if (Flag(FTALKING) == 0)
return;
@@ -636,6 +637,10 @@ hist_init(Source *s)
/* we have a file and are interactive */
if ((fd = open(hname, O_RDWR|O_CREAT|O_APPEND, 0600)) < 0)
return;
+   if (fstat(fd, ) == -1 || sb.st_uid != getuid()) {
+   close(fd);
+   return;
+   }
 
histfd = savefd(fd);
if (histfd != fd)
@@ -732,7 +737,6 @@ hist_shrink(unsigned char *oldbase, int 
 {
int fd;
charnfile[1024];
-   struct  stat statb;
unsigned char *nbase = oldbase;
int nbytes = oldbytes;
 
@@ -759,11 +763,6 @@ hist_shrink(unsigned char *oldbase, int 
unlink(nfile);
return 1;
}
-   /*
-*  worry about who owns this file
-*/
-   if (fstat(histfd, ) >= 0)
-   fchown(fd, statb.st_uid, statb.st_gid);
close(fd);
 
/*



Re: unlocking em - unable to fill any rx descriptors

2015-10-08 Thread Mike Belopuhov
On Thu, Oct 08, 2015 at 01:20 +0200, Hrvoje Popovski wrote:
> Hi all,
> 
> i have fairly simple setup with receiver connected to em2 and sender
> connected to em3. Both em are Intel I350. Setup is without pf with these
> sysctls:
> 
> kern.pool_debug=1
>net.inet.ip.forwarding=1
> net.inet.ip.ifq.maxlen=8192
> ddb.console=1
> 
> with if_em.c revisions 1.307 and 1.306 i can trigger
> em2: unable to fill any rx descriptors
> when doing ifconfig em2 down/up (receiver side) while generating
> traffic. i can't trigger this with ifconfig em3 down/up (sender side) or
> destroying bridge and enabling it. this is reproducible.
> 
> with bridged setup when doing ifconfig em2 down/up i'm getting rx
> descriptors log and bridge stops bridging traffic until doing this:
> stop generating traffic
> ifconfig em2 down
> ifconfig em3 down
> ifconfig bridge0 destroy
> ifconfig em2 up
> ifconfig em3 up
> sh netstart bridge0
> start generating traffic
> 
> 
> with routed setup when doing ifconfig em2 down/up traffic is not
> forwarded until
> stop generating traffic
> ifconfig em2 down
> ifconfig em2 up
> start generating traffic
> 
> 
> with if_em.c revisions 1.305 and if_em.h revision 1.57 i can't trigger
> rx descriptors log and bridge starts to bridge traffic almost instantly
> when doing ifconfig em2 up
> 
> 
> i am willing to debug this further but i don't know how...
> 

Is it possible that em_reset_hw is called a bit too early
while we don't have full control of the driver?

Also it appears to me that while we're mi_switch'ing in the
sched_barrier/sleep_finish another process can start running
em_init since mi_switch will unlock all KERNEL_LOCKs that the
process doing em_stop might hold.  Now of course KASSERT would
have fired if IFF_RUNNING would have been set, but perhaps we
just haven't hit this yet.  Shouldn't we do rwlock protection
to make sure that while em_stop is not finished no other
process can do em_stop and/or em_init?

diff --git sys/dev/pci/if_em.c sys/dev/pci/if_em.c
index f2ddb32..7b24dce 100644
--- sys/dev/pci/if_em.c
+++ sys/dev/pci/if_em.c
@@ -1559,13 +1559,14 @@ em_stop(void *arg, int softonly)
timeout_del(>timer_handle);
timeout_del(>tx_fifo_timer_handle);
 
-   if (!softonly) {
+   if (!softonly)
em_disable_intr(sc);
-   em_reset_hw(>hw);
-   }
 
intr_barrier(sc->sc_intrhand);
 
+   if (!softonly)
+   em_reset_hw(>hw);
+
KASSERT((ifp->if_flags & IFF_RUNNING) == 0);
 
em_free_transmit_structures(sc);



Re: fix ksh histfile owner

2015-10-08 Thread Theo de Raadt
> ksh does a little dance to try and gift history files to their original owner
> if it's somehow running as a different user. this of course only works as
> root, and is probably a terrible idea.

When I found this with tame, I got worried about the implications of
how ksh is opening the file in the first place.  No *chown call should
be present in ksh!  

> ksh should simply refuse to open a history file that's owned by somebody else.

good policy.

ok deraadt



mg(1) onlywind() line number anomaly

2015-10-08 Thread Mark Lumsden
If you open mg with only the *scratch* buffer loaded (no file names
given on command line), and press enter 5 times, take a note of the
line number in the status bar then open theo mode (M-x theo). Then
move back to the *scratch* buffer via switch-to-buffer (C-x b). You'll
notice the line number of the *scratch* buffer is now 1, though the
cursor is on line 5. theo mode is initally opened splitting the screen
in two, then onlywind() is called. The bug is in onlywind(). This diff
fixes this behavior. ok? 

-lum

Index: window.c
===
RCS file: /cvs/src/usr.bin/mg/window.c,v
retrieving revision 1.33
diff -u -p -u -p -r1.33 window.c
--- window.c25 Mar 2015 20:53:31 -  1.33
+++ window.c8 Oct 2015 19:49:34 -
@@ -167,6 +167,8 @@ onlywind(int f, int n)
wp->w_bufp->b_doto = wp->w_doto;
wp->w_bufp->b_markp = wp->w_markp;
wp->w_bufp->b_marko = wp->w_marko;
+   wp->w_bufp->b_dotline = wp->w_dotline;
+   wp->w_bufp->b_markline = wp->w_markline;
}
free(wp);
}
@@ -178,6 +180,8 @@ onlywind(int f, int n)
wp->w_bufp->b_doto = wp->w_doto;
wp->w_bufp->b_markp = wp->w_markp;
wp->w_bufp->b_marko = wp->w_marko;
+   wp->w_bufp->b_dotline = wp->w_dotline;
+   wp->w_bufp->b_markline = wp->w_markline;
}
free(wp);
}



vattr_null() doesn't initialize va_filerev

2015-10-08 Thread Martin Natano
Filesystem implementations depend on vattr_null() to initialize the
fields in struct vattr, which is true for all the fields except
va_filerev. It therefore is not set to VNOVAL as expected by the file
system, but contains whatever was there on the stack. This causes
VOP_GETATTR() on cd9660 and msdosfs vnodes to yield garbage for
va_filerev.

Index: vfs_subr.c
===
RCS file: /cvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.235
diff -u -r1.235 vfs_subr.c
--- vfs_subr.c  8 Oct 2015 08:41:58 -   1.235
+++ vfs_subr.c  8 Oct 2015 19:33:44 -
@@ -305,7 +305,7 @@
vap->va_atime.tv_sec = vap->va_atime.tv_nsec =
vap->va_mtime.tv_sec = vap->va_mtime.tv_nsec =
vap->va_ctime.tv_sec = vap->va_ctime.tv_nsec =
-   vap->va_flags = vap->va_gen = VNOVAL;
+   vap->va_flags = vap->va_gen = vap->va_filerev = VNOVAL;
vap->va_vaflags = 0;
 }

cheers,
natano



[patch] regress test fix: libc/db

2015-10-08 Thread Kevin Reay
Implement max file size constant in libc/db/dbtest regression test.
Some /bin files read for testing are larger than SIZE_MAX causing tests
to fail. Also change error for file too large from E2BIG to EFBIG.

Feedback is very appreciated.

Index: dbtest.c
===
RCS file: /cvs/src/regress/lib/libc/db/dbtest.c,v
retrieving revision 1.13
diff -u -p -u -p -r1.13 dbtest.c
--- dbtest.c6 Feb 2015 23:21:58 -   1.13
+++ dbtest.c9 Oct 2015 03:34:48 -
@@ -45,6 +45,8 @@

 #include 

+#define FILE_SIZE_MAX (512 * 1024)
+
 enum S { COMMAND, COMPARE, GET, PUT, REMOVE, SEQ, SEQFLAG, KEY, DATA };

 voidcompare(DBT *, DBT *);
@@ -685,8 +687,8 @@ rfile(name, lenp)
if ((fd = open(name, O_RDONLY, 0)) < 0 ||
fstat(fd, ))
dberr("%s: %s\n", name, strerror(errno));
-   if (sb.st_size > (off_t)SIZE_MAX)
-   dberr("%s: %s\n", name, strerror(E2BIG));
+   if (sb.st_size > FILE_SIZE_MAX)
+   dberr("%s: %s\n", name, strerror(EFBIG));
if ((p = (void *)malloc((u_int)sb.st_size)) == NULL)
dberr("%s", strerror(errno));
(void)read(fd, p, (int)sb.st_size);


[patch] regress test fix: systrace/id

2015-10-08 Thread Kevin Reay
Attached is a patch for the systrace/id regress test:

Updated the id.policy used to allow the new pledge syscall

This is my first time working with the regress tests. I want to make
sure I'm on the right track so any tips are appreciated. Is there
interest in additional regress test work? I have additional regress
fixes if this one is on the right track.

Thanks,
Kevin

Index: id.policy
===
RCS file: /cvs/src/regress/bin/systrace/id/id.policy,v
retrieving revision 1.5
diff -u -p -u -p -r1.5 id.policy
--- id.policy   13 Sep 2015 17:08:04 -  1.5
+++ id.policy   9 Oct 2015 05:13:00 -
@@ -28,6 +28,7 @@ Policy: /usr/bin/id, Emulation: native
native-mprotect: permit
native-mquery: permit
native-munmap: permit
+   native-pledge: permit
native-pread: permit
native-read: permit
native-sendsyslog: permit


Re: Unlocking ix(4) a bit further

2015-10-08 Thread Mark Kettenis
> Date: Wed, 30 Sep 2015 14:30:11 +0200 (CEST)
> From: Mark Kettenis 
> 
> Since people seemed to like my diff for em(4), here is one for ix(4).
> In addition to unlocking the rx completion path, this one also uses
> intr_barrier() and removes the rx mutex that claudio@ put in recently.
> 
> I don't have any ix(4) hardware.  So the only guarantee is that this
> compiles ;).

Based on the experience with em(4), here is an updated diff.

Index: if_ix.c
===
RCS file: /home/cvs/src/sys/dev/pci/if_ix.c,v
retrieving revision 1.125
diff -u -p -r1.125 if_ix.c
--- if_ix.c 11 Sep 2015 12:09:10 -  1.125
+++ if_ix.c 8 Oct 2015 18:43:36 -
@@ -210,8 +210,6 @@ ixgbe_attach(struct device *parent, stru
sc->osdep.os_sc = sc;
sc->osdep.os_pa = *pa;
 
-   mtx_init(>rx_mtx, IPL_NET);
-
/* Set up the timer callout */
timeout_set(>timer, ixgbe_local_timer, sc);
timeout_set(>rx_refill, ixgbe_rxrefill, sc);
@@ -529,10 +527,7 @@ ixgbe_watchdog(struct ifnet * ifp)
 
/*
 * The timer is set to 5 every time ixgbe_start() queues a packet.
-* Then ixgbe_txeof() keeps resetting to 5 as long as it cleans at
-* least one descriptor.
-* Finally, anytime all descriptors are clean the timer is
-* set to 0.
+* Anytime all descriptors are clean the timer is set to 0.
 */
for (i = 0; i < sc->num_queues; i++, txr++) {
if (txr->watchdog_timer == 0 || --txr->watchdog_timer)
@@ -864,7 +859,7 @@ ixgbe_intr(void *arg)
struct tx_ring  *txr = sc->tx_rings;
struct ixgbe_hw *hw = >hw;
uint32_t reg_eicr;
-   int  i, refill = 0, was_active = 0;
+   int  i, refill = 0;
 
reg_eicr = IXGBE_READ_REG(>hw, IXGBE_EICR);
if (reg_eicr == 0) {
@@ -874,11 +869,8 @@ ixgbe_intr(void *arg)
 
if (ISSET(ifp->if_flags, IFF_RUNNING)) {
ixgbe_rxeof(que);
-   refill = 1;
-
-   if (ISSET(ifp->if_flags, IFF_OACTIVE))
-   was_active = 1;
ixgbe_txeof(txr);
+   refill = 1;
}
 
if (refill) {
@@ -894,6 +886,8 @@ ixgbe_intr(void *arg)
if (reg_eicr & IXGBE_EICR_LSC) {
KERNEL_LOCK();
ixgbe_update_link_status(sc);
+   if (!IFQ_IS_EMPTY(>if_snd))
+   ixgbe_start(ifp);
KERNEL_UNLOCK();
}
 
@@ -915,12 +909,6 @@ ixgbe_intr(void *arg)
}
}
 
-   if (was_active) {
-   KERNEL_LOCK();
-   ixgbe_start(ifp);
-   KERNEL_UNLOCK();
-   }
-
/* Check for fan failure */
if ((hw->device_id == IXGBE_DEV_ID_82598AT) &&
(reg_eicr & IXGBE_EICR_GPI_SDP1)) {
@@ -1063,17 +1051,9 @@ ixgbe_encap(struct tx_ring *txr, struct 
cmd_type_len |= IXGBE_ADVTXD_DCMD_VLE;
 #endif
 
-   /*
-* Force a cleanup if number of TX descriptors
-* available is below the threshold. If it fails
-* to get above, then abort transmit.
-*/
-   if (txr->tx_avail <= IXGBE_TX_CLEANUP_THRESHOLD) {
-   ixgbe_txeof(txr);
-   /* Make sure things have improved */
-   if (txr->tx_avail <= IXGBE_TX_OP_THRESHOLD)
-   return (ENOBUFS);
-   }
+   /* Check that we have least the minimal number of TX descriptors. */
+   if (txr->tx_avail <= IXGBE_TX_OP_THRESHOLD)
+   return (ENOBUFS);
 
/*
 * Important to capture the first descriptor
@@ -1137,8 +1117,6 @@ ixgbe_encap(struct tx_ring *txr, struct 
 
txd->read.cmd_type_len |=
htole32(IXGBE_TXD_CMD_EOP | IXGBE_TXD_CMD_RS);
-   txr->tx_avail -= map->dm_nsegs;
-   txr->next_avail_desc = i;
 
txbuf->m_head = m_head;
/*
@@ -1156,6 +1134,11 @@ ixgbe_encap(struct tx_ring *txr, struct 
txbuf = >tx_buffers[first];
txbuf->eop_index = last;
 
+   membar_producer();
+
+   atomic_sub_int(>tx_avail, map->dm_nsegs);
+   txr->next_avail_desc = i;
+
++txr->tx_packets;
return (0);
 
@@ -1323,6 +1306,10 @@ ixgbe_stop(void *arg)
/* reprogram the RAR[0] in case user changed it. */
ixgbe_set_rar(>hw, 0, sc->hw.mac.addr, 0, IXGBE_RAH_AV);
 
+   intr_barrier(sc->tag);
+
+   KASSERT((ifp->if_flags & IFF_RUNNING) == 0);
+
/* Should we really clear all structures on stop? */
ixgbe_free_transmit_structures(sc);
ixgbe_free_receive_structures(sc);
@@ -2203,11 +2190,13 @@ ixgbe_tx_ctx_setup(struct tx_ring *txr, 
tx_buffer->m_head = NULL;
tx_buffer->eop_index = -1;
 
+   membar_producer();
+
/* We've consumed the first desc, adjust counters */
if (++ctxd == sc->num_tx_desc)
ctxd = 0;

Re: unlocking em - unable to fill any rx descriptors

2015-10-08 Thread Mark Kettenis
> Date: Thu, 8 Oct 2015 18:12:26 +0200
> From: Mike Belopuhov 
> 
> Is it possible that em_reset_hw is called a bit too early
> while we don't have full control of the driver?

I don't think that's a problem.

> Also it appears to me that while we're mi_switch'ing in the
> sched_barrier/sleep_finish another process can start running
> em_init since mi_switch will unlock all KERNEL_LOCKs that the
> process doing em_stop might hold.  Now of course KASSERT would
> have fired if IFF_RUNNING would have been set, but perhaps we
> just haven't hit this yet.  Shouldn't we do rwlock protection
> to make sure that while em_stop is not finished no other
> process can do em_stop and/or em_init?

I certainly couldn't completely convince myself that such a race
couldn't happen.  That's why I put that KASSERT there.  Adding an
rwlockmight indeed make sense.  But I wonder if the locking should be
added in the layer above such that we don't have to replicate it in
all the drivers.

> diff --git sys/dev/pci/if_em.c sys/dev/pci/if_em.c
> index f2ddb32..7b24dce 100644
> --- sys/dev/pci/if_em.c
> +++ sys/dev/pci/if_em.c
> @@ -1559,13 +1559,14 @@ em_stop(void *arg, int softonly)
>   timeout_del(>timer_handle);
>   timeout_del(>tx_fifo_timer_handle);
>  
> - if (!softonly) {
> + if (!softonly)
>   em_disable_intr(sc);
> - em_reset_hw(>hw);
> - }
>  
>   intr_barrier(sc->sc_intrhand);
>  
> + if (!softonly)
> + em_reset_hw(>hw);
> +
>   KASSERT((ifp->if_flags & IFF_RUNNING) == 0);
>  
>   em_free_transmit_structures(sc);
>