AUDIO_DEV_SOUND

2017-05-15 Thread Jan Stary
While /dev/sound is no more, the

  #define AUDIO_DEV_SOUND   0   /* minor of /dev/sound0 */

apparently cannot be just removed from dev/audio.c
- with the diff below, all audio applications complain
that the default audio device cannot be opened. Why is that?
That comment seems misleading at any rate.

Jan


Index: dev/audio.c
===
RCS file: /cvs/src/sys/dev/audio.c,v
retrieving revision 1.163
diff -u -p -r1.163 audio.c
--- dev/audio.c 3 May 2017 06:56:54 -   1.163
+++ dev/audio.c 15 May 2017 08:08:32 -
@@ -50,7 +50,6 @@
 #define DEVNAME(sc)((sc)->dev.dv_xname)
 #define AUDIO_UNIT(n)  (minor(n) & 0x0f)
 #define AUDIO_DEV(n)   (minor(n) & 0xf0)
-#define AUDIO_DEV_SOUND0   /* minor of /dev/sound0 */
 #define AUDIO_DEV_MIXER0x10/* minor of /dev/mixer0 */
 #define AUDIO_DEV_AUDIO0x80/* minor of /dev/audio0 */
 #define AUDIO_DEV_AUDIOCTL 0xc0/* minor of /dev/audioctl */
@@ -1136,7 +1135,6 @@ audio_detach(struct device *self, int fl
 * close uses device_lookup, it returns EXIO and does nothing
 */
mn = self->dv_unit;
-   vdevgone(maj, mn | AUDIO_DEV_SOUND, mn | AUDIO_DEV_SOUND, VCHR);
vdevgone(maj, mn | AUDIO_DEV_AUDIO, mn | AUDIO_DEV_AUDIO, VCHR);
vdevgone(maj, mn | AUDIO_DEV_AUDIOCTL, mn | AUDIO_DEV_AUDIOCTL, VCHR);
vdevgone(maj, mn | AUDIO_DEV_MIXER, mn | AUDIO_DEV_MIXER, VCHR);
@@ -1607,7 +1605,6 @@ audioopen(dev_t dev, int flags, int mode
error = ENXIO;
else {
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_open(sc, flags);
break;
@@ -1633,7 +1630,6 @@ audioclose(dev_t dev, int flags, int ifm
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_close(sc);
break;
@@ -1658,7 +1654,6 @@ audioread(dev_t dev, struct uio *uio, in
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_read(sc, uio, ioflag);
break;
@@ -1683,7 +1678,6 @@ audiowrite(dev_t dev, struct uio *uio, i
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_write(sc, uio, ioflag);
break;
@@ -1708,7 +1702,6 @@ audioioctl(dev_t dev, u_long cmd, caddr_
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_ioctl(sc, cmd, addr);
break;
@@ -1743,7 +1736,6 @@ audiopoll(dev_t dev, int events, struct 
if (sc == NULL)
return POLLERR;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
revents = audio_poll(sc, events, p);
break;



commit regarding boot-time rng

2017-05-15 Thread Theo de Raadt
I wanted to highlight this commit.

On some architectures we use a system bootloader rather than our own
bootloader.  In this particular case an octeon bootloader would be
hard to write, becuase it needs native drivers for some very
complicated peripherals.

So the solution is to have the kernel self-seed the entropy buffer
very very early in locore.

Another possibile solution would be to find the vendor bootloader,
and teach them about our ELF randomdata segment.  Convince then to
fill that buffer if they find it in a kernel.  That would slowly
lead other operating systems towards this strategy...


>CVSROOT:   /cvs
>Module name:   src
>Changes by:v...@cvs.openbsd.org2017/05/09 09:11:33
>
>Modified files:
>   sys/arch/octeon/conf: ld.script 
>   sys/arch/octeon/octeon: locore.S 
>
>Log message:
>Mix bits from the built-in RNG with the randomdata section at boot time.
>This should improve considerably the quality of early entropy and
>stack protector guard data on octeon.
>
>Suggested by and OK deraadt@
>OK kettenis@, jasper@
>
>



Re: Atomic copyin(9)/copyout(9) for amd64

2017-05-15 Thread Miod Vallat
>   So I implemented a new function called
> copyin_futex(9), which is all we really need.

But it is not specific to futex - in fact, it could be used in syscall()
as well.

Better call it fuword() or aligned_fuword() since it has the extra
alignment requirement that fuword() didn't have.



Enable NET_LOCK(), take 3

2017-05-15 Thread Martin Pieuchot
Here's the diff that got reverted before release.  My plan is to enable
it and then get rid of the remaining splsoftnet() to be able to fix the
remaining XXXSMP.

ok?

Index: kern/uipc_socket.c
===
RCS file: /cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.182
diff -u -p -r1.182 uipc_socket.c
--- kern/uipc_socket.c  2 Apr 2017 23:40:08 -   1.182
+++ kern/uipc_socket.c  15 May 2017 09:35:52 -
@@ -1038,10 +1038,12 @@ sorflush(struct socket *so)
 {
struct sockbuf *sb = >so_rcv;
struct protosw *pr = so->so_proto;
+   sa_family_t af = pr->pr_domain->dom_family;
struct sockbuf asb;
 
sb->sb_flags |= SB_NOINTR;
-   sblock(sb, M_WAITOK, NULL);
+   sblock(sb, M_WAITOK,
+   (af != PF_LOCAL && af != PF_ROUTE) ?  : NULL);
socantrcvmore(so);
sbunlock(sb);
asb = *sb;
Index: kern/uipc_socket2.c
===
RCS file: /cvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.75
diff -u -p -r1.75 uipc_socket2.c
--- kern/uipc_socket2.c 17 Mar 2017 17:19:16 -  1.75
+++ kern/uipc_socket2.c 15 May 2017 09:35:52 -
@@ -299,7 +299,11 @@ soassertlocked(struct socket *so)
 int
 sosleep(struct socket *so, void *ident, int prio, const char *wmesg, int timo)
 {
-   return tsleep(ident, prio, wmesg, timo);
+   if ((so->so_proto->pr_domain->dom_family != PF_LOCAL) &&
+   (so->so_proto->pr_domain->dom_family != PF_ROUTE)) {
+   return rwsleep(ident, , prio, wmesg, timo);
+   } else
+   return tsleep(ident, prio, wmesg, timo);
 }
 
 /*
Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.495
diff -u -p -r1.495 if.c
--- net/if.c9 May 2017 09:31:07 -   1.495
+++ net/if.c15 May 2017 09:35:52 -
@@ -230,6 +230,12 @@ struct taskq   *softnettq;
 struct task if_input_task_locked = TASK_INITIALIZER(if_netisr, NULL);
 
 /*
+ * Serialize socket operations to ensure no new sleeping points
+ * are introduced in IP output paths.
+ */
+struct rwlock netlock = RWLOCK_INITIALIZER("netlock");
+
+/*
  * Network interface utility routines.
  */
 void
@@ -1146,7 +1152,10 @@ if_clone_create(const char *name, int rd
if (ifunit(name) != NULL)
return (EEXIST);
 
+   /* XXXSMP breaks atomicity */
+   rw_exit_write();
ret = (*ifc->ifc_create)(ifc, unit);
+   rw_enter_write();
 
if (ret != 0 || (ifp = ifunit(name)) == NULL)
return (ret);
@@ -1188,7 +1197,10 @@ if_clone_destroy(const char *name)
splx(s);
}
 
+   /* XXXSMP breaks atomicity */
+   rw_exit_write();
ret = (*ifc->ifc_destroy)(ifp);
+   rw_enter_write();
 
return (ret);
 }
Index: net/if_pflow.c
===
RCS file: /cvs/src/sys/net/if_pflow.c,v
retrieving revision 1.75
diff -u -p -r1.75 if_pflow.c
--- net/if_pflow.c  17 Mar 2017 17:19:16 -  1.75
+++ net/if_pflow.c  15 May 2017 09:35:52 -
@@ -463,9 +463,12 @@ pflowioctl(struct ifnet *ifp, u_long cmd
sc->sc_gcounter=pflowstats.pflow_flows;
/* send templates on startup */
if (sc->sc_version == PFLOW_PROTO_10) {
+   /* XXXSMP breaks atomicity */
+   rw_exit_write();
s = splnet();
pflow_sendout_ipfix_tmpl(sc);
splx(s);
+   rw_enter_write();
}
} else
ifp->if_flags &= ~IFF_RUNNING;
@@ -505,6 +508,8 @@ pflowioctl(struct ifnet *ifp, u_long cmd
sizeof(pflowr
return (error);
 
+   /* XXXSMP breaks atomicity */
+   rw_exit_write();
s = splnet();
error = pflow_set(sc, );
splx(s);
@@ -522,6 +527,7 @@ pflowioctl(struct ifnet *ifp, u_long cmd
} else
ifp->if_flags &= ~IFF_RUNNING;
 
+   rw_enter_write();
break;
 
default:
Index: net/pf.c
===
RCS file: /cvs/src/sys/net/pf.c,v
retrieving revision 1.1021
diff -u -p -r1.1021 pf.c
--- net/pf.c5 May 2017 16:30:39 -   1.1021
+++ net/pf.c15 May 2017 09:35:52 -
@@ -1154,26 +1154,20 @@ pf_state_export(struct pfsync_state *sp,
 /* END state table stuff */
 
 void
-pf_purge_expired_rules(int locked)
+pf_purge_expired_rules(void)
 {
struct pf_rule  *r;
 
+   NET_ASSERT_LOCKED();
+
if (SLIST_EMPTY(_rule_gcl))
return;
 
-   if (!locked)
-  

Re: Atomic copyin(9)/copyout(9) for amd64

2017-05-15 Thread Martin Pieuchot
On 14/05/17(Sun) 22:05, Mark Kettenis wrote:
> > Date: Fri, 12 May 2017 13:34:38 +
> > From: Visa Hankala 
> > 
> > On Mon, May 01, 2017 at 06:02:24PM +0200, Mark Kettenis wrote:
> > > The futex(2) syscall needs to be able to atomically copy the futex in
> > > and out of userland.  The current implementation uses copyin(9) and
> > > copyout(9) for that.  The futex is a 32-bit integer, and currently our
> > > copyin(9) and copyout(9) don't guarantee an atomic 32-bit access.
> > > Previously mpi@ and I discussed implementing new interfaces that do
> > > guarantee the required atomicity.  However, it oocurred to me that we
> > > could simply change our copyin implementations such that they
> > > guarantee atomicity of a properly aligned 32-bit copyin and copyout.
> > > 
> > > The i386 version of these calls uses "rep movsl", which means it is
> > > already atomic.  At least that is how I interpret 8.2.4 in Volume 3A
> > > of the Intel SDM.  The diff below makes the amd64 version safe as
> > > well.  This does introduce a few additional instructions in the loop.
> > > Apparently modern Intel CPUs optimize the string loops.  If we can
> > > rely on the hardware to turn 32-bit moves into 64-bit moves, we could
> > > simplify the code by using "rep movsl" instead of "rep movsq".
> > 
> > I submit to this approach. OK visa@
> 
> So I'm partly backtracking here...
> 
> I tried to convert a few more architectures, and things get a bit
> messier than I envisioned on architectures that have an optimized copy
> function and care about alignment.  Especially when copyin(9) is
> implemented in assembly.  So I implemented a new function called
> copyin_futex(9), which is all we really need.  The implementations for
> powerpc and sparc64 below have been (lightly) tested.  The regression
> test that mpi@ created passes and doesn't crash the kernel!  I have an
> (untested) implementation for hppa as well.  The amd64 and i386
> implementation can simply be a wrapper around copyin(9).  The same
> holds for all out non-MULTIPROCESSOR architectures.  So my idea is to
> let kern/sys_futex.c provide that wrapper if MULTIPROCESSOR isn't
> defined and add an #ifdef MULTIPROCESSOR around all MD copyin_futex()
> implementations.
> 
> Thoughts?

As I said I'm also find with a custom function.  The !MULTIPROCESSOR
defines make sense to me.

I don't have a better suggestion than miod@ for the name :)

> Index: arch/powerpc/powerpc/pmap.c
> ===
> RCS file: /cvs/src/sys/arch/powerpc/powerpc/pmap.c,v
> retrieving revision 1.166
> diff -u -p -r1.166 pmap.c
> --- arch/powerpc/powerpc/pmap.c   19 Oct 2016 08:28:20 -  1.166
> +++ arch/powerpc/powerpc/pmap.c   14 May 2017 19:39:54 -
> @@ -1792,6 +1802,30 @@ copyout(const void *kaddr, void *udaddr,
>   kaddr += l;
>   len -= l;
>   }
> + curpcb->pcb_onfault = oldh;
> + return 0;
> +}
> +
> +int
> +copyin_futex(const uint32_t *udaddr, uint32_t *kaddr)
> +{
> + volatile uint32_t *p;
> + u_int32_t oldsr;
> + faultbuf env;
> + void *oldh = curpcb->pcb_onfault;
> +
> + if ((u_int)udaddr & 0x3)
> + return EFAULT;
> +
> + p = PPC_USER_ADDR + ((u_int)udaddr & ~PPC_SEGMENT_MASK);
> + oldsr = pmap_setusr(curpcb->pcb_pm, (vaddr_t)udaddr);
> + if (setfault()) {
> + pmap_popusr(oldsr);
> + curpcb->pcb_onfault = oldh;
> + return EFAULT;
> + }
> + *kaddr = *p;
> + pmap_popusr(oldsr);
>   curpcb->pcb_onfault = oldh;
>   return 0;
>  }
> Index: arch/sparc64/sparc64/locore.s
> ===
> RCS file: /cvs/src/sys/arch/sparc64/sparc64/locore.s,v
> retrieving revision 1.185
> diff -u -p -r1.185 locore.s
> --- arch/sparc64/sparc64/locore.s 30 Apr 2017 16:45:45 -  1.185
> +++ arch/sparc64/sparc64/locore.s 14 May 2017 19:39:54 -
> @@ -5968,6 +5968,21 @@ Lcopyout_done:
>   retl! New instr
>clr%o0 ! return 0
>  
> +ENTRY(copyin_futex)
> + andcc   %o0, 0x3, %g0
> + bnz,pn  %xcc, Lcopyfault
> +  nop
> + GET_CPCB(%o3)
> + set Lcopyfault, %o4
> + membar  #Sync
> + stx %o4, [%o3 + PCB_ONFAULT]
> + lduwa   [%o0] ASI_AIUS, %o2
> + stw %o2, [%o1]
> + membar  #Sync
> + stx %g0, [%o3 + PCB_ONFAULT]
> + retl
> +  clr%o0
> +
>  ! Copyin or copyout fault.  Clear cpcb->pcb_onfault and return EFAULT.
>  ! Note that although we were in bcopy, there is no state to clean up;
>  ! the only special thing is that we have to return to [g7 + 8] rather than
> Index: kern/sys_futex.c
> ===
> RCS file: /cvs/src/sys/kern/sys_futex.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 sys_futex.c
> --- kern/sys_futex.c  30 Apr 2017 10:10:21 -  

NFS is IPL_SOFTNET free

2017-05-15 Thread Martin Pieuchot
Outside of USB, no code is executed in a softnet interrupt context.  So
what's protecting NFS data structures is the KERNEL_LOCK().

But more importantly, since r1.114 of nfs_socket.c, the 'softnet' thread
is no longer executing NFS code.  So all these splsoftnet() can disappear.

Ok?

Index: nfs/nfs_socket.c
===
RCS file: /cvs/src/sys/nfs/nfs_socket.c,v
retrieving revision 1.115
diff -u -p -r1.115 nfs_socket.c
--- nfs/nfs_socket.c8 May 2017 09:11:20 -   1.115
+++ nfs/nfs_socket.c15 May 2017 11:11:27 -
@@ -394,7 +394,7 @@ nfs_reconnect(struct nfsreq *rep)
 {
struct nfsreq *rp;
struct nfsmount *nmp = rep->r_nmp;
-   int s, error;
+   int error;
 
nfs_disconnect(nmp);
while ((error = nfs_connect(nmp, rep)) != 0) {
@@ -407,12 +407,10 @@ nfs_reconnect(struct nfsreq *rep)
 * Loop through outstanding request list and fix up all requests
 * on old socket.
 */
-   s = splsoftnet();
TAILQ_FOREACH(rp, >nm_reqsq, r_chain) {
rp->r_flags |= R_MUSTRESEND;
rp->r_rexmit = 0;
}
-   splx(s);
return (0);
 }
 
@@ -737,7 +735,7 @@ nfs_reply(struct nfsreq *myrep)
struct mbuf *nam;
u_int32_t rxid, *tl, t1;
caddr_t cp2;
-   int s, error;
+   int error;
 
/*
 * Loop around until we get our own reply
@@ -790,7 +788,6 @@ nfsmout:
 * Loop through the request list to match up the reply
 * Iff no match, just drop the datagram
 */
-   s = splsoftnet();
TAILQ_FOREACH(rep, >nm_reqsq, r_chain) {
if (rep->r_mrep == NULL && rxid == rep->r_xid) {
/* Found it.. */
@@ -820,7 +817,6 @@ nfsmout:
break;
}
}
-   splx(s);
/*
 * If not matched to a request, drop it.
 * If it's mine, get out.
@@ -854,7 +850,7 @@ nfs_request(struct vnode *vp, int procnu
struct nfsmount *nmp;
struct timeval tv;
caddr_t cp2;
-   int t1, i, s, error = 0;
+   int t1, i, error = 0;
int trylater_delay;
struct nfsreq *rep;
int  mrest_len;
@@ -911,7 +907,6 @@ tryagain:
 * Chain request into list of outstanding requests. Be sure
 * to put it LAST so timer finds oldest requests first.
 */
-   s = splsoftnet();
if (TAILQ_EMPTY(>nm_reqsq))
timeout_add(>nm_rtimeout, nfs_ticks);
TAILQ_INSERT_TAIL(>nm_reqsq, rep, r_chain);
@@ -924,7 +919,6 @@ tryagain:
if (nmp->nm_so && (nmp->nm_sotype != SOCK_DGRAM ||
(nmp->nm_flag & NFSMNT_DUMBTIMR) ||
nmp->nm_sent < nmp->nm_cwnd)) {
-   splx(s);
if (nmp->nm_soflags & PR_CONNREQUIRED)
error = nfs_sndlock(>nm_flag, rep);
if (!error) {
@@ -939,7 +933,6 @@ tryagain:
rep->r_flags |= R_SENT;
}
} else {
-   splx(s);
rep->r_rtt = -1;
}
 
@@ -952,11 +945,9 @@ tryagain:
/*
 * RPC done, unlink the request.
 */
-   s = splsoftnet();
TAILQ_REMOVE(>nm_reqsq, rep, r_chain);
if (TAILQ_EMPTY(>nm_reqsq))
timeout_del(>nm_rtimeout);
-   splx(s);
 
/*
 * Decrement the outstanding request count.
Index: nfs/nfs_syscalls.c
===
RCS file: /cvs/src/sys/nfs/nfs_syscalls.c,v
retrieving revision 1.107
diff -u -p -r1.107 nfs_syscalls.c
--- nfs/nfs_syscalls.c  22 Feb 2017 11:42:46 -  1.107
+++ nfs/nfs_syscalls.c  15 May 2017 11:11:27 -
@@ -229,7 +229,7 @@ nfssvc_addsock(struct file *fp, struct m
struct nfssvc_sock *slp;
struct socket *so;
struct nfssvc_sock *tslp;
-   int error, s;
+   int error;
 
so = (struct socket *)fp->f_data;
tslp = NULL;
@@ -286,12 +286,10 @@ nfssvc_addsock(struct file *fp, struct m
slp->ns_nam = mynam;
fp->f_count++;
slp->ns_fp = fp;
-   s = splsoftnet();
so->so_upcallarg = (caddr_t)slp;
so->so_upcall = nfsrv_rcv;
slp->ns_flag = (SLP_VALID | SLP_NEEDQ);
nfsrv_wakenfsd(slp);
-   splx(s);
return (0);
 }
 
@@ -309,11 +307,10 @@ nfssvc_nfsd(struct nfsd *nfsd)
int *solockp;
struct nfsrv_descript *nd = NULL;
struct mbuf *mreq;
-   int error = 0, cacherep, s, sotype;
+   int error = 0, cacherep, sotype;
 
cacherep = RC_DOIT;
 
-   s = splsoftnet();
TAILQ_INSERT_TAIL(_head, nfsd, nfsd_chain);
nfs_numnfsd++;
 
@@ -357,8 +354,6 @@ loop:
goto loop;
}
 
-   splx(s);
-
so = slp->ns_so;

gif(4) vs splnet()

2017-05-15 Thread Martin Pieuchot
Similar to gre(4), the global list of interfaces needs to be protected
by the NET_LOCK().

ok?

Index: net/if_gif.c
===
RCS file: /cvs/src/sys/net/if_gif.c,v
retrieving revision 1.94
diff -u -p -r1.94 if_gif.c
--- net/if_gif.c4 May 2017 15:00:24 -   1.94
+++ net/if_gif.c15 May 2017 13:10:13 -
@@ -129,9 +129,9 @@ gif_clone_create(struct if_clone *ifc, i
 #if NBPFILTER > 0
bpfattach(>gif_if.if_bpf, >gif_if, DLT_LOOP, sizeof(u_int32_t));
 #endif
-   s = splnet();
+   NET_LOCK(s);
LIST_INSERT_HEAD(_softc_list, sc, gif_list);
-   splx(s);
+   NET_UNLOCK(s);
 
return (0);
 }
@@ -142,9 +142,9 @@ gif_clone_destroy(struct ifnet *ifp)
struct gif_softc *sc = ifp->if_softc;
int s;
 
-   s = splnet();
+   NET_LOCK(s);
LIST_REMOVE(sc, gif_list);
-   splx(s);
+   NET_UNLOCK(s);
 
if_detach(ifp);
 
@@ -323,7 +323,6 @@ gif_ioctl(struct ifnet *ifp, u_long cmd,
int error = 0, size;
struct sockaddr *dst, *src;
struct sockaddr *sa;
-   int s;
struct gif_softc *sc2;
 
switch (cmd) {
@@ -466,10 +465,8 @@ gif_ioctl(struct ifnet *ifp, u_long cmd,
bcopy((caddr_t)dst, (caddr_t)sa, dst->sa_len);
sc->gif_pdst = sa;
 
-   s = splnet();
ifp->if_flags |= IFF_RUNNING;
if_up(ifp); /* send up RTM_IFINFO */
-   splx(s);
 
error = 0;
break;



Re: splsoftnet() in SO_SPLICE

2017-05-15 Thread Alexander Bluhm
On Mon, May 15, 2017 at 01:20:02PM +0200, Martin Pieuchot wrote:
> Here's the last of the splsoftnet(), is it ok if I replace it with a
> KERNEL_LOCK() to make it clear this needs protection?

I think it needs the socket lock.  somove() is modifying the
so_splicelen value.  somove() calls soassertlocked().

bluhm

> 
> Index: kern/uipc_socket.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_socket.c,v
> retrieving revision 1.182
> diff -u -p -r1.182 uipc_socket.c
> --- kern/uipc_socket.c2 Apr 2017 23:40:08 -   1.182
> +++ kern/uipc_socket.c15 May 2017 11:18:21 -
> @@ -1860,12 +1862,12 @@ sogetopt(struct socket *so, int level, i
>   case SO_SPLICE:
>   {
>   off_t len;
> - int s = splsoftnet();
>  
> + KERNEL_LOCK();
>   m->m_len = sizeof(off_t);
>   len = so->so_sp ? so->so_sp->ssp_len : 0;
>   memcpy(mtod(m, off_t *), , sizeof(off_t));
> - splx(s);
> + KERNEL_UNLOCK();
>   break;
>   }
>  #endif /* SOCKET_SPLICE */



Re: Enable NET_LOCK(), take 3

2017-05-15 Thread Alexander Bluhm
On Mon, May 15, 2017 at 11:49:18AM +0200, Martin Pieuchot wrote:
> Here's the diff that got reverted before release.  My plan is to enable
> it and then get rid of the remaining splsoftnet() to be able to fix the
> remaining XXXSMP.
> 
> ok?

This diff looks familiar.  Put it in so we can fix fallout.

OK bluhm@

> 
> Index: kern/uipc_socket.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_socket.c,v
> retrieving revision 1.182
> diff -u -p -r1.182 uipc_socket.c
> --- kern/uipc_socket.c2 Apr 2017 23:40:08 -   1.182
> +++ kern/uipc_socket.c15 May 2017 09:35:52 -
> @@ -1038,10 +1038,12 @@ sorflush(struct socket *so)
>  {
>   struct sockbuf *sb = >so_rcv;
>   struct protosw *pr = so->so_proto;
> + sa_family_t af = pr->pr_domain->dom_family;
>   struct sockbuf asb;
>  
>   sb->sb_flags |= SB_NOINTR;
> - sblock(sb, M_WAITOK, NULL);
> + sblock(sb, M_WAITOK,
> + (af != PF_LOCAL && af != PF_ROUTE) ?  : NULL);
>   socantrcvmore(so);
>   sbunlock(sb);
>   asb = *sb;
> Index: kern/uipc_socket2.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_socket2.c,v
> retrieving revision 1.75
> diff -u -p -r1.75 uipc_socket2.c
> --- kern/uipc_socket2.c   17 Mar 2017 17:19:16 -  1.75
> +++ kern/uipc_socket2.c   15 May 2017 09:35:52 -
> @@ -299,7 +299,11 @@ soassertlocked(struct socket *so)
>  int
>  sosleep(struct socket *so, void *ident, int prio, const char *wmesg, int 
> timo)
>  {
> - return tsleep(ident, prio, wmesg, timo);
> + if ((so->so_proto->pr_domain->dom_family != PF_LOCAL) &&
> + (so->so_proto->pr_domain->dom_family != PF_ROUTE)) {
> + return rwsleep(ident, , prio, wmesg, timo);
> + } else
> + return tsleep(ident, prio, wmesg, timo);
>  }
>  
>  /*
> Index: net/if.c
> ===
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.495
> diff -u -p -r1.495 if.c
> --- net/if.c  9 May 2017 09:31:07 -   1.495
> +++ net/if.c  15 May 2017 09:35:52 -
> @@ -230,6 +230,12 @@ struct taskq *softnettq;
>  struct task if_input_task_locked = TASK_INITIALIZER(if_netisr, NULL);
>  
>  /*
> + * Serialize socket operations to ensure no new sleeping points
> + * are introduced in IP output paths.
> + */
> +struct rwlock netlock = RWLOCK_INITIALIZER("netlock");
> +
> +/*
>   * Network interface utility routines.
>   */
>  void
> @@ -1146,7 +1152,10 @@ if_clone_create(const char *name, int rd
>   if (ifunit(name) != NULL)
>   return (EEXIST);
>  
> + /* XXXSMP breaks atomicity */
> + rw_exit_write();
>   ret = (*ifc->ifc_create)(ifc, unit);
> + rw_enter_write();
>  
>   if (ret != 0 || (ifp = ifunit(name)) == NULL)
>   return (ret);
> @@ -1188,7 +1197,10 @@ if_clone_destroy(const char *name)
>   splx(s);
>   }
>  
> + /* XXXSMP breaks atomicity */
> + rw_exit_write();
>   ret = (*ifc->ifc_destroy)(ifp);
> + rw_enter_write();
>  
>   return (ret);
>  }
> Index: net/if_pflow.c
> ===
> RCS file: /cvs/src/sys/net/if_pflow.c,v
> retrieving revision 1.75
> diff -u -p -r1.75 if_pflow.c
> --- net/if_pflow.c17 Mar 2017 17:19:16 -  1.75
> +++ net/if_pflow.c15 May 2017 09:35:52 -
> @@ -463,9 +463,12 @@ pflowioctl(struct ifnet *ifp, u_long cmd
>   sc->sc_gcounter=pflowstats.pflow_flows;
>   /* send templates on startup */
>   if (sc->sc_version == PFLOW_PROTO_10) {
> + /* XXXSMP breaks atomicity */
> + rw_exit_write();
>   s = splnet();
>   pflow_sendout_ipfix_tmpl(sc);
>   splx(s);
> + rw_enter_write();
>   }
>   } else
>   ifp->if_flags &= ~IFF_RUNNING;
> @@ -505,6 +508,8 @@ pflowioctl(struct ifnet *ifp, u_long cmd
>   sizeof(pflowr
>   return (error);
>  
> + /* XXXSMP breaks atomicity */
> + rw_exit_write();
>   s = splnet();
>   error = pflow_set(sc, );
>   splx(s);
> @@ -522,6 +527,7 @@ pflowioctl(struct ifnet *ifp, u_long cmd
>   } else
>   ifp->if_flags &= ~IFF_RUNNING;
>  
> + rw_enter_write();
>   break;
>  
>   default:
> Index: net/pf.c
> ===
> RCS file: /cvs/src/sys/net/pf.c,v
> retrieving revision 1.1021
> diff -u -p -r1.1021 pf.c
> --- net/pf.c  5 May 2017 16:30:39 -   1.1021
> +++ net/pf.c  15 May 2017 09:35:52 -
> @@ -1154,26 

Re: splsoftnet() in SO_SPLICE

2017-05-15 Thread Alexander Bluhm
On Mon, May 15, 2017 at 02:32:07PM +0200, Martin Pieuchot wrote:
> On 15/05/17(Mon) 13:35, Alexander Bluhm wrote:
> > On Mon, May 15, 2017 at 01:20:02PM +0200, Martin Pieuchot wrote:
> > > Here's the last of the splsoftnet(), is it ok if I replace it with a
> > > KERNEL_LOCK() to make it clear this needs protection?
> > 
> > I think it needs the socket lock.  somove() is modifying the
> > so_splicelen value.  somove() calls soassertlocked().
> 
> Fine, updated diff below.

OK bluhm@

> 
> Index: kern/uipc_socket.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_socket.c,v
> retrieving revision 1.183
> diff -u -p -r1.183 uipc_socket.c
> --- kern/uipc_socket.c15 May 2017 12:26:00 -  1.183
> +++ kern/uipc_socket.c15 May 2017 12:31:39 -
> @@ -1862,12 +1862,13 @@ sogetopt(struct socket *so, int level, i
>   case SO_SPLICE:
>   {
>   off_t len;
> - int s = splsoftnet();
> + int s;
>  
> + s = solock(so);
>   m->m_len = sizeof(off_t);
>   len = so->so_sp ? so->so_sp->ssp_len : 0;
>   memcpy(mtod(m, off_t *), , sizeof(off_t));
> - splx(s);
> + sounlock(s);
>   break;
>   }
>  #endif /* SOCKET_SPLICE */



mpw(4) vs splnet()

2017-05-15 Thread Martin Pieuchot
if_input() does not need to be protected, ok?

Index: net/if_mpw.c
===
RCS file: /cvs/src/sys/net/if_mpw.c,v
retrieving revision 1.21
diff -u -p -r1.21 if_mpw.c
--- net/if_mpw.c4 May 2017 15:00:24 -   1.21
+++ net/if_mpw.c15 May 2017 12:58:25 -
@@ -268,7 +268,6 @@ mpw_output(struct ifnet *ifp, struct mbu
struct mbuf_list ml = MBUF_LIST_INITIALIZER();
struct ether_header *eh, ehc;
struct shim_hdr *shim;
-   int s;
 
if (sc->sc_type == IMR_TYPE_NONE) {
m_freem(m);
@@ -324,10 +323,7 @@ mpw_output(struct ifnet *ifp, struct mbu
}
 
ml_enqueue(, m);
-
-   s = splnet();
if_input(ifp, );
-   splx(s);
 
return (0);
 }



Get rid of splsoftassert(IPL_SOFTNET)

2017-05-15 Thread Martin Pieuchot
I insisted we keep this macro to differentiate which code has been
audited and really need the NET_LOCK() until we were ready to switch.

This time has come, so I'd like to replace all of them.  This will
allow us to get rid of the SPL dance altogether and turn the NET_LOCK()
free of any argument.

This requires the NET_LOCK() diff sent earlier, ok?

diff --git sys/net/bfd.c sys/net/bfd.c
index 560bab52728..69600b73359 100644
--- sys/net/bfd.c
+++ sys/net/bfd.c
@@ -236,8 +236,6 @@ bfd_clear_task(void *arg)
struct rtentry *rt = (struct rtentry *)arg;
struct bfd_config *bfd;
 
-   splsoftassert(IPL_SOFTNET);
-
if ((bfd = bfd_lookup(rt)) == NULL)
return;
 
diff --git sys/net/bridgestp.c sys/net/bridgestp.c
index dec7b86836c..38271cb4770 100644
--- sys/net/bridgestp.c
+++ sys/net/bridgestp.c
@@ -1584,7 +1584,7 @@ bstp_notify_rtage(struct bstp_port *bp, int pending)
 {
int age = 0;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
switch (bp->bp_protover) {
case BSTP_PROTO_STP:
diff --git sys/net/if.c sys/net/if.c
index 41e2d9a2747..5ef1b5717d2 100644
--- sys/net/if.c
+++ sys/net/if.c
@@ -410,7 +410,7 @@ if_attachsetup(struct ifnet *ifp)
 {
unsigned long ifidx;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
TAILQ_INIT(>if_groups);
 
@@ -1143,7 +1143,7 @@ if_clone_create(const char *name, int rdomain)
struct ifnet *ifp;
int unit, ret;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
ifc = if_clone_lookup(name, );
if (ifc == NULL)
@@ -1177,7 +1177,7 @@ if_clone_destroy(const char *name)
struct ifnet *ifp;
int ret;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
ifc = if_clone_lookup(name, NULL);
if (ifc == NULL)
@@ -1516,7 +1516,7 @@ if_downall(void)
 void
 if_down(struct ifnet *ifp)
 {
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
ifp->if_flags &= ~IFF_UP;
getmicrotime(>if_lastchange);
@@ -1532,7 +1532,7 @@ if_down(struct ifnet *ifp)
 void
 if_up(struct ifnet *ifp)
 {
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
ifp->if_flags |= IFF_UP;
getmicrotime(>if_lastchange);
diff --git sys/net/if_bridge.c sys/net/if_bridge.c
index 377f6f5d074..f167098a521 100644
--- sys/net/if_bridge.c
+++ sys/net/if_bridge.c
@@ -1469,7 +1469,7 @@ bridge_ipsec(struct bridge_softc *sc, struct ifnet *ifp,
break;
}
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
tdb = gettdb(ifp->if_rdomain, spi, , proto);
if (tdb != NULL && (tdb->tdb_flags & TDBF_INVALID) == 0 &&
diff --git sys/net/if_pfsync.c sys/net/if_pfsync.c
index 9a1f3c44589..8757fd0ae30 100644
--- sys/net/if_pfsync.c
+++ sys/net/if_pfsync.c
@@ -1180,7 +1180,7 @@ pfsync_update_net_tdb(struct pfsync_tdb *pt)
 {
struct tdb  *tdb;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
/* check for invalid values */
if (ntohl(pt->spi) <= SPI_RESERVED_MAX ||
@@ -1686,7 +1686,7 @@ pfsync_insert_state(struct pf_state *st)
 {
struct pfsync_softc *sc = pfsyncif;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
if (ISSET(st->rule.ptr->rule_flag, PFRULE_NOSYNC) ||
st->key[PF_SK_WIRE]->proto == IPPROTO_PFSYNC) {
@@ -1716,7 +1716,7 @@ pfsync_defer(struct pf_state *st, struct mbuf *m)
struct pfsync_softc *sc = pfsyncif;
struct pfsync_deferral *pd;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
if (!sc->sc_defer ||
ISSET(st->state_flags, PFSTATE_NOSYNC) ||
@@ -1756,7 +1756,7 @@ pfsync_undefer(struct pfsync_deferral *pd, int drop)
struct pfsync_softc *sc = pfsyncif;
struct pf_pdesc pdesc;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
TAILQ_REMOVE(>sc_deferrals, pd, pd_entry);
sc->sc_deferred--;
@@ -1821,7 +1821,7 @@ pfsync_deferred(struct pf_state *st, int drop)
struct pfsync_softc *sc = pfsyncif;
struct pfsync_deferral *pd;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
TAILQ_FOREACH(pd, >sc_deferrals, pd_entry) {
 if (pd->pd_st == st) {
@@ -1840,7 +1840,7 @@ pfsync_update_state(struct pf_state *st)
struct pfsync_softc *sc = pfsyncif;
int sync = 0;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING))
return;
@@ -2009,7 +2009,7 @@ pfsync_delete_state(struct pf_state *st)
 {
struct pfsync_softc *sc = pfsyncif;
 
-   splsoftassert(IPL_SOFTNET);
+   NET_ASSERT_LOCKED();
 
if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING))
return;
@@ -2056,7 +2056,7 @@ 

Re: rasops: fix virtual console initialization

2017-05-15 Thread Stefan Sperling
On Sun, May 14, 2017 at 08:44:54PM -0500, joshua stein wrote:
> Only copy the console buffer contents to a rasops virtual console if
> it's the first one (visible).
> 
> Otherwise all of the other virtual consoles initialize with the
> trailing end of the kernel messages and then getty writes over them,
> like this: https://imgur.com/a/mAXKf

I have seen this problem, too. OK.

> Index: dev/rasops/rasops.c
> ===
> RCS file: /cvs/src/sys/dev/rasops/rasops.c,v
> retrieving revision 1.44
> diff -u -p -u -p -r1.44 rasops.c
> --- dev/rasops/rasops.c   15 Dec 2016 19:18:41 -  1.44
> +++ dev/rasops/rasops.c   15 May 2017 01:40:22 -
> @@ -1398,7 +1398,7 @@ rasops_alloc_screen(void *v, void **cook
>   scr->rs_crow = -1;
>   scr->rs_ccol = -1;
>  
> - if (ri->ri_bs) {
> + if (ri->ri_bs && scr->rs_visible) {
>   memcpy(scr->rs_bs, ri->ri_bs, ri->ri_rows * ri->ri_cols *
>   sizeof(struct wsdisplay_charcell));
>   } else {
> 



splsoftnet() in SO_SPLICE

2017-05-15 Thread Martin Pieuchot
Here's the last of the splsoftnet(), is it ok if I replace it with a
KERNEL_LOCK() to make it clear this needs protection?

Index: kern/uipc_socket.c
===
RCS file: /cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.182
diff -u -p -r1.182 uipc_socket.c
--- kern/uipc_socket.c  2 Apr 2017 23:40:08 -   1.182
+++ kern/uipc_socket.c  15 May 2017 11:18:21 -
@@ -1860,12 +1862,12 @@ sogetopt(struct socket *so, int level, i
case SO_SPLICE:
{
off_t len;
-   int s = splsoftnet();
 
+   KERNEL_LOCK();
m->m_len = sizeof(off_t);
len = so->so_sp ? so->so_sp->ssp_len : 0;
memcpy(mtod(m, off_t *), , sizeof(off_t));
-   splx(s);
+   KERNEL_UNLOCK();
break;
}
 #endif /* SOCKET_SPLICE */



Re: [patch] ND_COMPUTER_RTIME is not uniformly distributed

2017-05-15 Thread Matthew Martin
ping

On Sun, May 07, 2017 at 06:59:12PM -0500, Matthew Martin wrote:
> RFC 4861 specifies ReachableTime "should be a uniformly distributed
> random value between MIN_RANDOM_FACTOR and MAX_RANDOM_FACTOR times
> BaseReachableTime milliseconds." I think the author intended to do the
> multiplication by (x>>10) outside the mask, but it's still missing a -1.
> 
> - Matthew Martin
> 
> 
> 
> diff --git nd6.h nd6.h
> index 4274cd4dd07..8eea089a40c 100644
> --- nd6.h
> +++ nd6.h
> @@ -162,8 +162,8 @@ structllinfo_nd6 {
>  #define MIN_RANDOM_FACTOR512 /* 1024 * 0.5 */
>  #define MAX_RANDOM_FACTOR1536/* 1024 * 1.5 */
>  #define ND_COMPUTE_RTIME(x) \
> - (((MIN_RANDOM_FACTOR * (x >> 10)) + (arc4random() & \
> - ((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * (x >> 10 /1000)
> + (((arc4random() & (MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR - 1)) \
> + + MIN_RANDOM_FACTOR) * (x >> 10) / 1000)
>  
>  TAILQ_HEAD(nd_drhead, nd_defrouter);
>  struct   nd_defrouter {



Re: splsoftnet() in SO_SPLICE

2017-05-15 Thread Martin Pieuchot
On 15/05/17(Mon) 13:35, Alexander Bluhm wrote:
> On Mon, May 15, 2017 at 01:20:02PM +0200, Martin Pieuchot wrote:
> > Here's the last of the splsoftnet(), is it ok if I replace it with a
> > KERNEL_LOCK() to make it clear this needs protection?
> 
> I think it needs the socket lock.  somove() is modifying the
> so_splicelen value.  somove() calls soassertlocked().

Fine, updated diff below.

Index: kern/uipc_socket.c
===
RCS file: /cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.183
diff -u -p -r1.183 uipc_socket.c
--- kern/uipc_socket.c  15 May 2017 12:26:00 -  1.183
+++ kern/uipc_socket.c  15 May 2017 12:31:39 -
@@ -1862,12 +1862,13 @@ sogetopt(struct socket *so, int level, i
case SO_SPLICE:
{
off_t len;
-   int s = splsoftnet();
+   int s;
 
+   s = solock(so);
m->m_len = sizeof(off_t);
len = so->so_sp ? so->so_sp->ssp_len : 0;
memcpy(mtod(m, off_t *), , sizeof(off_t));
-   splx(s);
+   sounlock(s);
break;
}
 #endif /* SOCKET_SPLICE */



Most of the splnet() in /sys/net are wrong

2017-05-15 Thread Martin Pieuchot
They all need to be audited and possibly changed to a solution that work
with the NET_LOCK(). Just saying,  enc(4) is a nightmare.

Here's a diff fro gre(4).  The 'softnet' iterates over ``gre_softc_list'' 
so we need to make sure it isn't modified when this happen.  We need the
NET_LOCK() were splnet() was used.

ok?

Index: net/if_gre.c
===
RCS file: /cvs/src/sys/net/if_gre.c,v
retrieving revision 1.84
diff -u -p -r1.84 if_gre.c
--- net/if_gre.c24 Jan 2017 10:08:30 -  1.84
+++ net/if_gre.c15 May 2017 12:45:25 -
@@ -165,9 +165,9 @@ gre_clone_create(struct if_clone *ifc, i
 #if NBPFILTER > 0
bpfattach(>sc_if.if_bpf, >sc_if, DLT_LOOP, sizeof(u_int32_t));
 #endif
-   s = splnet();
+   NET_LOCK(s);
LIST_INSERT_HEAD(_softc_list, sc, sc_list);
-   splx(s);
+   NET_UNLOCK(s);
 
return (0);
 }
@@ -178,11 +178,11 @@ gre_clone_destroy(struct ifnet *ifp)
struct gre_softc *sc = ifp->if_softc;
int s;
 
-   s = splnet();
timeout_del(>sc_ka_snd);
timeout_del(>sc_ka_hold);
+   NET_LOCK(s);
LIST_REMOVE(sc, sc_list);
-   splx(s);
+   NET_UNLOCK(s);
 
if_detach(ifp);
 



pflog(4) vs splnet()

2017-05-15 Thread Martin Pieuchot
Kill unused global list of softc and protect the global array of
interface by the NET_LOCK().

ok?

Index: net/if_pflog.c
===
RCS file: /cvs/src/sys/net/if_pflog.c,v
retrieving revision 1.78
diff -u -p -r1.78 if_pflog.c
--- net/if_pflog.c  24 Jan 2017 10:08:30 -  1.78
+++ net/if_pflog.c  15 May 2017 13:19:39 -
@@ -83,7 +83,6 @@ int   pflog_clone_create(struct if_clone *
 intpflog_clone_destroy(struct ifnet *);
 void   pflog_bpfcopy(const void *, void *, size_t);
 
-LIST_HEAD(, pflog_softc)   pflogif_list;
 struct if_clonepflog_cloner =
 IF_CLONE_INITIALIZER("pflog", pflog_clone_create, pflog_clone_destroy);
 
@@ -94,7 +93,6 @@ struct mbuf*pflog_mhdr = NULL, *pflog_
 void
 pflogattach(int npflog)
 {
-   LIST_INIT(_list);
if (pflog_mhdr == NULL)
if ((pflog_mhdr = m_get(M_DONTWAIT, MT_HEADER)) == NULL)
panic("pflogattach: no mbuf");
@@ -110,6 +108,8 @@ pflogifs_resize(size_t n)
struct ifnet**p;
int   i;
 
+   NET_ASSERT_LOCKED();
+
if (n > SIZE_MAX / sizeof(*p))
return (EINVAL);
if (n == 0)
@@ -161,14 +161,13 @@ pflog_clone_create(struct if_clone *ifc,
bpfattach(>sc_if.if_bpf, ifp, DLT_PFLOG, PFLOG_HDRLEN);
 #endif
 
-   s = splnet();
-   LIST_INSERT_HEAD(_list, pflogif, sc_list);
+   NET_LOCK(s);
if (unit + 1 > npflogifs && pflogifs_resize(unit + 1) != 0) {
-   splx(s);
+   NET_UNLOCK(s);
return (ENOMEM);
}
pflogifs[unit] = ifp;
-   splx(s);
+   NET_UNLOCK(s);
 
return (0);
 }
@@ -179,15 +178,13 @@ pflog_clone_destroy(struct ifnet *ifp)
struct pflog_softc  *pflogif = ifp->if_softc;
int  s, i;
 
-   s = splnet();
+   NET_LOCK(s);
pflogifs[pflogif->sc_unit] = NULL;
-   LIST_REMOVE(pflogif, sc_list);
-
for (i = npflogifs; i > 0 && pflogifs[i - 1] == NULL; i--)
; /* nothing */
if (i < npflogifs)
pflogifs_resize(i); /* error harmless here */
-   splx(s);
+   NET_UNLOCK(s);
 
if_detach(ifp);
free(pflogif, M_DEVBUF, 0);
Index: net/if_pflog.h
===
RCS file: /cvs/src/sys/net/if_pflog.h,v
retrieving revision 1.26
diff -u -p -r1.26 if_pflog.h
--- net/if_pflog.h  12 Feb 2015 01:24:10 -  1.26
+++ net/if_pflog.h  15 May 2017 13:21:13 -
@@ -64,7 +64,6 @@ struct pfloghdr {
 struct pflog_softc {
struct ifnetsc_if;  /* the interface */
int sc_unit;
-   LIST_ENTRY(pflog_softc) sc_list;
 };
 
 #if NPFLOG > 0



Re: Atomic copyin(9)/copyout(9) for amd64

2017-05-15 Thread Visa Hankala
On Sun, May 14, 2017 at 10:05:28PM +0200, Mark Kettenis wrote:
> I tried to convert a few more architectures, and things get a bit
> messier than I envisioned on architectures that have an optimized copy
> function and care about alignment.  Especially when copyin(9) is
> implemented in assembly.  So I implemented a new function called
> copyin_futex(9), which is all we really need.  The implementations for
> powerpc and sparc64 below have been (lightly) tested.  The regression
> test that mpi@ created passes and doesn't crash the kernel!  I have an
> (untested) implementation for hppa as well.  The amd64 and i386
> implementation can simply be a wrapper around copyin(9).  The same
> holds for all out non-MULTIPROCESSOR architectures.  So my idea is to
> let kern/sys_futex.c provide that wrapper if MULTIPROCESSOR isn't
> defined and add an #ifdef MULTIPROCESSOR around all MD copyin_futex()
> implementations.
> 
> Thoughts?

This direction is much better than the previous one in my opinion.



Re: AUDIO_DEV_SOUND

2017-05-15 Thread Alexandre Ratchov
On Mon, May 15, 2017 at 10:14:48AM +0200, Jan Stary wrote:
> While /dev/sound is no more, the
> 
>   #define AUDIO_DEV_SOUND 0   /* minor of /dev/sound0 */
> 
> apparently cannot be just removed from dev/audio.c
> - with the diff below, all audio applications complain
> that the default audio device cannot be opened. Why is that?
> That comment seems misleading at any rate.

$ ls -al /dev/audio[0-9]*
crw-rw-rw-  1 root  wheel   42,   0 Jan 16 05:59 /dev/audio0
crw-rw-rw-  1 root  wheel   42,   1 May 15 08:25 /dev/audio1
crw-rw-rw-  1 root  wheel   42,   2 Nov 12  2016 /dev/audio2
crw-rw-rw-  1 root  wheel   42,   3 Sep  9  2016 /dev/audio3
crw-rw-rw-  1 root  wheel   42,   4 Sep  8  2016 /dev/audio4

The devices with minor numbers 0..15 are the ones we still use.  So
according to the sources:

#define AUDIO_DEV_AUDIO 0x80/* minor of /dev/audio0 */

is the one to be removed (then AUDIO_DEV_SOUND should be renamed to
AUDIO_DEV_AUDIO, and comment fixed, to avoid confusion).

> ===
> RCS file: /cvs/src/sys/dev/audio.c,v
> retrieving revision 1.163
> diff -u -p -r1.163 audio.c
> --- dev/audio.c   3 May 2017 06:56:54 -   1.163
> +++ dev/audio.c   15 May 2017 08:08:32 -
> @@ -50,7 +50,6 @@
>  #define DEVNAME(sc)  ((sc)->dev.dv_xname)
>  #define AUDIO_UNIT(n)(minor(n) & 0x0f)
>  #define AUDIO_DEV(n) (minor(n) & 0xf0)
> -#define AUDIO_DEV_SOUND  0   /* minor of /dev/sound0 */
>  #define AUDIO_DEV_MIXER  0x10/* minor of /dev/mixer0 */
>  #define AUDIO_DEV_AUDIO  0x80/* minor of /dev/audio0 */
^
wrong comments!



Re: pf: percpu anchor stacks

2017-05-15 Thread Alexandr Nedvedicky
Hello,

> Now *is* the time to commit the first step, the refactoring.  Once
> that's done we can discuss the introduction of the context.
> 
> Could you come up with such diff?

first of all: I have not managed to finish the re-factoring step yet, work
is still in progress. I stole some cycles from other projects, but it was
not enough apparently. Must try harder next time...


> > > Does this pass pfctl regress tests?
> > 
> > I'm about to run those tests for OpenBSD.
> 
> Did you manage to do that?

I have some update on testing of final patch. I've used pf_forward tests to
make sure the code I'm changing gets executed. I'm still working on
testcase, which covers deeper anchor tree with once-rules.

the pf_forward tests show no harm caused by my changes, though I saw some
failures:

Makefile:217 'run-regress-udp-inet-RTT_IN'
Makefile:217 'run-regress-udp-inet6-ECO_IN'
Makefile:217 'run-regress-udp-inet6-ECO_OUT'
Makefile:217 'run-regress-udp-inet6-RDR_IN'
Makefile:217 'run-regress-udp-inet6-RDR_OUT'
Makefile:217 'run-regress-udp-inet6-RTT_IN'
Makefile:215 'run-regress-udp-inet6-RPT_OUT'
Makefile:257 'run-regress-traceroute-udp-inet6-AF_IN'

I could see same failures in baseline (tree _without_ my changes). I took a
closer look to find out what's going on there. I took a tcpdump at ECO:
#
# tcpdump -i vnet1 running on ECO (192.168.214.188, 192.168.3.20)
#
13:27:31.712955 192.168.1.10.42707 > 192.168.214.188.echo: udp 3
13:27:31.713616 192.168.3.20.echo > 192.168.1.10.42707: udp 3
13:27:31.714693 192.168.1.10 > 192.168.3.20: icmp: 192.168.1.10
udp port 42707 unreachable
#
# output above shows we get answer from .3.20 instead of .214.188
# looks as a kind of yet another bug.
#

There are multiple IP addresses bound to ECO IN/OUT interface. However
UDP socket at ECO always answers using primary IP address bound to ECO
interface. The answer triggers ICMP port unreachable at SRC (192.168.1.10)

> > >  - s/test_status/action/ as it's done everywhere else?
> > 
> > I've opted to test_status, because it's something different to 'action'
> > as we use it in current code.
> 
> I agree with you for test_status.  What about naming the enum and use it
> instead of 'int' for the variable?  This implicitly documents the possible
> option and allow the compiler to check for missing cases in switch.

I'm attaching updated final patch, which accepts your suggestion.

thanks and
regards
sasha

8<---8<---8<--8<
diff -r d1adecdc78cc src/sys/net/pf.c
--- a/src/sys/net/pf.c  Fri May 12 00:09:06 2017 +0200
+++ b/src/sys/net/pf.c  Mon May 15 13:36:45 2017 +0200
@@ -119,12 +119,54 @@ u_char pf_tcp_secret[16];
 int pf_tcp_secret_init;
 int pf_tcp_iss_off;
 
-struct pf_anchor_stackframe {
-   struct pf_ruleset   *rs;
-   struct pf_rule  *r;
-   struct pf_anchor_node   *parent;
-   struct pf_anchor*child;
-} pf_anchor_stack[64];
+enum pf_test_status {
+   PF_TEST_FAIL = -1,
+   PF_TEST_OK,
+   PF_TEST_QUICK
+};
+
+struct pf_test_ctx {
+   enum pf_test_status   test_status;
+   struct pf_pdesc  *pd;
+   struct pf_rule_actionsact;
+   u_int8_t  icmpcode;
+   u_int8_t  icmptype;
+   int   icmp_dir;
+   int   state_icmp;
+   int   tag;
+   u_short   reason;
+   struct pf_rule_item  *ri;
+   struct pf_src_node   *sns[PF_SN_MAX];
+   struct pf_rule_slist  rules;
+   struct pf_rule   *nr;
+   struct pf_rule  **rm;
+   struct pf_rule   *a;
+   struct pf_rule  **am;
+   struct pf_ruleset   **rsm;
+   struct pf_ruleset*arsm;
+   struct pf_ruleset*aruleset;
+   struct tcphdr*th;
+   int   depth;
+};
+
+#definePF_ANCHOR_STACK_MAX 64
+
+/*
+ * Cannot fold into pf_pdesc directly, unknown storage size outside pf.c.
+ * Keep in sync with union pf_headers in pflog_bpfcopy() in if_pflog.c.
+ */
+union pf_headers {
+   struct tcphdr   tcp;
+   struct udphdr   udp;
+   struct icmp icmp;
+#ifdef INET6
+   struct icmp6_hdricmp6;
+   struct mld_hdr  mld;
+   struct nd_neighbor_solicit nd_ns;
+#endif /* INET6 */
+};
+
+
 
 struct pool pf_src_tree_pl, pf_rule_pl, pf_queue_pl;
 struct pool pf_state_pl, pf_state_key_pl, pf_state_item_pl;
@@ -211,11 +253,8 @@ struct pf_state*pf_find_state(struct p
   

Re: Most of the splnet() in /sys/net are wrong

2017-05-15 Thread Alexander Bluhm
On Mon, May 15, 2017 at 03:06:08PM +0200, Martin Pieuchot wrote:
> They all need to be audited and possibly changed to a solution that work
> with the NET_LOCK(). Just saying,  enc(4) is a nightmare.
> 
> Here's a diff fro gre(4).  The 'softnet' iterates over ``gre_softc_list'' 
> so we need to make sure it isn't modified when this happen.  We need the
> NET_LOCK() were splnet() was used.
> 
> ok?

OK bluhm@

> 
> Index: net/if_gre.c
> ===
> RCS file: /cvs/src/sys/net/if_gre.c,v
> retrieving revision 1.84
> diff -u -p -r1.84 if_gre.c
> --- net/if_gre.c  24 Jan 2017 10:08:30 -  1.84
> +++ net/if_gre.c  15 May 2017 12:45:25 -
> @@ -165,9 +165,9 @@ gre_clone_create(struct if_clone *ifc, i
>  #if NBPFILTER > 0
>   bpfattach(>sc_if.if_bpf, >sc_if, DLT_LOOP, sizeof(u_int32_t));
>  #endif
> - s = splnet();
> + NET_LOCK(s);
>   LIST_INSERT_HEAD(_softc_list, sc, sc_list);
> - splx(s);
> + NET_UNLOCK(s);
>  
>   return (0);
>  }
> @@ -178,11 +178,11 @@ gre_clone_destroy(struct ifnet *ifp)
>   struct gre_softc *sc = ifp->if_softc;
>   int s;
>  
> - s = splnet();
>   timeout_del(>sc_ka_snd);
>   timeout_del(>sc_ka_hold);
> + NET_LOCK(s);
>   LIST_REMOVE(sc, sc_list);
> - splx(s);
> + NET_UNLOCK(s);
>  
>   if_detach(ifp);
>  



Re: [patch] ND_COMPUTER_RTIME is not uniformly distributed

2017-05-15 Thread Mike Belopuhov
On Sun, May 07, 2017 at 18:59 -0500, Matthew Martin wrote:
> RFC 4861 specifies ReachableTime "should be a uniformly distributed
> random value between MIN_RANDOM_FACTOR and MAX_RANDOM_FACTOR times
> BaseReachableTime milliseconds." I think the author intended to do the
> multiplication by (x>>10) outside the mask, but it's still missing a -1.
> 
> - Matthew Martin
> 
> 
> 
> diff --git nd6.h nd6.h
> index 4274cd4dd07..8eea089a40c 100644
> --- nd6.h
> +++ nd6.h
> @@ -162,8 +162,8 @@ structllinfo_nd6 {
>  #define MIN_RANDOM_FACTOR512 /* 1024 * 0.5 */
>  #define MAX_RANDOM_FACTOR1536/* 1024 * 1.5 */
>  #define ND_COMPUTE_RTIME(x) \
> - (((MIN_RANDOM_FACTOR * (x >> 10)) + (arc4random() & \
> - ((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * (x >> 10 /1000)
> + (((arc4random() & (MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR - 1)) \
> + + MIN_RANDOM_FACTOR) * (x >> 10) / 1000)
>  
>  TAILQ_HEAD(nd_drhead, nd_defrouter);
>  struct   nd_defrouter {
> 

Hmm, how about we use an arc4random_uniform here?  It will generate
numbers less than the upper bound specified as an argument so I don't
need to -1 there.

I don't think they wanted to do a multiplication by (x>>10) outside
the mask and in this case as well we want arc4random_uniform to operate
on an extended range.

Does this look good?

diff --git sys/netinet6/nd6.h sys/netinet6/nd6.h
index 4274cd4dd07..01c0fe2c7af 100644
--- sys/netinet6/nd6.h
+++ sys/netinet6/nd6.h
@@ -160,12 +160,12 @@ structllinfo_nd6 {
 #define REACHABLE_TIME 3   /* msec */
 #define RETRANS_TIMER  1000/* msec */
 #define MIN_RANDOM_FACTOR  512 /* 1024 * 0.5 */
 #define MAX_RANDOM_FACTOR  1536/* 1024 * 1.5 */
 #define ND_COMPUTE_RTIME(x) \
-   (((MIN_RANDOM_FACTOR * (x >> 10)) + (arc4random() & \
-   ((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * (x >> 10 /1000)
+   ((arc4random_uniform((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * \
+   (x >> 10)) + MIN_RANDOM_FACTOR * (x >> 10)) / 1000)
 
 TAILQ_HEAD(nd_drhead, nd_defrouter);
 struct nd_defrouter {
TAILQ_ENTRY(nd_defrouter) dr_entry;
struct  in6_addr rtaddr;



Re: Atomic copyin(9)/copyout(9) for amd64

2017-05-15 Thread Mark Kettenis
> From: Miod Vallat 
> Date: Mon, 15 May 2017 06:06:22 + (UTC)
> 
> >   So I implemented a new function called
> > copyin_futex(9), which is all we really need.
> 
> But it is not specific to futex - in fact, it could be used in syscall()
> as well.
> 
> Better call it fuword() or aligned_fuword() since it has the extra
> alignment requirement that fuword() didn't have.

Heh, we had a bikeshed about the name already.  To quote myself:

> Problem with the classic fuword(9) API is that it return -1 on error
> (when copyin(9) would return EFAULT).  So you can't distinguish
> between the futex being -1 and an invalid address.  FreeBSD has
> fueword32(9), which doesn't suffer from this problem (but returns -1
> instead of EFAULT upon error).

So in the end we settled on copyin32() in order not to introduce an
interface that has slightly different semantics than the FreeBSD one.

The reason I changed this to copyin_futex() is that the in this
proposal the call is only truly atomic for MULTIPROCESSOR kernels.
This is good enough for futex(2), but I'm not absolutely confident
that it is good enough for all possible use cases.

That said, if there is consensus that copyin32() is better than
copyin_futex() I'll happily change the name.

Cheers,

Mark



pf.conf.5: mention the inversion (!) operator

2017-05-15 Thread Michal Mazurek
- mention the inversion operator for "some parameters"
- mention the inversion operator for "received-on" to match "tagged"
- don't wrap a short line
- use spaces, not tabs inside a literal block
- quote the inversion operator when describing BNF syntax (easy to miss):
- "label" string | "tag" string | [ ! ] "tagged" string |
+ "label" string | "tag" string | [ "!" ] "tagged" string |


Index: share/man/man5/pf.conf.5
===
RCS file: /cvs/src/share/man/man5/pf.conf.5,v
retrieving revision 1.558
diff -u -p -r1.558 pf.conf.5
--- share/man/man5/pf.conf.515 May 2017 11:24:37 -  1.558
+++ share/man/man5/pf.conf.515 May 2017 17:30:30 -
@@ -131,6 +131,9 @@ matching attributes.
 Certain parameters can be expressed as lists, in which case
 .Xr pfctl 8
 generates all needed rule combinations.
+It's also possible to invert some parameters by specifying the
+.Cm !\&
+operator.
 .Pp
 By default
 .Xr pf 4
@@ -638,12 +641,17 @@ For example, the following rule will dro
 .It Cm prio Ar number
 Only match packets which have the given queueing priority assigned.
 .Pp
-.It Cm received-on Ar interface
+.It Oo Cm \&! Oc Ns Cm received-on Ar interface
 Only match packets which were received on the specified
 .Cm interface
 (or interface group).
 .Cm any
 will match any existing interface except loopback ones.
+Inverse interface matching can also be done by specifying the
+.Cm !\&
+operator before the
+.Cm received-on
+keyword.
 .Pp
 .It Cm rtable Ar number
 Used to select an alternate routing table for the routing lookup.
@@ -733,8 +741,7 @@ to specify that packets must already
 be tagged with the given
 .Ar string
 in order to match the rule.
-Inverse tag matching can also be done
-by specifying the
+Inverse tag matching can also be done by specifying the
 .Cm !\&
 operator before the
 .Cm tagged
@@ -2690,22 +2697,22 @@ filteropt  = user | group | flags | 
  ( "no" | "keep" | "modulate" | "synproxy" ) "state"
  [ "(" state-opts ")" ] | "scrub" "(" scrubopts ")" |
  "fragment" | "allow-opts" | "once" |
-"divert-packet" "port" port | "divert-reply" |
-"divert-to" host "port" port |
- "label" string | "tag" string | [ ! ] "tagged" string |
+ "divert-packet" "port" port | "divert-reply" |
+ "divert-to" host "port" port |
+ "label" string | "tag" string | [ "!" ] "tagged" string |
  "set prio" ( number | "(" number [ [ "," ] number ] ")" ) |
  "set queue" ( string | "(" string [ [ "," ] string ] ")" ) |
  "rtable" number | "probability" number"%" | "prio" number |
-"af-to" af "from" ( redirhost | "{" redirhost-list "}" )
-[ "to" ( redirhost | "{" redirhost-list "}" ) ] |
-"binat-to" ( redirhost | "{" redirhost-list "}" )
-[ portspec ] [ pooltype ] |
-"rdr-to" ( redirhost | "{" redirhost-list "}" )
-[ portspec ] [ pooltype ] |
-"nat-to" ( redirhost | "{" redirhost-list "}" )
-[ portspec ] [ pooltype ] [ "static-port" ] |
-[ route ] | [ "set tos" tos ] |
-[ [ "!" ] "received-on" ( interface-name | interface-group ) ]
+ "af-to" af "from" ( redirhost | "{" redirhost-list "}" )
+ [ "to" ( redirhost | "{" redirhost-list "}" ) ] |
+ "binat-to" ( redirhost | "{" redirhost-list "}" )
+ [ portspec ] [ pooltype ] |
+ "rdr-to" ( redirhost | "{" redirhost-list "}" )
+ [ portspec ] [ pooltype ] |
+ "nat-to" ( redirhost | "{" redirhost-list "}" )
+ [ portspec ] [ pooltype ] [ "static-port" ] |
+ [ route ] | [ "set tos" tos ] |
+ [ [ "!" ] "received-on" ( interface-name | interface-group ) ]
 
 scrubopts  = scrubopt [ [ "," ] scrubopts ]
 scrubopt   = "no-df" | "min-ttl" number | "max-mss" number |

-- 
Michal Mazurek



event(3): mention bufferevent_setwatermark

2017-05-15 Thread Anton Lindqvist
Hi,
The bufferevent_setwatermark function is not mentioned in event(3).
Maybe the function deserves to be documented under the "BUFFERED EVENTS"
section but I know too little about the API to determine if that would
be useful. Some docs regarding the function can however be found in the
event.h header.

Spotted while reading the tmux source.

Index: event.3
===
RCS file: /cvs/src/lib/libevent/event.3,v
retrieving revision 1.52
diff -u -p -r1.52 event.3
--- event.3 17 Jul 2016 11:21:07 -  1.52
+++ event.3 15 May 2017 17:32:36 -
@@ -67,7 +67,8 @@
 .Nm bufferevent_read ,
 .Nm bufferevent_enable ,
 .Nm bufferevent_disable ,
-.Nm bufferevent_settimeout
+.Nm bufferevent_settimeout ,
+.Nm bufferevent_setwatermark
 .Nd execute a function when a specific event occurs
 .Sh SYNOPSIS
 .In sys/time.h
@@ -154,6 +155,8 @@
 .Fn "bufferevent_disable" "struct bufferevent *bufev" "short event"
 .Ft void
 .Fn "bufferevent_settimeout" "struct bufferevent *bufev" "int timeout_read" 
"int timeout_write"
+.Ft void
+.Fn "bufferevent_setwatermark" "struct bufferevent *bufev" "short events" 
"size_t lowmark" "size_t highmark"
 .Sh DESCRIPTION
 The
 .Nm event



Re: pf queue definition: bandwidth resolution problem

2017-05-15 Thread Mike Belopuhov
On Sun, May 14, 2017 at 21:08 +, Carl Mascott wrote:
> OK, I was indeed missing something. Thanks, Theo and Mike.
> 
> I think I see another very minor problem, though.
> Let the pf BW be 9,999,999.
> Shift right 3 digits (divide by 1000) : yields 9,999K.
> (If we were doing floating point arithmetic, would yield 9,999.999K.)
> You (Mike) don't round this, presumably to avoid overflow to 10,000 (5 
> digits).
> However, since (i < 3) it could have been rounded and scaled again, to 10M.
> Slightly more accurate but not a big deal.
>

You're right, initially I had a "<= " there but then I got
sidetracked with a hypothetical problem that right now seems
like my imagination.  I've double checked numbers and indeed,
it should be "if (rtmp <= )".

> 
> 
> On Sun, 5/14/17, Theo Buehler  wrote:
> 
>  Subject: Re: pf queue definition: bandwidth resolution problem
>  To: tech@openbsd.org
>  Date: Sunday, May 14, 2017, 4:35 PM
>  
>  On Sun, May 14, 2017 at 08:29:18PM +, Carl
>  Mascott wrote:
>  > It looks to me like you
>  are rounding on each iteration of the for-loop:
>  > 
>  > +    for (i = 0;
>  rate >  && i <= 3; i++) {
>  > +        rtmp = rate / 1000;
>  > +        if (rtmp < )
>  
>  This is only true in the last
>  iteration.
>  
>  > +            rtmp
>  += (rate % 1000) / 500;
>  > +       
>  rate = rtmp;
>  > +    }
>  > 
>  > Am I missing
>  something?
>  
>  
> 



Displaying flow queue in the systat

2017-05-15 Thread Mike Belopuhov
Here are some bits to display flow queues alongside H-FSC ones.
It's a bit hackish in a way I switch the "bandwidth" field to
the "bandwidth or flows" and then use node->qstats.data.period
because I'm too lazy to change the pfctl_queue_node to include
a union... This will require changes in the whole file instead
of just an XXX comment.  Does it bother anybody?

I also make use of a presently empty field "SCH" to display the
queue management policy (flow or fifo) which is not strictly a
scheduler, but it will become descriptive when I'll [hopefully]
hook up FQ-CoDel to HFSC so that it would be an HFSC class with
its queue managed by the FQ-CoDel.  This will distinguish such
queues from the regular HFSC ones that use a FIFO queue.

OK?

diff --git usr.bin/systat/pftop.c usr.bin/systat/pftop.c
index 673a69df6a6..d19affeae90 100644
--- usr.bin/systat/pftop.c
+++ usr.bin/systat/pftop.c
@@ -146,11 +146,11 @@ field_def fields[] = {
{"RATE", 5, 8, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
{"AVG", 5, 8, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
{"PEAK", 5, 8, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
{"ANCHOR", 6, 16, 1, FLD_ALIGN_LEFT, -1, 0, 0},
{"QUEUE", 15, 30, 1, FLD_ALIGN_LEFT, -1, 0, 0, 0},
-   {"BW", 4, 5, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
+   {"BW/FL", 4, 5, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
{"SCH", 3, 4, 1, FLD_ALIGN_LEFT, -1, 0, 0, 0},
{"DROP_P", 6, 8, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
{"DROP_B", 6, 8, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
{"QLEN", 4, 4, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
{"BORROW", 4, 6, 1, FLD_ALIGN_RIGHT, -1, 0, 0, 0},
@@ -1621,16 +1621,28 @@ print_queue_node(struct pfctl_queue_node *node)
tbprintf(" on %s ", node->qs.ifname);
print_fld_tb(FLD_QUEUE);
 
// XXX: missing min, max, burst
tb_start();
-   rate = node->qs.linkshare.m2.absolute;
-   for (i = 0; rate >= 1000 && i <= 3; i++)
-   rate /= 1000;
-   tbprintf("%u%c", rate, unit[i]);
+   if (node->qs.flags & PFQS_FLOWQUEUE)
+   /*
+* XXX We're abusing the fact that 'flows' in
+* the fqcodel_stats structure is at the same
+* spot as the 'period' in hfsc_class_stats.
+*/
+   tbprintf("%u", node->qstats.data.period);
+   else {
+   rate = node->qs.linkshare.m2.absolute;
+   for (i = 0; rate >= 1000 && i <= 3; i++)
+   rate /= 1000;
+   tbprintf("%u%c", rate, unit[i]);
+   }
print_fld_tb(FLD_BANDW);
 
+   print_fld_str(FLD_SCHED, node->qs.flags & PFQS_FLOWQUEUE ?
+   "flow" : "fifo");
+
if (node->qstats.valid && node->qstats_last.valid)
interval = calc_interval(>qstats.timestamp,
>qstats_last.timestamp);
else
interval = 0;



Re: multipath / route priority support for ospf6d

2017-05-15 Thread Florian Riehm
On 05/12/17 18:07, Florian Riehm wrote:
> Hi,
> 
> our QA reports issues with the ospf6d since the kernel uses more multipath
> routes.
> It exits after certain topology changes with:
> rde_send_change_kroute: no valid nexthop found
> 
> Since the kernel uses more multipath routes, the lack of multipath support in
> ospf6d became a problem.
> 
> The attached patch ports the multipath support from ospfd to ospf6d.
> It bases on the following ospfd commits:
> cvs diff -D "2007-09-24" -D "2007-09-26"
> cvs diff -r1.65 -r1.67 kroute.c
> 
> A similar patch was suggested by Manues Guesdon a couple of years ago.
> 
> This patch doesn't fix all problems I am seeing, but it improves the situation
> a lot. A second patch porting priority support will follow and fix further
> issues. I decided to split my fixes into two parts to make review easier.
> 
Sorry,

Mailclient messed up whitespaces.

Fixed diff below.

friehm

Index: kroute.c
===
RCS file: /openbsd/src/usr.sbin/ospf6d/kroute.c,v
retrieving revision 1.50
diff -u -p -r1.50 kroute.c
--- kroute.c27 Dec 2016 17:18:56 -  1.50
+++ kroute.c11 May 2017 20:48:49 -
@@ -59,6 +59,8 @@ void  kr_redist_remove(struct kroute_node
 intkr_redist_eval(struct kroute *, struct kroute *);
 void   kr_redistribute(struct kroute_node *);
 intkroute_compare(struct kroute_node *, struct kroute_node *);
+intkr_change_fib(struct kroute_node *, struct kroute *, int, int);
+intkr_delete_fib(struct kroute_node *);
 
 struct kroute_node *kroute_find(const struct in6_addr *, u_int8_t);
 struct kroute_node *kroute_matchgw(struct kroute_node *,
@@ -141,18 +143,105 @@ kr_init(int fs)
 }
 
 int
-kr_change(struct kroute *kroute)
+kr_change_fib(struct kroute_node *kr, struct kroute *kroute, int krcount,
+int action)
+{
+   int  i;
+   struct kroute_node  *kn, *nkn;
+
+   if (action == RTM_ADD) {
+   /*
+* First remove all stale multipath routes.
+* This step must be skipped when the action is RTM_CHANGE
+* because it is already a single path route that will be
+* changed.
+*/
+   for (kn = kr; kn != NULL; kn = nkn) {
+   for (i = 0; i < krcount; i++) {
+   if (kn->r.scope == kroute[i].scope &&
+   IN6_ARE_ADDR_EQUAL(>r.nexthop,
+   [i].nexthop))
+   break;
+   }
+   nkn = kn->next;
+   if (i == krcount) {
+   /* stale route */
+   if (kr_delete_fib(kn) == -1)
+   log_warnx("kr_delete_fib failed");
+   /*
+* if head element was removed we need to adjust
+* the head
+*/
+   if (kr == kn)
+   kr = nkn;
+   }
+   }
+   }
+
+   /*
+* now add or change the route
+*/
+   for (i = 0; i < krcount; i++) {
+   /* nexthop ::1 -> ignore silently */
+   if (IN6_IS_ADDR_LOOPBACK([i].nexthop))
+   continue;
+
+   if (action == RTM_ADD && kr) {
+   for (kn = kr; kn != NULL; kn = kn->next) {
+   if (kn->r.scope == kroute[i].scope &&
+   IN6_ARE_ADDR_EQUAL(>r.nexthop,
+   [i].nexthop))
+   break;
+   }
+
+   if (kn != NULL)
+   /* nexthop already present, skip it */
+   continue;
+   } else
+   /* modify first entry */
+   kn = kr;
+
+   /* send update */
+   if (send_rtmsg(kr_state.fd, action, [i]) == -1)
+   return (-1);
+
+   /* create new entry unless we are changing the first entry */
+   if (action == RTM_ADD)
+   if ((kn = calloc(1, sizeof(*kn))) == NULL)
+   fatal(NULL);
+
+   kn->r.prefix = kroute[i].prefix;
+   kn->r.prefixlen = kroute[i].prefixlen;
+   kn->r.nexthop = kroute[i].nexthop;
+   kn->r.scope = kroute[i].scope;
+   kn->r.flags = kroute[i].flags | F_OSPFD_INSERTED;
+   kn->r.ext_tag = kroute[i].ext_tag;
+   rtlabel_unref(kn->r.rtlabel);   /* for RTM_CHANGE */
+   kn->r.rtlabel = kroute[i].rtlabel;
+
+   if (action == RTM_ADD)
+

Re: AUDIO_DEV_SOUND

2017-05-15 Thread Jan Stary
On May 15 16:36:41, a...@caoua.org wrote:
> On Mon, May 15, 2017 at 10:14:48AM +0200, Jan Stary wrote:
> > While /dev/sound is no more, the
> > 
> >   #define AUDIO_DEV_SOUND   0   /* minor of /dev/sound0 */
> > 
> > apparently cannot be just removed from dev/audio.c
> > - with the diff below, all audio applications complain
> > that the default audio device cannot be opened. Why is that?
> > That comment seems misleading at any rate.
> 
> $ ls -al /dev/audio[0-9]*
> crw-rw-rw-  1 root  wheel   42,   0 Jan 16 05:59 /dev/audio0
> crw-rw-rw-  1 root  wheel   42,   1 May 15 08:25 /dev/audio1
> crw-rw-rw-  1 root  wheel   42,   2 Nov 12  2016 /dev/audio2
> crw-rw-rw-  1 root  wheel   42,   3 Sep  9  2016 /dev/audio3
> crw-rw-rw-  1 root  wheel   42,   4 Sep  8  2016 /dev/audio4
> 
> The devices with minor numbers 0..15 are the ones we still use.  So
> according to the sources:
> 
> #define AUDIO_DEV_AUDIO   0x80/* minor of /dev/audio0 */
> 
> is the one to be removed (then AUDIO_DEV_SOUND should be renamed to
> AUDIO_DEV_AUDIO, and comment fixed, to avoid confusion).

diff below

Jan

> > ===
> > RCS file: /cvs/src/sys/dev/audio.c,v
> > retrieving revision 1.163
> > diff -u -p -r1.163 audio.c
> > --- dev/audio.c 3 May 2017 06:56:54 -   1.163
> > +++ dev/audio.c 15 May 2017 08:08:32 -
> > @@ -50,7 +50,6 @@
> >  #define DEVNAME(sc)((sc)->dev.dv_xname)
> >  #define AUDIO_UNIT(n)  (minor(n) & 0x0f)
> >  #define AUDIO_DEV(n)   (minor(n) & 0xf0)
> > -#define AUDIO_DEV_SOUND0   /* minor of /dev/sound0 */
> >  #define AUDIO_DEV_MIXER0x10/* minor of /dev/mixer0 */
> >  #define AUDIO_DEV_AUDIO0x80/* minor of /dev/audio0 */
>   ^
>   wrong comments!

Index: dev/audio.c
===
RCS file: /cvs/src/sys/dev/audio.c,v
retrieving revision 1.163
diff -u -p -r1.163 audio.c
--- dev/audio.c 3 May 2017 06:56:54 -   1.163
+++ dev/audio.c 15 May 2017 16:19:28 -
@@ -50,9 +50,8 @@
 #define DEVNAME(sc)((sc)->dev.dv_xname)
 #define AUDIO_UNIT(n)  (minor(n) & 0x0f)
 #define AUDIO_DEV(n)   (minor(n) & 0xf0)
-#define AUDIO_DEV_SOUND0   /* minor of /dev/sound0 */
+#define AUDIO_DEV_AUDIO0   /* minor of /dev/audio0 */
 #define AUDIO_DEV_MIXER0x10/* minor of /dev/mixer0 */
-#define AUDIO_DEV_AUDIO0x80/* minor of /dev/audio0 */
 #define AUDIO_DEV_AUDIOCTL 0xc0/* minor of /dev/audioctl */
 #define AUDIO_BUFSZ65536   /* buffer size in bytes */
 
@@ -1136,7 +1135,6 @@ audio_detach(struct device *self, int fl
 * close uses device_lookup, it returns EXIO and does nothing
 */
mn = self->dv_unit;
-   vdevgone(maj, mn | AUDIO_DEV_SOUND, mn | AUDIO_DEV_SOUND, VCHR);
vdevgone(maj, mn | AUDIO_DEV_AUDIO, mn | AUDIO_DEV_AUDIO, VCHR);
vdevgone(maj, mn | AUDIO_DEV_AUDIOCTL, mn | AUDIO_DEV_AUDIOCTL, VCHR);
vdevgone(maj, mn | AUDIO_DEV_MIXER, mn | AUDIO_DEV_MIXER, VCHR);
@@ -1607,7 +1605,6 @@ audioopen(dev_t dev, int flags, int mode
error = ENXIO;
else {
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_open(sc, flags);
break;
@@ -1633,7 +1630,6 @@ audioclose(dev_t dev, int flags, int ifm
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_close(sc);
break;
@@ -1658,7 +1654,6 @@ audioread(dev_t dev, struct uio *uio, in
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_read(sc, uio, ioflag);
break;
@@ -1683,7 +1678,6 @@ audiowrite(dev_t dev, struct uio *uio, i
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_write(sc, uio, ioflag);
break;
@@ -1708,7 +1702,6 @@ audioioctl(dev_t dev, u_long cmd, caddr_
if (sc == NULL)
return ENXIO;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
error = audio_ioctl(sc, cmd, addr);
break;
@@ -1743,7 +1736,6 @@ audiopoll(dev_t dev, int events, struct 
if (sc == NULL)
return POLLERR;
switch (AUDIO_DEV(dev)) {
-   case AUDIO_DEV_SOUND:
case AUDIO_DEV_AUDIO:
   

Switch relayd to libtls

2017-05-15 Thread Claudio Jeker
Promised this some time ago. This diff switches relayd from libssl to
libtls. I decided to only do the refactoring so no new features have been
added. I have tested the various modes of relayd and they still behave the
same as before. This is a huge diff and needs some major testing (at least
the regress test is passing so that is a good start).

Any feedback towards the diff welcome. I will ignore feature requests
since those are out of scope for now. Once this is in I will start adding
support for OCSP and later on SNI.
-- 
:wq Claudio

Index: Makefile
===
RCS file: /cvs/src/usr.sbin/relayd/Makefile,v
retrieving revision 1.30
diff -u -p -r1.30 Makefile
--- Makefile28 Sep 2016 15:03:03 -  1.30
+++ Makefile15 May 2017 19:51:17 -
@@ -3,13 +3,13 @@
 PROG=  relayd
 SRCS=  parse.y
 SRCS+= agentx.c ca.c carp.c check_icmp.c check_script.c \
-   check_tcp.c config.c control.c hce.c log.c name2id.c \
-   pfe.c pfe_filter.c pfe_route.c proc.c \
+   check_tcp.c check_tls.c config.c control.c hce.c log.c \
+   name2id.c pfe.c pfe_filter.c pfe_route.c proc.c \
relay.c relay_http.c relay_udp.c relayd.c \
shuffle.c snmp.c ssl.c util.c
 MAN=   relayd.8 relayd.conf.5
 
-LDADD= -levent -lssl -lcrypto -lutil
+LDADD= -levent -ltls -lssl -lcrypto -lutil
 DPADD= ${LIBEVENT} ${LIBSSL} ${LIBCRYPTO} ${LIBUTIL}
 CFLAGS+=   -Wall -I${.CURDIR} -I${.CURDIR}/../snmpd
 CFLAGS+=   -Wstrict-prototypes -Wmissing-prototypes
Index: boguskeys.h
===
RCS file: boguskeys.h
diff -N boguskeys.h
--- /dev/null   1 Jan 1970 00:00:00 -
+++ boguskeys.h 26 Mar 2017 19:25:38 -
@@ -0,0 +1,200 @@
+/* $OpenBSD$   */
+
+/*
+ * Placed in the public domain by Claudio Jeker 
+ * on March 26, 2017.
+ */
+
+/* Bogus private key since the private key is privseped away */
+const char bogus_1024[] = "-BEGIN RSA PRIVATE KEY-\n"
+"MIICXQIBAAKBgQDXEA8QOA7tgvV0UN50pAf34b0vKD95svTuFNuCn7esdTUly/hF\n"
+"wDckkEznfbGj6o1otpMVaPNwRhhwikF7x9IWPjXw7sfbgvQoa2gkMUMkUr/X49KA\n"
+"7Uu0xqOaKn/IM4yA/ZaTuL99zdn8EBCRyrDVF8iDnVTPMrsLTyg2bE1qhwIDAQAB\n"
+"AoGAHvv/T5TkAbAWcPWdtyxSwZHSUdL4oi34P7zdi0o7iiswxwtF77aruybXDZr8\n"
+"VuNaEDYNps4CFLDkoIIqwQye5bWktBLL9Bv0ZDmR8u1PkQPjwRblg7jPtk46aiWQ\n"
+"9NEVkr2V1GUrzAPDcC23R5PKx//PveTiwrfmo6j+sWkxTxECQQD+9LevrDATY1nt\n"
+"Ce9R1KnduwueeDRRByS+8or8dyGXUR1wZjm2M4pBpigTfPQiSA9O6nixV3xpwNUN\n"
+"G9XpWGO1AkEA1/GE9ZPBWHOut+WYSerq76gZeIaH3tF3FnnLBLzw8+ePf0qm0h4q\n"
+"i2dl/EQV9LH7q0Rf7k2yXgHeo5dK4OkyywJBAM49+kWSvcVBTmJw8fa5WLw0bf7A\n"
+"cFnHtJL+sy3t1O+KP41INJFOeh4HIk45e2gr8K4/AGk9QzhtNCuJg+5igS0CQQDM\n"
+"AyW2TW2w/znmC0ehLgvfd1T5BUCARizYUyB2zXpnNDHh9Mk+YbmYEovLlReZIj2+\n"
+"RM7M+SK2pdWNgHYBns+ZAkAT7fZsAeOxNjM7h2kA0AriUvc2IuDqVGiFKAFCVacF\n"
+"mSQSIplSJU117YTqbVGf++SEj/WFYOTS8G+jjBuMr1d9\n"
+"-END RSA PRIVATE KEY-\n";
+const char bogus_2048[] = "-BEGIN RSA PRIVATE KEY-\n"
+"MIIEpAIBAAKCAQEA2qsShCATc5n25suEmB+1zaxbrVbSqaEWZ+qizKTLlybJ0TOD\n"
+"Nl/6lo9hIZ+gTqf0GwJRTUwtkjlovrn5p8IWtZUceG0S+ijh7DybzGCVlOFN0JRx\n"
+"z+zTr9eNPkvrJLwYavSzV4BpjelBKManE8sHA6pqXCDi5PfJ0iKfWtHQk2S5ukWA\n"
+"WE33ANLQVW0ATPvtNpHSacNIzWEW7h/66sPJu+iNcekx/Q+1kI+0Msf7m1HFN464\n"
+"eRrm3kqncPsJ6o3Kbu2aoJFk7oO6+HfSyXmxLywuUgyPnqW6zN8pj7jaq33fmHzo\n"
+"3s95Nbk/cYOtYHPzaT5eQWXy120aZ/scuY1laQIDAQABAoIBAClEP6pPo1wdokrL\n"
+"/an30geOj36W9AqvK9tQnIiiUQmleFDSt+B7HH9tb5c42Lf/WkH+nflIdxExZGMa\n"
+"FdNi/YYnLchMTViIfppmlcBsOc5u9pB2c0QaHZkBxNYM3cOA+9qzc2UABuuRKYrY\n"
+"co95sUkv0AKy8h7j5GKTxh8NmZ82+YRkkkMkk7bvXhGppR+jiqeQ4KsZbYWFPAG9\n"
+"WJA+sFVn8WS0oMePfqmeIPY+BiddU0ITn02Hafn9jBhhXI5LKbiiwC8sFDICxPSm\n"
+"moDpmexe1s7jNuSxueEM5XPQP7v2QmnH9KDxDcPEC5Lz8qFa6wkiLBpQ9CRmPlDV\n"
+"pEfF8kECgYEA/5jiItxbt+kEDMm6GuAGy02Zq/9Eb3u1J7szjvvGrL6L0S5FcDic\n"
+"S8M5A5hTvbxQfohr6AEzqog5IQ2EiyxghIfYOs5E+rYVnN4py1ErzR3LoC45bIiO\n"
+"tRbgYGMqFzD+uGaePpCwz/Ptn9KqCoH4hhfCJPMgOSNUvh8EAJfh7HcCgYEA2wNK\n"
+"Y53qfMjsd1qGYMM3J6QtTJWrteejSspouyKAlCD1RHKKzhmOxa5GPkG4NSYi1hij\n"
+"nRywxGvFOm0eoYYMUhPdjdC4Txp646l3HNdEZMWv+NN47+vaHX+KvTyq1xis46JB\n"
+"Y5SK+57RmS7sEQqUVwuqJiuPR1YoM2daBqiUFR8CgYEAymyLE67PGLz7TyFoOaaY\n"
+"2uQPQ098JIqlstyofaHa+65Azx7FMZYz+jCXc8hs8cQ1P7DNPMXO5EzUad/py8sO\n"
+"eYeYcSIxMRmJzl2IXhRgCyeAv9A7/D++PZ7rfoqqqAlOgj4LL2OqFFeMJtpRftbm\n"
+"O1SPlnHSYE4h7BxmMA4ZiAsCgYAeG0Cxmvat+qzO52nLiWpej6oOehClq9b9o/9r\n"
+"oh2Mv08X/qroFAlVUVSkoEIjRD/LsI1lPplqFuqA0plAWP3+lm6BXSzI6vnzq8sM\n"
+"8uaa97Xt/ZwFVyWfonW+98UAVosFq7tTZgsI9dcYOKQI36xuntLf9mL2yngyQMXW\n"
+"XnwkvwKBgQCCoZxoF0o6QWbEowJf/BrozjYa2D0tVokxRr7kfVXt9TTQez5LQ1u4\n"
+"/w6oCEKldPe/6tzO12i9BITmAmoZzswO/ms7J3cRnvoLWM1tPHh3zrGZgIaMdTyv\n"
+

Re: pf queue definition: bandwidth resolution problem

2017-05-15 Thread Theo Buehler
On Mon, May 15, 2017 at 09:28:57PM +0200, Mike Belopuhov wrote:
> On Sun, May 14, 2017 at 21:08 +, Carl Mascott wrote:
> > OK, I was indeed missing something. Thanks, Theo and Mike.
> > 
> > I think I see another very minor problem, though.
> > Let the pf BW be 9,999,999.
> > Shift right 3 digits (divide by 1000) : yields 9,999K.
> > (If we were doing floating point arithmetic, would yield 9,999.999K.)
> > You (Mike) don't round this, presumably to avoid overflow to 10,000 (5 
> > digits).
> > However, since (i < 3) it could have been rounded and scaled again, to 10M.
> > Slightly more accurate but not a big deal.
> >
> 
> You're right, initially I had a "<= " there but then I got
> sidetracked with a hypothetical problem that right now seems
> like my imagination.  I've double checked numbers and indeed,
> it should be "if (rtmp <= )".

Ah, indeed.



Re: [patch] ND_COMPUTER_RTIME is not uniformly distributed

2017-05-15 Thread Matthew Martin
On Mon, May 15, 2017 at 03:49:55PM +0200, Mike Belopuhov wrote:
> On Sun, May 07, 2017 at 18:59 -0500, Matthew Martin wrote:
> > RFC 4861 specifies ReachableTime "should be a uniformly distributed
> > random value between MIN_RANDOM_FACTOR and MAX_RANDOM_FACTOR times
> > BaseReachableTime milliseconds." I think the author intended to do the
> > multiplication by (x>>10) outside the mask, but it's still missing a -1.
> > 
> > - Matthew Martin
> > 
> > 
> > 
> > diff --git nd6.h nd6.h
> > index 4274cd4dd07..8eea089a40c 100644
> > --- nd6.h
> > +++ nd6.h
> > @@ -162,8 +162,8 @@ struct  llinfo_nd6 {
> >  #define MIN_RANDOM_FACTOR  512 /* 1024 * 0.5 */
> >  #define MAX_RANDOM_FACTOR  1536/* 1024 * 1.5 */
> >  #define ND_COMPUTE_RTIME(x) \
> > -   (((MIN_RANDOM_FACTOR * (x >> 10)) + (arc4random() & \
> > -   ((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * (x >> 10 /1000)
> > +   (((arc4random() & (MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR - 1)) \
> > +   + MIN_RANDOM_FACTOR) * (x >> 10) / 1000)
> >  
> >  TAILQ_HEAD(nd_drhead, nd_defrouter);
> >  struct nd_defrouter {
> > 
> 
> Hmm, how about we use an arc4random_uniform here?  It will generate
> numbers less than the upper bound specified as an argument so I don't
> need to -1 there.
> 
> I don't think they wanted to do a multiplication by (x>>10) outside
> the mask and in this case as well we want arc4random_uniform to operate
> on an extended range.
> 
> Does this look good?

Sure. In the past[1] tb@ had raised concern about using
arc4random_uniform in the kernel since it can loop indefinitely. If
that's not a concern here, the only reason to possibly not go with
arc4random_uniform is for portability with other projects that use the
KAME code which appears to have arc4random but not arc4random_uniform.

1: http://marc.info/?l=openbsd-tech=144947731311613=2

> diff --git sys/netinet6/nd6.h sys/netinet6/nd6.h
> index 4274cd4dd07..01c0fe2c7af 100644
> --- sys/netinet6/nd6.h
> +++ sys/netinet6/nd6.h
> @@ -160,12 +160,12 @@ struct  llinfo_nd6 {
>  #define REACHABLE_TIME   3   /* msec */
>  #define RETRANS_TIMER1000/* msec */
>  #define MIN_RANDOM_FACTOR512 /* 1024 * 0.5 */
>  #define MAX_RANDOM_FACTOR1536/* 1024 * 1.5 */
>  #define ND_COMPUTE_RTIME(x) \
> - (((MIN_RANDOM_FACTOR * (x >> 10)) + (arc4random() & \
> - ((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * (x >> 10 /1000)
> + ((arc4random_uniform((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * \
> + (x >> 10)) + MIN_RANDOM_FACTOR * (x >> 10)) / 1000)
>  
>  TAILQ_HEAD(nd_drhead, nd_defrouter);
>  struct   nd_defrouter {
>   TAILQ_ENTRY(nd_defrouter) dr_entry;
>   struct  in6_addr rtaddr;



Re: pflog(4) vs splnet()

2017-05-15 Thread Alexander Bluhm
On Mon, May 15, 2017 at 03:23:02PM +0200, Martin Pieuchot wrote:
> Kill unused global list of softc and protect the global array of
> interface by the NET_LOCK().
> 
> ok?

OK bluhm@

> 
> Index: net/if_pflog.c
> ===
> RCS file: /cvs/src/sys/net/if_pflog.c,v
> retrieving revision 1.78
> diff -u -p -r1.78 if_pflog.c
> --- net/if_pflog.c24 Jan 2017 10:08:30 -  1.78
> +++ net/if_pflog.c15 May 2017 13:19:39 -
> @@ -83,7 +83,6 @@ int pflog_clone_create(struct if_clone *
>  int  pflog_clone_destroy(struct ifnet *);
>  void pflog_bpfcopy(const void *, void *, size_t);
>  
> -LIST_HEAD(, pflog_softc) pflogif_list;
>  struct if_clone  pflog_cloner =
>  IF_CLONE_INITIALIZER("pflog", pflog_clone_create, pflog_clone_destroy);
>  
> @@ -94,7 +93,6 @@ struct mbuf  *pflog_mhdr = NULL, *pflog_
>  void
>  pflogattach(int npflog)
>  {
> - LIST_INIT(_list);
>   if (pflog_mhdr == NULL)
>   if ((pflog_mhdr = m_get(M_DONTWAIT, MT_HEADER)) == NULL)
>   panic("pflogattach: no mbuf");
> @@ -110,6 +108,8 @@ pflogifs_resize(size_t n)
>   struct ifnet**p;
>   int   i;
>  
> + NET_ASSERT_LOCKED();
> +
>   if (n > SIZE_MAX / sizeof(*p))
>   return (EINVAL);
>   if (n == 0)
> @@ -161,14 +161,13 @@ pflog_clone_create(struct if_clone *ifc,
>   bpfattach(>sc_if.if_bpf, ifp, DLT_PFLOG, PFLOG_HDRLEN);
>  #endif
>  
> - s = splnet();
> - LIST_INSERT_HEAD(_list, pflogif, sc_list);
> + NET_LOCK(s);
>   if (unit + 1 > npflogifs && pflogifs_resize(unit + 1) != 0) {
> - splx(s);
> + NET_UNLOCK(s);
>   return (ENOMEM);
>   }
>   pflogifs[unit] = ifp;
> - splx(s);
> + NET_UNLOCK(s);
>  
>   return (0);
>  }
> @@ -179,15 +178,13 @@ pflog_clone_destroy(struct ifnet *ifp)
>   struct pflog_softc  *pflogif = ifp->if_softc;
>   int  s, i;
>  
> - s = splnet();
> + NET_LOCK(s);
>   pflogifs[pflogif->sc_unit] = NULL;
> - LIST_REMOVE(pflogif, sc_list);
> -
>   for (i = npflogifs; i > 0 && pflogifs[i - 1] == NULL; i--)
>   ; /* nothing */
>   if (i < npflogifs)
>   pflogifs_resize(i); /* error harmless here */
> - splx(s);
> + NET_UNLOCK(s);
>  
>   if_detach(ifp);
>   free(pflogif, M_DEVBUF, 0);
> Index: net/if_pflog.h
> ===
> RCS file: /cvs/src/sys/net/if_pflog.h,v
> retrieving revision 1.26
> diff -u -p -r1.26 if_pflog.h
> --- net/if_pflog.h12 Feb 2015 01:24:10 -  1.26
> +++ net/if_pflog.h15 May 2017 13:21:13 -
> @@ -64,7 +64,6 @@ struct pfloghdr {
>  struct pflog_softc {
>   struct ifnetsc_if;  /* the interface */
>   int sc_unit;
> - LIST_ENTRY(pflog_softc) sc_list;
>  };
>  
>  #if NPFLOG > 0



Re: pf: percpu anchor stacks

2017-05-15 Thread Mike Belopuhov
On Mon, May 15, 2017 at 15:19 +0200, Alexandr Nedvedicky wrote:
> Hello,
> 
> > Now *is* the time to commit the first step, the refactoring.  Once
> > that's done we can discuss the introduction of the context.
> > 
> > Could you come up with such diff?
> 
> first of all: I have not managed to finish the re-factoring step yet, work
> is still in progress. I stole some cycles from other projects, but it was
> not enough apparently. Must try harder next time...
> 
> 
> > > > Does this pass pfctl regress tests?
> > > 
> > > I'm about to run those tests for OpenBSD.
> > 
> > Did you manage to do that?
> 
> I have some update on testing of final patch. I've used pf_forward tests 
> to
> make sure the code I'm changing gets executed. I'm still working on
> testcase, which covers deeper anchor tree with once-rules.
> 
> the pf_forward tests show no harm caused by my changes, though I saw some
> failures:
> 
>   Makefile:217 'run-regress-udp-inet-RTT_IN'
>   Makefile:217 'run-regress-udp-inet6-ECO_IN'
>   Makefile:217 'run-regress-udp-inet6-ECO_OUT'
>   Makefile:217 'run-regress-udp-inet6-RDR_IN'
>   Makefile:217 'run-regress-udp-inet6-RDR_OUT'
>   Makefile:217 'run-regress-udp-inet6-RTT_IN'
>   Makefile:215 'run-regress-udp-inet6-RPT_OUT'
>   Makefile:257 'run-regress-traceroute-udp-inet6-AF_IN'
> 
> I could see same failures in baseline (tree _without_ my changes). I took 
> a
> closer look to find out what's going on there. I took a tcpdump at ECO:
>   #
>   # tcpdump -i vnet1 running on ECO (192.168.214.188, 192.168.3.20)
>   #
>   13:27:31.712955 192.168.1.10.42707 > 192.168.214.188.echo: udp 3
>   13:27:31.713616 192.168.3.20.echo > 192.168.1.10.42707: udp 3
>   13:27:31.714693 192.168.1.10 > 192.168.3.20: icmp: 192.168.1.10
>   udp port 42707 unreachable
>   #
>   # output above shows we get answer from .3.20 instead of .214.188
>   # looks as a kind of yet another bug.
>   #
> 
> There are multiple IP addresses bound to ECO IN/OUT interface. However
> UDP socket at ECO always answers using primary IP address bound to ECO
> interface. The answer triggers ICMP port unreachable at SRC (192.168.1.10)
> 
> > > >  - s/test_status/action/ as it's done everywhere else?
> > > 
> > > I've opted to test_status, because it's something different to 
> > > 'action'
> > > as we use it in current code.
> > 
> > I agree with you for test_status.  What about naming the enum and use it
> > instead of 'int' for the variable?  This implicitly documents the possible
> > option and allow the compiler to check for missing cases in switch.
> 
> I'm attaching updated final patch, which accepts your suggestion.
> 
> thanks and
> regards
> sasha
> 

I think you can go ahead with your change.  OK mikeb