Re: if attach/detach netlocks

2016-12-28 Thread Alexander Bluhm
On Fri, Dec 23, 2016 at 12:09:32AM +0100, Martin Pieuchot wrote:
> On 22/12/16(Thu) 20:45, Mike Belopuhov wrote:
> > I think this is what is required here.  Works here, but YMMV.
> 
> splnet() in a pseudo-driver seems completely wrong, you could get rid of
> it.

Yes, but that is another issue.  Can we get the netlock splasserts
fixed first?  This diff looks good to me.

bluhm

> > diff --git sys/net/if_vxlan.c sys/net/if_vxlan.c
> > index e9bc1cb8305..dfb71cf9467 100644
> > --- sys/net/if_vxlan.c
> > +++ sys/net/if_vxlan.c
> > @@ -178,13 +178,15 @@ int
> >  vxlan_clone_destroy(struct ifnet *ifp)
> >  {
> > struct vxlan_softc  *sc = ifp->if_softc;
> > int  s;
> >  
> > +   NET_LOCK(s);
> > s = splnet();
> > vxlan_multicast_cleanup(ifp);
> > splx(s);
> > +   NET_UNLOCK(s);
> >  
> > vxlan_enable--;
> > LIST_REMOVE(sc, sc_entry);
> >  
> > ifmedia_delete_instance(>sc_media, IFM_INST_ANY);



Re: Build kernels with -ffreestanding?

2016-12-28 Thread Mark Kettenis
> Date: Wed, 28 Dec 2016 22:59:18 +0100 (CET)
> From: Mark Kettenis 
> 
> > Date: Wed, 28 Dec 2016 08:29:05 +0100
> > From: Martin Pieuchot 
> > 
> > On 28/12/16(Wed) 01:05, Jeremie Courreges-Anglas wrote:
> > > Mark Kettenis  writes:
> > > 
> > > >> Date: Sat, 24 Dec 2016 00:08:35 +0100 (CET)
> > > >> From: Mark Kettenis 
> > > >> 
> > > >> We already do this on some architectures, but not on amd64 for
> > > >> example.  The main reason is that this disables memcpy() optimizations
> > > >> that have a measurable impact on the network stack performance.
> > > >> 
> > > >> We can get those optimizations back by doing:
> > > >> 
> > > >> #define memcpy(d, s, n) __builtin_memcpy((d), (s), (n))
> > > >> 
> > > >> I verified that gcc still does proper bounds checking on
> > > >> __builtin_memcpy(), so we don't lose that.
> > > >> 
> > > >> The nice thing about this solution is that we can choose explicitly
> > > >> which optimizations we want.  And as you can see the kernel makefile
> > > >> gets simpler ;).
> > > >> 
> > > >> Of course the real reason why I'm looking into this is that clang
> > > >> makes it really hard to build kernels without -ffreestanding.
> > > >> 
> > > >> The diff below implements this strategy, and enabled the optimizations
> > > >> for memcpy() and memset().  We can add others if we think there is a
> > > >> benefit.  I've tested the diff on amd64.  We may need to put an #undef
> > > >> memcpy somewhere for platforms that use the generic C code for memcpy.
> > > >> 
> > > >> Thoughts?
> > > >
> > > > So those #undefs are necessary.  New diff below.  Tested on armv7,
> > > > hppa and sparc64 now as well.
> > > 
> > > I think this is the way to go; can't help tests on other archs, though.
> > > ok jca@ fwiw
> > 
> > For the archives, Hrvoje Popovski measured a performance impact when using
> > a kernel with this diff to forward packets.  I guess we're missing some
> > defines.
> 
> The most likely candidate is memmove.  Here is a diff that adds it.

Scrap that; memcmp needs this too.  New diff below.

Index: arch/amd64/conf/Makefile.amd64
===
RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v
retrieving revision 1.74
diff -u -p -r1.74 Makefile.amd64
--- arch/amd64/conf/Makefile.amd64  29 Nov 2016 09:08:34 -  1.74
+++ arch/amd64/conf/Makefile.amd64  28 Dec 2016 22:19:20 -
@@ -29,9 +29,7 @@ CWARNFLAGS=   -Werror -Wall -Wimplicit-fun
 
 CMACHFLAGS=-mcmodel=kernel -mno-red-zone -mno-sse2 -mno-sse -mno-3dnow \
-mno-mmx -msoft-float -fno-omit-frame-pointer
-CMACHFLAGS+=   -fno-builtin-printf -fno-builtin-snprintf \
-   -fno-builtin-vsnprintf -fno-builtin-log \
-   -fno-builtin-log2 -fno-builtin-malloc ${NOPIE_FLAGS}
+CMACHFLAGS+=   -ffreestanding ${NOPIE_FLAGS}
 .if ${IDENT:M-DNO_PROPOLICE}
 CMACHFLAGS+=   -fno-stack-protector
 .endif
Index: lib/libkern/memcmp.c
===
RCS file: /cvs/src/sys/lib/libkern/memcmp.c,v
retrieving revision 1.6
diff -u -p -r1.6 memcmp.c
--- lib/libkern/memcmp.c10 Jun 2014 04:16:57 -  1.6
+++ lib/libkern/memcmp.c28 Dec 2016 22:19:21 -
@@ -34,6 +34,8 @@
 
 #include 
 
+#undef memcmp
+
 /*
  * Compare memory regions.
  */
Index: lib/libkern/memcpy.c
===
RCS file: /cvs/src/sys/lib/libkern/memcpy.c,v
retrieving revision 1.3
diff -u -p -r1.3 memcpy.c
--- lib/libkern/memcpy.c12 Jun 2013 16:44:22 -  1.3
+++ lib/libkern/memcpy.c28 Dec 2016 22:19:21 -
@@ -32,6 +32,8 @@
 #include 
 #include 
 
+#undef memcpy
+
 /*
  * This is designed to be small, not fast.
  */
Index: lib/libkern/memmove.c
===
RCS file: /cvs/src/sys/lib/libkern/memmove.c,v
retrieving revision 1.1
diff -u -p -r1.1 memmove.c
--- lib/libkern/memmove.c   11 Jun 2013 18:04:41 -  1.1
+++ lib/libkern/memmove.c   28 Dec 2016 22:19:21 -
@@ -32,6 +32,8 @@
 #include 
 #include 
 
+#undef memmove
+
 /*
  * This is designed to be small, not fast.
  */
Index: lib/libkern/memset.c
===
RCS file: /cvs/src/sys/lib/libkern/memset.c,v
retrieving revision 1.7
diff -u -p -r1.7 memset.c
--- lib/libkern/memset.c10 Jun 2014 04:16:57 -  1.7
+++ lib/libkern/memset.c28 Dec 2016 22:19:21 -
@@ -39,6 +39,8 @@
 #include 
 #include 
 
+#undef memset
+
 #definewsize   sizeof(u_int)
 #definewmask   (wsize - 1)
 
Index: sys/systm.h
===
RCS file: /cvs/src/sys/sys/systm.h,v
retrieving revision 1.120
diff -u -p -r1.120 systm.h
--- sys/systm.h 19 Dec 2016 08:36:50 -  1.120
+++ sys/systm.h 

Re: Build kernels with -ffreestanding?

2016-12-28 Thread Mark Kettenis
> Date: Wed, 28 Dec 2016 08:29:05 +0100
> From: Martin Pieuchot 
> 
> On 28/12/16(Wed) 01:05, Jeremie Courreges-Anglas wrote:
> > Mark Kettenis  writes:
> > 
> > >> Date: Sat, 24 Dec 2016 00:08:35 +0100 (CET)
> > >> From: Mark Kettenis 
> > >> 
> > >> We already do this on some architectures, but not on amd64 for
> > >> example.  The main reason is that this disables memcpy() optimizations
> > >> that have a measurable impact on the network stack performance.
> > >> 
> > >> We can get those optimizations back by doing:
> > >> 
> > >> #define memcpy(d, s, n) __builtin_memcpy((d), (s), (n))
> > >> 
> > >> I verified that gcc still does proper bounds checking on
> > >> __builtin_memcpy(), so we don't lose that.
> > >> 
> > >> The nice thing about this solution is that we can choose explicitly
> > >> which optimizations we want.  And as you can see the kernel makefile
> > >> gets simpler ;).
> > >> 
> > >> Of course the real reason why I'm looking into this is that clang
> > >> makes it really hard to build kernels without -ffreestanding.
> > >> 
> > >> The diff below implements this strategy, and enabled the optimizations
> > >> for memcpy() and memset().  We can add others if we think there is a
> > >> benefit.  I've tested the diff on amd64.  We may need to put an #undef
> > >> memcpy somewhere for platforms that use the generic C code for memcpy.
> > >> 
> > >> Thoughts?
> > >
> > > So those #undefs are necessary.  New diff below.  Tested on armv7,
> > > hppa and sparc64 now as well.
> > 
> > I think this is the way to go; can't help tests on other archs, though.
> > ok jca@ fwiw
> 
> For the archives, Hrvoje Popovski measured a performance impact when using
> a kernel with this diff to forward packets.  I guess we're missing some
> defines.

The most likely candidate is memmove.  Here is a diff that adds it.

Index: arch/amd64/conf/Makefile.amd64
===
RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v
retrieving revision 1.74
diff -u -p -r1.74 Makefile.amd64
--- arch/amd64/conf/Makefile.amd64  29 Nov 2016 09:08:34 -  1.74
+++ arch/amd64/conf/Makefile.amd64  28 Dec 2016 21:48:52 -
@@ -29,9 +29,7 @@ CWARNFLAGS=   -Werror -Wall -Wimplicit-fun
 
 CMACHFLAGS=-mcmodel=kernel -mno-red-zone -mno-sse2 -mno-sse -mno-3dnow \
-mno-mmx -msoft-float -fno-omit-frame-pointer
-CMACHFLAGS+=   -fno-builtin-printf -fno-builtin-snprintf \
-   -fno-builtin-vsnprintf -fno-builtin-log \
-   -fno-builtin-log2 -fno-builtin-malloc ${NOPIE_FLAGS}
+CMACHFLAGS+=   -ffreestanding ${NOPIE_FLAGS}
 .if ${IDENT:M-DNO_PROPOLICE}
 CMACHFLAGS+=   -fno-stack-protector
 .endif
Index: lib/libkern/memcpy.c
===
RCS file: /cvs/src/sys/lib/libkern/memcpy.c,v
retrieving revision 1.3
diff -u -p -r1.3 memcpy.c
--- lib/libkern/memcpy.c12 Jun 2013 16:44:22 -  1.3
+++ lib/libkern/memcpy.c28 Dec 2016 21:48:53 -
@@ -32,6 +32,8 @@
 #include 
 #include 
 
+#undef memcpy
+
 /*
  * This is designed to be small, not fast.
  */
Index: lib/libkern/memmove.c
===
RCS file: /cvs/src/sys/lib/libkern/memmove.c,v
retrieving revision 1.1
diff -u -p -r1.1 memmove.c
--- lib/libkern/memmove.c   11 Jun 2013 18:04:41 -  1.1
+++ lib/libkern/memmove.c   28 Dec 2016 21:48:53 -
@@ -32,6 +32,8 @@
 #include 
 #include 
 
+#undef memmove
+
 /*
  * This is designed to be small, not fast.
  */
Index: lib/libkern/memset.c
===
RCS file: /cvs/src/sys/lib/libkern/memset.c,v
retrieving revision 1.7
diff -u -p -r1.7 memset.c
--- lib/libkern/memset.c10 Jun 2014 04:16:57 -  1.7
+++ lib/libkern/memset.c28 Dec 2016 21:48:53 -
@@ -39,6 +39,8 @@
 #include 
 #include 
 
+#undef memset
+
 #definewsize   sizeof(u_int)
 #definewmask   (wsize - 1)
 
Index: sys/systm.h
===
RCS file: /cvs/src/sys/sys/systm.h,v
retrieving revision 1.120
diff -u -p -r1.120 systm.h
--- sys/systm.h 19 Dec 2016 08:36:50 -  1.120
+++ sys/systm.h 28 Dec 2016 21:48:53 -
@@ -330,6 +330,10 @@ extern int (*mountroot)(void);
 
 #include 
 
+#define memcpy(d, s, n)__builtin_memcpy((d), (s), (n))
+#define memmove(d, s, n)   __builtin_memmove((d), (s), (n))
+#define memset(b, c, n)__builtin_memset((b), (c), (n))
+
 #if defined(DDB) || defined(KGDB)
 /* debugger entry points */
 void   Debugger(void); /* in DDB only */



Re: BFD: route get and route monitor

2016-12-28 Thread Peter Hessler
On 2016 Dec 23 (Fri) at 16:57:27 +0100 (+0100), Hrvoje Popovski wrote:
:On 21.12.2016. 23:15, Sebastian Benoit wrote:
:>> Hi,
:>>
:>> it seems that bfd is working with Force10 S4810 and Extreme Networks
:>> x460 switches. I can test it with cisco c6k5 if you want?
:> 
:> Hei,
:> 
:> i'm sure phessler (who might not read this for a couple of days) is happy
:> about any test you can do.
:> 
:> And thanks for doing these tests!
:> 
:> /Benno
:
:Hi,
:
:no bfd for me on Cisco c6k5. Will upgrade and report back.
:
:Tnx for bfd, really great feature ...
:
:

Many thanks for the testing.  Can you get some packet captures of the
failing bfd with that Cisco and send them to me offline?  I'd really
like to see what they are doing.

Thanks!

-- 
What the world *really* needs is a good Automatic Bicycle Sharpener.



pf refrag route

2016-12-28 Thread Alexander Bluhm
Hi,

In pf_refragment6() use the valid route from pf_route6() instead
of calling rtalloc() again.

ok?

bluhm

Index: net/pf.c
===
RCS file: /cvs/src/sys/net/pf.c,v
retrieving revision 1.1008
diff -u -p -r1.1008 pf.c
--- net/pf.c28 Dec 2016 15:36:15 -  1.1008
+++ net/pf.c28 Dec 2016 15:51:35 -
@@ -6003,7 +6003,7 @@ pf_route6(struct pf_pdesc *pd, struct pf
 * use pf_refragment6() here to turn it back to fragments.
 */
if ((mtag = m_tag_find(m0, PACKET_TAG_PF_REASSEMBLED, NULL))) {
-   (void) pf_refragment6(, mtag, dst, ifp);
+   (void) pf_refragment6(, mtag, dst, ifp, rt);
} else if ((u_long)m0->m_pkthdr.len <= ifp->if_mtu) {
ifp->if_output(ifp, m0, sin6tosa(dst), rt);
} else {
@@ -6925,7 +6925,7 @@ done:
struct m_tag*mtag;
 
if ((mtag = m_tag_find(pd.m, PACKET_TAG_PF_REASSEMBLED, NULL)))
-   action = pf_refragment6(, mtag, NULL, NULL);
+   action = pf_refragment6(, mtag, NULL, NULL, NULL);
}
 #endif /* INET6 */
if (s && action != PF_DROP) {
Index: net/pf_norm.c
===
RCS file: /cvs/src/sys/net/pf_norm.c,v
retrieving revision 1.197
diff -u -p -r1.197 pf_norm.c
--- net/pf_norm.c   22 Nov 2016 19:29:54 -  1.197
+++ net/pf_norm.c   28 Dec 2016 15:51:35 -
@@ -687,11 +687,10 @@ fail:
 
 int
 pf_refragment6(struct mbuf **m0, struct m_tag *mtag, struct sockaddr_in6 *dst,
-struct ifnet *ifp)
+struct ifnet *ifp, struct rtentry *rt)
 {
struct mbuf *m = *m0, *t;
struct pf_fragment_tag  *ftag = (struct pf_fragment_tag *)(mtag + 1);
-   struct rtentry  *rt = NULL;
u_int32_tmtu;
u_int16_thdrlen, extoff, maxlen;
u_int8_t proto;
@@ -748,15 +747,6 @@ pf_refragment6(struct mbuf **m0, struct 
action = PF_DROP;
}
 
-   if (ifp != NULL) {
-   rt = rtalloc(sin6tosa(dst), RT_RESOLVE,
-   m->m_pkthdr.ph_rtableid);
-   if (rt == NULL) {
-   ip6stat.ip6s_noroute++;
-   error = -1;
-   }
-   }
-
for (t = m; m; m = t) {
t = m->m_nextpkt;
m->m_nextpkt = NULL;
@@ -774,7 +764,6 @@ pf_refragment6(struct mbuf **m0, struct 
m_freem(m);
}
}
-   rtfree(rt);
 
return (action);
 }
Index: net/pfvar.h
===
RCS file: /cvs/src/sys/net/pfvar.h,v
retrieving revision 1.445
diff -u -p -r1.445 pfvar.h
--- net/pfvar.h 22 Nov 2016 19:29:54 -  1.445
+++ net/pfvar.h 28 Dec 2016 15:51:36 -
@@ -1681,7 +1681,7 @@ int   pf_match_uid(u_int8_t, uid_t, uid_t,
 intpf_match_gid(u_int8_t, gid_t, gid_t, gid_t);
 
 intpf_refragment6(struct mbuf **, struct m_tag *mtag,
-   struct sockaddr_in6 *, struct ifnet *);
+   struct sockaddr_in6 *, struct ifnet *, struct rtentry *);
 void   pf_normalize_init(void);
 intpf_normalize_ip(struct pf_pdesc *, u_short *);
 intpf_normalize_ip6(struct pf_pdesc *, u_short *);



Re: pf state key link

2016-12-28 Thread Alexandr Nedvedicky
Hello,

On Fri, Dec 23, 2016 at 04:21:09PM +0100, Alexander Bluhm wrote:
> Hi,
> 
> Christiano Haesbaert has sent me this diff.
> 
> They are setting pkt_sk to NULL if pkt_sk->reverse is not 
>   
> pf_statek_key_isvalid(), but the chunk that creates the pkt_sk->reverse   
>   
> link actually depends on pkt_sk != NULL.  
>   
> 
> I think it is correct.
> 
> ok?

looks correct to me too

OK sashan@

regards
sasha



passwd(1): clear memory used for password input

2016-12-28 Thread Ingve Skåra
passwd(1) does not clear memory used for the the second password
input. Use explicit_bzero(3) to zero the memory when we're done with
it. Utilities like bioctl(8) and signify(1) already do this.

Index: local_passwd.c
===
RCS file: /cvs/src/usr.bin/passwd/local_passwd.c,v
retrieving revision 1.52
diff -u -p -u -r1.52 local_passwd.c
--- local_passwd.c  2 Sep 2016 18:06:43 -   1.52
+++ local_passwd.c  28 Dec 2016 08:13:07 -
@@ -203,9 +203,12 @@ getnewpasswd(struct passwd *pw, login_ca
continue;
p = readpassphrase("Retype new password:", repeat, 
sizeof(repeat),
RPP_ECHO_OFF);
-   if (p != NULL && strcmp(newpass, p) == 0)
+   if (p != NULL && strcmp(newpass, p) == 0) {
+   explicit_bzero(repeat, sizeof(repeat));
break;
+   }
(void)printf("Mismatch; try again, EOF to quit.\n");
+   explicit_bzero(repeat, sizeof(repeat));
explicit_bzero(newpass, sizeof(newpass));
}
 



Re: Build kernels with -ffreestanding?

2016-12-28 Thread Reyk Floeter

>> Am 28.12.2016 um 08:29 schrieb Martin Pieuchot :
>> 
>> On 28/12/16(Wed) 01:05, Jeremie Courreges-Anglas wrote:
>> Mark Kettenis  writes:
>> 
 Date: Sat, 24 Dec 2016 00:08:35 +0100 (CET)
 From: Mark Kettenis 
 
 We already do this on some architectures, but not on amd64 for
 example.  The main reason is that this disables memcpy() optimizations
 that have a measurable impact on the network stack performance.
 
 We can get those optimizations back by doing:
 
 #define memcpy(d, s, n) __builtin_memcpy((d), (s), (n))
 
 I verified that gcc still does proper bounds checking on
 __builtin_memcpy(), so we don't lose that.
 
 The nice thing about this solution is that we can choose explicitly
 which optimizations we want.  And as you can see the kernel makefile
 gets simpler ;).
 
 Of course the real reason why I'm looking into this is that clang
 makes it really hard to build kernels without -ffreestanding.
 
 The diff below implements this strategy, and enabled the optimizations
 for memcpy() and memset().  We can add others if we think there is a
 benefit.  I've tested the diff on amd64.  We may need to put an #undef
 memcpy somewhere for platforms that use the generic C code for memcpy.
 
 Thoughts?
>>> 
>>> So those #undefs are necessary.  New diff below.  Tested on armv7,
>>> hppa and sparc64 now as well.
>> 
>> I think this is the way to go; can't help tests on other archs, though.
>> ok jca@ fwiw
> 
> For the archives, Hrvoje Popovski measured a performance impact when using
> a kernel with this diff to forward packets.  I guess we're missing some
> defines.

I'm late to the game - but does this diff remove all the other optimizations as 
well? (eg. bcopy, memcmp, memmove, strchr, ...) I did some performance testing 
when I added them for amd64 in libc and it made a noticeable difference - not 
just for memcpy+memset.

Reyk