Re: IPv6 DoS sysctl man page additions

2014-04-22 Thread Mike Belopuhov
On 19 April 2014 13:20, Loganaden Velvindron lo...@elandsys.com wrote:
 On Sat, Apr 19, 2014 at 04:04:30AM -0700, Loganaden Velvindron wrote:
 Hi All,

 I'm taking a short break from playing with pf statistics.

 There were 4 sysctls added from KAME, but the man pages weren't updated
 accordingly.

 (Adapted from the NetBSD man page changes)

 Feedback welcomed.



 Removed trailing spaces and use set to instead of set it to based
 on feedback from sthen@



OK mikeb



Re: iked + isakmpd on the same machine

2014-04-22 Thread Mike Belopuhov
On 22 April 2014 17:13, Philipp
e1c1bac6253dc54a1e89ddc046585...@posteo.net wrote:
 It happened! A remote peer *requires* IKEv2 - and I've to do that on a
 machine running isakmpd with somewhat 25+ IKEv1 peers.

 First hurdle: I cannot bind iked to a certain (carp) IP-address. Mad
 workaround: start isakmpd (with Listen-on) first.
 Second hurdle: iked loads its SAs and eventually does this by creating a
 new empty SADB, effectivly killing all
 the SAs isakmpd loaded into the kernel before?

 Is there a diff sleeping out there for tackling the first hurdle?

 For the second one, I've to refrain from testing this in live further more.
 First to reconstruct my Frankenstein-Lab.

 Cheers for any thoughts beside mad, bro? :-)


more like it's not supported and is not supposed to work.
it's like running nginx and apache at the same time but
worse since there are kernel tentacles involved as well
(as you might have figured out already) that will likely
prevent you from doing that on the same box but different
ip addresses.

cheers,
mike



Re: iked + isakmpd on the same machine

2014-04-24 Thread Mike Belopuhov
On 24 April 2014 12:12, Philipp
e1c1bac6253dc54a1e89ddc046585...@posteo.net wrote:
 Am 22.04.2014 17:28 schrieb Mike Belopuhov:

 more like it's not supported and is not supposed to work.

 not supposed as in 'not wanted'?


not supposed.


 it's like running nginx and apache at the same time but

 Quite frankly: I'm doing that in some locations ;-)


not on the same port (80) though.  ikev2 and isakmp both use
same udp ports (500 and 4500).


 worse since there are kernel tentacles involved as well
 (as you might have figured out already) that will likely

 That's somehow the main problem, the two daemons are not
 trying to share the pfkey2 ioctls outcome.

i don't see it like that.

 So, I can wait til iked supports ikev1, too.

there are no current plans to implement ikev1 support that
i'm aware of.

 Using a different machine will be quite painful at the moment.
 Rock+hard place.


 prevent you from doing that on the same box but different
 ip addresses.

 Nevertheless I'd say that a Listen-on style directive for iked
 would a useful thing[tm], e.g. to default the srcid to that.


perhaps.  currently i believe srcid will default to local if
specified.

 Cheers.




Re: Kill in_localaddr()

2014-04-24 Thread Mike Belopuhov
On 24 April 2014 16:41, Martin Pieuchot mpieuc...@nolizard.org wrote:
 in_localaddr() is used only once in our tree and only if the sysctl
 net.inet.ip.mtudisc is set to 0.

 It is used to optimize the size of the MSS if the forward address
 correspond to a host on one of our subnets.  Since it's an
 optimization for a special case that's not enabled by default, I'd
 like to  kill it to remove one usage of the global list of IPv4
 addresses.

 While here get rid of the #ifdef RTV_MTU, it is here.

 ok?


OK



Re: iked + isakmpd on the same machine

2014-04-24 Thread Mike Belopuhov
On 24 April 2014 20:25, Chris Cappuccio ch...@nmedia.net wrote:
 Mike Belopuhov [m...@belopuhov.com] wrote:

 more like it's not supported and is not supposed to work.
 it's like running nginx and apache at the same time

 hey, nginx and httpd run concurrently quite fine on
 different IP addresses, same box :)

i meant using the same port numbers of course.



Re: iked + isakmpd on the same machine

2014-04-24 Thread Mike Belopuhov
On 24 April 2014 22:25, Alexander Hall alexan...@beard.se wrote:
 On 04/24/14 21:53, Stuart Henderson wrote:

 On 2014/04/24 20:30, Mike Belopuhov wrote:

 On 24 April 2014 20:25, Chris Cappuccio ch...@nmedia.net wrote:

 Mike Belopuhov [m...@belopuhov.com] wrote:


 more like it's not supported and is not supposed to work.
 it's like running nginx and apache at the same time


 hey, nginx and httpd run concurrently quite fine on
 different IP addresses, same box :)


 i meant using the same port numbers of course.


 they can do that fine too! :) just have one hand-off the relevant
 requests to the other.


 If they bind to separate IP addresses that is obviously not a problem, even
 for the same port numbers.

yes. that's precisely what i meant:  you can't bind to the same ipaddr:port
pair twice.   why do i have to chew it and spit it out for you.  it was clear
what i meant from the start.



snmpd: add backend from bgpd to support multiple routing tables

2014-04-28 Thread Mike Belopuhov
This adds ktable code from bgpd to fetch, store and perform lookups
in multiple routing tables.  Currently it doesn't do anything useful
but it's a prerequisite for any future work in this direction.

OK to get this in?

diff --git usr.sbin/snmpd/kroute.c usr.sbin/snmpd/kroute.c
index e157b25..e19f924 100644
--- usr.sbin/snmpd/kroute.c
+++ usr.sbin/snmpd/kroute.c
@@ -45,10 +45,13 @@
 
 #include snmpd.h
 
 extern struct snmpd*env;
 
+struct ktable  **krt;
+u_intkrt_size;
+
 struct {
struct event ks_ev;
u_long   ks_iflastchange;
u_long   ks_nroutes;/* 4 billions enough? */
int  ks_fd;
@@ -77,24 +80,32 @@ struct kif_node {
 
 intkroute_compare(struct kroute_node *, struct kroute_node *);
 intkroute6_compare(struct kroute6_node *, struct kroute6_node *);
 intkif_compare(struct kif_node *, struct kif_node *);
 
-struct kroute_node *kroute_find(in_addr_t, u_int8_t, u_int8_t);
+voidktable_init(void);
+int ktable_new(u_int, u_int);
+voidktable_free(u_int);
+int ktable_exists(u_int, u_int *);
+struct ktable  *ktable_get(u_int);
+int ktable_update(u_int);
+
+struct kroute_node *kroute_find(struct ktable *, in_addr_t, u_int8_t,
+   u_int8_t);
 struct kroute_node *kroute_matchgw(struct kroute_node *,
struct sockaddr_in *);
-int kroute_insert(struct kroute_node *);
-int kroute_remove(struct kroute_node *);
-voidkroute_clear(void);
+int kroute_insert(struct ktable *, struct kroute_node *);
+int kroute_remove(struct ktable *, struct kroute_node *);
+voidkroute_clear(struct ktable *);
 
-struct kroute6_node*kroute6_find(const struct in6_addr *, u_int8_t,
-u_int8_t);
+struct kroute6_node*kroute6_find(struct ktable *, const struct in6_addr *,
+   u_int8_t, u_int8_t);
 struct kroute6_node*kroute6_matchgw(struct kroute6_node *,
struct sockaddr_in6 *);
-int kroute6_insert(struct kroute6_node *);
-int kroute6_remove(struct kroute6_node *);
-voidkroute6_clear(void);
+int kroute6_insert(struct ktable *, struct kroute6_node *);
+int kroute6_remove(struct ktable *, struct kroute6_node *);
+voidkroute6_clear(struct ktable *);
 
 struct kif_arp *karp_find(struct sockaddr *, u_short);
 int karp_insert(struct kif_node *, struct kif_arp *);
 int karp_remove(struct kif_node *, struct kif_arp *);
 
@@ -121,23 +132,21 @@ void  if_newaddr(u_short, struct sockaddr *, 
struct sockaddr *,
struct sockaddr *);
 void   if_deladdr(u_short, struct sockaddr *, struct sockaddr *,
struct sockaddr *);
 void   if_announce(void *);
 
-intfetchtable(void);
+intfetchtable(struct ktable *);
 intfetchifs(u_short);
-intfetcharp(void);
+intfetcharp(struct ktable *);
 void   dispatch_rtmsg(int, short, void *);
 intrtmsg_process(char *, int);
-intdispatch_rtmsg_addr(struct rt_msghdr *,
+intdispatch_rtmsg_addr(struct ktable *, struct rt_msghdr *,
struct sockaddr *[RTAX_MAX]);
 
-RB_HEAD(kroute_tree, kroute_node)  krt;
 RB_PROTOTYPE(kroute_tree, kroute_node, entry, kroute_compare)
 RB_GENERATE(kroute_tree, kroute_node, entry, kroute_compare)
 
-RB_HEAD(kroute6_tree, kroute6_node)krt6;
 RB_PROTOTYPE(kroute6_tree, kroute6_node, entry, kroute6_compare)
 RB_GENERATE(kroute6_tree, kroute6_node, entry, kroute6_compare)
 
 RB_HEAD(kif_tree, kif_node)kit;
 RB_PROTOTYPE(kif_tree, kif_node, entry, kif_compare)
@@ -149,10 +158,11 @@ RB_GENERATE(ka_tree, kif_addr, node, ka_compare)
 
 void
 kr_init(void)
 {
int opt = 0, rcvbuf, default_rcvbuf;
+   unsigned inttid = RTABLE_ANY;
socklen_t   optlen;
 
if ((kr_state.ks_ifd = socket(AF_INET, SOCK_DGRAM, 0)) == -1)
fatal(kr_init: ioctl socket);
 
@@ -179,31 +189,166 @@ kr_init(void)
setsockopt(kr_state.ks_fd, SOL_SOCKET, SO_RCVBUF,
rcvbuf, sizeof(rcvbuf)) == -1  errno == ENOBUFS;
rcvbuf /= 2)
;   /* nothing */
 
-   RB_INIT(krt);
-   RB_INIT(krt6);
+   if (setsockopt(kr_state.ks_fd, AF_ROUTE, ROUTE_TABLEFILTER, tid,
+   sizeof(tid)) == -1)
+   log_warn(kr_init: setsockopt AF_ROUTE ROUTE_TABLEFILTER);
+
RB_INIT(kit);
RB_INIT(kat);
 
if 

Re: data modified on freelist, tmpfs-related?

2014-04-30 Thread Mike Belopuhov
On 30 April 2014 15:55, Mark Kettenis mark.kette...@xs4all.nl wrote:
 Date: Wed, 30 Apr 2014 15:38:39 +0200 (CEST)
 From: Mark Kettenis mark.kette...@xs4all.nl

  Date: Wed, 30 Apr 2014 13:39:20 +0100
  From: Stuart Henderson st...@openbsd.org
 
  Seen when running e2fsprogs regression tests with /tmp on tmpfs

 I'm not surprised; tmpfs contains some serious bugs.  I recommend not
 using it until those are fixed.

 Which means, I'd like somebody else besides espie@ to comment on my
 uvm_aobj.c list manipulation hack diff.


Diff made sense to me when I looked at it, but I would rather hide
direct pointer access :/  Perhaps LIST_SWAP does a tiny bit more,
but it's cleaner and perhaps can be useful in the future.



Re: data modified on freelist, tmpfs-related?

2014-04-30 Thread Mike Belopuhov
On 30 April 2014 16:55, Mark Kettenis mark.kette...@xs4all.nl wrote:
 From: Mike Belopuhov m...@belopuhov.com
 Date: Wed, 30 Apr 2014 16:00:45 +0200

 On 30 April 2014 15:55, Mark Kettenis mark.kette...@xs4all.nl wrote:
  Date: Wed, 30 Apr 2014 15:38:39 +0200 (CEST)
  From: Mark Kettenis mark.kette...@xs4all.nl
 
   Date: Wed, 30 Apr 2014 13:39:20 +0100
   From: Stuart Henderson st...@openbsd.org
  
   Seen when running e2fsprogs regression tests with /tmp on tmpfs
 
  I'm not surprised; tmpfs contains some serious bugs.  I recommend not
  using it until those are fixed.
 
  Which means, I'd like somebody else besides espie@ to comment on my
  uvm_aobj.c list manipulation hack diff.
 

 Diff made sense to me when I looked at it, but I would rather hide
 direct pointer access :/  Perhaps LIST_SWAP does a tiny bit more,
 but it's cleaner and perhaps can be useful in the future.

 I'm not comfortable with introducing more sys/queue.h APIs.  So
 perhaps we should just punt on the optimization and remove/insert all
 list items.  Removing the trap comments that pedro set up...


I agree.



Re: Annoying emacs variable in if_spppsubr.c

2014-05-02 Thread Mike Belopuhov
On 2 May 2014 12:09, Jérémie Courrèges-Anglas j...@wxcvbn.org wrote:

 This one is bugging me each time I start my Emacs session (because Emacs
 now asks confirmation for most variables).  This one would be useful
 only with hilit19.el (obsolete) from editors/emacs21... if the size of
 the file was actually under 12 bytes (it's not anymore).

 ok to kill it?


yes, please.



Re: m-m_pkthdr.rcvif and ip6_input()

2014-05-12 Thread Mike Belopuhov
On 12 May 2014 15:12, Martin Pieuchot mpieuc...@nolizard.org wrote:
 Like the previous diffs, it reduces the number of m-m_pkthdr.rcvif
 occurrences, this time in ip6_input().  Should be no functional change.

 Ok?


OK



Re: Remove p2p loopback hack in nd6_rtrequest()

2014-05-13 Thread Mike Belopuhov
On Mon, May 12, 2014 at 12:48 +0200, Martin Pieuchot wrote:
 On 07/05/14(Wed) 12:46, Martin Pieuchot wrote:
  Diff below stops abusing nd6_rtrequest() for loopback interfaces, which
  means we can remove the special hack below and reduce the differences
  with arp_rtrequest().
  
  This diff introduces two changes in the inet6 routing table, but they
  should not matter.  The first one is that the gateway of the link-local
  entry for loopback interfaces is no longer a buggy link-layer address:
  
  -fe80::1%lo0link#6 UHL  
  00 - 4 lo0  
  +fe80::1%lo0fe80::1%lo0UHL  
  00 - 4 lo0
  
  The second one is that every route to network associated to the
  loopback interface will now have the mtu of this interface:
  
  -ff02::/16  ::1UGRS 
  00 - 8 lo0  
  +ff02::/16  ::1UGRS 
  00 33192 8 lo0  
  
  Both changes are consistent with the IPv4 behavior, ok?
 
 Anyone?
 

OK

  
  
  Index: netinet6/in6.c
  ===
  RCS file: /home/ncvs/src/sys/netinet6/in6.c,v
  retrieving revision 1.136
  diff -u -p -r1.136 in6.c
  --- netinet6/in6.c  5 May 2014 11:44:33 -   1.136
  +++ netinet6/in6.c  7 May 2014 10:29:24 -
  @@ -1399,7 +1399,7 @@ in6_ifinit(struct ifnet *ifp, struct in6
  /* Add ownaddr as loopback rtentry, if necessary (ex. on p2p link). */
  if (newhost) {
  /* set the rtrequest function to create llinfo */
  -   if ((ifp-if_flags  IFF_POINTOPOINT) == 0)
  +   if ((ifp-if_flags  (IFF_LOOPBACK | IFF_POINTOPOINT)) == 0)
  ia6-ia_ifa.ifa_rtrequest = nd6_rtrequest;
   
  rt_ifa_addloop((ia6-ia_ifa));
  Index: netinet6/nd6.c
  ===
  RCS file: /home/ncvs/src/sys/netinet6/nd6.c,v
  retrieving revision 1.116
  diff -u -p -r1.116 nd6.c
  --- netinet6/nd6.c  7 May 2014 08:14:59 -   1.116
  +++ netinet6/nd6.c  7 May 2014 10:29:24 -
  @@ -1059,20 +1059,14 @@ nd6_rtrequest(int req, struct rtentry *r
   #endif
  /* FALLTHROUGH */
  case RTM_RESOLVE:
  -   if ((ifp-if_flags  (IFF_POINTOPOINT | IFF_LOOPBACK)) == 0) {
  -   /*
  -* Address resolution isn't necessary for a point to
  -* point link, so we can skip this test for a p2p link.
  -*/
  -   if (gate-sa_family != AF_LINK ||
  -   gate-sa_len  sizeof(null_sdl)) {
  -   log(LOG_DEBUG, %s: bad gateway value: %s\n,
  -   __func__, ifp-if_xname);
  -   break;
  -   }
  -   SDL(gate)-sdl_type = ifp-if_type;
  -   SDL(gate)-sdl_index = ifp-if_index;
  +   if (gate-sa_family != AF_LINK ||
  +   gate-sa_len  sizeof(null_sdl)) {
  +   log(LOG_DEBUG, %s: bad gateway value: %s\n,
  +   __func__, ifp-if_xname);
  +   break;
  }
  +   SDL(gate)-sdl_type = ifp-if_type;
  +   SDL(gate)-sdl_index = ifp-if_index;
  if (ln != NULL)
  break;  /* This happens on a route change */
  /*
  
 



Re: snmpd: add backend from bgpd to support multiple routing tables

2014-05-13 Thread Mike Belopuhov
On Mon, Apr 28, 2014 at 14:20 +0200, Mike Belopuhov wrote:
 This adds ktable code from bgpd to fetch, store and perform lookups
 in multiple routing tables.  Currently it doesn't do anything useful
 but it's a prerequisite for any future work in this direction.
 
 OK to get this in?
 

Any objections?

 diff --git usr.sbin/snmpd/kroute.c usr.sbin/snmpd/kroute.c
 index e157b25..e19f924 100644
 --- usr.sbin/snmpd/kroute.c
 +++ usr.sbin/snmpd/kroute.c
 @@ -45,10 +45,13 @@
  
  #include snmpd.h
  
  extern struct snmpd  *env;
  
 +struct ktable**krt;
 +u_int  krt_size;
 +
  struct {
   struct event ks_ev;
   u_long   ks_iflastchange;
   u_long   ks_nroutes;/* 4 billions enough? */
   int  ks_fd;
 @@ -77,24 +80,32 @@ struct kif_node {
  
  int  kroute_compare(struct kroute_node *, struct kroute_node *);
  int  kroute6_compare(struct kroute6_node *, struct kroute6_node *);
  int  kif_compare(struct kif_node *, struct kif_node *);
  
 -struct kroute_node   *kroute_find(in_addr_t, u_int8_t, u_int8_t);
 +void  ktable_init(void);
 +int   ktable_new(u_int, u_int);
 +void  ktable_free(u_int);
 +int   ktable_exists(u_int, u_int *);
 +struct ktable*ktable_get(u_int);
 +int   ktable_update(u_int);
 +
 +struct kroute_node   *kroute_find(struct ktable *, in_addr_t, u_int8_t,
 + u_int8_t);
  struct kroute_node   *kroute_matchgw(struct kroute_node *,
   struct sockaddr_in *);
 -int   kroute_insert(struct kroute_node *);
 -int   kroute_remove(struct kroute_node *);
 -void  kroute_clear(void);
 +int   kroute_insert(struct ktable *, struct kroute_node *);
 +int   kroute_remove(struct ktable *, struct kroute_node *);
 +void  kroute_clear(struct ktable *);
  
 -struct kroute6_node  *kroute6_find(const struct in6_addr *, u_int8_t,
 -  u_int8_t);
 +struct kroute6_node  *kroute6_find(struct ktable *, const struct in6_addr *,
 + u_int8_t, u_int8_t);
  struct kroute6_node  *kroute6_matchgw(struct kroute6_node *,
   struct sockaddr_in6 *);
 -int   kroute6_insert(struct kroute6_node *);
 -int   kroute6_remove(struct kroute6_node *);
 -void  kroute6_clear(void);
 +int   kroute6_insert(struct ktable *, struct kroute6_node *);
 +int   kroute6_remove(struct ktable *, struct kroute6_node *);
 +void  kroute6_clear(struct ktable *);
  
  struct kif_arp   *karp_find(struct sockaddr *, u_short);
  int   karp_insert(struct kif_node *, struct kif_arp *);
  int   karp_remove(struct kif_node *, struct kif_arp *);
  
 @@ -121,23 +132,21 @@ voidif_newaddr(u_short, struct sockaddr *, 
 struct sockaddr *,
   struct sockaddr *);
  void if_deladdr(u_short, struct sockaddr *, struct sockaddr *,
   struct sockaddr *);
  void if_announce(void *);
  
 -int  fetchtable(void);
 +int  fetchtable(struct ktable *);
  int  fetchifs(u_short);
 -int  fetcharp(void);
 +int  fetcharp(struct ktable *);
  void dispatch_rtmsg(int, short, void *);
  int  rtmsg_process(char *, int);
 -int  dispatch_rtmsg_addr(struct rt_msghdr *,
 +int  dispatch_rtmsg_addr(struct ktable *, struct rt_msghdr *,
   struct sockaddr *[RTAX_MAX]);
  
 -RB_HEAD(kroute_tree, kroute_node)krt;
  RB_PROTOTYPE(kroute_tree, kroute_node, entry, kroute_compare)
  RB_GENERATE(kroute_tree, kroute_node, entry, kroute_compare)
  
 -RB_HEAD(kroute6_tree, kroute6_node)  krt6;
  RB_PROTOTYPE(kroute6_tree, kroute6_node, entry, kroute6_compare)
  RB_GENERATE(kroute6_tree, kroute6_node, entry, kroute6_compare)
  
  RB_HEAD(kif_tree, kif_node)  kit;
  RB_PROTOTYPE(kif_tree, kif_node, entry, kif_compare)
 @@ -149,10 +158,11 @@ RB_GENERATE(ka_tree, kif_addr, node, ka_compare)
  
  void
  kr_init(void)
  {
   int opt = 0, rcvbuf, default_rcvbuf;
 + unsigned inttid = RTABLE_ANY;
   socklen_t   optlen;
  
   if ((kr_state.ks_ifd = socket(AF_INET, SOCK_DGRAM, 0)) == -1)
   fatal(kr_init: ioctl socket);
  
 @@ -179,31 +189,166 @@ kr_init(void)
   setsockopt(kr_state.ks_fd, SOL_SOCKET, SO_RCVBUF,
   rcvbuf, sizeof(rcvbuf)) == -1  errno == ENOBUFS;
   rcvbuf /= 2)
   ;   /* nothing */
  
 - RB_INIT(krt);
 - RB_INIT(krt6);
 + if (setsockopt(kr_state.ks_fd, AF_ROUTE, ROUTE_TABLEFILTER, tid,
 + sizeof(tid)) == -1)
 + log_warn(kr_init: setsockopt AF_ROUTE

Re: MI MTU size for lo(4)

2014-05-13 Thread Mike Belopuhov
On 13 May 2014 15:45, Claudio Jeker cje...@diehard.n-r-g.com wrote:
 With KAME the MTU size of the loopback interface became strange and is
 actually dependend on the architecture. I see no point in all this
 just go back to the way it was long long long ago and just use 32k as the
 MTU. AFAIK all of this was only done to test large IPv6 packets but why
 MHLEN and MLEN were added is still beyond my imagination.


In fact this comes from this commit in NetBSD originally:
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/net/if_loop.c#rev1.20

Comment says Add MHLEN + MLEN extra space to LOMTU for IP and
transport headers.  This doesn't make me understand why this
is needed however.  So unless this clues you in, I'm OK with
setting it to 32k everywhere.



Re: MI MTU size for lo(4)

2014-05-13 Thread Mike Belopuhov
On 13 May 2014 16:05, Mike Belopuhov m...@belopuhov.com wrote:
 On 13 May 2014 15:45, Claudio Jeker cje...@diehard.n-r-g.com wrote:
 With KAME the MTU size of the loopback interface became strange and is
 actually dependend on the architecture. I see no point in all this
 just go back to the way it was long long long ago and just use 32k as the
 MTU. AFAIK all of this was only done to test large IPv6 packets but why
 MHLEN and MLEN were added is still beyond my imagination.


 In fact this comes from this commit in NetBSD originally:
 http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/net/if_loop.c#rev1.20

 Comment says Add MHLEN + MLEN extra space to LOMTU for IP and
 transport headers.  This doesn't make me understand why this
 is needed however.  So unless this clues you in, I'm OK with
 setting it to 32k everywhere.

And purely for your amusement: FreeBSD have decreased MTU from
64k to 16k in '95:

http://svnweb.freebsd.org/base/head/sys/net/if_loop.c?r1=6875r2=6876

Perhaps (and I'm somewhat speculating here) because of TCP
performance problems, like so:

http://marc.info/?l=netbsd-current-usersm=102684186303106w=2
(fixed diff is in the next mail though)



Re: PRU_BIND in raw ip

2014-06-02 Thread Mike Belopuhov
On 28 May 2014 13:36, Martin Pieuchot mpieuc...@nolizard.org wrote:
 On 28/05/14(Wed) 09:30, Jérémie Courrèges-Anglas wrote:
 Martin Pieuchot mpieuc...@nolizard.org writes:

  Diff below replace in_iawithaddr() + in_broadcast() - ifa_ifwithaddr(),
  that does the same for IPv4 since broadcast addresses are added to the
  tree.

 This prevents listeners to bind on 255.255.255.255, something allowed
 with the current code.  Thoughts?

 You're right, here's an updated diff that keeps this behavior.


OK



Re: Remove a global variable in ip_input

2014-06-04 Thread Mike Belopuhov
On 4 June 2014 12:30, Martin Pieuchot mpieuc...@nolizard.org wrote:
 ok?


sure



Re: in_pcbbind() and in_broadcast/in_iawithaddr

2014-06-04 Thread Mike Belopuhov
On 3 June 2014 09:18, Martin Pieuchot mpieuc...@nolizard.org wrote:
 On 02/06/14(Mon) 15:45, Martin Pieuchot wrote:
 This diff is similar to the one that has been committed to handle the
 SOCK_RAW binding.  I'd like to stop using in_iawithaddr() *and*
 in_broadcast().  Since these functions are just doing an iteration on
 all the addresses present in the RB-tree (or equivalent), let's use
 ifa_ifwithaddr() instead.

 This diff should not introduce any behavior change concerning SOCK_DGRAM
 and binding to multicast addresses.

 As pointed out by jca@ this diff breaks on loopback.  This is because
 the loopback IPv4 addresses are abusing the dstaddr field to be able to
 create a route to their address.  Hopefully this hack can be removed
 once the local route diff is in, but in the meantime let's use the less
 intuitive but equivalent idiom:

 sin-sin_addr.s_addr != ia-ia_addr.s_addr

 Update diff below, is this one ok?


OK



Re: pf anchor references

2014-06-05 Thread Mike Belopuhov
On Mon, Jun 02, 2014 at 17:51 +0200, Mike Belopuhov wrote:
 Hi,
 
 I've been chasing some bugs in the pfctl anchor code for a couple
 of weeks and I'm not astonished at how loose the handling is in
 general.  Lot's of rules and checks are being violated by some
 code paths while honoured by others.  The problem I have is that
 it's truly a rabbit's hole: almost every bug I hit is a feature
 in disguise.
 
 One thing that I particularly hate is called an anchor reference.
 A good example of this is a ruleset like this:
 
 anchor foo {
 block all
 }
 anchor foo
 
 If you believe it should be a syntax error let me disappoint you.
 The second 'anchor foo' is just a reference to the anchor named
 foo defined before:
 
 # echo -e 'anchor foo {\n block all\n}\nanchor foo' | pfctl -f -
 # pfctl -a '*' -sr
 anchor foo all {
   block drop all
 }
 anchor foo all {
   block drop all
 }
 # echo -e 'pass all' | pfctl -a 'foo' -f -
 # pfctl -a '*' -sr
 anchor foo all {
   pass all flags S/SA
 }
 anchor foo all {
   pass all flags S/SA
 }
 
 And to demonstrate a reference specified by an absolute path we can
 add anchor /foo inside foo:
 
 # echo -e 'anchor /foo' | pfctl -a 'foo' -f -
 
 This obviously gets us an endless loop if we decide to print out
 contents recursively (pfctl -a '*' -sr).  Thankfully this doesn't
 affect ruleset evaluation in the kernel.
 
 The basic question I have is why do we need this dangerous and (at
 least in my eyes) pretty useless mechanism?
 
 Any opinions on this?  Does anyone make any sensible use of it?
 Should we try to simplify this by removing ambiguous functionality?
 
 Cheers,
 Mike

While I've got a few responses on tech supporting me, I should
probably ask misc@ as well for additional scrutiny.  And besides,
I've come up with an example that might simplify ruleset creation
for say multiple customer vlans or rdomains.  Does anyone use it
like that?  This looks like an anchor template that we want to
specify but not use directly in the main ruleset, but it plays
somewhat nicely with quick anchors.

(The ruleset below was tested and works correctly)

Cheers,
Mike


block all

anchor customer1 quick on rdomain 1 {
anchor /allow-egress
anchor /allow-ssh
anchor /allow-icmp-echo
}

anchor customer2 quick on rdomain 2 {
anchor /allow-egress
anchor /allow-ssh
}

pass in quick on rdomain 0 proto tcp to (self) port ssh
pass out quick on rdomain 0

# end of ruleset evaluation
block quick

anchor allow-ssh {
pass in proto tcp to (self) port ssh
}

anchor allow-icmp-echo {
pass in inet proto icmp to (self) icmp-type echoreq code 0
}

anchor allow-egress {
pass out
}



Re: pfctl: stricter redirect specs

2014-06-24 Thread Mike Belopuhov
On Tue, Jun 24, 2014 at 15:07 +0200, Mike Belopuhov wrote:
 I have carefully tested that and do not expect any unrelated
 fallout.  And for the reasons stated above I don't believe
 anyone is using this since it's largely error prone.
 

and a regress chunk that avoids using such combination.

note that this diff basically finishes previous attemts at
modifying tests to match present syntax.

diff --git regress/sbin/pfctl/pf48.in regress/sbin/pfctl/pf48.in
index 540711a..a0dd143 100644
--- regress/sbin/pfctl/pf48.in
+++ regress/sbin/pfctl/pf48.in
@@ -1,12 +1,12 @@
 table  regress  { 1.2.3.4 !5.6.7.8 10/8 lo0 }
 table regress.1 const { ::1 fe80::/64 }
 table regress.a { 1.2.3.4 !5.6.7.8 } { ::1 ::2 ::3 } file /dev/null const 
{ 4.3.2.1 }
-match out on lo0 from  regress.1 to regress.2 nat-to lo0:0
+match out on lo0 inet from  regress.1 to regress.2 nat-to lo0:0
 match out on !lo0 inet from !regress.1  to regress.2 nat-to lo0:0
 match in on lo0 inet6 from regress.1 to regress.2 rdr-to lo0:0
-match in on !lo0 from ! regress.1  to regress.2 rdr-to lo0:0
+match in on !lo0 inet6 from ! regress.1  to regress.2 rdr-to lo0:0
 match in from { regress.1 !regress.2 } to any
 match out from any to { !regress.1, regress.2 }
 pass in from regress to any
 pass out from any to regress 
 pass in from { regress.1 regress.2 } to any
diff --git regress/sbin/pfctl/pf48.loaded regress/sbin/pfctl/pf48.loaded
index 68b9854..cfdc8a2 100644
--- regress/sbin/pfctl/pf48.loaded
+++ regress/sbin/pfctl/pf48.loaded
@@ -1,7 +1,7 @@
-@0 match out on lo0 inet6 from regress.1:2 to regress.2:* nat-to ::1
-  [ Skip steps: d=2 p=end da=4 sp=end dp=end ]
+@0 match out on lo0 inet from regress.1:2 to regress.2:* nat-to 127.0.0.1
+  [ Skip steps: d=2 f=2 p=end da=4 sp=end dp=end ]
   [ queue: qname= qid=0 pqname= pqid=0 ]
   [ Evaluations: 0 Packets: 0 Bytes: 0   States: 0 
]
 @1 match out on ! lo0 inet from ! regress.1:2 to regress.2:* nat-to 
127.0.0.1
   [ Skip steps: p=end da=4 sp=end dp=end ]
   [ queue: qname= qid=0 pqname= pqid=0 ]
diff --git regress/sbin/pfctl/pf48.ok regress/sbin/pfctl/pf48.ok
index 5f6e6ab..aeccfcf 100644
--- regress/sbin/pfctl/pf48.ok
+++ regress/sbin/pfctl/pf48.ok
@@ -1,9 +1,9 @@
 table regress { 1.2.3.4 !5.6.7.8 10.0.0.0/8 ::1 fe80::1 127.0.0.1 }
 table regress.1 const { ::1 fe80::/64 }
 table regress.a const { 1.2.3.4 !5.6.7.8 ::1 ::2 ::3 } file /dev/null { 
4.3.2.1 }
-match out on lo0 inet6 from regress.1 to regress.2 nat-to ::1
+match out on lo0 inet from regress.1 to regress.2 nat-to 127.0.0.1
 match out on ! lo0 inet from ! regress.1 to regress.2 nat-to 127.0.0.1
 match in on lo0 inet6 from regress.1 to regress.2 rdr-to ::1
 match in on ! lo0 inet6 from ! regress.1 to regress.2 rdr-to ::1
 match in from regress.1 to any
 match in from ! regress.2 to any
diff --git regress/sbin/pfctl/pf48.optimized regress/sbin/pfctl/pf48.optimized
index 63309a7..90e2036 100644
--- regress/sbin/pfctl/pf48.optimized
+++ regress/sbin/pfctl/pf48.optimized
@@ -1,7 +1,7 @@
-@0 match out on lo0 inet6 from regress.1:2 to regress.2:* nat-to ::1
-  [ Skip steps: d=2 p=end da=4 sp=end dp=end ]
+@0 match out on lo0 inet from regress.1:2 to regress.2:* nat-to 127.0.0.1
+  [ Skip steps: d=2 f=2 p=end da=4 sp=end dp=end ]
   [ queue: qname= qid=0 pqname= pqid=0 ]
   [ Evaluations: 0 Packets: 0 Bytes: 0   States: 0 
]
 @1 match out on ! lo0 inet from ! regress.1:2 to regress.2:* nat-to 
127.0.0.1
   [ Skip steps: p=end da=4 sp=end dp=end ]
   [ queue: qname= qid=0 pqname= pqid=0 ]
diff --git regress/sbin/pfctl/pf98.in regress/sbin/pfctl/pf98.in
index a18a491..a224f9e 100644
--- regress/sbin/pfctl/pf98.in
+++ regress/sbin/pfctl/pf98.in
@@ -1,4 +1,4 @@
 # Test rule order processing should pass (require-order no longer required)
 pass in on lo100 all
-match out on lo0 all nat-to lo0
+match out on lo0 inet6 all nat-to lo0
 



pfctl: better af-to translation specs handling

2014-06-25 Thread Mike Belopuhov
Hi,

collapse_redirspec is one of my pet peeve since the af-to support.
Unfortunately we didn't put much effort in making it work well back
then, but now it is time for a change!  Improving upon the last
diff here's a collapsed version of the collapse_redirspec, so to
speak.  Instead of having two loops: first counts addresses, the
second one performs an action, here's a merged version that has all
the checks done once and in the right order.

No changes in the regress test output (all tests pass) and the
gain is that we no longer need to explicitely set the address
family for af-to rules, it will be deduced in most of the cases.
For example, previously you'd have to do:

  pass in inet af-to inet6 from ::1

While it's clear that you're translating IPv4 to the IPv6.  Now
it's possible to omit the 'inet' part:

  pass in af-to inet6 from ::1

Which is in line with how nat-to and rdr-to target addresses
affect selection of the address family for the whole rule, i.e.:

  % echo 'pass in rdr-to ::1' | pfctl -vnf -
  pass in inet6 all flags S/SA rdr-to ::1

(Of course af-to rule still needs the 'inet6' part, but that
will require some other changes as well...)

OK?

P.S. This diff moves chunks around quite a bit and might require
applying to tell the difference.

diff --git sbin/pfctl/parse.y sbin/pfctl/parse.y
index ce80f8e..fabe210 100644
--- sbin/pfctl/parse.y
+++ sbin/pfctl/parse.y
@@ -2025,26 +2025,21 @@ filter_opt  : USER uids {
filter_opts.nat.rdr = $2;
memcpy(filter_opts.nat.pool_opts, $3,
sizeof(filter_opts.nat.pool_opts));
}
| AFTO af FROM redirpool pool_opts {
if (filter_opts.nat.rdr) {
yyerror(cannot respecify af-to);
YYERROR;
}
if ($2 == 0) {
-   yyerror(no address family specified);
-   YYERROR;
-   }
-   if ($4-host-af  $4-host-af != $2) {
-   yyerror(af-to addresses must be in the 
-   target address family);
+   yyerror(no target address family specified);
YYERROR;
}
filter_opts.nat.af = $2;
filter_opts.nat.rdr = $4;
memcpy(filter_opts.nat.pool_opts, $5,
sizeof(filter_opts.nat.pool_opts));
filter_opts.rdr.rdr =
calloc(1, sizeof(struct redirection));
bzero(filter_opts.rdr.pool_opts,
sizeof(filter_opts.rdr.pool_opts));
@@ -3482,21 +3477,21 @@ redirspec   : host optweight{
n-weight = $2;
}
$$ = $1;
}
| '{' optnl redir_host_list '}' { $$ = $3; }
;
 
 redir_host_list: host optweight optnl  {
if ($1-addr.type != PF_ADDR_ADDRMASK) {
free($1);
-   yyerror(only addresses can be listed for
+   yyerror(only addresses can be listed for 
redirection pools );
YYERROR;
}
if ($2  0) {
struct node_host*n;
for (n = $1; n != NULL; n = n-next)
n-weight = $2;
}
$$ = $1;
}
@@ -4288,118 +4283,139 @@ expand_queue(char *qname, struct node_if *interfaces, 
struct queue_opts *opts)
 
FREE_LIST(struct node_if, interfaces);
return (0);
 }
 
 int
 collapse_redirspec(struct pf_pool *rpool, struct pf_rule *r,
 struct redirspec *rs, u_int8_t allow_if)
 {
struct pf_opt_tbl *tbl = NULL;
-   struct node_host *h;
+   struct node_host *h, *hprev = NULL;
struct pf_rule_addr ra;
-   int af = 0, i = 0;
+   int af = 0, naddr = 0;
 
if (!rs || !rs-rdr || rs-rdr-host == NULL) {
rpool-addr.type = PF_ADDR_NONE;
return (0);
}
 
if (r-rule_flag  PFRULE_AFTO)
r-naf = rs-af;
 
-   /* count matching addresses */
for (h = rs-rdr-host; h != NULL; h = h-next) {
-   if (h-af) {
-   /* check that af-to target address has correct af */
-   if (rs-af  rs-af != h-af) {
-   yyerror(af mismatch in %s spec,
-   allow_if ? routing : translation);
-  

Re: ANONCVS MIRROR MAINTAINERS.. YOU NEED TO READ THIS!

2014-06-26 Thread Mike Belopuhov
On 26 June 2014 08:53, patrick keshishian sids...@boxsoft.com wrote:
 On Wed, Jun 25, 2014 at 10:01:06PM -0700, patrick keshishian wrote:
 On Thu, Jun 26, 2014 at 06:37:00AM +0200, Alexander Hall wrote:
  On 06/25/14 20:52, Bob Beck wrote:
  If you or someone you love runs an anoncvs server, they need to see this.
  
  As you know we recently added commitid support to cvs, and we had
  you update your cvsync binary.
  
  Unfortunately, the fix wasn't quite right.  We ran into problems
  with the synching of commitid files. naddy managed to cook
  a correct fix for us.
  
  anoncvs.ca (the fanout machine) has been fixed - again. What you
  need to do is to update your cvsync to the latest one that was
  just committed into ports (cvsync-0.25.0pre0 or later). Remove your
  scanfile (if any).
  
  Once you do that, re-run cvsync and you should be back in business.
 
  Is it known whether this requires a fixed upstream cvsync first, or if
  it is ok to apply the fix and your repo will become fine whenever the
  upstream is fixed?
 
  Anyone with insight, please feel free to chime in.


 The commit message suggests, at least, the server must be
 updated:

 | Add full support for commitid and bump protocol version
 | Old clients will receive updates with commitid stripped out.


 It seems, running updated cvsync against an un-updated
 cvsyncd (I assume) results in failure:

 Connecting to anoncvs1.usa.openbsd.org port 
 Connected to 149.20.54.217 port 
 Running...
 Updating (collection openbsd-src/rcs)
 Updater(RCS): UPDATE(delta): error
 Updater(RCS): UPDATE: 
 /cvs/anoncvs/cvs/obsd/src/lib/libc/string/timingsafe_bcmp.3,v
 Updater: RCS Error
 Socket Error: recv: 2 residue 2
 Receiver Error: recv
 Mux(SEND) Error: not running: 0
 DirScan: RCS Error
 Mux(SEND) Error: not running: 1
 FileScan(RCS): UPDATE /cvs/anoncvs/cvs/obsd/src/sbin/pfctl/parse.y,v
 FileScan: RCS Error
 Failed



removing offending files locally and restarting cvsync helps.



Re: PF Once rules are not removed from main anchor

2014-07-02 Thread Mike Belopuhov
On 21 June 2014 15:36, Alexandr Nedvedicky alexandr.nedvedi...@oracle.com 
wrote:
 Hello,

 I'm not sure it is the right place to submit patches. Let me know if there is
 better/more appropriate address for this.

 during our testing we've found the once rules are not removed,
 when used in main anchor.


Correct.  I've addedd the ruleset pointer check to prevent that
shortly before the release since it caused panics...

 during debugging we found the rules in main anchor have member anchor set to
 NULL (pf_rule::anchor). This makes pf_purge_rule() function to bail out
 to early without removing the rule from ruleset.

 patch below fixed problem for us.


However, this solution is not correct for us.  Perhaps you have some
other changes in your tree to make it work.

 Index: pf_ioctl.c
 ===
 RCS file: /cvs/src/sys/net/pf_ioctl.c,v
 retrieving revision 1.272
 diff -u -r1.272 pf_ioctl.c
 --- pf_ioctl.c  22 Apr 2014 14:41:03 -  1.272
 +++ pf_ioctl.c  20 Jun 2014 14:26:22 -
 @@ -312,7 +312,7 @@
  {
 u_int32_tnr;

 -   if (ruleset == NULL || ruleset-anchor == NULL)
 +   if (ruleset == NULL)
 return;


ruleset-anchor is useless since nothing really checks for it if
ruleset is NULL.

 pf_rm_rule(ruleset-rules.active.ptr, rule);
 @@ -325,7 +325,10 @@
 ruleset-rules.active.ticket++;

 pf_calc_skip_steps(ruleset-rules.active.ptr);
 -   pf_remove_if_empty_ruleset(ruleset);
 +
 +   if (ruleset != pf_main_ruleset) {
 +   pf_remove_if_empty_ruleset(ruleset);
 +   }
  }


You don't need to check ruleset against pf_main_ruleset since
the first thing pf_remove_if_empty_ruleset does is bail if
ruleset == pf_main_ruleset.  This bit confused me quite a bit.

What really is a problem for us is that when pf_purge_rule is
called on a main ruleset from pf_test_rule the first argument
(ruleset) is NULL and not pf_main_ruleset, which would be
sensible.  The only other user of it, AFAICT, is pflog_packet
but it checks for ruleset-anchor before it does it's thing.
I don't see any reason why we can't start our rule set iteration
with 'ruleset' set to pf_main_ruleset (regress tests agree).

OK?

diff --git sys/net/pf.c sys/net/pf.c
index 71f85d1..562901d 100644
--- sys/net/pf.c
+++ sys/net/pf.c
@@ -3163,10 +3163,11 @@ pf_test_rule(struct pf_pdesc *pd, struct pf_rule **rm, 
struct pf_state **sm,
}
break;
 #endif /* INET6 */
}
 
+   ruleset = pf_main_ruleset;
r = TAILQ_FIRST(pf_main_ruleset.rules.active.ptr);
while (r != NULL) {
r-evaluations++;
PF_TEST_ATTRIB((pfi_kif_match(r-kif, pd-kif) == r-ifnot),
r-skip[PF_SKIP_IFP].ptr);
diff --git sys/net/pf_ioctl.c sys/net/pf_ioctl.c
index 5ad762c..86e987f 100644
--- sys/net/pf_ioctl.c
+++ sys/net/pf_ioctl.c
@@ -308,24 +308,19 @@ pf_rm_rule(struct pf_rulequeue *rulequeue, struct pf_rule 
*rule)
 }
 
 void
 pf_purge_rule(struct pf_ruleset *ruleset, struct pf_rule *rule)
 {
-   u_int32_tnr;
+   u_int32_tnr = 0;
 
-   if (ruleset == NULL || ruleset-anchor == NULL)
-   return;
+   KASSERT(ruleset != NULL  rule != NULL);
 
pf_rm_rule(ruleset-rules.active.ptr, rule);
ruleset-rules.active.rcount--;
-
-   nr = 0;
TAILQ_FOREACH(rule, ruleset-rules.active.ptr, entries)
rule-nr = nr++;
-
ruleset-rules.active.ticket++;
-
pf_calc_skip_steps(ruleset-rules.active.ptr);
pf_remove_if_empty_ruleset(ruleset);
 }
 
 u_int16_t



Re: [PATCH] rdomain support on rc.d

2014-07-11 Thread Mike Belopuhov
On 11 July 2014 10:29, Antoine Jacoutot ajacou...@bsdfrog.org wrote:
 On Thu, Jul 10, 2014 at 06:51:01PM +0200, Loďc BLOT wrote:
 Hello all,
 I use rdomains to split routing domains per company and also separate
 administration interfaces from routing interfaces on my routers (sshd,
 bacula, postfix and puppetd running on a dedicated rdomain)

 Actually there is a problem with rdomains, we need to modify /etc/rc.d
 scripts to add rdomain execution environment to the specified service.
 If rc.subr have support to rdomains, we can let the rc.d scripts clean.

 To resolve those rdomain issues, I created a patch and I added a new
 variable we could use on rc.conf(.local), ${_name}_rdomain. (This
 variable needs a signed integer and use an existing rdomain, this is
 checked by rc.subr.

 I want to contribute to OpenBSD and I give you this patch. If you have
 any suggestions to improve it, tell me.

 I don't use rdomain so someone knowledgeable should comment here.
 But it does look like a nice idea.


having something like this would be really cool.  in case you'll be
tweaking the code, make sure that the route -T exec printf check
is preserved.  i would use true in this test however.

as far as i can tell the daemon_rdomain bit that goes into the rc
script is fine, however i'm not quite sure how can i start two
daemons in different rdomains via rc.conf.local.  looks like this
diff doesn't handle this and allows only one instance in the
${_name}_rdomain rdomain.  but sometimes you want multiple, say
sshd in rdomain 0 and 1.  daemon_rdomain flag allows me to go and
create another rc.d/sshd-rdomain-1 script and stuff daemon_rdomain=1
in there.  but then i'd have to add it to the pkg_scripts...  this
is a minor issue that i see.  perhaps ${_name}_rdomain should list
multiple values, like sshd_rdomain=0,1,2,3.



pf: fixup pf_step_into_anchor to save current anchor rule pointer (2)

2014-07-22 Thread Mike Belopuhov
Hi,

This is a second diff and it makes sure that pf_step_into_anchor always
saves a pointer to the rule that owns the anchor on the pf anchor stack.
There's no reason why we should check for depth here.  As a side effect
this makes sure that the correct nested anchor gets it's counter bumped
instead of the top most.

For the save/restore symmetry pf_step_out_of_anchor is made to always
restore previous value of the anchor rule.  depth == 0 means what we a
at the top (main ruleset).

OK?

diff --git sys/net/pf.c sys/net/pf.c
index 9832e47..8708417 100644
--- sys/net/pf.c
+++ sys/net/pf.c
@@ -2666,11 +2666,11 @@ pf_step_into_anchor(int *depth, struct pf_ruleset **rs,
if (*depth = sizeof(pf_anchor_stack) /
sizeof(pf_anchor_stack[0])) {
log(LOG_ERR, pf_step_into_anchor: stack overflow\n);
*r = TAILQ_NEXT(*r, entries);
return;
-   } else if (*depth == 0  a != NULL)
+   } else if (a != NULL)
*a = *r;
f = pf_anchor_stack + (*depth)++;
f-rs = *rs;
f-r = *r;
if ((*r)-anchor_wildcard) {
@@ -2711,10 +2711,12 @@ pf_step_out_of_anchor(int *depth, struct pf_ruleset 
**rs,
}
}
(*depth)--;
if (*depth == 0  a != NULL)
*a = NULL;
+   else if (a != NULL)
+   *a = f-r;
*rs = f-rs;
if (*match  *depth) {
*match = *depth;
if (f-r-quick)
quick = 1;



pf: once for match rules?

2014-07-22 Thread Mike Belopuhov
Hi,

Before I send a diff for pfctl to disable once on match rules,
I've decided to try and see how much work is it to make it actually
work.  Turns out that I need to extend pf_rule_item by 3 pointers
to track the match rule ruleset, anchor rule and the ruleset it
belongs to.

Here's what this means in practice.  Consider a ruleset:

 block drop all
 match out log proto tcp to port 22 once
 anchor foo all {
   match out log proto tcp to port 22 once
   anchor bar all {
 match out log proto tcp to port 22 once
 pass out quick proto tcp to port 22 once
   }
 }

Once we send a packet to port 22 the ruleset collapses to just:

 block drop all

Thoughts?

diff --git sys/net/pf.c sys/net/pf.c
index 9f0e2d6..5679a40 100644
--- sys/net/pf.c
+++ sys/net/pf.c
@@ -3279,15 +3279,16 @@ pf_test_rule(struct pf_pdesc *pd, struct pf_rule **rm, 
struct pf_state **sm,
PR_NOWAIT)) == NULL) {
REASON_SET(reason, PFRES_MEMORY);
goto cleanup;
}
ri-r = r;
+   ri-ar = a;
+   ri-rs = ruleset;
+   ri-ars = aruleset;
/* order is irrelevant */
SLIST_INSERT_HEAD(rules, ri, entry);
pf_rule_to_actions(r, act);
-   if (r-rule_flag  PFRULE_AFTO)
-   pd-naf = r-naf;
if (pf_get_transaddr(r, pd, sns, nr) == -1) {
REASON_SET(reason, PFRES_TRANSLATE);
goto cleanup;
}
if (r-log || act.log  PF_LOG_MATCHES) {
@@ -3428,10 +3429,12 @@ pf_test_rule(struct pf_pdesc *pd, struct pf_rule **rm, 
struct pf_state **sm,
virtual_type, icmp_dir);
}
} else {
while ((ri = SLIST_FIRST(rules))) {
SLIST_REMOVE_HEAD(rules, entry);
+   if (ri-r-rule_flag  PFRULE_ONCE)
+   pf_purge_rule(ri-rs, ri-r, ri-ars, ri-ar);
pool_put(pf_rule_item_pl, ri);
}
}
 
/* copy back packet headers if needed */
@@ -3454,10 +3457,23 @@ pf_test_rule(struct pf_pdesc *pd, struct pf_rule **rm, 
struct pf_state **sm,
}
 #endif
 
if (r-rule_flag  PFRULE_ONCE)
pf_purge_rule(ruleset, r, aruleset, a);
+   if (*sm) {
+   SLIST_FOREACH(ri, (*sm)-match_rules, entry) {
+   if (ri-r-rule_flag  PFRULE_ONCE)
+   /*
+* We can be sure that pf_purge_rule won't
+* pool_put the rule because when *sm != NULL
+* STATE_INC_COUNTERS has increased states_cur.
+* pf_rule_item's and rules will be g/c'ed by
+* pf_free_state.
+*/
+   pf_purge_rule(ri-rs, ri-r, ri-ars, ri-ar);
+   }
+   }
 
 #if INET  INET6
if (rewrite  skw-af != sks-af)
return (PF_AFRT);
 #endif /* INET  INET6 */
diff --git sys/net/pfvar.h sys/net/pfvar.h
index a0d94f7..49af7b4 100644
--- sys/net/pfvar.h
+++ sys/net/pfvar.h
@@ -691,10 +691,13 @@ struct pf_threshold {
 };
 
 struct pf_rule_item {
SLIST_ENTRY(pf_rule_item)entry;
struct pf_rule  *r;
+   struct pf_rule  *ar;
+   struct pf_ruleset   *rs;
+   struct pf_ruleset   *ars;
 };
 
 SLIST_HEAD(pf_rule_slist, pf_rule_item);
 
 enum pf_sn_types { PF_SN_NONE, PF_SN_NAT, PF_SN_RDR, PF_SN_ROUTE, PF_SN_MAX };



Re: pf: once for match rules?

2014-08-12 Thread Mike Belopuhov
On Tue, Jul 22, 2014 at 19:03 +0200, Mike Belopuhov wrote:
 Hi,
 
 Before I send a diff for pfctl to disable once on match rules,
 I've decided to try and see how much work is it to make it actually
 work.  Turns out that I need to extend pf_rule_item by 3 pointers
 to track the match rule ruleset, anchor rule and the ruleset it
 belongs to.
 
 Here's what this means in practice.  Consider a ruleset:
 
  block drop all
  match out log proto tcp to port 22 once
  anchor foo all {
match out log proto tcp to port 22 once
anchor bar all {
  match out log proto tcp to port 22 once
  pass out quick proto tcp to port 22 once
}
  }
 
 Once we send a packet to port 22 the ruleset collapses to just:
 
  block drop all
 
 Thoughts?

Henning thinks it's a bit of an overkill.  Any other opinions?

 
 diff --git sys/net/pf.c sys/net/pf.c
 index 9f0e2d6..5679a40 100644
 --- sys/net/pf.c
 +++ sys/net/pf.c
 @@ -3279,15 +3279,16 @@ pf_test_rule(struct pf_pdesc *pd, struct pf_rule 
 **rm, struct pf_state **sm,
   PR_NOWAIT)) == NULL) {
   REASON_SET(reason, PFRES_MEMORY);
   goto cleanup;
   }
   ri-r = r;
 + ri-ar = a;
 + ri-rs = ruleset;
 + ri-ars = aruleset;
   /* order is irrelevant */
   SLIST_INSERT_HEAD(rules, ri, entry);
   pf_rule_to_actions(r, act);
 - if (r-rule_flag  PFRULE_AFTO)
 - pd-naf = r-naf;
   if (pf_get_transaddr(r, pd, sns, nr) == -1) {
   REASON_SET(reason, PFRES_TRANSLATE);
   goto cleanup;
   }
   if (r-log || act.log  PF_LOG_MATCHES) {
 @@ -3428,10 +3429,12 @@ pf_test_rule(struct pf_pdesc *pd, struct pf_rule 
 **rm, struct pf_state **sm,
   virtual_type, icmp_dir);
   }
   } else {
   while ((ri = SLIST_FIRST(rules))) {
   SLIST_REMOVE_HEAD(rules, entry);
 + if (ri-r-rule_flag  PFRULE_ONCE)
 + pf_purge_rule(ri-rs, ri-r, ri-ars, ri-ar);
   pool_put(pf_rule_item_pl, ri);
   }
   }
  
   /* copy back packet headers if needed */
 @@ -3454,10 +3457,23 @@ pf_test_rule(struct pf_pdesc *pd, struct pf_rule 
 **rm, struct pf_state **sm,
   }
  #endif
  
   if (r-rule_flag  PFRULE_ONCE)
   pf_purge_rule(ruleset, r, aruleset, a);
 + if (*sm) {
 + SLIST_FOREACH(ri, (*sm)-match_rules, entry) {
 + if (ri-r-rule_flag  PFRULE_ONCE)
 + /*
 +  * We can be sure that pf_purge_rule won't
 +  * pool_put the rule because when *sm != NULL
 +  * STATE_INC_COUNTERS has increased states_cur.
 +  * pf_rule_item's and rules will be g/c'ed by
 +  * pf_free_state.
 +  */
 + pf_purge_rule(ri-rs, ri-r, ri-ars, ri-ar);
 + }
 + }
  
  #if INET  INET6
   if (rewrite  skw-af != sks-af)
   return (PF_AFRT);
  #endif /* INET  INET6 */
 diff --git sys/net/pfvar.h sys/net/pfvar.h
 index a0d94f7..49af7b4 100644
 --- sys/net/pfvar.h
 +++ sys/net/pfvar.h
 @@ -691,10 +691,13 @@ struct pf_threshold {
  };
  
  struct pf_rule_item {
   SLIST_ENTRY(pf_rule_item)entry;
   struct pf_rule  *r;
 + struct pf_rule  *ar;
 + struct pf_ruleset   *rs;
 + struct pf_ruleset   *ars;
  };
  
  SLIST_HEAD(pf_rule_slist, pf_rule_item);
  
  enum pf_sn_types { PF_SN_NONE, PF_SN_NAT, PF_SN_RDR, PF_SN_ROUTE, PF_SN_MAX 
 };



Re: Kill MRT_{ADD,DEL}_BW_UPCALL

2014-08-13 Thread Mike Belopuhov
On 13 August 2014 10:56, Martin Pieuchot mpieuc...@nolizard.org wrote:
 Our multicast routing code is insert your adjective and for the most
 part unused.  We discussed with claudio@ during t2k13 to rewrite only
 the parts that people currently use, any volunteer?

 In the meantime, I'd like to kill the obviously unused parts of it.  So
 here's a first diff that remove the bandwidth monitoring interface.
 Nothing use it in base and a quick search on codesearch.debian.net
 reveals that only net/xorp picks it if it finds the defines.

 Ok?


OK



Re: Fix pppoe(4) with rdomain != 0

2014-08-13 Thread Mike Belopuhov
OK

On 13 August 2014 11:56, Martin Pieuchot mpieuc...@nolizard.org wrote:
 ok?

 Index: net/if_pppoe.c
 ===
 RCS file: /home/ncvs/src/sys/net/if_pppoe.c,v
 retrieving revision 1.40
 diff -u -p -r1.40 if_pppoe.c
 --- net/if_pppoe.c  12 Jul 2014 18:44:22 -  1.40
 +++ net/if_pppoe.c  13 Aug 2014 09:56:16 -
 @@ -1398,6 +1398,9 @@ pppoe_send_padt(struct ifnet *outgoing_i
 memcpy(eh-ether_dhost, dest, ETHER_ADDR_LEN);

 m0-m_flags = ~(M_BCAST|M_MCAST);
 +   /* encapsulated packet is forced into rdomain of physical interface */
 +   m0-m_pkthdr.ph_rtableid = outgoing_if-if_rdomain;
 +
 return (outgoing_if-if_output(outgoing_if, m0, dst, NULL));
  }





[regress] convert aes testcase from /dev/crypto

2014-08-13 Thread Mike Belopuhov
in order to deprecate crypto(4) interface regress tests need to be
converted.

this aes test case actually uses ecb vectors, therefore no chaining
is required and the code looks very simple.

OK?

diff --git regress/sys/crypto/aes/Makefile regress/sys/crypto/aes/Makefile
index 459aedb..826d98c 100644
--- regress/sys/crypto/aes/Makefile
+++ regress/sys/crypto/aes/Makefile
@@ -1,9 +1,13 @@
 #   $OpenBSD: Makefile,v 1.2 2014/01/18 05:54:52 martynas Exp $
 
-PROG=   aestest
+DIR=   ${.CURDIR}/../../../../sys
+
+CFLAGS+=   -I${DIR}
 
+PROG=   aestest
+SRCS=  aestest.c
 CDIAGFLAGS=-Wall
 #CDIAGFLAGS+=  -Werror
 CDIAGFLAGS+=   -Wpointer-arith
 CDIAGFLAGS+=   -Wno-uninitialized
 CDIAGFLAGS+=   -Wstrict-prototypes
@@ -12,9 +16,12 @@ CDIAGFLAGS+= -Wunused
 CDIAGFLAGS+=   -Wsign-compare
 CDIAGFLAGS+=   -Wshadow
 
 REGRESS_ROOT_TARGETS=  run-regress-${PROG}
 
+.PATH: ${DIR}/crypto
+SRCS+= rijndael.c
+
 run-regress-${PROG}: ${PROG}
-   ${SUDO} ./${PROG} ${.CURDIR}/vectors/*.txt
+   ./${PROG} ${.CURDIR}/vectors/*.txt
 
 .include bsd.regress.mk
diff --git regress/sys/crypto/aes/aestest.c regress/sys/crypto/aes/aestest.c
index 2437c38..720dbc1 100644
--- regress/sys/crypto/aes/aestest.c
+++ regress/sys/crypto/aes/aestest.c
@@ -24,117 +24,39 @@
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 /*
- * Test crypto(4) AES with test vectors provided by Dr Brian Gladman:
- * http://fp.gladman.plus.com/AES/
+ * Test kernel AES implementation with test vectors provided by
+ * Dr Brian Gladman:  http://fp.gladman.plus.com/AES/
  */
 
-#include sys/types.h
 #include sys/param.h
-#include sys/ioctl.h
-#include sys/sysctl.h
-#include crypto/cryptodev.h
+#include crypto/rijndael.h
 #include err.h
-#include fcntl.h
 #include stdio.h
 #include stdlib.h
 #include string.h
 #include unistd.h
 #include ctype.h
 
 static int
-syscrypt(const unsigned char *key, size_t klen, const unsigned char *in,
+docrypt(const unsigned char *key, size_t klen, const unsigned char *in,
 unsigned char *out, size_t len, int do_encrypt)
 {
-   struct session_op session;
-   struct crypt_op cryp;
-   int cryptodev_fd = -1, fd = -1;
-   u_char iv[32];
-
-   /*
-* Kludge; the kernel doesn't support ECB encryption so we
-* use a all-zero IV and encrypt a single block only, so the
-* result should be the same.
-*/
-   bzero(iv, sizeof(iv));
-
-   if ((cryptodev_fd = open(/dev/crypto, O_RDWR, 0))  0) {
-   warn(/dev/crypto);
-   goto err;
-   }
-   if (ioctl(cryptodev_fd, CRIOGET, fd) == -1) {
-   warn(CRIOGET failed);
-   goto err;
-   }
-   memset(session, 0, sizeof(session));
-   session.cipher = CRYPTO_AES_CBC;
-   session.key = (caddr_t) key;
-   session.keylen = klen;
-   if (ioctl(fd, CIOCGSESSION, session) == -1) {
-   warn(CIOCGSESSION);
-   goto err;
-   }
-   memset(cryp, 0, sizeof(cryp));
-   cryp.ses = session.ses;
-   cryp.op = do_encrypt ? COP_ENCRYPT : COP_DECRYPT;
-   cryp.flags = 0;
-   cryp.len = len;
-   cryp.src = (caddr_t) in;
-   cryp.dst = (caddr_t) out;
-   cryp.iv = (caddr_t) iv;
-   cryp.mac = 0;
-   if (ioctl(fd, CIOCCRYPT, cryp) == -1) {
-   warn(CIOCCRYPT);
-   goto err;
-   }
-   if (ioctl(fd, CIOCFSESSION, session.ses) == -1) {
-   warn(CIOCFSESSION);
-   goto err;
-   }
-   close(fd);
-   close(cryptodev_fd);
-   return (0);
-
-err:
-   if (fd != -1)
-   close(fd);
-   if (cryptodev_fd != -1)
-   close(cryptodev_fd);
-   return (-1);
-}
-
-static int
-getallowsoft(void)
-{
-   int mib[2], old;
-   size_t olen;
-
-   olen = sizeof(old);
-
-   mib[0] = CTL_KERN;
-   mib[1] = KERN_CRYPTODEVALLOWSOFT;
-   if (sysctl(mib, 2, old, olen, NULL, 0)  0)
-   err(1, sysctl failed);
-
-   return old;
-}
-
-static void
-setallowsoft(int new)
-{
-   int mib[2], old;
-   size_t olen, nlen;
-
-   olen = nlen = sizeof(new);
-
-   mib[0] = CTL_KERN;
-   mib[1] = KERN_CRYPTODEVALLOWSOFT;
-
-   if (sysctl(mib, 2, old, olen, new, nlen)  0)
-   err(1, sysctl failed);
+   rijndael_ctx ctx;
+   int error = 0;
+
+   memset(ctx, 0, sizeof(ctx));
+   error = rijndael_set_key(ctx, key, klen * 8);
+   if (error)
+   return -1;
+   if (do_encrypt)
+   rijndael_encrypt(ctx, in, out);
+   else
+   rijndael_decrypt(ctx, in, out);
+   return 0;
 }
 
 static int
 match(unsigned char *a, unsigned char *b, size_t len)
 {
@@ -221,21 +143,21 @@ do_tests(const char *filename, int test_num, u_char *key, 
u_int keylen,
 {
char result[32];
int fail = 0;
 
/* Encrypt test */

[regress] convert aes-ctr test from /dev/crypto

2014-08-13 Thread Mike Belopuhov
this test is converted the same way jsing@ has recently converted
an xts test by pulling in xform.c code.

OK?

diff --git regress/sys/crypto/aesctr/Makefile regress/sys/crypto/aesctr/Makefile
index 31ae500..7310dbc 100644
--- regress/sys/crypto/aesctr/Makefile
+++ regress/sys/crypto/aesctr/Makefile
@@ -1,10 +1,29 @@
 #   $OpenBSD: Makefile,v 1.1 2005/05/25 05:47:53 markus Exp $
 
+DIR=   ${.CURDIR}/../../../../sys
+
+CFLAGS+=   -I${DIR}
+
 PROG=   aesctr
+SRCS=  aesctr.c
+
+CDIAGFLAGS=-Wall
+CDIAGFLAGS+=   -Werror
+CDIAGFLAGS+=   -Wpointer-arith
+CDIAGFLAGS+=   -Wno-uninitialized
+CDIAGFLAGS+=   -Wstrict-prototypes
+CDIAGFLAGS+=   -Wmissing-prototypes
+CDIAGFLAGS+=   -Wunused
+CDIAGFLAGS+=   -Wsign-compare
+CDIAGFLAGS+=   -Wshadow
 
 REGRESS_ROOT_TARGETS=  run-regress-${PROG}
 
+.PATH: ${DIR}/crypto
+SRCS+= cast.c ecb_enc.c ecb3_enc.c gmac.c rijndael.c set_key.c
+SRCS+= xform.c
+
 run-regress-${PROG}: ${PROG}
-   ${SUDO} ./${PROG}
+   ./${PROG}
 
 .include bsd.regress.mk
diff --git regress/sys/crypto/aesctr/aesctr.c regress/sys/crypto/aesctr/aesctr.c
index 4cc1a6e..3a0b4d1 100644
--- regress/sys/crypto/aesctr/aesctr.c
+++ regress/sys/crypto/aesctr/aesctr.c
@@ -14,17 +14,13 @@
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 
-#include sys/types.h
 #include sys/param.h
-#include sys/ioctl.h
-#include sys/sysctl.h
-#include crypto/cryptodev.h
+#include crypto/rijndael.h
 #include err.h
-#include fcntl.h
 #include stdio.h
 #include stdlib.h
 #include string.h
 #include unistd.h
 #include limits.h
@@ -128,92 +124,67 @@ struct {
B4 07 DF 86 65 69 FD 07 F4 8C C0 B5 83 D6 07 1F
/*1E C0 E6 B8*/,
},
 };
 
-static int
-syscrypt(const unsigned char *key, size_t klen, const unsigned char *iv,
-const unsigned char *in, unsigned char *out, size_t len, int encrypt)
-{
-   struct session_op session;
-   struct crypt_op cryp;
-   int cryptodev_fd = -1, fd = -1;
+/* Stubs */
 
-   if ((cryptodev_fd = open(/dev/crypto, O_RDWR, 0))  0) {
-   warn(/dev/crypto);
-   goto err;
-   }
-   if (ioctl(cryptodev_fd, CRIOGET, fd) == -1) {
-   warn(CRIOGET failed);
-   goto err;
-   }
-   memset(session, 0, sizeof(session));
-   session.cipher = CRYPTO_AES_CTR;
-   session.key = (caddr_t) key;
-   session.keylen = klen;
-   if (ioctl(fd, CIOCGSESSION, session) == -1) {
-   warn(CIOCGSESSION);
-   goto err;
-   }
-   memset(cryp, 0, sizeof(cryp));
-   cryp.ses = session.ses;
-   cryp.op = encrypt ? COP_ENCRYPT : COP_DECRYPT;
-   cryp.flags = 0;
-   cryp.len = len;
-   cryp.src = (caddr_t) in;
-   cryp.dst = (caddr_t) out;
-   cryp.iv = (caddr_t) iv;
-   cryp.mac = 0;
-   if (ioctl(fd, CIOCCRYPT, cryp) == -1) {
-   warn(CIOCCRYPT);
-   goto err;
-   }
-   if (ioctl(fd, CIOCFSESSION, session.ses) == -1) {
-   warn(CIOCFSESSION);
-   goto err;
-   }
-   close(fd);
-   close(cryptodev_fd);
-   return (0);
+u_int32_t deflate_global(u_int8_t *, u_int32_t, int, u_int8_t **);
 
-err:
-   if (fd != -1)
-   close(fd);
-   if (cryptodev_fd != -1)
-   close(cryptodev_fd);
-   return (-1);
+u_int32_t
+deflate_global(u_int8_t *data, u_int32_t size, int comp, u_int8_t **out)
+{
+   return 0;
 }
 
-static int
-getallowsoft(void)
+void   explicit_bzero(void *, size_t);
+
+void
+explicit_bzero(void *b, size_t len)
 {
-   int mib[2], old;
-   size_t olen;
+   bzero(b, len);
+}
 
-   olen = sizeof(old);
+/* Definitions from /sys/crypto/xform.c */
 
-   mib[0] = CTL_KERN;
-   mib[1] = KERN_CRYPTODEVALLOWSOFT;
-   if (sysctl(mib, 2, old, olen, NULL, 0)  0)
-   err(1, sysctl failed);
+#define AESCTR_NONCESIZE   4
+#define AESCTR_IVSIZE  8
+#define AESCTR_BLOCKSIZE   16
 
-   return old;
-}
+struct aes_ctr_ctx {
+   u_int32_t   ac_ek[4*(AES_MAXROUNDS + 1)];
+   u_int8_tac_block[AESCTR_BLOCKSIZE];
+   int ac_nr;
+};
 
-static void
-setallowsoft(int new)
-{
-   int mib[2], old;
-   size_t olen, nlen;
+int  aes_ctr_setkey(void *, u_int8_t *, int);
+void aes_ctr_encrypt(caddr_t, u_int8_t *);
+void aes_ctr_decrypt(caddr_t, u_int8_t *);
+void aes_ctr_reinit(caddr_t, u_int8_t *);
 
-   olen = nlen = sizeof(new);
+static int
+docrypt(const unsigned char *key, size_t klen, const unsigned char *iv,
+const unsigned char *in, unsigned char *out, size_t len, int encrypt)
+{
+   u_int8_t block[AESCTR_BLOCKSIZE];
+   struct aes_ctr_ctx ctx;
+   int error = 0;
+   size_t i;
 
-   mib[0] = CTL_KERN;
-   mib[1] = KERN_CRYPTODEVALLOWSOFT;
+ 

[regress] convert enc (3des) test from /dev/crypto

2014-08-13 Thread Mike Belopuhov
this one with a bit of cheating however (manual cbc implementation).

OK?

diff --git regress/sys/crypto/enc/Makefile regress/sys/crypto/enc/Makefile
index cc29b32..8725f0c 100644
--- regress/sys/crypto/enc/Makefile
+++ regress/sys/crypto/enc/Makefile
@@ -1,12 +1,21 @@
 #   $OpenBSD: Makefile,v 1.5 2010/10/15 10:39:12 jsg Exp $
 
+DIR=   ${.CURDIR}/../../../../sys
+
+CFLAGS+=   -I${DIR}
+
 PROG=   des3
+SRCS=  des3.c
 LDADD=-lcrypto
 DPADD=${LIBCRYPTO}
 
 REGRESS_ROOT_TARGETS=  run-regress-${PROG}
 
+.PATH: ${DIR}/crypto
+SRCS+= cast.c ecb_enc.c ecb3_enc.c gmac.c rijndael.c set_key.c
+SRCS+= xform.c
+
 run-regress-${PROG}: ${PROG}
-   ${SUDO} ./${PROG}
+   ./${PROG}
 
 .include bsd.regress.mk
diff --git regress/sys/crypto/enc/des3.c regress/sys/crypto/enc/des3.c
index 024418d..fe67872 100644
--- regress/sys/crypto/enc/des3.c
+++ regress/sys/crypto/enc/des3.c
@@ -22,105 +22,73 @@
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#include sys/types.h
 #include sys/param.h
-#include sys/ioctl.h
-#include sys/sysctl.h
-#include crypto/cryptodev.h
 #include openssl/des.h
 #include err.h
 #include fcntl.h
 #include stdio.h
 #include stdlib.h
 #include string.h
 #include unistd.h
 
-static int
-syscrypt(const unsigned char *key, size_t klen, const unsigned char *iv,
-const unsigned char *in, unsigned char *out, size_t len, int encrypt)
-{
-   struct session_op session;
-   struct crypt_op cryp;
-   int cryptodev_fd = -1, fd = -1;
-
-   if ((cryptodev_fd = open(/dev/crypto, O_RDWR, 0))  0) {
-   warn(/dev/crypto);
-   goto err;
-   }
-   if (ioctl(cryptodev_fd, CRIOGET, fd) == -1) {
-   warn(CRIOGET failed);
-   goto err;
-   }
-   memset(session, 0, sizeof(session));
-   session.cipher = CRYPTO_3DES_CBC;
-   session.key = (caddr_t) key;
-   session.keylen = klen;
-   if (ioctl(fd, CIOCGSESSION, session) == -1) {
-   warn(CIOCGSESSION);
-   goto err;
-   }
-   memset(cryp, 0, sizeof(cryp));
-   cryp.ses = session.ses;
-   cryp.op = encrypt ? COP_ENCRYPT : COP_DECRYPT;
-   cryp.flags = 0;
-   cryp.len = len;
-   cryp.src = (caddr_t) in;
-   cryp.dst = (caddr_t) out;
-   cryp.iv = (caddr_t) iv;
-   cryp.mac = 0;
-   if (ioctl(fd, CIOCCRYPT, cryp) == -1) {
-   warn(CIOCCRYPT);
-   goto err;
-   }
-   if (ioctl(fd, CIOCFSESSION, session.ses) == -1) {
-   warn(CIOCFSESSION);
-   goto err;
-   }
-   close(fd);
-   close(cryptodev_fd);
-   return (0);
+/* Stubs */
 
-err:
-   if (fd != -1)
-   close(fd);
-   if (cryptodev_fd != -1)
-   close(cryptodev_fd);
-   return (-1);
-}
+u_int32_t deflate_global(u_int8_t *, u_int32_t, int, u_int8_t **);
 
-static int
-getallowsoft(void)
+u_int32_t
+deflate_global(u_int8_t *data, u_int32_t size, int comp, u_int8_t **out)
 {
-   int mib[2], old;
-   size_t olen;
-
-   olen = sizeof(old);
-
-   mib[0] = CTL_KERN;
-   mib[1] = KERN_CRYPTODEVALLOWSOFT;
-   if (sysctl(mib, 2, old, olen, NULL, 0)  0)
-   err(1, sysctl failed);
-
-   return old;
+   return 0;
 }
 
-static void
-setallowsoft(int new)
+void   explicit_bzero(void *, size_t);
+
+void
+explicit_bzero(void *b, size_t len)
 {
-   int mib[2], old;
-   size_t olen, nlen;
+   bzero(b, len);
+}
 
-   olen = nlen = sizeof(new);
 
-   mib[0] = CTL_KERN;
-   mib[1] = KERN_CRYPTODEVALLOWSOFT;
+/* Simulate CBC mode */
 
-   if (sysctl(mib, 2, old, olen, new, nlen)  0)
-   err(1, sysctl failed);
+static int
+docrypt(const unsigned char *key, size_t klen, const unsigned char *iv0,
+const unsigned char *in, unsigned char *out, size_t len, int encrypt)
+{
+   u_int8_t block[8], iv[8], iv2[8], *ivp = iv, *nivp;
+   u_int8_t ctx[384];
+   int i, j, error = 0;
+
+   memcpy(iv, iv0, 8);
+   memset(ctx, 0, sizeof(ctx));
+   error = des3_setkey(ctx, key, klen);
+   if (error)
+   return -1;
+   for (i = 0; i  len / 8; i ++) {
+   bcopy(in, block, 8);
+   in += 8;
+   if (encrypt) {
+   for (j = 0; j  8; j++)
+   block[j] ^= ivp[j];
+   des3_encrypt(ctx, block);
+   memcpy(ivp, block, 8);
+   } else {
+   nivp = ivp == iv ? iv2 : iv;
+   memcpy(nivp, block, 8);
+   des3_decrypt(ctx, block);
+   for (j = 0; j  8; j++)
+   block[j] ^= ivp[j];
+   ivp = nivp;
+   }
+   

/dev/crypto removal (1/3): unlink pseudo device

2014-08-18 Thread Mike Belopuhov
first step is to unlink crypto(4) pseudo device from the architecture
dependant character device tables and kernel config files.

OK?

diff --git sys/arch/alpha/alpha/conf.c sys/arch/alpha/alpha/conf.c
index 83cea34..7d103af 100644
--- sys/arch/alpha/alpha/conf.c
+++ sys/arch/alpha/alpha/conf.c
@@ -190,11 +190,11 @@ struct cdevsw cdevsw[] =
 #endif
cdev_bio_init(NBIO,bio),/* 53: ioctl tunnel */
cdev_notdef(),
cdev_ptm_init(NPTY,ptm),/* 55: pseudo-tty ptm device */
cdev_hotplug_init(NHOTPLUG,hotplug), /* 56: devices hot plugging */
-   cdev_crypto_init(NCRYPTO,crypto), /* 57: /dev/crypto */
+   cdev_notdef(),  /* 57: was: /dev/crypto */
cdev_bktr_init(NBKTR,bktr), /* 58: Bt848 video capture device */
cdev_radio_init(NRADIO,radio), /* 59: generic radio I/O */
cdev_mouse_init(NWSMUX, wsmux), /* 60: ws multiplexor */
cdev_vscsi_init(NVSCSI, vscsi), /* 61: vscsi */
cdev_notdef(),
diff --git sys/arch/alpha/conf/GENERIC sys/arch/alpha/conf/GENERIC
index 7b0c790..dbc8bdc 100644
--- sys/arch/alpha/conf/GENERIC
+++ sys/arch/alpha/conf/GENERIC
@@ -422,8 +422,7 @@ option ONEWIREVERBOSE
 owid*  at onewire? # ID
 owsbm* at onewire? # Smart Battery Monitor
 owtemp* at onewire?# Temperature
 owctr* at onewire? # Counter device
 
-pseudo-device  crypto  1
 pseudo-device  hotplug 1   # devices hot plugging
 pseudo-device  wsmux   2   # mouse  keyboard multiplexor
diff --git sys/arch/amd64/amd64/conf.c sys/arch/amd64/amd64/conf.c
index 59d4c9b..5da5820 100644
--- sys/arch/amd64/amd64/conf.c
+++ sys/arch/amd64/amd64/conf.c
@@ -38,12 +38,10 @@
 #include sys/tty.h
 #include sys/vnode.h
 
 #include machine/conf.h
 
-#include inet.h
-
 #include wd.h
 bdev_decl(wd);
 #include fdc.h
 #include fd.h
 bdev_decl(fd);
@@ -254,11 +252,11 @@ struct cdevsw cdevsw[] =
cdev_tty_init(NUCOM,ucom),  /* 66: USB tty */
cdev_mouse_init(NWSKBD, wskbd), /* 67: keyboards */
cdev_mouse_init(NWSMOUSE,   /* 68: mice */
wsmouse),
cdev_mouse_init(NWSMUX, wsmux), /* 69: ws multiplexor */
-   cdev_crypto_init(NCRYPTO,crypto), /* 70: /dev/crypto */
+   cdev_notdef(),  /* 70: was: /dev/crypto */
cdev_tty_init(NCZ,cztty),   /* 71: Cyclades-Z serial port */
 #ifdef USER_PCICONF
cdev_pci_init(NPCI,pci),/* 72: PCI user */
 #else
cdev_notdef(),
diff --git sys/arch/amd64/conf/GENERIC sys/arch/amd64/conf/GENERIC
index 09fd692..3177dd8 100644
--- sys/arch/amd64/conf/GENERIC
+++ sys/arch/amd64/conf/GENERIC
@@ -600,11 +600,10 @@ pseudo-device pctr1
 pseudo-device  nvram   1
 pseudo-device  hotplug 1   # devices hot plugging
 
 # mouse  keyboard multiplexor pseudo-devices
 pseudo-device  wsmux   2
-pseudo-device  crypto  1
 
 # Virtio devices
 virtio*at pci? # Virtio PCI device
 vioblk*at virtio?  # Virtio block device
 vio*   at virtio?  # Virtio network device
diff --git sys/arch/arm/arm/conf.c sys/arch/arm/arm/conf.c
index d381885..8f12732 100644
--- sys/arch/arm/arm/conf.c
+++ sys/arch/arm/arm/conf.c
@@ -53,12 +53,10 @@
 #include sys/conf.h
 #include sys/vnode.h
 
 #include machine/conf.h
 
-#include inet.h
-
 /*
  * From this point, these need to be MI foo.h files.
  */
 
 /*
@@ -321,11 +319,11 @@ struct cdevsw cdevsw[] = {
cdev_lkm_dummy(),   /* 42: reserved */
cdev_lkm_dummy(),   /* 43: reserved */
cdev_lkm_dummy(),   /* 44: reserved */
cdev_lkm_dummy(),   /* 45: reserved */
cdev_pf_init(NPF,pf),   /* 46: packet filter */
-   cdev_crypto_init(NCRYPTO,crypto),   /* 47: /dev/crypto */
+   cdev_notdef(),  /* 47: was: /dev/crypto */
cdev_lkm_dummy(),   /* 48: reserved */
cdev_lkm_dummy(),   /* 49: reserved */
cdev_systrace_init(NSYSTRACE,systrace), /* 50: system call tracing */
cdev_notdef(),  /* 51: reserved */
cdev_bio_init(NBIO,bio),/* 52: ioctl tunnel */
diff --git sys/arch/armish/conf/GENERIC sys/arch/armish/conf/GENERIC
index e72ff41..6bf908c 100644
--- sys/arch/armish/conf/GENERIC
+++ sys/arch/armish/conf/GENERIC
@@ -195,7 +195,6 @@ owsbm*  at onewire? # Smart Battery Monitor
 owtemp* at onewire?# Temperature
 owctr* at onewire? # Counter device
 
 # mouse  keyboard multiplexor pseudo-devices
 pseudo-device  wsmux   2
-pseudo-device  crypto  1
 pseudo-device  hotplug 1   # devices hot plugging
diff --git sys/arch/hppa/hppa/conf.c sys/arch/hppa/hppa/conf.c

/dev/crypto removal (2/3): remove kernel support

2014-08-18 Thread Mike Belopuhov
this removes /dev/crypto device interface and public key functionality
that is only usable via /dev/crypto.

OK?

diff --git sys/conf/files sys/conf/files
index 3941639..9af78cc 100644
--- sys/conf/files
+++ sys/conf/files
@@ -870,11 +870,10 @@ file crypto/blf.c (inet  ipsec) | crypto 
| vnd
 file crypto/cast.c (inet  ipsec) | crypto
 file crypto/ecb_enc.c  (inet  ipsec) | crypto
 file crypto/set_key.c  (inet  ipsec) | crypto
 file crypto/ecb3_enc.c (inet  ipsec) | crypto
 file crypto/crypto.c   (inet  ipsec) | crypto
-file crypto/cryptodev.c((inet  ipsec) | crypto)   
needs-flag
 file crypto/criov.c(inet  ipsec) | crypto
 file crypto/cryptosoft.c   (inet  ipsec) | crypto
 file crypto/xform.c(inet  ipsec) | crypto
 file crypto/xform_ipcomp.c (inet  ipsec) | crypto
 file crypto/arc4.c 
diff --git sys/crypto/crypto.c sys/crypto/crypto.c
index f62752d..4a11ae3 100644
--- sys/crypto/crypto.c
+++ sys/crypto/crypto.c
@@ -280,39 +280,10 @@ crypto_get_driverid(u_int8_t flags)
 /*
  * Register a crypto driver. It should be called once for each algorithm
  * supported by the driver.
  */
 int
-crypto_kregister(u_int32_t driverid, int *kalg,
-int (*kprocess)(struct cryptkop *))
-{
-   int s, i;
-
-   if (driverid = crypto_drivers_num || kalg  == NULL ||
-   crypto_drivers == NULL)
-   return EINVAL;
-
-   s = splvm();
-
-   for (i = 0; i = CRK_ALGORITHM_MAX; i++) {
-   /*
-* XXX Do some performance testing to determine
-* placing.  We probably need an auxiliary data
-* structure that describes relative performances.
-*/
-
-   crypto_drivers[driverid].cc_kalg[i] = kalg[i];
-   }
-
-   crypto_drivers[driverid].cc_kprocess = kprocess;
-
-   splx(s);
-   return 0;
-}
-
-/* Register a crypto driver. */
-int
 crypto_register(u_int32_t driverid, int *alg,
 int (*newses)(u_int32_t *, struct cryptoini *),
 int (*freeses)(u_int64_t), int (*process)(struct cryptop *))
 {
int s, i;
@@ -410,71 +381,10 @@ crypto_dispatch(struct cryptop *crp)
}
 
return 0;
 }
 
-int
-crypto_kdispatch(struct cryptkop *krp)
-{
-   if (crypto_taskq) {
-   task_set(krp-krp_task, (void (*))crypto_kinvoke, krp, NULL);
-   task_add(crypto_taskq, krp-krp_task);
-   } else {
-   crypto_kinvoke(krp);
-   }
-
-   return 0;
-}
-
-/*
- * Dispatch an asymmetric crypto request to the appropriate crypto devices.
- */
-int
-crypto_kinvoke(struct cryptkop *krp)
-{
-   extern int cryptodevallowsoft;
-   u_int32_t hid;
-   int error;
-   int s;
-
-   /* Sanity checks. */
-   if (krp == NULL || krp-krp_callback == NULL)
-   return (EINVAL);
-
-   s = splvm();
-   for (hid = 0; hid  crypto_drivers_num; hid++) {
-   if ((crypto_drivers[hid].cc_flags  CRYPTOCAP_F_SOFTWARE) 
-   cryptodevallowsoft == 0)
-   continue;
-   if (crypto_drivers[hid].cc_kprocess == NULL)
-   continue;
-   if ((crypto_drivers[hid].cc_kalg[krp-krp_op] 
-   CRYPTO_ALG_FLAG_SUPPORTED) == 0)
-   continue;
-   break;
-   }
-
-   if (hid == crypto_drivers_num) {
-   krp-krp_status = ENODEV;
-   crypto_kdone(krp);
-   splx(s);
-   return (0);
-   }
-
-   krp-krp_hid = hid;
-
-   crypto_drivers[hid].cc_koperations++;
-
-   error = crypto_drivers[hid].cc_kprocess(krp);
-   if (error) {
-   krp-krp_status = error;
-   crypto_kdone(krp);
-   }
-   splx(s);
-   return (0);
-}
-
 /*
  * Dispatch a crypto request to the appropriate crypto devices.
  */
 int
 crypto_invoke(struct cryptop *crp)
@@ -624,40 +534,5 @@ crypto_done(struct cryptop *crp)
task_set(crp-crp_task, (void (*))crp-crp_callback,
crp, NULL);
task_add(crypto_taskq, crp-crp_task);
}
 }
-
-/*
- * Invoke the callback on behalf of the driver.
- */
-void
-crypto_kdone(struct cryptkop *krp)
-{
-   task_set(krp-krp_task, (void (*))krp-krp_callback, krp, NULL);
-   task_add(crypto_taskq, krp-krp_task);
-}
-
-int
-crypto_getfeat(int *featp)
-{
-   extern int cryptodevallowsoft, userasymcrypto;
-   int hid, kalg, feat = 0;
-
-   if (userasymcrypto == 0)
-   goto out; 
-   for (hid = 0; hid  crypto_drivers_num; hid++) {
-   if ((crypto_drivers[hid].cc_flags  CRYPTOCAP_F_SOFTWARE) 
-   cryptodevallowsoft == 0) {
-   continue;
-   }
-   if 

/dev/crypto removal (3/3): userland bits

2014-08-18 Thread Mike Belopuhov
please note that the commented out example usage in etc/MAKEDEV.common
remains till someone feels the need to change it.

OK?

diff --git etc/MAKEDEV.common etc/MAKEDEV.common
index bfcd943..b656d46 100644
--- etc/MAKEDEV.common
+++ etc/MAKEDEV.common
@@ -131,11 +131,10 @@ target(all, wd, 0, 1, 2, 3)dnl
 target(all, xd, 0, 1, 2, 3)dnl
 target(all, systrace)dnl
 target(all, pctr)dnl
 target(all, pctr0)dnl
 target(all, pf)dnl
-twrget(all, cry, crypto)dnl
 target(all, apm)dnl
 target(all, acpi)dnl
 twrget(all, tth, ttyh, 0, 1)dnl
 target(all, ttyA, 0, 1)dnl
 target(all, ttyB, 0, 1, 2, 3, 4, 5)dnl
@@ -446,12 +445,10 @@ __devitem(fdesc, fd, fd/* nodes, fd)dnl
 _mkdev(fdesc, fd, {-RMlist[${#RMlist[*]}]=;mkdir -p fd;rm -f n=0
while [ $n -lt 64 ];do M fd/$n c major_fdesc_c $n;n=Add($n, 1);done
MKlist[${#MKlist[*]}]=;chmod 555 fd-})dnl
 __devitem(oppr, openprom,PROM settings,openprom)dnl
 _cdev(oppr, openprom, 70, 0)dnl
-__devitem(cry, crypto, Hardware crypto access driver,crypto)dnl
-_mkdev(cry, crypto, {-M crypto c major_cry_c-} 0)dnl
 __devitem(pf, pf*, Packet Filter)dnl
 _mkdev(pf, {-pf*-}, {-M pf c major_pf_c 0 600-})dnl
 __devitem(bpf, bpf*, Berkeley Packet Filter)dnl
 _mkdev(bpf, {-bpf*-}, {-M bpf$U c major_bpf_c $U 600-}, 600)dnl
 _mkdev(tun, {-tun*-}, {-M tun$U c major_tun_c $U 600-}, 600)dnl
diff --git etc/etc.alpha/MAKEDEV etc/etc.alpha/MAKEDEV
index 2dd9859..050aa4e 100644
--- etc/etc.alpha/MAKEDEV
+++ etc/etc.alpha/MAKEDEV
@@ -68,11 +68,10 @@
 # Special purpose devices:
 #  audio*  Audio devices
 #  bio ioctl tunnel pseudo-device
 #  bktr*   Video frame grabbers
 #  bpf*Berkeley Packet Filter
-#  crypto  Hardware crypto access driver
 #  diskmap Disk mapper
 #  fd  fd/* nodes
 #  fuseUserland Filesystem
 #  hotplug devices hot plugging
 #  lkm Loadable kernel modules interface
@@ -324,14 +323,10 @@ fd)
 
 diskmap)
M diskmap c 63 0 640 operator
;;
 
-crypto)
-   M crypto c 57 0
-   ;;
-
 bpf*)
M bpf$U c 11 $U 600
;;
 
 bktr*)
@@ -539,11 +534,11 @@ all)
R local wscons pci0 pci1 pci2 pci3 uall rmidi0 rmidi1 rmidi2
R rmidi3 rmidi4 rmidi5 rmidi6 rmidi7 tuner0 radio0 speaker
R video0 video1 uk0 random lpa0 lpa1 lpa2 lpt0 lpt1 lpt2 lkm
R tty00 tty01 tty02 tty03 tty04 tty05 tty06 tty07 tty08 tty09
R tty0a tty0b ttyc0 ttyc1 ttyc2 ttyc3 ttyc4 ttyc5 ttyc6 ttyc7
-   R ttyB0 ttyB1 ttyB2 ttyB3 ttyB4 ttyB5 crypto pf systrace wd0
+   R ttyB0 ttyB1 ttyB2 ttyB3 ttyB4 ttyB5 pf systrace wd0
R wd1 wd2 wd3 std st0 st1 fd
;;
 
 wd*|sd*)
case $i in
diff --git etc/etc.amd64/MAKEDEV etc/etc.amd64/MAKEDEV
index 2971021..4a0d05c 100644
--- etc/etc.amd64/MAKEDEV
+++ etc/etc.amd64/MAKEDEV
@@ -68,11 +68,10 @@
 #  apm Power Management Interface
 #  audio*  Audio devices
 #  bio ioctl tunnel pseudo-device
 #  bktr*   Video frame grabbers
 #  bpf*Berkeley Packet Filter
-#  crypto  Hardware crypto access driver
 #  diskmap Disk mapper
 #  drm*Direct Rendering Manager
 #  fd  fd/* nodes
 #  fuseUserland Filesystem
 #  gpio*   General Purpose Input/Output
@@ -344,14 +343,10 @@ drm*)
 
 diskmap)
M diskmap c 90 0 640 operator
;;
 
-crypto)
-   M crypto c 70 0
-   ;;
-
 bpf*)
M bpf$U c 23 $U 600
;;
 
 bktr*)
@@ -565,11 +560,11 @@ all)
R wscons pci0 pci1 pci2 pci3 uall rmidi0 rmidi1 rmidi2 rmidi3
R rmidi4 rmidi5 rmidi6 rmidi7 tuner0 radio0 speaker video0
R video1 uk0 random lpa0 lpa1 lpa2 lpt0 lpt1 lpt2 lkm tty00
R tty01 tty02 tty03 tty04 tty05 tty06 tty07 tty08 tty09 tty0a
R tty0b ttyc0 ttyc1 ttyc2 ttyc3 ttyc4 ttyc5 ttyc6 ttyc7 apm
-   R crypto pf pctr systrace wd0 wd1 wd2 wd3 std st0 st1 fd
+   R pf pctr systrace wd0 wd1 wd2 wd3 std st0 st1 fd
;;
 
 wd*|sd*)
case $i in
wd*) dodisk wd $U 0 3 $U 0;;
diff --git etc/etc.armish/MAKEDEV etc/etc.armish/MAKEDEV
index 7a3a21e..25fd4d4 100644
--- etc/etc.armish/MAKEDEV
+++ etc/etc.armish/MAKEDEV
@@ -62,11 +62,10 @@
 #  apm Power management device
 #  audio*  Audio devices
 #  bio ioctl tunnel pseudo-device
 #  bktr*   Video frame grabbers
 #  bpf*Berkeley Packet Filter
-#  crypto  Hardware crypto access driver
 #  diskmap Disk mapper
 #  fd  fd/* nodes
 #  fuseUserland Filesystem
 #  hotplug devices hot plugging
 #  lkm Loadable kernel modules interface
@@ -302,14 +301,10 @@ fd)
 
 diskmap)
M diskmap c 102 0 640 operator
;;
 
-crypto)
-   M crypto c 47 0
-   ;;
-
 bpf*)
M bpf$U c 22 $U 600
;;
 
 bktr*)
@@ -481,11 +476,11 @@ all)
R bpf5 bpf6 bpf7 bpf8 bpf9 pty0 diskmap vscsi0 ch0 audio0
R audio1 audio2 fuse pppx hotplug ptm local wscons pci0 pci1
R pci2 pci3 uall rmidi0 rmidi1 rmidi2 rmidi3 rmidi4 rmidi5
  

Re: /dev/crypto removal (2/3): remove kernel support

2014-08-18 Thread Mike Belopuhov
On Mon, Aug 18, 2014 at 16:03 +0200, Mike Belopuhov wrote:
 this removes /dev/crypto device interface and public key functionality
 that is only usable via /dev/crypto.
 
 OK?
 

minor correction: preserve #ifdef _KERNEL in the cryptodev.h.  while
there are no userland programs including it atm leaving it for the
symmetry with cryptosoft.c seems like a good idea.

diff --git sys/conf/files sys/conf/files
index 3941639..9af78cc 100644
--- sys/conf/files
+++ sys/conf/files
@@ -870,11 +870,10 @@ file crypto/blf.c (inet  ipsec) | crypto 
| vnd
 file crypto/cast.c (inet  ipsec) | crypto
 file crypto/ecb_enc.c  (inet  ipsec) | crypto
 file crypto/set_key.c  (inet  ipsec) | crypto
 file crypto/ecb3_enc.c (inet  ipsec) | crypto
 file crypto/crypto.c   (inet  ipsec) | crypto
-file crypto/cryptodev.c((inet  ipsec) | crypto)   
needs-flag
 file crypto/criov.c(inet  ipsec) | crypto
 file crypto/cryptosoft.c   (inet  ipsec) | crypto
 file crypto/xform.c(inet  ipsec) | crypto
 file crypto/xform_ipcomp.c (inet  ipsec) | crypto
 file crypto/arc4.c 
diff --git sys/crypto/crypto.c sys/crypto/crypto.c
index f62752d..4a11ae3 100644
--- sys/crypto/crypto.c
+++ sys/crypto/crypto.c
@@ -280,39 +280,10 @@ crypto_get_driverid(u_int8_t flags)
 /*
  * Register a crypto driver. It should be called once for each algorithm
  * supported by the driver.
  */
 int
-crypto_kregister(u_int32_t driverid, int *kalg,
-int (*kprocess)(struct cryptkop *))
-{
-   int s, i;
-
-   if (driverid = crypto_drivers_num || kalg  == NULL ||
-   crypto_drivers == NULL)
-   return EINVAL;
-
-   s = splvm();
-
-   for (i = 0; i = CRK_ALGORITHM_MAX; i++) {
-   /*
-* XXX Do some performance testing to determine
-* placing.  We probably need an auxiliary data
-* structure that describes relative performances.
-*/
-
-   crypto_drivers[driverid].cc_kalg[i] = kalg[i];
-   }
-
-   crypto_drivers[driverid].cc_kprocess = kprocess;
-
-   splx(s);
-   return 0;
-}
-
-/* Register a crypto driver. */
-int
 crypto_register(u_int32_t driverid, int *alg,
 int (*newses)(u_int32_t *, struct cryptoini *),
 int (*freeses)(u_int64_t), int (*process)(struct cryptop *))
 {
int s, i;
@@ -410,71 +381,10 @@ crypto_dispatch(struct cryptop *crp)
}
 
return 0;
 }
 
-int
-crypto_kdispatch(struct cryptkop *krp)
-{
-   if (crypto_taskq) {
-   task_set(krp-krp_task, (void (*))crypto_kinvoke, krp, NULL);
-   task_add(crypto_taskq, krp-krp_task);
-   } else {
-   crypto_kinvoke(krp);
-   }
-
-   return 0;
-}
-
-/*
- * Dispatch an asymmetric crypto request to the appropriate crypto devices.
- */
-int
-crypto_kinvoke(struct cryptkop *krp)
-{
-   extern int cryptodevallowsoft;
-   u_int32_t hid;
-   int error;
-   int s;
-
-   /* Sanity checks. */
-   if (krp == NULL || krp-krp_callback == NULL)
-   return (EINVAL);
-
-   s = splvm();
-   for (hid = 0; hid  crypto_drivers_num; hid++) {
-   if ((crypto_drivers[hid].cc_flags  CRYPTOCAP_F_SOFTWARE) 
-   cryptodevallowsoft == 0)
-   continue;
-   if (crypto_drivers[hid].cc_kprocess == NULL)
-   continue;
-   if ((crypto_drivers[hid].cc_kalg[krp-krp_op] 
-   CRYPTO_ALG_FLAG_SUPPORTED) == 0)
-   continue;
-   break;
-   }
-
-   if (hid == crypto_drivers_num) {
-   krp-krp_status = ENODEV;
-   crypto_kdone(krp);
-   splx(s);
-   return (0);
-   }
-
-   krp-krp_hid = hid;
-
-   crypto_drivers[hid].cc_koperations++;
-
-   error = crypto_drivers[hid].cc_kprocess(krp);
-   if (error) {
-   krp-krp_status = error;
-   crypto_kdone(krp);
-   }
-   splx(s);
-   return (0);
-}
-
 /*
  * Dispatch a crypto request to the appropriate crypto devices.
  */
 int
 crypto_invoke(struct cryptop *crp)
@@ -624,40 +534,5 @@ crypto_done(struct cryptop *crp)
task_set(crp-crp_task, (void (*))crp-crp_callback,
crp, NULL);
task_add(crypto_taskq, crp-crp_task);
}
 }
-
-/*
- * Invoke the callback on behalf of the driver.
- */
-void
-crypto_kdone(struct cryptkop *krp)
-{
-   task_set(krp-krp_task, (void (*))krp-krp_callback, krp, NULL);
-   task_add(crypto_taskq, krp-krp_task);
-}
-
-int
-crypto_getfeat(int *featp)
-{
-   extern int cryptodevallowsoft, userasymcrypto;
-   int hid, kalg, feat = 0;
-
-   if (userasymcrypto == 0)
-   goto out; 
-   for (hid

/dev/crypto removal (3bis): DTYPE_CRYPTO

2014-08-18 Thread Mike Belopuhov
I don't know if we recycle them somehow, but just in case...

diff --git sys/sys/file.h sys/sys/file.h
index d98118e..64c0f31 100644
--- sys/sys/file.h
+++ sys/sys/file.h
@@ -67,11 +67,11 @@ struct file {
short   f_flag; /* see fcntl.h */
 #defineDTYPE_VNODE 1   /* file */
 #defineDTYPE_SOCKET2   /* communications endpoint */
 #defineDTYPE_PIPE  3   /* pipe */
 #defineDTYPE_KQUEUE4   /* event queue */
-#defineDTYPE_CRYPTO5   /* crypto */
+/* was define  DTYPE_CRYPTO5 */
 #defineDTYPE_SYSTRACE  6   /* system call tracing */
short   f_type; /* descriptor type */
longf_count;/* reference count */
longf_msgcount; /* references from message queue */
struct  ucred *f_cred;  /* credentials associated with descriptor */



Re: pf: once for match rules?

2014-08-20 Thread Mike Belopuhov
On Tue, Aug 12, 2014 at 18:26 +0200, Mike Belopuhov wrote:
 On Tue, Jul 22, 2014 at 19:03 +0200, Mike Belopuhov wrote:
  Hi,
  
  Before I send a diff for pfctl to disable once on match rules,
  I've decided to try and see how much work is it to make it actually
  work.  Turns out that I need to extend pf_rule_item by 3 pointers
  to track the match rule ruleset, anchor rule and the ruleset it
  belongs to.
  
  Here's what this means in practice.  Consider a ruleset:
  
   block drop all
   match out log proto tcp to port 22 once
   anchor foo all {
 match out log proto tcp to port 22 once
 anchor bar all {
   match out log proto tcp to port 22 once
   pass out quick proto tcp to port 22 once
 }
   }
  
  Once we send a packet to port 22 the ruleset collapses to just:
  
   block drop all
  
  Thoughts?
 
 Henning thinks it's a bit of an overkill.  Any other opinions?
 

here we go then.  OK?

diff --git sbin/pfctl/parse.y sbin/pfctl/parse.y
index c277b8d..61c2646 100644
--- sbin/pfctl/parse.y
+++ sbin/pfctl/parse.y
@@ -1488,12 +1488,18 @@ pfrule  : action dir logquick interface af 
proto fromto
if ($8.marker  FOM_SETPRIO) {
r.set_prio[0] = $8.set_prio[0];
r.set_prio[1] = $8.set_prio[1];
r.scrub_flags |= PFSTATE_SETPRIO;
}
-   if ($8.marker  FOM_ONCE)
+   if ($8.marker  FOM_ONCE) {
+   if (r.action == PF_MATCH) {
+   yyerror(can't specify once for 
+   match rules);
+   YYERROR;
+   }
r.rule_flag |= PFRULE_ONCE;
+   }
if ($8.marker  FOM_AFTO)
r.rule_flag |= PFRULE_AFTO;
r.af = $5;
 
if ($8.tag)



Re: let vlan(4) mtu be limited by the parents hardmtu instead of current mtu

2014-08-27 Thread Mike Belopuhov
On 27 August 2014 13:17, David Gwynne da...@gwynne.id.au wrote:
 On Tue, Aug 26, 2014 at 09:11:14PM -0400, Brad Smith wrote:
 On 20/08/14 8:03 PM, David Gwynne wrote:
 sthen@ says this is likely a bit optimistic. while most of our drivers 
 unconditionally configure their max mru, there's some stupid ones that 
 still interpret the configured mtu as a what the mru should be.
 
 dlg

 oce(4) and ix(4) need to be fixed.

 this might fix ix(4). anyone able to test?


ix and oce (internally though) calculate various paramethers
related to flow control, internal buffer sizes and so on based
on the maximum frame size.  therefore if i set max frame size
to 16k but always use 1,5k all these calculations will be off.
and by killing SIOCSIFMTU you make it impossible for the driver
to readjust them.

we have no idea at the moment how will this affect nic's behavior.

pieces of such black magic such as the code under
/* Hardware Packet Buffer  Flow Control setup */ comment must
be studied before making such changes.

and as i've mentioned to dlg, ix diff also switches it to using
16k cluster pool no matter what you actually need.  this can be
changed, but needs testing (i'm doing it now).



Re: let vlan(4) mtu be limited by the parents hardmtu instead of current mtu

2014-08-27 Thread Mike Belopuhov
On 27 August 2014 13:23, David Gwynne da...@gwynne.id.au wrote:
 On Tue, Aug 26, 2014 at 09:11:14PM -0400, Brad Smith wrote:
 On 20/08/14 8:03 PM, David Gwynne wrote:
 sthen@ says this is likely a bit optimistic. while most of our drivers 
 unconditionally configure their max mru, there's some stupid ones that 
 still interpret the configured mtu as a what the mru should be.
 
 dlg

 oce(4) and ix(4) need to be fixed.

 this might fix oce. can anyone test?


works fine here.  i think we can go with this one.  OK mikeb



Re: bge(4) Jumbo support for newer chipsets

2014-08-27 Thread Mike Belopuhov
On 27 August 2014 08:25, Brad Smith b...@comstyle.com wrote:
 Looking for some testing of the following diff to add Jumbo support for the
 BCM5714 / BCM5780 and BCM5717 / BCM5719 / BCM5720 / BCM57765 / BCM57766
 chipsets.



i have tested this on Broadcom BCM5719 rev 0x01, unknown BCM5719 (0x5719001),
APE firmware NCSI 1.1.15.0  and Broadcom BCM5714 rev 0xa3, BCM5715
A3 (0x9003).

it works, however i'm not strictly a fan of switching the cluster pool to
larger one for 5714.  wasting another 8k page (on sparc for example) for
every rx cluster in 90% cases sounds kinda wrong to me.  but ymmv.

apart from that there's a deficiency in the diff itself.  you probably want
to change MCLBYTES in bge_rxrinfo to bge_rx_std_len otherwise statistics
look wrong.

i'm certainly OK with !5714 part of the diff.



Re: bge(4) Jumbo support for newer chipsets

2014-08-28 Thread Mike Belopuhov
On 28 August 2014 12:32, David Gwynne da...@gwynne.id.au wrote:

 On 28 Aug 2014, at 3:02 am, Mike Belopuhov m...@belopuhov.com wrote:

 On 27 August 2014 08:25, Brad Smith b...@comstyle.com wrote:
 Looking for some testing of the following diff to add Jumbo support for the
 BCM5714 / BCM5780 and BCM5717 / BCM5719 / BCM5720 / BCM57765 / BCM57766
 chipsets.



 i have tested this on Broadcom BCM5719 rev 0x01, unknown BCM5719 
 (0x5719001),
 APE firmware NCSI 1.1.15.0  and Broadcom BCM5714 rev 0xa3, BCM5715
 A3 (0x9003).

 it works, however i'm not strictly a fan of switching the cluster pool to
 larger one for 5714.  wasting another 8k page (on sparc for example) for
 every rx cluster in 90% cases sounds kinda wrong to me.  but ymmv.

 this is what MCLGETI was invented to solve though. comparing pre mclgeti to 
 what this does:


that doesn't make my point invalid though.

 a 5714 right now without jumbos would have 512 rings entries with 2048 bytes 
 on each. 2048 * 512 is a 1024k of ram. if we bumped the std ring up to jumbos 
 by default, 9216 * 512 would eat 4608k of ram.


your calculation is a bit off.  it's not 9216 * 512, in case of sparc64 it's
8k * 2 * 512 which is 8M.

but my concern is different:  you ask uvm to do more work for every cluster
since now you need 2 consequent pages of memory for one cluster instead of
just one that fits 8k/2k = 4 clusters.

 my boxes with bge with mclgeti generally sit around 40 clusters, but 
 sometimes end up around 80. 80 * 9216 is 720k. we can have jumbos and still 
 be ahead.

 if you compare the nics with split rings: 512 * 2048 + 256 * 9216 is ~3.3M. 
 the same chip with mclgeti and only doing a 1500 byte workload would be 80 * 
 2048 + 17 * 9216, or 300k.


 apart from that there's a deficiency in the diff itself.  you probably want
 to change MCLBYTES in bge_rxrinfo to bge_rx_std_len otherwise statistics
 look wrong.

 yeah.

 i have tested both 1500 and 9000 mtus on a 5714 and it is working well. as 
 you say, 5719 seems to be fine too, but ive only tested it with mtu 1500. ill 
 test 9k tomorrow.

 it needs tests on older chips too though.



Re: minphys woes

2014-08-29 Thread Mike Belopuhov
On 29 August 2014 11:26, Stefan Fritsch s...@sfritsch.de wrote:
 On Fri, 29 Aug 2014, Miod Vallat wrote:
  sc-sc_xfer_max is computed according to the host's capabilities. What I
  want to simulate with this diff is a host adapter that can only cope with
  transfers  64k == MAXPHYS.

 Back to your original problem, you might want to print the sc_link
 struct as well the scsi_adapter struct it points to, when you detect a
 transfer larger than MAXPHYS. It has likely been overriden or reset to
 NULL by mistake at some point.

 OK, I will try that. Will take a bit until I have time, though.

 But I have also read the code and I could not find a place in the path
 from bread() to the scsi adapter cmd function where the minphys function
 is called.


correct me if i'm wrong, but what happens is that bread being a block
read reads up to MAXBSIZE which is conveniently set to 64k and you can't
create a filesystem with a larger block size.

physio (raw device io) however doesn't go through bread and need to know
how split the provided buffer in separate transactions hence minphys.



Re: newfs.8

2014-08-29 Thread Mike Belopuhov
On 29 August 2014 08:19, Jason McIntyre j...@kerhand.co.uk wrote:

 is this correct? i'm not a user myself, but the text states that
 special, for mount_mfs, is typically that of the primary swap area.
 how would you even define the swap area without a disklabel?

 jmc


sort of yes.  mount_mfs(8) says this:

 [...] The special file is only used
 to read the disk label which provides a set of configuration parameters
 for the memory based file system.  The special file is typically that of
 the primary swap area, since that is where the file system will be backed
 up when free memory gets low and the memory supporting the file system
 has to be paged.  If the keyword ``swap'' is used instead of a special
 file name, default configuration parameters will be used.  (This option
 is useful when trying to use mount_mfs on a machine without any disks.)

in reality it fakes up a disklabel and proceeds.  number of XXX's
around this code is also mildly amusing...



Re: newfs.8

2014-08-29 Thread Mike Belopuhov
On 29 August 2014 13:44, Jason McIntyre j...@kerhand.co.uk wrote:
 On Fri, Aug 29, 2014 at 01:39:57PM +0200, Mike Belopuhov wrote:
 On 29 August 2014 08:19, Jason McIntyre j...@kerhand.co.uk wrote:
 
  is this correct? i'm not a user myself, but the text states that
  special, for mount_mfs, is typically that of the primary swap area.
  how would you even define the swap area without a disklabel?
 
  jmc
 

 sort of yes.  mount_mfs(8) says this:

  [...] The special file is only used
  to read the disk label which provides a set of configuration parameters
  for the memory based file system.  The special file is typically that of
  the primary swap area, since that is where the file system will be 
 backed
  up when free memory gets low and the memory supporting the file system
  has to be paged.  If the keyword ``swap'' is used instead of a special
  file name, default configuration parameters will be used.  (This option
  is useful when trying to use mount_mfs on a machine without any disks.)

 in reality it fakes up a disklabel and proceeds.  number of XXX's
 around this code is also mildly amusing...


 so the diff posted was correct and should be committed?
 jmc


yes, i believe so.  OK mikeb



Re: reduce the number of missed PCB cache with tcpbench -su

2014-08-31 Thread Mike Belopuhov
Daniel,  don't reply anything to Damien just yet.  Can you please run
a simple test on Monday.  Try tcpbench -u -n 2 ip (as in multi-
connection test) without your patch and then with the patch and see
if behavior is changed.

Thanks

On 29 August 2014 18:01, Damien Miller d...@mindrot.org wrote:
 On Fri, 29 Aug 2014, Daniel Jakots wrote:

 Hi,

 When running tcpbench -su, a lot of them are counted as missed PCB
 cache.
 ...
 + n = recvfrom(fd, ptb-dummybuf, ptb-dummybuf_len, 0,
 + (struct sockaddr *)ss, slen);
 + if (n  0  connect(fd, (const struct sockaddr *)ss, slen))
 + warn(fail to connect);

 What's the benefit of this? I've never seen an application do this;
 wouldn't it be better to improve the PCB cache so it caught this case
 (which seems the usual way UDP applications behave) instead?

 -d




Re: minphys woes

2014-09-01 Thread Mike Belopuhov
On 29 August 2014 22:39, Stefan Fritsch s...@sfritsch.de wrote:
 On Fri, 29 Aug 2014, Mike Belopuhov wrote:
 correct me if i'm wrong, but what happens is that bread being a block
 read reads up to MAXBSIZE which is conveniently set to 64k and you can't
 create a filesystem with a larger block size.

 physio (raw device io) however doesn't go through bread and need to know
 how split the provided buffer in separate transactions hence minphys.

 Yes, that seems to be what happens. But if every adapter needs to support
 transfers of MAXBSIZE == MAXPHYS anyway, there would be no need for the
 adapter to be able to override the default minphys function with its own.
 And adapters that only support smaller transfers would need to have logic
 in their driver to be able to split the transfer into smaller chunks.


i believe that if you start digging you realise that (at least at some point)
the least common denominator is (or was) 64k meaning that even the shittiest
controller on vax can do 64k.  which means that we don't have code for a
filesystem or buffercache to probe the controller for a supported transfer
size.

 I think it makes more sense to have that logic in one place to be used by
 all drivers.

perhaps, but unless the filesystem will issue breads of larger blocks the
only benefit would be physio itself which doesn't really justify the change.



Re: reduce the number of missed PCB cache with tcpbench -su

2014-09-01 Thread Mike Belopuhov
On 29 August 2014 18:01, Damien Miller d...@mindrot.org wrote:
 What's the benefit of this?

This creates a UDP PCB per connection.  Otherwise we always rely on
matching the wildcard PCB.

 I've never seen an application do this;

I doubt that.  However, things like NTP or DNS servers usually expect
requests from everyone so they don't do it.  Now I have actually
identified one use case that this diff breaks: multiple senders won't
work.  Which is kind of annoying.

 wouldn't it be better to improve the PCB cache so it caught this case
 (which seems the usual way UDP applications behave) instead?


Definitely!  Using something like a prefix tree (mickey advocates hash
array mapped tries) would eliminate this second lookup and also solve
problems with attacking PCB hash with collisions and resizing the hash
table.

 -d




Re: minphys woes

2014-09-01 Thread Mike Belopuhov
On 1 September 2014 13:06, Stefan Fritsch s...@sfritsch.de wrote:
 On Mon, 1 Sep 2014, Mike Belopuhov wrote:

 On 29 August 2014 22:39, Stefan Fritsch s...@sfritsch.de wrote:
  Yes, that seems to be what happens. But if every adapter needs to support
  transfers of MAXBSIZE == MAXPHYS anyway, there would be no need for the
  adapter to be able to override the default minphys function with its own.
  And adapters that only support smaller transfers would need to have logic
  in their driver to be able to split the transfer into smaller chunks.
 

 i believe that if you start digging you realise that (at least at some point)
 the least common denominator is (or was) 64k meaning that even the shittiest
 controller on vax can do 64k.  which means that we don't have code for a
 filesystem or buffercache to probe the controller for a supported transfer
 size.

 That's possible. But what is really strange is, why does openbsd then have
 an infrastructure to set different max transfer sizes for physio on a
 per-adapter basis? This makes no sense. Either the drivers have to support
 64k transfers, in which case most of the minphys infrastructure is
 useless, or they don't have to. In the latter case the minphys
 infrastructure would need to be used in all code paths.


i haven't found a controller that does less than MAXPHYS.
perhaps they meant to improve the situation but stopped short.

  I think it makes more sense to have that logic in one place to be used by
  all drivers.

 perhaps, but unless the filesystem will issue breads of larger blocks the
 only benefit would be physio itself which doesn't really justify the change.

 You lost me there. Why should this depend on how many requests some
 filesystem makes with larger blocks? If there is the possibility that
 filesystems do such requests, someone needs to do the splitting of the
 requests. The question is who. The driver or the block layer.



well, the filesystem doesn't issue requests for more than 64k at a time.
newfs won't allow you to build a 128k block file filesystem.

 BTW, for my use case I don't actually want to limit the block size, but
 rather the number of DMA segments. But the most reasonable way to do that
 seemed to set minphys to max_segments * pagesize. If we change how these
 things work, we could take the number of DMA segments into account.

can't help you with this one.



Re: bge(4) Jumbo support for newer chipsets

2014-09-02 Thread Mike Belopuhov
On 2 September 2014 03:54, Brad Smith b...@comstyle.com wrote:
 On Wed, Aug 27, 2014 at 02:25:27AM -0400, Brad Smith wrote:
 Looking for some testing of the following diff to add Jumbo support for the
 BCM5714 / BCM5780 and BCM5717 / BCM5719 / BCM5720 / BCM57765 / BCM57766
 chipsets.

 Here is an updated diff with bge_rxrinfo() being fixed.


get it in.  OK mikeb



Re: splnet() and SIOCSIFADDR

2014-09-03 Thread Mike Belopuhov
On 3 September 2014 15:53, Martin Pieuchot mpieuc...@nolizard.org wrote:
 On 03/09/14(Wed) 15:25, Martin Pieuchot wrote:
 Drivers that need a splnet() protection inside their SIOCSIFADDR
 generally raise the spl level themselves, so we should not need
 to do that in in{6,}_ifinit().  One exception to this rule is,
 as always, carp(4)...

 So the diff below moves the spl dance inside carp's SIOCSIFADDR
 handler, it's a baby step, so we can take care of carp_set_addr{6,}
 later.

 mikeb@ pointed out that a spl dance was missing from the SIOCDIFADDR_IN6
 ioctl, updated diff below also fixes that.


i think we're ready to step on a carefully placed landmine. OK mikeb



Re: RTFREE - rtfree

2014-10-08 Thread Mike Belopuhov
On 8 October 2014 12:24, Martin Pieuchot mpieuc...@nolizard.org wrote:
 Diff below kills the macro and use the fonction instead since they are
 equivalent.  It also replaces some 0 - NULL where it applies.  It does
 not include the manpage bits, I'll deal with that afterward.

 I'd appreciate reviews and oks.


although syn_cache_put doesn't really require nullification,
i'm ok with the diff.



Re: improving OpenBSD's gmac.c...

2014-10-08 Thread Mike Belopuhov
On 8 October 2014 00:48, John-Mark Gurney j...@funkthat.com wrote:
 Christian Weisgerber wrote this message on Tue, Oct 07, 2014 at 23:08 +0200:
 John-Mark Gurney:

  So, as I was working on FreeBSD's implementation of gmac.c, I noticed
  that I was able to get a significant speed up by using a mask instead
  of an if branch in ghash_gfmul in gmac.c from OpenBSD...
 
  Add a mask var and replace the code between the comments
  update Z and update V w/:
  mask = !!(x[i  3]  (1  (~i  7)));
  mask = ~(mask - 1);
 
  z[0] ^= v[0]  mask;
  z[1] ^= v[1]  mask;
  z[2] ^= v[2]  mask;
  z[3] ^= v[3]  mask;
 
  And you should see a nice performance increase...

 I tried this on a Soekris net6501-50 and the performance increase
 was around 1.3%.  (I set up an ESP transport association with
 AES-128-GMAC and pushed UDP traffic with tcpbench over it.)

 Yeh, I benchmarked the raw algo in userland, not part of IPsec.  I
 forget the resulting perf increase, but it was well more than 10-20%.

 A look at the generated amd64 assembly code shows that the change
 indeed removes a branch.  What's pretty shocking is that this code

 mul = v[3]  1;
 ...
 v[0] = (v[0]  1) ^ (0xe100 * mul);

 is turned into an actual imul instruction by GCC.  I used the same
 masking approach to get rid of the multiplication, but the improvement
 was minuscule (1%).

 Hmm. interesting...  In my code I switched both to using the and
 operator...

 I also have code which switches the registers to 64bits so that on
 amd64, we make better uses of registers, and on i386, the compilers
 are good at breaking down the 64bit registers to 32bit w/o extra
 work...

  I also have an implementation of ghash that does a 4 bit lookup table
  version with the table split between cache lines in p4 at:
  https://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/opencrypto/sys/opencrypto/gfmult.cREV=4

 I'll have to look at this, but haven't there been increasing
 misgivings about table implementations for GHASH because of timing
 attacks?

 Well, the code avoids one issue but introduces another issue...  Each
 table lookup accesses all four lines (assuming 64 byte cache lines), so
 there isn't a problem there...  The issue intrroduded is that the first
 64 bits of a cache line are faster to access than the remaining bits...

 For more info, look at the NSS bug:
 https://bugzilla.mozilla.org/show_bug.cgi?id=868948

 Though considering that the AES implementation that FreeBSD is still
 using is the standard table based AES, there are also timing issues
 there too...  Yes, no excuse for opening up additional windows...

 I have thought about introducing an option of slow but secure and
 fast but insecure against local attackers...  Since we are already
 on the fast but insecure against local attackers path, I figured I'll
 take the extra performance...

 As with all things, there is a trade off between speed and security...
 As most IPsec gateways don't have a local user trying to target the
 key, this is less of an issue...  And even if they did, it'd probably
 be easier to use a local root exploit to get the key than try to mount
 a timing attack to extract a key that might change in an hour or a day...

 --
   John-Mark Gurney  Voice: +1 415 225 5579

  All that I will do, has been done, All that I have, has not.


Hi,

I've talked to Theo and it looks like we'll be importing your GF2
multiplication library as is.  I think we should concentrate on
making table version of gmac.c work better.

Cheers,
Mike



Re: A system without interface?

2014-10-14 Thread Mike Belopuhov
On 14 October 2014 11:01, Martin Pieuchot mpieuc...@nolizard.org wrote:
 On 08/10/14(Wed) 14:29, Martin Pieuchot wrote:
 I'm looking after the uses of the global list of interface.  These ones
 are pointless, you always have at least one interface on your system.

 Ok?

 Anyone?


looks good to me. ok mikeb



Re: pool page colouring

2014-10-28 Thread Mike Belopuhov
On 28 October 2014 17:02, Ted Unangst t...@tedunangst.com wrote:
 On Tue, Oct 28, 2014 at 16:49, David Gwynne wrote:
 when i shuffled the locking in pools around, page colouring was
 left behind.

 page colouring is where you offset items within a page if you have
 enough slack space. the previous implementation simply incremented
 the colour so each new page got the next offset. i didnt do this
 because the page and its items are now initted outside the lock,
 so maintaining that curcolour iterator wasnt as easy.

 this sidesteps the curcolor maintenance by just having each page
 randomly pick a colour when it's set up.

 Would it make more sense to use the page address to pick the color?


Does it actually still make sense to keep page coloring?  Is there still
benefit on modern hardware?



Re: pool page colouring

2014-10-29 Thread Mike Belopuhov
On 29 October 2014 22:52, Ted Unangst t...@tedunangst.com wrote:
 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:


 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.

 there. i fixed it.


so is there any performance difference?



Re: Multipath for HOST p2p routes

2014-11-04 Thread Mike Belopuhov
On 4 November 2014 12:51, Martin Pieuchot mpieuc...@nolizard.org wrote:
 How are we suppose to support configuration with multiple p2p interfaces
 pointing to the same destination address?  Right now only one route to
 host is added.

 Diff below replaces a hack that move a host route from one p2p interface
 to another by always installing MPATH routes.  It assumes that multiples
 p2p interfaces configured to the same destination address will have
 different local addresses.

 ok?


After discussing it with Martin it sounds like a good idea.
There's some semi-unrelated cleanup in in_ifinit that he has
promised to commit separately.  OK mikeb for both changes.



Re: Kill in_iawithaddr()

2014-11-04 Thread Mike Belopuhov
On 4 November 2014 12:52, Martin Pieuchot mpieuc...@nolizard.org wrote:
 This function is just a wrapper around ifa_ifwithaddr() and I'd prefer
 to have less function iterating over the global list of interfaces.

 ok?



what's not immediately apparent is that it also makes sure that the
address that ifa_ifwithaddr returns is not a broadcast address. oh god.

ok for the cleanup



Re: network pool names

2014-11-04 Thread Mike Belopuhov
On 4 November 2014 13:23, Martin Pieuchot mpieuc...@nolizard.org wrote:
 Remove pl suffix, ok?


ok with a syncache instead of syn



Re: pool page colouring

2014-11-04 Thread Mike Belopuhov
On 5 November 2014 01:12, Mike Belopuhov m...@belopuhov.com wrote:

 well, first of all, right now this is a rather theoretical gain.  we
 need to test it
 to understand if it makes things easier.

err. i meant to say go faster not easier.



Re: pool page colouring

2014-11-04 Thread Mike Belopuhov
On 5 November 2014 00:38, David Gwynne da...@gwynne.id.au wrote:

 On 30 Oct 2014, at 07:52, Ted Unangst t...@tedunangst.com wrote:

 On Wed, Oct 29, 2014 at 07:25, David Gwynne wrote:


 i dunno. im fine with either removing colouring altogether or setting it
 from something else completely. i just want a decision to be made cos
 right now ph_color isnt set, which is a bug.

 there. i fixed it.

 looks like we were both ignorant and wrong. mikeb@ points out this from the 
 original slab paper:

 4.1. Impact of Buffer Address Distribution on Cache
 Utilization

 The address distribution of mid-size buffers can
 affect the system’s overall cache utilization. In par-
 ticular, power-of-two allocators - where all buffers
 are 2 n bytes and are 2 n -byte aligned - are pes-
 simal.* Suppose, for example, that every inode
 (∼ 300 bytes) is assigned a 512-byte buffer, 512-byte
 aligned, and that only the first dozen fields of an
 inode (48 bytes) are frequently referenced. Then
 the majority of inode-related memory traffic will be
 at addresses between 0 and 47 modulo 512. Thus
 the cache lines near 512-byte boundaries will be
 heavily loaded while the rest lie fallow. In effect
 only 9% (48/512) of the cache will be usable by
 inodes. Fully-associative caches would not suffer
 this problem, but current hardware trends are toward
 simpler rather than more complex caches.

 4.3. Slab Coloring

 The slab allocator incorporates a simple coloring
 scheme that distributes buffers evenly throughout
 the cache, resulting in excellent cache utilization
 and bus balance. The concept is simple: each time
 a new slab is created, the buffer addresses start at a
 slightly different offset (color) from the slab base
 (which is always page-aligned). For example, for a
 cache of 200-byte objects with 8-byte alignment, the
 first slab’s buffers would be at addresses 0, 200,
 400, ... relative to the slab base. The next slab’s
 buffers would be at offsets 8, 208, 408, ... and so
 on. The maximum slab color is determined by the
 amount of unused space in the slab.


 we run on enough different machines that i think we should consider this.


well, first of all, right now this is a rather theoretical gain.  we
need to test it
to understand if it makes things easier.  to see cache statistics we can use
performance counters, however current pctr code might be a bit out of date.

 so the question is if we do bring colouring back, how do we calculate it?
 arc4random? mask bits off ph_magic? atomic_inc something in the pool?
 read a counter from the pool? shift bits off the page address?

the way i read it is that you have a per-pool running value pr_color that you
increment by the item alignment or native cache line size modulo space
available for every page you are getting from uvm.  however i can see that
it might entail a problem by locating a page header (or was it page boundary?
don't have the code at hand) using simple math.



Re: iked responds with esp over external ips.

2014-11-05 Thread Mike Belopuhov
On 4 November 2014 17:06, Martin Larsson martin.larss...@gmail.com wrote:
 Hello!

 I've setup a tunnel between OpenBSD 5.6 using iked and an openwrt router
 running strongswan.
 The tunnel works great with ping and other traffic but traffic between the
 two external ip's dies.

 This is a site-to-site connection and nothing fancy.

 iked.conf on OpenBSD.
 ikev2 esp from $10.11.12.0/24 to $194.168.4.0/24 peer $tcgw srcid sippan.se

 # ipsecctl -sa
 FLOWS:
 flow esp in from 192.168.4.0/24 to 10.11.12.0/24 peer 82.17.12.21 srcid
 FQDN/sippan.se dstid FQDN/sswan.sippan.se type use
 flow esp out from 10.11.12.0/24 to 192.168.4.0/24 peer 82.17.12.21 srcid
 FQDN/sippan.se dstid FQDN/sswan.sippan.se type require
 flow esp out from ::/0 to ::/0 type deny

 SAD:
 esp tunnel from 82.17.12.21 to 130.51.23.4 spi 0x67483925 auth hmac-sha1
 enc aes
 esp tunnel from 130.51.23.4 to 82.17.12.21 spi 0xcf1f39d1 auth hmac-sha1
 enc aes

 # netstat -nr
 Routing tables

 Internet:
 DestinationGatewayFlags   Refs  Use   Mtu  Prio
 Iface
 default130.51.23.4 UGS   10 30430256 - 8 em0
 10/8   link#5 UC 10 - 4
 vether0
 10.11.12.13fe:e1:ba:d0:d6:1c  UHLl   01 - 1 lo0
 10.255.255.255 link#5 UHLc   3  570 - 4
 vether0
 82.17.12.21  130.51.23.4 UGHD   0 30430251 - L  56 em0
 127/8  127.0.0.1  UGRS   00 32768 8 lo0
 127.0.0.1  127.0.0.1  UH 16 32768 4 lo0
 194.48.213.128/27  link#1 UC 10 - 4 em0
 130.51.23.4 00:00:cd:19:95:16  UHLc   20 - 4 em0
 130.51.23.4 00:02:b3:aa:cc:c3  HLl00 - 1 lo0
 224/4  127.0.0.1  URS00 32768 8 lo0

 Internet6:
 -removed, dont use it-

 Encap:
 Source Port  DestinationPort  Proto
 SA(Address/Proto/Type/Direction)
 192.168.4/24   0 10.11.12/240 0
 82.17.12.21/esp/use/in
 10.11.12/240 192.168.4/24   0 0
 82.17.12.21/esp/require/out
 default0 default
  0 0 none/esp/deny/out

 # tcpdump on openbsd while trying to connect with ssh to the external ip of
 the OpenBSD host from the exernal ip of the other end.

 # tcpdump host 82.17.12.21
 tcpdump: listening on em0, link-type EN10MB
 tcpdump: WARNING: compensating for unaligned libpcap packets
 16:49:55.539903 egget.priv.lamest.se.54158  loller.sippan.se.ssh: S
 2729317717:2729317717(0) win 8192 mss 1460,nop,wscale 8,nop,nop,sackOK
 (DF)
 16:49:55.539932 loller.sippan.se.ssh  egget.priv.lamest.se.54158: S
 2317435827:2317435827(0) ack 2729317718 win 16384 mss
 1240,nop,nop,sackOK,nop,wscale 3
 16:49:55.545936 egget.priv.lamest.se.54158  loller.sippan.se.ssh: . ack 1
 win 256 (DF)
 16:49:55.553927 esp loller.sippan.se  egget.priv.lamest.se spi 0xcf1f39d1
 seq 190 len 100
 16:50:01.553883 esp loller.sippan.se  egget.priv.lamest.se spi 0xcf1f39d1
 seq 191 len 100
 16:50:05.977468 esp egget.priv.lamest.se  loller.sippan.se spi 0x67483925
 seq 127 len 84 (DF)
 16:50:05.977519 esp loller.sippan.se  egget.priv.lamest.se spi 0xcf1f39d1
 seq 192 len 84


 # tcpdump on enc0 while trying ssh and https
 tcpdump: listening on enc0, link-type ENC
 tcpdump: WARNING: compensating for unaligned libpcap packets
 17:01:01.578622 (authentic,confidential): SPI 0xc31749f4:
 loller.sippan.se.ssh  egget.priv.lamest.se.54158: R
 2317435850:2317435850(0) ack 2729317718 win 0 (encap)
 17:01:05.786123 (authentic,confidential): SPI 0xc31749f4:
 loller.sippan.se.ssh  egget.priv.lamest.se.54792: P
 3813334764:3813334785(21) ack 2711749548 win 2170 (encap)
 17:01:05.968654 (authentic,confidential): SPI 0xc31749f4:
 loller.sippan.se.https  egget.priv.lamest.se.54793: P
 3540908942:3540909100(158) ack 1840586787 win 2170 (encap)
 17:01:06.265543 (authentic,confidential): SPI 0xc31749f4:
 loller.sippan.se.https  egget.priv.lamest.se.54793: . ack 1 win 2170
 (encap)
 17:01:06.876165 (authentic,confidential): SPI 0xc31749f4:
 loller.sippan.se.https  egget.priv.lamest.se.54793: . ack 1 win 2170
 (encap)
 17:01:08.095189 (authentic,confidential): SPI 0xc31749f4:
 loller.sippan.se.https  egget.priv.lamest.se.54793: . ack 1 win 2170
 (encap)
 17:01:10.459116 (authentic,confidential): SPI 0xc31749f4:
 loller.sippan.se.https  egget.priv.lamest.se.54793: . ack 1 win 2170
 (encap)

 So it appears that OpenBSD tries to send back traffic with ESP when it
 shouldn't.

 I'd also like to add that the exact same setup works with with isakmpd.

 Best regards
 Martin



This is a known issue.  The reason why it happens is not
strictly documented.  The change to the stack was introduced
as part of the cleanup long time ago.  Supposedly it's there
to support TCP MD5 signatures, but I didn't verify that yet.

There's this code in the 

Re: iked responds with esp over external ips.

2014-11-05 Thread Mike Belopuhov
On 5 November 2014 13:28, Mike Belopuhov m...@belopuhov.com wrote:
 On 4 November 2014 17:06, Martin Larsson martin.larss...@gmail.com wrote:
 Hello!

 I've setup a tunnel between OpenBSD 5.6 using iked and an openwrt router
 running strongswan.
 The tunnel works great with ping and other traffic but traffic between the
 two external ip's dies.

 This is a site-to-site connection and nothing fancy.

 iked.conf on OpenBSD.
 ikev2 esp from $10.11.12.0/24 to $194.168.4.0/24 peer $tcgw srcid sippan.se

 # ipsecctl -sa
 FLOWS:
 flow esp in from 192.168.4.0/24 to 10.11.12.0/24 peer 82.17.12.21 srcid
 FQDN/sippan.se dstid FQDN/sswan.sippan.se type use
 flow esp out from 10.11.12.0/24 to 192.168.4.0/24 peer 82.17.12.21 srcid
 FQDN/sippan.se dstid FQDN/sswan.sippan.se type require
 flow esp out from ::/0 to ::/0 type deny

 SAD:
 esp tunnel from 82.17.12.21 to 130.51.23.4 spi 0x67483925 auth hmac-sha1
 enc aes
 esp tunnel from 130.51.23.4 to 82.17.12.21 spi 0xcf1f39d1 auth hmac-sha1
 enc aes

 # netstat -nr
 Routing tables

 Internet:
 DestinationGatewayFlags   Refs  Use   Mtu  Prio
 Iface
 default130.51.23.4 UGS   10 30430256 - 8 em0
 10/8   link#5 UC 10 - 4
 vether0
 10.11.12.13fe:e1:ba:d0:d6:1c  UHLl   01 - 1 lo0
 10.255.255.255 link#5 UHLc   3  570 - 4
 vether0
 82.17.12.21  130.51.23.4 UGHD   0 30430251 - L  56 em0
 127/8  127.0.0.1  UGRS   00 32768 8 lo0
 127.0.0.1  127.0.0.1  UH 16 32768 4 lo0
 194.48.213.128/27  link#1 UC 10 - 4 em0
 130.51.23.4 00:00:cd:19:95:16  UHLc   20 - 4 em0
 130.51.23.4 00:02:b3:aa:cc:c3  HLl00 - 1 lo0
 224/4  127.0.0.1  URS00 32768 8 lo0

 Internet6:
 -removed, dont use it-

 Encap:
 Source Port  DestinationPort  Proto
 SA(Address/Proto/Type/Direction)
 192.168.4/24   0 10.11.12/240 0
 82.17.12.21/esp/use/in
 10.11.12/240 192.168.4/24   0 0
 82.17.12.21/esp/require/out
 default0 default
  0 0 none/esp/deny/out

 # tcpdump on openbsd while trying to connect with ssh to the external ip of
 the OpenBSD host from the exernal ip of the other end.

 # tcpdump host 82.17.12.21
 tcpdump: listening on em0, link-type EN10MB
 tcpdump: WARNING: compensating for unaligned libpcap packets
 16:49:55.539903 egget.priv.lamest.se.54158  loller.sippan.se.ssh: S
 2729317717:2729317717(0) win 8192 mss 1460,nop,wscale 8,nop,nop,sackOK
 (DF)
 16:49:55.539932 loller.sippan.se.ssh  egget.priv.lamest.se.54158: S
 2317435827:2317435827(0) ack 2729317718 win 16384 mss
 1240,nop,nop,sackOK,nop,wscale 3
 16:49:55.545936 egget.priv.lamest.se.54158  loller.sippan.se.ssh: . ack 1
 win 256 (DF)
 16:49:55.553927 esp loller.sippan.se  egget.priv.lamest.se spi 0xcf1f39d1
 seq 190 len 100
 16:50:01.553883 esp loller.sippan.se  egget.priv.lamest.se spi 0xcf1f39d1
 seq 191 len 100
 16:50:05.977468 esp egget.priv.lamest.se  loller.sippan.se spi 0x67483925
 seq 127 len 84 (DF)
 16:50:05.977519 esp loller.sippan.se  egget.priv.lamest.se spi 0xcf1f39d1
 seq 192 len 84


There's another interesting behavior evident in this trace: first few (usually
1 or 2) packets (SYN-ACK from loller here) are not encrypted.  That's because
we update our PCB on input due to the presence of a matching SA in a very agile
manner: first it doesn't really match, but we make sure next time it does
(funny logic in the ipsp_spd_inp that coupled with the default IPsec policy).



Re: 5.6 Icmp6 checksum / pf

2014-11-10 Thread Mike Belopuhov
On Sun, Nov 09, 2014 at 10:17 +0100, Bastien Durel wrote:
 Hi,
 
 I recently upgraded to 5.6, and got problems with icmpv6
 
 I have a gif tunnel for IPv6:
 [root@fremen root]# ifconfig gif0 
   
   
   
 gif0: flags=8051UP,POINTOPOINT,RUNNING,MULTICAST mtu 1280
 description: Sixxs
 priority: 0
 groups: gif egress
 tunnel: inet 78.194.94.73 - 212.100.184.146
 inet6 fe80::200:24ff:fecf:42ac%gif0 - prefixlen 64 scopeid 0x10
 inet6 2001:6f8:202:19c::2 - 2001:6f8:202:19c::1 prefixlen 128
 
 When I initiate a ping from this interface, it works as intended:
 
 08:59:38.376107 2001:6f8:202:19c::2  2001:41d0:8:91a::1: icmp6: echo request 
 (id:392e seq:0) [icmp6 cksum ok] (len 16, hlim 64)
 08:59:38.410385 2001:41d0:8:91a::1  2001:6f8:202:19c::2: icmp6: echo reply 
 (id:392e seq:0) [icmp6 cksum ok] (len 16, hlim 57)
 08:59:39.389383 2001:6f8:202:19c::2  2001:41d0:8:91a::1: icmp6: echo request 
 (id:392e seq:1) [icmp6 cksum ok] (len 16, hlim 64)
 08:59:39.421397 2001:41d0:8:91a::1  2001:6f8:202:19c::2: icmp6: echo reply 
 (id:392e seq:1) [icmp6 cksum ok] (len 16, hlim 57)
 
 But when a ping from outside reached it, the echo reply is sent with a
 bad (0) checksum, and the packet is dropped by te other side:
 
 09:40:28.852102 2001:41d0:8:91a::1  2001:6f8:202:19c::2: icmp6: echo request 
 (id:6c10 seq:1) [icmp6 cksum ok] (len 64, hlim 57)
 09:40:28.852251 2001:6f8:202:19c::2  2001:41d0:8:91a::1: icmp6: echo reply 
 (id:6c10 seq:1) [bad icmp6 cksum 0! - 2d93] (len 64, hlim 64)
 09:40:29.860327 2001:41d0:8:91a::1  2001:6f8:202:19c::2: icmp6: echo request 
 (id:6c10 seq:2) [icmp6 cksum ok] (len 64, hlim 57)
 09:40:29.860432 2001:6f8:202:19c::2  2001:41d0:8:91a::1: icmp6: echo reply 
 (id:6c10 seq:2) [bad icmp6 cksum 0! - 8a71] (len 64, hlim 64)
 
 It works correctly with this pf rule disabled:
 pass in on gif0 reply-to ( gif0 2001:6f8:202:19c::1 ) keep state
 
 What's the correct way to handle correct return-path ?
 
 Regards,
 
 -- 
 Bastien Durel
 

hi,

looks like it's caused by the forgotten in6_proto_cksum_out call
in the pf_route6 after possible pf_map_addr/PF_ACPY manipulations.

bastien, can you please verify that the diff below fixes your
issue?

diff --git sys/net/pf.c sys/net/pf.c
index 979f44d..4c934db 100644
--- sys/net/pf.c
+++ sys/net/pf.c
@@ -5777,20 +5777,22 @@ pf_route6(struct mbuf **m, struct pf_rule *r, int dir, 
struct ifnet *oifp,
goto bad;
else if (m0 == NULL)
goto done;
if (m0-m_len  sizeof(struct ip6_hdr)) {
DPFPRINTF(LOG_ERR,
pf_route6: m0-m_len  sizeof(struct ip6_hdr));
goto bad;
}
}
 
+   in6_proto_cksum_out(m0, ifp);
+
/*
 * If the packet is too large for the outgoing interface,
 * send back an icmp6 error.
 */
if (IN6_IS_SCOPE_EMBED(dst-sin6_addr))
dst-sin6_addr.s6_addr16[1] = htons(ifp-if_index);
if ((u_long)m0-m_pkthdr.len = ifp-if_mtu) {
nd6_output(ifp, ifp, m0, dst, NULL);
} else {
in6_ifstat_inc(ifp, ifs6_in_toobig);



Re: improving OpenBSD's gmac.c...

2014-11-12 Thread Mike Belopuhov
On 10 October 2014 02:39, Damien Miller d...@mindrot.org wrote:
 On Thu, 9 Oct 2014, Christian Weisgerber wrote:

 John-Mark Gurney:

  I also have an implementation of ghash that does a 4 bit lookup table
  version with the table split between cache lines in p4 at:
  https://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/opencrypto/sys/opencrypto/gfmult.cREV=4
 
  This also has a version with does 4 blocks at a time getting a
  further speed up...

 FWIW, I did a quick  dirty merge of this into the OpenBSD tree and
 the speed of my test (net6501-50, tcpbench -u over esp aes-128-gmac)
 almost doubled.

 isn't this likely to make it more likely to be subject to timing
 attacks?


then how is this different to our table based aes implementation?
and it's the same C code as in openssl which also uses table based
gcm implementation.

what countermeasures can be applied to the table lookup code
to fight these attacks?



Re: divert-to with sockets bound to any

2013-06-19 Thread Mike Belopuhov
On 19 June 2013 20:20, Reyk Floeter r...@openbsd.org wrote:
 On Wed, Jun 19, 2013 at 08:00:01PM +0200, Reyk Floeter wrote:
 OK?


 I forgot the in6_pcblookup_listen() case, updated diff below.

 Reyk


it boils down to the pcb lookup magic as i thought;  ok mikeb.



Stop calling IPsec and pf under splnet

2013-07-12 Thread Mike Belopuhov
Hi,

As it was pointed out by dhill there are some rogue splnets in
the tcp_input that shouldn't be there really.  The only reason
they're still there is to match overzealous splnets in bridge_
broadcast.  bridge_ifenqueue is the only function call in there
that requires splnet protection since it's dealing with send
queues.  Narrowing the range of the splnet protection allows us
to remove all splnet protection of the IPsec SPD and TDB code.
This as well removes the only pf_test call done under IPL_NET.

Below are essentially two diffs that are rather hard to separate.
I've tested the diff with the gif-to-ethernet IPsec bridge but
some additional IPsec and bridge testing won't hurt.

mpi@ has provided some feedback already, so I'm really looking
for OK's on this.

diff --git sys/net/if_bridge.c sys/net/if_bridge.c
index 41d7b67..0ca2710 100644
--- sys/net/if_bridge.c
+++ sys/net/if_bridge.c
@@ -969,12 +969,10 @@ bridge_output(struct ifnet *ifp, struct mbuf *m, struct 
sockaddr *sa,
return (ENOBUFS);
}
eh = mtod(m, struct ether_header *);
dst = (struct ether_addr *)eh-ether_dhost[0];
 
-   s = splnet();
-
/*
 * If bridge is down, but original output interface is up,
 * go ahead and send out that interface.  Otherwise the packet
 * is dropped below.
 */
@@ -1007,11 +1005,10 @@ bridge_output(struct ifnet *ifp, struct mbuf *m, struct 
sockaddr *sa,
 */
if ((mtag = m_tag_find(m, PACKET_TAG_IPSEC_OUT_CRYPTO_NEEDED,
NULL)) != NULL) {
ipsp_skipcrypto_unmark((struct tdb_ident *)(mtag + 1));
m_freem(m);
-   splx(s);
return (0);
}
 #endif /* IPSEC */
bridge_span(sc, NULL, m);
 
@@ -1074,22 +1071,24 @@ bridge_output(struct ifnet *ifp, struct mbuf *m, struct 
sockaddr *sa,
m1-m_pkthdr.len = len;
}
mc = m1;
}
 
+   s = splnet();
error = bridge_ifenqueue(sc, dst_if, mc);
+   splx(s);
if (error)
continue;
}
if (!used)
m_freem(m);
-   splx(s);
return (0);
}
 
 sendunicast:
bridge_span(sc, NULL, m);
+   s = splnet();
if ((dst_if-if_flags  IFF_RUNNING) == 0) {
m_freem(m);
splx(s);
return (ENETDOWN);
}
@@ -1251,13 +1250,11 @@ bridgeintr_frame(struct bridge_softc *sc, struct mbuf 
*m)
 * If the packet is a multicast or broadcast OR if we don't
 * know any better, forward it to all interfaces.
 */
if ((m-m_flags  (M_BCAST | M_MCAST)) || dst_if == NULL) {
sc-sc_if.if_imcasts++;
-   s = splnet();
bridge_broadcast(sc, src_if, eh, m);
-   splx(s);
return;
}
 
/*
 * At this point, we're dealing with a unicast frame going to a
@@ -1496,13 +1493,11 @@ bridge_broadcast(struct bridge_softc *sc, struct ifnet 
*ifp,
 struct ether_header *eh, struct mbuf *m)
 {
struct bridge_iflist *p;
struct mbuf *mc;
struct ifnet *dst_if;
-   int len, used = 0;
-
-   splassert(IPL_NET);
+   int len, s, used = 0;
 
TAILQ_FOREACH(p, sc-sc_iflist, next) {
/*
 * Don't retransmit out of the same interface where
 * the packet was received from.
@@ -1587,11 +1582,13 @@ bridge_broadcast(struct bridge_softc *sc, struct ifnet 
*ifp,
len += ETHER_VLAN_ENCAP_LEN;
 #endif
if ((len - ETHER_HDR_LEN)  dst_if-if_mtu)
bridge_fragment(sc, dst_if, eh, mc);
else {
+   s = splnet();
bridge_ifenqueue(sc, dst_if, mc);
+   splx(s);
}
}
 
if (!used)
m_freem(m);
@@ -1643,11 +1640,11 @@ bridge_span(struct bridge_softc *sc, struct 
ether_header *eh,
 struct mbuf *morig)
 {
struct bridge_iflist *p;
struct ifnet *ifp;
struct mbuf *mc, *m;
-   int error;
+   int s, error;
 
if (TAILQ_EMPTY(sc-sc_spanlist))
return;
 
m = m_copym2(morig, 0, M_COPYALL, M_NOWAIT);
@@ -1679,11 +1676,13 @@ bridge_span(struct bridge_softc *sc, struct 
ether_header *eh,
if (mc == NULL) {
sc-sc_if.if_oerrors++;
continue;
}
 
+   s = splnet();
error = bridge_ifenqueue(sc, ifp, mc);
+   splx(s);
if (error)

ix(4) driver update to latest FreeBSD/Intel source code

2013-07-12 Thread Mike Belopuhov
Hi,

The following diff updates most of the ix(4) driver to what
FreeBSD and Intel have today.  Most importantly it introduces
support for the Ethernet flow control.

Please test and report.  OK's are welcome as well.

http://theapt.org/~mike/ix.diff
http://theapt.org/~mike/ix-w.diff  (less whitespace changes)



Re: include netinet/in_var.h in dev

2013-08-06 Thread Mike Belopuhov
On 6 August 2013 03:54, Alexander Bluhm alexander.bl...@gmx.net wrote:
 Hi,

 For an upcoming change in in6_var.h I would like to minimize the
 impact.  Most network drivers include netinet/in_var.h, but apparently
 they don't have to.  Can we remove these includes?

 compiled on amd64 and i386

 ok?

 bluhm


should be fine.  ok mikeb



Re: Insert new IPv4 addresses at only one place

2013-08-07 Thread Mike Belopuhov
On 7 August 2013 15:07, Martin Pieuchot mpieuc...@nolizard.org wrote:
 Diff below deduplicate and move the code adding a new address to the
 global list into in_ifinit(), there's no functional change.

 While here add a comment about why we always delete addresses from
 the tree during update.

 ok?


OK mikeb



Re: Constify the null sockaddr in arp_rtrequest()

2013-08-08 Thread Mike Belopuhov
On 8 August 2013 12:35, Martin Pieuchot mpieuc...@nolizard.org wrote:
 arp_rtrequest() uses a default static sockaddr_dl which is only used
 read-only: it is copied by rt_setgate().

 I'd like to constify this structure to make it clear no value can be
 trashed if code using it is run in parallel.

 Also remove another reference to the name of the variable to make it
 clear it is used only once.

 ok?


ok mikeb



Re: tedu netatm

2013-08-09 Thread Mike Belopuhov
On 9 August 2013 09:36, Martin Pieuchot mpieuc...@nolizard.org wrote:
 It's me again :) With a freshly updated and tested diff to tedu netatm.
 I got no objection since I raised the issue 5 months ago [0], so I'm now
 looking for oks.

 [0] http://marc.info/?l=openbsd-techm=136335787207091w=2


i think it should go the way of doo doo.



ray(4) removal

2013-08-13 Thread Mike Belopuhov
to make mpi's life a tad easier and also lose some weight,
i'd like to move rat(4) to the attic.  mpi, kettenis, jsg
and henning agree.  i'll commit the diff if noone objects.

henning has also suggested to remove the pre-wifi era cnw(4).
if there's interest i can cook a diff for that as well, but
it doesn't seem to get in the way at the moment.

Index: arch/amd64/conf/GENERIC
===
RCS file: /cvs/openbsd/src/sys/arch/amd64/conf/GENERIC,v
retrieving revision 1.343
diff -u -p -r1.343 GENERIC
--- arch/amd64/conf/GENERIC 12 Aug 2013 04:11:52 -  1.343
+++ arch/amd64/conf/GENERIC 13 Aug 2013 13:10:57 -
@@ -472,7 +472,6 @@ wi* at pcmcia?  # WaveLAN IEEE 802.11DS
 an*at pci? # Aironet IEEE 802.11DS
 an*at pcmcia?  # Aironet IEEE 802.11DS
 cnw*   at pcmcia?  # Xircom Netwave
-ray*   at pcmcia?  # Raylink Aviator2.4/Pro 802.11FH
 iwi*   at pci? # Intel PRO/Wireless 2200BG/2915ABG
 wpi*   at pci? # Intel PRO/Wireless 3945ABG
 iwn*   at pci? # Intel WiFi Link 4965/5000/1000/6000
Index: arch/amd64/conf/RAMDISK
===
RCS file: /cvs/openbsd/src/sys/arch/amd64/conf/RAMDISK,v
retrieving revision 1.55
diff -u -p -r1.55 RAMDISK
--- arch/amd64/conf/RAMDISK 15 Mar 2012 20:06:02 -  1.55
+++ arch/amd64/conf/RAMDISK 13 Aug 2013 13:10:57 -
@@ -217,7 +217,6 @@ nfe*at pci? # NVIDIA nForce Ethernet
 #wi*   at pcmcia?  # WaveLAN IEEE 802.11DS
 #an*   at pcmcia?  # Aironet IEEE 802.11DS
 #cnw*  at pcmcia?  # Xircom Netwave
-#ray*  at pcmcia?  # Raylink Aviator2.4/Pro 802.11FH
 
 # Media Independent Interface (mii) drivers
 #exphy*at mii? # 3Com internal PHYs
Index: arch/amd64/conf/RAMDISK_CD
===
RCS file: /cvs/openbsd/src/sys/arch/amd64/conf/RAMDISK_CD,v
retrieving revision 1.125
diff -u -p -r1.125 RAMDISK_CD
--- arch/amd64/conf/RAMDISK_CD  5 Apr 2013 05:45:09 -   1.125
+++ arch/amd64/conf/RAMDISK_CD  13 Aug 2013 13:10:57 -
@@ -291,7 +291,6 @@ wi* at pci? # WaveLAN IEEE 802.11DS
 wi*at pcmcia?  # WaveLAN IEEE 802.11DS
 an*at pcmcia?  # Aironet IEEE 802.11DS
 #cnw*  at pcmcia?  # Xircom Netwave
-#ray*  at pcmcia?  # Raylink Aviator2.4/Pro 802.11FH
 ipw*   at pci? # Intel PRO/Wireless 2100
 iwi*   at pci? # Intel PRO/Wireless 2200BG/2915ABG
 wpi*   at pci? # Intel PRO/Wireless 3945ABG
Index: arch/i386/conf/GENERIC
===
RCS file: /cvs/openbsd/src/sys/arch/i386/conf/GENERIC,v
retrieving revision 1.748
diff -u -p -r1.748 GENERIC
--- arch/i386/conf/GENERIC  12 Aug 2013 04:11:52 -  1.748
+++ arch/i386/conf/GENERIC  13 Aug 2013 13:10:57 -
@@ -612,7 +612,6 @@ an* at pci? # Aironet IEEE 802.11DS
 an*at isapnp?  # Aironet IEEE 802.11DS
 an*at pcmcia?  # Aironet IEEE 802.11DS
 #cnw*  at pcmcia?  # Xircom Netwave
-ray*   at pcmcia?  # Raylink Aviator2.4/Pro 802.11FH
 ipw*   at pci? # Intel PRO/Wireless 2100
 iwi*   at pci? # Intel PRO/Wireless 2200BG/2915ABG
 wpi*   at pci? # Intel PRO/Wireless 3945ABG
Index: arch/i386/conf/RAMDISK
===
RCS file: /cvs/openbsd/src/sys/arch/i386/conf/RAMDISK,v
retrieving revision 1.180
diff -u -p -r1.180 RAMDISK
--- arch/i386/conf/RAMDISK  14 Jun 2013 01:00:24 -  1.180
+++ arch/i386/conf/RAMDISK  13 Aug 2013 13:10:57 -
@@ -238,7 +238,6 @@ wi* at pci? # WaveLAN IEEE 802.11DS
 wi*at pcmcia?  # WaveLAN IEEE 802.11DS
 #an*   at pcmcia?  # Aironet IEEE 802.11DS
 #cnw*  at pcmcia?  # Xircom Netwave
-#ray*  at pcmcia?  # Raylink Aviator2.4/Pro 802.11FH
 
 # Media Independent Interface (mii) drivers
 exphy* at mii? # 3Com internal PHYs
Index: arch/i386/conf/RAMDISKB
===
RCS file: /cvs/openbsd/src/sys/arch/i386/conf/RAMDISKB,v
retrieving revision 1.121
diff -u -p -r1.121 RAMDISKB
--- arch/i386/conf/RAMDISKB 15 Mar 2012 20:06:03 -  1.121
+++ arch/i386/conf/RAMDISKB 13 Aug 2013 13:10:57 -
@@ -247,7 +247,6 @@ nfe*at pci? # NVIDIA nForce 
Ethernet
 #wi*   at pcmcia?  # WaveLAN IEEE 802.11DS
 

Re: rtsold ioctl sysctl

2013-08-15 Thread Mike Belopuhov
On Thu, Aug 15, 2013 at 00:39 +0200, Alexander Bluhm wrote:
 Hi,
 
 I would like to replace the obsolete ioctl(SIOCGDRLST_IN6) interface
 with sysctl(net.inet6.icmp6.nd6_drlist) in rtsold.  Code copied
 from ndp.
 
 ok?
 

looks good to me.  OK



Re: remove obsolete nd6 ioctls

2013-08-15 Thread Mike Belopuhov
On 15 August 2013 17:34, Alexander Bluhm alexander.bl...@gmx.net wrote:
 Hi,

 After converting the last user of ioctl(SIOCGDRLST_IN6) to sysctl,
 I would like to remove dead kernel ioctl code.

 Is it save to just delete this?

 ok?

 bluhm


if ports are fine with it, i'm fine as well (:



Re: threaded prof signals

2013-08-16 Thread Mike Belopuhov
On 16 August 2013 09:23, Ted Unangst t...@tedunangst.com wrote:
 Actually, here's my concern. There's only one timeout for the process.
 What happens when two threads are running on two CPUs? Is there a
 guarantee that cpu0 will both set and execute the timeout before cpu1
 sets it, or is there a race where cpu1 will fail to send the signal to
 its thread because cpu0 has already processed the timeout?


softclock and hence the timeout processing is done only by the cpu0
currently.



Re: Stop using static variables in ICMP

2013-08-19 Thread Mike Belopuhov
On 9 August 2013 11:04, Martin Pieuchot mpieuc...@nolizard.org wrote:
 This is the last episode from the first season of the serie, move
 your variables to the stack.  Like in the previous episodes, this
 one will let us execute the various icmp functions in parallel
 without risk of trashing a value.

 It also reduces the difference with the icmp6 code adding a redirect
 route.

 ok?


OK



Re: src/sbin/ifconfig: missing include

2013-08-19 Thread Mike Belopuhov
On 19 August 2013 12:52, David Coppa dco...@gmail.com wrote:

 This misses util.h:

 cc -O2 -pipe  -fno-pie -Wall   -DINET6 -c /usr/src/sbin/ifconfig/ifconfig.c
 /usr/src/sbin/ifconfig/ifconfig.c: In function 'setifwpakey':
 /usr/src/sbin/ifconfig/ifconfig.c:1759: warning: implicit declaration of 
 function 'pkcs5_pbkdf2'

 OK?


OK



bge: call if_link_state_change when state is actually different

2013-08-23 Thread Mike Belopuhov
hi,

bge(4) is the last driver in the tree that is willing to call
if_link_state_change whenever, while others do so only when
the link state does change.

there should be no real change in functionality.  ok?


diff --git sys/dev/pci/if_bge.c sys/dev/pci/if_bge.c
index 5cd56e2..233ccab 100644
--- sys/dev/pci/if_bge.c
+++ sys/dev/pci/if_bge.c
@@ -4512,22 +4512,27 @@ bge_link_upd(struct bge_softc *sc)
if (BGE_ASICREV(sc-bge_chipid) == 
BGE_ASICREV_BCM5704)
BGE_CLRBIT(sc, BGE_MAC_MODE,
BGE_MACMODE_TBI_SEND_CFGS);
CSR_WRITE_4(sc, BGE_MAC_STS, 0x);
status = CSR_READ_4(sc, BGE_MAC_MODE);
-   ifp-if_link_state =
-   (status  BGE_MACMODE_HALF_DUPLEX) ?
+   link = (status  BGE_MACMODE_HALF_DUPLEX) ?
LINK_STATE_HALF_DUPLEX :
LINK_STATE_FULL_DUPLEX;
-   if_link_state_change(ifp);
ifp-if_baudrate = IF_Gbps(1);
+   if (ifp-if_link_state != link) {
+   ifp-if_link_state = link;
+   if_link_state_change(ifp);
+   }
}
} else if (BGE_STS_BIT(sc, BGE_STS_LINK)) {
BGE_STS_CLRBIT(sc, BGE_STS_LINK);
-   ifp-if_link_state = LINK_STATE_DOWN;
-   if_link_state_change(ifp);
+   link = LINK_STATE_DOWN;
ifp-if_baudrate = 0;
+   if (ifp-if_link_state != link) {
+   ifp-if_link_state = link;
+   if_link_state_change(ifp);
+   }
}
} else if (BGE_STS_BIT(sc, BGE_STS_AUTOPOLL)) {
/*
 * Some broken BCM chips have BGE_STATFLAG_LINKSTATE_CHANGED bit
 * in status word always set. Workaround this bug by reading



defer routing table updates on link state changes

2013-08-26 Thread Mike Belopuhov
hi,

in order to make our life a bit easier and prevent rogue
accesses to the routing table from the hardware interrupt
context violating all kinds of spl assumptions we would
like if_link_state_change that is called by network device
drivers in their interrupt service routines to defer its
work to the process context or thereabouts.

i did some testing here, but wouldn't mind if someone
tries this diff in gre/vlan/ospf/anything-weird setups.
making sure that hot-plugging/unplugging usb interfaces
doesn't produce any undesirable effects would be superb
as well.

please note that a token (an interface index) is passed
to the workq in order to make sure that if the interface
would be gone by the time syswq goes around to run the
task it would just fall through.

ok?

diff --git sys/net/if.c sys/net/if.c
index 6dafd0d..5b6800a 100644
--- sys/net/if.c
+++ sys/net/if.c
@@ -79,10 +79,11 @@
 #include sys/protosw.h
 #include sys/kernel.h
 #include sys/ioctl.h
 #include sys/domain.h
 #include sys/sysctl.h
+#include sys/workq.h
 
 #include net/if.h
 #include net/if_dl.h
 #include net/if_media.h
 #include net/if_types.h
@@ -151,10 +152,12 @@ int   if_clone_list(struct if_clonereq *);
 struct if_clone*if_clone_lookup(const char *, int *);
 
 void   if_congestion_clear(void *);
 intif_group_egress_build(void);
 
+void   if_link_state_change_task(void *, void *);
+
 intifai_cmp(struct ifaddr_item *,  struct ifaddr_item *);
 void   ifa_item_insert(struct sockaddr *, struct ifaddr *, struct ifnet *);
 void   ifa_item_remove(struct sockaddr *, struct ifaddr *, struct ifnet *);
 #ifndef SMALL_KERNEL
 void   ifa_print_rb(void);
@@ -1106,21 +1109,39 @@ if_up(struct ifnet *ifp)
 
m_clinitifp(ifp);
 }
 
 /*
- * Process a link state change.
- * NOTE: must be called at splsoftnet or equivalent.
+ * Schedule a link state change task.
  */
 void
 if_link_state_change(struct ifnet *ifp)
 {
-   rt_ifmsg(ifp);
+   /* try to put the routing table update task on syswq */
+   workq_add_task(NULL, 0, if_link_state_change_task,
+   (void *)((unsigned long)ifp-if_index), NULL);
+}
+
+/*
+ * Process a link state change.
+ */
+void
+if_link_state_change_task(void *arg, void *unused)
+{
+   unsigned int index = (unsigned long)arg;
+   struct ifnet *ifp;
+   int s;
+
+   s = splsoftnet();
+   if ((ifp = if_get(index)) != NULL) {
+   rt_ifmsg(ifp);
 #ifndef SMALL_KERNEL
-   rt_if_track(ifp);
+   rt_if_track(ifp);
 #endif
-   dohooks(ifp-if_linkstatehooks, 0);
+   dohooks(ifp-if_linkstatehooks, 0);
+   }
+   splx(s);
 }
 
 /*
  * Handle interface watchdog timer routines.  Called
  * from softclock, we decrement timers (if set) and



Re: defer routing table updates on link state changes

2013-08-27 Thread Mike Belopuhov
On 27 August 2013 13:39, Martin Pieuchot mpieuc...@nolizard.org wrote:
 I think that's the right approach but the current code generating
 interfaces indexes is too clever from my point of view, it tries
 to reuse the last index if possible.  This could lead to some
 funny races if we detach and reattach an interface before the
 task get scheduled.

 So I'm ok with your diff if we avoid to reuse the last index too
 soon.  Diff below does that.


i'm ok with this change.



Re: Remove unused argument from *rtrequest()

2013-08-27 Thread Mike Belopuhov
On 27 August 2013 15:58, Martin Pieuchot mpieuc...@nolizard.org wrote:
 In order to define a proper API for our routine table, I'd like to turn
 the struct rt_addrinfo into a private type (ie: only used in route.c
 and rtsock.c).

 This type is used by a lost of code in our network stack to add or delete
 a route but also in the various *rtrequest() functions.  However in
 these functions the argument is never used!  So the diff below kills it.

 ok?


ok



Re: defer routing table updates on link state changes

2013-09-02 Thread Mike Belopuhov
On Mon, Aug 26, 2013 at 13:36 +0200, Mike Belopuhov wrote:
 hi,
 
 in order to make our life a bit easier and prevent rogue
 accesses to the routing table from the hardware interrupt
 context violating all kinds of spl assumptions we would
 like if_link_state_change that is called by network device
 drivers in their interrupt service routines to defer its
 work to the process context or thereabouts.
 
 i did some testing here, but wouldn't mind if someone
 tries this diff in gre/vlan/ospf/anything-weird setups.
 making sure that hot-plugging/unplugging usb interfaces
 doesn't produce any undesirable effects would be superb
 as well.
 
 please note that a token (an interface index) is passed
 to the workq in order to make sure that if the interface
 would be gone by the time syswq goes around to run the
 task it would just fall through.
 
 ok?
 

i've got an ok from mpi, anyone else would like to test
and/or comment?

 diff --git sys/net/if.c sys/net/if.c
 index 6dafd0d..5b6800a 100644
 --- sys/net/if.c
 +++ sys/net/if.c
 @@ -79,10 +79,11 @@
  #include sys/protosw.h
  #include sys/kernel.h
  #include sys/ioctl.h
  #include sys/domain.h
  #include sys/sysctl.h
 +#include sys/workq.h
  
  #include net/if.h
  #include net/if_dl.h
  #include net/if_media.h
  #include net/if_types.h
 @@ -151,10 +152,12 @@ int if_clone_list(struct if_clonereq *);
  struct if_clone  *if_clone_lookup(const char *, int *);
  
  void if_congestion_clear(void *);
  int  if_group_egress_build(void);
  
 +void if_link_state_change_task(void *, void *);
 +
  int  ifai_cmp(struct ifaddr_item *,  struct ifaddr_item *);
  void ifa_item_insert(struct sockaddr *, struct ifaddr *, struct ifnet *);
  void ifa_item_remove(struct sockaddr *, struct ifaddr *, struct ifnet *);
  #ifndef SMALL_KERNEL
  void ifa_print_rb(void);
 @@ -1106,21 +1109,39 @@ if_up(struct ifnet *ifp)
  
   m_clinitifp(ifp);
  }
  
  /*
 - * Process a link state change.
 - * NOTE: must be called at splsoftnet or equivalent.
 + * Schedule a link state change task.
   */
  void
  if_link_state_change(struct ifnet *ifp)
  {
 - rt_ifmsg(ifp);
 + /* try to put the routing table update task on syswq */
 + workq_add_task(NULL, 0, if_link_state_change_task,
 + (void *)((unsigned long)ifp-if_index), NULL);
 +}
 +
 +/*
 + * Process a link state change.
 + */
 +void
 +if_link_state_change_task(void *arg, void *unused)
 +{
 + unsigned int index = (unsigned long)arg;
 + struct ifnet *ifp;
 + int s;
 +
 + s = splsoftnet();
 + if ((ifp = if_get(index)) != NULL) {
 + rt_ifmsg(ifp);
  #ifndef SMALL_KERNEL
 - rt_if_track(ifp);
 + rt_if_track(ifp);
  #endif
 - dohooks(ifp-if_linkstatehooks, 0);
 + dohooks(ifp-if_linkstatehooks, 0);
 + }
 + splx(s);
  }
  
  /*
   * Handle interface watchdog timer routines.  Called
   * from softclock, we decrement timers (if set) and



Re: ix(4): enable checksum offload

2013-09-09 Thread Mike Belopuhov
On 9 September 2013 21:48, Brad Smith b...@comstyle.com wrote:
 Here is a diff to enable the checksum offload support for ix(4).

 Looking for any testing.


last time i checked this broke ospf traffic.  please make sure at least
ip/tcp, ip/udp, ip/icmp, ip/ip, ip/gre, ip/esp, ip/ah and ip/ospf work fine
with this.



Re: em(4): enable checksum offload

2013-09-09 Thread Mike Belopuhov
On 9 September 2013 21:44, Brad Smith b...@comstyle.com wrote:
 Since I have been asked to send out these diffs again here is a diff to enable
 the checksum offload support for em(4).

 Looking for any testing.



tx checksum offloading will not work on 75, 76, 80, i350.



Re: unknown products found in Dell Optiplex 9020

2013-09-09 Thread Mike Belopuhov
On 9 September 2013 22:46, STeve Andre' and...@msu.edu wrote:
 On 09/09/13 07:45, Paul de Weerd wrote:

 Found a couple of unknown Intel products in a Dell Optiplex 9020:

 vendor Intel, unknown product 0x153a (class network subclass ethernet,
 rev 0x04) at pci0 dev 25 function 0 not configured
 vendor Intel, unknown product 0x8c22 (class serial bus subclass SMBus,
 rev 0x04) at pci0 dev 31 function 3 not configured
 vendor Intel, unknown product 0x8c31 (class serial bus subclass USB, rev
 0x04) at pci0 dev 20 function 0 not configured
 vendor Intel, unknown product 0x8c3a (class communications subclass
 miscellaneous, rev 0x04) at pci0 dev 22 function 0 not configured
 vendor Intel, unknown product 0x8c3d (class communications subclass
 serial, rev 0x04) at pci0 dev 22 function 3 not configured

 The included diff adds these to pcidevs, resulting in the following
 (none of these devices are supported by actual drivers yet):

 Intel C220 xHCI USB rev 0x04 at pci0 dev 20 function 0 not configured
 Intel C220 MEI controller rev 0x04 at pci0 dev 22 function 0 not
 configured
 Intel C220 KT controller rev 0x04 at pci0 dev 22 function 3 not
 configured
 Intel I217-LM rev 0x04 at pci0 dev 25 function 0 not configured


 The 217-LM is the Listening Monitor, by the NSA.


 Intel C220 SMBus controller rev 0x04 at pci0 dev 31 function 3 not
 configured

 Note that the diff below also adds I217-V, which got me confused
 while I was figuring out what I had in this machine.  That part is not
 in the machine (as I understand it, it's basically a simplified
 version of the same NIC, lacking some enterprise features).  Full
 dmesg after the diff.


 The 217-V was made by GCHQ.


yes, these two come with an additional non-optional tx offload mode
we don't support in our stack yet.



Re: defer routing table updates on link state changes

2013-09-12 Thread Mike Belopuhov
On 12 September 2013 17:18, Martin Pieuchot mpieuc...@nolizard.org wrote:
 FWIW it would be interesting to modify tun(4) so that it doesn't
 need to detach/reattach itself when switching between mode, this
 would allow us to stop reusing the last index.


this definitely makes a lot of sense.



Re: defer routing table updates on link state changes

2013-09-12 Thread Mike Belopuhov
On 12 September 2013 17:31, Reyk Floeter r...@openbsd.org wrote:
 On Thu, Sep 12, 2013 at 05:18:39PM +0200, Martin Pieuchot wrote:
  For example, you have to query the IfIndex via SNMP to get further
  information, like the ifName or statistics, and most monitoring
  systems would save interface information based on the index - they
  would not recognize that tun0 with if_index 10 is the same interface
  (from an OpenBSD point of view) as tun0 with an if_index 11.  It is
  not guaranteed in OpenBSD, but we shouldn't make it worse.

 All our interface drivers are associated to one index when they are
 attached and it does not change.

 The only thing is that tun(4) is special because internally when it
 switches from L3 to L2 or vice versa it detaches and reattaches itself.
 That's why this hack of reusing the last index is needed.


 It also matters if you create destroy and re-create any other cloner
 interface (vlan, ...).


it makes no sense whatsoever, reyk.  those indices can be easily
stolen and nobody guarantees that if you create vlan10, vlan11,
then destroy vlan10, create vlan12 and vlan10 that vlan10 will
have the same index as before.  in fact it might be a different
interface created for a different purpose days after.  who knows?
if snmp client relies on this behavior, it's broken since we have
never made any provisions regarding how we use those indices.

 But if I understand correctly what you're saying about querying the
 IfIndex, I find it even more dangerous to reuse the last index,
 because I can create a configuration where I change an interface
 while keeping the same index (with usb interfaces for example).


 I think that's nonsense.


looks like you misunderstand the problem we're dealing with here.

 FWIW it would be interesting to modify tun(4) so that it doesn't
 need to detach/reattach itself when switching between mode, this
 would allow us to stop reusing the last index.


 Or you could simply rewrite tun(4)?

 Isn't there any other way to do what you want without stopping to
 reuse the index?  SNMP simply expects that if_indexes are fairly
 static, linear, and without holes.  Why should we change that in
 OpenBSD?  Is there any security reason to randomize the indexes -
 No.

 Reyk


or snmp can simply stop assuming things.  if_index wasn't created
for snmp in the first place.



Re: defer routing table updates on link state changes

2013-09-12 Thread Mike Belopuhov
On 12 September 2013 18:28, Mike Belopuhov m...@belopuhov.com wrote:
 On 12 September 2013 18:14, Reyk Floeter r...@openbsd.org wrote:
 On Thu, Sep 12, 2013 at 05:53:42PM +0200, Mike Belopuhov wrote:
 looks like you misunderstand the problem we're dealing with here.


 Sure, I do.  You're trying to push one thing and you don't want to
 hear the concerns about a specific detail of it.


 with all respect, i think you don't.  otherwise you wouldn't be asking
 the questions you're asking.

 we do hear your concerns, but since even before the change if_index
 was not static at all the way you seem to be implying snmp requires
 it, i don't see a situation drastically changing.  if you create all
 the interfaces on startup or before you start snmpd and don't destroy/
 re-create them nothing is changed.


perhaps truly persistent snmp ifindexes could be implemented in the
snmpd itself via a new configuration option, i.e.

interface vlan1 index 1.

you don't necessarily have to use the same ones system is using for
it's own purposes.



Re: defer routing table updates on link state changes

2013-09-12 Thread Mike Belopuhov
On 12 September 2013 18:14, Reyk Floeter r...@openbsd.org wrote:
 On Thu, Sep 12, 2013 at 05:53:42PM +0200, Mike Belopuhov wrote:
 looks like you misunderstand the problem we're dealing with here.


 Sure, I do.  You're trying to push one thing and you don't want to
 hear the concerns about a specific detail of it.


with all respect, i think you don't.  otherwise you wouldn't be asking
the questions you're asking.

we do hear your concerns, but since even before the change if_index
was not static at all the way you seem to be implying snmp requires
it, i don't see a situation drastically changing.  if you create all
the interfaces on startup or before you start snmpd and don't destroy/
re-create them nothing is changed.

  FWIW it would be interesting to modify tun(4) so that it doesn't
  need to detach/reattach itself when switching between mode, this
  would allow us to stop reusing the last index.
 
 
  Or you could simply rewrite tun(4)?
 
  Isn't there any other way to do what you want without stopping to
  reuse the index?  SNMP simply expects that if_indexes are fairly
  static, linear, and without holes.  Why should we change that in
  OpenBSD?  Is there any security reason to randomize the indexes -
  No.
 
  Reyk
 

 or snmp can simply stop assuming things.  if_index wasn't created
 for snmp in the first place.

 Of course, everyone else is wrong, let's change the world!  IfIndex is
 used by SNMP since at least 1988 (RFC 1066) and many many tools have
 adopted it expecting this behaviour.  Anyway, just go ahead and do the
 stuff.  I don't care, it is not a big issue for snmpd.  But I still
 don't see the point of changing the semantics instead of finding
 another way to do what you want.  Unless there is a security issue or
 similar with if_indexes and changing it would actually improve
 something.  Blah.

 Reyk

no need to get upset.  mpi's change does improve things.  we want to
make full use of if_index' initial design and use it as a reference
to the interface in the mp network stack .  it has nothing to do with
a badly designed protocol from the eighties.



Re: defer routing table updates on link state changes

2013-09-12 Thread Mike Belopuhov
On 12 September 2013 19:07, Reyk Floeter r...@openbsd.org wrote:
 On Thu, Sep 12, 2013 at 06:59:13PM +0200, Mike Belopuhov wrote:
  Ok, let's stop this.  I don't think you read what I replied before.  I
  didn't say that we're static with if_indexes, just that we shouldn't
  make it worse.
 

 or implement persistent indices in the snmpd itself maybe?


 Maybe.  But this would create another layer of abstraction.  And
 if_index is not just used by SNMP.

  I give up, but please read my next comment below.
 
Isn't there any other way to do what you want without stopping to
reuse the index?  SNMP simply expects that if_indexes are fairly
static, linear, and without holes.  Why should we change that in
OpenBSD?  Is there any security reason to randomize the indexes -
No.
  
   or snmp can simply stop assuming things.  if_index wasn't created
   for snmp in the first place.
 
  Actually, I think this assumption is wrong.  I researched a little bit
  in BSD history:
 
  - RFC 1066 from August 1988 is one of the early SNMP RFC that mention 
  IfIndex
 
  - 4.3BSD-Tahoe from June 1988 doesn't have if_index, I also didn't find
in other early BSDs.
 
  - 4.3BSD-Reno from June 1990 does have it.  You can even find a
new comment /*  XXX fast fix for SNMP, going away soon */ on top of 
  if.h.
 
  So it seems that if_index was added _for_ SNMP.
 

 i believe this comment refers to the inclusion of sys/time.h.


 Yes, I know.  But see:

 1. 1988-06: 4.3BSD-Tahoe without if_index
 2. 1988-08: SNMP
 3. 1990-06: 4.3BSD-Reno with if_index, mentioning SNMP in if.h.

 I don't have the commit history, but this might indicate that there
 was some work to support SNMP and many new fields in struct ifnet have
 been added for SNMP.

 http://minnie.tuhs.org/cgi-bin/utree.pl?file1=4.3BSD-Reno/src/sys/net/if.hfile2=4.3BSD-Tahoe/usr/src/sys/net/if.hprint=1

 reyk

either way, we need to move forward on this.  we want to use if_index
for the purpose of looking up the interface w/o a pointer to the ifnet.
should we implement additional indices for that or snmp problem will
be dealt with?  currently if you reuse an index and reuse the memory
for the ifnet you may suddenly pick a new interface and perform an
operation on it that might not necessarily make sense in the context.



Re: defer routing table updates on link state changes

2013-09-12 Thread Mike Belopuhov
On 12 September 2013 18:48, Reyk Floeter r...@openbsd.org wrote:
 On Thu, Sep 12, 2013 at 06:28:15PM +0200, Mike Belopuhov wrote:
  Sure, I do.  You're trying to push one thing and you don't want to
  hear the concerns about a specific detail of it.
 

 with all respect, i think you don't.  otherwise you wouldn't be asking
 the questions you're asking.

 we do hear your concerns, but since even before the change if_index
 was not static at all the way you seem to be implying snmp requires
 it, i don't see a situation drastically changing.  if you create all
 the interfaces on startup or before you start snmpd and don't destroy/
 re-create them nothing is changed.


 Ok, let's stop this.  I don't think you read what I replied before.  I
 didn't say that we're static with if_indexes, just that we shouldn't
 make it worse.


or implement persistent indices in the snmpd itself maybe?

 I give up, but please read my next comment below.

   Isn't there any other way to do what you want without stopping to
   reuse the index?  SNMP simply expects that if_indexes are fairly
   static, linear, and without holes.  Why should we change that in
   OpenBSD?  Is there any security reason to randomize the indexes -
   No.
 
  or snmp can simply stop assuming things.  if_index wasn't created
  for snmp in the first place.

 Actually, I think this assumption is wrong.  I researched a little bit
 in BSD history:

 - RFC 1066 from August 1988 is one of the early SNMP RFC that mention IfIndex

 - 4.3BSD-Tahoe from June 1988 doesn't have if_index, I also didn't find
   in other early BSDs.

 - 4.3BSD-Reno from June 1990 does have it.  You can even find a
   new comment /*  XXX fast fix for SNMP, going away soon */ on top of if.h.

 So it seems that if_index was added _for_ SNMP.


i believe this comment refers to the inclusion of sys/time.h.

  Of course, everyone else is wrong, let's change the world!  IfIndex is
  used by SNMP since at least 1988 (RFC 1066) and many many tools have
  adopted it expecting this behaviour.  Anyway, just go ahead and do the
  stuff.  I don't care, it is not a big issue for snmpd.  But I still
  don't see the point of changing the semantics instead of finding
  another way to do what you want.  Unless there is a security issue or
  similar with if_indexes and changing it would actually improve
  something.  Blah.

 no need to get upset.  mpi's change does improve things.  we want to
 make full use of if_index' initial design and use it as a reference
 to the interface in the mp network stack .  it has nothing to do with
 a badly designed protocol from the eighties.

 Oh, come on.  SNMP is as badly designed as many other things that
 we're using every day.

sure.

 Do you suggest to rm snmpd because the
 protocol is badly designed?


no, i don't.  i merely suggest that if_index should not be used if
persistent ifindex'es are required.



Re: enc interface errno

2013-09-27 Thread Mike Belopuhov
On 27 September 2013 15:24, Alexander Bluhm alexander.bl...@gmx.net wrote:
 Hi,

 The error return codes for the enc interface seem quite inconsistent.
 Always return the appropriate errno.

 ok?

 bluhm


OK



Re: 5.4 html Security Improvements section

2013-10-09 Thread Mike Belopuhov
On 9 October 2013 19:51, Alexey E. Suslikov alexey.susli...@gmail.com wrote:
 * Added AES-XTS support to aesni crypto(4) driver on amd64.
 Allows softraid(4) to benefit from the AES-NI instructions on
 newer Intel CPUs

not at the moment, though.



Re: pf dropping window updates and acks

2013-10-11 Thread Mike Belopuhov
On Fri, Oct 11, 2013 at 12:09 +0200, Gerhard Roth wrote:
 In January bluhm@ introduced 'data_end' to pf.c:tcp_track_full().
 Now this breaks the handling of non-data packets. They may be rejected
 because the SEQ_GEQ(src-seqhi, data_end) check fails.
 
 The patch below should fix this.
 

Makes sense to me.  OK mikeb

 Gerhard
 
 
 
 Index: sys/net/pf.c
 ===
 RCS file: /cvs/src/sys/net/pf.c,v
 retrieving revision 1.841
 diff -u -p -u -p -r1.841 pf.c
 --- sys/net/pf.c  9 Oct 2013 09:32:01 -   1.841
 +++ sys/net/pf.c  11 Oct 2013 09:57:20 -
 @@ -3940,7 +3940,7 @@ pf_tcp_track_full(struct pf_pdesc *pd, s
   if (seq == end) {
   /* Ease sequencing restrictions on no data packets */
   seq = src-seqlo;
 - end = seq;
 + data_end = end = seq;
   }
  
   ackskew = dst-seqlo - ack;
 



defer routing table updates on link state changes (again)

2013-10-19 Thread Mike Belopuhov
hi,

since mpi's if_index diff is now in, this should probably
go in as well.  it has received some testing in the meantime.

original description:

in order to make our life a bit easier and prevent rogue
accesses to the routing table from the hardware interrupt
context violating all kinds of spl assumptions we would
like if_link_state_change that is called by network device
drivers in their interrupt service routines to defer its
work to the process context or thereabouts.

please note that a token (an interface index) is passed
to the workq in order to make sure that if the interface
would be gone by the time syswq goes around to run the
task it would just fall through.

ok?

diff --git sys/net/if.c sys/net/if.c
index 6dafd0d..5b6800a 100644
--- sys/net/if.c
+++ sys/net/if.c
@@ -79,10 +79,11 @@
 #include sys/protosw.h
 #include sys/kernel.h
 #include sys/ioctl.h
 #include sys/domain.h
 #include sys/sysctl.h
+#include sys/workq.h
 
 #include net/if.h
 #include net/if_dl.h
 #include net/if_media.h
 #include net/if_types.h
@@ -151,10 +152,12 @@ int   if_clone_list(struct if_clonereq *);
 struct if_clone*if_clone_lookup(const char *, int *);
 
 void   if_congestion_clear(void *);
 intif_group_egress_build(void);
 
+void   if_link_state_change_task(void *, void *);
+
 intifai_cmp(struct ifaddr_item *,  struct ifaddr_item *);
 void   ifa_item_insert(struct sockaddr *, struct ifaddr *, struct ifnet *);
 void   ifa_item_remove(struct sockaddr *, struct ifaddr *, struct ifnet *);
 #ifndef SMALL_KERNEL
 void   ifa_print_rb(void);
@@ -1106,21 +1109,39 @@ if_up(struct ifnet *ifp)
 
m_clinitifp(ifp);
 }
 
 /*
- * Process a link state change.
- * NOTE: must be called at splsoftnet or equivalent.
+ * Schedule a link state change task.
  */
 void
 if_link_state_change(struct ifnet *ifp)
 {
-   rt_ifmsg(ifp);
+   /* try to put the routing table update task on syswq */
+   workq_add_task(NULL, 0, if_link_state_change_task,
+   (void *)((unsigned long)ifp-if_index), NULL);
+}
+
+/*
+ * Process a link state change.
+ */
+void
+if_link_state_change_task(void *arg, void *unused)
+{
+   unsigned int index = (unsigned long)arg;
+   struct ifnet *ifp;
+   int s;
+
+   s = splsoftnet();
+   if ((ifp = if_get(index)) != NULL) {
+   rt_ifmsg(ifp);
 #ifndef SMALL_KERNEL
-   rt_if_track(ifp);
+   rt_if_track(ifp);
 #endif
-   dohooks(ifp-if_linkstatehooks, 0);
+   dohooks(ifp-if_linkstatehooks, 0);
+   }
+   splx(s);
 }
 
 /*
  * Handle interface watchdog timer routines.  Called
  * from softclock, we decrement timers (if set) and



Re: Make bioctl(4) print cache policy

2013-10-22 Thread Mike Belopuhov
On 22 October 2013 15:22, Mark Kettenis mark.kette...@xs4all.nl wrote:
 Diff below makes bioctl(4) print the cache policy for that's currently
 in effect for RAID volumes.  It only prints the state (WB for
 write-back, WT for write-through) if the RAID controller driver fills
 in the details in response to a BIOCVOL ioctl.  This diff adds such
 support to mfi(4).

 # bioctl mfi0
 Volume  Status   Size Device
  mfi0 0 Online72746008576 sd0 RAID1 WB
   0 Online73407820800 1:0.0   noencl HITACHI HUS153073VLS300 
 A598
   1 Online73407820800 1:1.0   noencl HITACHI HUS153073VLS300 
 A598

 ok?



mpii specs do not use terms write-back or write-through, they rather
talk about write cache being enabled or disabled.  they obviously
mean write-back cache.  i believe for the bioctl user it's more
important to know if actual writes are deferred or not, so i would
rather say Write Cache Enabled/Disabled or Write Cache On/Off or
something similar rather then if it's WB or WT.  oterwise people will
have to look in the man page for what WB or WT stands for and then
search wikipedia for what does it mean..

just my 2 cents.



convert crypto queue to the task(9) api

2013-10-30 Thread Mike Belopuhov
Tested on amd64 SP and MP, i386 SP so far.  sparc64 MP test
is in progress.  I've also tested the crypto(4) interface
(doesn't use queue) so softraid should work as well.

ok?

diff --git sys/crypto/crypto.c sys/crypto/crypto.c
index 7df0c435..fbdcd97 100644
--- sys/crypto/crypto.c
+++ sys/crypto/crypto.c
@@ -34,11 +34,11 @@ struct cryptocap *crypto_drivers = NULL;
 int crypto_drivers_num = 0;
 
 struct pool cryptop_pool;
 struct pool cryptodesc_pool;
 
-struct workq *crypto_workq;
+struct taskq *crypto_taskq;
 
 /*
  * Create a new session.
  */
 int
@@ -414,26 +414,26 @@ crypto_dispatch(struct cryptop *crp)
hid = (crp-crp_sid  32)  0x;
if (hid  crypto_drivers_num)
crypto_drivers[hid].cc_queued++;
splx(s);
 
-   if (crypto_workq) {
-   workq_queue_task(crypto_workq, crp-crp_wqt, 0,
-   (workq_fn)crypto_invoke, crp, NULL);
+   if (crypto_taskq) {
+   task_set(crp-crp_task, (void (*))crypto_invoke, crp, NULL);
+   task_add(crypto_taskq, crp-crp_task);
} else {
crypto_invoke(crp);
}
 
return 0;
 }
 
 int
 crypto_kdispatch(struct cryptkop *krp)
 {
-   if (crypto_workq) {
-   workq_queue_task(crypto_workq, krp-krp_wqt, 0,
-   (workq_fn)crypto_kinvoke, krp, NULL);
+   if (crypto_taskq) {
+   task_set(krp-krp_task, (void (*))crypto_kinvoke, krp, NULL);
+   task_add(crypto_taskq, krp-krp_task);
} else {
crypto_kinvoke(krp);
}
 
return 0;
@@ -616,11 +616,11 @@ crypto_getreq(int num)
 }
 
 void
 crypto_init(void)
 {
-   crypto_workq = workq_create(crypto, 1, IPL_HIGH);
+   crypto_taskq = taskq_create(crypto, 1, IPL_HIGH);
 
pool_init(cryptop_pool, sizeof(struct cryptop), 0, 0,
0, cryptop, NULL);
pool_init(cryptodesc_pool, sizeof(struct cryptodesc), 0, 0,
0, cryptodesc, NULL);
@@ -635,23 +635,24 @@ crypto_done(struct cryptop *crp)
crp-crp_flags |= CRYPTO_F_DONE;
if (crp-crp_flags  CRYPTO_F_NOQUEUE) {
/* not from the crypto queue, wakeup the userland process */
crp-crp_callback(crp);
} else {
-   workq_queue_task(crypto_workq, crp-crp_wqt, 0,
-   (workq_fn)crp-crp_callback, crp, NULL);
+   task_set(crp-crp_task, (void (*))crp-crp_callback,
+   crp, NULL);
+   task_add(crypto_taskq, crp-crp_task);
}
 }
 
 /*
  * Invoke the callback on behalf of the driver.
  */
 void
 crypto_kdone(struct cryptkop *krp)
 {
-   workq_queue_task(crypto_workq, krp-krp_wqt, 0,
-   (workq_fn)krp-krp_callback, krp, NULL);
+   task_set(krp-krp_task, (void (*))krp-krp_callback, krp, NULL);
+   task_add(crypto_taskq, krp-krp_task);
 }
 
 int
 crypto_getfeat(int *featp)
 {
diff --git sys/crypto/cryptodev.h sys/crypto/cryptodev.h
index 4f87046..4f732f9 100644
--- sys/crypto/cryptodev.h
+++ sys/crypto/cryptodev.h
@@ -51,11 +51,11 @@
 
 #ifndef _CRYPTO_CRYPTO_H_
 #define _CRYPTO_CRYPTO_H_
 
 #include sys/ioccom.h
-#include sys/workq.h
+#include sys/task.h
 
 /* Some initial values */
 #define CRYPTO_DRIVERS_INITIAL 4
 #define CRYPTO_DRIVERS_MAX 128
 #define CRYPTO_SW_SESSIONS 32
@@ -158,11 +158,11 @@ struct cryptodesc {
struct cryptodesc *crd_next;
 };
 
 /* Structure describing complete operation */
 struct cryptop {
-   struct workq_task crp_wqt;
+   struct task crp_task;
 
u_int64_t   crp_sid;/* Session ID */
int crp_ilen;   /* Input data total length */
int crp_olen;   /* Result total length */
int crp_alloctype;  /* Type of buf to allocate if needed */
@@ -228,11 +228,11 @@ struct crypt_kop {
 #define CRF_DSA_SIGN   (1  CRK_DSA_SIGN)
 #define CRF_DSA_VERIFY (1  CRK_DSA_VERIFY)
 #define CRF_DH_COMPUTE_KEY (1  CRK_DH_COMPUTE_KEY)
 
 struct cryptkop {
-   struct workq_task krp_wqt;
+   struct task krp_task;
 
u_int   krp_op; /* ie. CRK_MOD_EXP or other */
u_int   krp_status; /* return status */
u_short krp_iparams;/* # of input parameters */
u_short krp_oparams;/* # of output parameters */



Re: convert crypto queue to the task(9) api

2013-10-30 Thread Mike Belopuhov
On Wed, Oct 30, 2013 at 14:58 +0100, Mike Belopuhov wrote:
 Tested on amd64 SP and MP, i386 SP so far.  sparc64 MP test
 is in progress.  I've also tested the crypto(4) interface
 (doesn't use queue) so softraid should work as well.
 

sparc64 test is done.

on a side note, that IPL_HIGH can be safely converted to
IPL_VM since crypto code itself uses splvm (which is also
rather pointless because hw crypto uses IPL_NET and mbuf
allocation code will bump IPL itself anyways).  i have
just tested IPL_VM taskq on the aforementioned machines.

 ok?
 
 diff --git sys/crypto/crypto.c sys/crypto/crypto.c
 index 7df0c435..fbdcd97 100644
 --- sys/crypto/crypto.c
 +++ sys/crypto/crypto.c
 @@ -34,11 +34,11 @@ struct cryptocap *crypto_drivers = NULL;
  int crypto_drivers_num = 0;
  
  struct pool cryptop_pool;
  struct pool cryptodesc_pool;
  
 -struct workq *crypto_workq;
 +struct taskq *crypto_taskq;
  
  /*
   * Create a new session.
   */
  int
 @@ -414,26 +414,26 @@ crypto_dispatch(struct cryptop *crp)
   hid = (crp-crp_sid  32)  0x;
   if (hid  crypto_drivers_num)
   crypto_drivers[hid].cc_queued++;
   splx(s);
  
 - if (crypto_workq) {
 - workq_queue_task(crypto_workq, crp-crp_wqt, 0,
 - (workq_fn)crypto_invoke, crp, NULL);
 + if (crypto_taskq) {
 + task_set(crp-crp_task, (void (*))crypto_invoke, crp, NULL);
 + task_add(crypto_taskq, crp-crp_task);
   } else {
   crypto_invoke(crp);
   }
  
   return 0;
  }
  
  int
  crypto_kdispatch(struct cryptkop *krp)
  {
 - if (crypto_workq) {
 - workq_queue_task(crypto_workq, krp-krp_wqt, 0,
 - (workq_fn)crypto_kinvoke, krp, NULL);
 + if (crypto_taskq) {
 + task_set(krp-krp_task, (void (*))crypto_kinvoke, krp, NULL);
 + task_add(crypto_taskq, krp-krp_task);
   } else {
   crypto_kinvoke(krp);
   }
  
   return 0;
 @@ -616,11 +616,11 @@ crypto_getreq(int num)
  }
  
  void
  crypto_init(void)
  {
 - crypto_workq = workq_create(crypto, 1, IPL_HIGH);
 + crypto_taskq = taskq_create(crypto, 1, IPL_HIGH);
  
   pool_init(cryptop_pool, sizeof(struct cryptop), 0, 0,
   0, cryptop, NULL);
   pool_init(cryptodesc_pool, sizeof(struct cryptodesc), 0, 0,
   0, cryptodesc, NULL);
 @@ -635,23 +635,24 @@ crypto_done(struct cryptop *crp)
   crp-crp_flags |= CRYPTO_F_DONE;
   if (crp-crp_flags  CRYPTO_F_NOQUEUE) {
   /* not from the crypto queue, wakeup the userland process */
   crp-crp_callback(crp);
   } else {
 - workq_queue_task(crypto_workq, crp-crp_wqt, 0,
 - (workq_fn)crp-crp_callback, crp, NULL);
 + task_set(crp-crp_task, (void (*))crp-crp_callback,
 + crp, NULL);
 + task_add(crypto_taskq, crp-crp_task);
   }
  }
  
  /*
   * Invoke the callback on behalf of the driver.
   */
  void
  crypto_kdone(struct cryptkop *krp)
  {
 - workq_queue_task(crypto_workq, krp-krp_wqt, 0,
 - (workq_fn)krp-krp_callback, krp, NULL);
 + task_set(krp-krp_task, (void (*))krp-krp_callback, krp, NULL);
 + task_add(crypto_taskq, krp-krp_task);
  }
  
  int
  crypto_getfeat(int *featp)
  {
 diff --git sys/crypto/cryptodev.h sys/crypto/cryptodev.h
 index 4f87046..4f732f9 100644
 --- sys/crypto/cryptodev.h
 +++ sys/crypto/cryptodev.h
 @@ -51,11 +51,11 @@
  
  #ifndef _CRYPTO_CRYPTO_H_
  #define _CRYPTO_CRYPTO_H_
  
  #include sys/ioccom.h
 -#include sys/workq.h
 +#include sys/task.h
  
  /* Some initial values */
  #define CRYPTO_DRIVERS_INITIAL   4
  #define CRYPTO_DRIVERS_MAX   128
  #define CRYPTO_SW_SESSIONS   32
 @@ -158,11 +158,11 @@ struct cryptodesc {
   struct cryptodesc *crd_next;
  };
  
  /* Structure describing complete operation */
  struct cryptop {
 - struct workq_task crp_wqt;
 + struct task crp_task;
  
   u_int64_t   crp_sid;/* Session ID */
   int crp_ilen;   /* Input data total length */
   int crp_olen;   /* Result total length */
   int crp_alloctype;  /* Type of buf to allocate if needed */
 @@ -228,11 +228,11 @@ struct crypt_kop {
  #define CRF_DSA_SIGN (1  CRK_DSA_SIGN)
  #define CRF_DSA_VERIFY   (1  CRK_DSA_VERIFY)
  #define CRF_DH_COMPUTE_KEY   (1  CRK_DH_COMPUTE_KEY)
  
  struct cryptkop {
 - struct workq_task krp_wqt;
 + struct task krp_task;
  
   u_int   krp_op; /* ie. CRK_MOD_EXP or other */
   u_int   krp_status; /* return status */
   u_short krp_iparams;/* # of input parameters */
   u_short krp_oparams;/* # of output parameters */



Re: IPv6 routing header type 0

2013-11-14 Thread Mike Belopuhov
On 14 November 2013 18:52, Henning Brauer lists-openbsdt...@bsws.de wrote:
 * Theo de Raadt dera...@cvs.openbsd.org [2013-11-14 18:47]:
  it is the status quo *right now*

 Look, you can't call something the status quo when a commit was made 1
 month ago, to a REAL status quo that existed for 10 years when itojun
 made the change...  and immediately after this recent commit we
 started arguying about the change.

 Go find out what status quo means.

 let's not get into this, leads us nowhere.

   It *was* being filtered, until a month ago.
  no news here - we had discussed that change at the time. pretty sure
  you were in the loop even.
 No, I was not in the loop before it was commited, and many others were
 not either.  Apparently only you and mikeb were, and you did NOT push
 for more people to know about the change before it was commited.

 mikeb? bluhm worked on  comitted that otoh.
 i'm still pretty damn sure you were Cc'd; won't dig for old mail just
 to prove it; don't see the point, doesn't change anything now anyway.


we have discussed that with bluhm in berlin and initially i had the same
opinion: leave the check in the stack, but he has convinced me that it's
rather pf's job to do it.  i'm not against bringing it back and his diff
looks fine to me, esp. since it avoids double check that was there before.

 The non-pf RH0 filtering case is worthwhile.

 and here we disagree.

 --
 Henning Brauer, h...@bsws.de, henn...@openbsd.org
 BS Web Services GmbH, http://bsws.de, Full-Service ISP
 Secure Hosting, Mail and DNS Services. Dedicated Servers, Root to Fully 
 Managed
 Henning Brauer Consulting, http://henningbrauer.com/




  1   2   3   4   5   6   7   8   9   10   >