Re: AVL tree

2011-05-20 Thread Ariane van der Steldt
On Fri, May 20, 2011 at 11:23:47AM +0200, Mike Belopuhov wrote:
> On Fri, May 20, 2011 at 4:04 AM, Ariane van der Steldt  
> wrote:
> > AVL trees have a difference of max 1 between the longest and shortest
> > path to a leaf, whereas RB-trees have at max the longest path be double
> > the length of the shortest path.
> > I.e. work case lookups require traversal of
> > ? ?ln(size)/ln(2) elements for AVL trees,
> > 2 * ln(size)/ln(2) elements for RB trees.
> >
> > However, RB trees are slightly more efficient with insertion/removal.
> >
> > The better balancing of AVL trees can make a big difference in lookup
> > time compared to RB trees, for trees containing millions of items.
> >
> >> how are you supposed to choose between them?
> >
> > Basically it's a trade off. Is lookup more important, or insert/remove?
> > And ofcourse, if you still don't know, just take what strikes your
> > fancy. :D
> >
> 
> i know about the difference of lookup and insert/remove speed, but
> is there a significant difference in practice (in openbsd use cases)?

I suspect that the PF state table is mostly lookups, therefor AVL would
improve performance there. That said without having looked at the code
though.

For UVMs allocators, the case is harder to pin, since they use a lot of
lookup and traversal (where AVL may benefit) while also requiring a lot
of modifications (where RB may be better). The choice is a lot harder to
pin down.

This is without taking cache coherency on multiple CPUs into account,
which may change the picture again (a write/rotate would lead to cache
invalidation on all other cpus).

> > But I'm mostly interested because I tend to use trees left, right and
> > center and those macros lead to code bloat. If I can use a tree without
> > macros, I'm happy.
> 
> so maybe it makes sense to have a non macro implementation in libkern
> and use it in uvm or whenever needed?

Or just in sys/kern.
-- 
Ariane



Re: Panic in sr_crypto_rw with kern.bufcachepercent >= 75

2011-05-20 Thread Marco Peereboom
On Fri, May 20, 2011 at 02:45:04PM +0200, Mike Belopuhov wrote:
> On Fri, May 20, 2011 at 06:24 -0600, David Coppa wrote:
> > Hi all,
> > 
> > OpenBSD-current snapshot dated 16-May-2011:
> > I get an always-reproducible panic with softraid crypto and
> > kern.bufcachepercent >= 75, when untarring a tarball of the
> > complete source tree.
> > 
> > The disk layout is the following, with softraid crypto for
> > all but /:
> > 
> > /dev/sd0a on / type ffs (local)
> > /dev/sd2f on /home type ffs (local, nodev, nosuid)
> > /dev/sd2e on /usr type ffs (local, nodev)
> > /dev/sd2d on /var type ffs (local, nodev, nosuid)
> > /dev/sd3i on /mnt type msdos (local)
> > 
> > Here's the trace:
> > 
> > 
> > panic: sr_crypto_rw: no crypto op
> 
> hi,
> 
> although i'm not sure that this is a best solution, i can't
> see why we should panic here.  sr_scsi_cmd seems to cope with
> stuffups just fine.

This is correct.  Please commit.

> 
> Index: dev/softraid_crypto.c
> ===
> RCS file: /home/cvs/src/sys/dev/softraid_crypto.c,v
> retrieving revision 1.65
> diff -u -p -r1.65 softraid_crypto.c
> --- dev/softraid_crypto.c 6 Apr 2011 03:14:51 -   1.65
> +++ dev/softraid_crypto.c 20 May 2011 12:42:12 -
> @@ -1115,7 +1115,7 @@ sr_crypto_rw(struct sr_workunit *wu)
>   if (wu->swu_xs->flags & SCSI_DATA_OUT) {
>   crp = sr_crypto_getcryptop(wu, 1);
>   if (crp == NULL)
> - panic("sr_crypto_rw: no crypto op");
> + return (1);
>   crp->crp_callback = sr_crypto_write;
>   crp->crp_opaque = wu;
>   s = splvm();



Re: Panic in sr_crypto_rw with kern.bufcachepercent >= 75

2011-05-20 Thread Marco Peereboom
Can I get the orignal diff one more time?

On Fri, May 20, 2011 at 04:51:02PM +0200, David Coppa wrote:
> On Fri, May 20, 2011 at 2:45 PM, Mike Belopuhov  wrote:
> > On Fri, May 20, 2011 at 06:24 -0600, David Coppa wrote:
> >> Hi all,
> >>
> >> OpenBSD-current snapshot dated 16-May-2011:
> >> I get an always-reproducible panic with softraid crypto and
> >> kern.bufcachepercent >= 75, when untarring a tarball of the
> >> complete source tree.
> >>
> >> The disk layout is the following, with softraid crypto for
> >> all but /:
> >>
> >> /dev/sd0a on / type ffs (local)
> >> /dev/sd2f on /home type ffs (local, nodev, nosuid)
> >> /dev/sd2e on /usr type ffs (local, nodev)
> >> /dev/sd2d on /var type ffs (local, nodev, nosuid)
> >> /dev/sd3i on /mnt type msdos (local)
> >>
> >> Here's the trace:
> >>
> >>
> >> panic: sr_crypto_rw: no crypto op
> >
> > hi,
> >
> > although i'm not sure that this is a best solution, i can't
> > see why we should panic here. ?sr_scsi_cmd seems to cope with
> > stuffups just fine.
> >
> > Index: dev/softraid_crypto.c
> > ===
> > RCS file: /home/cvs/src/sys/dev/softraid_crypto.c,v
> > retrieving revision 1.65
> > diff -u -p -r1.65 softraid_crypto.c
> > --- dev/softraid_crypto.c ? ? ? 6 Apr 2011 03:14:51 - ? ? ? 1.65
> > +++ dev/softraid_crypto.c ? ? ? 20 May 2011 12:42:12 -
> > @@ -1115,7 +1115,7 @@ sr_crypto_rw(struct sr_workunit *wu)
> > ? ? ? ?if (wu->swu_xs->flags & SCSI_DATA_OUT) {
> > ? ? ? ? ? ? ? ?crp = sr_crypto_getcryptop(wu, 1);
> > ? ? ? ? ? ? ? ?if (crp == NULL)
> > - ? ? ? ? ? ? ? ? ? ? ? panic("sr_crypto_rw: no crypto op");
> > + ? ? ? ? ? ? ? ? ? ? ? return (1);
> > ? ? ? ? ? ? ? ?crp->crp_callback = sr_crypto_write;
> > ? ? ? ? ? ? ? ?crp->crp_opaque = wu;
> > ? ? ? ? ? ? ? ?s = splvm();
> >
> 
> It now survives to several untarrings of src.tar with kern.bufcachepercent=90.
> 
> cheers,
> David



Re: Allow ipsecctl-like grouping in iked(8)

2011-05-20 Thread Vadim Zhukov
On 20 May 2011 c. 18:54:02 Reyk Floeter wrote:
> hi,
>
> On Thu, May 19, 2011 at 11:06:44PM +0400, Vadim Zhukov wrote:
> > This patch allows ipsecctl-like flow grouping along with current
> > behavior. It allows to write many-to-many policies in a more
> > compact way, see an example:
> >
> >   ikev2 esp \
> > from { 1.2.3.4, 5.6.7.8 } to { 3.4.5.6, 4.5.6.7} \
> > from 7.8.9.0 to { 0.1.2.3, 2.3.4.5 } \
> > ...
> >
> > will create six flows:
> >
> >   1.2.3.4 -> 3.4.5.6
> >   1.2.3.4 -> 4.5.6.7
> >   5.6.7.8 -> 3.4.5.6
> >   5.6.7.8 -> 4.5.6.7
> >   7.8.9.0 -> 0.1.2.3
> >   7.8.9.0 -> 2.3.4.5
> >
> > Please note that you're still limited by MAX_IMSGSIZE, which is
> > about 7 flows, depending on arch. This will be addressed in
> > another patch.
>
> The difference between ipsec.conf and iked.conf and IKEv1 and IKEv2 is
> that an IKEv2 SA can have multiple flows, a.k.a. traffic selectors, in
> a single exchange while IKEv1 can only have one.
>
> So in ipsec.conf the code would expand it into multiple independent
> ike policies that would lead to multiple independent IKEv1 exchanges
> and SAs.
>
> With IKEv2 this would expand into a single IKEv2 SA with many traffic
> selectors that would have to fit into a single IKEv2 UDP packet
> (IKE_AUTH) which can lead to IP fragmentation and all kinds of
> problems.  The current code does not prevent multiple traffic
> selectors as it is totally legal in IKEv2, but I would be careful with
> using macros to generate _many_ traffic selectors, or flows, in a
> policy.  Maybe this is just a documentation issue or you need some
> additional length checking here to avoid/warn about excessive
> fragmentation (considering the 64k IPv4 limit as well).

Thank you (again :) ) for reviewing the diffs.

Then we need to warn user on parsing/configuring if length of resulting
IKE_AUTH packet will grow more than, say, 1400 bytes, yes?

I thought about generating new policy(-ies) for each "{ host, ... }"
construction, but this will effectively disallow multiple selectors
specification in such rules. This will not cause any harm to existing
setups, but looks ugly, and definitely is not intuitive for IKEv2 user.
>From the other side, if iked will support IKEv1 fully in future,
creating multiple policies will be a consistent behavior. What do you
think? I just want such rules compacting functionality, and trust you a
lot more in choosing underlying mechanism. :)

--
  Best wishes,
Vadim Zhukov

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?



Re: Split iked(8) policy creation imsg

2011-05-20 Thread Vadim Zhukov
On 20 May 2011 c. 18:45:05 Reyk Floeter wrote:
> Hi!
>
> On Fri, May 20, 2011 at 03:54:03PM +0400, Vadim Zhukov wrote:
> > This patch splits off IMSG_CFG_POLICY into four messages:
> >
> >   IMSG_CFG_POLICY_BEGIN
> >   IMSG_CFG_POLICY_PROPOSAL
> >   IMSG_CFG_POLICY_FLOW
> >   IMSG_CFG_POLICY_COMMIT
> >
> > Each new policy should start with IMSG_CFG_POLICY_BEGIN, then
> > proceed any number of proposals and flows, and then commit policy. I
> > decided not to use ticket system because I see no race sources.
> >
> > Now we can have long, long policies in your iked.conf. Previously we
> > were limited by maximum imsg packet size (MAX_IMSGSIZE) which is 16
> > kilobytes currently.
>
> Thanks for you diff.  See comments below.  It is appreciated and I can
> adjust it but you can see my comments for modifications below.

Thank you for reviewing and commenting! I'll rework the diff soon and
send new version.

--
  Best wishes,
Vadim Zhukov

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?



Re: Allow ipsecctl-like grouping in iked(8)

2011-05-20 Thread Reyk Floeter
hi,

On Thu, May 19, 2011 at 11:06:44PM +0400, Vadim Zhukov wrote:
> This patch allows ipsecctl-like flow grouping along with current
> behavior. It allows to write many-to-many policies in a more
> compact way, see an example:
> 
>   ikev2 esp \
> from { 1.2.3.4, 5.6.7.8 } to { 3.4.5.6, 4.5.6.7} \
> from 7.8.9.0 to { 0.1.2.3, 2.3.4.5 } \
> ...
> 
> will create six flows:
> 
>   1.2.3.4 -> 3.4.5.6
>   1.2.3.4 -> 4.5.6.7
>   5.6.7.8 -> 3.4.5.6
>   5.6.7.8 -> 4.5.6.7
>   7.8.9.0 -> 0.1.2.3
>   7.8.9.0 -> 2.3.4.5
> 
> Please note that you're still limited by MAX_IMSGSIZE, which is
> about 7 flows, depending on arch. This will be addressed in
> another patch.
> 

The difference between ipsec.conf and iked.conf and IKEv1 and IKEv2 is
that an IKEv2 SA can have multiple flows, a.k.a. traffic selectors, in
a single exchange while IKEv1 can only have one.

So in ipsec.conf the code would expand it into multiple independent
ike policies that would lead to multiple independent IKEv1 exchanges
and SAs.

With IKEv2 this would expand into a single IKEv2 SA with many traffic
selectors that would have to fit into a single IKEv2 UDP packet
(IKE_AUTH) which can lead to IP fragmentation and all kinds of
problems.  The current code does not prevent multiple traffic
selectors as it is totally legal in IKEv2, but I would be careful with
using macros to generate _many_ traffic selectors, or flows, in a
policy.  Maybe this is just a documentation issue or you need some
additional length checking here to avoid/warn about excessive
fragmentation (considering the 64k IPv4 limit as well).

Reyk

> -- 
>   Best wishes,
> Vadim Zhukov
> 
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> 
> 
> Index: iked.conf.5
> ===
> RCS file: /cvs/src/sbin/iked/iked.conf.5,v
> retrieving revision 1.13
> diff -u -p -r1.13 iked.conf.5
> --- iked.conf.5   21 Jan 2011 12:34:11 -  1.13
> +++ iked.conf.5   19 May 2011 17:18:54 -
> @@ -318,6 +318,29 @@ For a list of all port name to number ma
>  .Xr ipsecctl 8 ,
>  see the file
>  .Pa /etc/services .
> +.Pp
> +Traffic selector can contain more than one address for both
> +.Ic from
> +and
> +.Ic to
> +clauses, for example:
> +.Bd -literal -offset indent
> +ikev2 esp \e
> + from { 172.16.0.0/24, 172.16.4.0/24, 172.16.9.0/24 } \e
> + to   { 192.168.0.0/20, 10.0.0.0/8 } \e
> + peer 1.2.3.4 local 5.6.7.8
> +.Ed
> +.Pp
> +In this case, six actual selectors will be created, from each
> +address in
> +.Ic from
> +part to each address in
> +.Ic to
> +part.
> +Commas between addresses are optional.
> +.Pp
> +You can combine multiple addresses and multiple traffic selectors
> +features in one statement.
>  .It Ic local Ar localip Ic peer Ar remote
>  The
>  .Ic local
> @@ -821,9 +844,13 @@ and any other connection will be matched
>  .Sq catch all
>  policy.
>  .Bd -literal -offset indent
> -ikev2 quick esp from 10.10.10.0/24 to 10.20.20.0/24 \e
> +ikev2 quick esp \e
> + from { 10.10.10.0/24, 10.11.10.0/24 } \e
> + to 10.20.20.0/24 \e
>   peer 192.168.1.34
> -ikev2 "catch all" esp from 10.0.1.0/24 to 10.0.2.0/24 \e
> +ikev2 "catch all" esp \e
> + from 10.0.1.0/24 to 10.0.2.0/24 \e
> + from 10.0.31.0/24 to 10.0.35.0/24 \e
>   peer any
>  ikev2 "subnet" esp from 10.0.3.0/24 to 10.0.4.0/24 \e
>   peer 192.168.1.0/24
> Index: parse.y
> ===
> RCS file: /cvs/src/sbin/iked/parse.y,v
> retrieving revision 1.21
> diff -u -p -r1.21 parse.y
> --- parse.y   18 Apr 2011 08:45:43 -  1.21
> +++ parse.y   19 May 2011 17:18:54 -
> @@ -258,6 +258,8 @@ struct ipsec_hosts {
>   struct ipsec_addr_wrap  *dst;
>   u_int16_tsport;
>   u_int16_tdport;
> + struct ipsec_hosts  *next;
> + struct ipsec_hosts  *tail;
>  };
>  
>  struct ipsec_filters {
> @@ -293,6 +295,8 @@ intget_id_type(char *);
>  u_int8_t  x2i(unsigned char *);
>  int   parsekey(unsigned char *, size_t, struct iked_auth *);
>  int   parsekeyfile(char *, struct iked_auth *);
> +struct ipsec_hosts   *expand_hosts(struct ipsec_addr_wrap *, u_int16_t,
> +  struct ipsec_addr_wrap *, u_int16_t);
>  
>  struct ipsec_transforms *ipsec_transforms;
>  struct ipsec_filters *ipsec_filters;
> @@ -345,7 +349,7 @@ typedef struct {
>  %type  portval af
>  %type   peers
>  %type anyhost
> -%typehost host_spec
> +%typehost host_list host_spec
>  %type ids
>  %type  id
>  %type  transforms
> @@ -490,51 +494,20 @@ hosts_list  : hosts

Re: Panic in sr_crypto_rw with kern.bufcachepercent >= 75

2011-05-20 Thread David Coppa
On Fri, May 20, 2011 at 2:45 PM, Mike Belopuhov  wrote:
> On Fri, May 20, 2011 at 06:24 -0600, David Coppa wrote:
>> Hi all,
>>
>> OpenBSD-current snapshot dated 16-May-2011:
>> I get an always-reproducible panic with softraid crypto and
>> kern.bufcachepercent >= 75, when untarring a tarball of the
>> complete source tree.
>>
>> The disk layout is the following, with softraid crypto for
>> all but /:
>>
>> /dev/sd0a on / type ffs (local)
>> /dev/sd2f on /home type ffs (local, nodev, nosuid)
>> /dev/sd2e on /usr type ffs (local, nodev)
>> /dev/sd2d on /var type ffs (local, nodev, nosuid)
>> /dev/sd3i on /mnt type msdos (local)
>>
>> Here's the trace:
>>
>>
>> panic: sr_crypto_rw: no crypto op
>
> hi,
>
> although i'm not sure that this is a best solution, i can't
> see why we should panic here.  sr_scsi_cmd seems to cope with
> stuffups just fine.
>
> Index: dev/softraid_crypto.c
> ===
> RCS file: /home/cvs/src/sys/dev/softraid_crypto.c,v
> retrieving revision 1.65
> diff -u -p -r1.65 softraid_crypto.c
> --- dev/softraid_crypto.c   6 Apr 2011 03:14:51 -   1.65
> +++ dev/softraid_crypto.c   20 May 2011 12:42:12 -
> @@ -1115,7 +1115,7 @@ sr_crypto_rw(struct sr_workunit *wu)
>if (wu->swu_xs->flags & SCSI_DATA_OUT) {
>crp = sr_crypto_getcryptop(wu, 1);
>if (crp == NULL)
> -   panic("sr_crypto_rw: no crypto op");
> +   return (1);
>crp->crp_callback = sr_crypto_write;
>crp->crp_opaque = wu;
>s = splvm();
>

It now survives to several untarrings of src.tar with
kern.bufcachepercent=90.

cheers,
David



Re: Split iked(8) policy creation imsg

2011-05-20 Thread Reyk Floeter
Hi!

On Fri, May 20, 2011 at 03:54:03PM +0400, Vadim Zhukov wrote:
> This patch splits off IMSG_CFG_POLICY into four messages:
> 
>   IMSG_CFG_POLICY_BEGIN
>   IMSG_CFG_POLICY_PROPOSAL
>   IMSG_CFG_POLICY_FLOW
>   IMSG_CFG_POLICY_COMMIT
> 
> Each new policy should start with IMSG_CFG_POLICY_BEGIN, then proceed
> any number of proposals and flows, and then commit policy. I decided
> not to use ticket system because I see no race sources.
> 
> Now we can have long, long policies in your iked.conf. Previously we were
> limited by maximum imsg packet size (MAX_IMSGSIZE) which is 16 kilobytes
> currently.
> 

Thanks for you diff.  See comments below.  It is appreciated and I can
adjust it but you can see my comments for modifications below.

reyk

> Index: ikev2.c
> ===
> RCS file: /cvs/src/sbin/iked/ikev2.c,v
> retrieving revision 1.55
> diff -u -p -r1.55 ikev2.c
> --- ikev2.c   9 May 2011 11:15:18 -   1.55
> +++ ikev2.c   20 May 2011 09:48:06 -
> @@ -133,8 +133,14 @@ ikev2_dispatch_parent(int fd, struct pri
>   return (config_getsocket(env, imsg, ikev2_msg_cb));
>   case IMSG_PFKEY_SOCKET:
>   return (config_getpfkey(env, imsg));
> - case IMSG_CFG_POLICY:
> - return (config_getpolicy(env, imsg));
> + case IMSG_CFG_POLICY_BEGIN:
> + return (config_getpolicy_begin(env, imsg));
> + case IMSG_CFG_POLICY_PROPOSAL:
> + return (config_getpolicy_proposal(env, imsg));
> + case IMSG_CFG_POLICY_FLOW:
> + return (config_getpolicy_flow(env, imsg));
> + case IMSG_CFG_POLICY_COMMIT:
> + return (config_getpolicy_commit(env, imsg));

The naming should be simplified just to

IMSG_CFG_POLICY
IMSG_CFG_PROPOSAL
IMSG_CFG_FLOW

same for the functions

config_getpolicy
config_getproposal
config_getflow

The commit step is not required because the existing IMSG_COMPILE
finishes the policy configuration ones, calculates the skip steps and
starts operation.

See more comments below. There are some hints to make it fully async
below (not reviewing the code in detail).

>   case IMSG_CFG_USER:
>   return (config_getuser(env, imsg));
>   case IMSG_COMPILE:
> Index: config.c
> ===
> RCS file: /cvs/src/sbin/iked/config.c,v
> retrieving revision 1.12
> diff -u -p -r1.12 config.c
> --- config.c  9 May 2011 11:15:18 -   1.12
> +++ config.c  20 May 2011 09:48:06 -
> @@ -45,6 +45,10 @@
>  #include "iked.h"
>  #include "ikev2.h"
>  
> +
> +static struct iked_policy *newpol = NULL;
> +
> +

Please don't use a global variable here.  Just add the policy to the
global list in config_getpolicy() and reference to the policy by id
in the config_getproposal() and config_getflow() calls.

>  struct iked_sa *
>  config_new_sa(struct iked *env, int initiator)
>  {
> @@ -584,122 +588,167 @@ config_setpolicy(struct iked *env, struc
>  {
>   struct iked_proposal*prop;
>   struct iked_flow*flow;
> - struct iked_transform   *xform;
> - size_t   size, iovcnt, j, c = 0;
>   struct iovec iov[IOV_MAX];
> + unsigned int i;
>  
> - iovcnt = 1;
> - size = sizeof(*pol);
> - TAILQ_FOREACH(prop, &pol->pol_proposals, prop_entry) {
> - size += (prop->prop_nxforms * sizeof(*xform)) +
> - (sizeof(*prop));
> - iovcnt += prop->prop_nxforms + 1;
> + if (env->sc_opts & IKED_OPT_NOACTION) {
> + print_policy(pol);
> + return (0);
>   }
>  
> - size += pol->pol_nflows * sizeof(*flow);
> - iovcnt += pol->pol_nflows;
> -
> - if (iovcnt > IOV_MAX) {
> - log_warn("%s: too many proposals/flows", __func__);
> + if (proc_compose_imsg(env, id, IMSG_CFG_POLICY_BEGIN, -1,
> + pol, sizeof(*pol)) == -1)
>   return (-1);
> - }
> -
> - iov[c].iov_base = pol;
> - iov[c++].iov_len = sizeof(*pol);
>  
>   TAILQ_FOREACH(prop, &pol->pol_proposals, prop_entry) {
> - iov[c].iov_base = prop;
> - iov[c++].iov_len = sizeof(*prop);
> -
> - for (j = 0; j < prop->prop_nxforms; j++) {
> - xform = prop->prop_xforms + j;
> + iov[0].iov_base = prop;
> + iov[0].iov_len = sizeof(*prop);
>  
> - iov[c].iov_base = xform;
> - iov[c++].iov_len = sizeof(*xform);
> + for (i = 1; i <= prop->prop_nxforms; i++) {
> + if (i >= IOV_MAX) {
> + log_warn("%s: too many proposals", __func__);
> + return (-1);
> + }
> + iov[i].iov_base = prop->prop_xforms + (i - 1);
> + iov[i].iov_len = sizeof(struct iked_transform);
>   }
> +
> + if (proc_composev_im

Re: Panic in sr_crypto_rw with kern.bufcachepercent >= 75

2011-05-20 Thread Mike Belopuhov
On Fri, May 20, 2011 at 06:24 -0600, David Coppa wrote:
> Hi all,
> 
> OpenBSD-current snapshot dated 16-May-2011:
> I get an always-reproducible panic with softraid crypto and
> kern.bufcachepercent >= 75, when untarring a tarball of the
> complete source tree.
> 
> The disk layout is the following, with softraid crypto for
> all but /:
> 
> /dev/sd0a on / type ffs (local)
> /dev/sd2f on /home type ffs (local, nodev, nosuid)
> /dev/sd2e on /usr type ffs (local, nodev)
> /dev/sd2d on /var type ffs (local, nodev, nosuid)
> /dev/sd3i on /mnt type msdos (local)
> 
> Here's the trace:
> 
> 
> panic: sr_crypto_rw: no crypto op

hi,

although i'm not sure that this is a best solution, i can't
see why we should panic here.  sr_scsi_cmd seems to cope with
stuffups just fine.

Index: dev/softraid_crypto.c
===
RCS file: /home/cvs/src/sys/dev/softraid_crypto.c,v
retrieving revision 1.65
diff -u -p -r1.65 softraid_crypto.c
--- dev/softraid_crypto.c   6 Apr 2011 03:14:51 -   1.65
+++ dev/softraid_crypto.c   20 May 2011 12:42:12 -
@@ -1115,7 +1115,7 @@ sr_crypto_rw(struct sr_workunit *wu)
if (wu->swu_xs->flags & SCSI_DATA_OUT) {
crp = sr_crypto_getcryptop(wu, 1);
if (crp == NULL)
-   panic("sr_crypto_rw: no crypto op");
+   return (1);
crp->crp_callback = sr_crypto_write;
crp->crp_opaque = wu;
s = splvm();



Panic in sr_crypto_rw with kern.bufcachepercent >= 75

2011-05-20 Thread David Coppa
Hi all,

OpenBSD-current snapshot dated 16-May-2011:
I get an always-reproducible panic with softraid crypto and
kern.bufcachepercent >= 75, when untarring a tarball of the
complete source tree.

The disk layout is the following, with softraid crypto for
all but /:

/dev/sd0a on / type ffs (local)
/dev/sd2f on /home type ffs (local, nodev, nosuid)
/dev/sd2e on /usr type ffs (local, nodev)
/dev/sd2d on /var type ffs (local, nodev, nosuid)
/dev/sd3i on /mnt type msdos (local)

Here's the trace:


panic: sr_crypto_rw: no crypto op
Stopped at Debugger+0x4: popl %ebp
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
IF RUNNING SMP, USE 'mach ddbcpu <#>' AND 'trace' ON OTHER PROCESSORS, TOO.
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
ddb{0}> trace
Debugger(d08e41c2,dc777d78,d08b9e8f,dc777d78,0) at Debugger+0x4
panic(d08b9e8f,1,0,d1fcf488,d2016000) at panic+0x5d
sr_crypto_rw(d1fcf488,0,74,d03e7cfa,d1fa7240) at sr_crypto_rw+0x93
sr_scsi_cmd(d6c75178,0,478b7e0,0,20) at sr_scsi_cmd+0x17f
sdstart(d6c75178,d1fcf488,d2016c40,d1fd2c00,d1fd2c84) at sdstart+0x1b4
scsi_ioh_runqueue(d2016c2c,d1fd2c84,d6793b90,dc777e5c,d1fa7000) at 
scsi_ioh_runqueue+0x59
scsi_xsh_runqueue(d1fd2c00,d6793b90,0,d0578683,20) at scsi_xsh_runqueue+0x11d
sdstrategy(d6793b90,4a007000,0,fb530b2,0) at sdstrategy+0x175
spec_strategy(dc777ec8,d08c4f07,dc777eec,1001b4,d6793b90) at spec_strategy+0x3d
VOP_STRATEGY(d6793b90,1,4,d6793b90,d6849d7c) at VOP_STRATEGY+0x2c
bwrite(d6793b90,d08c4323,dc777f2c,d041964e,d6793b90) at bwrite+0xcc
VOP_BWRITE(d6793b90,d6df027c,dc777f5c,0,d6dd41dc) at VOP_BWRITE+0x29
ffs_sync(dc777f4c,0,d6dd41dc,d6e53000,3) at ffs_sync+0x126
VOP_FSYNC(d6dd41dc,d6e53000,3,d6df027c,d03d2f23) at VOP_FSYNC+0x35
sched_sync(d6df027c) at sched_sync+0xaf
Bad frame pointer: 0xd0baae88
ddb{0}> mach ddbcpu 1
Stopped at Debugger+0x4: popl %ebp
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
IF RUNNING SMP, USE 'mach ddbcpu <#>' AND 'trace' ON OTHER PROCESSORS, TOO.
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
ddb{1}> trace
Debugger(d1e9bc00,d083cb6c,0,296,d1e9bc34) at Debugger+0x4
i386_ipi_handler(b0,d03e0020,d09e,d1e90010,df3b0010) at 
i386_ipi_handler+0x5f
Xintripi() at Xintripi+0x49
--- interrupt ---
__mp_lock(d0a35ce4,7fff,dc751f18,d0202630,0) at __mp_lock+0x52
i386_softintlock(0,d03e0020,d0a3,10,d1e90010) at i386_softintlock+0x12
Xintrltimer() at Xintrltimer+0x50
--- interrupt ---
cpu_idle_cycle(d1e9bc00) at cpu_idle_cycle+0xf
Bad frame pointer: 0xd0baae48
ddb{1}> ps
   PID   PPID   PGRP   UID   S   FLAGS   WAITCOMMAND
  1177  10944   1177  1000   2   0x200   tar
 10944  1  10944  1000   3   0x280   pause   ksh
 32405  1  32405 0   3   0x280   ttyin   getty
 22382  1  22382 0   3   0x280   ttyin   getty
 16249  1  16249 0   3   0x280   ttyin   getty
  4339  1   4339 0   3   0x280   ttyin   getty
 27948  1  27948 0   3   0x280   select  cron
 22709  1  22709 0   3   0x280   kqread  apmd
  3666  1   366699   3   0x280   pollaucat
 27290  1  27290 0   3   0x280   select  sshd
 28487   1837   183774   3   0x280   bpf pflogd
  1837  1   1837 0   3   0x280   netio   pflogd
 14290  22465  2246573   3   0x280   pollsyslogd
 22465  1  22465 0   3   0x288   netio   syslogd
 11486  0  0 0   3   0x2100200   bored   srdis
16  0  0 0   3   0x2100200   aiodonedaiodoned
*   15  0  0 0   7   0x2100200   update
14  0  0 0   3   0x2100200   cleaner cleaner
13  0  0 0   30x100200   reaper  reaper
12  0  0 0   2   0x2100200   pagedaemon
11  0  0 0   2   0x2100200   crypto
10  0  0 0   3   0x2100200   pftmpfpurge
 9  0  0 0   3   0x2100200   usbtsk  usbtask
 8  0  0 0   3   0x2100200   usbatsk usbatsk
 7  0  0 0   3   0x2100200   bored   intelrel
 6  0  0 0   3   0x2100200   acpi0   acpi0
 5  0  0 0   7  0x40100200   idle1
 4  0  0 0   3   0x2100200   bored   syswq
 3  0  0 0   3  0x40100200   idle0
 2  0  0 0   3   0x2100200   kmalloc kmthread
 1  0  1 0   3   0x280   waitinit
 0 -1  0 0   3   0x2080200   scheduler   swapper
ddb{1}> boot reboot
softraid0: could not allocate metadata scratch area
splassert: assertwaitok: want -1 have 1


With kern.bufcachepercent=70 the system seems stable
(tested by untarring src.tar a lot of times).

dmesg below...

Cheers!
David


OpenBSD 4.9-current (GENERIC.MP) #113: Mon May 16 11:52:0

Split iked(8) policy creation imsg

2011-05-20 Thread Vadim Zhukov
Hello all.

This patch splits off IMSG_CFG_POLICY into four messages:

  IMSG_CFG_POLICY_BEGIN
  IMSG_CFG_POLICY_PROPOSAL
  IMSG_CFG_POLICY_FLOW
  IMSG_CFG_POLICY_COMMIT

Each new policy should start with IMSG_CFG_POLICY_BEGIN, then proceed
any number of proposals and flows, and then commit policy. I decided
not to use ticket system because I see no race sources.

Now we can have long, long policies in your iked.conf. Previously we were
limited by maximum imsg packet size (MAX_IMSGSIZE) which is 16 kilobytes
currently.

This patch complements previous one:
http://marc.info/?l=openbsd-tech&m=130583217730436&w=2
Those patches are totally independent and may be applied in any order and
combination.

Both patches were tested on i386.

-- 
  Best wishes,
Vadim Zhukov

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?


Index: ikev2.c
===
RCS file: /cvs/src/sbin/iked/ikev2.c,v
retrieving revision 1.55
diff -u -p -r1.55 ikev2.c
--- ikev2.c 9 May 2011 11:15:18 -   1.55
+++ ikev2.c 20 May 2011 09:48:06 -
@@ -133,8 +133,14 @@ ikev2_dispatch_parent(int fd, struct pri
return (config_getsocket(env, imsg, ikev2_msg_cb));
case IMSG_PFKEY_SOCKET:
return (config_getpfkey(env, imsg));
-   case IMSG_CFG_POLICY:
-   return (config_getpolicy(env, imsg));
+   case IMSG_CFG_POLICY_BEGIN:
+   return (config_getpolicy_begin(env, imsg));
+   case IMSG_CFG_POLICY_PROPOSAL:
+   return (config_getpolicy_proposal(env, imsg));
+   case IMSG_CFG_POLICY_FLOW:
+   return (config_getpolicy_flow(env, imsg));
+   case IMSG_CFG_POLICY_COMMIT:
+   return (config_getpolicy_commit(env, imsg));
case IMSG_CFG_USER:
return (config_getuser(env, imsg));
case IMSG_COMPILE:
Index: config.c
===
RCS file: /cvs/src/sbin/iked/config.c,v
retrieving revision 1.12
diff -u -p -r1.12 config.c
--- config.c9 May 2011 11:15:18 -   1.12
+++ config.c20 May 2011 09:48:06 -
@@ -45,6 +45,10 @@
 #include "iked.h"
 #include "ikev2.h"
 
+
+static struct iked_policy *newpol = NULL;
+
+
 struct iked_sa *
 config_new_sa(struct iked *env, int initiator)
 {
@@ -584,122 +588,167 @@ config_setpolicy(struct iked *env, struc
 {
struct iked_proposal*prop;
struct iked_flow*flow;
-   struct iked_transform   *xform;
-   size_t   size, iovcnt, j, c = 0;
struct iovec iov[IOV_MAX];
+   unsigned int i;
 
-   iovcnt = 1;
-   size = sizeof(*pol);
-   TAILQ_FOREACH(prop, &pol->pol_proposals, prop_entry) {
-   size += (prop->prop_nxforms * sizeof(*xform)) +
-   (sizeof(*prop));
-   iovcnt += prop->prop_nxforms + 1;
+   if (env->sc_opts & IKED_OPT_NOACTION) {
+   print_policy(pol);
+   return (0);
}
 
-   size += pol->pol_nflows * sizeof(*flow);
-   iovcnt += pol->pol_nflows;
-
-   if (iovcnt > IOV_MAX) {
-   log_warn("%s: too many proposals/flows", __func__);
+   if (proc_compose_imsg(env, id, IMSG_CFG_POLICY_BEGIN, -1,
+   pol, sizeof(*pol)) == -1)
return (-1);
-   }
-
-   iov[c].iov_base = pol;
-   iov[c++].iov_len = sizeof(*pol);
 
TAILQ_FOREACH(prop, &pol->pol_proposals, prop_entry) {
-   iov[c].iov_base = prop;
-   iov[c++].iov_len = sizeof(*prop);
-
-   for (j = 0; j < prop->prop_nxforms; j++) {
-   xform = prop->prop_xforms + j;
+   iov[0].iov_base = prop;
+   iov[0].iov_len = sizeof(*prop);
 
-   iov[c].iov_base = xform;
-   iov[c++].iov_len = sizeof(*xform);
+   for (i = 1; i <= prop->prop_nxforms; i++) {
+   if (i >= IOV_MAX) {
+   log_warn("%s: too many proposals", __func__);
+   return (-1);
+   }
+   iov[i].iov_base = prop->prop_xforms + (i - 1);
+   iov[i].iov_len = sizeof(struct iked_transform);
}
+
+   if (proc_composev_imsg(env, id, IMSG_CFG_POLICY_PROPOSAL, -1,
+   iov, i) == -1)
+   return (-1);
}
 
RB_FOREACH(flow, iked_flows, &pol->pol_flows) {
-   iov[c].iov_base = flow;
-   iov[c++].iov_len = sizeof(*flow);
+   if (proc_compose_imsg(env, id, IMSG_CFG_POLICY_FLOW, -1,
+   flow, sizeof(struct iked_flow)) == -1)
+   return (-1);
}
 
-   if (env->sc_opts & IKED_OPT_NOACT

Re: set skip on

2011-05-20 Thread Henning Brauer
* Stuart Henderson  [2011-05-19 11:50]:
> On 2011/05/19 11:26, Claudio Jeker wrote:
> > There is a bigger problem with 'set skip on lo', it is only evaluated
> > during load. So if you create a lo1 afterwards the set skip will not
> > trigger. This is very annoying especially with qemu and tun interfaces.
> 
> Right, I noticed this during testing, and this at least deserves a
> mention (independent of my other diff).
> 
> Changing this behaviour could be a problem though, I think it would
> need to be checked before state lookup, and we don't want to walk the
> groups of all interfaces on the system for every packet.

nah. we get calls from the interface subsystem when interfaces show up
or go. just a few lines of code missing to deal with skip.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: set skip on

2011-05-20 Thread Henning Brauer
* Reyk Floeter  [2011-05-19 11:47]:
> On Thu, May 19, 2011 at 11:26:59AM +0200, Claudio Jeker wrote:
> > To be honest I'm not sure who will do a 'set skip on sis' or
> > 'set skip on em'.
> 
> I would ;-)
> 
> Sometimes you have machines with different types of physical
> interfaces where one type is used for internal stuff like a dedicated
> pfsync or management link, eg. 
> 
> set skip on em
> block in on ix0
> 
> ...or soekrises with vr(4) and a quad sis(4) in one box.

you are abusing a coincidence, in this case, pretty much.

but really, you can just put a "group vr" in hostname.vr* if you want
this. or use a more descriptive group name to begin with :)

> > I think the very important bit is this:
> > > Hmmm, looking further, it seems ordinary rules only match on the
> > > interface name or group as well (in pfi_kif_match()), so maybe
> > > you're just plain right after all. :-)
> > set skip is currently special and works in a not so expected way so it is
> > better to make it work like all other users of interface names and people
> > needing 'set skip on em' should add a 'group em' line to their
> > hostname.em* files.
> "ifconfig em" also works, so i think it would be less special and
> confusing if "set skip on em" would just work without extra config
> magic.

I disagree.

the ifconfig em is a backwards compat hack that doesn't make much
sense at all.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: set skip on

2011-05-20 Thread Henning Brauer
* Claudio Jeker  [2011-05-19 13:20]:
> On Thu, May 19, 2011 at 10:49:59AM +0100, Stuart Henderson wrote:
> > On 2011/05/19 11:26, Claudio Jeker wrote:
> > > There is a bigger problem with 'set skip on lo', it is only evaluated
> > > during load. So if you create a lo1 afterwards the set skip will not
> > > trigger. This is very annoying especially with qemu and tun interfaces.
> > 
> > Right, I noticed this during testing, and this at least deserves a
> > mention (independent of my other diff).
> > 
> > Changing this behaviour could be a problem though, I think it would
> > need to be checked before state lookup, and we don't want to walk the
> > groups of all interfaces on the system for every packet.
> > 
> 
> Since the PFI_IFLAG_SKIP is set on a kif it should be possible to
> inherit the flag from a group when an interface is created.
> This would have no impact on runtime execution time.
> 
> I would really like to have set skip pick up interfaces at runtime because
> I get constantly burned by it.

again, spot on.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: set skip on

2011-05-20 Thread Henning Brauer
* Stuart Henderson  [2011-05-19 11:21]:
> > Note that the default ruleset does include a 'set skip on lo' but
> > that's fine since lo* interfaces are by default added to the "lo"
> > group.
> >
> > If people get bitten by this change, they could either add
> > an interface-name-matching group to each interface or we do that
> > automatically, as is done for vlan's, lo's etc.
> 
> What does anyone else think about this (making em0 a member
> of em automatically, etc.)? I don't think it's really all that
> important, there are usually better grouping criteria than
> "what driver supports the device" and people have to change
> their ruleset this release anyway.

we have specifically decided to NOT do this. what prupose does a group
of all interface of a certain driver have? I see zero for teh regular
drivers. the ones where it makes sense are the clonables like ppp and
for those we have automatic base class group, i. e. all tun interfaces
end up in the tun group by default.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: set skip on

2011-05-20 Thread Henning Brauer
* Claudio Jeker  [2011-05-19 11:29]:
> There is a bigger problem with 'set skip on lo', it is only evaluated
> during load. So if you create a lo1 afterwards the set skip will not
> trigger. This is very annoying especially with qemu and tun interfaces.
> 
> To be honest I'm not sure who will do a 'set skip on sis' or
> 'set skip on em'. Normaly you want to filter on your physical interfaces
> and not just skip them all. For pseudo-devices like lo, tun, vlan, etc. a
> group is created automatically.
> 
> I think the very important bit is this:
> > Hmmm, looking further, it seems ordinary rules only match on the
> > interface name or group as well (in pfi_kif_match()), so maybe
> > you're just plain right after all. :-)
> 
> set skip is currently special and works in a not so expected way so it is
> better to make it work like all other users of interface names and people
> needing 'set skip on em' should add a 'group em' line to their
> hostname.em* files.

spot on.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: set skip on

2011-05-20 Thread Henning Brauer
* Alexander Hall  [2011-05-19 10:25]:
> On 05/18/11 23:31, Stuart Henderson wrote:
> > "set skip" in PF has a slightly unexpected behaviour; rather
> > than skipping by interface group, it matches on the non-numeric
> > part of an interface name.
> 
> I think the prefix match test is a common behaviour so I think you
> should keep that.

no, that is just leftover.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



rdr-to ::1

2011-05-20 Thread Camiel Dobbelaar
inet6 pf rules that "rdr-to ::1" do not work currently.  Matching
packets just disappear and the counter "packets that violated scope
rules" from a "netstat -s -p ip6" gets incremented.

It came up before on misc@:
http://marc.info/?t=12680425912&r=1&w=2

The attached diff removes the check (let's call it check #1) that drops
the packet.  This is just to point out where the problem is because the
BSD's have diverged here:

FreeBSD _replaced_ check #1 with another check (#2) in this diff from
2004 (from kame):
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/ip6_input.c.diff?r1=1.68;r2=1.69

NetBSD _replaced_ check #1 it with something totally different
(attributed to jinmei@kame)
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/ip6_input.c?rev=1.81&content-type=text/x-cvsweb-markup

Check #2 was _added_ to OpenBSD in 2006 (attributed to jinmei@kame):
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/netinet6/ip6_input.c.diff?r1=1.72;r2=1.73

Basically, check #1 is gone in FreeBSD and NetBSD and the diff syncs us
closer to FreeBSD.

I'm unsure if it's the right thing to do though for a few reasons. The
FreeBSD diff has a second part that I cannot yet tell is related or not.
 Or maybe the NetBSD diff could be better.  And OpenBSD also seems to
have other checks in this area.

I'll spend some more time on this, but maybe there's an IPv6 guru that
can lend a hand?  :-)

--
Cam




Index: ip6_input.c
===
RCS file: /cvs/src/sys/netinet6/ip6_input.c,v
retrieving revision 1.99
diff -u -r1.99 ip6_input.c
--- ip6_input.c 3 Apr 2011 13:56:05 -   1.99
+++ ip6_input.c 20 May 2011 09:30:14 -
@@ -270,7 +270,6 @@
in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_addrerr);
goto bad;
}
-
if (IN6_IS_ADDR_MC_INTFACELOCAL(&ip6->ip6_dst) &&
!(m->m_flags & M_LOOP)) {
/*
@@ -340,19 +339,6 @@
ip6 = mtod(m, struct ip6_hdr *);
srcrt = !IN6_ARE_ADDR_EQUAL(&odst, &ip6->ip6_dst);
 #endif
-
-   if (IN6_IS_ADDR_LOOPBACK(&ip6->ip6_src) ||
-   IN6_IS_ADDR_LOOPBACK(&ip6->ip6_dst)) {
-   if (m->m_pkthdr.rcvif->if_flags & IFF_LOOPBACK) {
-   ours = 1;
-   deliverifp = m->m_pkthdr.rcvif;
-   goto hbhcheck;
-   } else {
-   ip6stat.ip6s_badscope++;
-   in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_addrerr);
-   goto bad;
-   }
-   }

/* drop packets if interface ID portion is already filled */
if ((m->m_pkthdr.rcvif->if_flags & IFF_LOOPBACK) == 0) {



Re: AVL tree

2011-05-20 Thread Michael Pounov
On Fri, 20 May 2011 05:01:41 +0200
Ariane van der Steldt  wrote:

> On Thu, May 19, 2011 at 08:28:09PM +0300, Michael Pounov wrote:
> > Add AVL tree implementation and merge few RB tree related macros.
> > 
> > If you have comments or any claims, please send me feedback
> > and I will fix them. 
> > 
> >  Inline patch::
> > --- /usr/src/sys/sys/tree.h Mon Mar  2 11:42:55 2009
> > +++ tree.h  Thu May 19 20:16:36 2011
> > @@ -730,9 +730,367 @@
> >  (x) != NULL;   \
> >  (x) = name##_RB_NEXT(x))
> >  
> > +#define RB_FOREACH_FROM(x, name, y)
> > \
> 
> This macro modifies parameter y, which I do not expect from using it.
> Please keep y its original value. Also, I want to be able to use an
> rvalue for y.
> 
> > +   for ((x) = (y); \
> > +   ((x) != NULL) && ((y) = name##_RB_NEXT(x), (x) != NULL);\
> > +(x) = (y))
> > +
> > +#define RB_FOREACH_SAFE(x, name, head, y)  \
> 
> (Again, don't modify the value of y.)
> 
> > +   for ((x) = RB_MIN(name, head);  \
> > +   ((x) != NULL) && ((y) = name##_RB_NEXT(x), (x) != NULL);\
> > +(x) = (y))
> > +
> 
> Is there a specific reason your wrote the above 2 macros? The pattern
> where the for loop is written out is quite common and well understood by
> developers.
> 
> >  #define RB_FOREACH_REVERSE(x, name, head)  \
> > for ((x) = RB_MAX(name, head);  \
> >  (x) != NULL;   \
> >  (x) = name##_RB_PREV(x))
> > +
> > +#define RB_FOREACH_REVERSE_FROM(x, name, y)
> > \
> > +   for ((x) = (y); \
> > +   ((x) != NULL) && ((y) = name##_RB_PREV(x), (x) != NULL);\
> > +(x) = (y))
> > +
> > +#define RB_FOREACH_REVERSE_SAFE(x, name, head, y)  \
> > +   for ((x) = RB_MAX(name, head);  \
> > +   ((x) != NULL) && ((y) = name##_RB_PREV(x), (x) != NULL);\
> > +(x) = (y))
> > +
> > +/* Macros that define a avl tree */
> > +#define AVL_MAX_HEIGHT 42
> 
> No. Trees do not have a max height. On a modern amd64, I can
> create code which will be able to fit more than 2 ** 42 items in the
> tree (provided I had the physmem in my machine).
> 
> > +#define AVL_HEAD(name, type)   
> > \
> > +struct name {  
> > \
> > +   struct type *avlh_root; /* root of the tree */  \
> > +   int avlh_height;\
> > +   long avlh_count;\
> 

This tree implementation use height for ancestors temporary array
because we have not a pointer to the parent.

You may use all memory when set new height 2** = all stored elements in 
tree!
:)

> size_t, not long. K&R guarantee it'll fit for every future version C.
> 
> Why is there a count? For most trees in the system, we don't care how
> large they are. And we don't care for the height either.
> 
> > +}
> > +
> > +#define AVL_INITIALIZER(root)  
> > \
> > +   { NULL, 0, 0 }
> > +
> > +#define AVL_INIT(root, height) do {
> > \
> > +   (root)->avlh_root = NULL;   \
> > +   (root)->avlh_height = !height ? AVL_MAX_HEIGHT : height;\
> 
> height? This is not the current height of the tree... Don't like it; if
> the items in the tree are constrained in some way, other code will have
> dealt with it.
> 
> > +   (root)->avlh_count = 0; \
> > +} while (0)
> > +
> > +#define AVL_ENTRY(type)
> > \
> > +struct {   \
> > +   struct type *avle_left; /* left element */  \
> > +   struct type *avle_right;/* right element */ \
> > +   int avle_height;/* node color */\
> 
> Comment copy-pasted from RB tree.
> 
> > +}
> > +
> > +#define AVL_LEFT(elm, field)   (elm)->field.avle_left
> > +#define AVL_RIGHT(elm, field)  (elm)->field.avle_right
> > +#define AVL_HEIGHT(elm, field) (elm)->field.avle_height
> > +#define AVL_ROOT(head) (head)->avlh_root
> > +#define AVL_EMPTY(head)(AVL_ROOT(head) == NULL)
> > +#define AVL_ROOT_HEIGHT(head)  (head)->avlh_height
> 
> How about this instead:  (AVL_EMPTY(head) ? 0 : AVL_HEIGHT(AVL_ROOT(head)))
> 
> > +#define AVL_ROOT_COUNT(head)   (head)->avlh_count
> > +
> > +/* Generates pro

Re: AVL tree

2011-05-20 Thread Mike Belopuhov
On Fri, May 20, 2011 at 4:04 AM, Ariane van der Steldt 
wrote:
>
> AVL trees have a difference of max 1 between the longest and shortest
> path to a leaf, whereas RB-trees have at max the longest path be double
> the length of the shortest path.
> I.e. work case lookups require traversal of
>ln(size)/ln(2) elements for AVL trees,
> 2 * ln(size)/ln(2) elements for RB trees.
>
> However, RB trees are slightly more efficient with insertion/removal.
>
> The better balancing of AVL trees can make a big difference in lookup
> time compared to RB trees, for trees containing millions of items.
>
>> how are you supposed to choose between them?
>
> Basically it's a trade off. Is lookup more important, or insert/remove?
> And ofcourse, if you still don't know, just take what strikes your
> fancy. :D
>

i know about the difference of lookup and insert/remove speed, but
is there a significant difference in practice (in openbsd use cases)?

> But I'm mostly interested because I tend to use trees left, right and
> center and those macros lead to code bloat. If I can use a tree without
> macros, I'm happy.

so maybe it makes sense to have a non macro implementation in libkern
and use it in uvm or whenever needed?

> --
> Ariane



Re: AVL tree

2011-05-20 Thread Michael Pounov
I test this implementation with 1000 entry inserted by random()
;)
average seek time with random search of 1 is from 2 to 3 seconds

On Fri, 20 May 2011 04:04:07 +0200
Ariane van der Steldt  wrote:

> On Thu, May 19, 2011 at 07:33:22PM +0200, Mike Belopuhov wrote:
> > On Thu, May 19, 2011 at 7:25 PM, Michael Pounov  wrote:
> > > Not see differences in results with performance from RB tree vs AVL,
> 
> Which tree size did you use to test that? Which insert order (yes, this
> is relevant: with the right ordering you can hit the worst or best case
> of a tree algorithm). Did you test lookup time or update time or both
> (combined or split)?
> 
> > >  but right solution for problems when we have choice between algorithms.
> > >
> > 
> > if you don't see any difference then what's the point in having them
> > both available?
> 
> AVL trees have a difference of max 1 between the longest and shortest
> path to a leaf, whereas RB-trees have at max the longest path be double
> the length of the shortest path.
> I.e. work case lookups require traversal of
> ln(size)/ln(2) elements for AVL trees,
> 2 * ln(size)/ln(2) elements for RB trees.
> 
> However, RB trees are slightly more efficient with insertion/removal.
> 
> The better balancing of AVL trees can make a big difference in lookup
> time compared to RB trees, for trees containing millions of items.
> 
> > how are you supposed to choose between them?
> 
> Basically it's a trade off. Is lookup more important, or insert/remove?
> And ofcourse, if you still don't know, just take what strikes your
> fancy. :D
> 
> But I'm mostly interested because I tend to use trees left, right and
> center and those macros lead to code bloat. If I can use a tree without
> macros, I'm happy.
> -- 
> Ariane
> 


-- 

M.Punov
-
AITNET - Sofia/Bulgaria -
Software & Network Solutions
(+359) 888 73 73 58;(+359) 2 402 4000