Re: Non-opaque hamt type?

2021-04-03 Thread Bruno Haible
Hi Marc,

> Secondly, after having given it some more thought, the alternative protocol
> (which we have called more robust) seems to be harder to understand because
> "p != e" could then mean two different things. So I will leave the original
> protocol in place, which is easy to comprehend: If "old_hamt == new_hamt",
> no insertion has taken place and one has manually free the element one has
> attempted to insert. If "old_hamt != new_hamt" the element has been
> inserted and has now to eventually free "new_hamt" besides "old_hamt".

Yes, an API with 3 possible outcomes is easier to use incorrectly than an
API with 2 possible outcomes.

> After I have rebased my code to HEAD, I will commit the new module to
> Gnulib.

Yes please. Thank you for your great contribution!!

Bruno




Re: Non-opaque hamt type?

2021-04-03 Thread Marc Nieper-Wißkirchen
Please excuse the delay in finalizing the new module. I was distracted due
to the start of the semester in October last year and then forgot to finish
my work.

To summarize, I have finally come to the conclusion not to change the API
as theorized in this thread.

First of all, the benefits of making the hamt type non-opaque are too small
compared with the possible drawbacks (the non-opaqueness, the inability to
return NULL in future API extensions, etc.).

Secondly, after having given it some more thought, the alternative protocol
(which we have called more robust) seems to be harder to understand because
"p != e" could then mean two different things. So I will leave the original
protocol in place, which is easy to comprehend: If "old_hamt == new_hamt",
no insertion has taken place and one has manually free the element one has
attempted to insert. If "old_hamt != new_hamt" the element has been
inserted and has now to eventually free "new_hamt" besides "old_hamt".

After I have rebased my code to HEAD, I will commit the new module to
Gnulib.

Thank you for your patience.

Marc

Am So., 18. Okt. 2020 um 20:11 Uhr schrieb Marc Nieper-Wißkirchen <
marc.nieper+...@gmail.com>:

> Okay, if you find the latter protocol better anyway, I will switch to
> this protocol, and hamts will be stack-allocated (just two words) and
> passed by value.
>
> Thanks,
>
> Marc
>
> Am So., 18. Okt. 2020 um 19:58 Uhr schrieb Bruno Haible :
> >
> > Marc Nieper-Wißkirchen wrote:
> > > The existing protocol is as follows:
> > >
> > > Hamt_entry *e = hamt_entry (...);
> > > Hamt_entry *p = e;
> > > Hamt *new_hamt = hamt_insert (old_hamt, );
> > > if (old_hamt == new_hamt)
> > >   {
> > > /* The element hasn't been insert as an equivalent element has
> already been in the hamt. p now holds a reference to the entry that already
> existed in the hamt.
> > > element_free (e);
> > > ...
> > > hamt_free (old_hamt); /* We don't have to free new_hamt because no
> new hamt was created. */
> > >   }
> > > else
> > >   {
> > > /* The element has been inserted. p hasn't changed. */
> > > ...
> > > hamt_free (old_hamt);  /* This frees all hamts */
> > > hamt_free (new_hamt); /* and all elements inserted, including e. */
> > >   }
> > >
> > > A protocol where no pointer values need to be compared could use p to
> > > carry the information:
> > >
> > > Hamt_entry *e = hamt_entry ();
> > > Hamt_entry *p = e;
> > > Hamt new_hamt = hamt_insert (old_hamt, );
> > > if (p == e)
> > >   {
> > > /* The element has been inserted. */
> > > ... /* See above. */
> > >   }
> > > else if (p == NULL)
> > >   {
> > > /* The element e already existed in the hamt. */
> > >... /* See above. */
> > >   }
> > > else /* p != e && p != NULL */
> > >   {
> > > /* An element equivalent to e already existed in the hamt. p now
> holds this element. */
> > > ... /* See above. */
> > >   }
> >
> > I find the latter protocol more robust: it does not depend on details of
> > the implementation of hamt_insert.
> >
> > Can you decide on your original question (allow stack-allocated HAMTs)?
> > I don't feel I can help you decide this.
> >
> > Bruno
> >
>


Re: Non-opaque hamt type?

2020-10-18 Thread Marc Nieper-Wißkirchen
Okay, if you find the latter protocol better anyway, I will switch to
this protocol, and hamts will be stack-allocated (just two words) and
passed by value.

Thanks,

Marc

Am So., 18. Okt. 2020 um 19:58 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > The existing protocol is as follows:
> >
> > Hamt_entry *e = hamt_entry (...);
> > Hamt_entry *p = e;
> > Hamt *new_hamt = hamt_insert (old_hamt, );
> > if (old_hamt == new_hamt)
> >   {
> > /* The element hasn't been insert as an equivalent element has already 
> > been in the hamt. p now holds a reference to the entry that already existed 
> > in the hamt.
> > element_free (e);
> > ...
> > hamt_free (old_hamt); /* We don't have to free new_hamt because no new 
> > hamt was created. */
> >   }
> > else
> >   {
> > /* The element has been inserted. p hasn't changed. */
> > ...
> > hamt_free (old_hamt);  /* This frees all hamts */
> > hamt_free (new_hamt); /* and all elements inserted, including e. */
> >   }
> >
> > A protocol where no pointer values need to be compared could use p to
> > carry the information:
> >
> > Hamt_entry *e = hamt_entry ();
> > Hamt_entry *p = e;
> > Hamt new_hamt = hamt_insert (old_hamt, );
> > if (p == e)
> >   {
> > /* The element has been inserted. */
> > ... /* See above. */
> >   }
> > else if (p == NULL)
> >   {
> > /* The element e already existed in the hamt. */
> >... /* See above. */
> >   }
> > else /* p != e && p != NULL */
> >   {
> > /* An element equivalent to e already existed in the hamt. p now holds 
> > this element. */
> > ... /* See above. */
> >   }
>
> I find the latter protocol more robust: it does not depend on details of
> the implementation of hamt_insert.
>
> Can you decide on your original question (allow stack-allocated HAMTs)?
> I don't feel I can help you decide this.
>
> Bruno
>



Re: Non-opaque hamt type?

2020-10-18 Thread Bruno Haible
Marc Nieper-Wißkirchen wrote:
> The existing protocol is as follows:
> 
> Hamt_entry *e = hamt_entry (...);
> Hamt_entry *p = e;
> Hamt *new_hamt = hamt_insert (old_hamt, );
> if (old_hamt == new_hamt)
>   {
> /* The element hasn't been insert as an equivalent element has already 
> been in the hamt. p now holds a reference to the entry that already existed 
> in the hamt.
> element_free (e);
> ...
> hamt_free (old_hamt); /* We don't have to free new_hamt because no new 
> hamt was created. */
>   }
> else
>   {
> /* The element has been inserted. p hasn't changed. */
> ...
> hamt_free (old_hamt);  /* This frees all hamts */
> hamt_free (new_hamt); /* and all elements inserted, including e. */
>   }
> 
> A protocol where no pointer values need to be compared could use p to
> carry the information:
> 
> Hamt_entry *e = hamt_entry ();
> Hamt_entry *p = e;
> Hamt new_hamt = hamt_insert (old_hamt, );
> if (p == e)
>   {
> /* The element has been inserted. */
> ... /* See above. */
>   }
> else if (p == NULL)
>   {
> /* The element e already existed in the hamt. */
>... /* See above. */
>   }
> else /* p != e && p != NULL */
>   {
> /* An element equivalent to e already existed in the hamt. p now holds 
> this element. */
> ... /* See above. */
>   }

I find the latter protocol more robust: it does not depend on details of
the implementation of hamt_insert.

Can you decide on your original question (allow stack-allocated HAMTs)?
I don't feel I can help you decide this.

Bruno




Re: Non-opaque hamt type?

2020-10-18 Thread Marc Nieper-Wißkirchen
Am So., 18. Okt. 2020 um 16:39 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > At the moment, the header file exposes an opaque struct Hamt and
> > communication with the code happens through (stack-allocated) pointers
> > to a Hamt. In the implementation, however, each Hamt just consists of
> > two pointers (a pointer to a function table and a pointer to the
> > root), so stack-allocating Hamts instead and passing them around by
> > values makes as much sense.
> >
> > This would have the advantage of reducing heap allocation.
>
> By how much does it reduce the heap allocation? Say, someone allocates
> a HAMT and adds 100 entries. There will be heap allocations of ca. 5-10
> buckets and internal nodes. Adding one more heap allocation for the
> root object is not a tragedy.

A struct with just two pointers is allocated on the heap, meaning it
is probably below the minimum malloc size on most implementations. In
architectures where structs of two pointers are passed by value in and
out of functions, a heap-allocated structure will mean one more
pointer indirection per hamt access.

> So, in the end the question is whether to optimize the case of small HAMTs.
>
> > The disadvantage would be that adding further fields to a Hamt in future
> > extensions might be more problematic
>
> True. But when this happens we can mitigate this through a warning in the
> NEWS file.
>
> > and that identity of Hamts cannot
> > be decided through pointer identity anymore (so the protocol how to
> > decide whether an element has been inserted or not has to be changed).
>
> Does this protocol change introduce a significant slowdown?

The existing protocol is as follows:

Hamt_entry *e = hamt_entry (...);
Hamt_entry *p = e;
Hamt *new_hamt = hamt_insert (old_hamt, );
if (old_hamt == new_hamt)
  {
/* The element hasn't been insert as an equivalent element has
already been in the hamt. p now holds a reference to the entry that
already existed in the hamt.
element_free (e);
...
hamt_free (old_hamt); /* We don't have to free new_hamt because no
new hamt was created. */
  }
else
  {
/* The element has been inserted. p hasn't changed. */
...
hamt_free (old_hamt);  /* This frees all hamts */
hamt_free (new_hamt); /* and all elements inserted, including e. */
  }

A protocol where no pointer values need to be compared could use p to
carry the information:

Hamt_entry *e = hamt_entry ();
Hamt_entry *p = e;
Hamt new_hamt = hamt_insert (old_hamt, );
if (p == e)
  {
/* The element has been inserted. */
... /* See above. */
  }
else if (p == NULL)
  {
/* The element e already existed in the hamt. */
   ... /* See above. */
  }
else /* p != e && p != NULL */
  {
/* An element equivalent to e already existed in the hamt. p now
holds this element. */
... /* See above. */
  }

Marc



Re: Non-opaque hamt type?

2020-10-18 Thread Bruno Haible
Hi Marc,

> At the moment, the header file exposes an opaque struct Hamt and
> communication with the code happens through (stack-allocated) pointers
> to a Hamt. In the implementation, however, each Hamt just consists of
> two pointers (a pointer to a function table and a pointer to the
> root), so stack-allocating Hamts instead and passing them around by
> values makes as much sense.
> 
> This would have the advantage of reducing heap allocation.

By how much does it reduce the heap allocation? Say, someone allocates
a HAMT and adds 100 entries. There will be heap allocations of ca. 5-10
buckets and internal nodes. Adding one more heap allocation for the
root object is not a tragedy.

So, in the end the question is whether to optimize the case of small HAMTs.

> The disadvantage would be that adding further fields to a Hamt in future
> extensions might be more problematic

True. But when this happens we can mitigate this through a warning in the
NEWS file.

> and that identity of Hamts cannot
> be decided through pointer identity anymore (so the protocol how to
> decide whether an element has been inserted or not has to be changed).

Does this protocol change introduce a significant slowdown?

Bruno




Re: Non-opaque hamt type?

2020-10-12 Thread Marc Nieper-Wißkirchen
One final issue (I hope):

At the moment, the header file exposes an opaque struct Hamt and
communication with the code happens through (stack-allocated) pointers
to a Hamt. In the implementation, however, each Hamt just consists of
two pointers (a pointer to a function table and a pointer to the
root), so stack-allocating Hamts instead and passing them around by
values makes as much sense.

This would have the advantage of reducing heap allocation. The
disadvantage would be that adding further fields to a Hamt in future
extensions might be more problematic and that identity of Hamts cannot
be decided through pointer identity anymore (so the protocol how to
decide whether an element has been inserted or not has to be changed).

What do you think?

Am So., 11. Okt. 2020 um 21:09 Uhr schrieb Bruno Haible :

> OK for me. Thanks for having listened to the many remarks.

Your remarks only helped to make the module better!

Marc