from:"Marc Nieper\-Wißkirchen"

Re: Problem with valgrind-tests: relies on bash not causing error

2017-04-14 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Hi Reuben,

>/The test for whether to use valgrind runs:/
>//
>//bin/bash -c 'exit 0'/
>//
>/This looks pretty harmless; unfortunately, bash itself causes problems:/

I'd suggest that you change m4/valgrind-tests.m4 to use the AC_CACHE_CHECK
macro instead of "lone" AC_MSG_CHECKING and AC_MSG_RESULT invocations.
Then you have an cache variable that you set as an environment variable
in order to override the result of this test, e.g.
  $ env gl_cv_use_valgrind=yes ./configure

Remember, nowadays that the autoconf cache is disabled by default, the
major benefit of AC_CACHE_CHECK is that it provides the user a way to
selectively override configure test results.

Bruno

Today I have run into the same problem (namely that valgrind and bash
are not going along very well).

While your proposed solution would work for anyone who is aware of this
problem, it won't work for an unaware user who would simply run
./configure, wondering why valgrind isn't detected. As it seems rather
involved to resolve the incompatibility between bash and valgrind, I
would like to ask to have gnulib's valgrind-tests changed upstream. As
this change amounts simply in replacing $(SHELL) by any other well-known
utility, the changes to be done are trivial.

Having an updated version upstream makes it much easier for everyone  to
pick up the necessary changes to make valgrind-tests work.

Marc

Valgrind is complaining unitialized values in freea (malloca.c:135)

2017-08-22 Thread Marc Nieper-Wißkirchen

In freea in malloca.c, a possibly uninitialized indicator word is used for
a comparison so that Valgrind reports: "Conditional jump or move depends on
uninitialised value(s)".

Valgrind is not smart enough to understand the logic in freea.

It would be nice if the warning could be silenced, either by amending freea
slightly (it seems that a similar thing has already been done for Clang
warnings) or by reporting the issue to the Valgrind developers so that they
can special-case gnulib's freea.

--

Marc

Re: Valgrind is complaining unitialized values in freea (malloca.c:135)

2017-08-22 Thread Marc Nieper-Wißkirchen

I just noticed the file lib/malloca.valgrind, which can be used with the
Valgrind option suppressions.

Marc

Am 22.08.2017 5:52 nachm. schrieb "Tim Rühsen" <tim.rueh...@gmx.de>:

> On Dienstag, 22. August 2017 06:11:41 CEST Marc Nieper-Wißkirchen wrote:
> > In freea in malloca.c, a possibly uninitialized indicator word is used
> for
> > a comparison so that Valgrind reports: "Conditional jump or move depends
> on
> > uninitialised value(s)".
> >
> > Valgrind is not smart enough to understand the logic in freea.
> >
> > It would be nice if the warning could be silenced, either by amending
> freea
> > slightly (it seems that a similar thing has already been done for Clang
> > warnings) or by reporting the issue to the Valgrind developers so that
> they
> > can special-case gnulib's freea.
>
> I also see several false positives from clang's Undefined Sanitizer due to
> alloca 'magic' (reallocations on stack space ?). This might not be directly
> related, but I think there is a common coding pattern.
>
> glob.c:1738:23: runtime error: index 64 out of bounds for type 'char *[64]'
> #0 0x557545 in glob_in_dir /home/tim/src/wget2/lib/glob.c:1738:40
> #1 0x54ded1 in rpl_glob /home/tim/src/wget2/lib/glob.c:1306:16
>
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior glob.c:1738:23 in
> glob.c:1739:27: runtime error: index 64 out of bounds for type 'char *[64]'
> #0 0x5575d4 in glob_in_dir /home/tim/src/wget2/lib/glob.c:1739:27
> #1 0x54ded1 in rpl_glob /home/tim/src/wget2/lib/glob.c:1306:16
>
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior glob.c:1739:27 in
> glob.c:1811:21: runtime error: index 64 out of bounds for type 'char *[64]'
> #0 0x55845e in glob_in_dir /home/tim/src/wget2/lib/glob.c:1811:21
> #1 0x54ded1 in rpl_glob /home/tim/src/wget2/lib/glob.c:1306:16
>
> Regards, Tim
>

UUID Generator in Gnulib?

2018-12-06 Thread Marc Nieper-Wißkirchen

Universally unique identifiers (UUIDs) are needed in various application
scenarios. It would be quite helpful if Gnulib included a module with a
UUID generator. As Gnulib already contains (strong) random number
generators, implementing such a UUID generator shouldn't be too hard.

Linux systems have libuuid but this may not be present on other POSIX
systems.

Thanks,

Marc

Visibility of gnulib's symbols

2019-07-08 Thread Marc Nieper-Wißkirchen

Dear all,

I'm working on a shared library, for which implementation I make use of
gnulib modules.

I followed the instructions for the lib-symbol-visibility module (see here:
https://www.gnu.org/software/gnulib/manual/html_node/Exported-Symbols-of-Shared-Libraries.html)
to make only the documented parts of my API visible to the outside.

This works for the library code, but now for the imported gnulib modules by
default. I still see all gnulib exported identifiers among the visible
symbols of my shared library.

What is the canonical way to solve this? Is there an easy way to add
$(CFLAGS_VISIBLITY) to the CFLAGs when gnulib is compiled? Or should I try
something different?

Thanks!

Cheers,

Marc

Re: pure and const function attributes

2020-04-29 Thread Marc Nieper-Wißkirchen

Paul Eggert  schrieb am Mi., 29. Apr. 2020, 18:01:

> On 4/29/20 7:28 AM, Marc Nieper-Wißkirchen wrote:
> > I am wondering whether it makes sense to add two new modules, named
> > pure and const that define macros GL_PURE and GL_CONST, respectively
>
> There's already _GL_ATTRIBUTE_PURE and _GL_ATTRIBUTE_CONST. Presumably you
> just
> want them exposed? (I confess that Emacs already uses the latter)
>

That would be perfect!

>

pure and const function attributes

2020-04-29 Thread Marc Nieper-Wißkirchen

Some GNULIB modules (e.g. xsize) use constructs like:

#if __GNUC__ >= 3
__attribute__ ((__pure__))
#endif

I am wondering whether it makes sense to add two new modules, named
pure and const that define macros GL_PURE and GL_CONST, respectively
that expand into nothing for a compiler that does not support these
function attributes and into the relevant code for a compiler that
does.

-- Marc

xsize and flexmember

2020-04-29 Thread Marc Nieper-Wißkirchen

The flexmember module provides the macro FLEXSIZEOF to calculate an
appropriate size to malloc a struct with a flexible array member.

Overflow is reported differently than with the xsize module, which
provides size_overflow_p.

It would be great if the flexmember exported another macro, say
XFLEXSIZEOF, which returned SIZE_MAX in case of arithmetic overflow.

--
Marc

Re: Unicode string literals

2020-04-30 Thread Marc Nieper-Wißkirchen

Am Do., 30. Apr. 2020 um 22:54 Uhr schrieb Paul Eggert :
>
> On 4/30/20 6:08 AM, Bruno Haible wrote:
> > These not-so-new compilers don't perform
> > character set conversion; you have to provide the numeric value of each
> > byte yourself (either as escapes, or by enumerating the bytes of the
> > string one by one).
>
> Could we have a macro to be used only in source code encoded via UTF-8?
> Presumably the older compilers would process the bytes of the string as if 
> they
> were individual 8-bit characters and would pass them through unchanged, so the
> run-time string would be UTF-8 too.

This would allow writing a macro that prefixes "u8" to strings in
compilers supporting enough of C11, skipping the prefix in compilers
that pass UTF-8 encoded bytes in strings unchanged and signal an error
in all other cases (hopefully only very exotic platforms), right?

Re: pure and const function attributes

2020-05-01 Thread Marc Nieper-Wißkirchen

Am Fr., 1. Mai 2020 um 02:46 Uhr schrieb Paul Eggert :
>
> On 4/30/20 6:37 AM, Marc Nieper-Wißkirchen wrote:
> > P.S.: It would also be helpful so that warnings coming from
> > "-Wsuggest-attribute=pure" can be handled for the GCC without
> > affecting other compilers.
>
> So, something like the attached file for modules/attribute-gcc? You can put
> ATTRIBUTE_PURE in the declaration for a function, and for non-GCC compilers it
> expands to nothing.

Yes. And this can even be reused by many existing Gnulib modules,
which currently contain code like:

#if __GNUC__ >= 3
__attribute__ ((__pure__))
#endif

A feature test in configure for the attributes would be even better
because other compilers like Clang will also accept the GCC attributes
(and different GCC versions may support different attributes in the
future).

To my knowledge, C2x will standardize attributes so one should keep an
eye on this to make this new Gnulib module future-proof. (The
standardized attributes seem to have vendor prefixes.)

Re: xsize and flexmember

2020-05-01 Thread Marc Nieper-Wißkirchen

Am Fr., 1. Mai 2020 um 00:20 Uhr schrieb Paul Eggert :
>
> On 4/30/20 2:01 PM, Marc Nieper-Wißkirchen wrote:
>
> >>>> #define XFLEXSIZEOF_XSIZE(type, member, n) \
> >>>>   (((n) <= FLEXSIZEOF (type, member, n) \
> >>>> && FLEXSIZEOF (type, member, n) <= (size_t) -1) \
> >>>>? (size_t) FLEXSIZEOF (type, member, n) : (size_t) -1)
> >
> > Why do you write "(n) <= FLEXSIZEOF (type, member, n)" and not "n <
> > FLEXSIZEOF (type, member, n)"? In case MEMBER is the first element of
> > TYPE, this would not indicate an overflow, would it?
>
> If n == FLEXSIZEOF (type, member, n) then overflow has not occurred, yes. And 
> in
> that case the function should yield n. (Admittedly this case would be 
> rare)
>
> > My idea was:
> >
> > #define XFLEXSIZEOF_XSIZE(type, member, n) xflexsizeof_xsize_bound(
> > FLEXSIZEOF (type, member, n), n)
> > static _GL_INLINE size_t xflexsizeof_xsize_bound (umaxint_t m, size_t n)
> > {
> >   if (n < m && m <= (size_t) -1)
> > return m;
> >   else
> > return (size_t) -1;
> > }
>
> This would require including stdint.h to get uintmax_t, which adds a 
> dependency.
> Also, xflexsizeof_xsize_bound shouldn't be a static function since extern 
> inline
> functions can't call static functions, though that should be easy to fix.

We could make it external inline.

> There's also the theoretical problem that INTMAX_MAX might be greater than
> UINTMAX_MAX, but perhaps we needn't worry about that

Alternatively, we can add a dependency to the xsize module for the
module that exports FLEXSIZEOF_XSIZE so and duplicate the code of
FLEXSIZEOF_XISIZE but do the additions with xsum and friends. The "&"
operator may still be a problem.

> I can see going either way on this. As a macro, FLEXSIZEOF_XSIZE could insist
> that its last argument be free of side effects, and that would be simpler on 
> the
> implementation. It's an annoying restriction, though.

If it is documented, I think this would be okay if not optimal.

>
> > maybe FLEXSIZEOF_XSIZE, which would at least drop the
> > leading "x" as we no error is signaled. :)
>
> Yes, good point.
>

Re: xsize and flexmember

2020-05-01 Thread Marc Nieper-Wißkirchen

Am Fr., 1. Mai 2020 um 11:09 Uhr schrieb Bruno Haible :
>
> Paul Eggert wrote:
> > I realize we have dueling conventions here, but would prefer that
> > saturated size_t arithmetic have a longer prefix or suffix than just "x".
>
> I'm open to this. What prefix would you propose instead of 'x'?

Whatever prefix instead, it should be a short as 'x'. As the functions
exported by xsize are to be used in place of the usual arithmetic
operators, their names should be short.

> Generally, 'xsize' has not caught on as I had expected. It is still a
> simple solution to the task of avoiding inadvertent overflow, especially
> in complex expressions, but
>   - many people continued to prefer ad-hoc code, especially for simple
> expressions,

I'd rather use the xsize code than ad-hoc code because it expresses
the programmer's intent much better.

>   - the 'xsize' module is written for size_t, therefore overflow checking
> for 'unsigned int' or 'unsigned long' still has to be done the
> manual way,

I think that size_t calculations are still the most important ones.

Thanks,

Marc

>   - on glibc systems, the problem has been mitigated since malloc()
> now refuses arguments > SIZE_MAX/2, thus in a loop that grows an
> array malloc() will typically fail before the size overflows.
>
> Thoughts?
>
> Bruno
>

Re: pure and const function attributes

2020-04-30 Thread Marc Nieper-Wißkirchen

Am Mi., 29. Apr. 2020 um 18:05 Uhr schrieb Marc Nieper-Wißkirchen
:
>
>
> Paul Eggert  schrieb am Mi., 29. Apr. 2020, 18:01:
>>
>> On 4/29/20 7:28 AM, Marc Nieper-Wißkirchen wrote:
>> > I am wondering whether it makes sense to add two new modules, named
>> > pure and const that define macros GL_PURE and GL_CONST, respectively
>>
>> There's already _GL_ATTRIBUTE_PURE and _GL_ATTRIBUTE_CONST. Presumably you 
>> just
>> want them exposed? (I confess that Emacs already uses the latter)
>
>
> That would be perfect!

P.S.: It would also be helpful so that warnings coming from
"-Wsuggest-attribute=pure" can be handled for the GCC without
affecting other compilers.

Re: Add gl_list_remove_last to list/xlist

2020-05-02 Thread Marc Nieper-Wißkirchen

Hi Bruno,

thanks again very much!

Am Sa., 2. Mai 2020 um 02:07 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > This is a feature request to add an operation
> >
> > extern gl_list_node_t gl_list_remove_last (gl_list_t list)
> >
> > to the list/xlist interface.
> >
> > This operation would remove the last element of the list and return
> > the node of the previous element (or NULL if no element remained).
> >
> > There are at least two reasons why adding such an operation is meaningful:
> >
> > (1) It is a common operation if the list is used as a stack. Pushing
> > will be gl_list_add_last, popping will be gl_list_remove_last.
> >
> > (2) The complexity of removing an arbitrary element in an ARRAY list
> > is O(n). Removing the last element, however, is only O(1). With an
> > explicit operation gl_list_remove_last, this can be documented in the
> > table at the beginning of lib_gl_list.h.
>
> I agree about point (2). It applies also the CARRAY and LINKED list types.
>
> Similarly for gl_list_remove_first, which also have smaller complexity than
> gl_list_remove_at for CARRAY and LINKED list types.
>
> However, the return type cannot be gl_list_node_t, because for the LINKED
> list type, that would be returning a pointer to already freed memory.

Could you explain why it is so? I didn't mean that gl_list_remove_last
should return the just deleted node but the node of the element that
is now the last one (the respectively first one for
gl_list_remove_first) after the removal. If there is none (because the
list is empty after the removal), NULL would be returned. It would be
an error to apply gl_list_remove_last to an empty list.

The motivation behind my suggestion is to make it easy to implement a
stack (or a FIFO queue) with the gl_list interface. For this,
operations like gl_list_get_first and gl_list_get_last with guaranteed
O(1) behavior for a number of list implementations would also be nice.

>
> The return type cannot be 'const void *' either, because then it would not
> be possible to distinguish a returned NULL element and a call on an
> empty list - which would also return NULL.
>
> So, the best possible return type here is 'bool'.
>
> Implemented as follows. Thanks for the suggestion.
>
>
> 2020-05-01  Bruno Haible  
>
> list: Add remove_first and remove_last operations.
> Suggested by Marc Nieper-Wißkirchen  in
> <https://lists.gnu.org/archive/html/bug-gnulib/2020-04/msg00092.html>.
> * lib/gl_list.h (struct gl_list_implementation): Add fields
> remove_first, remove_last.
> (gl_list_remove_first, gl_list_remove_last): New functions.
> * lib/gl_array_list.c (gl_array_remove_first, gl_array_remove_last): 
> New
> functions, based on gl_array_remove_at.
> (gl_array_list_implementation): Implement the new operations.
> * lib/gl_carray_list.c (gl_carray_remove_first, 
> gl_carray_remove_last):
> New functions, based on gl_carray_remove_at.
> (gl_carray_list_implementation): Implement the new operations.
> * lib/gl_anylinked_list2.h (gl_linked_remove_first,
> gl_linked_remove_last): New functions, based on gl_linked_remove_at.
> * lib/gl_linked_list.c (gl_linked_list_implementation): Implement the
> new operations.
> * lib/gl_linkedhash_list.c (gl_linkedhash_list_implementation):
> Likewise.
> * lib/gl_anytree_list2.h (gl_tree_remove_first, gl_tree_remove_last):
> New functions, based on gl_tree_remove_at.
> * lib/gl_avltree_list.c (gl_avltree_list_implementation): Implement 
> the
> new operations.
> * lib/gl_avltreehash_list.c (gl_avltreehash_list_implementation):
> Likewise.
> * lib/gl_rbtree_list.c (gl_rbtree_list_implementation): Likewise.
> * lib/gl_rbtreehash_list.c (gl_rbtreehash_list_implementation):
> Likewise.
> * lib/gl_sublist.c (gl_sublist_remove_first, gl_sublist_remove_last):
> New functions, based on gl_sublist_remove_at.
> (gl_sublist_list_implementation): Implement the new operations.
> * lib/gl_list.hh (class gl_List): Add methods remove_first,
> remove_last.
> * tests/test-array_list.c (main): Test also gl_list_remove_first and
> gl_list_remove_last.
> * tests/test-avltree_list.c (main): Likewise.
> * tests/test-avltreehash_list.c (main): Likewise.
> * tests/test-carray_list.c (main): Likewise.
> * tests/test-linked_list.c (main): Likewise.
> * tests/test-linkedhash_list.c (main): Likewise.
> * tests/test-rbtree_list.c (main): Likewise

Re: Add gl_list_remove_last to list/xlist

2020-05-02 Thread Marc Nieper-Wißkirchen

Hi Bruno,

[...]

> The gl_list_node_t type, in the 'list' interface so far, is used in
> operations that work on a single position. Except for the functions
> gl_list_next_node and gl_list_previous_node, which intentionally
> walk from one node to a neighbour node. Having a function that
> does an effect on the last element and returns the position of the
> second-to-last element would be an invitation to incorrect coding.
> Better keep the operations simple!

That's a fair point of view.

[...]

> > The motivation behind my suggestion is to make it easy to implement a
> > stack (or a FIFO queue) with the gl_list interface.
>
> It is already easy to implement a stack or FIFO using the primitives.
> E.g. stack_pop =
>   1. gl_list_get_at (list, gl_list_size (list) - 1)
>   2. gl_list_remove_at (list, gl_list_size (list) - 1) or
>  gl_list_remove_last (list).
>
> It would be a mistake to add semantics of stack or FIFO directly to the
> list interface, because a list *is* not a stack or a FIFO; a list *can
> be used to implement* a stack or a FIFO. It is an important concept
> that one interface can be used to implement another interface (->
> layered software, hiding implementation details, etc.).

Okay; I agree that a separate stack or FIFO module can make more
sense; in particular when it only has to deal with a single
implementation of an underlying data structure. At the moment I do
have other work to finish first, but maybe I will find some time in
the near future for a stack module.

[...]

> > For this,
> > operations like gl_list_get_first and gl_list_get_last with guaranteed
> > O(1) behavior for a number of list implementations would also be nice.
>
> Hmm. I'm reluctant to introduce 4 new functions
>   gl_list_get_first
>   gl_list_get_last
>   gl_list_set_first
>   gl_list_set_last
> when the gl_list_get/gl_list_set operations with appropriate position
> argument already do the job. It appears to be more of a documentation
> issue, right? I.e. I should better revert yesterday's patch, and instead,
> in the table show the guaranteed average performance
>   gl_list_get_first
>   gl_list_get_last
>   gl_list_set_first
>   gl_list_set_last
>   gl_list_remove_first
>   gl_list_remove_last
> where these 6 functions are defined globally, not separately for each
> implementation.

When the functions can be defined globally, that's better than adding
six functions to each vtable. Partly, it is just a documentation
issue. I don't see, however, to implement the function dealing with
end of the list in O(1) behavior when the underlying data structure is
a linked list. Don't we need to expose an implementation-dependent
gl_list_get_last_node (gl_list_first_node for symmetry)? The rest
should then be implementable easily.

Marc

Re: Add gl_list_remove_last to list/xlist

2020-05-02 Thread Marc Nieper-Wißkirchen

Hi Bruno,

[...]

> > I don't see, however, to implement the function dealing with
> > end of the list in O(1) behavior when the underlying data structure is
> > a linked list.
>
> The LINKED list implementation is a doubly-linked list, and the functions
> get_at, set_at, or remove_at are implemented like this:
>   If position < length/2 then
> walk from the start of the list (O(position))
>   else
> walk from the end of the list (O(length-position)).
>
> So, the original invocation
>   gl_list_remove_at (list, gl_list_size (list) - 1)
> already does the job in O(1).

Great. (Now that you say it I think I have once looked into the code
but forgotten the details.) It would be nice if this could be
documented. In any case, the general six functions we discussed should
then be possible without any implementation-specifics.

Re: Add gl_list_remove_last to list/xlist

2020-05-05 Thread Marc Nieper-Wißkirchen

Hi Bruno,

excellent; thank you very much!

Just one note: The documentation needs to be updated in section 14.8
as well ([1]).

So long,

Marc

[1] https://www.gnu.org/software/gnulib/manual/gnulib.html#Container-data-types

Am Sa., 2. Mai 2020 um 23:24 Uhr schrieb Bruno Haible :
>
> I wrote:
> > I should better revert yesterday's patch, and instead,
> > in the table show the guaranteed average performance
> >   gl_list_get_first
> >   gl_list_get_last
> >   gl_list_set_first
> >   gl_list_set_last
> >   gl_list_remove_first
> >   gl_list_remove_last
> > where these 6 functions are defined globally, not separately for each
> > implementation.
>
> Done through the two attached patches.
>
> 2020-05-02  Bruno Haible  
>
> list: Add get_first, get_last, set_first, set_last operations.
> * lib/gl_list.h (gl_list_get_first, gl_list_get_last,
> gl_list_nx_set_first, gl_list_nx_set_last): New functions.
> * lib/gl_xlist.h (gl_list_set_first, gl_list_set_last): New functions.
>
> 2020-05-02  Bruno Haible  
>
> list: Remove redundant code for remove_first and remove_last 
> operations.
> * lib/gl_list.h (struct gl_list_implementation): Remove fields
> remove_first, remove_last.
> (gl_list_remove_first, gl_list_remove_last): Implement in a generic 
> way.
> * lib/gl_array_list.c: Revert last change.
> * lib/gl_carray_list.c: Likewise.
> * lib/gl_anylinked_list2.h: Likewise.
> * lib/gl_linked_list.c: Likewise.
> * lib/gl_linkedhash_list.c: Likewise.
> * lib/gl_anytree_list2.h: Likewise.
> * lib/gl_avltree_list.c: Likewise.
> * lib/gl_avltreehash_list.c: Likewise.
> * lib/gl_rbtree_list.c: Likewise.
> * lib/gl_rbtreehash_list.c: Likewise.
> * lib/gl_sublist.c: Likewise.
>

Re: pure and const function attributes

2020-05-05 Thread Marc Nieper-Wißkirchen

Hi Paul,

thank you very much.

If I understand correctly, the patch seems to cover compilers that
declare "__has_attribute", which should cover GCC and Clang. Does it
make sense to support MSC's "__declspec" as well?

As for the future-proof C2X approach: In the GCC 10 Changes
(https://gcc.gnu.org/gcc-10/changes.html), the C2X standard attributes
are written without double underscores.  In the attached patch, they
are. What is correct?

Marc

Am So., 3. Mai 2020 um 08:21 Uhr schrieb Paul Eggert :
>
> Thanks for catching those glitches. I suppose you're right, the new
> _GL_ATTRIBUTE_* macros should be in gnulib-common.m4 even though that will now
> grow by a bit (meaning config.h will grow quite a bit). Attached is a revised
> draft patch to Gnulib, which (unlike the previous one) I have tested a bit. It
> doesn't attempt to revamp all of Gnulib to use the new macros, but does enough
> of them to give a flavor for the conversion.

Add gl_list_remove_last to list/xlist

2020-04-28 Thread Marc Nieper-Wißkirchen

This is a feature request to add an operation

extern gl_list_node_t gl_list_remove_last (gl_list_t list)

to the list/xlist interface.

This operation would remove the last element of the list and return
the node of the previous element (or NULL if no element remained).

There are at least two reasons why adding such an operation is meaningful:

(1) It is a common operation if the list is used as a stack. Pushing
will be gl_list_add_last, popping will be gl_list_remove_last.

(2) The complexity of removing an arbitrary element in an ARRAY list
is O(n). Removing the last element, however, is only O(1). With an
explicit operation gl_list_remove_last, this can be documented in the
table at the beginning of lib_gl_list.h.

Thanks,

Marc

Re: xsize and flexmember

2020-04-30 Thread Marc Nieper-Wißkirchen

Thank you very much for your quick response!

Am Do., 30. Apr. 2020 um 00:39 Uhr schrieb Paul Eggert :
>
> On 4/29/20 12:29 PM, Marc Nieper-Wißkirchen wrote:
> > It would be great if the flexmember exported another macro, say
> > XFLEXSIZEOF, which returned SIZE_MAX in case of arithmetic overflow.
>
> Something like this?
>
> /* Like FLEXSIZEOF, except yield SIZE_MAX on arithmetic overflow,
>and N might be evaluated more than once.  */
>
> #define XFLEXSIZEOF_XSIZE(type, member, n) \
>   (((n) <= FLEXSIZEOF (type, member, n) \
> && FLEXSIZEOF (type, member, n) <= (size_t) -1) \
>? (size_t) FLEXSIZEOF (type, member, n) : (size_t) -1)
>
> A couple of problems with this approach:
>
>   * It evaluates N more than once.

Couldn't this be solved by calling a static function that would be
subject to be inlined?

>
>   * If the FLEXSIZEOF calls appears in a ptrdiff_t context it might not
> return the right value. ptrdiff_t is also a popular way
> to compute sizes.

Maybe a warning in the comment above the macro's definition would be enough.

>
> But perhaps it's good enough.

Why would you prefer the (longer) name XFLEXSIZEOF_XSIZE vs XFLEXSIZEOF?

Unicode string literals

2020-04-30 Thread Marc Nieper-Wißkirchen

On a system that supports at least C11, I can create an UTF8-encoded
literal string through:

(uint8_t const *) u8"..."

Could Gnulib abstract this into a macro so that substitutes for
systems that do not have u8 string literals can be provided.

On a C11 system, we would have

#define UTF8(s) ((uint8_t const *) u8 ## s)

and similar definitions for UTF16 and UTF32.

Similarly, something like

#define ASCII(s) (u8 ## s [0])

for pre-C2x systems would be nice so that ASCII("c") expands into the
ASCII code point of the character `c'.

Re: Unicode string literals

2020-04-30 Thread Marc Nieper-Wißkirchen

Hi Bruno,

thank you very much for your reply.

Am Do., 30. Apr. 2020 um 12:06 Uhr schrieb Bruno Haible :

[...]

> Unfortunately, we cannot provide such macros. The reason is that the
> translation from the source file's encoding to UTF-8/UTF-16/UTF-32 must
> be done by the compiler, if you want to be able to write
>   static uint8_t my_string[] = u8"Wißkirchen";

For a compiler that supports the "u8" prefix, which is defined by C11,
the compiler should do the translation from the source file encoding
to UTF-8.  I was hoping that compilers not supporting enough of C11
would have some other way to translate from the source file encoding
to UTF-8, which could be exploited by Gnulib.

> Your best bet is
>   1) For UTF-8 encoded strings, ensure that your source code is UTF-8
>  encoded, or use escapes, like in gnulib/tests/uniwidth/test-u8-width.c.

Using escapes for non-ASCII characters, it will work whenever the
execution character set of the compiler is compatible with ASCII,
right?

>   2) For UTF-16 encoded strings, which you'll need only on Windows,
>  write L"Wißkirchen". Or use hex codes, like in
>  gnulib/tests/uniwidth/test-u16-width.c.
>   3) Don't use UTF-32 encoded strings. Or use hex codes, like in
>  gnulib/tests/uniwidth/test-u32-width.c.

These two are less important for me; I mentioned them to have a full
set of macros.

>
> > Similarly, something like
> >
> > #define ASCII(s) (u8 ## s [0])
> >
> > for pre-C2x systems would be nice so that ASCII("c") expands into the
> > ASCII code point of the character `c'.
>
> What's the point of this one? Why not just write 'c'?

I was thinking of a system whose execution character set is not
compatible with ASCII. Or are those excluded in general by Gnulib?

Thanks again,

Marc

Re: c-stack.c and DEBUG: missing import

2020-05-16 Thread Marc Nieper-Wißkirchen

Thank you, both of you!

Have a nice weekend,

Marc

Am Fr., 15. Mai 2020 um 23:04 Uhr schrieb Bruno Haible :
>
> Hi Paul,
>
> > I don't think we need to go that far, since c-stack is already using
> > ignore_value. I installed the attached.
>
> You beat me to it by 2 minutes :)
>
> I verified that the patch fixes the warning that occurred with
>   ./configure CPPFLAGS="-Wall -DDEBUG" --with-libsigsegv-prefix=...
>
> Bruno
>

Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-19 Thread Marc Nieper-Wißkirchen

It is often important to be able to read back a written floating-point
number accurately so it has to be output with high enough precision.
The Scheme standard even demands this for the number->string and
string->number procedures. Moreover, for output, the shortest rounded
representation that still reads back accurately has to be selected.

Such input and output routines for floats and doubles would also be
very helpful in the C language. Do you think this could become a
Gnulib module?

A simple algorithm is given by Aubrey Jaffer in [1].

Marc

[1] http://people.csail.mit.edu/jaffer/III/EZFPRW

Re: c-stack.c and DEBUG: missing import

2020-05-14 Thread Marc Nieper-Wißkirchen

Hi Bruno,

there is one more issue with c-stack when all warnings are enabled
(and the DEBUG flag is set):

c-stack.c: In function ‘segv_handler’:
c-stack.c:175:5: warning: ignoring return value of ‘write’, declared
with attribute warn_unused_result [-Wunused-result]
  175 | write (STDERR_FILENO, buf, strlen (buf));
  | ^~~~
c-stack.c: In function ‘overflow_handler’:
c-stack.c:198:5: warning: ignoring return value of ‘write’, declared
with attribute warn_unused_result [-Wunused-result]
  198 | write (STDERR_FILENO, buf, strlen (buf));
  | ^~~~

This will be a nice use case for the newly created attribute module
and MAYBE_UNUSED.

Thanks,

Marc

Am Sa., 9. Mai 2020 um 13:34 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > Please add
> >
> > #ifdef DEBUG
> > # include 
> > #endif
> >
> > at the beginning of c-stack.c.
> >
> > When the DEBUG flag is enabled, c-stack.c uses sprintf and without the
> > suggested addition gcc complains about an implicit declaration of the
> > function sprintf.
>
> Yup. Thanks for the suggestion. Done.
>
>
> 2020-05-09  Bruno Haible  
>
> c-stack: Fix warning when DEBUG is enabled.
> Patch suggested by Marc Nieper-Wißkirchen  
> in
> <https://lists.gnu.org/archive/html/bug-gnulib/2020-05/msg00081.html>.
> * lib/c-stack.c: Include .
>
> diff --git a/lib/c-stack.c b/lib/c-stack.c
> index 4050d08..e486591 100644
> --- a/lib/c-stack.c
> +++ b/lib/c-stack.c
> @@ -65,6 +65,10 @@ typedef struct sigaltstack stack_t;
>
>  #include 
>
> +#if DEBUG
> +# include 
> +#endif
> +
>  #if HAVE_LIBSIGSEGV
>  # include 
>  #endif
>

c-stack.c and DEBUG: missing import

2020-05-09 Thread Marc Nieper-Wißkirchen

Please add

#ifdef DEBUG
# include 
#endif

at the beginning of c-stack.c.

When the DEBUG flag is enabled, c-stack.c uses sprintf and without the
suggested addition gcc complains about an implicit declaration of the
function sprintf.

Thanks,

Marc

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-20 Thread Marc Nieper-Wißkirchen

Please see the attached patch file, my first attempt (and first
contribution to Gnulib).

Am Di., 19. Mai 2020 um 17:51 Uhr schrieb Paul Eggert :
>
> On 5/19/20 8:35 AM, Marc Nieper-Wißkirchen wrote:
> >> It is, however, locale-dependent, and there is no "c_dtoastr"
> >> version as there is a "c_strtod". (Could we get such a wrapper?)
> > Or a version "c_dtoastr" that uses "c_snprintf" and "c_strtod" internally.
>
> Yes, that'd be good to have. Could you write that?
From f4308c51cc73a8b1397c436f8dd667e15d6d0b9b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marc=20Nieper-Wi=C3=9Fkirchen?= 
Date: Wed, 20 May 2020 13:59:31 +0200
Subject: [PATCH] New module to accurately print (long) doubles in C locale.

---
 ChangeLog| 18 +
 MODULES.html.sh  |  2 ++
 lib/c-dtoastr.c  |  3 +++
 lib/c-ldtoastr.c |  3 +++
 lib/ftoastr.c| 23 +++-
 lib/ftoastr.h|  6 +
 modules/c-dtoastr| 27 +++
 modules/c-dtoastr-tests  | 18 +
 modules/c-ldtoastr   | 27 +++
 modules/c-ldtoastr-tests | 18 +
 tests/test-c-dtoastr.c   | 58 
 tests/test-c-dtoastr.sh  | 15 +++
 tests/test-c-ldtoastr.c  | 58 
 tests/test-c-ldtoastr.sh | 15 +++
 14 files changed, 285 insertions(+), 6 deletions(-)
 create mode 100644 lib/c-dtoastr.c
 create mode 100644 lib/c-ldtoastr.c
 create mode 100644 modules/c-dtoastr
 create mode 100644 modules/c-dtoastr-tests
 create mode 100644 modules/c-ldtoastr
 create mode 100644 modules/c-ldtoastr-tests
 create mode 100644 tests/test-c-dtoastr.c
 create mode 100644 tests/test-c-dtoastr.sh
 create mode 100644 tests/test-c-ldtoastr.c
 create mode 100644 tests/test-c-ldtoastr.sh

diff --git a/ChangeLog b/ChangeLog
index 1c39b92e6..becb40eec 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,21 @@
+2020-05-20  Marc Nieper-Wißkirchen  
+
+	c-dtoastr: new module
+	c-ldtoastr: new module
+	These modules provide the same functionality as the modules
+	dtoastr and ldtoastr except for the formatting taking place in the
+	C locale.
+	* MODULES.html.sh: Add c-dtoastr and c-ldtoastr.
+	* lib/c-dtoastr.c, lib/c-ldtoastr.c: New files
+	* lib/ftoastr.c: Prefix exported functions when the macro C_LOCALE is
+	defined.  Use c_snprintf and c_strtod/c_strtold instead of
+	snprintf and strtod/strtold whhen the macro C_LOCALE is defined.
+	* lib/ftoastr.h: Add prototypes for c_dtoastr and c_ldtoastr.
+	* modules/c-dtoastr, modules/c-dtoastr-tests, modules/c-ldtoastr,
+	modules/c-ldtoastr-tests: New files.
+	* tests/test-c-dtoastr.c, tests/test-c-dtoastr.sh,
+	tests-c-ldtoastr.c tests-c-ldtoastr.sh: New files.
+
 2020-05-19  Paul Eggert  
 
 	ftoastr: fix ifndef typo
diff --git a/MODULES.html.sh b/MODULES.html.sh
index 318a15a1d..280bd14da 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2319,6 +2319,8 @@ func_all_modules ()
   func_echo "$element"
 
   func_begin_table
+  func_module c-dtoastr
+  func_module c-ldtoast
   func_module dtoastr
   func_module ftoastr
   func_module intprops
diff --git a/lib/c-dtoastr.c b/lib/c-dtoastr.c
new file mode 100644
index 0..b57524fb1
--- /dev/null
+++ b/lib/c-dtoastr.c
@@ -0,0 +1,3 @@
+#define LENGTH 2
+#define C_LOCALE 1
+#include "ftoastr.c"
diff --git a/lib/c-ldtoastr.c b/lib/c-ldtoastr.c
new file mode 100644
index 0..5446fc3e7
--- /dev/null
+++ b/lib/c-ldtoastr.c
@@ -0,0 +1,3 @@
+#define LENGTH 3
+#define C_LOCALE 1
+#include "ftoastr.c"
diff --git a/lib/ftoastr.c b/lib/ftoastr.c
index 7a7d4113c..47a83152e 100644
--- a/lib/ftoastr.c
+++ b/lib/ftoastr.c
@@ -33,20 +33,28 @@
 #include 
 #include 
 
+#ifdef C_LOCALE
+# include "c-snprintf.h"
+# include "c-strtod.h"
+# define PREFIX(name) c_ ## name
+#else
+# define PREFIX(name) name
+#endif
+
 #if LENGTH == 3
 # define FLOAT long double
 # define FLOAT_DIG LDBL_DIG
 # define FLOAT_MIN LDBL_MIN
 # define FLOAT_PREC_BOUND _GL_LDBL_PREC_BOUND
-# define FTOASTR ldtoastr
+# define FTOASTR PREFIX (ldtoastr)
 # define PROMOTED_FLOAT long double
-# define STRTOF strtold
+# define STRTOF PREFIX (strtold)
 #elif LENGTH == 2
 # define FLOAT double
 # define FLOAT_DIG DBL_DIG
 # define FLOAT_MIN DBL_MIN
 # define FLOAT_PREC_BOUND _GL_DBL_PREC_BOUND
-# define FTOASTR dtoastr
+# define FTOASTR PREFIX (dtoastr)
 # define PROMOTED_FLOAT double
 #else
 # define LENGTH 1
@@ -54,7 +62,7 @@
 # define FLOAT_DIG FLT_DIG
 # define FLOAT_MIN FLT_MIN
 # define FLOAT_PREC_BOUND _GL_FLT_PREC_BOUND
-# define FTOASTR ftoastr
+# define FTOASTR PREFIX (ftoastr)
 # define PROMOTED_FLOAT double
 # if HAVE_STRTOF
 #  define STRTOF strtof
@@ -65,13 +73,16 @@
may generate one or two extra digits, but that's better than not
working at all.  */
 #ifndef STRTOF
-# define STRTOF strtod
+

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-19 Thread Marc Nieper-Wißkirchen

Am Di., 19. Mai 2020 um 09:44 Uhr schrieb Marc Nieper-Wißkirchen
:

> Such input and output routines for floats and doubles would also be
> very helpful in the C language. Do you think this could become a
> Gnulib module?

I have just discovered the "dtoastr" module, which seems to do what I
want. It is, however, locale-dependent, and there is no "c_dtoastr"
version as there is a "c_strtod". (Could we get such a wrapper?)

Marc

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-19 Thread Marc Nieper-Wißkirchen

Am Di., 19. Mai 2020 um 17:10 Uhr schrieb Marc Nieper-Wißkirchen
:

> I have just discovered the "dtoastr" module, which seems to do what I
> want. It is, however, locale-dependent, and there is no "c_dtoastr"
> version as there is a "c_strtod". (Could we get such a wrapper?)

Or a version "c_dtoastr" that uses "c_snprintf" and "c_strtod" internally.

Marc

P.S. Could we also get a flag for "(c_)strtod", with which one can
choose the %e or %f format explicitly? At the moment, it always uses
"%g".

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-19 Thread Marc Nieper-Wißkirchen

Am Di., 19. Mai 2020 um 17:51 Uhr schrieb Paul Eggert :
>
> On 5/19/20 8:35 AM, Marc Nieper-Wißkirchen wrote:
> >> It is, however, locale-dependent, and there is no "c_dtoastr"
> >> version as there is a "c_strtod". (Could we get such a wrapper?)
> > Or a version "c_dtoastr" that uses "c_snprintf" and "c_strtod" internally.
>
> Yes, that'd be good to have. Could you write that?

I should be able to. But please note that I haven't filled out an
assignment form for copyright transfer yet (which is, by the copyright
law of Germany, where I live, technically not possible, by the way). I
hope this is not a major problem.

Marc

ftoastr.h: Include guard incomplete

2020-05-19 Thread Marc Nieper-Wißkirchen

ftoastr.h tests for _GL_FTOASTR_H, but does not define it.

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-19 Thread Marc Nieper-Wißkirchen

Am Di., 19. Mai 2020 um 19:23 Uhr schrieb Tim Rühsen :

> "copyright" here means more "Nutzungsrecht" which you can transfer also
> in Germany. You still stay the "Urheber".

If transferring the "Nutzungsrecht" is enough, everything should be fine.

Marc

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-23 Thread Marc Nieper-Wißkirchen

Hi Bruno,

[...]

> The only (tiny) issue is a matter of style: Two of the new module files don't
> end in a newline. In Gnulib, we try to let every non-empty text file end in a
> newline, since "cat FILE" and "sed -e ... FILE" work better this way.

I will correct this when I send you the final patch. In a future
version, I may add a flag to select between %e, %f, and %g.

> I would apply this patch. But first, we need a copyright assignment of yours.
> As mentioned in [1], I think a simple copyright assignment is enough. You 
> start
> the process by taking the gnulib/doc/Copyright/request-assign.future template
> and sending it the FSF.

I sent such an assignment request to the FSF a few days ago but
haven't heard back from them yet.

Marc

Re: stack module

2020-05-23 Thread Marc Nieper-Wißkirchen

Hi Bruno,

> In Gnulib, we usually avoid extensive use of multi-line macros, because
> it hampers debuggability. Here, however, one needs macros, in order to
> accommodate the TYPE argument, which is not necessarily compatible to 'void 
> *'.
> Nevertheless, we would try to put as much code as possible into functions.
> The STACK_INIT macro, for example, could be implemented as a function.

How? This would mean that the function stack_init has to take a void *
argument, which has to be cast to

struct
{
  void *base; ...
}

and we have to hope that this structure is guaranteed to be compatible to

struct
{
  type *base; ...
}

by the C standard.

> > #define STACK_CLEAR(stack)\
> >   free ((stack).base)
>
> Shouldn't this one also set .size and .allocated to 0 ?

A stack can be uninitialized or initialized. An uninitialized stack is
initialized by STACK_INIT. It is cleared (and uninitialized) by
STACK_CLEAR. An uninitialized stack does not have to maintain any
invariant. The only way to use it is to initialize it again. Thus,
setting .size and .allocated seems pointless.

> > #define STACK_POP(stack)\
> >   (stack).base [--(stack).size]
> >
> > #define STACK_DISCARD(stack)\
> >   (--(stack).size)
> >
> > #define STACK_TOP(stack)\
> >   (stack).base[(stack).size - 1]
>
> In these three macros, I would consider to abort() when (stack).size == 0.

In the form of assure of the assure module? Or, to facilitate
optimization, better assume from verify module? In non-debug builds, I
want to make sure that no superfluous checks are made.

Marc

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-23 Thread Marc Nieper-Wißkirchen

Am Sa., 23. Mai 2020 um 16:52 Uhr schrieb Bruno Haible :

> > In a future version, I may add a flag to select between %e, %f, and %g.
>
> Hmm, that route is likely going the path to sprintf: First a flag to
> select among %e, %f, %g. Then an optional width. Then a flag to select
> whether to fill with spaces or with zeroes.
> If you go that route, it's better to start with full sprintf and define
> a particular "precision" that would mean "as many digits as needed to
> guarantee that the result can be unambiguously read back in".

So you mean that we should hack it into
http://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/vasnprintf.c?

How can we ensure that this module is (at least) in one-way sync with
glibc's (va)s(n)printf?

But it is an intriguing idea!

Re: stack module

2020-05-23 Thread Marc Nieper-Wißkirchen

Am Sa., 23. Mai 2020 um 16:02 Uhr schrieb Bruno Haible :

> I was expecting that you write
>
>   struct
>   {
> void *base; ...
>   }

This removes type safety. The benefit of the current approach is that
stack types of different types are not compatible.

> Then macro should better be called STACK_FREE or STACK_DESTROY.

> The name STACK_CLEAR suggests that the result is still a valid stack, and 
> empty.

Thank you for this suggestion; much better.

> > In the form of assure of the assure module? Or, to facilitate
> > optimization, better assume from verify module? In non-debug builds, I
> > want to make sure that no superfluous checks are made.
>
> The 'verify' module is for compile-time checks.
>
> 'assure' is an improved form of 'assert'.

The verify  also contains a runtime check `assume', which uses
__builtin_unreachable if available to help the compiler in optimizing
modes.

Re: stack module

2020-05-23 Thread Marc Nieper-Wißkirchen

Am Sa., 23. Mai 2020 um 18:44 Uhr schrieb Paul Eggert :
>
> On 5/23/20 8:55 AM, Marc Nieper-Wißkirchen wrote:
> > A combination of assure and assume would be helpful:
> >
> > #define checked_assume(X) do { assure (X); assume (X); } while (0)
>
> No, because the compiler is entitled to optimize away the 'assure (X)' in this
> case. I installed the attached to try to explain this better.

Sure, but not in "-O0" or "-Og" builds, I think. Or am I mistaken?

But it is definitely better to make it more robust and not rely on
optimization levels:

#include 
#include "verify.h"

#ifdef NDEBUG
# define checked_assume(E) assume (E)
#else
# define checked_assume(E) assert (E)
#endif

Re: stack module

2020-05-23 Thread Marc Nieper-Wißkirchen

Am Sa., 23. Mai 2020 um 17:39 Uhr schrieb Paul Eggert :
>
> On 5/23/20 7:36 AM, Marc Nieper-Wißkirchen wrote:
> > The verify  also contains a runtime check `assume', which uses
> > __builtin_unreachable if available to help the compiler in optimizing
> > modes.
>
> I wouldn't call 'assume (X)' a "runtime check". It's more an 
> anti-runtime-check.
>
> 'assume (X)' is a directive from the programmer to the compiler that X is true
> so that the compiler needn't generate code to test whether X is true. This is
> why 'assume' is in verify.h: it's related to static checking (in this case,
> static checking done by the programmer), not dynamic checking.
>
> Perhaps I should add a comment to this effect

A combination of assure and assume would be helpful:

#define checked_assume(X) do { assure (X); assume (X); } while (0)

In debug builds, we get assertions and can check our assumptions. In
non-debug (NDEBUG) builds, we just have the compiler hints.

Re: stack module

2020-05-24 Thread Marc Nieper-Wißkirchen

Thank you very much, Paul!

One bit of the commentary is still misleading, though, I think. You
wrote for the affirm macro that if NDEBUG is defined that the behavior
is undefined if E has side effects. That's not true as long as E does
not evaluate to false.

Marc

Am So., 24. Mai 2020 um 04:14 Uhr schrieb Paul Eggert :
>
> On 5/23/20 3:06 PM, Bruno Haible wrote:
> > How about this instead?
>
> Thanks, good point about the danger. Also, I forgot to include verify.h.
>
> I tightened up the commentary, folded in Marc's suggestion, and installed the
> attached.

Bundling Gnulib tests with an application

2020-05-24 Thread Marc Nieper-Wißkirchen

When an application bundles Gnulib tests in its distribution, say in a
folder named gnulib-tests, and has its own suite of Automake tests in the
folder tests, the console output by Automake when running

make check

is suboptimal because it will print something like


Testsuite summary for MyProgram 1.0


twice, once when processing the tests folder and once when processing the
gnulib-tests folder.

Is there a way to change the title that is printed by Automake?

Marc

Re: Add gl_list_remove_last to list/xlist

2020-05-22 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Am Sa., 2. Mai 2020 um 17:49 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > Okay; I agree that a separate stack or FIFO module can make more
> > sense; in particular when it only has to deal with a single
> > implementation of an underlying data structure. At the moment I do
> > have other work to finish first, but maybe I will find some time in
> > the near future for a stack module.

Alternatively, one could implement a universally usable stack through
the following header file (mimicking somewhat C++ templates). What do
you think? It will be a lot faster than using the general list modules
of Gnulib.

It would be used like:

STACK (int) stack;
STACK_INIT (stack);
assert (STACK_EMPTY (stack));
STACK_PUSH (stack, 4)
assert (!STACK_EMPTY (stack));
assert (STACK_TOP (stack) == 4);
assert (STACK_POP (stack) == 4);
assert (STACK_EMPTY (stack));
STACK_CLEAR (stack);

So long,

Marc



#ifndef _GL_STACK_H
#define _GL_STACK_H

#include 
#include 
#include 

#define STACK(type)\
  struct {\
type *base;\
size_t size;\
size_t allocated;\
  }

#define STACK_INIT(stack)\
  do\
{\
  (stack).base = NULL;\
  (stack).size = 0;\
  (stack).allocated = 0;\
}\
  while (0)

#define STACK_CLEAR(stack)\
  free ((stack).base)

#define STACK_EMPTY(stack)\
  ((stack).size == 0)

#define STACK_BASE(stack)\
  ((stack).base)

#define STACK_PUSH(stack, item)\
  do\
{\
  if ((stack).size == (stack).allocated)\
(stack).base = x2nrealloc ((stack).base, &(stack).allocated,
sizeof (item)); \
  (stack).base [(stack).size++] = item;\
}\
  while (0)

#define STACK_POP(stack)\
  (stack).base [--(stack).size]

#define STACK_DISCARD(stack)\
  (--(stack).size)

#define STACK_TOP(stack)\
  (stack).base[(stack).size - 1]

#define STACK_SIZE(stack)\
  ((stack).size)

#endif /* _GL_STACK_H */

Re: stack module

2020-05-23 Thread Marc Nieper-Wißkirchen

Hi Paul,

Am Sa., 23. Mai 2020 um 19:33 Uhr schrieb Paul Eggert :

> Probably not for -O0. I'm not so sure for -Og. Either way, we shouldn't rely 
> on
> GCC's current behavior in this area as it is neither documented nor guaranteed
> to stay the same.

Agreed.

> > #include 
> > #include "verify.h"
> >
> > #ifdef NDEBUG
> > # define checked_assume(E) assume (E)
> > #else
> > # define checked_assume(E) assert (E)
> > #endif
>
> Something like that would work, though the name "checked_assume" is misleading
> since the assumption is not always checked.
>
> "affirm (E)" would be a better name, since the name's not being used anymore 
> by
> the old software verification project[1] and it slides in well next to 
> "assume"
> and "assert". (Some day we're going to run out of synonyms. :-)

Believe it or not, but when I first proposed the (initial version of
the) macro, I wanted to name it "affirm" after I had looked for
synonyms. Only eventually, I switched to the name "checked_assume".

But if "affirm" is fine with you, I would love to see it in a module.
Either in verify or assure or in a new module named affirm.

valgrind-tests and suppression files

2020-05-23 Thread Marc Nieper-Wißkirchen

Is there any built-in support in the valgrind-tests module for
suppression files? For example, if I want to run my tests under
valgrind (if available), I write

LOG_COMPILER = $(VALGRIND)

or, in case I use libtool,

LOG_COMPILER = $(LIBTOOL) --mode=execute $(VALGRIND)

But how to add valgrind suppression files in this context?

LOG_COMPILER = $(VALGRIND) --suppressions=...

is not a good solution because it will fail when VALGRIND is empty
because valgrind-tests couldn't find a valgrind.

What's the canonical solution of this problem?

Re: stack module

2020-05-23 Thread Marc Nieper-Wißkirchen

Am Sa., 23. Mai 2020 um 22:51 Uhr schrieb Paul Eggert :
>
> On 5/23/20 10:38 AM, Marc Nieper-Wißkirchen wrote:
> > But if "affirm" is fine with you, I would love to see it in a module.
> > Either in verify or assure or in a new module named affirm.
>
> Something like the attached?

That would be a good addition to Gnulib, indeed, I think. (And may
even already be used by a number of modules, which currently call
abort explicitly.)

I find the comments in your code very valuable. I may change only one
thing. Above the definition of affirm, you write "Unlike standard
'assert', this macro always compiles E even when NDEBUG is defined
...". I think it is better to change it to "... always evaluates E
even when ...". In fact, when the assumption is fulfilled, E must have
been evaluated. If the assumption fails we are in the realm of
undefined behavior, so we may as well claim that E has been evaluated.
This is important not only so that the side effects of the evaluation
of E are documented, but also to remind the user of a potentially
costly evaluation.

Module suggestion: Atomic operations

2020-05-24 Thread Marc Nieper-Wißkirchen

C11 has introduced atomic types and atomic operations.  When they are not
available, one can use locks/mutexes instead.

It would be nice if there was a Gnulib module that abstracts over this,
much like the threadlib module and friends abstract over a specific
threading implementation.

What I am thinking of is the following: Given a type T, a new Gnulib module
atomic allows the declaration of an atomic version of type T.  This is
straightforward on a platform that has .  Otherwise the atomic
version of T would be a struct consisting of an object of type T together
with a lock.

The rest of the module would then provide some simple atomic primitives
like fetch_and_add, etc. that are either mapped to the C11 stdatomic
counterparts or are implemented using the lock.

Marc

Re: Unused parameter warnings

2020-10-07 Thread Marc Nieper-Wißkirchen

Am Di., 6. Okt. 2020 um 23:06 Uhr schrieb Bruno Haible :

> Marc Nieper-Wißkirchen wrote:
> > In file included from lib/gl_linked_list.c:29:
> > lib/gl_linked_list.c: In function 'gl_linked_iterator_from_to':
> > lib/gl_anylinked_list2.h:938:1: error: function might be candidate for
> > attribute 'pure' if it is known to return normally
> > [-Werror=suggest-attribute=pure]
> >   938 | gl_linked_iterator_from_to (gl_list_t list,
> >   | ^~
>
> But no such warning in line 920? Makes no sense to me.

The function is static and will be inlined everywhere so the static
analyzer (in the relevant optimizing levels) doesn't seem to get to
the point where it has to/can check whether the function is pure or
not.

> And a function that may invoke abort () does "affect observable state".

This description of the a pure function does not seem to be accurate.
When a function calls another function like abort that is marked with
_Noreturn in a pure context, for the compiler the function can still
be pure (but not const). It can eliminate a second call to the
function with the same parameters. I am pretty sure that the warning
of GCC in line 938 is correct. (It has to be conservative about this
warning because you cannot mark a funciton non-pure.)

(That GCC understands about _Noreturn in a pure context is important
as otherwise the assert debugging facility becomes less usable.)

Re: stack module

2020-10-07 Thread Marc Nieper-Wißkirchen

Dear Bruno,

please see the attached patch for the new module.

Thanks,

Marc

Am Mi., 7. Okt. 2020 um 11:01 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Hi Bruno,
>
> I am finally finishing my work on the stack module.
>
> Am Do., 23. Juli 2020 um 12:36 Uhr schrieb Bruno Haible :
>
> [...]
>
> > This is perfectly acceptable for Gnulib. It has debuggability and type 
> > safety.
> >
> > You have precedent e.g. in lib/diffseq.h and lib/aligned-malloc.h.
> >
> > You can even omit the 'GL_' prefix from the macro names. The user can #undef
> > the macros after including the file; therefore there is nearly no risk of
> > collision with macros defined by other code.
>
> After I have given it a bit more thought, I see a different type of
> collision possibility, namely if some header file uses the stack
> header file. It will leak the macro names. Assume that the stack
> module accepts the macro name ELEMENT for the element type. A
> hypothetical header for the frob type may be implemented as:
>
> #ifndef FROB_H
> #define FROB_H
> ...
> #define ELEMENT int
> #define STORAGECLASS static inline
> #include "stack.h"
> ...
> struct frob
> {
>   stack frob_stack;
>   ...
> };
> ...
> #endif /* FROB_H */
>
> Now, user code cannot do:
>
> #define ELEMENT hydrogen
> #include "frob.h"
> ...
>
> (The ELEMENT definition in the first line could in fact be exported by
> some other header file.)
>
> Therefore, I think I prefer to use prefixed names.
>
> Cheers,
>
> Marc
From c449d60058f4a5806729ec76779fbb050890ddaf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marc=20Nieper-Wi=C3=9Fkirchen?= 
Date: Wed, 7 Oct 2020 13:29:10 +0200
Subject: [PATCH] Type-safe stack data type. * MODULES.html.sh: Add entry for
 the stack module. * modules/stack: New file. * modules/stack-tests: New file.
 * lib/stack.h: New file. * tests/test-stack.c: New file.

---
 ChangeLog   |   9 +++
 MODULES.html.sh |   1 +
 lib/stack.h | 163 
 modules/stack   |  25 +++
 modules/stack-tests |  13 
 tests/test-stack.c  |  74 
 6 files changed, 285 insertions(+)
 create mode 100644 lib/stack.h
 create mode 100644 modules/stack
 create mode 100644 modules/stack-tests
 create mode 100644 tests/test-stack.c

diff --git a/ChangeLog b/ChangeLog
index 875f3551a..b64689f73 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2020-10-07  Marc nieper-Wißkirchen  
+
+	stack: New module.
+	* MODULES.html.sh: Add entry for the stack module.
+	* modules/stack: New file.
+	* modules/stack-tests: New file.
+	* lib/stack.h: New file.
+	* tests/test-stack.c: New file.
+
 2020-10-05  Paul Eggert  
 
 	thread: pacify GCC on Solaris 10
diff --git a/MODULES.html.sh b/MODULES.html.sh
index a8a629e29..7e7cdae3e 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2031,6 +2031,7 @@ func_all_modules ()
   func_module readline
   func_module readtokens
   func_module readtokens0
+  func_module stack
   func_module strverscmp
   func_module filevercmp
   func_end_table
diff --git a/lib/stack.h b/lib/stack.h
new file mode 100644
index 0..22e976952
--- /dev/null
+++ b/lib/stack.h
@@ -0,0 +1,163 @@
+/* Type-safe stack data type.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */
+
+/* Written by Marc Nieper-Wißkirchen , 2020.  */
+
+/* This header file does not have include-guards as it is meant to be
+   includeable multiple times as long as the necessary defines have
+   been set up.
+
+   A stack is implemented with a homogenous array of elements in
+   memory, which will be resized (and moved) as needed.  Elements are
+   passed by value and it is the responsibility of the user-code to
+   free any allocate and free any resources associated with individual
+   elements.
+
+   The exported names are all prefixed by ‘stack’ (e.g. stack_init),
+   but this can be changed by setting an appropriate define.
+ Type:   stack_type
+ Initialization: stack_init ();
+ De-initialization:  stack_destroy ();
+ Predicate:  bool res = stack_empty ();
+ Introspection:  ELEMENT *base = stack_base

Re: stack module

2020-10-07 Thread Marc Nieper-Wißkirchen

Hi Bruno,

I am finally finishing my work on the stack module.

Am Do., 23. Juli 2020 um 12:36 Uhr schrieb Bruno Haible :

[...]

> This is perfectly acceptable for Gnulib. It has debuggability and type safety.
>
> You have precedent e.g. in lib/diffseq.h and lib/aligned-malloc.h.
>
> You can even omit the 'GL_' prefix from the macro names. The user can #undef
> the macros after including the file; therefore there is nearly no risk of
> collision with macros defined by other code.

After I have given it a bit more thought, I see a different type of
collision possibility, namely if some header file uses the stack
header file. It will leak the macro names. Assume that the stack
module accepts the macro name ELEMENT for the element type. A
hypothetical header for the frob type may be implemented as:

#ifndef FROB_H
#define FROB_H
...
#define ELEMENT int
#define STORAGECLASS static inline
#include "stack.h"
...
struct frob
{
  stack frob_stack;
  ...
};
...
#endif /* FROB_H */

Now, user code cannot do:

#define ELEMENT hydrogen
#include "frob.h"
...

(The ELEMENT definition in the first line could in fact be exported by
some other header file.)

Therefore, I think I prefer to use prefixed names.

Cheers,

Marc

Re: Unused parameter warnings

2020-10-05 Thread Marc Nieper-Wißkirchen

Thank you! This works perfectly now.

In the meantime, my GCC has reported another warning here.

In file included from lib/gl_linked_list.c:29:
lib/gl_linked_list.c: In function 'gl_linked_iterator_from_to':
lib/gl_anylinked_list2.h:938:1: error: function might be candidate for
attribute 'pure' if it is known to return normally
[-Werror=suggest-attribute=pure]
  938 | gl_linked_iterator_from_to (gl_list_t list,
  | ^~

Marc

Am Mo., 5. Okt. 2020 um 00:08 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> Marc Nieper-Wißkirchen wrote:
> > When compiling Gnulib with -Wunused-parameter, I get the following
> > report from GCC:
> >
> > lib/localename.c: In function 'gl_locale_name_thread_unsafe':
> > lib/localename.c:3117:57: error: unused parameter 'categoryname'
> > [-Werror=unused-parameter]
> >  3117 | gl_locale_name_thread_unsafe (int category, const char 
> > *categoryname)
> >   | ^~~~
> > lib/localename.c: In function 'gl_locale_name_posix':
> > lib/localename.c:3256:49: error: unused parameter 'categoryname'
> > [-Werror=unused-parameter]
> >  3256 | gl_locale_name_posix (int category, const char *categoryname)
> >   | ^~~~
> > lib/localename.c: In function 'gl_locale_name_environ':
> > lib/localename.c:3321:29: error: unused parameter 'category'
> > [-Werror=unused-parameter]
> >  3321 | gl_locale_name_environ (int category, const char *categoryname)
> >   | ^~~~
> >
> > Wouldn't it make sense to insert MAYBE_UNUSED from "attribute.h" here?
>
> Yes. -Wunused-parameter is part of -Wall, unfortunately. Sigh.
>
> Here I prefer _GL_UNUSED, because it does not require '#include 
> "attribute.h"'.
>
>
> 2020-10-04  Bruno Haible  
>
> localename: Fix a couple of "unused parameter" warnings.
> Reported by Marc Nieper-Wißkirchen  in
> <https://lists.gnu.org/archive/html/bug-gnulib/2020-10/msg00014.html>.
> * lib/localename.c (gl_locale_name_thread_unsafe, 
> gl_locale_name_thread,
> gl_locale_name_posix, gl_locale_name_environ): Add _GL_UNUSED in
> parameter list.
>
> diff --git a/lib/localename.c b/lib/localename.c
> index 5731ceb..1bf47ed 100644
> --- a/lib/localename.c
> +++ b/lib/localename.c
> @@ -3114,7 +3114,7 @@ freelocale (locale_t locale)
>  static
>  # endif
>  const char *
> -gl_locale_name_thread_unsafe (int category, const char *categoryname)
> +gl_locale_name_thread_unsafe (int category, const char *categoryname 
> _GL_UNUSED)
>  {
>  # if HAVE_GOOD_USELOCALE
>{
> @@ -3229,7 +3229,7 @@ gl_locale_name_thread_unsafe (int category, const char 
> *categoryname)
>  #endif
>
>  const char *
> -gl_locale_name_thread (int category, const char *categoryname)
> +gl_locale_name_thread (int category, const char *categoryname _GL_UNUSED)
>  {
>  #if HAVE_GOOD_USELOCALE
>const char *name = gl_locale_name_thread_unsafe (category, categoryname);
> @@ -3253,7 +3253,7 @@ gl_locale_name_thread (int category, const char 
> *categoryname)
>  #endif
>
>  const char *
> -gl_locale_name_posix (int category, const char *categoryname)
> +gl_locale_name_posix (int category, const char *categoryname _GL_UNUSED)
>  {
>  #if defined WINDOWS_NATIVE
>if (LC_MIN <= category && category <= LC_MAX)
> @@ -3318,7 +3318,7 @@ gl_locale_name_posix (int category, const char 
> *categoryname)
>  }
>
>  const char *
> -gl_locale_name_environ (int category, const char *categoryname)
> +gl_locale_name_environ (int category _GL_UNUSED, const char *categoryname)
>  {
>const char *retval;
>
>

Unused parameter warnings

2020-10-04 Thread Marc Nieper-Wißkirchen

When compiling Gnulib with -Wunused-parameter, I get the following
report from GCC:

lib/localename.c: In function 'gl_locale_name_thread_unsafe':
lib/localename.c:3117:57: error: unused parameter 'categoryname'
[-Werror=unused-parameter]
 3117 | gl_locale_name_thread_unsafe (int category, const char *categoryname)
  | ^~~~
lib/localename.c: In function 'gl_locale_name_posix':
lib/localename.c:3256:49: error: unused parameter 'categoryname'
[-Werror=unused-parameter]
 3256 | gl_locale_name_posix (int category, const char *categoryname)
  | ^~~~
lib/localename.c: In function 'gl_locale_name_environ':
lib/localename.c:3321:29: error: unused parameter 'category'
[-Werror=unused-parameter]
 3321 | gl_locale_name_environ (int category, const char *categoryname)
  | ^~~~

Wouldn't it make sense to insert MAYBE_UNUSED from "attribute.h" here?

Thanks,

Marc

Re: Non-opaque hamt type?

2020-10-12 Thread Marc Nieper-Wißkirchen

One final issue (I hope):

At the moment, the header file exposes an opaque struct Hamt and
communication with the code happens through (stack-allocated) pointers
to a Hamt. In the implementation, however, each Hamt just consists of
two pointers (a pointer to a function table and a pointer to the
root), so stack-allocating Hamts instead and passing them around by
values makes as much sense.

This would have the advantage of reducing heap allocation. The
disadvantage would be that adding further fields to a Hamt in future
extensions might be more problematic and that identity of Hamts cannot
be decided through pointer identity anymore (so the protocol how to
decide whether an element has been inserted or not has to be changed).

What do you think?

Am So., 11. Okt. 2020 um 21:09 Uhr schrieb Bruno Haible :

> OK for me. Thanks for having listened to the many remarks.

Your remarks only helped to make the module better!

Marc

Re: Non-opaque hamt type?

2020-10-18 Thread Marc Nieper-Wißkirchen

Okay, if you find the latter protocol better anyway, I will switch to
this protocol, and hamts will be stack-allocated (just two words) and
passed by value.

Thanks,

Marc

Am So., 18. Okt. 2020 um 19:58 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > The existing protocol is as follows:
> >
> > Hamt_entry *e = hamt_entry (...);
> > Hamt_entry *p = e;
> > Hamt *new_hamt = hamt_insert (old_hamt, );
> > if (old_hamt == new_hamt)
> >   {
> > /* The element hasn't been insert as an equivalent element has already 
> > been in the hamt. p now holds a reference to the entry that already existed 
> > in the hamt.
> > element_free (e);
> > ...
> > hamt_free (old_hamt); /* We don't have to free new_hamt because no new 
> > hamt was created. */
> >   }
> > else
> >   {
> > /* The element has been inserted. p hasn't changed. */
> > ...
> > hamt_free (old_hamt);  /* This frees all hamts */
> > hamt_free (new_hamt); /* and all elements inserted, including e. */
> >   }
> >
> > A protocol where no pointer values need to be compared could use p to
> > carry the information:
> >
> > Hamt_entry *e = hamt_entry ();
> > Hamt_entry *p = e;
> > Hamt new_hamt = hamt_insert (old_hamt, );
> > if (p == e)
> >   {
> > /* The element has been inserted. */
> > ... /* See above. */
> >   }
> > else if (p == NULL)
> >   {
> > /* The element e already existed in the hamt. */
> >... /* See above. */
> >   }
> > else /* p != e && p != NULL */
> >   {
> > /* An element equivalent to e already existed in the hamt. p now holds 
> > this element. */
> > ... /* See above. */
> >   }
>
> I find the latter protocol more robust: it does not depend on details of
> the implementation of hamt_insert.
>
> Can you decide on your original question (allow stack-allocated HAMTs)?
> I don't feel I can help you decide this.
>
> Bruno
>

[New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-09 Thread Marc Nieper-Wißkirchen

Hi,

after I have contributed two comparably trivial modules to Gnulib, I
would like to contribute a less trivial module this time. It
implements a persistent version of Phil Bagwell's HAMTs, which has
been popularized by Clojure. HAMTs can be used when a persistent
(functional/pure) version of a data structure akin to hash tables is
needed. For example, the dynamic environment of a (possibly
multi-threaded) Lisp or Scheme can be modeled with persistent HAMTs.

Please take a look at the attached patch.

Thank you,

Marc
From 39a1ac1f78c8d00b1dfad4f260318a6fb5cf5584 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marc=20Nieper-Wi=C3=9Fkirchen?= 
Date: Wed, 7 Oct 2020 09:53:48 +0200
Subject: [PATCH] hamt: New module.

This module provides (persistent) hash array mapped tries.

* MODULES.html.sh: Add hamt.
* lib/hamt.c, lib/hamt.h, modules/hamt, modules/hamt-tests,
tests/test-hamt.c: New files.
---
 MODULES.html.sh|   1 +
 lib/hamt.c | 812 +
 lib/hamt.h |  97 ++
 modules/hamt   |  29 ++
 modules/hamt-tests |  11 +
 tests/test-hamt.c  | 148 +
 6 files changed, 1098 insertions(+)
 create mode 100644 lib/hamt.c
 create mode 100644 lib/hamt.h
 create mode 100644 modules/hamt
 create mode 100644 modules/hamt-tests
 create mode 100644 tests/test-hamt.c

diff --git a/MODULES.html.sh b/MODULES.html.sh
index a8a629e29..b48ca2bc4 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2028,6 +2028,7 @@ func_all_modules ()
   func_module hash-pjw
   func_module hash-pjw-bare
   func_module hash
+  func_module hamt
   func_module readline
   func_module readtokens
   func_module readtokens0
diff --git a/lib/hamt.c b/lib/hamt.c
new file mode 100644
index 0..4876d5809
--- /dev/null
+++ b/lib/hamt.c
@@ -0,0 +1,812 @@
+/* (Persistent) hash array mapped tries.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Written by Marc Nieper-Wißkirchen , 2020.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */
+
+#include 
+#include "hamt.h"
+
+#include 
+#include 
+#include 
+#include 
+#include "count-one-bits.h"
+#include "verify.h"
+#include "xalloc.h"
+
+/* See: Phil Bagwell (2000). Ideal Hash Trees (Report). Infoscience
+   Department, École Polytechnique Fédérale de Lausanne.
+
+   http://infoscience.epfl.ch/record/64398/files/idealhashtrees.pdf
+
+   We implement a persistent version of hash array mapped tires.  Each
+   updating operation returns a new hamt, which shares structure with
+   the original one.  If persistence is not needed, transient hash
+   tables are probably faster. */
+
+typedef
+#ifdef GL_HAMT_THREAD_SAFE
+_Atomic
+#endif
+size_t ref_counter;
+
+/* Hash values are of type size_t.  For each level of the trie, we use
+   5 bits (corresponding to lg2 of the width of a 32-bit word.  */
+#define MAX_DEPTH ((SIZE_WIDTH + 4) / 5)
+
+/***/
+/* Entry Types */
+/***/
+
+/* Leaf nodes are of type element.  Non-leaf nodes are either
+   subtries or, if at maximal depth, buckets.  */
+enum entry_type
+{
+  element_entry = 0,
+  subtrie_entry = 1,
+  bucket_entry = 2
+};
+
+/* Return the type an entry.  */
+static enum entry_type
+entry_type (const Hamt_entry *entry)
+{
+  return entry->ref_count & 3;
+}
+
+//
+/* Reference Counts */
+//
+
+/* Initialize the reference counter, storing its type.  */
+static void
+init_ref_counter (ref_counter *counter, enum entry_type type)
+{
+  *counter = 4 + type;
+}
+
+/* Increase the reference counter of an entry.  */
+static void
+inc_ref_counter (ref_counter *counter)
+{
+  *counter += 4;
+}
+
+/* Decrease the entry reference counter.  Return false if the entry
+   can be deleted.  */
+static bool
+dec_ref_counter (ref_counter *counter)
+{
+  *counter -= 4;
+  return *counter >= 4;
+}
+
+/**/
+/* Structures */
+/**/
+
+/* Different generations of a hamt share a function table.  */
+struct function_table
+{
+  Hamt_hasher *hasher;
+  Hamt_comparator *comparator;
+  Hamt_freer *freer;
+  ref_counter ref_count;
+};
+
+/* Different generations of a hamt share subtries.  A singleton
+   subtrie is modelled as a single element.  */
+struct subtrie
+{
+  ref_counter ref_count;
+  /* Nodes carry labels from 0 to 31.  The i-th bit in map is set if
+ the node l

Re: Unused parameter warnings

2020-10-10 Thread Marc Nieper-Wißkirchen

I can ask for a better description of "pure" in the GCC manual. In any
case, if a function doesn't return at all due to a function call
marked with _Noreturn, it does not affect the observable state locally
and GCC can avoid "emitting some calls in repeated invocations of the
function with the same argument values". As I wrote before, this is an
important observation as otherwise "assert" couldn't be used in "pure"
functions.

Anyway, the function gl_linked_iterator_from_to in Gnulib should
probably be fixed independently by adding the "pure" attribute so that
the code is compilable with -Werror -Wall -Wextra.

Am Sa., 10. Okt. 2020 um 15:50 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > > And a function that may invoke abort () does "affect observable state".
> >
> > This description of the a pure function does not seem to be accurate.
> > When a function calls another function like abort that is marked with
> > _Noreturn in a pure context, for the compiler the function can still
> > be pure (but not const). It can eliminate a second call to the
> > function with the same parameters. I am pretty sure that the warning
> > of GCC in line 938 is correct.
>
> The documentation of ATTRIBUTE_PURE in Gnulib is taken from the GCC
> documentation [1][2], therefore if you think it needs to be fixed, it's
> through a GCC bug report.
>
> Bruno
>
> [1] https://lists.gnu.org/archive/html/bug-gnulib/2020-05/msg00105.html
> [2] 
> https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Common-Function-Attributes.html
>

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen

Am Sa., 10. Okt. 2020 um 16:54 Uhr schrieb Bruno Haible :
>
> Since you define GL_HAMT_THREAD_SAFE to either 0 or 1, you need
> to test it through
>   #if GL_HAMT_THREAD_SAFE
> not
>   #ifdef GL_HAMT_THREAD_SAFE
>
> Bruno

Oh, yes... thanks!

Re: Unused parameter warnings

2020-10-10 Thread Marc Nieper-Wißkirchen

Am Sa., 10. Okt. 2020 um 16:39 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > I can ask for a better description of "pure" in the GCC manual.
>
> Yes, please.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97364

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen

Speaking of GL_HAMT_THREAD_SAFE:

Is there a special Gnulib way (or Autoconf macro) for the following test:

#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
!defined (__STD_NO_ATOMICS__)

I am asking because there may be non-C11 compilers that nevertheless
understand _Atomic.

Am Sa., 10. Okt. 2020 um 17:01 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Am Sa., 10. Okt. 2020 um 16:54 Uhr schrieb Bruno Haible :
> >
> > Since you define GL_HAMT_THREAD_SAFE to either 0 or 1, you need
> > to test it through
> >   #if GL_HAMT_THREAD_SAFE
> > not
> >   #ifdef GL_HAMT_THREAD_SAFE
> >
> > Bruno
>
> Oh, yes... thanks!

Re: stack module

2020-10-10 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Am Sa., 10. Okt. 2020 um 16:06 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > please see the attached patch for the new module.
>
> Very nice. Looks all good, except for some nit-picking in the comments:

Thanks.

>
> > + Introspection:  ELEMENT *base = stack_base ();
>
> I would add a comment here:
>   Where ELEMENT is the type to which GL_STACK_ELEMENT was defined when
>   this file was included.
> (It would be tempting to write
> GL_STACK_ELEMENT *base = stack_base ();
> but GL_STACK_ELEMENT is no longer defined after the file was included...)

I will add this comment.

> Other than that, please feel free to commit and push this. Thanks!!

I will do this as soon as I have been approved by Paul. (Do I have to
do something about the ChangeLog entry or will it be auto-generated
from my Git commit message?)

Marc

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Am Sa., 10. Okt. 2020 um 16:35 Uhr schrieb Bruno Haible :

> It is exciting to see such a datastructure with various application domains 
> [1]
> in Gnulib!

I'm glad that you like the idea.

>
> I haven't read through it line by line; nevertheless a couple of comments:

A few more lines will follow before the final draft is done because I
have been adding some destructive update procedures in the meantime.

>   - +/* The public interface is documented in the file hamt.c.
>
> This is not good. As a user of the module, I would want to read *only*
> hamt.h and know how to use the public interface. In other words, hamt.h
> should have the functions' comments.

I followed the example of the already existing "hash" module. But I
can easily move the public comments into the header file.

>   - "Each
> updating operation returns a new hamt, which shares structure with
> the original one."
> So, after calling hamt_insert or hamt_delete, which function should I call
> to delete the original Hamt without affecting the new one? (In order to
> avoid accumulating pieces of old Hamts in memory.)

When you do, say,

new_hamt = hamt_insert (hamt, ...)

and new_hamt != hamt afterwards, you have to delete both hamt and
new_hamt with hamt_free to free all the memory. After you have freed
the first hamt, the second won't be affected.

>   - "The GL_HAMT_THREAD_SAFE flag is set if the implementation of hamts
> is thread-safe as long as two threads do not simultaneously access
> the same hamt."
> I don't understand this sentence. Isn't the guarantee that
> "is thread-safe as long as two threads do not simultaneously access
> the same hamt" trivial? And what you want to achieve through the use
> of _Atomic is that it is thread-safe *also* when two threads access
> the same hamt?

The guarantee is not trivial because two different hamts may share
some structure.

So... assume you have a hamt. You do some operations on it (or you do
hamt_copy) to get a new_hamt that shares structure. Now if thread 1
works on hamt and thread 2 on new_hamt, it is safe when
GL_HAMT_THREAD_SAFE is set.

If two threads access the same hamt, it is not guaranteed to be safe.
If you need this, get an (extremely shallow) copy through hamt_copy
first.

Marc

[PATCH] stack: New module.

2020-10-10 Thread Marc Nieper-Wißkirchen

From: Marc Nieper-Wißkirchen 

* MODULES.html.sh: Add entry for the stack module.
* modules/stack: New file.
* modules/stack-tests: New file.
* lib/stack.h: New file.
* tests/test-stack.c: New file.
---
 ChangeLog   |   9 +++
 MODULES.html.sh |   1 +
 lib/stack.h | 165 
 modules/stack   |  25 +++
 modules/stack-tests |  13 
 tests/test-stack.c  |  74 
 6 files changed, 287 insertions(+)
 create mode 100644 lib/stack.h
 create mode 100644 modules/stack
 create mode 100644 modules/stack-tests
 create mode 100644 tests/test-stack.c

diff --git a/ChangeLog b/ChangeLog
index b479cf0c7..f496c8345 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2020-10-10  Marc Nieper-Wißkirchen  
+
+   stack: New module.
+   * MODULES.html.sh: Add entry for the stack module.
+   * modules/stack: New file.
+   * modules/stack-tests: New file.
+   * lib/stack.h: New file.
+   * tests/test-stack.c: New file.
+
 2020-10-10  Paul Eggert  
 
attribute: improve const, pure doc
diff --git a/MODULES.html.sh b/MODULES.html.sh
index a8a629e29..7e7cdae3e 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2031,6 +2031,7 @@ func_all_modules ()
   func_module readline
   func_module readtokens
   func_module readtokens0
+  func_module stack
   func_module strverscmp
   func_module filevercmp
   func_end_table
diff --git a/lib/stack.h b/lib/stack.h
new file mode 100644
index 0..5d803901c
--- /dev/null
+++ b/lib/stack.h
@@ -0,0 +1,165 @@
+/* Type-safe stack data type.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */
+
+/* Written by Marc Nieper-Wißkirchen , 2020.  */
+
+/* This header file does not have include-guards as it is meant to be
+   includeable multiple times as long as the necessary defines have
+   been set up.
+
+   A stack is implemented with a homogenous array of elements in
+   memory, which will be resized (and moved) as needed.  Elements are
+   passed by value and it is the responsibility of the user-code to
+   free any resources associated with individual elements.
+
+   The exported names are all prefixed by ‘stack’ (e.g. stack_init),
+   but this can be changed by setting an appropriate define.
+ Type:   stack_type
+ Initialization: stack_init ();
+ De-initialization:  stack_destroy ();
+ Predicate:  bool res = stack_empty ();
+ Introspection:  ELEMENT *base = stack_base ();
+ Pushing:stack_push (, element);
+ Popping:ELEMENT element = stack_pop ();
+ Discarding: stack_discard ();
+ Top-of-stack:   ELEMENT element = stack_top ();
+ Size:   size_t size = stack_size ();
+
+   Here, ELEMENT is the type to which GL_STACK_ELEMENT was defined when
+   this file was included.
+*/
+
+/* Before including this file, you need to define:
+ GL_STACK_ELEMENT   The type of the elements on the stack.
+ GL_STACK_STORAGECLASS  (Optional) The storage class specifier (e.g. 
static)
+to use in the function definitions.
+ GL_STACK_NAME  (Optional) The prefix to use for the type names
+and functions definitions instead of the default
+‘stack’.
+
+   After including this file, these names will be undefined.
+
+   Before including this file, you also need to include:
+ #include ""
+ #include ""
+ #include "assure.h"
+ #include "xalloc.h"
+*/
+
+#ifndef GL_STACK_ELEMENT
+# error "Please define GL_STACK_ELEMENT first."
+#endif
+
+#ifndef GL_STACK_STORAGECLASS
+# define GL_STACK_STORAGECLASS
+#endif
+
+#ifndef GL_STACK_NAME
+# define GL_STACK_NAME stack
+#endif
+
+#define _GL_STACK_PREFIX(name) _GL_CONCAT(GL_STACK_NAME, _GL_CONCAT(_, name))
+#define _GL_STACK_TYPE _GL_STACK_PREFIX(type)
+
+typedef struct
+{
+  GL_STACK_ELEMENT *base;
+  size_t size;
+  size_t allocated;
+} _GL_STACK_TYPE;
+
+/* Initialize a stack.  */
+GL_STACK_STORAGECLASS void
+_GL_STACK_PREFIX (init) (_GL_STACK_TYPE *stack)
+{
+  stack->base = NULL;
+  stack->size = 0;
+  stack->allocated = 0;
+}
+
+/* Destroy a stack by freeing the allocated space.  */
+GL_STACK_STORAGECLASS vo

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen

I have attached an improved version of the HAMT module to this email.

Apart from improving the comments (which includes moving some
documentation into the header file) and changing the things already
discussed, I added a few more tests and three procedures for in-place
updates.

For example, you can now do hamt_insert_x to destructively insert an
element into a hamt (instead of inserting the element into a
conceptual copy as you do with hamt_insert). Hamts that share
structure with the modified hamt are not effected. (The idea with the
_x prefix comes from Guile whose C interface uses _x where Scheme uses
!.)

NB: For those who fancy Scheme, hamts can be used to implement the
library (srfi 146 hash) of SRFI 146 ([1]). The destructive update
operations of this module can be used to implement the linear-update
procedures of (srfi 146 hash) efficiently.

--

[1] https://srfi.schemers.org/srfi-146/srfi-146.html

Am Sa., 10. Okt. 2020 um 23:24 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Am Sa., 10. Okt. 2020 um 20:19 Uhr schrieb Paul Eggert :
>
> > #if __STDC_VERSION__ < 201112 || defined __STD_NO_ATOMICS__
> >
> > which is a cleaner way of writing the negative of the above test. These days
> > there should be no reason to check whether __STDC_VERSION__ is defined,
> > generally it's clearer to use "<" instead of ">=" so that textual order 
> > reflects
> > numeric order, and the parens after "defined" are better omitted.
>
> Thanks, I did the edit in my local copy.
From db21b3e975ad7526e0db531fd4936354d8f031ae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marc=20Nieper-Wi=C3=9Fkirchen?= 
Date: Sat, 10 Oct 2020 23:38:21 +0200
Subject: [PATCH] hamt: New module.

This module provides (persistent) hash array mapped tries.
* MODULES.html.sh: Add hamt.
* lib/hamt.c: New file.
* lib/hamt.h: New file.
* modules/hamt: New file.
* modules/hamt-tests: New file.
* tests/test-hamt.c: New file.
---
 ChangeLog  |  11 +
 MODULES.html.sh|   1 +
 lib/hamt.c | 984 +
 lib/hamt.h | 182 +
 modules/hamt   |  29 ++
 modules/hamt-tests |  11 +
 tests/test-hamt.c  | 352 
 7 files changed, 1570 insertions(+)
 create mode 100644 lib/hamt.c
 create mode 100644 lib/hamt.h
 create mode 100644 modules/hamt
 create mode 100644 modules/hamt-tests
 create mode 100644 tests/test-hamt.c

diff --git a/ChangeLog b/ChangeLog
index 2aba2b0c7..b9e4f9970 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2020-10-10  Marc Nieper-Wißkirchen  
+
+	hamt: New module.
+	This module provides (persistent) hash array mapped tries.
+	* MODULES.html.sh: Add hamt.
+	* lib/hamt.c: New file.
+	* lib/hamt.h: New file.
+	* modules/hamt: New file.
+	* modules/hamt-tests: New file.
+	* tests/test-hamt.c: New file.
+
 2020-10-10  Bruno Haible  
 
 	*-list, *-oset, *-omap: Avoid possible compiler warnings.
diff --git a/MODULES.html.sh b/MODULES.html.sh
index 7e7cdae3e..2907eb741 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2028,6 +2028,7 @@ func_all_modules ()
   func_module hash-pjw
   func_module hash-pjw-bare
   func_module hash
+  func_module hamt
   func_module readline
   func_module readtokens
   func_module readtokens0
diff --git a/lib/hamt.c b/lib/hamt.c
new file mode 100644
index 0..eb15e19c1
--- /dev/null
+++ b/lib/hamt.c
@@ -0,0 +1,984 @@
+/* (Persistent) hash array mapped tries.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */
+
+/* Written by Marc Nieper-Wißkirchen , 2020.  */
+
+#include 
+#include "hamt.h"
+
+#include 
+#include 
+#include 
+#include 
+#include "count-one-bits.h"
+#include "verify.h"
+#include "xalloc.h"
+
+/* Reference counters can be shared between different threads if the
+   entry they belong to is shared between different threads.
+   Operations on them therefore have to be atomic so that no invalid
+   state is observable.
+
+   A thread must not modify an entry or its children (!) if its
+   reference count implies that the entry is shared by at least two
+   hamts.  */
+typedef
+#if GL_HAMT_THREAD_SAFE
+_Atomic
+#endif
+size_t ref_counter;
+
+/* Hash values are of type size_t.  For each level of the trie, we use
+

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen

Am Sa., 10. Okt. 2020 um 19:41 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > Is there a special Gnulib way (or Autoconf macro) for the following test:
> >
> > #if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> > !defined (__STD_NO_ATOMICS__)
> >
> > I am asking because there may be non-C11 compilers that nevertheless
> > understand _Atomic.
>
> And there may be C11 compilers which don't support _Atomic and nevertheless
> dont't define __STD_NO_ATOMICS__.
>
> So, indeed it would be useful to have an Autoconf macro that tests whether
> _Atomic can be used.

If someone is able to provide such a macro for Gnulib, I would replace
my check with this macro.

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen

Am Sa., 10. Okt. 2020 um 20:19 Uhr schrieb Paul Eggert :

> #if __STDC_VERSION__ < 201112 || defined __STD_NO_ATOMICS__
>
> which is a cleaner way of writing the negative of the above test. These days
> there should be no reason to check whether __STDC_VERSION__ is defined,
> generally it's clearer to use "<" instead of ">=" so that textual order 
> reflects
> numeric order, and the parens after "defined" are better omitted.

Thanks, I did the edit in my local copy.

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 03:28 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> Some comments:
>
> * GL_HAMT_THREAD_SAFE can be defined to 1 not only if
> __STDC_VERSION__ >= 201112 && ! defined __STD_NO_ATOMICS__
>   but also if
> __GNUC__ + (__GNUC_MINOR >= 9) > 4

Fixed.

>   (see the other mail).
>
> * Still '#ifdef GL_HAMT_THREAD_SAFE'.

Fixed.

> * Typo s/comparision/comparison/

Fixed.

> * Hamt_processor: The comment does not say what the return value means. Maybe 
> it
>   would be clearer if this type was moved down to the "Iteration" section.

Moved.

> * hamt_lookup: If the caller is allowed to modify the payload stored in the
>   returned entry, this function should return a 'Hamt_entry *', not a
>   'const Hamt_entry *'. Just like 'strchr' and 'strstr' return a 'char *',
>   not a 'const char *'.

Unless the caller knows what it does, modifying the payload is not a
good idea because the entry is shared between different hamts. If the
caller really wants to modify the payload, it has to do an explicit
type cast (which is safe).

> * In trienode_count, you don't need count_one_bits_l, only count_one_bits.
>   It's OK to assume that 'uint32_t' and 'unsigned int' are the same type.

Okay.

> * If subtrie_lookup was inlined in entry_lookup, you could get rid of the
>   forward declaration of entry_lookup, and compilers would be more likely
>   to turn the recursion of entry_lookup into a loop (tail-recursion
>   elimination).

I would prefer to have smaller functions. Are there any relevant
compilers who do not inline the function at higher recursion levels?

> * If subtrie_do_while was inlined in entry_do_while, you could get rid of the
>   forward declaration of entry_do_while.

> * It would be useful to be able to construct modules 'hamt-set' and 
> 'hamt-map',
>   analogous to 'hash-set' and 'hash-map', but the hamt API currently has two
>   issues that block it:
>
>   1) The iterator API is such that you cannot write the iteration using a 
> loop;
>  it involves a function call that is active during the entire iteration.
>  This is also problematic for two other reasons:
>
>- Creating a closure in C (unlike Scheme or Lisp) is tedious hand-work.
>  (The GNU C extension that allows local functions inside functions [1]
>  is not a solution, because
>- it is unportable,
>- it forces an executable stack on many platforms, which is
>  considered bad for security.)
>
>- In Common Lisp, iteration through a hash table is available not only
>  through a function-call interface (MAPHASH [2]) but also through an
>  iteration-loop interface (WITH-HASH-TABLE-ITERATOR [3], LOOP [4]).
>
>  For example, in GNU clisp, LOOP expands to
>
>  [1]> (macroexpand-1 '(loop for x being each hash-key of h do (foo 
> x)))
>  (MACROLET ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-ERROR)))
>   (BLOCK NIL
>(LET ((#:WHTI-3214 (SYSTEM::HASH-TABLE-ITERATOR H)))
> (MULTIPLE-VALUE-BIND (#:MORE?3215 #:HASH-KEY-3216 
> #:HASH-VALUE-3217)
>  (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214)
>  (DECLARE (IGNORE #:HASH-VALUE-3217))
>  (LET ((X NIL))
>   (LET NIL
>(MACROLET ((LOOP-FINISH NIL '(GO SYSTEM::END-LOOP)))
> (TAGBODY SYSTEM::BEGIN-LOOP (UNLESS #:MORE?3215 (LOOP-FINISH))
>  (SETQ X #:HASH-KEY-3216) (PROGN (PROGN (FOO X)))
>  (MULTIPLE-VALUE-SETQ
>   (#:MORE?3215 #:HASH-KEY-3216 #:HASH-VALUE-3217)
>   (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214))
>  (GO SYSTEM::BEGIN-LOOP) SYSTEM::END-LOOP
>  (MACROLET
>   ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-WARN)
> '(GO SYSTEM::END-LOOP ;
>
>  You can see that there is an initial function call 
> HASH-TABLE-ITERATOR
>  upfront, and then a function call to HASH-TABLE-ITERATE in each round
>  of the loop.
>
>  This principle makes it possible to iterate e.g. through a hash table
>  and a list in parallel.

In Scheme, it is even more relevant because you have call/cc so any
iteration can be suspended and restored any number of times.

I thought this could be solved by storing a "next" pointer in the
payload but this doesn't work with hamts sharing entries... So I will
add some iterator interface as well. Preferably one that can also be
used to implement Hamts in Scheme.

>   2) We have a dilemma regarding use of malloc vs. xmalloc. While modules
>  meant for use in programs can readily use xmalloc, modules meant for use
>  (also) in libraries cannot use xmalloc, since a library is not supposed
>  to terminate the program, even when memory is tight.

GMP more or less has this behavior.

>  The solution we found for this dilemma is that the *-set and *-map 
> modules
>  use just

Re: out-of-memory handling

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 12:54 Uhr schrieb Bruno Haible :

> > The bigger problem is that it mustn't make the code slower for the
> > 99,9% of the use cases where there is either enough memory or
> > calling xalloc_die is the only reasonable action.
>
> OK. When we add hamt as a possible implementation for gl_set and gl_map, we
> could document that out-of-memory handling must be done
>   - either through xalloc_die(),
>   - or by defining xmalloc and xrealloc [this is what Emacs does],
> unlike for the other implementations.

I am not so convinced of whether it makes sense to create a
gl_set/gl_map frontend for this hamt implementation. It is optimized
for the persistence (otherwise, use ordinary hash tables), while the
gl_set/gl_map interface is for destructive updates.

Re: HAMT iterator

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 13:02 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > I am going to implement the following iterator API as well:
> >
> > /* An alternative interface to iterating through the entry of a hamt
> >that does not make use of a higher-order function like
> >hamt_do_while uses the opaque Hamt_iterator type.  */
> > typedef struct hamt_iterator Hamt_iterator;
> >
> > /* Return a newly created iterator for HAMT.  */
> > extern Hamt_iterator *hamt_iterator_create (const Hamt *hamt);
>
> The pointer return here is overkill. A prototype
>
>   extern Hamt_iterator hamt_iterator_create (const Hamt *hamt);
>
> is sufficient, because the way such an iterator is used is:
>
>   Hamt_iterator iter = hamt_iterator_create (hamt);
>
>   for (...)
> {
>   ...
>   Hamt_entry *elt;
>   if (hamt_iterator_next (, ))
> {
>   ...
> }
>   ...
> }
>
>   hamt_iterator_free ();
>
> > /* Return an independent copy of ITER that is initially in the same
> >state.  */
> > extern Hamt_iterator *hamt_iterator_copy (Hamt_iterator *iter);
>
> Then a copy function is not needed, because the user's program can do
>
>   Hamt_iterator iter_clone = iter;

The hamt itself has to be copied (to increase the reference counter).

> > /* Return true if ITER is not at the end, place the current element in
> >*ELT_PTR and move the iterator forward.  Otherwise, return
> >*false.  */
> > extern bool hamt_iterator_next (Hamt_iterator *iter,
> > Hamt_entry *const *elt_ptr);
>
> The 'const' here forbids assigning to *ELT_PTR.

Yes. Wrong position of const.

Re: out-of-memory handling

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 13:56 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > I am not so convinced of whether it makes sense to create a
> > gl_set/gl_map frontend for this hamt implementation.
>
> Wikipedia [1] lists some advantages of HAMTs even without the persistence.
>
> > It is optimized
> > for the persistence (otherwise, use ordinary hash tables), while the
> > gl_set/gl_map interface is for destructive updates.
>
> How would a HAMT implementation look like that does not support persistence,
> but is instead optimized for destructive updates? Probably the reference
> counters would go away. Anything else?

The reference counter would go away as would all tests for "shared" in the code.

Without a reference counter, an element can be any "void *" (instead
of a pointer to a struct starting with Hamt_entry), which is a
substantial change.

We can add a number of #if's to the code so that it either compiles to
an implementation of persistent or of transient hamts, but I wouldn't
rather do it right now because it would make the code harder to read.
But when the hamt code has become stable (and has been put to use and,
therefore, to real-world testing), this sounds like a feasible option.

Re: HAMT iterator

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 14:14 Uhr schrieb Bruno Haible :

[...]

> > > > /* Return an independent copy of ITER that is initially in the same
> > > >state.  */
> > > > extern Hamt_iterator *hamt_iterator_copy (Hamt_iterator *iter);
> > >
> > > Then a copy function is not needed, because the user's program can do
> > >
> > >   Hamt_iterator iter_clone = iter;
> >
> > The hamt itself has to be copied (to increase the reference counter).
>
> Whether copying an iterator can be done by assignment or requires a function
> call, is of second importance.
>
> The more important point I wanted to make: Does allocating an iterator
> require a (heap) memory allocation?
>
> If you iterate with do_while, no memory allocation is needed, because all
> information is stored on the stack, between the various function calls.
>
> If you iterate with hamt_iterator_create, and the Hamt_iterator type
> has bounded size (I guess it will not require more than 15 pointers and
> 13 indices), it can be stored on the caller's stack.

That's a very good point you make. The hamt iterator has, of course,
(low) bounded size, so it can (and should) be allocated on the stack.

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-11 Thread Marc Nieper-Wißkirchen

I am going to implement the following iterator API as well:

/* An alternative interface to iterating through the entry of a hamt
   that does not make use of a higher-order function like
   hamt_do_while uses the opaque Hamt_iterator type.  */
typedef struct hamt_iterator Hamt_iterator;

/* Return a newly created iterator for HAMT.  */
extern Hamt_iterator *hamt_iterator_create (const Hamt *hamt);

/* Free the iterator ITER created through hamt_iterator_create or
   hamt_iterator_copy.  */
extern void hamt_iterator_free (Hamt_iterator *iter);

/* Return an independent copy of ITER that is initially in the same
   state.  */
extern Hamt_iterator *hamt_iterator_copy (Hamt_iterator *iter);

/* Return true if ITER is not at the end, place the current element in
   *ELT_PTR and move the iterator forward.  Otherwise, return
   *false.  */
extern bool hamt_iterator_next (Hamt_iterator *iter,
Hamt_entry *const *elt_ptr);

Am So., 11. Okt. 2020 um 10:20 Uhr schrieb Marc Nieper-Wißkirchen
:

> Am So., 11. Okt. 2020 um 03:28 Uhr schrieb Bruno Haible :

> >- Creating a closure in C (unlike Scheme or Lisp) is tedious 
> > hand-work.
> >  (The GNU C extension that allows local functions inside functions 
> > [1]
> >  is not a solution, because
> >- it is unportable,
> >- it forces an executable stack on many platforms, which is
> >  considered bad for security.)

PS Conceptually, hamt_do_while takes a closure, which consists of the
function pointer PROC and the closure environment DATA.

If this interface is sufficient for an application, it is more
lightweight than the alternative interface above because all
intermediate state is stored on the stack.

Re: HAMT iterator

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 14:04 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > > > /* Return an independent copy of ITER that is initially in the same
> > > >state.  */
> > > > extern Hamt_iterator *hamt_iterator_copy (Hamt_iterator *iter);
> > >
> > > Then a copy function is not needed, because the user's program can do
> > >
> > >   Hamt_iterator iter_clone = iter;
> >
> > The hamt itself has to be copied (to increase the reference counter).
>
> Then the comment should clarify what "independent" means. I thought it
> means that both iterators (the original one and the copy) can be used
> simultaneously, as long as the HAMT does not change. Do you mean
> something broader?
>   - If someone creates a derivative of the HAMT, the iterator won't
> be affected, right? ("persistence")

Yes.

>   - If someone makes destructive modifications to the HAMT (through the
> *_x functions), the iterator will be affected if it has not yet
> passed the point of modification, right?

No, the iterator shouldn't be affected. (The point of modification is
not well-defined without exposing implementation details.)

> So, what is the scenario where increasing the reference count will make
> a difference?

If you have a language with GC like Lisp or Scheme, say, the hamt may
be GC'd while the iterator is still live.

Re: HAMT iterators

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 12:29 Uhr schrieb Bruno Haible :

> The recursion across hamt_do_while / entry_do_while / subtrie_do_while /
> bucket_do_while builds up a stack whose essential contents is a set of
> indices, each at most 5 bits large, with a total of SIZE_WIDTH bits.
> In other words, the iterator object can be a

Yes, this is roughly how the implementation I worked on this morning
works. However, I haven't yet implemented the 5-bit compression.

> struct
>   {
> const Hamt *hamt;
> size_t position;
> int depth;
> const Hamt_entry *entry;
>   };
>
> where during the iteration
>   - hamt stays unmodified,
>   - position is incremented by some amount at each round,
>   - depth is the recursion depth.
>   - entry is a cache of hamt->nodes[(position >> 59) & 31][(position >> 54) & 
> 31]...
> with depth-1 indexing operations, and is changed at each round,
>
> HASH-TABLE-ITERATOR sets position, depth, entry to point to the first entry.
> HASH-TABLE-ITERATE works as follows:
>   - in a bucket, it increments position and fetches the next entry of the 
> bucket,

For a bucket, in the worst case, we need the full size_t range for
position, so we have to store the path in the tree that leads to the
bucket in another size_t variable.

>   - if the bucket is done, it decreases depth and goes to
> hamt->nodes[(position >> 59) & 31][(position >> 54) & 31]... (with depth-1
> indexing operations) until it finds a subtrie that is not yet done,

This means a lot of pointer operations if large subtries are processed
that are deep in the trie. The hamt_do_while operation stores that
chain of entries from the root implicitly on the stack. The same
should be done for the iterator, meaning that an array of MAX_DEPTH
entries has to be cached. On a 32 bit system, this means 28 bytes,
and, on a 64 bit system, 130 bytes.

> then it increases depth, entry to point to the first entry in that 
> subtrie.
>
> If done this way, the iterator will fit into the gl_set_iterator_t and
> gl_map_iterator_t that we already have in gl_set.h and gl_map.h.
>
> Bruno

Re: Draft #3 (with iterators)

2020-10-11 Thread Marc Nieper-Wißkirchen

I have implemented everything we have discussed today. The patch
versus the current master is attached so it can be reviewed.

The changes versus the previous draft can be summarized as follows:

* Bug fixes.
* Use _Atomic on GCC 4.9+.
* Implement a lightweight iterator type akin to the iterators of gl_list.
* Expose element initialization to the user so than an element can be
inserted in more than one hamt.
* Rename delete to remove.
* Improve documentation.

Future options for when the code has matured:

* Inline a number of subtrie procedures to get rid of forward
declarations to help compilers.
* Implement purely non-pure hamts to replace ordinary hash tables.
* Add _nx versions of the procedures that won't call xalloc_die.

Many thanks to Bruno for his support, guidance and his great suggestions.

Marc

Am So., 11. Okt. 2020 um 19:32 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Am So., 11. Okt. 2020 um 10:20 Uhr schrieb Marc Nieper-Wißkirchen
> :
> >
> > Am So., 11. Okt. 2020 um 03:28 Uhr schrieb Bruno Haible :
>
> [...]
>
> > > * hamt_lookup: If the caller is allowed to modify the payload stored in 
> > > the
> > >   returned entry, this function should return a 'Hamt_entry *', not a
> > >   'const Hamt_entry *'. Just like 'strchr' and 'strstr' return a 'char *',
> > >   not a 'const char *'.
> >
> > Unless the caller knows what it does, modifying the payload is not a
> > good idea because the entry is shared between different hamts. If the
> > caller really wants to modify the payload, it has to do an explicit
> > type cast (which is safe).
>
> I have noticed a problem with the current design: While an element can
> be in more than one hamt (because copies of hamts are created through
> various operations), an element cannot be actively inserted in more
> than one hamt. The reason is that the reference counter of the element
> is initialized whenever the element is inserted.
>
> The way out is to expose the initialization function to the user, who
> becomes responsible for initializing each element exactly once.
>
> As soon as it is possible to insert an element more than once, another
> observation will be made: The insert procedure does not accept a
> pointer to a const element because it must be able to change the
> reference counter internally. Thus it is more convenient if procedures
> like hamt_lookup do not return const versions.
From ece7b9e3cd090e9c084efd72677669130e80dd9c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marc=20Nieper-Wi=C3=9Fkirchen?= 
Date: Sat, 10 Oct 2020 23:38:21 +0200
Subject: [PATCH] hamt: New module.

This module provides (persistent) hash array mapped tries.
* MODULES.html.sh: Add hamt.
* lib/hamt.c: New file.
* lib/hamt.h: New file.
* modules/hamt: New file.
* modules/hamt-tests: New file.
* tests/test-hamt.c: New file.
---
 ChangeLog  |   11 +
 MODULES.html.sh|1 +
 lib/hamt.c | 1083 
 lib/hamt.h |  253 +++
 modules/hamt   |   29 ++
 modules/hamt-tests |   11 +
 tests/test-hamt.c  |  378 
 7 files changed, 1766 insertions(+)
 create mode 100644 lib/hamt.c
 create mode 100644 lib/hamt.h
 create mode 100644 modules/hamt
 create mode 100644 modules/hamt-tests
 create mode 100644 tests/test-hamt.c

diff --git a/ChangeLog b/ChangeLog
index 22f79fb09..7263e628f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2020-10-11  Marc Nieper-Wißkirchen  
+
+	hamt: New module.
+	This module provides (persistent) hash array mapped tries.
+	* MODULES.html.sh: Add hamt.
+	* lib/hamt.c: New file.
+	* lib/hamt.h: New file.
+	* modules/hamt: New file.
+	* modules/hamt-tests: New file.
+	* tests/test-hamt.c: New file.
+
 2020-10-11  Bruno Haible  
 
 	stdioext: Update comments regarding UnixWare.
diff --git a/MODULES.html.sh b/MODULES.html.sh
index 7e7cdae3e..2907eb741 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2028,6 +2028,7 @@ func_all_modules ()
   func_module hash-pjw
   func_module hash-pjw-bare
   func_module hash
+  func_module hamt
   func_module readline
   func_module readtokens
   func_module readtokens0
diff --git a/lib/hamt.c b/lib/hamt.c
new file mode 100644
index 0..f92d9c4e8
--- /dev/null
+++ b/lib/hamt.c
@@ -0,0 +1,1083 @@
+/* (Persistent) hash array mapped tries.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License fo

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 10:20 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Am So., 11. Okt. 2020 um 03:28 Uhr schrieb Bruno Haible :

[...]

> > * hamt_lookup: If the caller is allowed to modify the payload stored in the
> >   returned entry, this function should return a 'Hamt_entry *', not a
> >   'const Hamt_entry *'. Just like 'strchr' and 'strstr' return a 'char *',
> >   not a 'const char *'.
>
> Unless the caller knows what it does, modifying the payload is not a
> good idea because the entry is shared between different hamts. If the
> caller really wants to modify the payload, it has to do an explicit
> type cast (which is safe).

I have noticed a problem with the current design: While an element can
be in more than one hamt (because copies of hamts are created through
various operations), an element cannot be actively inserted in more
than one hamt. The reason is that the reference counter of the element
is initialized whenever the element is inserted.

The way out is to expose the initialization function to the user, who
becomes responsible for initializing each element exactly once.

As soon as it is possible to insert an element more than once, another
observation will be made: The insert procedure does not accept a
pointer to a const element because it must be able to change the
reference counter internally. Thus it is more convenient if procedures
like hamt_lookup do not return const versions.

Re: terminology

2020-10-11 Thread Marc Nieper-Wißkirchen

Am So., 11. Okt. 2020 um 16:14 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > I have attached an improved version of the HAMT module to this email.
>
> How about terminology: "delete" vs. "remove"?

> In this sense, 'hamt_delete' is triggering the wrong associations.
> How about renaming 'hamt_remove'?
>
> Deleting an entry from a hash table or HAMT does not always mean to
> delete the object that the entry references.
>
> The Java collections [1], C# collections [2], Python collections [3]
> all use the verb "remove".

I'm fine with this change and agree with your reasoning. I used the
hash module (see your comment below) as a guide for the interface,
that's why I called the procedure hamt_delete in the first place.

> Yes, we still have hash_delete (in module 'hash') and 'argz_delete' (in
> module 'argz'); these are very old modules.

Marc

Re: Non-opaque hamt type?

2020-10-18 Thread Marc Nieper-Wißkirchen

Am So., 18. Okt. 2020 um 16:39 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > At the moment, the header file exposes an opaque struct Hamt and
> > communication with the code happens through (stack-allocated) pointers
> > to a Hamt. In the implementation, however, each Hamt just consists of
> > two pointers (a pointer to a function table and a pointer to the
> > root), so stack-allocating Hamts instead and passing them around by
> > values makes as much sense.
> >
> > This would have the advantage of reducing heap allocation.
>
> By how much does it reduce the heap allocation? Say, someone allocates
> a HAMT and adds 100 entries. There will be heap allocations of ca. 5-10
> buckets and internal nodes. Adding one more heap allocation for the
> root object is not a tragedy.

A struct with just two pointers is allocated on the heap, meaning it
is probably below the minimum malloc size on most implementations. In
architectures where structs of two pointers are passed by value in and
out of functions, a heap-allocated structure will mean one more
pointer indirection per hamt access.

> So, in the end the question is whether to optimize the case of small HAMTs.
>
> > The disadvantage would be that adding further fields to a Hamt in future
> > extensions might be more problematic
>
> True. But when this happens we can mitigate this through a warning in the
> NEWS file.
>
> > and that identity of Hamts cannot
> > be decided through pointer identity anymore (so the protocol how to
> > decide whether an element has been inserted or not has to be changed).
>
> Does this protocol change introduce a significant slowdown?

The existing protocol is as follows:

Hamt_entry *e = hamt_entry (...);
Hamt_entry *p = e;
Hamt *new_hamt = hamt_insert (old_hamt, );
if (old_hamt == new_hamt)
  {
/* The element hasn't been insert as an equivalent element has
already been in the hamt. p now holds a reference to the entry that
already existed in the hamt.
element_free (e);
...
hamt_free (old_hamt); /* We don't have to free new_hamt because no
new hamt was created. */
  }
else
  {
/* The element has been inserted. p hasn't changed. */
...
hamt_free (old_hamt);  /* This frees all hamts */
hamt_free (new_hamt); /* and all elements inserted, including e. */
  }

A protocol where no pointer values need to be compared could use p to
carry the information:

Hamt_entry *e = hamt_entry ();
Hamt_entry *p = e;
Hamt new_hamt = hamt_insert (old_hamt, );
if (p == e)
  {
/* The element has been inserted. */
... /* See above. */
  }
else if (p == NULL)
  {
/* The element e already existed in the hamt. */
   ... /* See above. */
  }
else /* p != e && p != NULL */
  {
/* An element equivalent to e already existed in the hamt. p now
holds this element. */
... /* See above. */
  }

Marc

Re: hard-locale.c: SETLOCALE_NULL_MAX

2020-10-01 Thread Marc Nieper-Wißkirchen

Hi Bruno,

thank you for your reply. I could finally track down the error. It was
in my build configuration. Somehow, the wrong "-I ..." flags were set
up during the compilation of the Gnulib modules. It's working now. So,
everything is alright with Gnulib and I can retract my bug report.
Please excuse me for bothering you. Initially, when I found the bug on
Gentoo's bug tracker, I didn't think of such a simple cause.

As for your questions, it was probably only (6) that caused the problem.

Marc

Am Mi., 30. Sept. 2020 um 18:38 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > The compiler throws the following error at me:
> >
> > lib/hard-locale.c: In function 'hard_locale':
> > lib/hard-locale.c:29:15: error: 'SETLOCALE_NULL_MAX' undeclared (first
> > use in this function); did you mean 'SETLOCALE_NULL_ALL_MTSAFE'?
> >29 |   char locale[SETLOCALE_NULL_MAX];
> >   |   ^~
>
> Does the error occur always, or only with "make -j"?
>
> Are the following statements true in your build?
>   (1) I use the module 'hard-locale'.
>   (2) I use the module 'setlocale-null'.
>   (3) config.status defines GNULIB_SETLOCALE_NULL to 1.
>   (4) The file lib/setlocale_null.h exists.
>   (5) The file lib/locale.h includes setlocale_null.h.
>   (6) The -I options passed to GCC make sure that setlocale_null.h gets found.
>
> Bruno
>

Re: hard-locale.c: SETLOCALE_NULL_MAX

2020-09-30 Thread Marc Nieper-Wißkirchen

PS It may also be due to a problem in my build system (I am using
non-recursive make) so that the system-wide locale.h is included and
not the Gnulib version...

Am Mi., 30. Sept. 2020 um 16:29 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> It seems that I have stumbled across the same bug as reported here:
>
> https://bugs.gentoo.org/709732
>
> The compiler throws the following error at me:
>
> lib/hard-locale.c: In function 'hard_locale':
> lib/hard-locale.c:29:15: error: 'SETLOCALE_NULL_MAX' undeclared (first
> use in this function); did you mean 'SETLOCALE_NULL_ALL_MTSAFE'?
>29 |   char locale[SETLOCALE_NULL_MAX];
>   |   ^~
>   |   SETLOCALE_NULL_ALL_MTSAFE
> lib/hard-locale.c:29:15: note: each undeclared identifier is reported
> only once for each function it appears in
> lib/hard-locale.c:31:7: warning: implicit declaration of function
> 'setlocale_null_r' [-Wimplicit-function-declaration]
>31 |   if (setlocale_null_r (category, locale, sizeof (locale)))
>   |   ^~~~
>
> Is there a quick fix possible?
>
> Thanks,
>
> Marc

Re: recursive algorithms and stack overflow

2020-09-30 Thread Marc Nieper-Wißkirchen

Am Mi., 26. Aug. 2020 um 12:13 Uhr schrieb Bruno Haible :

> [CCing bug-gnulib and Marc Nieper-Wißkirchen.]
>
> Paul Eggert wrote in <https://bugs.gnu.org/42931>:
> > Bug#42931 prompted me to change the way
> > that the Gnulib diffseq module recurses so that the stack size is O(log N)
> > rather than O(N). I installed this change into Gnulib, here:
> >
> > https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=7aadb23803a8fb71d07e6e87ffb1ca510d86f8ef
>
> Similarly, there was a bug report against libcroco [1] recently, because
> libcroco's CSS matching is recursive and can thus trigger stack overflow.
>
> It seems this is something to watch out, when we have recursive algorithms
> (which is quite frequent). Data sizes of 100 MB (or even just 100 KB) are
> considered "small" by today's computation means; if they can trigger stack
> overflow, users are unhappy.

I agree with all the points made. (And that's why a language with
proper tail calls like Scheme is so convenient.)

> Possible mitigations are:
>   - Use iteration instead of recursion when possible - like here in diffseq.h,
>   - Use iteration with a simulated heap-allocated stack - it would be useful
> to have the stack module in Gnulib for this (Marc?),

I am sorry for my silence. I haven't forgotten about this project. I
will submit a patch hopefully soon.

Marc

hard-locale.c: SETLOCALE_NULL_MAX

2020-09-30 Thread Marc Nieper-Wißkirchen

It seems that I have stumbled across the same bug as reported here:

https://bugs.gentoo.org/709732

The compiler throws the following error at me:

lib/hard-locale.c: In function 'hard_locale':
lib/hard-locale.c:29:15: error: 'SETLOCALE_NULL_MAX' undeclared (first
use in this function); did you mean 'SETLOCALE_NULL_ALL_MTSAFE'?
   29 |   char locale[SETLOCALE_NULL_MAX];
  |   ^~
  |   SETLOCALE_NULL_ALL_MTSAFE
lib/hard-locale.c:29:15: note: each undeclared identifier is reported
only once for each function it appears in
lib/hard-locale.c:31:7: warning: implicit declaration of function
'setlocale_null_r' [-Wimplicit-function-declaration]
   31 |   if (setlocale_null_r (category, locale, sizeof (locale)))
  |   ^~~~

Is there a quick fix possible?

Thanks,

Marc

Re: Module suggestion: Atomic operations

2020-05-25 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Am So., 24. Mai 2020 um 22:54 Uhr schrieb Bruno Haible :

> Yes, given that the platform support for these atomic types/operations is
> increasing, it would accelerate the adoption if there was a Gnulib module,
> like you describe it. Program developers could then adopt 
> without losing portability to a number of platforms.

So, you mean that Gnulib should try to provide  in case
it is not available? I have to think about it whether this can ever
work (and in an efficient way).

When I raised the question, I was more thinking of a custom interface
that abstracts both over  and some other implementation
with explicit mutexes, much like the tls module abstracts over C11's
TLS and some other implementations, like the pthread one.

One problem with  as given is that atomic values have no
destructors. Thus, we cannot simply attach locks to them in a pre-C11
implementation because there is no place to destroy the locks. Thus,
we would have to work with a pool of locks shared by all atomic
variables. But when to set up the pool?

A module atomic with a header "atomic.h" could implement an interface
that has destructors (which are mapped to no-ops when  is
available).

> Personally I'm not very motivated to work on that (because in most algorithms
> I've seen, mutexes/locks are the way to go, and because I find the 
> memory_order
> stuff hard to understand). But if you want to work on that, it will be
> welcome!

I should wait for the copyright assignment first. As for the
memory_order things. A first version of the module may map everything
to memory_order_seq_cst, which is safe but not always the most
efficient.

Marc

Re: Module suggestion: Atomic operations

2020-05-25 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Am Mo., 25. Mai 2020 um 09:24 Uhr schrieb Bruno Haible :

> Pools have the drawback that they add a configuration requirement on the
> application: What is the default size of the pool? When to increase the
> pool size? By how much? When to shrink the pool? The developer would have
> determine good answers to this, but 5 years later or under specific
> circumstances the answers may not be good enough. (*)

For general types, the gcc (and clang) implementations use general
locks coming from a pool. (In the gcc case, see libatomic and,
especially, libat_lock_n.) Nevertheless, I agree with you that a pool
is suboptimal.

> Yes, that's the better way to approach it, then.

There is one problem with providing some emulation with Posix locks,
though. At least one type in  is guaranteed to be
lock-free, the atomic flag. Since C18, it can be used in signal
handlers, where Posix locks won't work because they are not
async-safe. Moreover,  provides memory fences, which I
don't know how to emulate in general and which also seem to be crucial
in signal handlers of multithreaded applications.

It seems to me that a substitute of  needs to know a bit
about the underlying compiler (so that builtins can be used) or the
underlying architecture (for example, x86 does not need memory fences
with the release or acquire memory order. Unfortunately, my knowledge
about other compilers than gcc or other architectures than x86 is
limited. I could provide the skeleton of a substitute, but it would
need other people to add their architectures.

Marc

Re: copyright in Germany

2020-05-20 Thread Marc Nieper-Wißkirchen

Am Mi., 20. Mai 2020 um 00:15 Uhr schrieb Bruno Haible :

> Marc Nieper-Wißkirchen wrote:

> If the code you write is not related to your job, that is, if you are not
> being paid to write it, then the "Nutzungsrechte" belong to you - regardless
> whether your employment contract says otherwise. This is guaranteed by
> Art. 2.(1) Grundgesetz. (*)
>
> Often it is clear whether something is related to your job. For example,
> if you have a manager that tells you what to develop. (*)
>
> If your employer is a university [1], things are less clear. In the past,
> some universities have been strict and claimed that the works of their
> professors and researchers belong to them. This has changed somewhat around
> 2000-2005: The slogan "public money - public code" has convinced at least
> some universities to allow the code that came out of research projects to
> be released under Open Source licenses. In any case, this topic is subject
> to negotiation between the professors/researchers and the university.

As far as I know (but I am, luckily, not a lawyer either), it is the
common legal opinion that Art. 5.(3) Grundgesetz implies that the
"Nutzungsrechte" are not being implicitly assigned from a professor to
the university ([2]). So the university cannot claim that some of my
works are theirs. In any case, I even have a letter from the
university saying basically this. Unfortunately, the FSF lawyers don't
accept the German text (I tried to contribute to GCC once) because
they don't have the manpower to verify it; now I have to persuade the
university lawyers to sign the FSF provided text.

> [1] 
> https://www.uni-augsburg.de/de/fakultaet/mntf/math/prof/alg/arbeitsgruppe/nieper-wisskirchen/

Correct.

Thanks,

Marc

--

[2] https://www.bmbf.de/upload_filestore/pub/Handreichung_UrhWissG.pdf

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-05-20 Thread Marc Nieper-Wißkirchen

Am Mi., 20. Mai 2020 um 00:37 Uhr schrieb Bruno Haible :

> Marc Nieper-Wißkirchen wrote:
> > for output, the shortest rounded
> > representation that still reads back accurately has to be selected.
> > ...
> > A simple algorithm is given by Aubrey Jaffer in [1].
> >
> > [1] http://people.csail.mit.edu/jaffer/III/EZFPRW

This was just one, which I liked because of its simplicity. (And
because it is written in my favorite programming language.)

[...]

> There are several other algorithms for this purpose.

There is also the new Ryu algorithm by Ulf Adams, see [2].
Interestingly, he has found some problems in Jaffer's algorithm, see
the paper [3].

[...]

> Then there's also the algorithm, by Michael Stoll and me (1990),
> in GNU clisp and CLN (based on multiprecision arithmetic, not hardware 
> floats).

I didn't know this; I will check this out.

The initial version of "c_dtoastr" will, however just work as "dtoastr" does.

Marc

--

[2] https://github.com/ulfjack/ryu

[3] https://dl.acm.org/doi/10.1145/3296979.3192369

Re: stack module

2020-06-02 Thread Marc Nieper-Wißkirchen

So here is the updated stack module. The name "CLEAR" has been changed
to "DESTROY" as suggested by Bruno. Error checking has been included.
The macro interface remains. Although a macro interface means macros,
the macros are trivial and the type safety wins here, I think. I have
renamed STACK_BASE to STACK_CURRENT_BASE because the base may be
invalidated when items are pushed onto the stack and reallocation has
to happen.

As I haven't heard anything from the FSF yet, I wouldn't mind if you
considered the following in the public domain...

#ifndef _GL_STACK_H
#define _GL_STACK_H

#include 
#include 

#include "assure.h"
#include "xalloc.h"

#define STACK(type)\
  struct {\
type *base;\
size_t size;\
size_t allocated;\
  }

#define STACK_INIT(stack)\
  do\
{\
  (stack).base = NULL;\
  (stack).size = 0;\
  (stack).allocated = 0;\
}\
  while (0)

#define STACK_DESTROY(stack)\
  free ((stack).base)

#define STACK_EMPTY(stack)\
  ((stack).size == 0)

#define STACK_CURRENT_BASE(stack)\
  ((stack).base)

#define STACK_PUSH(stack, item)\
  do\
{\
  if ((stack).size == (stack).allocated)\
(stack).base = x2nrealloc ((stack).base, &(stack).allocated,
sizeof (item)); \
  (stack).base [(stack).size++] = item;\
}\
  while (0)

#define STACK_POP(stack)\
  (affirm (!STACK_EMPTY (stack)), (stack).base [--(stack).size])

#define STACK_DISCARD(stack)\
  (affirm (!STACK_EMPTY (stack)), (--(stack).size))

#define STACK_TOP(stack)\
  (affirm (!STACK_EMPTY (stack)), (stack).base[(stack).size - 1)

#define STACK_SIZE(stack)\
  ((stack).size)

#endif /* _GL_STACK_H */

Am So., 24. Mai 2020 um 21:26 Uhr schrieb Bruno Haible :
>
> Paul Eggert wrote:
> > I don't want to encourage programmers to supply an E with side effects, as 
> > side
> > effects are trouble here.
>
> +1
>

Re: Bundling Gnulib tests with an application

2020-06-02 Thread Marc Nieper-Wißkirchen

Am So., 31. Mai 2020 um 21:51 Uhr schrieb Bruno Haible :

> I'm not aware of a documented way to change this.
>
> Nevertheless, it is probably easy to do: When you look at the generated
> Makefile.in and search for the string 'summary for', you will see that
> it gets the text that follows it from a Makefile variable. This Makefile
> variable gets its default value from the configure file. But if you
> set a value in Makefile.am, it will override the value from the configure
> file - locally in that Makefile only.

That's a great tip, thank you. And, maybe, it is a good idea to
automatically wire this into the test suite generated by Gnulib.

Re: Module with preprocessor utilities

2020-07-20 Thread Marc Nieper-Wißkirchen

Thank you for your input. So I won't come up with a full-featured
(meta-programming) macro module but may propose one or the other small
module about macros that I have found useful (and fool-proof) in my
own code.

The amount of macro one needs depends on the programming style, of
course. For example, do we want to encourage the writing of
sophisticated macros like the TRACE macros in chapter 16 of (the IMHO
opinionated) book Modern C ([1]). In that case, we don't want to write
STRGY or ALEN everytime.

Marc

--

[1] https://gforge.inria.fr/frs/download.php/latestfile/5298/ModernC.pdf

Am Mo., 20. Juli 2020 um 20:14 Uhr schrieb Paul Eggert :
>
> I'm a bit leery of this idea, as it is painful to write and debug reliable
> preprocessor macros. When you need macros you really need them, but they're 
> best
> avoided in the first place.
>
> The fanciest macros I've contributed to Gnulib are in ; they had 
> to
> be macros because C won't let you write polymorphic functions. A quick look at
> the P99 package suggests it wouldn't have helped me write or debug the
> intprops.h macros, and that it's intended more for meta-macro programming,
> something that gets pretty hairy pretty quickly and something I'm not sure I'd
> like to encourage. (Reading the warning for P99_IS_EMPTY gives me the willies,
> for example.)
>
> That being said, if there's a need for a particular macro for something
> Gnulibish then of course we should be all ears.

Re: Module with preprocessor utilities

2020-07-20 Thread Marc Nieper-Wißkirchen

> If you know of shortcomings of _GL_CONCAT, we need to determine whether we
> should just document a limitation, or spend the necessary macro complexity
> in order to fix it.

Sorry for the misunderstanding; _GL_CONCAT is fine because it takes a
detour over _GL_CONCAT0. But not every macro-writing user who wants to
reliably concatenate tokens may know this. That's why I said that it
may make sense to present even simply macros like GL_CONCAT to the
user. It may make user code more readable if they are used instead of
ad-hoc named macros.

Module with preprocessor utilities

2020-07-19 Thread Marc Nieper-Wißkirchen

Besides modules to improve system compatibility, Gnulib contains a
number of modules that provide "missing" functionality of the C
standard library (like the gl_list module).

What would be the chances to include a module with sophisticated
preprocessor macros like P99 ([1]) or the Boost Preprocessing library
([2])?

It would be a header-only module and its functionality could grow over
time. To allow for the latter in a backward compatible way, it would
make sense to prefix all exported macros with GL_. The internal macros
could be prefixed with _GL_. While this is strictly a violation of
what identifiers are allowed in ISO C, Gnulib already does it this
way.

Such a module would have zero footprint in the library and could be
used by other modules that  currently make use of ad-hoc definitions,
for example the verify module, which defines _GL_CONCAT and
_GL_COUNTER ([3]).

Marc

--

[1] https://gustedt.gitlabpages.inria.fr/p99/p99-html/

[2] https://www.boost.org/doc/libs/1_73_0/libs/preprocessor/doc/index.html

[3] http://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/verify.h#n159

Re: c-ldtoastr test failure

2020-08-10 Thread Marc Nieper-Wißkirchen

Hi Bruno,

thanks for catching that. The proposed change is perfect. Actually, I
had made that fix in my local code as well so I am wondering why it
didn't make itself into the patch I sent you.

Best,

Marc

Am Mo., 10. Aug. 2020 um 08:55 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> On a Linux/x86_64 system, the c-ldtoastr uni test fails.
>
> How to reproduce:
> $ ./gnulib-tool --test --single-configure c-ldtoastr
>
> The test fails like this:
> ../../gltests/test-c-ldtoastr.c:54: assertion '!strcmp (buf, "0.1")' failed
>
> In the debugger, I see that where the code expects a result "0.1",
> the actual result is "0.1555".
>
> This rounding error is not caused by the library code for binary to decimal
> conversion, because you can see that the number in its full glory in the
> debugger:
>
> (gdb) step
> c_ldtoastr (buf=buf@entry=0x7fffd670 "1,", bufsize=bufsize@entry=40, 
> flags=flags@entry=0, width=width@entry=0,
> x=0.155511151231257827) at ../../gllib/ftoastr.c:113
>
>
> Here's a suggested fix. OK to push?
>
>
> 2020-08-09  Bruno Haible  
>
> c-ldtoastr tests: Fix test failure.
> * tests/test-c-ldtoastr.c (main): Support platforms where 'long 
> double'
> is longer than 'double'.
>
> diff --git a/tests/test-c-ldtoastr.c b/tests/test-c-ldtoastr.c
> index 140f0c6..7e38422 100644
> --- a/tests/test-c-ldtoastr.c
> +++ b/tests/test-c-ldtoastr.c
> @@ -50,7 +50,7 @@ main (int argc, char *argv[])
>{
>  char buf[DBL_BUFSIZE_BOUND];
>
> -c_ldtoastr (buf, sizeof buf, 0, 0, 0.1);
> +c_ldtoastr (buf, sizeof buf, 0, 0, 0.1L);
>  ASSERT (!strcmp (buf, "0.1"));
>}
>
>

Callbacks in the abstact data types and extra contextual data

2020-07-09 Thread Marc Nieper-Wißkirchen

Hi,

a number of modules (like the hash module or the list module) allow
the user to specify callbacks (e.g. a comparison function).
Unfortunately, these procedures do not take a context parameter, which
can be a problem because C lacks closures.

The original qsort function in stdlib.h has the same problem. Glibs
has remedied the problem by introducing qsort_r.

It would be nice if the modules hash, list, xlist, oset, xoset, ...
are extended in a similar way. We would essentially need new
constructors, say with the '_r' suffix.

Marc

Re: Callbacks in the abstact data types and extra contextual data

2020-07-10 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Am Fr., 10. Juli 2020 um 20:38 Uhr schrieb Bruno Haible :

> OK. Then let's take the problem seriously.

If your solution can be implemented portable, that will be the best
solution by far.

For GCC one can use nested functions, but how can
'partial_function_last' be implemented in ISO C?

> I think it's time to solve the problem once and for all. I propose to add
> a module that defines a function 'partial_function_last' such that, when
> you have a function pointer
>
>int (*cmp3) (void *arg1, void *arg2, void *context)
>
> then
>
>cmp2 = partial_function_last (cmp3, context);

partial_function_last has to return the address of an existing
function. But this function has no context accept for global (thread
local) variables. How can this function know about 'cmp3'?

Marc

Re: Callbacks in the abstact data types and extra contextual data

2020-07-09 Thread Marc Nieper-Wißkirchen

Hi Bruno,

Am Do., 9. Juli 2020 um 22:16 Uhr schrieb Bruno Haible :

> > a number of modules (like the hash module or the list module) allow
> > the user to specify callbacks (e.g. a comparison function).
> > Unfortunately, these procedures do not take a context parameter, which
> > can be a problem because C lacks closures.
>
> Is this a practical, actual problem, or only a theoretical one?

It is a practical one; in a computer algebra application, I have lots
of small entities (two words each), whose sort order is determined by
the extra data.

> I would hesitate to change (or duplicate) public API, when there is no
> practical need.
>
> Note that the lack of context can be remedied by
>   - use of per-thread variables, or

That's doable, but thread-local variables are like dynamically scoped
variables, which makes reasoning about the program a bit harder.
Moreover, for decent speed, native thread locals would have to be
supported.

>   - use of nested functions [1], or

I don't want to use non-portable functions.

>   - storing a pointer to the necessary context in the list elements.

In my use case, this would be an increase of 50% in the size of the
elements everywhere they appear.

> > The original qsort function in stdlib.h has the same problem. Glibs
> > has remedied the problem by introducing qsort_r.
>
> qsort_r is only portable to glibc systems, FreeBSD, and macOS. And yet,
> no one has requested a substitute for it in gnulib. IMO this indicates
> that few programs need this function.

I haven't yet needed specifically qsort_r as well in portable code
(and only once in non-portable private code), but I have mentioned it
because it illustrates the fundamental problem with a missing context
argument.

I see the point that one shouldn't lightly double an API. Although I
think that the current API is flawed with respect to a missing context
parameter, we cannot change it either without breaking old code.

One could use some macro-metaprogramming technique to get two versions
for each module.

Or one could add a global flag to the configuration (which ends in
config.h) and which changes the behavior globally for the application
(I am not sure whether this technique has been employed before).

Marc

Re: Easy Accurate Reading and Writing of Floating-Point Numbers

2020-06-25 Thread Marc Nieper-Wißkirchen

Hi Bruno,

thank you very much.

Best regards,

Marc

Am Do., 25. Juni 2020 um 21:19 Uhr schrieb Bruno Haible :
>
> > Marc Nieper-Wißkirchen wrote:
> > > Please see the attached patch file, my first attempt (and first
> > > contribution to Gnulib).
> >
> > The patch looks all correct.
> >
> > The only (tiny) issue is a matter of style: Two of the new module files 
> > don't
> > end in a newline.
>
> I've pushed Marc's patch in his name, with the newlines fixed.
>
>
> 2020-06-25  Marc Nieper-Wißkirchen  
>
> c-dtoastr, c-ldtoastr: new modules
> These modules provide the same functionality as the modules
> dtoastr and ldtoastr except for the formatting taking place in the
> C locale.
> * MODULES.html.sh: Add c-dtoastr and c-ldtoastr.
> * lib/c-dtoastr.c, lib/c-ldtoastr.c: New files.
> * lib/ftoastr.c: Prefix exported functions when the macro C_LOCALE is
> defined.  Use c_snprintf and c_strtod/c_strtold instead of
> snprintf and strtod/strtold whhen the macro C_LOCALE is defined.
> * lib/ftoastr.h: Add prototypes for c_dtoastr and c_ldtoastr.
> * modules/c-dtoastr, modules/c-dtoastr-tests, modules/c-ldtoastr,
> modules/c-ldtoastr-tests: New files.
> * tests/test-c-dtoastr.c, tests/test-c-dtoastr.sh,
> tests-c-ldtoastr.c tests-c-ldtoastr.sh: New files.
>
>

Re: Add gl_list_remove_last to list/xlist

2020-06-03 Thread Marc Nieper-Wißkirchen

Hi Bruno,

now that some operations together with their complexity for dealing
with the end of the list have been added, what do you think of adding
a reverse iterator?

Thanks,

Marc

PS I've received a reply by the FSF; now I am waiting for the response
by my university's lawyer.

Am Sa., 2. Mai 2020 um 23:24 Uhr schrieb Bruno Haible :
>
> I wrote:
> > I should better revert yesterday's patch, and instead,
> > in the table show the guaranteed average performance
> >   gl_list_get_first
> >   gl_list_get_last
> >   gl_list_set_first
> >   gl_list_set_last
> >   gl_list_remove_first
> >   gl_list_remove_last
> > where these 6 functions are defined globally, not separately for each
> > implementation.
>
> Done through the two attached patches.
>
> 2020-05-02  Bruno Haible  
>
> list: Add get_first, get_last, set_first, set_last operations.
> * lib/gl_list.h (gl_list_get_first, gl_list_get_last,
> gl_list_nx_set_first, gl_list_nx_set_last): New functions.
> * lib/gl_xlist.h (gl_list_set_first, gl_list_set_last): New functions.
>
> 2020-05-02  Bruno Haible  
>
> list: Remove redundant code for remove_first and remove_last 
> operations.
> * lib/gl_list.h (struct gl_list_implementation): Remove fields
> remove_first, remove_last.
> (gl_list_remove_first, gl_list_remove_last): Implement in a generic 
> way.
> * lib/gl_array_list.c: Revert last change.
> * lib/gl_carray_list.c: Likewise.
> * lib/gl_anylinked_list2.h: Likewise.
> * lib/gl_linked_list.c: Likewise.
> * lib/gl_linkedhash_list.c: Likewise.
> * lib/gl_anytree_list2.h: Likewise.
> * lib/gl_avltree_list.c: Likewise.
> * lib/gl_avltreehash_list.c: Likewise.
> * lib/gl_rbtree_list.c: Likewise.
> * lib/gl_rbtreehash_list.c: Likewise.
> * lib/gl_sublist.c: Likewise.
>

Re: Add gl_list_remove_last to list/xlist

2020-06-03 Thread Marc Nieper-Wißkirchen

Am Mi., 3. Juni 2020 um 23:08 Uhr schrieb Bruno Haible :

> It's rarely used: In most cases, a list is either traversed one way or
> the other way.

One could say that a list that is to be transversed in both ways is a
different abstract data type; nevertheless, they appear (e.g. lists of
instructions in compiler analysis usually have to traversed in both
ways).

> If a list going to be traversed in reverse order, the programmer can just
> keep it in opposite order and use the normal forward iterator.
> Or they can use an array (or an array-based list) and use indices.

For an array-based list, this is fine. For a general list, one could
use gl_list_previous_node; however, one would have to keep a node of
the last element (there's no gl_list_first_node/gl_list_last_node).

> I find the amount of bloat in the C++ standard library horrible. In C
> at least, we can concentrate on the things that get used, not on the
> things that some rare programmer might find useful some day.

I admit that there is some wisdom in these words. :)

Re: Module with preprocessor utilities

2020-07-24 Thread Marc Nieper-Wißkirchen

Am Fr., 24. Juli 2020 um 10:56 Uhr schrieb Florian Weimer :

> This is the case that is unclear:
>
> double x[2];
> double *p = [1];
>
> The standard explicitly says “first element of an array”.

That's fine as well, I think. "x + 1" just points to an array of
length 1 in memory.

Re: stack module

2020-07-23 Thread Marc Nieper-Wißkirchen

Hi Bruno,

> This is perfectly acceptable for Gnulib. It has debuggability and type safety.

Perfect. Then I will come up with a module in this form.

> You have precedent e.g. in lib/diffseq.h and lib/aligned-malloc.h.

Good to know; I hadn't taken a look at these sources yet.

> You can even omit the 'GL_' prefix from the macro names. The user can #undef
> the macros after including the file; therefore there is nearly no risk of
> collision with macros defined by other code.

Or the included file #undefs the macros as in lib/diffseq.h.

Speaking of prefixes: As macro names (and other names) starting with
"_GL_" are not officially allowed by ISO C, I think it makes sense to
think of a new convention for new sources (while old sources could
later be updated incrementally).

For example, one could use the prefix "GL__" (double underscore) for
(future) private identifiers.

Marc

Re: Module with preprocessor utilities

2020-07-24 Thread Marc Nieper-Wißkirchen

Am Fr., 24. Juli 2020 um 10:05 Uhr schrieb Florian Weimer :

> It's still a candidate for an RFE.  Martin Sebor has been working on
> such warnings.  I'm going to talk to him.

It may also be useful for optimizations because the compiler may make
assumptions.

> >> It's also undefined when you pass the address of something that is not
> >> the first element of an array of type double to foo1.  (Undefined in the
> >> sense that the standard does not say what happens in that case.)
> >
> > Yes, but a pointer to (an existing) double fulfills this requirement.
>
> I don't think that's how the standard defines “first element of an
> array”.  I suspect the intent was that it the pointer has to point to an
> array of at least the indicated number of elements, but that's not what
> the standard says.

According to ISO C, "an array type describes a contiguously allocated
nonempty set of objects with a particular member object type, called
the element type". Thus, given

double x;
double *p = 

the pointer p points to the first element of an array (of length 1).

Maybe I have misunderstood you.

Thanks,

Marc

Re: stack module

2020-07-22 Thread Marc Nieper-Wißkirchen

Am Sa., 23. Mai 2020 um 19:19 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > > I was expecting that you write
> > >
> > >   struct
> > >   {
> > > void *base; ...
> > >   }
> >
> > This removes type safety. The benefit of the current approach is that
> > stack types of different types are not compatible.
>
> Indeed. Yes, it's a difficult trade-off between debuggability, binary code 
> size,
> and type safety...

The alternative with the same type safety would be a source file with
stack code procedures meant for inclusion (without include guards).
The source file would expect a preprocessor defines GL_STACK_NAME,
GL_STACK_TYPE, and GL_STACK_EXTERN.

The file itself would contain code like the following:

#define _GL_STACK_PREFIX(name) _GL_CONCAT(GL_STACK_NAME, _GL_CONCAT(_, name))

typedef struct
{
  GL_STACK_TYPE *base;
  size_t size;
  size_t allocated;
}
GL_STACK_PREFIX(type);

GL_STACK_EXTERN GL_STACK_PREFIX(init) (GL_STACK_PREFIX(type) *stack)
{
  stack->base = NULL;
  stack->size = 0;
  stack->allocated = 0;
}

...

The advantage of this model is that it generalizes to other data
structures, for which a sole (or at least simple) macro implementation
is not possible.

What do you think?

Marc

Re: Module with preprocessor utilities

2020-07-24 Thread Marc Nieper-Wißkirchen

Am Fr., 24. Juli 2020 um 08:53 Uhr schrieb Florian Weimer :

> * Bruno Haible:

> > (This is with gcc 10.1.0.)
> >
> > clang, OTOH, produces warnings for both foo1 and foo2.
> >
> > But I won't spend time to report a GCC bug on this, because - as you said -
> > without the ability to declare a pointer to an incomplete type or a 'void *'
> > as non-null, this language feature is worthless.

I don't think that it is a bug in the technical sense. A C99 compiler
does not have to warn as, in general, it is undecidable whether a
particular argument is NULL or not.

For reference, this language feature is described in 6.7.5.3 (7) in
the ISO C99 document.

It's good for documenting interfaces as is the restrict qualifier, or,
rather, it would be good if ISO C allowed arrays of incomplete types
where pointers of incomplete types are allowed.

So, instead of filing a bug for GCC, it may a better use of one's time
to file a request with WG 14 to relax the language in this regard.

> It's also undefined when you pass the address of something that is not
> the first element of an array of type double to foo1.  (Undefined in the
> sense that the standard does not say what happens in that case.)

Yes, but a pointer to (an existing) double fulfills this requirement.

Marc

1 2 >

1 - 100 of 174 matches

Mail list logo