Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Bruno Haible
Hi Marc,

Some comments:

* GL_HAMT_THREAD_SAFE can be defined to 1 not only if
__STDC_VERSION__ >= 201112 && ! defined __STD_NO_ATOMICS__
  but also if
__GNUC__ + (__GNUC_MINOR >= 9) > 4
  (see the other mail).

* Still '#ifdef GL_HAMT_THREAD_SAFE'.

* Typo s/comparision/comparison/

* Hamt_processor: The comment does not say what the return value means. Maybe it
  would be clearer if this type was moved down to the "Iteration" section.

* hamt_lookup: If the caller is allowed to modify the payload stored in the
  returned entry, this function should return a 'Hamt_entry *', not a
  'const Hamt_entry *'. Just like 'strchr' and 'strstr' return a 'char *',
  not a 'const char *'.

* In trienode_count, you don't need count_one_bits_l, only count_one_bits.
  It's OK to assume that 'uint32_t' and 'unsigned int' are the same type.

* If subtrie_lookup was inlined in entry_lookup, you could get rid of the
  forward declaration of entry_lookup, and compilers would be more likely
  to turn the recursion of entry_lookup into a loop (tail-recursion
  elimination).

* If subtrie_do_while was inlined in entry_do_while, you could get rid of the
  forward declaration of entry_do_while.

* It would be useful to be able to construct modules 'hamt-set' and 'hamt-map',
  analogous to 'hash-set' and 'hash-map', but the hamt API currently has two
  issues that block it:

  1) The iterator API is such that you cannot write the iteration using a loop;
 it involves a function call that is active during the entire iteration.
 This is also problematic for two other reasons:

   - Creating a closure in C (unlike Scheme or Lisp) is tedious hand-work.
 (The GNU C extension that allows local functions inside functions [1]
 is not a solution, because
   - it is unportable,
   - it forces an executable stack on many platforms, which is
 considered bad for security.)

   - In Common Lisp, iteration through a hash table is available not only
 through a function-call interface (MAPHASH [2]) but also through an
 iteration-loop interface (WITH-HASH-TABLE-ITERATOR [3], LOOP [4]).

 For example, in GNU clisp, LOOP expands to

 [1]> (macroexpand-1 '(loop for x being each hash-key of h do (foo x)))
 (MACROLET ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-ERROR)))
  (BLOCK NIL
   (LET ((#:WHTI-3214 (SYSTEM::HASH-TABLE-ITERATOR H)))
(MULTIPLE-VALUE-BIND (#:MORE?3215 #:HASH-KEY-3216 #:HASH-VALUE-3217)
 (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214)
 (DECLARE (IGNORE #:HASH-VALUE-3217))
 (LET ((X NIL))
  (LET NIL
   (MACROLET ((LOOP-FINISH NIL '(GO SYSTEM::END-LOOP)))
(TAGBODY SYSTEM::BEGIN-LOOP (UNLESS #:MORE?3215 (LOOP-FINISH))
 (SETQ X #:HASH-KEY-3216) (PROGN (PROGN (FOO X)))
 (MULTIPLE-VALUE-SETQ
  (#:MORE?3215 #:HASH-KEY-3216 #:HASH-VALUE-3217)
  (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214))
 (GO SYSTEM::BEGIN-LOOP) SYSTEM::END-LOOP
 (MACROLET
  ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-WARN)
'(GO SYSTEM::END-LOOP ;

 You can see that there is an initial function call HASH-TABLE-ITERATOR
 upfront, and then a function call to HASH-TABLE-ITERATE in each round
 of the loop.

 This principle makes it possible to iterate e.g. through a hash table
 and a list in parallel.

  2) We have a dilemma regarding use of malloc vs. xmalloc. While modules
 meant for use in programs can readily use xmalloc, modules meant for use
 (also) in libraries cannot use xmalloc, since a library is not supposed
 to terminate the program, even when memory is tight.

 The solution we found for this dilemma is that the *-set and *-map modules
 use just malloc, and we have thin wrapper modules (xset, xmap) that do
 call xalloc_die() in case of out-of-memory.

 Unfortunately, the API of many of the functions need to be adjusted to
 cope with the possibility of an ENOMEM failure. That is tedious work, and
 the code does not look so pretty afterwards... But I see no other way to
 make it fit for use in libraries.

Bruno

[1] https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Nested-Functions.html
[2] 
http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/fun_maphash.html
[3] 
http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/mac_with-hash_ble-iterator.html
[4] 
http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1-6.html




Re: _Atomic

2020-10-10 Thread Bruno Haible
Paul Eggert wrote:
> On 10/10/20 8:04 AM, Marc Nieper-Wißkirchen wrote:
> > #if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> > !defined (__STD_NO_ATOMICS__)
> > 
> > I am asking because there may be non-C11 compilers that nevertheless
> > understand _Atomic.
> 
> I suggest not worrying about this problem until we run into it.

GCC 4.9.x is such a compiler that is non-C11 but supports _Atomic.

With this test program

==
#include 

#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L \
  && !defined (__STD_NO_ATOMICS__)
# define GL_HAMT_THREAD_SAFE 1
#else
# define GL_HAMT_THREAD_SAFE 0
#endif

_Atomic int x;

void increment_x (void)
{
  x++;
}

int main (void)
{
  printf ("GL_HAMT_THREAD_SAFE=%d\n", GL_HAMT_THREAD_SAFE);
  return 0;
}
==

the results on the various platforms are as follows:

   GL_HAMT_THREAD_SAFE  _Atomic compiles   _Atomic works
  GCC 4.8 0   0  0
  GCC 4.9 0   1  1
  GCC 5   1   1  1
  GCC 6   1   1  1
  GCC 7   1   1  1
  GCC 8   1   1  1
  GCC 9   1   1  1
  GCC 10  1   1  1
  macOS 10.13 1   1  1
  FreeBSD 12  1   1  1
  NetBSD 91   1  1
  OpenBSD 6.7 cc  1   1  1
  OpenBSD 6.7 gcc 0   0  0
  AIX 7.1 xlc 0   0  0
  Solaris 10 cc   0   0  0
  Solaris 11.3 cc 12.40   0  0
  Solaris 11.3 cc 12.51   1  1
  Solaris 11.3 cc 12.61   1  1
  MSVC 14 0   0  0

Bruno




Re: stack bounds

2020-10-10 Thread Bruno Haible
Paul Eggert wrote:
> > On Linux, the kernel allows the stack to grow by any amount, if it does not
> > become closer than 1 MB to another VMA and does not violate the set limits.
> > See linux/mm/mmap.c:expand_downwards and linux/mm/mmap.c:acct_stack_growth.
> > Therefore on Linux, there is no need for a guard page and no need for
> > 'gcc -fstack-clash-protection'.
> 
> There's still a need, if a function declares a large local variable, as the 
> stack pointer can jump around the 1 MB barrier and trash other storage. If I 
> compile the attached program with 'gcc -m32 -O2 stackish.c' on Fedora 31 
> x86-64, 
> the program exits with status 255 (instead of crashing with a stack overflow 
> as 
> it should), because the stack has overflowed and has stomped on the heap. So 
> stack overflow checking is not "just working", at least for this particular 
> case.

Oh, I see: your program is not getting near the heap with the stack, it is
getting directly *into* the heap (because it fills the bottom of array 'b'
without having filled the rest of 'b' first).

gcc -fstack-clash-protection -m32 -O2 stackish.c fixes this issue.

So, you want 'gcc -fstack-clash-protection' [1] to become enabled by default?
Some distros are doing this already:
  - Ubuntu 20.04 [2] (also -fstack-clash-protection is part of the default
gcc flags for users),
  - RHEL 8 [1] (but apparently not by default for user-compiled programs),
and the Firefox people are considering it [3].

Bruno

[1] 
https://developers.redhat.com/blog/2020/05/22/stack-clash-mitigation-in-gcc-part-3/
[2] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1588710
.file   "stackish.c"
.text
.p2align 4
.globl  growby
.type   growby, @function
growby:
.LFB27:
.cfi_startproc
endbr64
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movslq  %edi, %rdi
addq$15, %rdi
movq%rsp, %rbp
.cfi_def_cfa_register 6
subq$16, %rsp
movq%fs:40, %rax
movq%rax, -8(%rbp)
xorl%eax, %eax
movq%rsp, %rdx
movq%rdi, %rax
andq$-4096, %rdi
subq%rdi, %rdx
andq$-16, %rax
cmpq%rdx, %rsp
je  .L3
.L14:
subq$4096, %rsp
orq $0, 4088(%rsp)
cmpq%rdx, %rsp
jne .L14
.L3:
andl$4095, %eax
subq%rax, %rsp
testq   %rax, %rax
jne .L15
.L4:
movl%esi, %r8d
movq%rsp, %rcx
addl$256, %r8d
js  .L5
movl%r8d, %edi
xorl%eax, %eax
.p2align 4,,10
.p2align 3
.L6:
movq%rax, %rdx
movb%al, (%rcx,%rax)
addq$1, %rax
cmpq%rdx, %rdi
jne .L6
.L5:
movslq  %esi, %rsi
movslq  %r8d, %r8
movsbl  (%rcx,%rsi), %eax
movsbl  (%rcx,%r8), %edx
subl%edx, %eax
movq-8(%rbp), %rsi
xorq%fs:40, %rsi
jne .L16
leave
.cfi_remember_state
.cfi_def_cfa 7, 8
ret
.p2align 4,,10
.p2align 3
.L15:
.cfi_restore_state
orq $0, -8(%rsp,%rax)
jmp .L4
.L16:
call__stack_chk_fail@PLT
.cfi_endproc
.LFE27:
.size   growby, .-growby
.section.text.startup,"ax",@progbits
.p2align 4
.globl  main
.type   main, @function
main:
.LFB28:
.cfi_startproc
endbr64
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movl$1, %esi
movq%rsp, %rbp
.cfi_def_cfa_register 6
pushq   %r12
.cfi_offset 12, -24
leal1073741824(%rdi), %r12d
pushq   %rbx
.cfi_offset 3, -32
movl%edi, %ebx
movslq  %r12d, %rdi
subq$16, %rsp
movq%fs:40, %rax
movq%rax, -24(%rbp)
xorl%eax, %eax
callcalloc@PLT
movl%r12d, %edx
movq%rsp, %rdi
shrl$31, %edx
movq%rax, %rcx
movq%rax, -32(%rbp)
leaq-32(%rbp), %rax
addl%r12d, %edx
sarl%edx
movslq  %edx, %rdx
addq%rcx, %rdx
movq%rsp, %rcx
subq%rdx, %rax
cltq
addq$15, %rax
movq%rax, %rdx
andq$-4096, %rax
subq%rax, %rcx
andq$-16, %rdx
movq%rcx, %rax
cmpq%rax, %rsp
je  .L19
.L32:
subq$4096, %rsp
orq $0, 4088(%rsp)
cmpq%rax, %rsp
jne .L32
.L19:
andl$4095, %edx
subq%rdx, %rsp
testq   %rdx, %rdx
jne .L33
.L20:
movl%ebx, %r8d
movq%rsp, %rcx
addl

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen
I have attached an improved version of the HAMT module to this email.

Apart from improving the comments (which includes moving some
documentation into the header file) and changing the things already
discussed, I added a few more tests and three procedures for in-place
updates.

For example, you can now do hamt_insert_x to destructively insert an
element into a hamt (instead of inserting the element into a
conceptual copy as you do with hamt_insert). Hamts that share
structure with the modified hamt are not effected. (The idea with the
_x prefix comes from Guile whose C interface uses _x where Scheme uses
!.)

NB: For those who fancy Scheme, hamts can be used to implement the
library (srfi 146 hash) of SRFI 146 ([1]). The destructive update
operations of this module can be used to implement the linear-update
procedures of (srfi 146 hash) efficiently.

--

[1] https://srfi.schemers.org/srfi-146/srfi-146.html

Am Sa., 10. Okt. 2020 um 23:24 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Am Sa., 10. Okt. 2020 um 20:19 Uhr schrieb Paul Eggert :
>
> > #if __STDC_VERSION__ < 201112 || defined __STD_NO_ATOMICS__
> >
> > which is a cleaner way of writing the negative of the above test. These days
> > there should be no reason to check whether __STDC_VERSION__ is defined,
> > generally it's clearer to use "<" instead of ">=" so that textual order 
> > reflects
> > numeric order, and the parens after "defined" are better omitted.
>
> Thanks, I did the edit in my local copy.
From db21b3e975ad7526e0db531fd4936354d8f031ae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marc=20Nieper-Wi=C3=9Fkirchen?= 
Date: Sat, 10 Oct 2020 23:38:21 +0200
Subject: [PATCH] hamt: New module.

This module provides (persistent) hash array mapped tries.
* MODULES.html.sh: Add hamt.
* lib/hamt.c: New file.
* lib/hamt.h: New file.
* modules/hamt: New file.
* modules/hamt-tests: New file.
* tests/test-hamt.c: New file.
---
 ChangeLog  |  11 +
 MODULES.html.sh|   1 +
 lib/hamt.c | 984 +
 lib/hamt.h | 182 +
 modules/hamt   |  29 ++
 modules/hamt-tests |  11 +
 tests/test-hamt.c  | 352 
 7 files changed, 1570 insertions(+)
 create mode 100644 lib/hamt.c
 create mode 100644 lib/hamt.h
 create mode 100644 modules/hamt
 create mode 100644 modules/hamt-tests
 create mode 100644 tests/test-hamt.c

diff --git a/ChangeLog b/ChangeLog
index 2aba2b0c7..b9e4f9970 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2020-10-10  Marc Nieper-Wißkirchen  
+
+	hamt: New module.
+	This module provides (persistent) hash array mapped tries.
+	* MODULES.html.sh: Add hamt.
+	* lib/hamt.c: New file.
+	* lib/hamt.h: New file.
+	* modules/hamt: New file.
+	* modules/hamt-tests: New file.
+	* tests/test-hamt.c: New file.
+
 2020-10-10  Bruno Haible  
 
 	*-list, *-oset, *-omap: Avoid possible compiler warnings.
diff --git a/MODULES.html.sh b/MODULES.html.sh
index 7e7cdae3e..2907eb741 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2028,6 +2028,7 @@ func_all_modules ()
   func_module hash-pjw
   func_module hash-pjw-bare
   func_module hash
+  func_module hamt
   func_module readline
   func_module readtokens
   func_module readtokens0
diff --git a/lib/hamt.c b/lib/hamt.c
new file mode 100644
index 0..eb15e19c1
--- /dev/null
+++ b/lib/hamt.c
@@ -0,0 +1,984 @@
+/* (Persistent) hash array mapped tries.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see .  */
+
+/* Written by Marc Nieper-Wißkirchen , 2020.  */
+
+#include 
+#include "hamt.h"
+
+#include 
+#include 
+#include 
+#include 
+#include "count-one-bits.h"
+#include "verify.h"
+#include "xalloc.h"
+
+/* Reference counters can be shared between different threads if the
+   entry they belong to is shared between different threads.
+   Operations on them therefore have to be atomic so that no invalid
+   state is observable.
+
+   A thread must not modify an entry or its children (!) if its
+   reference count implies that the entry is shared by at least two
+   hamts.  */
+typedef
+#if GL_HAMT_THREAD_SAFE
+_Atomic
+#endif
+size_t ref_counter;
+
+/* Hash values are of type size_t.  For each level of the trie, we use
+   5 bits (corresponding to lg2 of the width of a 32-bit word.  */
+#define MAX_DEPTH ((SIZE_WIDTH + 4) / 5)
+
+/***/
+/* Entry Types */

Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen
Am Sa., 10. Okt. 2020 um 20:19 Uhr schrieb Paul Eggert :

> #if __STDC_VERSION__ < 201112 || defined __STD_NO_ATOMICS__
>
> which is a cleaner way of writing the negative of the above test. These days
> there should be no reason to check whether __STDC_VERSION__ is defined,
> generally it's clearer to use "<" instead of ">=" so that textual order 
> reflects
> numeric order, and the parens after "defined" are better omitted.

Thanks, I did the edit in my local copy.



Re: Unused parameter warnings

2020-10-10 Thread Bruno Haible
Paul Eggert wrote:
> I installed the attached commentary changes.

OK, with these definitions of 'pure' and 'const', I can make the change Marc
requested - and some more.


2020-10-10  Bruno Haible  

*-list, *-oset, *-omap: Avoid possible compiler warnings.
Reported by Marc Nieper-Wißkirchen in
.
* lib/gl_anylinked_list2.h (gl_linked_iterator,
gl_linked_iterator_from_to): Mark as 'pure'.
(gl_linked_iterator_free): Mark as 'const'.
* lib/gl_anytree_list2.h (gl_tree_size, gl_tree_node_value,
gl_tree_search_from_to, gl_tree_indexof_from_to, gl_tree_iterator,
gl_tree_iterator_from_to, gl_tree_sortedlist_search,
gl_tree_sortedlist_search_from_to, gl_tree_sortedlist_indexof,
gl_tree_sortedlist_indexof_from_to): Mark as 'pure'.
(gl_tree_iterator_free): Mark as 'const'.
* lib/gl_anytree_omap.h (gl_tree_size, gl_tree_iterator): Mark as
'pure'.
(gl_tree_iterator_free): Mark as 'const'.
* lib/gl_anytree_oset.h (gl_tree_size, gl_tree_next_node,
gl_tree_prev_node, gl_tree_iterator): Mark as 'pure'.
(gl_tree_iterator_free): Mark as 'const'.
* lib/gl_anytreehash_list1.h (node_position, compare_by_position,
compare_position_threshold): Mark as 'pure'.
* lib/gl_array_list.c (gl_array_size, gl_array_indexof_from_to,
gl_array_search_from_to, gl_array_iterator, gl_array_iterator_from_to,
gl_array_sortedlist_indexof_from_to, gl_array_sortedlist_indexof,
gl_array_sortedlist_search_from_to, gl_array_sortedlist_search): Mark as
'pure'.
(gl_array_iterator_free): Mark as 'const'.
* lib/gl_array_omap.c (gl_array_size, gl_array_indexof, gl_array_search,
gl_array_search_atleast, gl_array_iterator): Mark as 'pure'.
(gl_array_iterator_free): Mark as 'const'.
* lib/gl_array_oset.c (gl_array_size, gl_array_indexof, gl_array_search,
gl_array_indexof_atleast, gl_array_search_atleast, gl_array_iterator,
gl_array_iterator_atleast): Mark as 'pure'.
(gl_array_iterator_free): Mark as 'const'.
* lib/gl_carray_list.c (gl_carray_size, gl_carray_node_value,
gl_carray_next_node, gl_carray_previous_node, gl_carray_get_at,
gl_carray_indexof_from_to, gl_carray_search_from_to, gl_carray_iterator,
gl_carray_iterator_from_to, gl_carray_sortedlist_indexof_from_to,
gl_carray_sortedlist_indexof, gl_carray_sortedlist_search_from_to,
gl_carray_sortedlist_search): Mark as 'pure'.
(gl_carray_iterator_free): Mark as 'const'.

2020-10-10  Bruno Haible  

rbtree-list: Avoid possible compiler warnings.
This mirrors the change of avltree-list on 2014-09-16.
* lib/gl_rbtree_list.c (gl_rbtree_list_check_invariants): Add extern
declaration. Add cast to void for ignored value of check_invariants.

>From be744f1e1db0b7dc79f8206fd83918c32fad1d31 Mon Sep 17 00:00:00 2001
From: Bruno Haible 
Date: Sat, 10 Oct 2020 22:17:15 +0200
Subject: [PATCH 1/2] rbtree-list: Avoid possible compiler warnings.

This mirrors the change of avltree-list on 2014-09-16.

* lib/gl_rbtree_list.c (gl_rbtree_list_check_invariants): Add extern
declaration. Add cast to void for ignored value of check_invariants.
---
 ChangeLog | 7 +++
 lib/gl_avltree_list.c | 1 -
 lib/gl_rbtree_list.c  | 4 +++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b479cf0..869f627 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2020-10-10  Bruno Haible  
+
+	rbtree-list: Avoid possible compiler warnings.
+	This mirrors the change of avltree-list on 2014-09-16.
+	* lib/gl_rbtree_list.c (gl_rbtree_list_check_invariants): Add extern
+	declaration. Add cast to void for ignored value of check_invariants.
+
 2020-10-10  Paul Eggert  
 
 	attribute: improve const, pure doc
diff --git a/lib/gl_avltree_list.c b/lib/gl_avltree_list.c
index 35ffec6..600e78b 100644
--- a/lib/gl_avltree_list.c
+++ b/lib/gl_avltree_list.c
@@ -61,7 +61,6 @@ check_invariants (gl_list_node_t node, gl_list_node_t parent)
 
   return 1 + (left_height > right_height ? left_height : right_height);
 }
-
 void
 gl_avltree_list_check_invariants (gl_list_t list)
 {
diff --git a/lib/gl_rbtree_list.c b/lib/gl_rbtree_list.c
index d79becf..5be225f 100644
--- a/lib/gl_rbtree_list.c
+++ b/lib/gl_rbtree_list.c
@@ -37,6 +37,8 @@
 #include "gl_anytree_list2.h"
 
 /* For debugging.  */
+extern void gl_rbtree_list_check_invariants (gl_list_t list);
+
 static unsigned int
 check_invariants (gl_list_node_t node, gl_list_node_t parent)
 {
@@ -64,7 +66,7 @@ void
 gl_rbtree_list_check_invariants (gl_list_t list)
 {
   if (list->root != NULL)
-check_invariants (list->root, NULL);
+(void) check_invariants (list->root, NULL);
 }
 
 const struct gl_list_implementation gl_rbtree_list_implementation =
-- 

Re: Unused parameter warnings

2020-10-10 Thread Bruno Haible
Marc Nieper-Wißkirchen wrote:
> so that the code is compilable with -Werror -Wall -Wextra.

Note that Gnulib does not guarantee this. Especially for -Wsign-compare and
-Wunused-parameter (but also some other warning types) we have no intention
to fix them; the cure would be often worse than the problem.

Bruno




[PATCH] stack: New module.

2020-10-10 Thread Marc Nieper-Wißkirchen
From: Marc Nieper-Wißkirchen 

* MODULES.html.sh: Add entry for the stack module.
* modules/stack: New file.
* modules/stack-tests: New file.
* lib/stack.h: New file.
* tests/test-stack.c: New file.
---
 ChangeLog   |   9 +++
 MODULES.html.sh |   1 +
 lib/stack.h | 165 
 modules/stack   |  25 +++
 modules/stack-tests |  13 
 tests/test-stack.c  |  74 
 6 files changed, 287 insertions(+)
 create mode 100644 lib/stack.h
 create mode 100644 modules/stack
 create mode 100644 modules/stack-tests
 create mode 100644 tests/test-stack.c

diff --git a/ChangeLog b/ChangeLog
index b479cf0c7..f496c8345 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2020-10-10  Marc Nieper-Wißkirchen  
+
+   stack: New module.
+   * MODULES.html.sh: Add entry for the stack module.
+   * modules/stack: New file.
+   * modules/stack-tests: New file.
+   * lib/stack.h: New file.
+   * tests/test-stack.c: New file.
+
 2020-10-10  Paul Eggert  
 
attribute: improve const, pure doc
diff --git a/MODULES.html.sh b/MODULES.html.sh
index a8a629e29..7e7cdae3e 100755
--- a/MODULES.html.sh
+++ b/MODULES.html.sh
@@ -2031,6 +2031,7 @@ func_all_modules ()
   func_module readline
   func_module readtokens
   func_module readtokens0
+  func_module stack
   func_module strverscmp
   func_module filevercmp
   func_end_table
diff --git a/lib/stack.h b/lib/stack.h
new file mode 100644
index 0..5d803901c
--- /dev/null
+++ b/lib/stack.h
@@ -0,0 +1,165 @@
+/* Type-safe stack data type.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see .  */
+
+/* Written by Marc Nieper-Wißkirchen , 2020.  */
+
+/* This header file does not have include-guards as it is meant to be
+   includeable multiple times as long as the necessary defines have
+   been set up.
+
+   A stack is implemented with a homogenous array of elements in
+   memory, which will be resized (and moved) as needed.  Elements are
+   passed by value and it is the responsibility of the user-code to
+   free any resources associated with individual elements.
+
+   The exported names are all prefixed by ‘stack’ (e.g. stack_init),
+   but this can be changed by setting an appropriate define.
+ Type:   stack_type
+ Initialization: stack_init ();
+ De-initialization:  stack_destroy ();
+ Predicate:  bool res = stack_empty ();
+ Introspection:  ELEMENT *base = stack_base ();
+ Pushing:stack_push (, element);
+ Popping:ELEMENT element = stack_pop ();
+ Discarding: stack_discard ();
+ Top-of-stack:   ELEMENT element = stack_top ();
+ Size:   size_t size = stack_size ();
+
+   Here, ELEMENT is the type to which GL_STACK_ELEMENT was defined when
+   this file was included.
+*/
+
+/* Before including this file, you need to define:
+ GL_STACK_ELEMENT   The type of the elements on the stack.
+ GL_STACK_STORAGECLASS  (Optional) The storage class specifier (e.g. 
static)
+to use in the function definitions.
+ GL_STACK_NAME  (Optional) The prefix to use for the type names
+and functions definitions instead of the default
+‘stack’.
+
+   After including this file, these names will be undefined.
+
+   Before including this file, you also need to include:
+ #include ""
+ #include ""
+ #include "assure.h"
+ #include "xalloc.h"
+*/
+
+#ifndef GL_STACK_ELEMENT
+# error "Please define GL_STACK_ELEMENT first."
+#endif
+
+#ifndef GL_STACK_STORAGECLASS
+# define GL_STACK_STORAGECLASS
+#endif
+
+#ifndef GL_STACK_NAME
+# define GL_STACK_NAME stack
+#endif
+
+#define _GL_STACK_PREFIX(name) _GL_CONCAT(GL_STACK_NAME, _GL_CONCAT(_, name))
+#define _GL_STACK_TYPE _GL_STACK_PREFIX(type)
+
+typedef struct
+{
+  GL_STACK_ELEMENT *base;
+  size_t size;
+  size_t allocated;
+} _GL_STACK_TYPE;
+
+/* Initialize a stack.  */
+GL_STACK_STORAGECLASS void
+_GL_STACK_PREFIX (init) (_GL_STACK_TYPE *stack)
+{
+  stack->base = NULL;
+  stack->size = 0;
+  stack->allocated = 0;
+}
+
+/* Destroy a stack by freeing the allocated space.  */
+GL_STACK_STORAGECLASS void
+_GL_STACK_PREFIX (destroy) (_GL_STACK_TYPE *stack)
+{
+  free 

Re: stack bounds

2020-10-10 Thread Paul Eggert

On 10/10/20 5:08 AM, Bruno Haible wrote:

On Linux, the kernel allows the stack to grow by any amount, if it does not
become closer than 1 MB to another VMA and does not violate the set limits.
See linux/mm/mmap.c:expand_downwards and linux/mm/mmap.c:acct_stack_growth.
Therefore on Linux, there is no need for a guard page and no need for
'gcc -fstack-clash-protection'.


There's still a need, if a function declares a large local variable, as the 
stack pointer can jump around the 1 MB barrier and trash other storage. If I 
compile the attached program with 'gcc -m32 -O2 stackish.c' on Fedora 31 x86-64, 
the program exits with status 255 (instead of crashing with a stack overflow as 
it should), because the stack has overflowed and has stomped on the heap. So 
stack overflow checking is not "just working", at least for this particular case.
#include 
#include 

int
growby (int bsize, int argc)
{
  char b[bsize];
  for (int i = 0; i <= argc + 256; i++)
b[i] = i;
  return b[argc] - b[argc + 256];
}


int
main (int argc, char **argv)
{
  int psize = argc + 1024 * 1024 * 1024;
  char *p = calloc (psize, 1);
  int bsize = (char *)  - (p + psize/2);
  int status = growby (bsize, argc);
  for (int i = 0; i < psize; i++)
status |= p[i]++;
  return status;
}


Re: Unused parameter warnings

2020-10-10 Thread Paul Eggert

On 10/10/20 8:00 AM, Marc Nieper-Wißkirchen wrote:


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97364


Looking at that bug report and related material (particularly Richard Biener's 
2012 comment ) the GCC 
folks seem to have decided long ago that const and pure attributes were intended 
more as directives for common subexpression elimination than as independent 
notions in their own right, it's just that they haven't gotten around to (or are 
embarrassed to :-) document that decision. In the meantime, to try to help out 
Gnulib hackers I installed the attached commentary changes. It was cleaner to 
document 'const' first so I rearranged the #defines.
>From e3665e43f2a042b26fa65cdba80dd89846d95540 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 10 Oct 2020 11:48:16 -0700
Subject: [PATCH] attribute: improve const, pure doc
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Marc Nieper-Wißkirchen in:
https://lists.gnu.org/r/bug-gnulib/2020-10/msg00035.html
* lib/attribute.h (ATTRIBUTE_CONST, ATTRIBUTE_PURE): Improv doc.  See:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51971#c1
---
 ChangeLog   |  8 
 lib/attribute.h | 21 -
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 875f3551a..b479cf0c7 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2020-10-10  Paul Eggert  
+
+	attribute: improve const, pure doc
+	Problem reported by Marc Nieper-Wißkirchen in:
+	https://lists.gnu.org/r/bug-gnulib/2020-10/msg00035.html
+	* lib/attribute.h (ATTRIBUTE_CONST, ATTRIBUTE_PURE): Improv doc.  See:
+	https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51971#c1
+
 2020-10-05  Paul Eggert  
 
 	thread: pacify GCC on Solaris 10
diff --git a/lib/attribute.h b/lib/attribute.h
index b97514655..48d9826c8 100644
--- a/lib/attribute.h
+++ b/lib/attribute.h
@@ -170,18 +170,21 @@
 /* Applies to: function.  */
 #define ATTRIBUTE_ALWAYS_INLINE _GL_ATTRIBUTE_ALWAYS_INLINE
 
-/* The function does not affect observable state, and always returns a value.
-   Compilers can omit duplicate calls with the same arguments if
-   observable state is not changed between calls.  (This attribute is
-   looser than ATTRIBUTE_CONST.)  */
+/* It is OK for a compiler to omit duplicate calls with the same arguments.
+   This attribute is safe for a function that neither depends on
+   nor affects observable state, and always returns exactly once -
+   e.g., does not loop forever, and does not call longjmp.
+   (This attribute is stricter than ATTRIBUTE_PURE.)  */
 /* Applies to: functions.  */
-#define ATTRIBUTE_PURE _GL_ATTRIBUTE_PURE
+#define ATTRIBUTE_CONST _GL_ATTRIBUTE_CONST
 
-/* The function neither depends on nor affects observable state,
-   and always returns a value.  Compilers can omit duplicate calls with
-   the same arguments.  (This attribute is stricter than ATTRIBUTE_PURE.)  */
+/* It is OK for a compiler to omit duplicate calls with the same
+   arguments if observable state is not changed between calls.
+   This attribute is safe for a function that does not affect
+   observable state, and always returns exactly once.
+   (This attribute is looser than ATTRIBUTE_CONST.)  */
 /* Applies to: functions.  */
-#define ATTRIBUTE_CONST _GL_ATTRIBUTE_CONST
+#define ATTRIBUTE_PURE _GL_ATTRIBUTE_PURE
 
 /* The function is rarely executed.  */
 /* Applies to: functions.  */
-- 
2.25.1



Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Paul Eggert

On 10/10/20 8:04 AM, Marc Nieper-Wißkirchen wrote:

#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
!defined (__STD_NO_ATOMICS__)

I am asking because there may be non-C11 compilers that nevertheless
understand _Atomic.


I suggest not worrying about this problem until we run into it. Quite possibly 
any actual problem will be something like "_Atomic can be used for most things, 
but not for feature X" in which case we'll likely want the more fine-grained 
check anyway. Until we discover actual problems we can get by with


   #if __STDC_VERSION__ < 201112 || defined __STD_NO_ATOMICS__

which is a cleaner way of writing the negative of the above test. These days 
there should be no reason to check whether __STDC_VERSION__ is defined, 
generally it's clearer to use "<" instead of ">=" so that textual order reflects 
numeric order, and the parens after "defined" are better omitted.




Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen
Am Sa., 10. Okt. 2020 um 19:41 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > Is there a special Gnulib way (or Autoconf macro) for the following test:
> >
> > #if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> > !defined (__STD_NO_ATOMICS__)
> >
> > I am asking because there may be non-C11 compilers that nevertheless
> > understand _Atomic.
>
> And there may be C11 compilers which don't support _Atomic and nevertheless
> dont't define __STD_NO_ATOMICS__.
>
> So, indeed it would be useful to have an Autoconf macro that tests whether
> _Atomic can be used.

If someone is able to provide such a macro for Gnulib, I would replace
my check with this macro.



Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Bruno Haible
Hi Marc,

> >   - +/* The public interface is documented in the file hamt.c.
> >
> > This is not good. As a user of the module, I would want to read *only*
> > hamt.h and know how to use the public interface. In other words, hamt.h
> > should have the functions' comments.
> 
> I followed the example of the already existing "hash" module. But I
> can easily move the public comments into the header file.

Yes please. Here in Gnulib, we don't have a way to extract reference
documentation from the source code (we don't use doxygen or similar);
therefore a programmer who wants to use a Gnulib module must peruse
the .h file.

> >   - "Each
> > updating operation returns a new hamt, which shares structure with
> > the original one."
> > So, after calling hamt_insert or hamt_delete, which function should I 
> > call
> > to delete the original Hamt without affecting the new one? (In order to
> > avoid accumulating pieces of old Hamts in memory.)
> 
> When you do, say,
> 
> new_hamt = hamt_insert (hamt, ...)
> 
> and new_hamt != hamt afterwards, you have to delete both hamt and
> new_hamt with hamt_free to free all the memory. After you have freed
> the first hamt, the second won't be affected.
> 
> >   - "The GL_HAMT_THREAD_SAFE flag is set if the implementation of hamts
> > is thread-safe as long as two threads do not simultaneously access
> > the same hamt."
> > I don't understand this sentence. Isn't the guarantee that
> > "is thread-safe as long as two threads do not simultaneously access
> > the same hamt" trivial? And what you want to achieve through the use
> > of _Atomic is that it is thread-safe *also* when two threads access
> > the same hamt?
> 
> The guarantee is not trivial because two different hamts may share
> some structure.
> 
> So... assume you have a hamt. You do some operations on it (or you do
> hamt_copy) to get a new_hamt that shares structure. Now if thread 1
> works on hamt and thread 2 on new_hamt, it is safe when
> GL_HAMT_THREAD_SAFE is set.
> 
> If two threads access the same hamt, it is not guaranteed to be safe.
> If you need this, get an (extremely shallow) copy through hamt_copy
> first.

I see. Thanks. These explanation would make good comments.

Bruno




Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Bruno Haible
Hi Marc,

> Is there a special Gnulib way (or Autoconf macro) for the following test:
> 
> #if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> !defined (__STD_NO_ATOMICS__)
> 
> I am asking because there may be non-C11 compilers that nevertheless
> understand _Atomic.

And there may be C11 compilers which don't support _Atomic and nevertheless
dont't define __STD_NO_ATOMICS__.

So, indeed it would be useful to have an Autoconf macro that tests whether
_Atomic can be used.

AFAICS, neither Autoconf, nor Gnulib, nor the autoconf-archive project
have a test for this.

Bruno




Re: ChangeLog entries

2020-10-10 Thread Bruno Haible
Hi Marc,

> (Do I have to
> do something about the ChangeLog entry or will it be auto-generated
> from my Git commit message?)

The ChangeLog entries are not auto-generated, no. We want the ChangeLog
entries to be essentially in sync with the git commits. This means, you
typically prepare the ChangeLog entry and reuse it for the git commit.
Or you prepare a git commit message (with line width <= 72) and use
it in the ChangeLog entry.

To avoid conflicts in ChangeLog during "git pull", I recomment to use the
git-merge-changelog program (see the comments in lib/git-merge-changelog.c).

Bruno




Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen
Speaking of GL_HAMT_THREAD_SAFE:

Is there a special Gnulib way (or Autoconf macro) for the following test:

#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
!defined (__STD_NO_ATOMICS__)

I am asking because there may be non-C11 compilers that nevertheless
understand _Atomic.

Am Sa., 10. Okt. 2020 um 17:01 Uhr schrieb Marc Nieper-Wißkirchen
:
>
> Am Sa., 10. Okt. 2020 um 16:54 Uhr schrieb Bruno Haible :
> >
> > Since you define GL_HAMT_THREAD_SAFE to either 0 or 1, you need
> > to test it through
> >   #if GL_HAMT_THREAD_SAFE
> > not
> >   #ifdef GL_HAMT_THREAD_SAFE
> >
> > Bruno
>
> Oh, yes... thanks!



Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen
Am Sa., 10. Okt. 2020 um 16:54 Uhr schrieb Bruno Haible :
>
> Since you define GL_HAMT_THREAD_SAFE to either 0 or 1, you need
> to test it through
>   #if GL_HAMT_THREAD_SAFE
> not
>   #ifdef GL_HAMT_THREAD_SAFE
>
> Bruno

Oh, yes... thanks!



Re: Unused parameter warnings

2020-10-10 Thread Marc Nieper-Wißkirchen
Am Sa., 10. Okt. 2020 um 16:39 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > I can ask for a better description of "pure" in the GCC manual.
>
> Yes, please.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97364



Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Bruno Haible
Since you define GL_HAMT_THREAD_SAFE to either 0 or 1, you need
to test it through
  #if GL_HAMT_THREAD_SAFE
not
  #ifdef GL_HAMT_THREAD_SAFE

Bruno




Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Marc Nieper-Wißkirchen
Hi Bruno,

Am Sa., 10. Okt. 2020 um 16:35 Uhr schrieb Bruno Haible :

> It is exciting to see such a datastructure with various application domains 
> [1]
> in Gnulib!

I'm glad that you like the idea.

>
> I haven't read through it line by line; nevertheless a couple of comments:

A few more lines will follow before the final draft is done because I
have been adding some destructive update procedures in the meantime.

>   - +/* The public interface is documented in the file hamt.c.
>
> This is not good. As a user of the module, I would want to read *only*
> hamt.h and know how to use the public interface. In other words, hamt.h
> should have the functions' comments.

I followed the example of the already existing "hash" module. But I
can easily move the public comments into the header file.

>   - "Each
> updating operation returns a new hamt, which shares structure with
> the original one."
> So, after calling hamt_insert or hamt_delete, which function should I call
> to delete the original Hamt without affecting the new one? (In order to
> avoid accumulating pieces of old Hamts in memory.)

When you do, say,

new_hamt = hamt_insert (hamt, ...)

and new_hamt != hamt afterwards, you have to delete both hamt and
new_hamt with hamt_free to free all the memory. After you have freed
the first hamt, the second won't be affected.

>   - "The GL_HAMT_THREAD_SAFE flag is set if the implementation of hamts
> is thread-safe as long as two threads do not simultaneously access
> the same hamt."
> I don't understand this sentence. Isn't the guarantee that
> "is thread-safe as long as two threads do not simultaneously access
> the same hamt" trivial? And what you want to achieve through the use
> of _Atomic is that it is thread-safe *also* when two threads access
> the same hamt?

The guarantee is not trivial because two different hamts may share
some structure.

So... assume you have a hamt. You do some operations on it (or you do
hamt_copy) to get a new_hamt that shares structure. Now if thread 1
works on hamt and thread 2 on new_hamt, it is safe when
GL_HAMT_THREAD_SAFE is set.

If two threads access the same hamt, it is not guaranteed to be safe.
If you need this, get an (extremely shallow) copy through hamt_copy
first.

Marc



Re: Unused parameter warnings

2020-10-10 Thread Bruno Haible
Hi Marc,

> I can ask for a better description of "pure" in the GCC manual.

Yes, please.

> Anyway, the function gl_linked_iterator_from_to in Gnulib should
> probably be fixed independently by adding the "pure" attribute so that
> the code is compilable with -Werror -Wall -Wextra.

I don't think it's independent: The GCC people could decide to change
their implementation, not the documentation, regarding attribute 'pure'.
I would prefer to wait for what the GCC people decide.

Bruno




Re: stack module

2020-10-10 Thread Marc Nieper-Wißkirchen
Hi Bruno,

Am Sa., 10. Okt. 2020 um 16:06 Uhr schrieb Bruno Haible :
>
> Hi Marc,
>
> > please see the attached patch for the new module.
>
> Very nice. Looks all good, except for some nit-picking in the comments:

Thanks.

>
> > + Introspection:  ELEMENT *base = stack_base ();
>
> I would add a comment here:
>   Where ELEMENT is the type to which GL_STACK_ELEMENT was defined when
>   this file was included.
> (It would be tempting to write
> GL_STACK_ELEMENT *base = stack_base ();
> but GL_STACK_ELEMENT is no longer defined after the file was included...)

I will add this comment.

> Other than that, please feel free to commit and push this. Thanks!!

I will do this as soon as I have been approved by Paul. (Do I have to
do something about the ChangeLog entry or will it be auto-generated
from my Git commit message?)

Marc



Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)

2020-10-10 Thread Bruno Haible
Hi Marc,

> It implements a persistent version of Phil Bagwell's HAMTs, which has
> been popularized by Clojure. HAMTs can be used when a persistent
> (functional/pure) version of a data structure akin to hash tables is
> needed. For example, the dynamic environment of a (possibly
> multi-threaded) Lisp or Scheme can be modeled with persistent HAMTs.

It is exciting to see such a datastructure with various application domains [1]
in Gnulib!

I haven't read through it line by line; nevertheless a couple of comments:

  - +/* The public interface is documented in the file hamt.c.

This is not good. As a user of the module, I would want to read *only*
hamt.h and know how to use the public interface. In other words, hamt.h
should have the functions' comments.

  - "Each
updating operation returns a new hamt, which shares structure with
the original one."
So, after calling hamt_insert or hamt_delete, which function should I call
to delete the original Hamt without affecting the new one? (In order to
avoid accumulating pieces of old Hamts in memory.)

  - "The GL_HAMT_THREAD_SAFE flag is set if the implementation of hamts
is thread-safe as long as two threads do not simultaneously access
the same hamt."
I don't understand this sentence. Isn't the guarantee that
"is thread-safe as long as two threads do not simultaneously access
the same hamt" trivial? And what you want to achieve through the use
of _Atomic is that it is thread-safe *also* when two threads access
the same hamt?

Bruno

[1] http://infoscience.epfl.ch/record/64398/files/idealhashtrees.pdf




Re: Unused parameter warnings

2020-10-10 Thread Marc Nieper-Wißkirchen
I can ask for a better description of "pure" in the GCC manual. In any
case, if a function doesn't return at all due to a function call
marked with _Noreturn, it does not affect the observable state locally
and GCC can avoid "emitting some calls in repeated invocations of the
function with the same argument values". As I wrote before, this is an
important observation as otherwise "assert" couldn't be used in "pure"
functions.

Anyway, the function gl_linked_iterator_from_to in Gnulib should
probably be fixed independently by adding the "pure" attribute so that
the code is compilable with -Werror -Wall -Wextra.

Am Sa., 10. Okt. 2020 um 15:50 Uhr schrieb Bruno Haible :
>
> Marc Nieper-Wißkirchen wrote:
> > > And a function that may invoke abort () does "affect observable state".
> >
> > This description of the a pure function does not seem to be accurate.
> > When a function calls another function like abort that is marked with
> > _Noreturn in a pure context, for the compiler the function can still
> > be pure (but not const). It can eliminate a second call to the
> > function with the same parameters. I am pretty sure that the warning
> > of GCC in line 938 is correct.
>
> The documentation of ATTRIBUTE_PURE in Gnulib is taken from the GCC
> documentation [1][2], therefore if you think it needs to be fixed, it's
> through a GCC bug report.
>
> Bruno
>
> [1] https://lists.gnu.org/archive/html/bug-gnulib/2020-05/msg00105.html
> [2] 
> https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Common-Function-Attributes.html
>



Re: stack module

2020-10-10 Thread Bruno Haible
Hi Marc,

> please see the attached patch for the new module.

Very nice. Looks all good, except for some nit-picking in the comments:

> + Introspection:  ELEMENT *base = stack_base ();

I would add a comment here:
  Where ELEMENT is the type to which GL_STACK_ELEMENT was defined when
  this file was included.
(It would be tempting to write
GL_STACK_ELEMENT *base = stack_base ();
but GL_STACK_ELEMENT is no longer defined after the file was included...)

Other than that, please feel free to commit and push this. Thanks!!

Bruno




Re: Unused parameter warnings

2020-10-10 Thread Bruno Haible
Marc Nieper-Wißkirchen wrote:
> > And a function that may invoke abort () does "affect observable state".
> 
> This description of the a pure function does not seem to be accurate.
> When a function calls another function like abort that is marked with
> _Noreturn in a pure context, for the compiler the function can still
> be pure (but not const). It can eliminate a second call to the
> function with the same parameters. I am pretty sure that the warning
> of GCC in line 938 is correct.

The documentation of ATTRIBUTE_PURE in Gnulib is taken from the GCC
documentation [1][2], therefore if you think it needs to be fixed, it's
through a GCC bug report.

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnulib/2020-05/msg00105.html
[2] 
https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Common-Function-Attributes.html




Re: stack bounds

2020-10-10 Thread Bruno Haible
Hi Paul,

you wrote on 2020-09-22:
> I am thinking of some combination of gcc -fstack-check and/or 
> -fstack-clash-protection and/or related ideas (not that I've looked into all 
> the 
> details). That is, I'm expecting help from the hardware, the kernel, and from 
> GCC. Stack-exhaustion checking should "just work". I realize it's a 
> nontrivial ask.

gcc -fstack-clash-protection does exactly what most people expect; as you say,
with the help from the hardware, the kernel, and GCC.
  - The hardware provides the MMU and the concept of page tables, which are
the basis of mprotect().
  - The kernel provides mprotect; organizes a guard page at the bottom of the
stack; and grows the stack if an access to the guard page is made.
  - 'gcc -fstack-clash-protection' arranges that a stack frame that is larger
than one page is accessed in single-page steps, from top to bottom, so
that the kernel will grow the stack and not report a SIGSEGV. Test code
below.
This is how it's done on Windows.

On Linux, the kernel allows the stack to grow by any amount, if it does not
become closer than 1 MB to another VMA and does not violate the set limits.
See linux/mm/mmap.c:expand_downwards and linux/mm/mmap.c:acct_stack_growth.
Therefore on Linux, there is no need for a guard page and no need for
'gcc -fstack-clash-protection'.

In both cases, it "just works". What other features around stack bounds checking
are you thinking of?

gcc -fstack-check is unrelated: It merely adds a canary word in each stack
frame and checks that this word is still present upon return. So, it's merely
a defense against buffer overflows on the stack.

Bruno

= foo.c =
void bar (int *array);

void foo (int n)
{
  int array[n];
  for (int i = 0; i < n; i++)
array[i] = i*i;
  bar (array);
}
=

gcc -fstack-clash-protection -O2 -S foo.c
= foo.s =
.file   "foo.c"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movslq  %edi, %rcx
leaq15(,%rcx,4), %rax
movq%rcx, %rdx
movq%rax, %rsi
andq$-4096, %rax
movq%rsp, %rdi
movq%rsp, %rbp
.cfi_def_cfa_register 6
andq$-16, %rsi
subq%rax, %rdi
cmpq%rdi, %rsp
je  .L3
.L13:
subq$4096, %rsp
orq $0, 4088(%rsp)
cmpq%rdi, %rsp
jne .L13
.L3:
andl$4095, %esi
subq%rsi, %rsp
testq   %rsi, %rsi
jne .L14
.L4:
movq%rsp, %rdi
testl   %edx, %edx
jle .L5
xorl%eax, %eax
.p2align 4,,10
.p2align 3
.L6:
movl%eax, %edx
imull   %eax, %edx
movl%edx, (%rdi,%rax,4)
addq$1, %rax
cmpq%rcx, %rax
jne .L6
.L5:
callbar
leave
.cfi_remember_state
.cfi_def_cfa 7, 8
ret
.p2align 4,,10
.p2align 3
.L14:
.cfi_restore_state
orq $0, -8(%rsp,%rsi)
jmp .L4
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 10.2.0"
.section.note.GNU-stack,"",@progbits
=




Re: clang warnings in regcomp.c

2020-10-10 Thread Bruno Haible
Paul Eggert wrote:
> > What is the point of these parentheses?
> 
> Haven't a clue. It's an unusual style.
> 
> >   static reg_errcode_t preorder (bin_tree_t *root,
> > -  reg_errcode_t (fn (void *, bin_tree_t *)),
> > +  reg_errcode_t fn (void *, bin_tree_t *),
> >void *extra);
> 
> For corrections like that I suggest the style "reg_errcode_t (*fn) (void *, 
> bin_tree_t *)" instead, as that's the usual style in glibc and I expect it is 
> the style that was intended in regcomp.c anyway.

Thanks for the advice. I reported it on the glibc tracker:
https://sourceware.org/bugzilla/show_bug.cgi?id=26725

Bruno