Re: patch for mmap and friends

2023-01-13 Thread Matt Wette

On 1/13/23 4:49 PM, Matt Wette wrote:

Please consider this patch for adding mmap(), munmap() and msync()
 to libguile/filesys.c.  Included is update for posix.texi and test 
file mman.test.

Once included, feature 'mman should be #t.

Matt

Please add the attached file: test-suite/tests/mman.test.

I thought it was included in the patch.  It's the thought that counts, 
right?


Matt


 mman.test --- Tests for mmap API.-*- scheme -*-

 Copyright 2022 Free Software Foundation, Inc.

 This library is free software; you can redistribute it and/or
 modify it under the terms of the GNU Lesser General Public
 License as published by the Free Software Foundation; either
 version 3 of the License, or (at your option) any later version.

 This library is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 Lesser General Public License for more details.

 You should have received a copy of the GNU Lesser General Public
 License along with this library; if not, write to the Free Software
 Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
USA

(define-module (test-mman)
  #:use-module (test-suite lib)
  #:use-module (test-suite guile-test)
  #:use-module (rnrs bytevectors)
  #:declarative? #f
  )

(define (mmap-test-file)
  (data-file-name "foo.txt"))

(define mmap-test-string "hello, world")

(define (gen-mmap-test-file)
  (with-output-to-file (mmap-test-file)
(lambda () (display mmap-test-string

(when (provided? 'mman)

  (gen-mmap-test-file)

  (with-test-prefix "mman"

(pass-if "mman 1"
  (let ((bv (mmap 0 #x100)))
(bytevector-u8-set! bv 0 34)
(= (bytevector-u8-ref bv 0) 34)))

))

;; --- last line ---


patch for mmap and friends

2023-01-13 Thread Matt Wette

Please consider this patch for adding mmap(), munmap() and msync()
 to libguile/filesys.c.  Included is update for posix.texi and test 
file mman.test.

Once included, feature 'mman should be #t.

Matt
From 6c944174d35d43f87340c8199d47f3f088fa6ca7 Mon Sep 17 00:00:00 2001
From: Matt Wette 
Date: Fri, 13 Jan 2023 16:42:06 -0800
Subject: [PATCH] Add mmap and friends (munmap, msync).

* libguile/filesys.[ch]: added scm_mmap_search, scm_mmap, scm_msync, and
  init_mman, built on availability of HAVE_MMAN_H; also provides feature
  'mman
* doc/ref/posix.texi: added documentation for mmap and friends
* test-suite/Makefile.am: updated for mman.test
* test-suite/tests/mman.test: mmap tests
---
 configure.ac   |   2 +
 doc/ref/posix.texi |  45 +++
 libguile/filesys.c | 264 +
 libguile/filesys.h |   4 +
 test-suite/Makefile.am |   1 +
 5 files changed, 316 insertions(+)

diff --git a/configure.ac b/configure.ac
index f8c12f0d7..c348d14a2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1018,6 +1018,8 @@ AC_CHECK_MEMBERS([struct tm.tm_gmtoff],,,
 ])
 GUILE_STRUCT_UTIMBUF
 
+AC_CHECK_FUNCS([msync])
+
 
 #
 #
diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index 5653d3758..16f3bbc49 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -1216,6 +1216,51 @@ valid separators.  Thus, programs should not assume that
 separator---e.g., when extracting the components of a file name.
 @end defvr
 
+@deffn {Scheme Procedure} mmap addr len [prot [flags [fd [offset
+@deffnx {Scheme Procedure} mmap/search addr len [prot [flags [fd [offset
+Create a memory mapping, returning a bytevector.  @var{addr}, if
+non-zero, is the staring address; or, if zero, is assigned by the
+system.  @var{prot}, if provided, assigns protection.  @var{fd},
+if provided associates the memory region with a file, starting
+at @var{offset}, if provided.
+The region returned by mmap will NOT be searched by the garbage
+ collector for pointers, while that returned by mmap/search will.
+Note that the finalizer for the returned bytevector will call munmap.
+Defaults for optional arguments are
+@table @asis
+@item prot
+(logior PROT_READ PROT_WRITE)
+@item flags
+(logior MAP_ANONYMOUS MAP_PRIVATE)
+@item fd
+-1
+@item offset
+0
+@end table
+@end deffn
+
+@deffn {Scheme Procedure} munmap bvec
+Given bytevector generated by mmap or mmap/search, unmap the
+the associated memory.  The argument will be modified to
+reflect a zero length bv.  The return value is unspecified.
+Note that munmap is called by finalizer associated with
+bytevectors returned from mmap and mmap/search.
+@end deffn
+
+@deffn {Scheme Procedure} msync addr length flag
+Flush changes made to the in-core copy of a file mapped using
+mmap or mmap/search.  This should be executed on modified memory
+before calling munmap.  The @var{flags} argument must be exactly one
+of the following:
+@table @code
+@item MS_ASYNC
+Initiate update, return immediately.
+@item MS_SYNC
+Initiate update, block until complete.
+@item MS_INVALIDATE
+Invalidate other mappings of the same file.
+@end table
+@end deffn
 
 @node User Information
 @subsection User Information
diff --git a/libguile/filesys.c b/libguile/filesys.c
index 1f0bba556..0ddb4cfee 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -67,11 +67,17 @@
 # include 
 #endif
 
+#ifdef HAVE_SYS_MMAN_H
+# include 
+#endif
+
 #include "async.h"
 #include "boolean.h"
 #include "dynwind.h"
 #include "fdes-finalizers.h"
 #include "feature.h"
+#include "finalizers.h"
+#include "foreign.h"
 #include "fports.h"
 #include "gsubr.h"
 #include "iselect.h"
@@ -2263,6 +2269,261 @@ scm_dir_free (SCM p)
 
 
 
+#ifdef HAVE_SYS_MMAN_H
+/* see https://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html */
+
+static void
+mmap_finalizer (void *ptr, void *data)
+{
+  SCM bvec;
+  void *c_addr;
+  size_t c_len;
+  int rv;
+
+  bvec = SCM_PACK_POINTER (ptr);
+  if (!SCM_BYTEVECTOR_P (bvec))
+scm_misc_error ("mmap", "expecting bytevector", SCM_EOL);
+
+  c_addr = SCM_BYTEVECTOR_CONTENTS (bvec);
+  c_len = SCM_BYTEVECTOR_LENGTH (bvec);
+  SCM_SYSCALL (rv = munmap(c_addr, c_len));
+  if (rv != 0)
+scm_misc_error ("mmap", "failed to munmap memory", SCM_EOL);
+}
+
+SCM_DEFINE (scm_mmap_search, "mmap/search", 2, 4, 0,
+(SCM addr, SCM len, SCM prot, SCM flags, SCM fd, SCM offset),
+	"Create a memory mapping, returning a bytevector..  @var{addr},\n"
+	"if non-zero, is the staring address; or, if zero, is assigned by\n"
+	"the system.  @var{prot}, if provided, assigns protection.\n"
+	"@var{fd}, if provided associates the memory region with a file\n"
+	"starting at @var{offset}, if provided.\n"
+	"The region returned by mmap WILL be searched by the garbage\n"
+	"collector for pointers.  See also mmap.  Note that the\n"
+"finalizer for the returned bytevector will 

Re: [PATCH] Add 'bytevector-slice'.

2023-01-13 Thread Ludovic Courtès
Maxime Devos  skribis:

> On 13-01-2023 12:32, Ludovic Courtès wrote:
>>> IIUC, if you use bytevector-slice iteratively, say:
>>>
>>> (let loop ((bv some-initial-value)
>>> (n large-number))
>>>(if (> n 0)
>>>(loop (bytevector-slice bv 0 (bytevector-length bv))
>>>  (- n 1))
>>>bv))
>>>
>>> you will end up with a bytevector containing a reference to a
>>> bytevector containing a reference to ... containing a reference to the
>>> original reference, in a chain of length ≃ large-number.
>> The ‘parent’ word is here just so the backing memory isn’t reclaimed
>> while the slice is still alive.
>> Whether there’s a long chain of ‘parent’ links or not makes no
>> difference to GC performance nor to memory usage.
>
> This is false, the opposite holds: memory usage is at least linear in
> the length of the chain, because the objects in the chain are pairwise
> non-eq?: for a chain (O_1, O_2, ..., O_n) that is alive, each object
> O_i in the chain is kept alive (because that's what the 'parent' link
> is for).  Because bytevector-slice does a fresh allocation, there then
> are n different objects.  Hence, memory usage is at least
> 'SCM_BYTEVECTOR_HEADER_SIZE * sizeof(scm_t_bits) * n'.

Ah yes, that’s right, I misunderstood the comment.

In the example above, where we’re only dealing with slices, we could
“skip” the parent (i.e., have each slice’s parent point to the “root” of
the hierarchy), but I don’t think we can assume this to be the case
generally.

Ludo’.



[PATCH] Document R7RS bytevector functions

2023-01-13 Thread lloda
Hi

Right now the manual just mentions (scheme base), but not the contents. This 
patch at least makes sure that at least the bytevector-related R7RS functions 
appear in the index.

The patch documents a first group of purely bytevector functions and a second 
group of binary I/O that are not elsewhere in Guile/R6RS or that exist but have 
different definitions.

Regards

  Daniel




0001-Document-R7RS-functions-related-to-bytevectors.patch
Description: Binary data


Re: fibers,questions about thread id and mutation of vectors

2023-01-13 Thread Maxime Devos

  for (i=start; i<=stop; i++)  { /* i is private by default */

scm_init_guile();
scm_call_1( func , scm_from_int(i) );


IIUC, you are calling scm_init_guile once per index, whereas calling it 
once per thread would suffice.  For better performance, I propose doing 
it once per thread.


On 13-01-2023 12:10, Damien Mattei wrote:
there should be difference in implementation of Guile between Mac OS and 
Linux but i do not know the inner mechanism and algorithm used to run 
Guile in a C environment,what  scm_init_guile() is doing?


Guile is free software, you can download the source code to read what 
scm_init_guile is doing.



why must it be placed under the // region on Linux (with slower result)


From the manual:

6.4  Each thread that wants to use functions from the Guile API needs to 
put itself into guile mode with either ‘scm_with_guile’ or 
‘scm_init_guile’. The global state of Guile is initialized automatically 
when the first thread enters guile mode.


OpenMP does multi-threading, so you need to call scm_init_guile or 
scm_with_guile.



and anywhere under MacOS ? (speed up code)


You need to do it on non-Linux too -- the extract from the manual does 
not make a 'except on MacOS' exception.  It might work without 
scm_init_guile in some circumstances, but this is not at all guaranteed.


If you want to know the difference between MacOS and Linux in the 
implementation of scm_init_guile, you can read the source code of 
scm_init_guile as mentioned before.


Greetings,
Maxime.


OpenPGP_0x49E3EE22191725EE.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] Add 'bytevector-slice'.

2023-01-13 Thread Maxime Devos

On 13-01-2023 12:32, Ludovic Courtès wrote:

IIUC, if you use bytevector-slice iteratively, say:

(let loop ((bv some-initial-value)
(n large-number))
   (if (> n 0)
   (loop (bytevector-slice bv 0 (bytevector-length bv))
 (- n 1))
   bv))

you will end up with a bytevector containing a reference to a
bytevector containing a reference to ... containing a reference to the
original reference, in a chain of length ≃ large-number.


The ‘parent’ word is here just so the backing memory isn’t reclaimed
while the slice is still alive.

Whether there’s a long chain of ‘parent’ links or not makes no
difference to GC performance nor to memory usage.


This is false, the opposite holds: memory usage is at least linear in 
the length of the chain, because the objects in the chain are pairwise 
non-eq?: for a chain (O_1, O_2, ..., O_n) that is alive, each object O_i 
in the chain is kept alive (because that's what the 'parent' link is 
for).  Because bytevector-slice does a fresh allocation, there then are 
n different objects.  Hence, memory usage is at least 
'SCM_BYTEVECTOR_HEADER_SIZE * sizeof(scm_t_bits) * n'.


Greetings,
Maxime.


OpenPGP_0x49E3EE22191725EE.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] Add 'bytevector-slice'.

2023-01-13 Thread Ludovic Courtès
Hi,

Maxime Devos  skribis:

> The only thing missing for me, is a procedure
> 'bytevector-slice/read-only' and 'bytevector-slice/write-only', then I
> could throw the module implementing the wrapping in Scheme-GNUnet and
> the overhead incurred by wrapping away.

I did consider the read-only part, but it’s not really implementable
currently as SCM_F_BYTEVECTOR_IMMUTABLE flag is even ignored by
instructions, and fixing it is far beyond the scope of this patch:

  https://issues.guix.gnu.org/60779

> On 11-01-2023 16:00, Ludovic Courtès wrote:

[...]

>> +(use-modules (rnrs bytevectors)
>> + (rnrs bytevectors gnu))
>
> I thought that R6RS reserved (rnrs ...) to the RnRS process, so I
> would think that naming it (rnrs bytevectors gnu) would be
> standards-incompliant, though I cannot find in the specification, so
> maybe it isn't actually reserved.
>
> (SRFI looks a bit looser to me, so I find the (srfi ... gnu)
> acceptable, but (rnrs ... gnu) looks weird to me, I would propose
> (ice-9 bytevector-extensions) or such.).

I’ll stick to gnu.scm because that’s the convention we’ve used for
extensions so far, and “gnu” is arguably “reserved”.

>> +Copyright (C) 1996-1997, 2000-2005, 2009-2023 Free Software Foundation,
>
>
> Where does this year 2022 come from?  Does a previous version of this
> patch predate the new year?

I started working on it in December and my ‘write-file-hooks’ updated it.

>> +  c_offset = scm_to_size_t (offset);
>> +
>> +  if (SCM_UNBNDP (size))
>> +{
>> +  if (c_offset < SCM_BYTEVECTOR_LENGTH (bv))
>> +c_size = SCM_BYTEVECTOR_LENGTH (bv) - c_offset;
>> +  else
>> +c_size = 0; > +}
>> +  else
>> +c_size = scm_to_size_t (size);
>> +
>> +  if (c_offset + c_size > SCM_BYTEVECTOR_LENGTH (bv))
>> +scm_out_of_range (FUNC_NAME, offset);
>
>
> If offset=SIZE_MAX-1 and size=1, this will overflow to 0 and hence not
> trigger the error reporting.  This bounds check needs to be rewritten,
> with corresponding additional tests.

OK.

> IIUC, if you use bytevector-slice iteratively, say:
>
> (let loop ((bv some-initial-value)
>(n large-number))
>   (if (> n 0)
>   (loop (bytevector-slice bv 0 (bytevector-length bv))
> (- n 1))
>   bv))
>
> you will end up with a bytevector containing a reference to a
> bytevector containing a reference to ... containing a reference to the
> original reference, in a chain of length ≃ large-number.

The ‘parent’ word is here just so the backing memory isn’t reclaimed
while the slice is still alive.

Whether there’s a long chain of ‘parent’ links or not makes no
difference to GC performance nor to memory usage.

> Do you know what this 'parent' field even is for?  Going by some
> comments in 'libguile/bytevectors.c', it is for GC reasons, but going
> by the existence of the 'bytevector->pointer' + 'pointer->bytevector'
> trick which destroys 'parent' information, it seems unnecessary to me.

Like I wrote, it was introduced to keep backing memory alive, generally.
With ‘pointer->bytevector’, we want to make sure the original pointer
remains live as long as the bytevector is live (pointer objects can have
a finalizer, and that finalizer must not run while the bytevector is
live).

> Nowadays a https://.../ instead of a mail address, is more
> conventional and useful, I'd think.  Its even used in old files,
> e.g. libguile/r6rs-ports.c.

Noted.

> A test is missing for the case where the size is unaligned instead of
> the offset.

Noted.

I’ll come up with a second version taking this into account.

Thanks!

Ludo’.



Re: fibers,questions about thread id and mutation of vectors

2023-01-13 Thread Damien Mattei
i made some test of openMP and Guile with Guile 3.0.8.99-f3ea8 on MacOS M1
and Linux Intel because i was not sure of the performances. I find a
problem on Linux the code is slower (could be a factor of 5) with openMP
and in Mac OS the gain is is of 100% (divide by 2) or 15% depending of
computation complexity.
i can not explain why it works under MacOS and not Linux, the only
difference of compilation is that under Mac OS i had to force this option
to succeed compiling:
configure --enable-mini-gmp

Anyway it is not good performance for openMP with scheme, under openMP with
n CPUs i have gain of almost n x 100% of speedup, in C language or Fortran
OpenMP when use for astronomical numerical simulation.
in the // region i have only this code on MacOS:

  scm_init_guile();

#pragma omp parallel for

  for (i=start; i<=stop; i++)  { /* i is private by default */

scm_call_1( func , scm_from_int(i) );

with linux this create a segmentation fault unless i move inside the for
loop the line scm_init_guile();

like this:

#pragma omp parallel for

  for (i=start; i<=stop; i++)  { /* i is private by default */

scm_init_guile();
scm_call_1( func , scm_from_int(i) );

https://github.com/damien-mattei/library-FunctProg/blob/master/guile-openMP.c#L91

the scheme+ code for speed test looks like that (i use collatz function to
make the computation unpredictable for any C compiler optimisations when i
compare with pur C code):

;; only for speed tests
{vtstlen <+ 2642245}
{vtst <+ (make-vector vtstlen 0)}

{fct <+ (lambda (x) {x * x * x})}

(define (fctapply i) {vtst[i] <- fct(vtst[i])}) ;; neoteric expression of
{vtst[i] <- (fct vtst[i])}

(define (fctpluscollatzapply i) {vtst[i] <- fctpluscollatz(vtst[i])})

(define (speed-test)

  ;; init data
  (display-nl "speed-test : Initialising data.")
  (for ({i <+ 0} {i < vtstlen} {i <- {i + 1}})
   {vtst[i] <- i})

  ;; compute
  (display-nl "speed-test : testing Scheme alone : start")
  (for ({i <+ 0} {i < vtstlen} {i <- {i + 1}})
   (fctpluscollatzapply i));;(fctapply i))
  (display-nl "speed-test : testing Scheme alone : end")

  (newline)

  ;; display a few results
  (for ({i <+ 0} {i < 10} {i <- {i + 1}})
   (display-nl {vtst[i]}))
  (display-nl ".")
  (for ({i <+ {vtstlen - 10}} {i < vtstlen} {i <- {i + 1}})
   (display-nl {vtst[i]}))

  ;; init data
  (display-nl "speed-test : Initialising data.")
  (for ({i <+ 0} {i < vtstlen} {i <- {i + 1}})
   {vtst[i] <- i})

  ;; compute
  (display-nl "speed-test : testing Scheme with OpenMP : start")
  (openmp 0 {vtstlen - 1} (string->pointer
"fctpluscollatzapply"));;"fctapply"))
  (display-nl "speed-test : testing Scheme with OpenMP : end")

  (newline)

  ;; display a few results
  (for ({i <+ 0} {i < 10} {i <- {i + 1}})
   (display-nl {vtst[i]}))
  (display-nl ".")
  (for ({i <+ {vtstlen - 10}} {i < vtstlen} {i <- {i + 1}})
   (display-nl {vtst[i]}))

  )


(define (collatz n)
  (cond ({n = 1} 1)
({(modulo n 2) = 0} {n / 2})
(else {{3 * n} + 1})))


(define (fctpluscollatz x)
  (declare c)
  (if {x = 0}
  {c <- 0}
  {c <- collatz(x)})
  {{x * x * x} + c})


(define openmp (foreign-library-function "./libguile-openMP" "openmp"
#:return-type int #:arg-types (list int int '*)))


(define libomp (dynamic-link "libomp")) ;;  note: require a link : ln -s
/opt/homebrew/opt/libomp/lib/libomp.dylib libomp.dylib
;; export LTDL_LIBRARY_PATH=. under linux with a link as above
;; or better solution: export LTDL_LIBRARY_PATH=/usr/lib/llvm-14/lib

(define omp-get-max-threads
  (pointer->procedure int
  (dynamic-func "omp_get_max_threads" libomp)
  '()))

https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L3581

output:

scheme@(guile-user)> (speed-test )
speed-test : Initialising data.
speed-test : testing Scheme alone : start
speed-test : testing Scheme alone : end

0
2
9
37
66
141
219
365
516
757
.
18446514741354254581
18446535685572961374
18446556629820732765
18446577574071146391
18446598518350624637
18446619462632745120
18446640406943930245
18446661351257757609
18446682295600649637
18446703239946183906
speed-test : Initialising data.
speed-test : testing Scheme with OpenMP : start
speed-test : testing Scheme with OpenMP : end

0
2
9
37
66
141
219
365
516
757
.
18446514741354254581
18446535685572961374
18446556629820732765
18446577574071146391
18446598518350624637
18446619462632745120
18446640406943930245
18446661351257757609
18446682295600649637
18446703239946183906

the sequential region : 4"
the // region: 2" (twice faster)

of course if i run a pure C eqivlent code it is instantaneous:

// openMP cube - collatz test

#include 
#include 
#include 



// OpenMP on macOS with Xcode tools:
// https://mac.r-project.org/openmp/

// export OMP_NUM_THREADS=8

// this main() in a library was only for testing openMP with Mac OS Xcode
and Linux for use uncomment main() and comment openmp() functions


// mac os 

Re: [EXT] [PATCH] Add 'bytevector-slice'.

2023-01-13 Thread lloda



> On 12 Jan 2023, at 23:27, Ludovic Courtès  wrote:
> 
> lloda  skribis:
> 
>>> On 11 Jan 2023, at 18:37, Thompson, David  wrote:
>>> 
>>> On Wed, Jan 11, 2023 at 12:34 PM Ludovic Courtès  wrote:
 
 What could be convenient though is ‘bytevector-copy’ (no bang), which
 would combine ‘make-bytevector’ + ‘bytevector-copy!’.
>>> 
>>> 'bytevector-copy' already exists, or do you mean some different
>>> implementation of it?
>>> 
>>> - Dave
>> 
>> The current bytevector-copy takes a single argument.
> 
> Right, what I had in mind is one that would take an offset and size; I
> hadn’t realized the name was already taken.
> 
> Thanks,
> Ludo’.

Actually (scheme base) from r7rs already defines (bytevector-copy bytevector 
start end), this is of course r7rs's convention. This is hidden in Guile's 
manual, which only lists the 1-argument version from r6rs.