Re: patch for mmap and friends
On 1/13/23 4:49 PM, Matt Wette wrote: Please consider this patch for adding mmap(), munmap() and msync() to libguile/filesys.c. Included is update for posix.texi and test file mman.test. Once included, feature 'mman should be #t. Matt Please add the attached file: test-suite/tests/mman.test. I thought it was included in the patch. It's the thought that counts, right? Matt mman.test --- Tests for mmap API.-*- scheme -*- Copyright 2022 Free Software Foundation, Inc. This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA (define-module (test-mman) #:use-module (test-suite lib) #:use-module (test-suite guile-test) #:use-module (rnrs bytevectors) #:declarative? #f ) (define (mmap-test-file) (data-file-name "foo.txt")) (define mmap-test-string "hello, world") (define (gen-mmap-test-file) (with-output-to-file (mmap-test-file) (lambda () (display mmap-test-string (when (provided? 'mman) (gen-mmap-test-file) (with-test-prefix "mman" (pass-if "mman 1" (let ((bv (mmap 0 #x100))) (bytevector-u8-set! bv 0 34) (= (bytevector-u8-ref bv 0) 34))) )) ;; --- last line ---
patch for mmap and friends
Please consider this patch for adding mmap(), munmap() and msync() to libguile/filesys.c. Included is update for posix.texi and test file mman.test. Once included, feature 'mman should be #t. Matt From 6c944174d35d43f87340c8199d47f3f088fa6ca7 Mon Sep 17 00:00:00 2001 From: Matt Wette Date: Fri, 13 Jan 2023 16:42:06 -0800 Subject: [PATCH] Add mmap and friends (munmap, msync). * libguile/filesys.[ch]: added scm_mmap_search, scm_mmap, scm_msync, and init_mman, built on availability of HAVE_MMAN_H; also provides feature 'mman * doc/ref/posix.texi: added documentation for mmap and friends * test-suite/Makefile.am: updated for mman.test * test-suite/tests/mman.test: mmap tests --- configure.ac | 2 + doc/ref/posix.texi | 45 +++ libguile/filesys.c | 264 + libguile/filesys.h | 4 + test-suite/Makefile.am | 1 + 5 files changed, 316 insertions(+) diff --git a/configure.ac b/configure.ac index f8c12f0d7..c348d14a2 100644 --- a/configure.ac +++ b/configure.ac @@ -1018,6 +1018,8 @@ AC_CHECK_MEMBERS([struct tm.tm_gmtoff],,, ]) GUILE_STRUCT_UTIMBUF +AC_CHECK_FUNCS([msync]) + # # diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi index 5653d3758..16f3bbc49 100644 --- a/doc/ref/posix.texi +++ b/doc/ref/posix.texi @@ -1216,6 +1216,51 @@ valid separators. Thus, programs should not assume that separator---e.g., when extracting the components of a file name. @end defvr +@deffn {Scheme Procedure} mmap addr len [prot [flags [fd [offset +@deffnx {Scheme Procedure} mmap/search addr len [prot [flags [fd [offset +Create a memory mapping, returning a bytevector. @var{addr}, if +non-zero, is the staring address; or, if zero, is assigned by the +system. @var{prot}, if provided, assigns protection. @var{fd}, +if provided associates the memory region with a file, starting +at @var{offset}, if provided. +The region returned by mmap will NOT be searched by the garbage + collector for pointers, while that returned by mmap/search will. +Note that the finalizer for the returned bytevector will call munmap. +Defaults for optional arguments are +@table @asis +@item prot +(logior PROT_READ PROT_WRITE) +@item flags +(logior MAP_ANONYMOUS MAP_PRIVATE) +@item fd +-1 +@item offset +0 +@end table +@end deffn + +@deffn {Scheme Procedure} munmap bvec +Given bytevector generated by mmap or mmap/search, unmap the +the associated memory. The argument will be modified to +reflect a zero length bv. The return value is unspecified. +Note that munmap is called by finalizer associated with +bytevectors returned from mmap and mmap/search. +@end deffn + +@deffn {Scheme Procedure} msync addr length flag +Flush changes made to the in-core copy of a file mapped using +mmap or mmap/search. This should be executed on modified memory +before calling munmap. The @var{flags} argument must be exactly one +of the following: +@table @code +@item MS_ASYNC +Initiate update, return immediately. +@item MS_SYNC +Initiate update, block until complete. +@item MS_INVALIDATE +Invalidate other mappings of the same file. +@end table +@end deffn @node User Information @subsection User Information diff --git a/libguile/filesys.c b/libguile/filesys.c index 1f0bba556..0ddb4cfee 100644 --- a/libguile/filesys.c +++ b/libguile/filesys.c @@ -67,11 +67,17 @@ # include #endif +#ifdef HAVE_SYS_MMAN_H +# include +#endif + #include "async.h" #include "boolean.h" #include "dynwind.h" #include "fdes-finalizers.h" #include "feature.h" +#include "finalizers.h" +#include "foreign.h" #include "fports.h" #include "gsubr.h" #include "iselect.h" @@ -2263,6 +2269,261 @@ scm_dir_free (SCM p) +#ifdef HAVE_SYS_MMAN_H +/* see https://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html */ + +static void +mmap_finalizer (void *ptr, void *data) +{ + SCM bvec; + void *c_addr; + size_t c_len; + int rv; + + bvec = SCM_PACK_POINTER (ptr); + if (!SCM_BYTEVECTOR_P (bvec)) +scm_misc_error ("mmap", "expecting bytevector", SCM_EOL); + + c_addr = SCM_BYTEVECTOR_CONTENTS (bvec); + c_len = SCM_BYTEVECTOR_LENGTH (bvec); + SCM_SYSCALL (rv = munmap(c_addr, c_len)); + if (rv != 0) +scm_misc_error ("mmap", "failed to munmap memory", SCM_EOL); +} + +SCM_DEFINE (scm_mmap_search, "mmap/search", 2, 4, 0, +(SCM addr, SCM len, SCM prot, SCM flags, SCM fd, SCM offset), + "Create a memory mapping, returning a bytevector.. @var{addr},\n" + "if non-zero, is the staring address; or, if zero, is assigned by\n" + "the system. @var{prot}, if provided, assigns protection.\n" + "@var{fd}, if provided associates the memory region with a file\n" + "starting at @var{offset}, if provided.\n" + "The region returned by mmap WILL be searched by the garbage\n" + "collector for pointers. See also mmap. Note that the\n" +"finalizer for the returned bytevector will
Re: [PATCH] Add 'bytevector-slice'.
Maxime Devos skribis: > On 13-01-2023 12:32, Ludovic Courtès wrote: >>> IIUC, if you use bytevector-slice iteratively, say: >>> >>> (let loop ((bv some-initial-value) >>> (n large-number)) >>>(if (> n 0) >>>(loop (bytevector-slice bv 0 (bytevector-length bv)) >>> (- n 1)) >>>bv)) >>> >>> you will end up with a bytevector containing a reference to a >>> bytevector containing a reference to ... containing a reference to the >>> original reference, in a chain of length ≃ large-number. >> The ‘parent’ word is here just so the backing memory isn’t reclaimed >> while the slice is still alive. >> Whether there’s a long chain of ‘parent’ links or not makes no >> difference to GC performance nor to memory usage. > > This is false, the opposite holds: memory usage is at least linear in > the length of the chain, because the objects in the chain are pairwise > non-eq?: for a chain (O_1, O_2, ..., O_n) that is alive, each object > O_i in the chain is kept alive (because that's what the 'parent' link > is for). Because bytevector-slice does a fresh allocation, there then > are n different objects. Hence, memory usage is at least > 'SCM_BYTEVECTOR_HEADER_SIZE * sizeof(scm_t_bits) * n'. Ah yes, that’s right, I misunderstood the comment. In the example above, where we’re only dealing with slices, we could “skip” the parent (i.e., have each slice’s parent point to the “root” of the hierarchy), but I don’t think we can assume this to be the case generally. Ludo’.
[PATCH] Document R7RS bytevector functions
Hi Right now the manual just mentions (scheme base), but not the contents. This patch at least makes sure that at least the bytevector-related R7RS functions appear in the index. The patch documents a first group of purely bytevector functions and a second group of binary I/O that are not elsewhere in Guile/R6RS or that exist but have different definitions. Regards Daniel 0001-Document-R7RS-functions-related-to-bytevectors.patch Description: Binary data
Re: fibers,questions about thread id and mutation of vectors
for (i=start; i<=stop; i++) { /* i is private by default */ scm_init_guile(); scm_call_1( func , scm_from_int(i) ); IIUC, you are calling scm_init_guile once per index, whereas calling it once per thread would suffice. For better performance, I propose doing it once per thread. On 13-01-2023 12:10, Damien Mattei wrote: there should be difference in implementation of Guile between Mac OS and Linux but i do not know the inner mechanism and algorithm used to run Guile in a C environment,what scm_init_guile() is doing? Guile is free software, you can download the source code to read what scm_init_guile is doing. why must it be placed under the // region on Linux (with slower result) From the manual: 6.4 Each thread that wants to use functions from the Guile API needs to put itself into guile mode with either ‘scm_with_guile’ or ‘scm_init_guile’. The global state of Guile is initialized automatically when the first thread enters guile mode. OpenMP does multi-threading, so you need to call scm_init_guile or scm_with_guile. and anywhere under MacOS ? (speed up code) You need to do it on non-Linux too -- the extract from the manual does not make a 'except on MacOS' exception. It might work without scm_init_guile in some circumstances, but this is not at all guaranteed. If you want to know the difference between MacOS and Linux in the implementation of scm_init_guile, you can read the source code of scm_init_guile as mentioned before. Greetings, Maxime. OpenPGP_0x49E3EE22191725EE.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH] Add 'bytevector-slice'.
On 13-01-2023 12:32, Ludovic Courtès wrote: IIUC, if you use bytevector-slice iteratively, say: (let loop ((bv some-initial-value) (n large-number)) (if (> n 0) (loop (bytevector-slice bv 0 (bytevector-length bv)) (- n 1)) bv)) you will end up with a bytevector containing a reference to a bytevector containing a reference to ... containing a reference to the original reference, in a chain of length ≃ large-number. The ‘parent’ word is here just so the backing memory isn’t reclaimed while the slice is still alive. Whether there’s a long chain of ‘parent’ links or not makes no difference to GC performance nor to memory usage. This is false, the opposite holds: memory usage is at least linear in the length of the chain, because the objects in the chain are pairwise non-eq?: for a chain (O_1, O_2, ..., O_n) that is alive, each object O_i in the chain is kept alive (because that's what the 'parent' link is for). Because bytevector-slice does a fresh allocation, there then are n different objects. Hence, memory usage is at least 'SCM_BYTEVECTOR_HEADER_SIZE * sizeof(scm_t_bits) * n'. Greetings, Maxime. OpenPGP_0x49E3EE22191725EE.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH] Add 'bytevector-slice'.
Hi, Maxime Devos skribis: > The only thing missing for me, is a procedure > 'bytevector-slice/read-only' and 'bytevector-slice/write-only', then I > could throw the module implementing the wrapping in Scheme-GNUnet and > the overhead incurred by wrapping away. I did consider the read-only part, but it’s not really implementable currently as SCM_F_BYTEVECTOR_IMMUTABLE flag is even ignored by instructions, and fixing it is far beyond the scope of this patch: https://issues.guix.gnu.org/60779 > On 11-01-2023 16:00, Ludovic Courtès wrote: [...] >> +(use-modules (rnrs bytevectors) >> + (rnrs bytevectors gnu)) > > I thought that R6RS reserved (rnrs ...) to the RnRS process, so I > would think that naming it (rnrs bytevectors gnu) would be > standards-incompliant, though I cannot find in the specification, so > maybe it isn't actually reserved. > > (SRFI looks a bit looser to me, so I find the (srfi ... gnu) > acceptable, but (rnrs ... gnu) looks weird to me, I would propose > (ice-9 bytevector-extensions) or such.). I’ll stick to gnu.scm because that’s the convention we’ve used for extensions so far, and “gnu” is arguably “reserved”. >> +Copyright (C) 1996-1997, 2000-2005, 2009-2023 Free Software Foundation, > > > Where does this year 2022 come from? Does a previous version of this > patch predate the new year? I started working on it in December and my ‘write-file-hooks’ updated it. >> + c_offset = scm_to_size_t (offset); >> + >> + if (SCM_UNBNDP (size)) >> +{ >> + if (c_offset < SCM_BYTEVECTOR_LENGTH (bv)) >> +c_size = SCM_BYTEVECTOR_LENGTH (bv) - c_offset; >> + else >> +c_size = 0; > +} >> + else >> +c_size = scm_to_size_t (size); >> + >> + if (c_offset + c_size > SCM_BYTEVECTOR_LENGTH (bv)) >> +scm_out_of_range (FUNC_NAME, offset); > > > If offset=SIZE_MAX-1 and size=1, this will overflow to 0 and hence not > trigger the error reporting. This bounds check needs to be rewritten, > with corresponding additional tests. OK. > IIUC, if you use bytevector-slice iteratively, say: > > (let loop ((bv some-initial-value) >(n large-number)) > (if (> n 0) > (loop (bytevector-slice bv 0 (bytevector-length bv)) > (- n 1)) > bv)) > > you will end up with a bytevector containing a reference to a > bytevector containing a reference to ... containing a reference to the > original reference, in a chain of length ≃ large-number. The ‘parent’ word is here just so the backing memory isn’t reclaimed while the slice is still alive. Whether there’s a long chain of ‘parent’ links or not makes no difference to GC performance nor to memory usage. > Do you know what this 'parent' field even is for? Going by some > comments in 'libguile/bytevectors.c', it is for GC reasons, but going > by the existence of the 'bytevector->pointer' + 'pointer->bytevector' > trick which destroys 'parent' information, it seems unnecessary to me. Like I wrote, it was introduced to keep backing memory alive, generally. With ‘pointer->bytevector’, we want to make sure the original pointer remains live as long as the bytevector is live (pointer objects can have a finalizer, and that finalizer must not run while the bytevector is live). > Nowadays a https://.../ instead of a mail address, is more > conventional and useful, I'd think. Its even used in old files, > e.g. libguile/r6rs-ports.c. Noted. > A test is missing for the case where the size is unaligned instead of > the offset. Noted. I’ll come up with a second version taking this into account. Thanks! Ludo’.
Re: fibers,questions about thread id and mutation of vectors
i made some test of openMP and Guile with Guile 3.0.8.99-f3ea8 on MacOS M1 and Linux Intel because i was not sure of the performances. I find a problem on Linux the code is slower (could be a factor of 5) with openMP and in Mac OS the gain is is of 100% (divide by 2) or 15% depending of computation complexity. i can not explain why it works under MacOS and not Linux, the only difference of compilation is that under Mac OS i had to force this option to succeed compiling: configure --enable-mini-gmp Anyway it is not good performance for openMP with scheme, under openMP with n CPUs i have gain of almost n x 100% of speedup, in C language or Fortran OpenMP when use for astronomical numerical simulation. in the // region i have only this code on MacOS: scm_init_guile(); #pragma omp parallel for for (i=start; i<=stop; i++) { /* i is private by default */ scm_call_1( func , scm_from_int(i) ); with linux this create a segmentation fault unless i move inside the for loop the line scm_init_guile(); like this: #pragma omp parallel for for (i=start; i<=stop; i++) { /* i is private by default */ scm_init_guile(); scm_call_1( func , scm_from_int(i) ); https://github.com/damien-mattei/library-FunctProg/blob/master/guile-openMP.c#L91 the scheme+ code for speed test looks like that (i use collatz function to make the computation unpredictable for any C compiler optimisations when i compare with pur C code): ;; only for speed tests {vtstlen <+ 2642245} {vtst <+ (make-vector vtstlen 0)} {fct <+ (lambda (x) {x * x * x})} (define (fctapply i) {vtst[i] <- fct(vtst[i])}) ;; neoteric expression of {vtst[i] <- (fct vtst[i])} (define (fctpluscollatzapply i) {vtst[i] <- fctpluscollatz(vtst[i])}) (define (speed-test) ;; init data (display-nl "speed-test : Initialising data.") (for ({i <+ 0} {i < vtstlen} {i <- {i + 1}}) {vtst[i] <- i}) ;; compute (display-nl "speed-test : testing Scheme alone : start") (for ({i <+ 0} {i < vtstlen} {i <- {i + 1}}) (fctpluscollatzapply i));;(fctapply i)) (display-nl "speed-test : testing Scheme alone : end") (newline) ;; display a few results (for ({i <+ 0} {i < 10} {i <- {i + 1}}) (display-nl {vtst[i]})) (display-nl ".") (for ({i <+ {vtstlen - 10}} {i < vtstlen} {i <- {i + 1}}) (display-nl {vtst[i]})) ;; init data (display-nl "speed-test : Initialising data.") (for ({i <+ 0} {i < vtstlen} {i <- {i + 1}}) {vtst[i] <- i}) ;; compute (display-nl "speed-test : testing Scheme with OpenMP : start") (openmp 0 {vtstlen - 1} (string->pointer "fctpluscollatzapply"));;"fctapply")) (display-nl "speed-test : testing Scheme with OpenMP : end") (newline) ;; display a few results (for ({i <+ 0} {i < 10} {i <- {i + 1}}) (display-nl {vtst[i]})) (display-nl ".") (for ({i <+ {vtstlen - 10}} {i < vtstlen} {i <- {i + 1}}) (display-nl {vtst[i]})) ) (define (collatz n) (cond ({n = 1} 1) ({(modulo n 2) = 0} {n / 2}) (else {{3 * n} + 1}))) (define (fctpluscollatz x) (declare c) (if {x = 0} {c <- 0} {c <- collatz(x)}) {{x * x * x} + c}) (define openmp (foreign-library-function "./libguile-openMP" "openmp" #:return-type int #:arg-types (list int int '*))) (define libomp (dynamic-link "libomp")) ;; note: require a link : ln -s /opt/homebrew/opt/libomp/lib/libomp.dylib libomp.dylib ;; export LTDL_LIBRARY_PATH=. under linux with a link as above ;; or better solution: export LTDL_LIBRARY_PATH=/usr/lib/llvm-14/lib (define omp-get-max-threads (pointer->procedure int (dynamic-func "omp_get_max_threads" libomp) '())) https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L3581 output: scheme@(guile-user)> (speed-test ) speed-test : Initialising data. speed-test : testing Scheme alone : start speed-test : testing Scheme alone : end 0 2 9 37 66 141 219 365 516 757 . 18446514741354254581 18446535685572961374 18446556629820732765 18446577574071146391 18446598518350624637 18446619462632745120 18446640406943930245 18446661351257757609 18446682295600649637 18446703239946183906 speed-test : Initialising data. speed-test : testing Scheme with OpenMP : start speed-test : testing Scheme with OpenMP : end 0 2 9 37 66 141 219 365 516 757 . 18446514741354254581 18446535685572961374 18446556629820732765 18446577574071146391 18446598518350624637 18446619462632745120 18446640406943930245 18446661351257757609 18446682295600649637 18446703239946183906 the sequential region : 4" the // region: 2" (twice faster) of course if i run a pure C eqivlent code it is instantaneous: // openMP cube - collatz test #include #include #include // OpenMP on macOS with Xcode tools: // https://mac.r-project.org/openmp/ // export OMP_NUM_THREADS=8 // this main() in a library was only for testing openMP with Mac OS Xcode and Linux for use uncomment main() and comment openmp() functions // mac os
Re: [EXT] [PATCH] Add 'bytevector-slice'.
> On 12 Jan 2023, at 23:27, Ludovic Courtès wrote: > > lloda skribis: > >>> On 11 Jan 2023, at 18:37, Thompson, David wrote: >>> >>> On Wed, Jan 11, 2023 at 12:34 PM Ludovic Courtès wrote: What could be convenient though is ‘bytevector-copy’ (no bang), which would combine ‘make-bytevector’ + ‘bytevector-copy!’. >>> >>> 'bytevector-copy' already exists, or do you mean some different >>> implementation of it? >>> >>> - Dave >> >> The current bytevector-copy takes a single argument. > > Right, what I had in mind is one that would take an offset and size; I > hadn’t realized the name was already taken. > > Thanks, > Ludo’. Actually (scheme base) from r7rs already defines (bytevector-copy bytevector start end), this is of course r7rs's convention. This is hidden in Guile's manual, which only lists the 1-argument version from r6rs.