Re: [PATCH] add language/wisp to Guile?

Dr. Arne Babenhauserheide Sat, 04 Feb 2023 14:23:20 -0800

Maxime Devos <maximede...@telenet.be> writes:

>> This needs an addition to the extensions via guile -x .w — I wrote
>> that
>> in the documentation. I didn’t want to do that unconditionally, because
>> detecting a wisp file as scheme import would cause errors.
>
> If done carefully, I don't think this situations would happen.
> More precisely:
>
>   * .w would be in the file extensions list.
>
>   * Instead of a list, it would actually be a map from extensions to
>     languages:
>
>       .scm -> scheme
>       .w -> wisp
>
>     With this change, (use-modules (foo)) will load 'foo.scm' as Scheme
>     and 'foo.w' as Wisp.  (Assuming that foo.go is out-of-date or
>     doesn't exist.)
>
>     (For backwards compatibility, I think %load-extensions needs to
>     remain a list of strings, but a %extension-language variable could
>     be defined.)
>
>   * "guile --language=whatever foo" loads foo as whatever, regardless
>     of the extension of 'foo' (if a specific language is requested,
>     then the user knows best).
>
>   * "guile foo" without --language will look up the extension of foo in
>     the extension map. If an entry exists, it would use the
>     corresponding language.  If no entry exists, it would use
>     a default language (scheme).


This sounds good, though a bit more complex than I think it should be.

I think this should stick to only load Scheme if no language is detected
to keep Scheme the default language for Guile — and also to avoid
stumbling over files that just take that extension. Checking more files
could slow down startup and I think having multiple languages fully
equal would risk splintering the development community.

Guile is first and foremost Scheme and fast startup time is essential.

More complicated is what should be done if a *.go file is detected
during import. There I could see Guile check if a file with any
supported extension is up to date.

>> Is there a way to only extend the loading mechanism to detect .w when
>> language is changed to wisp?

> I don't care what language the library Foo is written in, and my
> library Bar isn't written in Wisp so it seems unreasonable to have to
> add -x w.

I think you’re right with that. For any already compiled library, the
language should not matter.

>> readable uses
>
> This sentence appears to be incomplete; I might have misinterpreted it
> below (I don't know what you mean with 'readable' -- its an adjective
> and you are using it as a noun?).

readable is a noun, yes: the readable lisp project.

>> (set! %load-extensions (cons ".sscm" %load-extensions))
>> Would that be the correct way of doing this?

> FWIW, it appears to be an answer to the following unasked question:
>
>   How to make Guile accept "foo.go" when "foo.w" exists and is
>   up-to-date.

Yes, I think that is the most important question. If that is solved,
guile provides a multi-language environment in which only the build
tools of the libraries themselves have to know the languages used.

> This sounds like the second proposal ('alternatively ...'), but the
> way it is written, you appear to proposing it as a third proposal.  Is
> this the case?

It only differs in details (keeping Scheme more central and only
checking for non-scheme languages if a *.go file is detected).

> (I mean, after this patch, Wisp is a supported language, so it seems
> equivalent to me.)

Pretty close, yes.

>>>> +; Set locale to something which supports unicode. Required to avoid
>>>> using fluids.
>>>> +(catch #t
>>>
>>>   * Why avoid fluids?
>> I’m not sure anymore. It has been years since I wrote that code …
>> I think it was because I did not understand what that would mean for
>> the
>> program. And I actually still don’t know …
>> Hoow would I do that instead with fluids?
>> 
>>>   * Assuming for sake of argument that fluids are to be avoided,
>>>     what is the point of setting the locale to something supporting
>>>     Unicode?
>> I had problems with reading unicode symbols. Things like
>> define (Σ . args) : apply + args
>> [...]>
>> This is to ensure that Wisp are always read as Unicode. Since it uses
>> regular (read) as part of parsing, it must affect (read), too.
>
> OK.  So, Wisp files are supposed to be UTF-8, no matter the locale?
> AFAICT, the SRFI-119 document does not mention this UTF-8 (or UTF-16,
> or ...) requirement anywhere, this seems like an omission in
> <https://srfi.schemers.org/srfi-119/srfi-119.html> to me.

That’s an omission, yes … but since it was omitted (by me …), you’re
right. Forcing UTF-8 is actually not the way.

> First, I would like to point out the following part of
> ‘(guile)The Top of a Script File’:
>
>    • If this source code file is not ASCII or ISO-8859-1 encoded, a
>      coding declaration such as ‘coding: utf-8’ should appear in a
>      comment somewhere in the first five lines of the file: see *note
>      Character Encoding of Source Files::.
…
> (OTOH, (guile)Character Encoding says 'In the absence of any hints,
> UTF-8 is assumed.' which appears to suffice for you, but it also
> contradicts "If this source file is not ASCII or ISO-8859-1 encodes,
> ...", so I don't know what precisely is going on here.)

I think this inconsistency calls for calling in old timers who know why
this is there. Maybe one of these is just a leftover?

> Keep in mind that encodings are a per-port property -- the locale
> might have a default encoding, and ports by default take the encoding
> from %default-port-encoding or the locale (I think), but you can
> override the port encoding:
>
>  -- Scheme Procedure: set-port-encoding! port enc
>  -- C Function: scm_set_port_encoding_x (port, enc)
>      Sets the character encoding that will be used to interpret I/O to
>      PORT.  ENC is a string containing the name of an encoding.  Valid
>      encoding names are those defined by IANA
>      (http://www.iana.org/assignments/character-sets), for example
>      ‘"UTF-8"’ or ‘"ISO-8859-1"’.
>
> As such, I propose calling set-port-encoding! right in the beginning
> of read-one-wisp-sexp.

This sounds like the best way forward on the short term.

> Also, unrelated, I now noticed some dead code you can remove:
>
> +(define wisp-pending-sexps (list))

You’re right, that was only needed in a previous iteration of wisp (last
used more than 3 years ago, IIRC).

Thank you!

>>> (define repr-dot (make-symbol "REPR-DOT")).
>> That looks better — does uninterned symbol mean it can’t be
>> mis-interpreted?
>
> Yes.  This is because 'read' only reads interned symbols; uninterned
> symbols are unreadable:

…

>> Can I write it into a string and then read it back?
>
> No.  If you could, then uninterned symbols wouldn't be uninterned
> anymore, but rather a separation of symbols in two kinds that pretty
> much behave the same, and then you would again have a (very low) risk
> of a collision:

This sounds like I cannot go that way, because there’s a necessary
pre-processing step in wisp-read via (match-charlist-to-repr peeked):

(define (wisp-read port)
       "wrap read to catch list prefixes."
       (let ((prefix-maxlen 4))
         (let longpeek
           ((peeked '())
            (repr-symbol #f))
           (cond
             ((or (< prefix-maxlen (length peeked)) (eof-object? (peek-char 
port)) (equal? #\space (peek-char port)) (equal? #\newline (peek-char port)))
               (if repr-symbol ; found a special symbol, return it.
                  repr-symbol
                  (let unpeek
                    ((remaining peeked))
                    (cond
                      ((equal? '() remaining)
                        (read port)); let read to the work
                      (else
                        (unread-char (car remaining) port)
                        (unpeek (cdr remaining)))))))
             (else
               (let*
                 ((next-char (read-char port))
                  (peeked (cons next-char peeked)))
                 (longpeek
                   peeked
                   (match-charlist-to-repr peeked))))))))

This actually needs to be able to write the replacement symbols back
into the port.

> ..., for which I proposed a replacement, so do you still need to turn
> it in a string & back?

Sadly yes. Otherwise the normal reader will play tricks on the code,
because it does not know where a symbol needs to be interpreted
differently (i.e. where ` needs to be treated as `() even though that’s
not in the string).

>> The REPR supports the syntactic sugar like '(...) for (quote ...) by
>> turning
>> (' ...) into '(...).
>> Also it is needed to turn ((. a b c)) into (a b c).
>> However the literal array is used to make it possible to define
>> procedure properties which need a literal array.
>> 
>>> Also, I wonder if you could just do something like
>>>
>>>    (apply vector (map wisp-replace-paren-quotation-repr a))
>>>
>>> instead of this 'hack to defer to read' thing.  This seems simpler to
>>> me and equivalent.
>> That looks much cleaner. Thank you!
>
> This sounds positive, but it is unclear to me if I have found a
> solution, because of your negative "However the literal array is used
> to make it possible to define procedure properties which need a
> literal array." comment.
>
> Do I need to look into solving the 'literal array and procedure
> properties' stuff, or does the (apply vector (map ...)) suffice as-is?
>
> (If there is 'literal array and procedure properties' stuff to be
> solved, you will need to elaborate on what you mean, because arrays
> aren't procedures and procedures aren't arrays -- maybe you meant
> 'object properties'?)

I meant this:

(define (foo)
  #((bar . baz))
  #f)
(procedure-properties foo)
=> ((name . foo) (bar . baz))

I use that for doctests:


(define (A)
    #((tests (test-eqv 'A (A))
             (test-assert #t)))
    'A)

(define %this-module (current-module))
(define (main args)
         (doctests-testmod %this-module))

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

signature.asc
Description: PGP signature

Re: [PATCH] add language/wisp to Guile?

Reply via email to