Hi,

On Fri, Feb 17, 2023, at 6:06 PM, Maxime Devos wrote:
> On 16-02-2023 22:38, Dr. Arne Babenhauserheide wrote:
>> 
>> Matt Wette <matt.we...@gmail.com> writes:
>> 
>>> You may be interested in the load-lang patch I generated a few years ago
>>> to allow file-extension based loading, in addition to '#lang elisp"
>>> type hooks.
>>>
>>> https://github.com/mwette/guile-contrib/blob/main/patch/3.0.8/load-lang.patch
>> 
>> @Maxime: Is this something you’d be interested in championing?
>
> For the '#lang whatever stuff', no:
>
> The '#lang whatever' stuff makes Scheme (*) files unportable between 
> implementations, as '#lang scheme' is not a valid comment -- there exist 
> Schemes beyond Guile and Racket.  If it were changed to recognising
> '-*- mode: scheme -*-' or '-*- language: scheme -*-' or such, it would 
> be better IMO, but insufficient, because (^).
>

I haven't read the patch or this thread closely, but R6RS has an answer to any 
concerns about compatibility with `#lang`. At the beginning of Chapter 4, 
"Lexical and Datum Syntax" 
(<http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-7.html#node_chap_4>) the report 
specifies:

>  An implementation must not extend the lexical or datum syntax in any way, 
> with one exception: it need not treat the syntax `#!<identifier>`, for any 
> <identifier> (see section 4.2.4) that is not `r6rs`, as a syntax violation, 
> and it may use specific `#!`-prefixed identifiers as flags indicating that 
> subsequent input contains extensions to the standard lexical or datum syntax. 
> The syntax `#!r6rs` may be used to signify that the input afterward is 
> written with the lexical syntax and datum syntax described by this report. 
> `#!r6rs` is otherwise treated as a comment; see section 4.2.3.

Chez Scheme uses such comments to support extensions to lexical syntax, as 
documented in 
<https://cisco.github.io/ChezScheme/csug9.5/intro.html#./intro:h1>:

> The Chez Scheme lexical extensions described above are disabled in an input 
> stream after an `#!r6rs` comment directive has been seen, unless a 
> `#!chezscheme` comment directive has been seen since. Each library loaded 
> implicitly via import and each RNRS top-level program loaded via the 
> `--program` command-line option, the `scheme-script` command, or the 
> `load-program` procedure is treated as if it begins implicitly with an 
> `#!r6rs` comment directive.

> The case of symbol and character names is normally significant, as required 
> by the Revised6 Report. Names are folded, as if by string-foldcase, following 
> a `#!fold-case` comment directive in the same input stream unless a 
> `#!no-fold-case` has been seen since. Names are also folded if neither 
> directive has been seen and the parameter `case-sensitive` has been set to 
> `#f`. 

In Racket, in the initial configuration of the reader when reading a file, 
"`#!` is an alias for `#lang` followed by a space when `#!` is followed by 
alphanumeric ASCII, `+`, `-`, or `_`." (See 
<https://docs.racket-lang.org/reference/reader.html#%28part._parse-reader%29>.) 
This does not conflict with Racket's support for script shebangs: "A `#!`  
(which is `#!` followed by a space) or `#!/` starts a line comment that can be 
continued to the next line by ending a line with `\`. This form of comment 
normally appears at the beginning of a Unix script file." (See 
<https://docs.racket-lang.org/reference/reader.html#%28part._parse-comment%29>.)
 Furthermore, the lexical syntax for the rest of the file is entirely under 
control of the specified language. Most languages parameterize the reader to 
reject further uses of `#lang` or its `#!` alias. Some "meta-languages" 
chain-load another language but parameterize it in some way (e.g. 
<https://docs.racket-lang.org/exact-decimal-lang/>). The `#!r6rs` language, of 
course, handles `#!` exactly as specified by R6RS, with no extensions. 

(Guile does not handle `#!r6rs` properly, presumably because of the legacy 
`#!`/`!#` block comments. I think this should be a surmountable obstacle, 
though, especially since Guile does support standard `#|`/`|#` block comments.)

>
> (^) it doesn't integrate with the module system -- more concretely, 
> (use-modules (foo)) wouldn't try loading foo.js -- adding '-x' arguments 
> would solve that, but we agree that that would be unreasonable in many 
> situations.  (Alternatively one could place ECMAScript code in a file 
> with extension '.scm' with a '#lang' / '-*- mode: ecmascript -*-', but 
> ... no.)
>

Racket has a mechanism to enable additional source file extensions without 
needing explicit command-line arguments by defining `module-suffixes` or 
`doc-modules-suffixes` in a metadata module that is consulted when the 
collection is "set up": https://docs.racket-lang.org/raco/setup-info.html 
However, this mechanism is not widely used.

Overall, the experience of the Racket community strongly suggests that a file 
should say what language it is written in. Furthermore, that language is a 
property of the code, not of its runtime environment, so environment variables, 
command-line options, and similar extralinguistic mechanism are a particularly 
poor fit for controlling it. File extensions are not the worst possible 
mechanisms, but they have similar problems: code written in an unsaved editor 
or a blog post may not have a file extension. (For more on this theme, see the 
corresponding point of the Racket Manifesto: 
<https://cs.brown.edu/~sk/Publications/Papers/Published/fffkbmt-racket-manifesto/paper.pdf>)
 Actually writing the language into the source code has proven to work well.

To end with an argument from authority, this is from Andy Wingo's "lessons 
learned from guile, the ancient & spry" 
(<https://wingolog.org/archives/2020/02/07/lessons-learned-from-guile-the-ancient-spry>):

> On the change side, we need parallel installability for entire languages. 
> Racket did a great job facilitating this with #lang and we should just adopt 
> that.

-Philip

Reply via email to