Re: [racket-users] How would you implement autoquoted atoms?

2019-04-24 Thread rocketnia
Instead of a new `#%q-expression` form, I think there's potential to use 
`#%datum` or `quote` itself for this. Potentially, the only thing that 
makes numbers (for instance) special is that the reader, printer, IDE, and 
bytecode systems already know what module(s) the number structure type(s) 
come from. As long as user-defined structure types are able to provide each 
of those systems with the same knowledge (e.g. through a structure type 
property), then they can have the same benefits.

One complication: User-defined structure types typically aren't 
interoperable across phases and module registries, the way kernel-defined, 
cross-phase persistent structure types like numbers and lists are. Even if 
we know what their module path is, that's not all the information needed, 
some of the information only "exists" once a generative `struct` definition 
has created it.

In particular, I think using `quote` on user-defined types brings up 
cross-phase difficulties: A `quote` expression is typically computed at 
phase (N + 1) for use as a constant at phase N. If we expect to be able to 
compute it at compile time using phase-(N + 1) instances of user-defined 
structure types, and if we expect to be able to process it at run time 
using phase-N instances of those types, then the `quote` operation itself 
needs to perform some kind of marshalling between those.

How to do that marshalling? Well, whatever materials we need, the structure 
type property can provide them. For instance, certain structure types might 
go through no change at all (e.g. simple procedures, perhaps). Certain 
others may do their marshalling using a (module path, identifier, data) 
intermediate stage just like Matthew Flatt and Alexis King are talking 
about. Maybe some values would even use complex higher-order marshalling 
behaviors (similar to contracts or an FFI), letting us take an object that 
uses phase-(N + 1) tools internally and wrap it up in such a way that it 
can process phase-N input and output values. And maybe some steps of the 
marshalling would use side effects, for instance to implement the interning 
zeRusski is talking about.

Whatever the technique we use for marshalling any particular structure 
type, once the marshalling has completed at phase N, we're often going to 
want a value that's compatible with the phase-N instance of the structure 
type. And that means that by the time we ever see that result value, the 
structure type must have been defined at phase N already -- which means the 
module where the `quote` appears should (at least indirectly) have a 
phase-N dependency on the module that contains that definition.

To make this work, this time it can be the structure type itself that 
"carries its own `require` at all times" (via the structure type property), 
and a `quote` implicitly acts as a `require` for all the structure types 
that appear inside it. This has a lot in common with the `#q` approach, but 
it seamlessly blends in user-defined types with core types: We can say that 
when we `quote` numbers and lists, we implicitly `require` their modules 
too, but that since those modules are part of the kernel, the `require` has 
just been imperceptible the whole time.

For types that need to be marshalled to bytecode or saved as plain text 
from a graphical editor, I agree that the (module path, identifier, data) 
format seems like a fine choice.

That said, I do want to point out that in this approach, the *bytecode* and 
*plain text* uses of (module path, identifier, data) triples would be 
subtly different from each other. In bytecode, the module path would be 
required at phase 0 (because, as far as I understand it, phase 0 is all 
that there is in the bytecode) and the construction would be performed near 
the start of the module. In plain text Racket code, that phase 0 behavior 
only happens as the *result* of compiling a `quote` form. Since `quote` 
marshals the value down one phase, it must have started out at phase 1, and 
thus we need the reader to return a phase-1 instantiation of the value.

This suggests that although we would use a reader syntax like 
`#q(module-path identifier data)` in this approach, its behavior would be 
to `require` that module path at phase 1 and perform the construction 
immediately.


On Tuesday, April 23, 2019 at 12:45:04 PM UTC-7, zeRusski wrote:
>
> (begin-for-syntax
>   (list 1 #k(foo) 2))
> ;; => ; tag: undefined;
>
> This can be solved with (require (for-syntax prelude/tags)) but as with 
> other autoquoted types I'd probably want to be able to just write them in 
> any phase. Docs say some stuff about namespaces having a scope that crosses 
> all phases plus separate scopes for each phase. Is there a way for a 
> binding to span all phases without cooperation from the user?
>

I think one thing that might help is to have #k(foo) read as:

(
  (let ()
(local-require (only-in prelude/tags tag))
tag)
  'foo)

This still supposes that `#%app`, `let`, `lo

Re: [racket-users] How would you implement autoquoted atoms?

2019-04-23 Thread zeRusski
On Tuesday, 23 April 2019 15:57:52 UTC+1, Matthew Flatt wrote:
>
> This response will be rambling, too. :) 


And here I thought I asked an embarrassingly silly question :)

While implementing a "naive" version I ran into two issues that I kind of 
predicted upfront, but just wanted to make sure they indeed would present a 
problem:

#lang prelude/tags

(list 1 #k(foo) 2)
;; => (tag 'foo) as rewritten by our extended reader, 
;; where tag is a struct provided by prelude/tags

(begin-for-syntax
  (list 1 #k(foo) 2))
;; => ; tag: undefined;

This can be solved with (require (for-syntax prelude/tags)) but as with 
other autoquoted types I'd probably want to be able to just write them in 
any phase. Docs say some stuff about namespaces having a scope that crosses 
all phases plus separate scopes for each phase. Is there a way for a 
binding to span all phases without cooperation from the user?

Another problem is with REPL. Above runs fine when I run the module, but 
not if I type in REPL. 

scratch.rkt> (list 1 #k(foo))
; stdin::1273: read-syntax: bad syntax `#k`
; foo: undefined;
;  cannot reference an identifier before its definition
;   in module: "/Users/russki/Code/scratch.rkt"

What's up with that? Does the reader there need to be defined specially 
somehow?

I would be really happy to see someone experiment with these ideas, and 
> I'm pretty sure they could be implemented mostly by changing the 
> expander and reader in "racket/src/expander"


I'd love to see this implemented, but  Racket internals terrify me. As you 
can see above I can barely cope with the basics :)

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] How would you implement autoquoted atoms?

2019-04-23 Thread Alexis King
I find this email fascinating, as about three weeks ago, Spencer Florence and I 
discussed something almost identical, from the module path + symbol protocol 
all the way down to the trouble with `quote`. I had been intending to 
experiment with implementing the idea at some point, but I already have a few 
too many balls in the air right now (from the intdef changes I’ve been 
exploring to the `hash-key` implementation I started fiddling with, plus 
starting the process of looking for a new job), so I probably won’t get to it 
any time soon. But I’ll go on record as being interested in doing so.

> On Apr 23, 2019, at 09:57, Matthew Flatt  wrote:
> 
> This response will be rambling, too. :)
> 
> Especially with your follow-up message, I think you're getting to a
> problem that we've wrestled with for a while. Sometimes we've called it
> the "graphical syntax" problem, because it's related to having non-text
> syntax, such as images in DrRacket (which are currently implemented in
> an ad hoc way). Another example could be adding quaternion literals,
> analogous to complex-number literals. In the cases that we've
> considered, we want the language to be extensible with a new kind of
> literal, but there's not necessary any specific import the language
> extension in the program. That means there's a set of binding,
> evaluation, and composition problems to solve.
> 
> 
> I've discussed the problem the most with William Hatch, and here's as
> far as we got with some ideas.
> 
> There could be a new primitive datatype --- at the levels of symbols,
> pairs, vectors, etc. --- to let the reader and expander communicate.
> Just to have some concrete syntax for the default reader and printer,
> let's say that the new kind of value can be written with `#q`, perhaps
> of the form
> 
>  #q(  )
> 
> The intent of the  and  components is to give
> the value a kind of binding. That binding is analogous to syntax
> objects, but without actually using syntax objects, which is arguably
> the wrong concept to pull into the reader level. The remaining
>  is payload to be interpreted by the  and
>  combination, such as image data or real numbers for the
> components of a quaternion.
> 
> Of course, a reader might construct these values as a result of parsing
> some other text, but the idea is that printing out the result from that
> reader with the default printer would use this `#q` notation, and then
> that printed form could be read back in. That is, the values can be
> consistently marshaled and unmarshaled, just like pairs and vectors and
> numbers.
> 
> The benefit of a new datatype is that it can have its own dispatch rule
> in the expander. Probably a `#q` in an expression position would get
> wrapped by an implicit `#%q-expression`, or something like that, which
> would give a language control over whether it wants to allow arbitrary
> literal values. But the default `#%q-expression` would consult the
> value's "binding" via the  and  to expand the
> value, which might inline an image or quaternion construction, or
> something like that. In effect, the reader form carries its own
> `require` at all times.
> 
> Maybe interning corresponds to an expansion that lifts out a
> calculation (in the sense of `syntax-local-lift-expression`), or maybe
> that's not good enough; I'm not sure.
> 
> We imagined that the primitive `quote` form might do something similar
> to `#%q-expression` in the case that an image or quaternion is part of
> a quoted S-expression. But, then, does there need to be an even
> stronger `quote` that doesn't try to expand the `#q` content? I don't
> know.
> 
> Meanwhile, the  and  combination could also
> identify a value-specific printer, where images might recognize when
> the output context can support rendering the actual image, while
> quaternions might print using "+" and "i" and "j". Or maybe that
> problem should be left to `prop:custom-write`.
> 
> At the level of writing down programs, the examples or images and
> quaternions seem different. For images, DrRacket and other editors have
> to include the concept of images somehow, and they insert values that
> turn into `#q` forms when the program is viewed as a character
> sequence. But quaternions are written with characters, so maybe that
> syntax is more like `@` reading in that a language constructor on the
> `#lang` line would add quaternion syntax to the readtable (which would
> work for S-expression languages).
> 
> 
> Overall, this reply is intended as a kind of endorsement and
> elaboration of your thoughts: Yes, this is an interesting problem, and
> it seems to need something new in Racket. And, yes, adding some new
> datatype (with some default syntax) seems like the right direction,
> mainly because it could trigger a new kind of dispatch in the expander.
> Probably that new datatype should have something built-in that amounts
> to a binding for it's compile-time and run-time realization.
> 
> I would be really happy to see someone 

Re: [racket-users] How would you implement autoquoted atoms?

2019-04-23 Thread Matthew Flatt
This response will be rambling, too. :)

Especially with your follow-up message, I think you're getting to a
problem that we've wrestled with for a while. Sometimes we've called it
the "graphical syntax" problem, because it's related to having non-text
syntax, such as images in DrRacket (which are currently implemented in
an ad hoc way). Another example could be adding quaternion literals,
analogous to complex-number literals. In the cases that we've
considered, we want the language to be extensible with a new kind of
literal, but there's not necessary any specific import the language
extension in the program. That means there's a set of binding,
evaluation, and composition problems to solve.


I've discussed the problem the most with William Hatch, and here's as
far as we got with some ideas.

There could be a new primitive datatype --- at the levels of symbols,
pairs, vectors, etc. --- to let the reader and expander communicate.
Just to have some concrete syntax for the default reader and printer,
let's say that the new kind of value can be written with `#q`, perhaps
of the form

  #q(  )

The intent of the  and  components is to give
the value a kind of binding. That binding is analogous to syntax
objects, but without actually using syntax objects, which is arguably
the wrong concept to pull into the reader level. The remaining
 is payload to be interpreted by the  and
 combination, such as image data or real numbers for the
components of a quaternion.

Of course, a reader might construct these values as a result of parsing
some other text, but the idea is that printing out the result from that
reader with the default printer would use this `#q` notation, and then
that printed form could be read back in. That is, the values can be
consistently marshaled and unmarshaled, just like pairs and vectors and
numbers.

The benefit of a new datatype is that it can have its own dispatch rule
in the expander. Probably a `#q` in an expression position would get
wrapped by an implicit `#%q-expression`, or something like that, which
would give a language control over whether it wants to allow arbitrary
literal values. But the default `#%q-expression` would consult the
value's "binding" via the  and  to expand the
value, which might inline an image or quaternion construction, or
something like that. In effect, the reader form carries its own
`require` at all times.

Maybe interning corresponds to an expansion that lifts out a
calculation (in the sense of `syntax-local-lift-expression`), or maybe
that's not good enough; I'm not sure.

We imagined that the primitive `quote` form might do something similar
to `#%q-expression` in the case that an image or quaternion is part of
a quoted S-expression. But, then, does there need to be an even
stronger `quote` that doesn't try to expand the `#q` content? I don't
know.

Meanwhile, the  and  combination could also
identify a value-specific printer, where images might recognize when
the output context can support rendering the actual image, while
quaternions might print using "+" and "i" and "j". Or maybe that
problem should be left to `prop:custom-write`.

At the level of writing down programs, the examples or images and
quaternions seem different. For images, DrRacket and other editors have
to include the concept of images somehow, and they insert values that
turn into `#q` forms when the program is viewed as a character
sequence. But quaternions are written with characters, so maybe that
syntax is more like `@` reading in that a language constructor on the
`#lang` line would add quaternion syntax to the readtable (which would
work for S-expression languages).


Overall, this reply is intended as a kind of endorsement and
elaboration of your thoughts: Yes, this is an interesting problem, and
it seems to need something new in Racket. And, yes, adding some new
datatype (with some default syntax) seems like the right direction,
mainly because it could trigger a new kind of dispatch in the expander.
Probably that new datatype should have something built-in that amounts
to a binding for it's compile-time and run-time realization.

I would be really happy to see someone experiment with these ideas, and
I'm pretty sure they could be implemented mostly by changing the
expander and reader in "racket/src/expander" --- although some
cooperation from the bytecode writer and reader is probably also needed,
and I'd be happy to help more there.


At Tue, 23 Apr 2019 06:08:05 -0700 (PDT), zeRusski wrote:
> I must apologies for what follows will be more of a rambling than an 
> exercise in clear thinking. That is because I am a bit stuck and thought 
> I'd seek help.
> 
> I have been thinking some about languages and how it isn't always easy to 
> clearly separate language being implemented from the language used to 
> implement it. The picture gets particularly blurry in Lisps. This time 
> around the question that gave me pause was one of implementing symbols. 
> Better still Racket keywords

[racket-users] How would you implement autoquoted atoms?

2019-04-23 Thread zeRusski
I must apologies for what follows will be more of a rambling than an 
exercise in clear thinking. That is because I am a bit stuck and thought 
I'd seek help.

I have been thinking some about languages and how it isn't always easy to 
clearly separate language being implemented from the language used to 
implement it. The picture gets particularly blurry in Lisps. This time 
around the question that gave me pause was one of implementing symbols. 
Better still Racket keywords, since like many lispy terms "symbol" has so 
many confusing meanings that its nigh impossible to tell what people mean 
exactly. I specifically talk about autoquoted datums. Two interned symbols 
that are equal? are eq?, two keywords that are equal? are eq?, 42 is eq? to 
42, etc. Symbols are bad example cause people often think about 'symbol or 
identifier with semantics being: perform variable lookup.

Someone on this list said everything in Racket is a struct, so lets start 
there.
 
 (struct kw (symbol))

We can also come up with some syntactic representation and extend our 
language with read and read-syntax that translate this new syntax into 
kw-struct as needed. But then we also demand that two syntactically equal 
kws end up being the same value in the language, so no matter where our 
reader encounters #kw(foo) it must produce the same value. This must be 
true across module boundaries, too. Just like Racket keywords. So, what are 
we to do? There's time when the reader runs, followed by expansion. Does 
this mean they need to communicate somehow? Also, the reader "runs", that 
is it is written in Racket (or some derivative) after all, but reader's 
environment isn't one where expansion happens, and that of the final code 
being evaled is different still. Right? 

To ensure eq? of two kws with the same printed representation we'll 
probably want to keep some global table around that keeps track of 
"interned" kws. So, for any two #kw(foo), our reader would have to produce 
something like (lookup-intern-kw  #:symbol 'foo), which at run-time would 
consult the table of kws and return the (kw 'foo) already there, or create 
a fresh entry and return that new struct. Two observations: (a) it follows 
that the global table is one that must exist at runtime - not while the 
reader runs, and (b) we end up relying on the host language for symbol 
equality after all 'foo is eq? 'foo and that allows us to key the table by 
symbols e.g. 'foo.

Is this how you would do it? Is there a better way that involves the reader 
more and relies on the runtime less?

Bonus question. What if we allow families of kws effectively partitioning 
kws into namespaces: #kw(family name). This appears a small variation of 
the above, where you'd simply assemble a compound symbol from family and 
name to use for the table lookup. That is until you allow parameterizing by 
"current-family", so kw declaration can omit the family part and it gets 
inserted as needed - not unreasonable in a language with modules or 
explicit namespaces. We could allow something like this:

#lang racket/kws
#:current-family addams

#kw(morticia)

now any kw within a module without family must translate into one of addams 
family. But also any #kw(addams morticia) in a different module must be eq? 
to the one above and in fact to any one like that anywhere. One exception 
is probably if we send them across Racket spaces which IIUC amount to 
running separate VMs. In the above example the reader would have to be 
aware of #:current-family declaration that may appear at the top of the 
module. We'd probably translate that to some (current-family 'addams) 
parameter setup, or wrap #%module-begin body in parameterize, then every kw 
without explicit family would have to check the (current-family) parameter. 

Is there a way to push this more to the read-time? If there is, what 
happens if we load the module and enter REPL? Could we ensure its reader is 
properly parameterized that it would use appropriate current-family?

How screwed up is my thinking here? Is there a way to leverage the reader 
more and rely on the runtime less? I imagine that'd make kws discussed 
lighter weight? We talk about phases some in Racket, but reader runs 
somewhere or rather sometime, too. I'd like to have a clearer picture in my 
head, I guess.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] How would you implement autoquoted atoms?

2019-04-23 Thread zeRusski
I must apologies for what follows will be more of a rambling than an 
exercise in clear thinking. That is because I am a bit stuck and thought 
I'd seek help.

I have been thinking some about languages and how it isn't always easy to 
clearly separate language being implemented from the language used to 
implement it. The picture gets particularly blurry in Lisps. This time 
around the question that gave me pause was one of implementing symbols. 
Better still Racket keywords, since like many lispy terms "symbol" has so 
many confusing meanings that its nigh impossible to tell what people mean 
exactly. I specifically talk about autoquoted datums. Two interned symbols 
that are equal? are eq?, two keywords that are equal? are eq?, 42 is eq? to 
42, etc. Symbols are bad example cause people often think about 'symbol or 
identifier with semantics being: perform variable lookup.

Someone on this list said everything in Racket is a struct, so lets start 
there.
 
 (struct kw (symbol))

We can also come up with some syntactic representation and extend our 
language with read and read-syntax that translate this new syntax into 
kw-struct as needed. But then we also demand that two syntactically equal 
kws end up being the same value in the language, so no matter where our 
reader encounters #kw(foo) it must produce the same value. This must be 
true across module boundaries, too. Just like Racket keywords. So, what are 
we to do? There's time when the reader runs, followed by expansion. Does 
this mean they need to communicate somehow? Also, the reader "runs", that 
is it is written in Racket (or some derivative) after all, but reader's 
environment isn't one where expansion happens, and that of the final code 
being evaled is different still. Right? 

To ensure eq? of two kws with the same printed representation we'll 
probably want to keep some global table around that keeps track of 
"interned" kws. So, for any two #kw(foo), our reader would have to produce 
something like (lookup-intern-kw  #:symbol 'foo), which at run-time would 
consult the table of kws and return the (kw 'foo) already there, or create 
a fresh entry and return that new struct. Two observations: (a) it follows 
that the global table is one that must exist at runtime - not while the 
reader runs, and (b) we end up relying on the host language for symbol 
equality after all 'foo is eq? 'foo and that allows us to key the table by 
symbols e.g. 'foo.

Is this how you would do it? Is there a better way that involves the reader 
more and relies on the runtime less?

Bonus question. What if we allow families of kws effectively partitioning 
kws into namespaces: #kw(family name). This appears a small variation of 
the above, where you'd simply assemble a compound symbol from family and 
name to use for the table lookup. That is until you allow parameterizing by 
"current-family", so kw declaration can omit the family part and it gets 
inserted as needed - not unreasonable in a language with modules or 
explicit namespaces. We could allow something like this:

#lang racket/kws
#:current-family addams

#kw(morticia)

now any kw within a module without family must translate into one of addams 
family. But also any #kw(addams morticia) in a different module must be eq? 
to the one above and in fact to any one like that anywhere. One exception 
is probably if we send them across Racket spaces which IIUC amount to 
running separate VMs. In the above example the reader would have to be 
aware of #:current-family declaration that may appear at the top of the 
module. We'd probably translate that to some (current-family 'addams) 
parameter setup, or wrap #%module-begin body in parameterize, then every kw 
without explicit family would have to check the (current-family) parameter. 

Is there a way to push this more to the read-time? If there is, what 
happens if we load the module and enter REPL? Could we ensure its reader is 
properly parameterized that it would use appropriate current-family?

How screwed up is my thinking here? Is there a way to leverage the reader 
more and rely on the runtime less? I imagine that'd make kws discussed 
lighter weight? We talk about phases some in Racket, but reader runs 
somewhere or rather sometime, too. I'd like to have a clearer picture in my 
head, I guess.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.