Re: The empty string and other empty strings
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 13-01-12 17:39, Mark H Weaver wrote: David Kastrup d...@gnu.org writes: However, my mind is not set in stone on this. Does anyone else here agree with David? Should we defend the legitimacy of this optimization, and ask the R7RS working group to include explicit language specifying that empty strings/vectors need not be freshly allocated? It seems to me that it can't hurt to ask for clarification of this issue on scheme-reports. Personally I think the intent of the standard is to say that you cannot expect (string) to be un-eq? nor eq? to (string), but let's get a broader perspective. Marijn -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8T3yoACgkQp/VmCx0OL2wG4QCeJkTP7qhm/ll6g/szLrz21uUB 0PwAoKLWlLOIIgcEC8EJKnR+6fYaV0he =8SBJ -END PGP SIGNATURE-
Re: [PATCH] local-eval, local-compile, and the-environment (v3)
David Kastrup d...@gnu.org writes: (define current-module (let ((top-level (the-environment))) (lambda () (eval '(the-environment) top-level Some more notes about the above code (changing `eval' == `local-eval'): * (local-eval '(the-environment) environment) is a no-op: it always returns the same environment that was passed in, so there's no point in doing it. * Of course `top-level' is a constant, and would be even if it were within the (lambda () ...) because it captures no lexicals. It is always in the same module, i.e. the module containing the above definition. This same constant value is always returned by (current-module). * Also note that the real `current-module' simply accesses a fluid, which can also be set by `set-current-module'. (Fluids are a scheme analogue to dynamically-scoped variables in Lisp). Conceptually, it is variable that is explicitly set by the user. It has no relation to the code that is currently executing. Rather, it is used during compilation (or within a REPL) to keep track of where the user would like top-level definitions to go. Mark
Re: The empty string and other empty strings
Marijn hk...@gentoo.org writes: On 13-01-12 17:39, Mark H Weaver wrote: David Kastrup d...@gnu.org writes: However, my mind is not set in stone on this. Does anyone else here agree with David? Should we defend the legitimacy of this optimization, and ask the R7RS working group to include explicit language specifying that empty strings/vectors need not be freshly allocated? It seems to me that it can't hurt to ask for clarification of this issue on scheme-reports. Personally I think the intent of the standard is to say that you cannot expect (string) to be un-eq? nor eq? to (string), but let's get a broader perspective. It might be worth pointing out the similarity to (list) and (list) and '(). I think that eq-ness of memberless structures of type list and string (which also could allow mutable and immutable variants to be identical) is worth given separate mention as it is a special case that has semantics with regard to eq-ness and mutability and freshly allocated that are nowhere as obvious as with content-carrying variants. Even if the statement results to can be implemented as, it would avoid choosing inferior implementation options because of trying to split hairs on what amounts to a bald head. -- David Kastrup
Re: [PATCH] local-eval, local-compile, and the-environment (v3)
I wrote: * Also note that the real `current-module' simply accesses a fluid, which can also be set by `set-current-module'. (Fluids are a scheme analogue to dynamically-scoped variables in Lisp). Conceptually, it is variable that is explicitly set by the user. It has no relation to the code that is currently executing. Rather, it is used during compilation (or within a REPL) to keep track of where the user would like top-level definitions to go. That last bit was not quite right, let me try again: The `current-module' is used during compilation, within a REPL, or by primitive-eval, to keep track of which module the user would like top-level forms to be compiled in. It is set at compile time by constructs such as `define-module' or (eval-when (compile) (set-current-module module)), and this module is baked into every identifier at compile time before macro processing takes place. Macros in general might slice and dice and mix together fragments of code from many different modules. When this jumble of macro-expanded code is evaluated, (current-module) is no longer relevant to it. Instead, each top-level variable reference uses the module that was baked into its identifier before macro expansion. Mark
Re: [PATCH] local-eval, local-compile, and the-environment (v3)
Mark H Weaver m...@netris.org writes: David Kastrup d...@gnu.org writes: (define current-module (let ((top-level (the-environment))) (lambda () (eval '(the-environment) top-level Some more notes about the above code (changing `eval' == `local-eval'): * (local-eval '(the-environment) environment) is a no-op: it always returns the same environment that was passed in, so there's no point in doing it. I think this was based on the the current module is part of the environment, but local-eval does not change (current-module) mantra which I interpreted in a confused manner. The interesting thing is that it is the _lexically_ current module that is part of the environment, and inside of local-eval, this may well differ from the _actually_ current module as given by (current-module). -- David Kastrup
Re: local-eval on syntax-local-binding, bound-identifiers
Hi Andy! Andy Wingo wi...@pobox.com writes: + (cons (wrap (car symnames) + (anti-mark (make-wrap (car marks) subst)) * Why are you adding anti-marks here? As the changelog noted (and a comment should have noted ;), the identifiers are anti-marked so that syntax transformers can introduce them, as-is. The purpose of this procedure is to get a list of identifiers, and to capture some subset of them. It will do so by introducing references to them in the expansion of some macro. However they are not introduced identifiers: they come from the code itself. They are input the macro, and as such need an anti-mark. The anti-mark will be stripped from the expansion when the transformer that called `bound-identifiers' returns. Does this mean that `bound-identifiers' will not function properly when used outside of a macro? What about if it's used within a macro that was generated by another macro (or things of that nature)? Are there cases where you might need to strip more than one anti-mark? To use your phrase, this has a bad smell. More importantly: I notice that you are not stripping the psyntax wrap from identifiers placed within the wrapper procedure above. There are certainly benefits to that, but remember that the wrapper procedure will in general be serialized to disk and evaluated in a different Guile session, where the gensym counters have been reset. Of course, like all macros! The forgeable gensym issue is something we have in Guile, more generally, that needs a broader solution. Ah, good point! Macros already serialize syntax-objects to disk. psyntax wraps are already part of our ABI, so nothing new there. However, I fear that the gensym issue might be a serious problem for `local-eval', even though it hasn't been a problem for macros. The reason it has not been a problem with macros is that, within a top-level macro (which are the only ones used across Guile sessions), the only syntax-objects that can be meaningfully _introduced_ into the expansion are top-level/module bindings. But these bindings have no associated labels or gensyms, because they're not in the wrap. On the other hand, with `local-eval', it seems to me quite plausible that gensym collisions might occur. Suppose in one Guile session you compile a procedure (foo) that uses (the-environment), and then in another Guile session, you call (foo) and then `local-eval' with the environment returned by (foo). Now the wrapper procedure splices together syntax objects from two different Guile sessions into a single top-level form, where (unlike in the macro case) all of these syntax objects are lexicals, and thus depend on the gensyms and the labels. See how this is a problem now where it wasn't before? Or am I missing something? +((module? e) + ;; Here we evaluate the expression within `lambda', and then + ;; call the resulting procedure outside of the dynamic extent + ;; of `eval'. We do this because `eval' sets (current-module) + ;; within its dynamic extent, and we don't want that. Also, + ;; doing it this way makes this a proper tail call. + ((eval #`(lambda () #,x) e))) * This was my mistake, but since I'm already marking up the code: the `lambda' wrap above needs a `#f' before `e' to force expression context. OK. (Note though that (eval X e) does indeed evaluate X in tail position.) Looks to me like `eval' is initially bound to the C function scm_eval. Is it later rebound to a Scheme procedure? If so, where? For the record, I still think it's better for `the-environment' to be implemented within psyntax as a core form. It's a fundamental syntactic construct with clean semantics, and it belongs in psyntax with its brethren. Your desire to remove it from psyntax has caused you to add far less elegant interfaces that have been hastily designed, and that may not even be sufficient for a full implementation of `the-environment' that captures mutually-recursive local macros. In pursuit of the goal of agreeing on a strategy, I would like to convince you that you are wrong on all of these points :) So, in that spirit, I argue: Very well, I will endeavor to be open-minded. `the-environment' is not fundamental: it can be implemented in terms of simpler primitives. The same can be said of `lambda' or `syntax-case', but that's not the appropriate way to choose primitives in a language. The set of primitives chosen in Scheme are not the ones that are simplest to implement. It makes more sense to choose primitives with simple, clean semantics, which segues nicely into your next paragraph. `the-environment' does not have clean semantics, inasmuch as it has nothing worthy of the name, not yet anyway. The lambda calculus, the scheme language, even the syntax-case system have well-studied semantics (denotational and/or operational), and lots of
Re: Throw without catch before boot:
Mark H Weaver m...@netris.org writes: David Kastrup d...@gnu.org writes: Well, deciding to use my guile checkout not just for reference, I tried ./autogen.sh, ./configure and make on master. For now, it's best to stay on the stable-2.0 branch. That's our current focus. I see. CCLD guile ./.libs/libguile-2.0.so: undefined reference to `scm_i_new_smob' collect2: ld returned 1 exit status make[3]: *** [guile] Error 1 make[3]: Leaving directory `/usr/local/tmp/guile/libguile' make[2]: *** [all] Error 2 make[2]: Leaving directory `/usr/local/tmp/guile/libguile' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/usr/local/tmp/guile' make: *** [all] Error 2 dak@lola:/usr/local/tmp/guile$ -- David Kastrup
Re: local-eval on syntax-local-binding, bound-identifiers
Hi Andy! Thanks again for working on this. Andy Wingo wi...@pobox.com writes: * Why are you adding anti-marks here? As the changelog noted (and a comment should have noted ;), the identifiers are anti-marked so that syntax transformers can introduce them, as-is. The purpose of this procedure is to get a list of identifiers, and to capture some subset of them. It will do so by introducing references to them in the expansion of some macro. However they are not introduced identifiers: they come from the code itself. They are input the macro, and as such need an anti-mark. The anti-mark will be stripped from the expansion when the transformer that called `bound-identifiers' returns. Does this mean that `bound-identifiers' will not function properly when used outside of a macro? What about if it's used within a macro that was generated by another macro (or things of that nature)? Are there cases where you might need to strip more than one anti-mark? Well, bound-identifiers is a procedure, so if you are using it outside the dynamic extent of a transformer procedure, that means that you have a syntax object that you squirreled away from somewhere, so already we're in somewhat uncharted territory. How about something like (bound-identifiers #'here)? or (bound-identifiers #'x) where `x' is some lexical variable? Macro-generating macros should be fine, here. `expand-macro' is iterative, not recursive, so you don't need to strip anti-marks twice. Ah, okay. Good point! I agree that this anti-mark has a bad smell, but the idea of a `bound-identifiers' procedure or form sounds like a good idea, so if you have any suggestions for improvement here, they are most welcome. As I've already said, I don't think `bound-identifiers' will be useful in a full implementation of `local-eval', so once we move to that improved implementation, `bound-identifiers' will be left around as an orphan: a primitive of dubious value, introduced specifically to implement something that it turned out to be insufficient for. If you insist on this strategy, I think what we really need is a list of ribs, where each rib also specifies whether it is recursive. We don't actually care about `let' vs `letrec' (though there's no harm in providing that information in the interface, and it probably makes sense to for consistency), but we _do_ care about the difference between `let-syntax', `letrec-syntax', and internal bodies with mutually-recursive `define-syntax' forms. See how we're exposing increasingly complex internal psyntax structures in order to achieve your dream of making `local-eval' sleep outside in the shed? [W]ith `local-eval', it seems to me quite plausible that gensym collisions might occur. Suppose in one Guile session you compile a procedure (foo) that uses (the-environment), and then in another Guile session, you call (foo) and then `local-eval' with the environment returned by (foo). Now the wrapper procedure splices together syntax objects from two different Guile sessions into a single top-level form, where (unlike in the macro case) all of these syntax objects are lexicals, and thus depend on the gensyms and the labels. See how this is a problem now where it wasn't before? Or am I missing something? To be perfectly honest, this stuff is very confusing to me, but I think I can see how this can happen, yes. I do think that it's important to fix this bug at some point, but IMO it is not a blocker for local-eval, much less 2.0.4. I strongly disagree. Your implementation will clearly be buggy without a proper solution to the collision of gensyms (labels and marks, at least). I don't know about you, but personally I prefer rock-solid code with clearly documented limitations (that almost no one is likely to hit anyway) to buggy code. If you don't want to deal with the gensym problem for 2.0.4, there's an easy solution. Simply strip the wraps for now (as is done by my patch), and everything will robust as long as we don't capture local syntax. BTW, did you see my most recent model for thinking about `local-eval'? (the-environment) expands to (list (lambda () expr) ...), with one element for every possible expression: a countably infinite list that could be built lazily. `local-eval' simply chooses the appropriate procedure from the list and calls it. A poor implementation strategy, but the semantic meaning is quite clear, no? It sounds clear, but does it have any explanatory power? It sounds like it could apply just as well to any other computation... I don't understand what you mean here. It seems to me that this model can answer any question you could possibly have about the observable behaviors of `the-environment' and `local-eval', besides their efficiency. Can you provide a counter-example to this claim? Creating wraps is not the hack. It's creating wraps that are scoped in another specific module. With the-environment in psyntax, psyntax
Re: bound identifiers
On Mon 16 Jan 2012 20:46, Stefan Israelsson Tampe stefan.ita...@gmail.com writes: why are these two not equal in the sense of bound-identifier=? #(syntax-object x ((top) #(ribcage () () ()) #(ribcage () () ()) #(ribcage #(x) #((m1104 top)) #(i1105))) (hygiene guile-user)) #(syntax-object x ((#f top) shift #(ribcage () () ()) #(ribcage #(x) #((m1104 top)) #(i1105))) (hygiene guile-user))) One has been anti-marked and the other has not? Meaning that one was made up by your syntax expander, and the other and the other came in as part of the form. But that's not the right question or answer. Can you should where these identifiers come from? Andy -- http://wingolog.org/
Re: bound identifiers
In syntax parse the racket code stores syntax values inside structs and then transport them down the macro chain as argument to macros. Then when unpacking the struct they are compared with arguments of syntax values. I think that this is the reason. I tried to experiment with psyntax macro expander to poke inside structs and that solved this issue. But on the other hand even worse problem appeared. What I did now was to manually clean the syntax values e.g. remove #f and shift from the syntax value and this gave the best result. As you see, it's just wild west to get the racket code working. Mayby just writing a version from scratch would be better. /Stefan On Mon, Jan 16, 2012 at 10:28 PM, Andy Wingo wi...@pobox.com wrote: On Mon 16 Jan 2012 20:46, Stefan Israelsson Tampe stefan.ita...@gmail.com writes: why are these two not equal in the sense of bound-identifier=? #(syntax-object x ((top) #(ribcage () () ()) #(ribcage () () ()) #(ribcage #(x) #((m1104 top)) #(i1105))) (hygiene guile-user)) #(syntax-object x ((#f top) shift #(ribcage () () ()) #(ribcage #(x) #((m1104 top)) #(i1105))) (hygiene guile-user))) One has been anti-marked and the other has not? Meaning that one was made up by your syntax expander, and the other and the other came in as part of the form. But that's not the right question or answer. Can you should where these identifiers come from? Andy -- http://wingolog.org/
Re: bound identifiers
On Mon 16 Jan 2012 22:28, Andy Wingo wi...@pobox.com writes: On Mon 16 Jan 2012 20:46, Stefan Israelsson Tampe stefan.ita...@gmail.com writes: why are these two not equal in the sense of bound-identifier=? But that's not the right question or answer. Can you should where these identifiers come from? Sorry, I've been making lots of typos recently. I meant to say, can you show where these identifiers come from? Andy -- http://wingolog.org/
Re: local-eval on syntax-local-binding, bound-identifiers
Hi Mark, On Mon 16 Jan 2012 21:36, Mark H Weaver m...@netris.org writes: Thanks again for working on this. And thank you again for all your work, and patience with my pigheadedness. if you insist in this foolish quest to banish `the-environment' to sleep in the shed as a second-class citizen, I cannot stop you :) TBH I think this is the best thing we can do for local-eval. We preserve flexibility for local-eval, make other experiments possible, and the local-eval implementation is a bit more perspicacious, as the scoping is more lexical (in the same file, even). I know there's a smilie in your statement, but really, it's not just local-eval: there's loads more that should be broken out into modules over time, somehow :) Think of it as building a hippie commune of functionality, instead of making everyone live in the same house :) (OK, that's stretching it a bit, but perhaps it is partially apt?) Now, specific commentary. How about something like (bound-identifiers #'here)? scheme@(guile-user) (bound-identifiers #'here) $5 = () scheme@(guile-user) (let ((x 10)) (bound-identifiers #'here)) $6 = (#(syntax-object x ((#f top) shift #(ribcage #(x) #((top)) #(i176))) (hygiene guile-user))) What should the answer be in this case? Would you expect `x' in the list? Certainly for the-environment you would. But here: scheme@(guile-user) (define-syntax bound-here (lambda (x) (with-syntax (((id ...) (map (lambda (id) (datum-syntax x id)) (bound-identifiers #'here #'(list 'id ... scheme@(guile-user) bound-here $7 = (#(syntax-object x ((#f top) shift #(ribcage #(x) #((top)) #(i192))) (hygiene guile-user))) scheme@(guile-user) (let ((y 10)) bound-here) $8 = (#(syntax-object x ((#f top) shift #(ribcage #(x) #((top)) #(i192))) (hygiene guile-user))) So, it seems to be sensible. Now, what to do with these identifiers: you if you introduce one into another macro, the mark will indeed be stripped. I'm not sure what else you can do with a syntax-object, actually! Pass it directly to eval or compile, I guess, and in that case we do lose, as the anti-mark isn't stripped. But that's the case for other syntax objects captured in a syntax transformer, as well. Should we anti-mark only within the dynamic extent of a transformer, I wonder? As I've already said, I don't think `bound-identifiers' will be useful in a full implementation of `local-eval', so once we move to that improved implementation, `bound-identifiers' will be left around as an orphan: a primitive of dubious value, introduced specifically to implement something that it turned out to be insufficient for. Hum. Definitely something to think about. What if instead we implemented closure serialization somehow? Then we would handle procedural macros too, and bound-identifiers would still be sufficient. Maybe that idea is a little too crazy. If we have to lexical contours associated with bindings, recursive is only one bit: you probably also need letrec vs letrec*. To be perfectly honest, this stuff is very confusing to me, but I think I can see how this can happen, yes. I do think that it's important to fix this bug at some point, but IMO it is not a blocker for local-eval, much less 2.0.4. I strongly disagree. Your implementation will clearly be buggy without a proper solution to the collision of gensyms (labels and marks, at least). I don't know about you, but personally I prefer rock-solid code with clearly documented limitations (that almost no one is likely to hit anyway) to buggy code. If you don't want to deal with the gensym problem for 2.0.4, there's an easy solution. Simply strip the wraps for now (as is done by my patch), and everything will robust as long as we don't capture local syntax. Thinking about it a little more, labels are a non-issue. All they need to be is unique in the sense of eq?. Labels are strings. If they are loaded in separate compilation units, they will be unique, no matter what their contents. Labels are more important than marks, also, for the correctness of the algorithm. A mark collision is only an issue if there is also a symbolic collision. Label collision could alias completely unrelated bindings. Anyway, I would rather serialize bad marks than no marks. That's my personal opinion ;-) But if you think this is a huge issue, let's fix the marks to be more unique, no? Note that there is a well-known optimization that you don't actually need to generate the characters corresponding to a gensym until they are needed. It might serve your purposes. OK, I'm getting very sleepy now :) Let me know your thoughts. It would be great if all of this could land before Sunday. Though the cricket folk say pace is nothing without guile, Guile is nothing without a good development pace ;-) Cheers, Andy --
Re: bound identifiers
On Mon 16 Jan 2012 22:56, Stefan Israelsson Tampe stefan.ita...@gmail.com writes: As you see, it's just wild west to get the racket code working. :) Can you give a stripped-down test case for this particular behavior? That code is paged into my and Mark's minds right now :) Andy -- http://wingolog.org/
Eval, tail calls, (current-module), and backward compatibility
Hello all, There's a problem with Guile's `eval'. It doesn't do proper tail recursion as mandated by R5RS et al, and unfortunately we can't fix this without changing its behavior in a potentially incompatible way. The problem is that `eval' uses dynamic-wind to temporarily set (current-module) during the dynamic extent of the expansion and evaluation of its form. Ideally, it should set (current-module) only during expansion, _not_ during evaluation. It is worthwhile to consider what (current-module) is for, and how it should be used. This seems to be an area of great confusion. (current-module) should be relevant only at the beginning of macro-expansion: before any program transformations are performed, (current-module) is baked into every symbol of the top-level form. (psyntax actually does this lazily, but the effect is the same). After that, (current-module) should be completely irrelevant to the rest of compilation and evaluation. After expansion has begun, the expanded code is in general a patchwork of code fragments from many different modules, and thus it no longer makes sense to talk about the current module. Instead, the compiler looks at the modules that were baked into the identifier. For example, each top-level variable reference refers to the module that was baked into its corresponding source identifier before macro expansion. It sometimes makes sense for the user to (set-current-module module). This can be done within an REPL for example. It can also be done in a compiled file within (eval-when (compile) ...), which will cause subsequent top-level forms to be expanded within the newly changed (current-module). This is implicitly done by `define-module'. Pretty much the only proper use of (current-module) is to implement REPLs and things of that sort. It has nothing to do with the code that is currently running. It doesn't even really have to do with the code that is currently being expanded, so it's almost never the right thing to look at within procedural macros. * * * * * Can we fix this? Ideally, I think that `eval' should set (current-module) during expansion, but _not_ during evaluation. Then it can be properly tail recursive. However, some code out there might depend on the existing behavior, so I guess we can't change this, at least not in 2.0. Bummer. (BTW, `local-eval' accepts a module as its second argument, and conforms to the R5RS requirements of `eval', so it can serve as a proper replacement for Guile's broken `eval') Similarly, (compile 'expr #:env module) should set (current-module) during expansion but _not_ during evaluation. Then it can be properly tail recursive and have cleaner semantics. It might not be too late to fix this in 2.0. (Note that `local-compile' also conforms to the R5RS requirements of `eval', so can also serve as a proper `eval' replacement) What do other people think? Best, Mark