Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
On 12/11/2011 12:34 AM, Gabriel Scherer wrote: A summary to this lengthy mail: (1) Why type-enriched Camlp4 is an unreasonable idea (2) We should extract the typedtree; why it's hard (3) A fictional narrative of the camlp4/camlp5 history (4) Why you don't want to become Camlp4 maintainer (5) How we could try not to use Camlp4 in the future (6) Syntax extension survival advices Sorry, it's too long to read entirely for a sunday morning, so I reply just after reading points (1) and (2): (1) many extensions in Camlp4 are used to manipulate types (most of them based on type-conv). So, having the compiler knowledge of types at a point where you want to expanse a macro is actually very interesting for them. I wrote a patch to 3.12.0 a long time ago called ocaml-template that would exactly do that: it would call a plugin function at some points in the program, giving the function both the string to parse and the typing environment, so that it is possible (to some extend, since typing might only be partial at the point where it is called) to know the type of an identifier, and the complete description of the type. Here is the link to the patch: http://www.ocamlpro.com/files/ocaml-template-0.1-for-3.12.0.tar.gz It is not documented much, but there are some simple examples of plugins and uses. (2) exporting the typedtree is not really a problem. The main problem is to add enough information in the typedtree for all the tools that could use it (so that ocamlspotter and refactoring tools can use it). We are currently in the process of polishing our final patch to submit it for integration, now that we have tested it with enough tools. --Fabrice # (1) Why type-enriched Camlp4 is an unreasonable idea Wojciech, your idea of having type information at the Camlp4 level is absolutely unreasonable. You are not speaking about a minor change here, but a major rewrite that would affect the compiler internals as well. It would really be a new (and interesting) project. Camlp4 is, and I guess will remain, a syntax-level preprocessing tool. You have to accept the fact that you can't use type information at this level (but you can certainly interact in some way with the type system by producing/transforming pieces of code in a way that you know will have interesting typing effects; for example, you may want to generate code that is purposedly ill-typed in some cases, to disallow certain uses of your syntax extension). I'm not even sure what it would mean to access type information at the camlp4 level, as you're producing and transforming untyped AST; would you want partially typed ASTs? How is the typer supposed to work on the part that you haven't transformed yet, and therefore are not valid OCaml syntax? I suppose you could have a preprocessing and transformation tool at the typedtree level, but that would be a different tool with different uses, distinct from the syntactic preprocessing part (though you may develop extensions that act on both fronts). I'm not aware of so much Camlp4 situations that would really require typing information. I would be interested in good examples if you have some. One problem that I have had with Camlp4 is that you don't have identifier resolution information (eg. you don't know if the identifier (@) you're seeing is really list concatenation, or has been redefined/shadowed in the context); this makes uses of Camlp4 for inlining, for example, quirky and fragile. That's still a simpler problem than type information. # (2) We should extract the typedtree; why it's hard If you really want to play with type information and scope-resolved identifiers, AST-manipulating tools is probably not the way to go: you indeed want full access to the typedtree. Currently this is only possible by hacking the compiler, and this is what for example Jun Furuse's Ocamlspotter project does. Those kind of tools could be made less intrusive if it was possible to pass typedtree-like information in and out of the compiler. I remember reading that some people (OcamlPro, I suppose) have this on their target list. The problem however is that the current internal compiler's typedtree representation is not at all adapted for external communication. If you want a kind of tool that is robust and future-proof in any sense (you could probably get something working by just marshalling the current typedtree, but then it could break awfully after minor language changes, make the compiler choke, etc.; I certainly wouldn't want to use that), you have to design a clean and efficient representation for OCaml programs after the type inference phase. Having a solid proposal on this topic wil be an awful lot of work. (3) A fictional narrative of the camlp4/camlp5 history Jérémie Dimino wrote: But there is something I don't understand here. Why is there camlp4 and camlp5 ? These two projects do exactly the same thing and are incompatible. So i don't see
Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
And Xavier's mail suggests that camlp4 is a maintenance burden for the OCaml team. Why is it such a bad idea to drop camlp4 out of the distribution, and just let camlp5 live? First of all, I don't have a strong opinion here: I just voiced doubt. My reasoning for going so goes along two lines of argument: 1. I'm not exactly sure how Camlp4 being in or out of the distribution will change the maintainance burden. The main maintainance difficulty with camlp{4,5} is that it needs to evolve its own parser in parallel with the 'official' one (that's by design), with changes to the language syntax. You need to change Camlp{4,5} when Ocaml 3.N+1 introduces a new syntax, or it won't be usable on OCaml 3.N+1 code. There are already users relying on Camlp{4,5}, and those user generally wish to use the new, exciting features of the next version. If they can't, they will complex, regardless of whether their preprocessor is in or out the distribution. That means that when the OCaml team is about to release a new version with syntactic changes, they have to worry about the preprocessor anyway, or make users unhappy. So there is a preprocessor burden on the OCaml team, independently of where the code is maintained and located. If the change means make it easier to distribute camlp4 fixes without bumping OCaml's version number, why not. If it means now we won't care about Camlp4 state before releasing a new versions, this may mean a degradation in the life of Camlp4 users. I doubt that's the idea. Being in or out the distribution also wouldn't change much, I think, the possibility of external contributions. In my experience Nicolas Pouillard and now Xavier Clerc have a good track record of integrating external contributions (I have sent one or two bugfixes on the tracker), and I'm confident they would be able to work with Jérémie Dimino if he wished to contribute to camlp4's evolution more frequently. 2. I suppose -- purely personal guess -- the intention being the consortium's suggestion is to try to move away slowly from Camlp{4,5}, towards alternatives such as Alain Frisch's annotations proposal. As I said previously, I would personally welcome such a move, but I don't see said alternatives released yet. I haven't had the opportunity to play with alternate tools, have an idea of how a transition would work out, see if the documentation is reasonably complete, etc. It would make more sense, in my opinion, to downplay camlp{4,5} *once* we have played with alternatives and are confident that they are mature enough to make a transition. Please also remember that the consortium members represent relatively large, well-educated, experienced players in the OCaml community. It probably wouldn't bother them much if, say, ocamlbuild, ocamldoc, or ocamldbg where taken out of the official distribution. They have important tooling in place and would adapt relatively easily. The end user or OCaml beginner may not adapt to such changes that easily. Now this is a matter where distributions, such as Debian, can be of great help, by providing complete packages regardless of what is or isn't in the official distribution. However, I have handled just enough users reports wondering why they didn't have camlp4o available, or graphics.cma, or whatever, to know that this can also be a barrier to use and adoption. Again, no strong opinion. I will welcome any change that strenghten the OCaml language. If you think distributing camlp4 out of the distribution would ease the live of OCaml developpers and maintainers, at no cost to the users, nor complicating the distribution side, then all is good. I just feel that it may be a bit too soon. On Sun, Dec 11, 2011 at 10:04 AM, Stéphane Glondu st...@glondu.net wrote: Le 11/12/2011 00:34, Gabriel Scherer a écrit : The original Camlp4 tool was mostly developped by Daniel de Rauglaudre. [...] (I'm thinking of eg. Martin Jambon, which had extensive Camlp4 extensions, and the Coq team which has user-defined notations using Camlp4 and, huh, I really don't want to know the details); basically they didn't upgrade to 3.10 -- instead of porting the extensions, as was originally hoped. [...] Daniel, which apparently did not agree with some of the changes made, relatively suddenly restarted developpment of his branch of Camlp4, taken from the old sources, before refactoring. This was done as a separate project, outside the OCaml distribution (apparently Daniel and the OCaml team prefer not to work together). [...] Camlp4 is a piece of devil beauty. It does incredibly clever things, and is incredibly complex inside: Daniel is clearly a remarkable hacker, but his code is not easy to understand. I know that maintaining the whole thing is very hard; and that is the reason why Camlp4 tends to have problems to bump from one version to
Re: [Caml-list] Camlp4/p5 type reflection [
I agree would be a serious changes, and I was thinking even of experimenting a bit with this kind of strictly typed meta programming. It's perfectly viable, as I've seen some good examples in my life. (and Template Haskell does it AFAIR). I'm not familiar with Template Haskell at all, but I don't think it does what you say. From what I understood of it, it was quite close from Camlp4, if maybe integrated to GHC in a tighter way (Camlp4 is really a separate source-to-source preprocessing phase). In particular, you can write and use an extension even if is produces ill-typed code in some circumstances (I mean type-checking the extension doesn't give you that the expansions will themselves be well-typed), and I haven't heard of getting type feedback at extension-writing time. I would welcome any pointer about this. Of course worth to point out MetaOCaml and MetaML. They do runtime type safe staged meta programming. Staged metaprogramming is indeed interesting but it is also very different from Camlp4. It is not a separate processing pass but really program generation at runtime. I don't think it would make much sense to try to integrate it into Camlp4, but I would indeed support such an addition to the OCaml language. Btw., there is an olg bugtracker entry about it, which I follow as a sign of support (maybe there would be a market for I support feature request #0004608 stickers). It hasn't evolved much as I suppose nobody has time to work on it, and getting this to work reliably in both ocamlc and ocamlopt would certainly be an awful lot of work, but that's one case where community has helped bug triaging by expressing interest. http://caml.inria.fr/mantis/print_bug_page.php?bug_id=4608 Good example would be: let t = (1,2,3,4) in map_tuple t (fun x - x*2) I'm not sure this example works in MetaOCaml, or at least that map_tuple could be used on tuples of any width (of course you could have a map_tuple function only taking care of 4-uplets). I understand that you want a type-dependent metaprogramming layer to expression transformations that couldn't be typed in the source language. I think Alain Frisch dynamic type information injection proposal, plus in-language runtime computation, could be simpler for about the same expressive power. If you throw in MetaOCaml quotations, you can even remove the runtime interpretive costs. But who's going to work on that? I think that's orthogonal to syntactic processing anyway. - syntax extension allows you to do certain things that are not available in a normal way, to enumerate: list comprehensions, pa_lwt, pa_where, pa_monad, pa_js. So they are clearly very useful in some cases If we had no Camlp4, we should push for some of these things to be integrated in the language. A reasonable but solid mixfix syntax could replace pa_monad, the ##-syntax of pa_js, and some aspects of pa_lwt. jsnew and list comprehensions could perfectly be handled as quotations. I'm honored you mention mention pa_where (being one of the original authors, I'm happy to know that it brings happiness to your home in those Christmas times). I think that's also a good example of the dangers of Camlp4. The precedence rules for such a syntax extension are absolutely horrible to define and to get right; we have been bitten by countless bug writing it, and I still don't think it's safe for all uses. We would be better with either a solid version of it entering the language (... but it's hard to convince people that it's worth it, especially if you introduce new ambiguities opportunities), or maybe just forgetting the idea. - combinators in particular monads allow certain syntactical abstractions, however there are runtime costs because of constructing closures and evaluating them later I personally dislike uses of Camlp4 to optimize code by rewriting some known expressions (eg. (... = (fun x - ...)) into a more efficient form. It is fragile precisely because you can't get accurate points-to information so are unsure the operators you're manipulating really have the semantics corresponding to your rewrite. I think those things should be done in a different layer (not Camlp4), or not at all. Evidently other people have a different opinion, and that's fine. On Sun, Dec 11, 2011 at 1:47 AM, Wojciech Meyer wojciech.me...@googlemail.com wrote: Gabriel Scherer gabriel.sche...@gmail.com writes: A summary to this lengthy mail: (1) Why type-enriched Camlp4 is an unreasonable idea (2) We should extract the typedtree; why it's hard (3) A fictional narrative of the camlp4/camlp5 history (4) Why you don't want to become Camlp4 maintainer (5) How we could try not to use Camlp4 in the future (6) Syntax extension survival advices # (1) Why type-enriched Camlp4 is an unreasonable idea Wojciech, your idea of having type information at the Camlp4 level is absolutely unreasonable. You are not speaking about a minor change here, but a major rewrite that would affect the compiler
Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
Many people are still frustrated with the camlp4/p5 situation. IMHO, we should give up on camlp4 inside the distribution, and only implement a few of its features in the regular parser: - Antiquotation syntax (i.e. expressions) because this makes it very easy to incorporate foreign syntax elements. Of course, we would also need the ability to recursively invoke the ocaml parser for parts of the antiquoted expression (this would also open the door for transformations of the AST). - A simple macro language (ifdef at least), for the cases antiquotations would be too complex. That means we would give up on arbitrary syntax extensions in the core distribution. There could be still camlp4 or p5 as a separate add-on, but it will be a second-class citizen, and it would be clear that it is a bit behind with the newest syntactical features. Also, it could be developed outside the core team. Gerd Am Sonntag, den 11.12.2011, 11:29 +0100 schrieb Gabriel Scherer: And Xavier's mail suggests that camlp4 is a maintenance burden for the OCaml team. Why is it such a bad idea to drop camlp4 out of the distribution, and just let camlp5 live? First of all, I don't have a strong opinion here: I just voiced doubt. My reasoning for going so goes along two lines of argument: 1. I'm not exactly sure how Camlp4 being in or out of the distribution will change the maintainance burden. The main maintainance difficulty with camlp{4,5} is that it needs to evolve its own parser in parallel with the 'official' one (that's by design), with changes to the language syntax. You need to change Camlp{4,5} when Ocaml 3.N+1 introduces a new syntax, or it won't be usable on OCaml 3.N+1 code. There are already users relying on Camlp{4,5}, and those user generally wish to use the new, exciting features of the next version. If they can't, they will complex, regardless of whether their preprocessor is in or out the distribution. That means that when the OCaml team is about to release a new version with syntactic changes, they have to worry about the preprocessor anyway, or make users unhappy. So there is a preprocessor burden on the OCaml team, independently of where the code is maintained and located. If the change means make it easier to distribute camlp4 fixes without bumping OCaml's version number, why not. If it means now we won't care about Camlp4 state before releasing a new versions, this may mean a degradation in the life of Camlp4 users. I doubt that's the idea. Being in or out the distribution also wouldn't change much, I think, the possibility of external contributions. In my experience Nicolas Pouillard and now Xavier Clerc have a good track record of integrating external contributions (I have sent one or two bugfixes on the tracker), and I'm confident they would be able to work with Jérémie Dimino if he wished to contribute to camlp4's evolution more frequently. 2. I suppose -- purely personal guess -- the intention being the consortium's suggestion is to try to move away slowly from Camlp{4,5}, towards alternatives such as Alain Frisch's annotations proposal. As I said previously, I would personally welcome such a move, but I don't see said alternatives released yet. I haven't had the opportunity to play with alternate tools, have an idea of how a transition would work out, see if the documentation is reasonably complete, etc. It would make more sense, in my opinion, to downplay camlp{4,5} *once* we have played with alternatives and are confident that they are mature enough to make a transition. Please also remember that the consortium members represent relatively large, well-educated, experienced players in the OCaml community. It probably wouldn't bother them much if, say, ocamlbuild, ocamldoc, or ocamldbg where taken out of the official distribution. They have important tooling in place and would adapt relatively easily. The end user or OCaml beginner may not adapt to such changes that easily. Now this is a matter where distributions, such as Debian, can be of great help, by providing complete packages regardless of what is or isn't in the official distribution. However, I have handled just enough users reports wondering why they didn't have camlp4o available, or graphics.cma, or whatever, to know that this can also be a barrier to use and adoption. Again, no strong opinion. I will welcome any change that strenghten the OCaml language. If you think distributing camlp4 out of the distribution would ease the live of OCaml developpers and maintainers, at no cost to the users, nor complicating the distribution side, then all is good. I just feel that it may be a bit too soon. On Sun, Dec 11, 2011 at 10:04 AM, Stéphane Glondu st...@glondu.net wrote: Le 11/12/2011 00:34, Gabriel Scherer
Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
Gerd, you are summing up in a few paragraphs what I tried to say in a few pages. There are other parts of Camlp4 that I would also welcome: - the OCaml quotation parsers that reads quoted OCaml expression (and patterns) and translate them to their ASTs (as an OCaml expression); this makes generating OCaml code easy - controlled extension points such as 'type-conv' and 'deriving', hopefully generalized to most AST nodes; this goes in the direction of Alain's annotation proposal There could be still camlp4 or p5 as a separate add-on, but it will be a second-class citizen, and it would be clear that it is a bit behind with the newest syntactical features. It's not a big problem if the second-class extensions have trouble processing and producing pieces of OCaml AST using the newest syntactical features (eg. the OCaml quotations don't have sugar for them). But it's a problem if they simply can't be run on files using those features, forcing their users to keep older versions of OCaml. It's not a *new* problem as it already appears at version transitions, but it also won't go away as long as people use a preprocessor that insists on understanding the whole AST, whether those tools are first- or second-class. On Sun, Dec 11, 2011 at 12:23 PM, Gerd Stolpmann i...@gerd-stolpmann.de wrote: Many people are still frustrated with the camlp4/p5 situation. IMHO, we should give up on camlp4 inside the distribution, and only implement a few of its features in the regular parser: - Antiquotation syntax (i.e. expressions) because this makes it very easy to incorporate foreign syntax elements. Of course, we would also need the ability to recursively invoke the ocaml parser for parts of the antiquoted expression (this would also open the door for transformations of the AST). - A simple macro language (ifdef at least), for the cases antiquotations would be too complex. That means we would give up on arbitrary syntax extensions in the core distribution. There could be still camlp4 or p5 as a separate add-on, but it will be a second-class citizen, and it would be clear that it is a bit behind with the newest syntactical features. Also, it could be developed outside the core team. Gerd Am Sonntag, den 11.12.2011, 11:29 +0100 schrieb Gabriel Scherer: And Xavier's mail suggests that camlp4 is a maintenance burden for the OCaml team. Why is it such a bad idea to drop camlp4 out of the distribution, and just let camlp5 live? First of all, I don't have a strong opinion here: I just voiced doubt. My reasoning for going so goes along two lines of argument: 1. I'm not exactly sure how Camlp4 being in or out of the distribution will change the maintainance burden. The main maintainance difficulty with camlp{4,5} is that it needs to evolve its own parser in parallel with the 'official' one (that's by design), with changes to the language syntax. You need to change Camlp{4,5} when Ocaml 3.N+1 introduces a new syntax, or it won't be usable on OCaml 3.N+1 code. There are already users relying on Camlp{4,5}, and those user generally wish to use the new, exciting features of the next version. If they can't, they will complex, regardless of whether their preprocessor is in or out the distribution. That means that when the OCaml team is about to release a new version with syntactic changes, they have to worry about the preprocessor anyway, or make users unhappy. So there is a preprocessor burden on the OCaml team, independently of where the code is maintained and located. If the change means make it easier to distribute camlp4 fixes without bumping OCaml's version number, why not. If it means now we won't care about Camlp4 state before releasing a new versions, this may mean a degradation in the life of Camlp4 users. I doubt that's the idea. Being in or out the distribution also wouldn't change much, I think, the possibility of external contributions. In my experience Nicolas Pouillard and now Xavier Clerc have a good track record of integrating external contributions (I have sent one or two bugfixes on the tracker), and I'm confident they would be able to work with Jérémie Dimino if he wished to contribute to camlp4's evolution more frequently. 2. I suppose -- purely personal guess -- the intention being the consortium's suggestion is to try to move away slowly from Camlp{4,5}, towards alternatives such as Alain Frisch's annotations proposal. As I said previously, I would personally welcome such a move, but I don't see said alternatives released yet. I haven't had the opportunity to play with alternate tools, have an idea of how a transition would work out, see if the documentation is reasonably complete, etc. It would make more sense, in my opinion, to downplay camlp{4,5} *once* we have played with alternatives and are confident that they are
Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
On 12/11/2011 12:34 AM, Gabriel Scherer wrote: the Coq team which has user-defined notations using Camlp4 and, huh, I really don't want to know the details My understanding (please correct me if I'm wrong) is that Coq uses camlp{4,5} only as an extensible parser library in order to parse its own language (which can be extended with user-defined notations). In particular, Coq does not use the following camlp{4,5} features: - the revised OCaml syntax - the alternative representation of OCaml AST - the Camlp4 grammar definitions for OCaml syntax(es) - quotations/antiquotations to produce fragments of the OCaml AST - OCaml syntax extensions to define grammar entries - custom OCaml syntax extension for the Coq source code itself (or maybe only very simple one, like macro/conditional compilation?) I wonder how much energy it would take to create a stand-alone extensible parser library, implemented in pure OCaml (normal syntax), and following a similar API and semantics as camlp{4,5}, on which Coq parsing could be built. The same library could be used as a foundation for future versions of camlp{4,5}. It would be a simple library, with no external dependency (in particular, no dependency to the OCaml internals), and very little maintenance burden. My guess is: this would not take so much energy. After all, the representation of extensible grammars and the top-down parsing technology are not so complex. But I would be interested to hear from people who know Coq and camlp{4,5} better. -- Alain -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
Le 11/12/2011 14:27, Alain Frisch a écrit : My understanding (please correct me if I'm wrong) is that Coq uses camlp{4,5} only as an extensible parser library in order to parse its own language (which can be extended with user-defined notations). In particular, Coq does not use the following camlp{4,5} features: [1] the revised OCaml syntax [2] the alternative representation of OCaml AST [3] the Camlp4 grammar definitions for OCaml syntax(es) [4] quotations/antiquotations to produce fragments of the OCaml AST [5] OCaml syntax extensions to define grammar entries [6] custom OCaml syntax extension for the Coq source code itself (or maybe only very simple one, like macro/conditional compilation?) As far as I know, Coq uses: [1,2,3] not at all [4] a bit (e.g. tactics/tauto.ml4) [5] a lot (e.g. most of *.ml4 files) [6] for macros and conditional compilation indeed I wonder how much energy it would take to create a stand-alone extensible parser library, implemented in pure OCaml (normal syntax), and following a similar API and semantics as camlp{4,5}, on which Coq parsing could be built. The same library could be used as a foundation for future versions of camlp{4,5}. It would be a simple library, with no external dependency (in particular, no dependency to the OCaml internals), and very little maintenance burden. I've heard of some attempts to port Coq to dypgen, but since no public announcement was made, I guess this is quite difficult... Are there other such libraries? It's not really Coq's business to create a new one... Cheers, -- Stéphane -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Camlp4/p5 type reflection [
Le dimanche 11 décembre 2011 à 12:19 +0100, Gabriel Scherer a écrit : If we had no Camlp4, we should push for some of these things to be integrated in the language. A reasonable but solid mixfix syntax could replace pa_monad, the ##-syntax of pa_js, and some aspects of pa_lwt. jsnew and list comprehensions could perfectly be handled as quotations. I don't think it is possible to do pa_lwt via mixfix syntax. For example pa_lwt does this kind of transformation: lwt x = f () and y = g () and z = h () in return (x + y + z) -- let t1 = f () and t2 = g () and t3 = h () in Lwt.bind t1 (fun x - Lwt.bind t2 (fun y - Lwt.bind t3 (fun z - return (x + y + z The first form is much more readable. And it has another big advantage: it adds backtrace support. Because as soon as you use monads in ocaml, backtraces are unusable. And you just can't seriously ask the user to write that kind of code: Lwt.fail (try raise End_of_file with exn - exn) So unless we have better support for monads in OCaml itself, i think Camlp4 is still required. That said, i am not either a big fan of extending the syntax since it makes code harder to understand for those who don't know the extensions. -- Jérémie -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
Jérémie Dimino jere...@dimino.org writes: Le samedi 10 décembre 2011 à 19:10 +, Wojciech Meyer a écrit : I'm aware that these are huge changes to Camlp4, but it would make meta programming more powerful and push Camlp4 to the next level. Sure. But it seems that the next version of OCaml will have runtime types, see http://www.lexifi.com/blog/runtime-types , so maybe it is not needed to add this to camlp4. It's interesting and I didn't know about it. However, the problem is slightly different, I would like to know the typing of a freshly generated piece of code by Camlp4 in the previous phase. Then, have pattern matching against these meta types in annotated AST and produce another AST, which in turn have most likely another typing and pass to the next phase etc. I would say that Camlp4 is fine for the simpler syntax extensions and majority of small DSLs but when you start composing syntax extensions and macros it quickly becomes a problem. Also they are problems that i don't know how to solve with the camlp4 approach. For example consider: let x = 1 type int = A let y = A The typer knows that x has the type (int, 1) and y has the type (int, 42). But what you send to ocaml is a parse tree, and you cannot make this difference in the parse tree. Yes, you would need a type information in the parse tree as mentioned before, so you want to feed up the compiler with AST start unrolling macros top-down and then follow up with the inferred types bottom-up. Cheers; Wojciech -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)]
A summary to this lengthy mail: (1) Why type-enriched Camlp4 is an unreasonable idea (2) We should extract the typedtree; why it's hard (3) A fictional narrative of the camlp4/camlp5 history (4) Why you don't want to become Camlp4 maintainer (5) How we could try not to use Camlp4 in the future (6) Syntax extension survival advices # (1) Why type-enriched Camlp4 is an unreasonable idea Wojciech, your idea of having type information at the Camlp4 level is absolutely unreasonable. You are not speaking about a minor change here, but a major rewrite that would affect the compiler internals as well. It would really be a new (and interesting) project. Camlp4 is, and I guess will remain, a syntax-level preprocessing tool. You have to accept the fact that you can't use type information at this level (but you can certainly interact in some way with the type system by producing/transforming pieces of code in a way that you know will have interesting typing effects; for example, you may want to generate code that is purposedly ill-typed in some cases, to disallow certain uses of your syntax extension). I'm not even sure what it would mean to access type information at the camlp4 level, as you're producing and transforming untyped AST; would you want partially typed ASTs? How is the typer supposed to work on the part that you haven't transformed yet, and therefore are not valid OCaml syntax? I suppose you could have a preprocessing and transformation tool at the typedtree level, but that would be a different tool with different uses, distinct from the syntactic preprocessing part (though you may develop extensions that act on both fronts). I'm not aware of so much Camlp4 situations that would really require typing information. I would be interested in good examples if you have some. One problem that I have had with Camlp4 is that you don't have identifier resolution information (eg. you don't know if the identifier (@) you're seeing is really list concatenation, or has been redefined/shadowed in the context); this makes uses of Camlp4 for inlining, for example, quirky and fragile. That's still a simpler problem than type information. # (2) We should extract the typedtree; why it's hard If you really want to play with type information and scope-resolved identifiers, AST-manipulating tools is probably not the way to go: you indeed want full access to the typedtree. Currently this is only possible by hacking the compiler, and this is what for example Jun Furuse's Ocamlspotter project does. Those kind of tools could be made less intrusive if it was possible to pass typedtree-like information in and out of the compiler. I remember reading that some people (OcamlPro, I suppose) have this on their target list. The problem however is that the current internal compiler's typedtree representation is not at all adapted for external communication. If you want a kind of tool that is robust and future-proof in any sense (you could probably get something working by just marshalling the current typedtree, but then it could break awfully after minor language changes, make the compiler choke, etc.; I certainly wouldn't want to use that), you have to design a clean and efficient representation for OCaml programs after the type inference phase. Having a solid proposal on this topic wil be an awful lot of work. (3) A fictional narrative of the camlp4/camlp5 history Jérémie Dimino wrote: But there is something I don't understand here. Why is there camlp4 and camlp5 ? These two projects do exactly the same thing and are incompatible. So i don't see the point of maintaining them both. We should at least deprecate one. Let me repeat the story as I know it (with possible mistakes, I was still a caml baby in the 3.10 times) in a hopefully compact form for those on the list who have no idea about it. DISCLAIMER: this is only a fictional storytelling, meant to give a reasonable idea (or at least my vision) of the situation. I may be wrong about the events chronology, people name, hard facts, and of course english spelling and grammar. The story is complicated and I don't know the gory details. If you know a better story, feel free to add important precisions, correct the obvious mistakes, etc. I also welcome suggestions to make it a funny, entertaining read; finally, a few romantic details could clearly turn it into a blockbuster. The original Camlp4 tool was mostly developped by Daniel de Rauglaudre. Apparently, personal relations between Daniel and the OCaml team were not easy, and Camlp4 was gradually becoming more and more external to the OCaml distribution (in the past, the stream syntax was available as part of the core language, but it was moved to Camlp4; the Oreilly book was written before that move) and its maintainance status incertain. In the 3.09/3.10 transition, Nicolas Pouillard started working on a refactoring of the Camlp4 codebase (which was mostly a silent, non-moving animal at that time) to make it more
Re: [Caml-list] Camlp4/p5 type reflection [
Gabriel Scherer gabriel.sche...@gmail.com writes: A summary to this lengthy mail: (1) Why type-enriched Camlp4 is an unreasonable idea (2) We should extract the typedtree; why it's hard (3) A fictional narrative of the camlp4/camlp5 history (4) Why you don't want to become Camlp4 maintainer (5) How we could try not to use Camlp4 in the future (6) Syntax extension survival advices # (1) Why type-enriched Camlp4 is an unreasonable idea Wojciech, your idea of having type information at the Camlp4 level is absolutely unreasonable. You are not speaking about a minor change here, but a major rewrite that would affect the compiler internals as well. It would really be a new (and interesting) project. Hello Gabriel, I agree would be a serious changes, and I was thinking even of experimenting a bit with this kind of strictly typed meta programming. It's perfectly viable, as I've seen some good examples in my life. (and Template Haskell does it AFAIR). Camlp4 is, and I guess will remain, a syntax-level preprocessing tool. You have to accept the fact that you can't use type information at this level (but you can certainly interact in some way with the type system by producing/transforming pieces of code in a way that you know will have interesting typing effects; for example, you may want to generate code that is purposedly ill-typed in some cases, to disallow certain uses of your syntax extension). I'm not even sure what it would mean to access type information at the camlp4 level, as you're producing and transforming untyped AST; would you want partially typed ASTs? How is the typer supposed to work on the part that you haven't transformed yet, and therefore are not valid OCaml syntax? I suppose you could have a preprocessing and transformation tool at the typedtree level, but that would be a different tool with different uses, distinct from the syntactic preprocessing part (though you may develop extensions that act on both fronts). I'm not aware of so much Camlp4 situations that would really require typing information. I would be interested in good examples if you have some. One problem that I have had with Camlp4 is that you don't have identifier resolution information (eg. you don't know if the identifier (@) you're seeing is really list concatenation, or has been redefined/shadowed in the context); this makes uses of Camlp4 for inlining, for example, quirky and fragile. That's still a simpler problem than type information. Of course worth to point out MetaOCaml and MetaML. They do runtime type safe staged meta programming. Good example would be: let t = (1,2,3,4) in map_tuple t (fun x - x*2) (at this point I deliberately chosen closure and not additional syntax extension because you can pass closures and you can't pass code easily in Camlp4, because of the mentioned single stage meta programming). of course it's possible with dependent types and lists, but with meta programming you could just expand map_tuple to the code you want, *but* only when the type information is available. (or you can infer the type yourself, in the simplest case if the tuple was expression passed directly to map_tuple). Currently I believe that DSLs are the best way of achieving higher level abstractions, because you are no longer bound to the language syntax. And you can stay distant from the semantics of your target language. # (6) Syntax extension survival advices To the reader considering use of a new syntax extension in his next project: - don't Personally I disagree with this statement, because: - syntax extension allows you to do certain things that are not available in a normal way, to enumerate: list comprehensions, pa_lwt, pa_where, pa_monad, pa_js. So they are clearly very useful in some cases. - combinators in particular monads allow certain syntactical abstractions, however there are runtime costs because of constructing closures and evaluating them later. These runtime costs can easily trimmed to some level with some compiler optimisations for instance good inlining (that we don't have as mentioned before in the thread) and efficiency of runtime (we do have a very good gc). - compilation is much easier than interpretation. You see the code generated and the code generated is restricted semantically by the target language. Handling environment is way lower level than just generating some let bindings in a quotation. - - if you really must, try to make sure that your code is also reasonable to use *without* a syntax extension (eg. by producing a library with a clean interface, making your extension desugar to uses of it, but also making sure that it can be used by the human user) - if you really must, try to get it in the form of a quotation; the rest is fragile I agree that quotations are the best part of Camlp4. Right now I use them to generate a lot of ML code (more than hundreds of KLOC) out of data oriented DSL that parsing is detached