Hello, I have been writing a parser for mlorg files in OCaml. This started as an experiment to see if the literate programming mode of org-mode could scale to a full application (among other things). The project is at its beginning but can « bootstrap » itself (that is parses its own source and extract the source code), yet the support for the syntax is very far from being complete. The goal is also to be able to convert org-mode files to latex/html/... without having the dependancy on emacs. Indeed although org-mode files are just plain text, there is still a feeling of being locked because this is such a complicated format and that there doesn't seem to be a reference library to deal with this. I hope that more libraries to do so will appear for one main reason : to have a standard syntax we can build upon : I think that to know precisely the syntax understood by org-mode is very difficult : no document about this exists (Or I have found none). When I'm done with the main syntaxic part I will try to document them. Besides, I think org-mode is wonderful editor but does a terrible job at exporting : slow, emacs-specific, strange errors on some document, ... The code can be found on gitorious: http://gitorious.org/mlorg/mlorg For those who would like to compile, you will need the batteries library from git (hope it will be released before mlorg has reached a releasable state). An example of cool feature that I have added in mlorg and that should be the org-mode exporter : org-mode doesn't put location annotations (à la cpp) so that compilers know how to report correct line numbers. This is very helpful when compiling quite long files. The point of this message is mainly to attract people interested in testing or even contributing. (I will be very glad : there is so much to do). But I hope to make the org-mode community think about a standardization process of the syntax used in org-mode to ease the work of parsers mainteners. There is no README yet, but the mlorg binary doesn't do much yet and the code should be self-documented (I hope so). Simon.
On lun. 27/févr. (15:27), Alan Schmitt wrote: On 26 févr. 2012, at 17:41, Simon Castellan wrote: I have been writing a parser for mlorg files in OCaml. This started as an experiment to see if the literate programming mode of org-mode could scale to a full application (among other things). This looks very interesting, and would very much help in the dissemination of org-mode. Have you thought of announcing it on the caml mailing list? Alan I have but prefer to wait mlorg to be more complete. This post was meant mainly to gather info/document about org's syntax. (But as I said feedbacks welcome.) Simon.
On lun. 27/févr. (09:52), Eric Schulte wrote: Simon Castellan simon.castel...@iuwt.fr writes: On lun. 27/févr. (15:27), Alan Schmitt wrote: On 26 févr. 2012, at 17:41, Simon Castellan wrote: I have been writing a parser for mlorg files in OCaml. This started as an experiment to see if the literate programming mode of org-mode could scale to a full application (among other things). This looks very interesting, and would very much help in the dissemination of org-mode. Have you thought of announcing it on the caml mailing list? Alan I have but prefer to wait mlorg to be more complete. This post was meant mainly to gather info/document about org's syntax. (But as I said feedbacks welcome.) Hi Simon, Nicolas Goaziou has been working recently on a new emacs-lisp parser of Org-mode files, with the goals of 1. standardizing the formal syntax of Org-mode files 2. parsing Org-mode files to a canonical emacs-lisp list-based representation in memory (like an Org-mode AST) 3. re-basing the existing Org-mode exporters off of this canonical representation This work is contained in contrib/lisp/org-element.el, which includes a large amount of useful commentary at the top of the file. This should serve as a starting point for learning more about the formal syntax of Org-mode files (as it is defined). I think that developing parsers for this syntax in multiple language should be very useful to ensure that a usable syntax is developed separate from any particular implementation. Cheers, Thank you very much for this pointer, This is what I was looking for : a list of syntaxic construction in org-mode. I'd say though that it lacks a more-or-less formal syntaxic definition of constructions. Simon.
Hello, Thanks for your answer. I think indeed that a description of org's syntax would be better in a separate document. For now I am rebasing my parser on your categories (I must say I was lacking a lot). Please let me know when you change your syntaxic categories (by change you mean additions only or removals as well ?). I will try in my sources to document meanings and (very) informal syntax of handled constructions. Besides, what are export snippets ? I can't find a reference to it in the manual. Simon
Hello again, Four months have passed and a lot of progress have been made. First I suppressed the literate programming layer as it was getting too much in the way. Second, the support for the syntax has been greatly improved and supports almost all constructions mentioned in org-element.el. Fore most documents, it should be ok I guess -- but I don't know what org features are the most used. To debug, and to help mlorg to talk with other languages, I coded an XML backend which dumps the structure of the file as a XML tree. What is more interesting to me — and that's why I started mlorg in the first place — is the quote backend. This backend allows you to pick out a code block in your file (OCaml only for now) and feed it the whole document as a tree. Thus this code can extract the particuliar information you want. For instance, I have at the end of my contacts.org this little snippet that exports the contacts as mutt aliases (F stands for filter, D for document and |- is the composition of function as the code is written in point-free style -- the argument isn't explicitely mentionned) #+name:export #+begin_src ocaml let replace = Str.global_replace (Str.regexp ) _ in F.run (F.has_property (F.s EMAIL)) |- List.map (fun d - sprintf alias %s %s\n (D.name d | replace) (D.prop_val_ EMAIL d)) |- String.concat |- write #+end_src With this, I just need to do $ mlorg --filename contacts.org --backend quote to have my mutt aliases. With this quote feature I plan to let the user override the html/latex exporters through the means of inheritance. For instance, suppose the user has blocks like that in his document: #+begin_lemma Some lemma. #+end_lemma He wants to export it in a specific way in html, he can put at the end of his document: #+name export #+begin_src ocaml let exporter = object(self) inherit htmlExporter as super method block = function | Custom (lemma, name, contents) - Xml.block div ~attr:[class, lemma] (Xml.data (name ^ — ) :: self#blocks contents | block - super#block block end in exporter#document #+end_src (It doesn't work yet but soon will) --- I wrote a short README available here: http://kiwi.iuwt.fr/~asmanur/projets/mlorg/ (This shows that the html backend is pretty basic) This comments briefly every construction of the syntax I support. Performance-wise, it is not optimized at all and as such quite slow. To process this file http://doc.norang.ca/org-mode.org, on my computer the bytecode version is as fast as org-mode and the native version is about 5-6x faster. (tested quickly) -- What I plan to do next: - complete the syntax as much as possible - improve the html latex backend - try to be a little faster - have a agenda backend as well. - implements other languages ? Simon.
[O] `ob-babel-expand-noweb-references' does not take into account `org-babel-tangle-uncomment-comments'
Hello, I would like the tangle functionality of org-mode to generate line directives so that compilers generate correct location for errors. I am using the following settings, using the `:comments' feature of tangle: ;; Do not comment comments (setq org-babel-tangle-uncomment-comments t) ;; Beginning: set line directives (this is for OCaml) (setq org-babel-tangle-comment-format-beg "# %start-line \"%file\"\n") ;; No end template [Remember that they are not commented] (setq org-babel-tangle-comment-format-end " ") This works surprsingly well for ordinary code blocks but breaks down when noweb is used. Indeed control is passed to ob-babel-expand-noweb-references which comments the "comment" (I see this can be confusing) because it ignores the setting. My question is: is this the indented behaviour? Also, I do think such a behaviour should be supported out of the box: this is very important for compiled languages. Best wishes, Simon.