Re: [Zim-wiki] Questions about Wiki markup and its Parser
On Mon, Dec 13, 2010 at 4:52 PM, Michael Nagel wrote: > in my opinion moving the markup and it's lexer/parser to something generic > and future-proof is a very good idea. It could (and should) be the first > step to make zim markup compatible with some widespread markup syntax like > markdown or restructured text... Zim's wik-ilanguage is already close enough to CREOLE. One issue with Zim is that the interactive wiki-to-neat feature requires either: 1. Access to specific parts of the parser. 2. Repeating parsing logic in the UI. I took a look at the source code in python-creoleparser, and found it quite complicated. It's dependency on genshi is a plus or a minus depending on how you look at it. I will take the time now to post the prototype parser I wrote to the issue base to get the feedback. I think that the modular structure of the parser will be useful in Zim, because the parts may be used independently. These are the well defined tasks it performs. 1. Break the text into lines. 2. Classify each line according to its potential role in the document: heading, bullet, indent, paragraph. 3. Group lines into blocks with a top-down parser: , , , and . 4. Resolve inline markup where applicable (not in ), allowing for reasonable nesting (italics within bold, f.i.) Come to think of it, the same parser structure can be applied to any wiki dialect. I was thinking about the role of xml.etree in the parsing. My current take on it is that the parse tree should support the Visitor pattern to make it easy to render into several formats, including xml.etree. The *compiler*module and way of doing things may exactly what's needed: http://docs.python.org/library/compiler.html#compiler.ast.Node http://docs.python.org/library/compiler.html#compiler.visitor.ASTVisitor -- Juancarlo *Añez* ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Questions about Wiki markup and its Parser
Hello everyone, in my opinion moving the markup and it's lexer/parser to something generic and future-proof is a very good idea. It could (and should) be the first step to make zim markup compatible with some widespread markup syntax like markdown or restructured text... In my opinion this step (standard markup syntax) should be taken even if it means breaking compatibility with the old syntax! a converter could be create to ease the transition... It should not break the WYSIWYG editor capabilities of zim, however... Best Regards Michael ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Questions about Wiki markup and its Parser
On Sun, Dec 12, 2010 at 9:50 PM, Jaap Karssenberg < jaap.karssenb...@gmail.com> wrote: > If you feel that is to much overhead the alternative is to open a > ticket in the bug tracker and attach a patch file there. > That would be the route because I'm not making changes to Zim (yet), but rather proposing the adoption of a new module. BTW, today I found this: http://pyparsing.wikispaces.com/ It seems worth taking a look at. -- Juancarlo *Añez* ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Questions about Wiki markup and its Parser
On Mon, Dec 13, 2010 at 1:49 AM, Juancarlo Añez wrote: > Where should I post the source code for review? Best workflow would be to post your own bazaar branch on launchpad. That way others can checkout the code and propose patches etc. Also you can request merges with the main branch. If you feel that is to much overhead the alternative is to open a ticket in the bug tracker and attach a patch file there. Regards, Jaap ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Questions about Wiki markup and its Parser
On Sat, Dec 11, 2010 at 11:58 AM, Juancarlo Añez wrote: > I think that it will be best if this new parser produces only an ad-hoc > syntactic tree (hierarchical structure, unknown meaning) that can be tested > independently of the rest of Zim. Converting the syntactic tree to an > xml.tree should be straightforward. > I already have a parser that understands most of the wiki language. It lacks: 1. Checkboxes. 2. xml.etree 3. Cleanup and unit tests. The parser produces a very hierarchical structure through nested tuples and lists, because that was easy and fast to code. Swtiching to xml.etree should provide much more functionality, including pretty-printing. At this point what matters is a code review that assesses if the code is readable and understandable enough to merit its adoption once it has been tested to production quality. Where should I post the source code for review? -- Juancarlo *Añez* ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Questions about Wiki markup and its Parser
On Sat, Dec 11, 2010 at 10:17 AM, Jaap Karssenberg < jaap.karssenb...@gmail.com> wrote: > Sure, if it supports the same wiki syntax and the code is cleaner than > mine I'll be happy to use it. > OK. I'll give it a shot. > I have been meaning to clean up the object structure used for the > internal parse tree, but if you use the "builder" interface for > xml.etree it should be compatible when I switch to the new library. > I think that it will be best if this new parser produces only an ad-hoc syntactic tree (hierarchical structure, unknown meaning) that can be tested independently of the rest of Zim. Converting the syntactic tree to an xml.tree should be straightforward. -- Juancarlo *Añez* ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
Re: [Zim-wiki] Questions about Wiki markup and its Parser
On Fri, Dec 10, 2010 at 10:50 PM, Juancarlo Añez wrote: > I was examining the source code for Zim with the aim of adding support fro > enumerated lists, but found the code that deals with wiki-text too > complicated to be amenable for a small hack. > > Questions: > > Is there a reason why Zim doesn't use one of the wiki markup libraries > readily available for Python? Bit historic since zim was originally written in Perl and also because I had my own ideas how a good wiki format should look. Meaning to add other options for compatibility with other wikis, but never gotten around doing that myself. > Would it help (would it be accepted) if I wrote a simpler and more > extensible wiki parser? ... 8< ... Sure, if it supports the same wiki syntax and the code is cleaner than mine I'll be happy to use it. Looks like you are more qualified than I am since I never followed any formal courses in parsers or grammars. I have been meaning to clean up the object structure used for the internal parse tree, but if you use the "builder" interface for xml.etree it should be compatible when I switch to the new library. Regards, Jaap ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp
[Zim-wiki] Questions about Wiki markup and its Parser
I really like Zim. It has let me take care of my day-to-day design notes in an simple, useful, and safe way. I was examining the source code for Zim with the aim of adding support fro enumerated lists, but found the code that deals with wiki-text too complicated to be amenable for a small hack. Questions: 1. Is there a reason why Zim doesn't use one of the wiki markup libraries readily available for Python? 2. Would it help (would it be accepted) if I wrote a simpler and more extensible wiki parser? I specialized in programing language theory at the university, and later taught the subject as a professor, and I think I know why parsing wiki tends to be complicated: 1. Wiki is not a regular language, so regular expressions alone don't cut it. 2. Wiki has languages within languages, much like Python format strings are a language within Python. 3. Some of the sub-languages in Wiki are regular, and the outer languages seem amenable to top-down analysis, but the whole doesn't look LL or LR, so it is unlikely that a grammar can be built for it. The above means that parsing of Wiki should probably be done by serveral parsers set up in layers or piplelines. It also means that the sub-languages should be more formally defined (with the likely consequence that some odd stuff that currently "just works" may cease to). Again, if there's interest, I can give it a shot. The purpose would be to have a parser that's easier to understand and improve (womewhere in the Zim docs it says that performance is a non-issue in parsing). -- Juancarlo *Añez* ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp