Re: [Zim-wiki] Questions about Wiki markup and its Parser

2010-12-14 Thread Juancarlo Añez
On Mon, Dec 13, 2010 at 4:52 PM, Michael Nagel wrote:

> in my opinion moving the markup and it's lexer/parser to something generic
> and future-proof is a very good idea. It could (and should) be the first
> step to make zim markup compatible with some widespread markup syntax like
> markdown or restructured text...


Zim's wik-ilanguage is already close enough to CREOLE.

One issue with Zim is that the interactive wiki-to-neat feature requires
either:

   1. Access to specific parts of the parser.
   2. Repeating parsing logic in the UI.

I took a look at the source code in python-creoleparser, and found it quite
complicated. It's dependency on genshi is a plus or a minus depending on how
you look at it.

I will take the time now to post the prototype parser I wrote to the issue
base to get the feedback.

I think that the modular structure of the parser will be useful in Zim,
because the parts may be used independently. These are the well defined
tasks it performs.

   1. Break the text into lines.
   2. Classify each line according to its potential role in the document:
   heading, bullet, indent, paragraph.
   3. Group lines into blocks with a top-down parser: , , 
   , and .
   4. Resolve inline markup where applicable (not in ), allowing for
   reasonable nesting (italics within bold, f.i.)

Come to think of it, the same parser structure can be applied to any wiki
dialect.

I was thinking about the role of xml.etree in the parsing. My current take
on it is that the parse tree should support the Visitor pattern to make it
easy to render into several formats, including xml.etree. The
*compiler*module and way of doing things may exactly what's needed:

http://docs.python.org/library/compiler.html#compiler.ast.Node
http://docs.python.org/library/compiler.html#compiler.visitor.ASTVisitor

-- 
Juancarlo *Añez*
___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp


Re: [Zim-wiki] Questions about Wiki markup and its Parser

2010-12-13 Thread Michael Nagel
Hello everyone,

in my opinion moving the markup and it's lexer/parser to something generic and 
future-proof is a very good idea. It could (and should) be the first step to 
make zim markup compatible with some widespread markup syntax like markdown or 
restructured text...
In my opinion this step (standard markup syntax) should be taken even if it 
means breaking compatibility with the old syntax! a converter could be create 
to ease the transition...
It should not break the WYSIWYG editor capabilities of zim, however...

Best Regards
Michael

___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp


Re: [Zim-wiki] Questions about Wiki markup and its Parser

2010-12-12 Thread Juancarlo Añez
On Sun, Dec 12, 2010 at 9:50 PM, Jaap Karssenberg <
jaap.karssenb...@gmail.com> wrote:

> If you feel that is to much overhead the alternative is to open a
> ticket in the bug tracker and attach a patch file there.
>

That would be the route because I'm not making changes to Zim (yet), but
rather proposing the adoption of a new module.

BTW, today I found this:

http://pyparsing.wikispaces.com/

It seems worth taking a look at.

-- 
Juancarlo *Añez*
___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp


Re: [Zim-wiki] Questions about Wiki markup and its Parser

2010-12-12 Thread Jaap Karssenberg
On Mon, Dec 13, 2010 at 1:49 AM, Juancarlo Añez  wrote:
> Where should I post the source code for review?

Best workflow would be to post your own bazaar branch on launchpad.
That way others can checkout the code and propose patches etc. Also
you can request merges with the main branch.

If you feel that is to much overhead the alternative is to open a
ticket in the bug tracker and attach a patch file there.

Regards,

Jaap

___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp


Re: [Zim-wiki] Questions about Wiki markup and its Parser

2010-12-12 Thread Juancarlo Añez
On Sat, Dec 11, 2010 at 11:58 AM, Juancarlo Añez  wrote:

> I think that it will be best if this new parser produces only an ad-hoc
> syntactic tree (hierarchical structure, unknown meaning) that can be tested
> independently of the rest of Zim. Converting the syntactic tree to an
> xml.tree should be straightforward.
>

I already have a parser that understands most of the wiki language.

It lacks:

   1. Checkboxes.
   2. xml.etree
   3. Cleanup and unit tests.

The parser produces a very hierarchical structure through nested tuples and
lists, because that was easy and fast to code. Swtiching to xml.etree should
provide much more functionality, including pretty-printing.

At this point what matters is a code review that assesses if the code is
readable and understandable enough to merit its adoption once it has been
tested to production quality.

Where should I post the source code for review?

-- 
Juancarlo *Añez*
___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp


Re: [Zim-wiki] Questions about Wiki markup and its Parser

2010-12-11 Thread Juancarlo Añez
On Sat, Dec 11, 2010 at 10:17 AM, Jaap Karssenberg <
jaap.karssenb...@gmail.com> wrote:

> Sure, if it supports the same wiki syntax and the code is cleaner than
> mine I'll be happy to use it.
>

OK. I'll give it a shot.


>  I have been meaning to clean up the object structure used for the
> internal parse tree, but if you use the "builder" interface for
> xml.etree it should be compatible when I switch to the new library.
>

I think that it will be best if this new parser produces only an ad-hoc
syntactic tree (hierarchical structure, unknown meaning) that can be tested
independently of the rest of Zim. Converting the syntactic tree to an
xml.tree should be straightforward.

-- 
Juancarlo *Añez*
___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp


Re: [Zim-wiki] Questions about Wiki markup and its Parser

2010-12-11 Thread Jaap Karssenberg
On Fri, Dec 10, 2010 at 10:50 PM, Juancarlo Añez  wrote:
> I was examining the source code for Zim with the aim of adding support fro
> enumerated lists, but found the code that deals with wiki-text too
> complicated to be amenable for a small hack.
>
> Questions:
>
> Is there a reason why Zim doesn't use one of the wiki markup libraries
> readily available for Python?

Bit historic since zim was originally written in Perl and also because
I had my own ideas how a good wiki format should look. Meaning to add
other options for compatibility with other wikis, but never gotten
around doing that myself.

> Would it help (would it be accepted) if I wrote a simpler and more
> extensible wiki parser?
... 8< ...

Sure, if it supports the same wiki syntax and the code is cleaner than
mine I'll be happy to use it. Looks like you are more qualified than I
am since I never followed any formal courses in parsers or grammars.

I have been meaning to clean up the object structure used for the
internal parse tree, but if you use the "builder" interface for
xml.etree it should be compatible when I switch to the new library.

Regards,

Jaap

___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp


[Zim-wiki] Questions about Wiki markup and its Parser

2010-12-10 Thread Juancarlo Añez
I really like Zim. It has let me take care of my day-to-day design notes in
an simple,  useful, and safe way.

I was examining the source code for Zim with the aim of adding support fro
enumerated lists, but found the code that deals with wiki-text too
complicated to be amenable for a small hack.

Questions:

   1. Is there a reason why Zim doesn't use one of the wiki markup libraries
   readily available for Python?
   2. Would it help (would it be accepted) if I wrote a simpler and more
   extensible wiki parser?

I specialized in programing language theory at the university, and later
taught the subject as a professor, and I think I know why parsing wiki tends
to be complicated:

   1. Wiki is not a regular language, so regular expressions alone don't cut
   it.
   2. Wiki has languages within languages, much like Python format strings
   are a language within Python.
   3. Some of the sub-languages in Wiki are regular, and the outer languages
   seem amenable to top-down analysis, but the whole doesn't look LL or LR, so
   it is unlikely that a grammar can be built for it.

The above means that parsing of Wiki should probably be done by serveral
parsers set up in layers or piplelines. It also means that the sub-languages
should be more formally defined (with the likely consequence that some odd
stuff that currently "just works"  may cease to).

Again, if there's interest, I can give it a shot. The purpose would be to
have a parser that's easier to understand and improve (womewhere in the Zim
docs it says that performance is a non-issue in parsing).

-- 
Juancarlo *Añez*
___
Mailing list: https://launchpad.net/~zim-wiki
Post to : zim-wiki@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp