On Fri, Jul 8, 2011 at 2:13 PM,  <[email protected]> wrote:
> On Fri, 8 Jul 2011, Arlen Beiler wrote:
>
>> For your information, TiddlyWiki has no concept of paragraphs, as you may
>> have noticed, it simply adds a line break each time you hit enter, which
>> is
>> why I suggested closing tags at line breaks if they aren't closed. Maybe
>> you
>> meant the same thing, but I thought I would point it out.
>
> I personally think this is a serious bug (the lack of paragraphs). If
> there is discussion going on about clarifying and correcting the
> syntax my number one request would be:
>
> * identify blocks, as things which are separated by \n\n, make \n be
>  meaningless whitespace
>
> This would imply:
>
> * if somebody wants a br, it needs to be explicit, not implied
> * inline tags/markup don't extend beyond the current block
>
> [joe said:]
>>>
>>> I have a feeling that the use of regular expressions greatly hinders
>>> writing a parser - regexps are not good at operator precidence, nor
>>> for matching context sensitive grammars and most wiki parsers seem to
>>> be massive sets of regexps which I just cannot read - top-down
>>> recursive descent parsing seems a lot easier.
>
> Yes, this is the way of the wiki world, unfortunately. On hacker news
> not long ago there was some talk about how complex the parser for
> mediawiki text is[1]. I've tried to write parsers (with formal
> grammars) for usemod/purplewiki, kwiki, socialtext and started
> thinking about it for tiddlywiki and have never made good progress,
> usually because the regexps don't model a formal grammar, so the
> exceptional cases are a pain.

I'm into my second rewrite now :-)

It's a multi-pass parser. So far

   - pass 1 - split text at line breaks

   - pass 2 - tokenise

       the following tokens are collected
       {h, Str}            = a line starting !
       {ol, Str}           = a line starting *
       {ul, Str}           = a line starting #
       {table,Str}       = a line starting |
       {block,[Lines]} = a {{{ ... multiline block }}}
       {css,Class,[Lines]}  = a {{Class{ ... }}} block
       blank        = a blank line
       (str,S}      = anything else

    This cannot be done with regexps, just a simple FSM with a stack
    (this is because the {{Class{ ... }}} contexts are nestable

     Lines is a list of strings (for a block) or tokens for a css context block

- pass 3 calls a custom parser based on the tag
  this resets all flags prior to parsing so errors due unmatched markup tags
  do not propagate into the next consruct.

  The only tricky bit is tables - you have to prescan for links and
inline blocks etc.

  ie
      | {{{ | }}} |

is a table with one cell not two - you can't just blindly split this
on a "|" character

>
> If making a correct syntax is the goal, and backwards compatibility is
> not a concern, it would probably make sense to start from a grammar and
> then implement a tokenizer and one or more generators.

This is what I'm doing - My goal is to parse "most" existing tiddlers
but not "all"
For a tiddlywiki with a few hundred tiddlers I might expect a few rejects.

By a reject I mean a tiddler with badly formed content - I would also expect to
warn the user. A kind of Lint approach. Automatic fixing is probably
not worth the
effort. Goal: fix a moderate size tiddlywiki with a few minute of editing

Cheers

/Joe

>
> [1] http://news.ycombinator.com/item?id=2521576
>
> --
> Chris Dent                                   http://burningchrome.com/
>                                [...]
>
> --
> You received this message because you are subscribed to the Google Groups
> "TiddlyWikiDev" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tiddlywikidev?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWikiDev" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tiddlywikidev?hl=en.

Reply via email to