For dashes, italics, and all those other formatting strings, I think that it should close at the first line break unless the formatting string is by itself on a line. Example: // *foo *foo**foo--foo**foo-- *foo //
The second list item would parse like regular but would then close the tags before moving to the next line. The whole thing would get italics. On Thu, Jul 7, 2011 at 6:12 PM, Jeremy Ruston <[email protected]>wrote: > > I'm writing a tiddlyWiki parser as part of an exercise try and > > understand how the tiddlyWiki works. > > Awesome. I like the way that you've been approaching TiddlyWiki from a > formal perspective; its history has been influenced by its early > development as an experiment, and then its rapid mass adoption quite > early on as a practical tool. It has rather constrained our ability to > change some of the less desirable fundamentals, and meant that the > definitive specification is all too frequently the source code. > > > I have a few questions: > > > 1) What is a wikiword?- I can find no definition. Where is the > > definitive definition? > > The definitive definition is a regular expression the source code, starting > at: > > https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/Config.js#L179 > > The important bit is: > > config.textPrimitives.wikiLink = "(?:(?:" + > config.textPrimitives.upperLetter + "+" + > config.textPrimitives.lowerLetter + "+" + > config.textPrimitives.upperLetter + > config.textPrimitives.anyLetter + "*)|(?:" + > config.textPrimitives.upperLetter + "{2,}" + > config.textPrimitives.lowerLetter + "+))"; > > Confusingly, lowerLetter includes the digits and dashes, as does anyLetter. > > > My tiddly wiki seems to think that UTF-8 is a wikiword (since it's > > written ~UTF-8) - my parser > > finds a ~ then expects a wikiword but UTF-8 is not by the definitions > > I have found a wikiword > > ie its not CamelCase. > > See above. > > > 2) What is escaped text. As in """ escaped text """ > > Is there any difference between """ ... """ (three consecutive double > > quotes) and {{{ ... }}} > > The purpose of escaped text is to suppress wikification without > applying the monospaced formatting applied by {{{....}}}. > > As you can see from the code, <nowiki>...</nowiki> can also be used to > escape text. > > https://github.com/TiddlyWiki/tiddlywiki/blob/master/js/Formatter.js#L504 > > > 3) What is supposed to happen when markup elements are incorrectly > nested? > > The wikifier generally treats the conflicting formatting as if it were > plain text, but frequently finds a confusingly optimistic way of > interpreting broken markup. > > > I tried a tiddly page containing this: > > > > * --abc//def--hij// > > * ghi > > * 123 > > In this case, the first double dash triggers strikethrough formatting. > Then the double slash triggers italics. > > The second double dash isn't interpreted as the end of the > strikethrough formatting because the wikifier is still waiting for the > italics to be terminated first. Instead, it interprets the second > double dash as the start of a new run of strike through text, nested > inside the first. Of course, there's no visual difference. > > By the time the second double slash is encountered, then, the same > thing happens: the wikifier interprets it as the start of a new, > nested run of italic text. > > All of which is why the second and third lines are rendered with > italics and strikethrough. > > Finally, the reason that the second and third lines are indented is > that the wikifier is looking for the italics and strikethrough to end > before it will recognise the next sibling list item. It's probably a > bug that it therefore falls back to interpreting the subsequent list > items as child items. > > The reason that the wikifier behaves in this unexpected way is that it > is structured very simply: each individual formatter (ie element of > wiki syntax) is modelled as a regexp that triggers it, and a function > that processes it, usually with the help of another regular > expression. The idea was to make it easy to add new formatters without > disturbing the existing ones. > > > Now there is no notion of "incorrect markup" ie parsing cannot fail > > with a syntax error (or can it?) > > No, that is correct, parsing cannot fail; the goal in error conditions > is just to fail visibly, and emit as much readable text as possible. > > > unfortunately what is "sensable" in the event of badly nested markup > > is debatable. This bodes > > ill for standardization and wide scale adoption. > > Hopefully we can fix the most annoying problems. > > > In most html parsers an incorrect markup does not propagate beyond the > > scope of the > > current block (given some definition of a block) > > Yes, that would be much more useful. > > > Consider this: > > > > * abc//def > > * ghi// > > * 123 > > > > The result is weird this generates three levels of list indents - all > italic > > > > In my mind all "open" markup should be closed at the end of each superior > block. > > I think that this could be fixed - as discussed elsewhere, we're very > interested at Osmosoft in evolving a tiddly markup mark 2. > > > Comments: 1) this yeilds well-formed XHTML 2) a bit more work could make > this > > well-formed and retain what I "expected" > > I suspect that the broad user community would favour well-formed XHTML > for these cases, but perhaps we can still make the mis-nesting > behavour less unexpected. > > Best wishes > > jeremy > > > Cheers > > > > /Joe > > > > -- > > You received this message because you are subscribed to the Google Groups > "TiddlyWikiDev" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > [email protected]. > > For more options, visit this group at > http://groups.google.com/group/tiddlywikidev?hl=en. > > > > > > > > -- > Jeremy Ruston > mailto:[email protected] > http://www.tiddlywiki.com > > -- > You received this message because you are subscribed to the Google Groups > "TiddlyWikiDev" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tiddlywikidev?hl=en. > > -- You received this message because you are subscribed to the Google Groups "TiddlyWikiDev" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tiddlywikidev?hl=en.
