Re: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

Vincent Massol Mon, 01 Oct 2007 13:05:46 -0700

I've discussed this email below with Mikhail. We discussed 2 points:


* How to support the following:

{velocity}
<div>
<h1>Hello $customer.Name!</h1>
<table>
#foreach( $mud in $mudsOnSpecial )
   #if ( $customer.hasPurchased($mud) )
      <tr>
        <td>
          * [link>$flogger.getPromo( $mud )]
        </td>
      </tr>
   #end
#end
</table>
</div>
{/velocity}


The solution we've come to is the following for rendering the above:

a) When parsed with wikimodel, the velocity macro is called

b) the velocity macro uses an XML parser to parse its content (byadding a top level xml wrapping element for example)c) for each text content found during the XML parsing, call wikimodelon it so that the wiki syntax can be interpreted

Same for the {html} macro. We could also use parameters to decide ifXML parsing should be done, if wiki syntax should be interpreted,etc. For example: {html xml=true|false wikisyntax=true|false ...},{velocity xml=true|false wikisyntax=true|false ...}.


* How to speed up document rendering?

We propose several caches:

- level 1: Parse documents using a wikisyntax DOM parser whichproduces a DOM tree which is cached. This tree is composed of nodesrepresenting macros (unparsed or XML-parsed in a XML DOM tree whichis cached).- level 2: Cache the non macro blocks after they are rendered (XHTML)since they are static.- level 3: Caching at the level of the rendered XHTML. This would bea timed-cache (the cache is refreshed every N minutes). This isabsolutely required for heavy sites (Imagine Apache.org using XWiki).I think we would need to cache pages for users not logged in atleast. Not sure how to cache pages when users are logged in.


Note1: level 2 and level 3 caches do not require a DOM tree.

Note2: we need the DOM tree to speed up all macros that act oncontent, like the TOC macro as otherwise it means the content willneed to be parsed (which is hard and slow where traversing the DOMtree is easy and fast). Since the DOM tree is cached the renderingfor these macros will be fast.Note3: The TOC macro reminds me that we need somehow to supportmacros that should be rendered last since they operate on the fullDOM tree. This is easy to do with a DOM tree but would quite harderif we were only using a stream.


Any comment? Do you agree? Any other idea?

Thanks
-Vincent

On Oct 1, 2007, at 12:51 PM, Mikhail Kotelnikov wrote:

Hi! Excuse me please for the late response...

On 9/26/07, Vincent Massol <[EMAIL PROTECTED]> wrote:
Hi xwiki devs,
This is a summary of the decisions so far and the remainingquestions. This is also the outcome of my discussion of today withMikhail on skype.
1) We'll be able to import all syntaxes.

Yes
2) An XWiki instance will use a single syntax at a time. Once thedatabase is created using that syntax it won't be possible tochange it (except by doing an export and reimport).
We also need to decide what syntax we use by default OOB. I proposewe use the current xwiki syntax for some time and then switch tothe Wikimodel one (Common Syntax) later on.
 I have no choice - I have to vote for the "CommonSyntax" :-)
3) All pages will be able to be exported to any syntax. Someelements have no equivalent in other syntaxes and when this happensa warning will be displayed and the elements in question ignored.
Yes. You can loose data only when you *export* your data from theCommonSyntax, not when you *import* it. The WikiModel/CommonSyntaxsupport a super-set of elements defined in other wikis.
BTW: I think that it would be useful to involve Max Völkel (http://xam.de/) in the discussion about wiki imports/exports. Max is oneof authors of the Semantic Media WIki. He proposed a WikiInterchange Format (WIF) <http://eyaloren.org/pubs/semwiki2006-wif.pdf> which covers the topic. At least it would be useful toknow his opinion.
4) Mikhail has agreed to modify wikiModel to add macro blockrecognition.The syntax isn't fully defined yet but it'll besomething like:
{xxx param1=value1 param2=value2}
...
{/xxx}

(param1='value value' and param1="value value" will also be supported)
This means we'll be able to have a common syntax for xwiki's macrosand also for groovy code:
{groovy}
...
{/groovy}

And also for HTML blocks:

{html}
...
{/html}
It is already done and committed. You can use "embedded" macroblocks like{xxx param1=value1 param2="long value 2" param3='this is aparameter 3'}
   ... {yyy} ... {/yyy} ...
{/xxx}

Even embedded elements with the same names are possible:
{xxx}
   ...
     {xxx}
       ... {yyy} ... {/yyy} ...
     {/xxx}
   ...
{/xxx}


5) We need to decide if we want:
A) No velocity block but document properties/metadata to tell xwikito render the page using velocity. A user putting velocity code ina page will have to check a box somewhere to say that this is avelocity page.
Pros:
* Slightly easier to enter velocity code

Cons:
* Exception case compared to groovy, macros, etc
* User must not forget to check the checkbox saying the pagecontains velocity code
OR

B) Velocity blocks same as what exists for groovy/macros/html, namely:

{velocity}
... velocity code with wiki syntax allowed
{/velocity}
Note1: For B) we would allow putting wiki syntax inside thevelocity block. Technically we'll apply the velocity rendering andthen re-apply wikimodel on the result.Note2: For backward compatibility we can have a config flag( xwiki.compatibility = 1) that automatically adds the {velocity}{/velocity} marker around the whole page. The only downside is thatit'll be as slow as it currently is (actually it'll be faster sincewikimodel is going to be faster than radeox)
Pros:
* Speed. Since we know the blocks that use velocity we can cacheall the wiki syntax not inside the velocity/macros/groovy blockswhich will speed up considerably the rendering of pages
* Consistency with macros and groovy blocks.

My preferences goes to B) and I'm proposing to use that.

My +1 for B)
6) Mikhail is going to add support for recognizing XML tags in thewikimodel parser so that onOpenXmlTag()/onCloseXmlTag() events arecalled in listeners). This is needed for point 7 below.
Hmm... Yes, I said that I'll do it. Technically it is simple(simpler than to add "macro" blocks). But conceptually it breakseverything.Explanations: Imagine that the listener already has the onOpenXmlTag(String name, WikiParameters params)/onCloseXmlTag(String name)methods.
Then the text:
----------------------
<table>...

This is a content of the table

...</table>
----------------------

will be reported as following:
----------------------
- onBeginParagraph
- onOpenXmlTag: => "table"
- onEndParagraph

- onBeginParagraph
- onWord/onSpace => "This is a content of the table"
- onEndParagraph

- onBeginParagraph
- onCloseXmlTag: => "table"
- onEndParagraph
----------------------
It means that the calls onOpenXmlTag/onCloseXmlTag will cross theborders of multiple wiki paragraphs. And this is a BIG problem.
It means that we have to chose one of the following:
(A) Report HTML tags "as is" inside of wiki structural elements; Inthe example above opening and closing "table" tags are reported intwo separate wiki paragraphs.
  pro: It is the simplest solution.
con: if somebody want to treat these elements and create a wellformed document then it is up to the him/her to do it by hands andto ignore non-appropriate structural wiki elements;(B) Ignore wiki formatting inside of XML tags. In the example aboveall wiki paragraphs will be skipped and only HTML "table" tags willbe taken into account; in this case WikiModel can not guarantiesthat each opening element was really closed.
  pro: It is doable.
con: a) It breaks completely the idea of the WikiModel - to giveaccess to a well-formed structure of wiki documents; b) the grammarwill be bigger; c) It is not so simple to implement(C) Add some HTML tags as markup elements for the CommonSyntax. Inthis case each '<table>...</table>' tag pair will be interpreted inthe same way as normal wiki tables. The same for "ol/ul/li/dd/dl/dt/p/span/div/..." elements.pro: you can mix your wiki and HTML markup with the same meaningand all of them will be reported in the same way to the listeners.con: a) the grammar will be bigger; b) I have to do much moreadditional validations of documents by hand to guarantee that thedocument is well-formed; c) the parsing will be much slower (as theconsequence of a and b); d) it is much more difficult toimplement; e) the number of possible HTML elements have to be fixedin advance
From my point of view neither option is good.
One another approach to resolve this problem - see below, in theresponse to the next question.
7) We need to allow intermixing velocity/HTML and wiki syntaxeasily. For this our listener (the code that listens to {velocity}events) will evaluate the content using velocity and will call wikimodel again on the resulting code. Since wikimodel requires HTML tobe in a block ({html} for us) we'll use a different wikimodellistener that intercepts the onOpenXmlTag/onCloseXmlTag so thatit'll output XML tags with no modifications (the standard HTMLlistener generates < and > for < and >). This will allowwriting:
{velocity}
<div>
<h1>Hello $customer.Name!</h1>
<table>
#foreach( $mud in $mudsOnSpecial )
   #if ( $customer.hasPurchased($mud) )
      <tr>
        <td>
          * [link>$flogger.getPromo( $mud )]
        </td>
      </tr>
   #end
#end
</table>
</div>
{/velocity}
I propose to interpret the content of such blocks in the followingmanner:- The given text is interpreted as a "relaxed-XML" where allopening XML tags have to be closed (or a tag has to be empty)- If an XML tag has a text content (not only spaces) then thiscontent is interpreted as a wiki syntax. In this case the wikicontent can be handled by a normal wiki parser.
In this case the scenario of usage can be following:
1. You parse your initial wiki document and create a backbonestructure of the document containing macro blocks2. All macro blocks corresponding to template blocks are handled bythe corresponding engine (ie velocity)3. The output from template engines is used as an input for such a"relaxed-XML" parser4. A not-empty text content of tags is interpreted as wiki contentusing separate wiki parser instances; the content formed by thesewiki blocks should be inserted in the document from the step 3.
pro:
* It seems that it is the simplest solution and it resolvesproblems with inter-mixing of the wiki syntax/XML/HTML from theprevious point (if such inter-mixing is available only in macroblocks).* For such a "relaxed-XML" content existing XML parsers can beused. It is possible just to add the "<xml>...</xml>" pair aroundthe content and use directly a normal XML parser. But there is arisk that this document is not a well-formed XML. Or an HTMLcleaner (JTidy/NekoHTML/...) can be used to get a well-formed XHTMLbefore the XML parsing.
* All steps gives well-formed structures
con: More parsers in the chain of the page treatment => thetreatment is slower (this can be partially compensated byadditional caches in the future)
Personally I definitely prefer this approach. And from my point ofview it resolves the problems with onOpenXmlTag/onCloseXmlTagmentioned above.
8) Documents are stored in textual format in the DB (i.e. as theuser sees them). Portions of them will be cached after they'rerendered for the first time (see option B above for the bestcaching option).
Are you ok on these points and especially about using the 5B solution?

Anything else I've forgotten?

If we agree, then my next steps are:
* Understand the wikimodel API in more details
* Understand the doxia API too (it's a "competitor" to wikimodel).The reason is that I'd like to see two implementations to ensurethat the XWiki Interfaces can be implemented using differentimplementations so that XWiki is independent of the underlyingrendering/parsing framework used. Jason Van Zyl is also interestedin implementing the doxia part for XWiki in the future.
I think that in any case it is a good idea to have a specializedinterface (API) for each type of functionalities to isolate thecore of the system from external libraries/frameworks.
BTW: Thank you for the pointer on Doxia! I will see it more indetails. After a brief look I saw that WikiModel contains verysimilar modules with the "Sink" object of Doxia. I think that it isvery simple to create a WikiModel/Doxia bridge. And it seemspossible to add the APT ( http://maven.apache.org/doxia/references/apt-format.html) format support directly to the WikiModel.
Best regards
Mikhail

* Propose a XWiki API
* Propose an integration path
* Implement it using WikiModel

Thanks
-Vincent

_______________________________________________
devs mailing list
devs@xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

Reply via email to