And part 2 is exactly what I want to achieve. :) In my frontend I use an XML -> Object mapper (JibX), which tries to parse the content inside the tekst element. I only want the String value of this tekst element to write out in my response to the client. I guess I have to do it on the JibX side with a special deserializer on the tekst element, because the other 3 points are very valid. :)

Thanks,

Nick S.

ps. and no, not storing XML as CDATA, storing HTML as CDATA ;), to make it valid XML with the attached XSD document. (Which describes the tekst element as xs:string)

Ard Schrijvers wrote:
So you are storing <![CDATA ]]> instead of xml?? I would certainly disencourage 
you to do so!

1) You cannot set extractors anymore on xpath within the "xml/html" in <![CDATA 
]]>, so you cannot for example extract links to other documents

2) In the frontend you have to parse the CDATA to xml if you want to use xslt

3) lucene will index 'elements' like <html> <body> etc within the CDATA because 
it just is text.

So, I do not know what you want to achieve, but you are heading for problems if 
you as me...

Ard

ps sry for the not-indenting, i am using webmail...


The document xml will change from
<document>
    <meta>
       ....
    </meta>
    <content>
       <section>
          <tekst><html><body>....</body></html></tekst>
       </section>
    </content>
</document>

to

<document>
    <meta>
       ....
    </meta>
    <content>
       <section>
          <tekst><![CDATA[<html><body>....</body></html>]]></tekst>
       </section>
    </content>
</document>

Only the contents of the tekst node will change.

With regards,

Nick Stolwijk

Ard Schrijvers wrote:
ps sorry for the late reaction...
pps (the document xml does not change in your suggested part, isn't? ). Lucene 
just indexes all xml, regardless its structure

Ard


Anyone?

Nick Stolwijk wrote:
Can someone confirm whether this will interfere with the Lucene indexer?

regards,

Nick Stolwijk

Dennis Dam wrote:
Hi Nick,

that's possible, but you will have to implement a custom binding class for that. The default binding class is

nl.hippo.cocoon.forms.binding.HTMLAreaBinding

You can use that class as a reference. If I understand correctly, you have to load the CData element from the source file, and assign the string value to the context node in the doLoad() method, and in the doSave() method you have to create a CData element and serialize the input DOM element to a string, and assign that to the CData element.

One potential drawback is that, if you write the html in a CDATA block, that text indexing in the repository might go wrong: the html tags are also indexed (something you won't want), or maybe the entire CDATA block is not indexed.. I'm not sure about this thought, Ard can you shed some light on this ?

regards,
Dennis

********************************************
Hippocms-dev: Hippo CMS development public mailinglist

********************************************
Hippocms-dev: Hippo CMS development public mailinglist



********************************************
Hippocms-dev: Hippo CMS development public mailinglist


********************************************
Hippocms-dev: Hippo CMS development public mailinglist



********************************************
Hippocms-dev: Hippo CMS development public mailinglist


********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Reply via email to