Title: Message
Hi Stefan,
 
There is a current enhancement request to keep the order of apparition of the string elements; you can even find a patch submitted by a castor user in our bugzilla system.
We plan to implement it in the near future,
 
Hope that helps,

Arnaud
-----Original Message-----
From: Stefan H Westlund [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 30, 2003 10:31 AM
To: [EMAIL PROTECTED]
Subject: [castor-dev] Schema for a weird document

Hello!
I have got an XML which I don�t know how to create a valid schema for.
 
                <info>
                 Om
                 <emph type="italic">vacker</emph>
                 syftar p�
                 <emph type="italic">det h�ga landet</emph>
                 �r det kongruensfel
                </info>
 
Note the String objects between the emph-tags.
 
I thought of something like this:
 
                <xs:element name="info" minOccurs="1" maxOccurs="1" >
                 <xs:complexType mixed="true" >
                  <xs:sequence>
                   <xs:element name="emph" minOccurs="0" maxOccurs="unbounded" >
                    <xs:complexType mixed="true" >
                     <xs:attribute name="type" type="xs:string" use="required" />
                    </xs:complexType>
                   </xs:element>
                  </xs:sequence>
                 </xs:complexType>
                </xs:element>
 
But as you can see, I will loose the order between the String-objects and the Emph-objects. The method in Info will be getContent() for the whole String "Om syftar p� �r det kongruensfel." and a sequence of two Emph-objects.
 
Is it possible to parse this and keep the Emph's positions between the String objects? The main point is that I want to keep the String-objects in the Emph-objects, so the result would be "Om vacker syftar p� det h�ga landet �r det kongruensfel".
 
Please note that this is a small part of a bigger document I am trying to parse. The cheapest but somewhat danger solution would be to filter the incoming stream from the emph tag ( and the counterpart </emph>). Danger because this annoying emph-tag may occur at other places in the document.
 
Best Regards
/Stefan H Westlund
 
PS. Don't blame me for writing this weird document style, it was done before my time :-) DS.

Reply via email to