Re: Using grammarpool with included schemas

Dick Deneer Fri, 07 Jul 2006 11:49:38 -0700

Thanks for the explanation.

I am realizing that my perspective towards schema's and xml was wrong.

I allways took the approach where the schema was leading: "this is my schema so come up with that xml and I will validate it". This way I thought to have maximum control and making a simple solution.

But in fact I was swimming against the stream because the XML itself determines the way it should be handled. You can see this also with tools like XMLSpy which let you change the xmlsource to validate against another schema or DTD.

So I will take another approuch which is more in line with Xerces .

Today I have tested a lot with the xerces SAXParser and an EnityResolver2.

And I have simple question.

I am putting in this XML:

<purchaseOrder orderDate="1999-10-20" xmlns="http://tempuri.org/po.xsd"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://tempuri.org/po.xsd C:\Temp\Schemas\apo.xsd">

The entityresolver2 gets the following callbacks:

getExternalSubset (name purchaseOrder baseURI null)

resolveEntity name null publicId null baseURI null systemId C:\Temp\Schemas\apo.xsd

getExternalSubset name xs:schema baseURI file:///C:/Temp/Schemas/apo.xsd

So getExternalSubset is always called; the second time it comes form the loaded schema.

Also the baseURI is null in the first because I did don specify a systemid to the xml inputSource.

This is all clear to me.

But I wonder about values in the resolveEntity. I thought I also should have the namespace

http://tempuri.org/po.xsd available in one of the parameters (I supposed in name or publicid).

But I only get the systemID. It seams to me that the namespace is a very relevant value for determing the right inputsource you will give back.

Also the doc says about the name:

name - Identifies the external entity being resolved. Either "[dtd]" for the external subset, or a name starting with "%" to indicate a parameter entity, or else the name of a general entity. This is never null when invoked by a SAX2 parser.

Am I missing something?

And can I just open a URL connection to the given systemid to check if the parser will resolve the entry. Or should I combine this with the baseUri if not null?

Op 7-jul-2006, om 7:12 heeft Michael Glavassevich het volgende geschreven:

Hi Dick,

An XSGrammar is a collection of schema components for a given target
namespace. The schema validator will consult the grammar pool once per
namespace, so you only have one shot to return an XSGrammar object from
your pool for each namespace. That doesn't prevent your grammar pool
implementation from doing something clever like merging XSGrammars (see
org.apache.xerces.impl.xs.XSLoaderImpl.XSGrammarMerger [1]) but that may
be a bit difficult for you to do particularly if the schemas aren't
disjoint. I would go the entity resolver route instead of trying that. You
can register an XMLEntityResolver [2] with the XMLGrammarPreparser by
calling the setEntityResolver(XMLEntityResolver) method. The parser's
default behaviour is to open a URLConnection for the location specified.
If you're unable to create an InputStream from the URLConnection in your
entity resolver the parser won't succeed at doing that either.

Thanks.

[1]
http://svn.apache.org/viewvc/xerces/java/trunk/src/org/apache/xerces/impl/xs/XSLoaderImpl.java?revision=406145&view=markup
[2]
http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLEntityResolver.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

Dick Deneer <[EMAIL PROTECTED]> wrote on 07/04/2006
01:59:17 PM:

Hi,

I am using a grammarpool to cache schemas and want the user to point
to a number of schemas that will be added to the grammarpool. Then
the grammarpool is used to validate a XML instance document.
One of the schema (=top ) uses a include to another schema (=sub
). If I can rely on the location mentioned in the include there is
no problem. Just add "top" to the pool and everything is fine.
But when the path in the top schema is invalid, the included schema
cannot be found. My first idea was just let the user add the
subschema also to the pool. But this has no effect. The result is
that the subschema is just ignored and will not be in the
grammarpool (likely caused by the fact that there is already a
grammar with the same namespace in the pool, namely "top").
Validation will be incomplete because the type definitions mentioned
in sub are ignored.

Working in the other direction (first add the sub, then the top
schema) does not not work either: only the sub schema will be added
to the grammarpool

The ideal situation for me would be: just add another schema to the
pool if the parser gives an error that something cannot be found.

Is there no other way then using a entityresolver?
If so: can I detect in the entityresolver that the schema will or
will not be found by the parser.if "not" , I have to return my own
inputsource.
Can I also use such a entityresolver when using a XMLGrammarPreparser?

PS
You can easily "replay" the above situations with

the XMLGrammarBuilder.java

program that is supplied with xerces. But also using direct (without
preparsing) the dom or sax parser with a grammarpool will give the
same results.
I used these xml and schemas

top.xsd:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:include schemaLocation="include.xsd"/>
<xsd:element name="myRoot" type="myRootType"/>
<xsd:complexType name="myRootType">
<xsd:sequence>
<xsd:element name="label" type="labelType"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

include.xsd:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:simpleType name="labelType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="01"/>
<xsd:enumeration value="02"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>

instance.xml:
<?xml version="1.0" encoding="UTF-8"?>
<myRoot>
<label>028</label>
</myRoot>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Using grammarpool with included schemas

Reply via email to