Thanks for the explanation.
I am realizing that my perspective towards  schema's and xml was wrong.
I allways took the approach where the schema was leading:  "this is my schema so come up with that xml and I will validate it".  This way I thought to have maximum control and making a simple solution.
But in fact I was swimming against the stream because the XML itself determines the way it should be handled. You can see this also with tools like XMLSpy which let you change the xmlsource to validate against another schema or DTD.

So I will take another approuch which is more in line with Xerces  .
Today I have tested a lot with the xerces SAXParser and an EnityResolver2.
And I have simple question. 
I am putting in this XML:
<purchaseOrder orderDate="1999-10-20" xmlns="http://tempuri.org/po.xsd"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://tempuri.org/po.xsd C:\Temp\Schemas\apo.xsd">

The entityresolver2 gets the following callbacks:

getExternalSubset (name purchaseOrder baseURI null)

resolveEntity name null publicId null baseURI null systemId C:\Temp\Schemas\apo.xsd

getExternalSubset  name xs:schema baseURI file:///C:/Temp/Schemas/apo.xsd

So getExternalSubset is always called; the second time it comes form the loaded schema. 
Also the baseURI is null in the first because I did don specify a systemid to the xml inputSource.
This is all clear to me.

But I wonder about values in the resolveEntity. I thought I also should have the namespace 
http://tempuri.org/po.xsd  available  in one of the parameters (I supposed in name or publicid).
But I only get the systemID. It seams to me that the namespace is a very relevant value for determing the right inputsource you will give back. 
Also the doc says about the name:
name - Identifies the external entity being resolved. Either "[dtd]" for the external subset, or a name starting with "%" to indicate a parameter entity, or else the name of a general entity. This is never null when invoked by a SAX2 parser.
Am I missing something?


And can I just open a URL connection to the given systemid to check if the parser will resolve the entry. Or should I combine this with the baseUri if not null?



Op 7-jul-2006, om 7:12 heeft Michael Glavassevich het volgende geschreven:

Hi Dick,

An XSGrammar is a collection of schema components for a given target 
namespace. The schema validator will consult the grammar pool once per 
namespace, so you only have one shot to return an XSGrammar object from 
your pool for each namespace. That doesn't prevent your grammar pool 
implementation from doing something clever like merging XSGrammars (see 
org.apache.xerces.impl.xs.XSLoaderImpl.XSGrammarMerger [1]) but that may 
be a bit difficult for you to do particularly if the schemas aren't 
disjoint. I would go the entity resolver route instead of trying that. You 
can register an XMLEntityResolver [2] with the XMLGrammarPreparser by 
calling the setEntityResolver(XMLEntityResolver) method. The parser's 
default behaviour is to open a URLConnection for the location specified. 
If you're unable to create an InputStream from the URLConnection in your 
entity resolver the parser won't succeed at doing that either.

Thanks.

[1] 
[2] 

Michael Glavassevich
XML Parser Development
IBM Toronto Lab

Dick Deneer <[EMAIL PROTECTED]> wrote on 07/04/2006 
01:59:17 PM:

Hi,

I am using a grammarpool to cache schemas and want the user to point
to a number of schemas that will be added to the grammarpool. Then 
the grammarpool is used to validate a XML instance document. 
One of the schema (=top )  uses a include to another schema (=sub 
).  If I can  rely on the location mentioned in the include there is
no problem. Just add  "top" to the pool and everything is fine. 
But when the path in the top schema is invalid, the included schema 
cannot be found. My first idea was just let the user add the 
subschema also to the pool. But this has no effect.  The result is 
that the subschema is just ignored and will not be in the 
grammarpool (likely caused by the fact that there is already a 
grammar with the same namespace in the pool, namely "top"). 
Validation will be incomplete because the type definitions mentioned
in sub are ignored.

Working in the other direction (first add the sub, then the top 
schema) does not not work either: only the sub schema will be added 
to the grammarpool

The ideal situation for me would be: just add another schema to the 
pool if the parser gives an error that something cannot be found.

Is there no other way then using a entityresolver?
If so: can I detect in the entityresolver that the schema will or 
will not be found by the parser.if "not" , I have to return my own 
inputsource.
Can I also use such a entityresolver when using a XMLGrammarPreparser?

PS
You can easily "replay" the above situations with  
the XMLGrammarBuilder.java 
program that is supplied with xerces. But also using direct (without
preparsing) the dom or sax parser with a grammarpool will give the 
same results.
I used these xml and schemas

top.xsd:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:include schemaLocation="include.xsd"/>
<xsd:element name="myRoot" type="myRootType"/>
<xsd:complexType name="myRootType">
<xsd:sequence>
<xsd:element name="label" type="labelType"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

include.xsd:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:simpleType name="labelType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="01"/>
<xsd:enumeration value="02"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>

instance.xml:
<?xml version="1.0" encoding="UTF-8"?>
<myRoot>
    <label>028</label>
</myRoot>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





Reply via email to