Hi Michael,
Thanks for responding.
I think that Xerces should provide an option/feature to enable *also*
the validation of the main document with XInclude expanded against the
main DTD. I imagine this can be done by adding another DTD validator
component in the pipeline after the XInclude processing.
This will help for instance the DocBook users that want to validate a
document with XInclude and see if it DocBook valid or not. Currently
knowing that each document is valid against its own DTD does not imply
that the main document with XInclude expanded is valid against the
DocBook DTD.
I do not have problems with how schema validation plus XInclude work
right now.
It will be great if the last part can be fixed in 2.7.1. I just added a
Jira entry for that
http://issues.apache.org/jira/browse/XERCESJ-1089
Best Regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Michael Glavassevich wrote:
George Cristian Bina <[EMAIL PROTECTED]> wrote on 07/21/2005 02:30:12 PM:
Hi,
Here are a couple of thinks I noticed with the Xerces XInclude support
wrt validation.
Let's assume a simple scenario, one document main.xml that includes a
document fragment.xml.
If we run a validation without setting the schema feature and both
main.xml and fragment.xml specify a DTD then
* main.xml is validated against its DTD. let that be main.dtd
* fragment.xml is validated against its DTD, let that be fragment.dtd
From reading through your e-mail, I guess what you may be wondering here
is why DTD validation isn't being performed on the result infoset. If
that's the case consider the following document (if not you can ignore the
rest of this section):
<?xml version="1.0"?>
<!DOCTYPE xi:include [
<!ELEMENT xi:include (xi:fallback?)>
<!ATTLIST xi:include
xmlns:xi CDATA #FIXED "http://www.w3.org/2001/XInclude
"
href CDATA #IMPLIED
parse (xml|text) "xml"
xpointer CDATA #IMPLIED
encoding CDATA #IMPLIED
accept CDATA #IMPLIED
accept-language CDATA #IMPLIED>
<!ELEMENT xi:fallback ANY>
<!ATTLIST xi:fallback
xmlns:xi CDATA #FIXED "http://www.w3.org/2001/XInclude">
]>
<xi:include href="fragment.xml"/>
If the DTD validator is placed after the XInclude processor, then the
xmlns:xi attribute will not yet have been defaulted into the document,
meaning the prefix xi will not have been bound to "
http://www.w3.org/2001/XInclude" when the XInclude processor sees the
xi:include element. The infoset being passed to the XInclude processor
here isn't complete. Though one can imagine how to do DTD validation on
the result of an XInclude, you really can't decouple DTD related
processing from the parsing process.
If we run a validation and specify the schema feature and both main.xml
and fragment.xml specify an XML schema then
* the content of main.xml with xi:include replaced with the content of
fragment.xml is validated against the XML Schema specified in main.xml,
let that be main.xsd
* fragment.xml is *not* validated against its schema, let that be
fragment.xml
What if you don't have a schema for fragment.xml or don't want to validate
fragment.xml? What if fragment.xml contains xi:include elements, do you
want the parser to validate before or after inclusion or both? What if you
have a fragment2.xml which main.xml includes and you don't want to
validate it but you still want to validate fragment.xml? The schema
feature is a boolean so it can't handle these preferences. We did think
about how to go about supporting these combinations [1] but we don't have
a solution yet. An XML pipelining language might help.
In a JAXP 1.3 context if a javax.xml.validation.Schema is set on the
parser, validation occurs after any other processing (including XInclude)
so the schema validation behaviour you get when the schema feature is
turned on is consistent with JAXP.
If we set the schema feature to true and both documents specify a DTD
then:
* main.xml is validated against main.dtd
* fragment.xml is *not* validated
The current behaviour is the result of the fix for Jira bug #843 [2]. To
be consistent with what occurs on the main pipeline, I suppose validation
should occur if the document being processed on the child pipeline has a
DTD and if it doesn't then no errors should be reported for it not having
a DTD grammar. Fixing this behaviour in 2.7.1 is probably doable.
To summarize, there is a totally different behavior of the validate
action when XInclude is enabled, for XML Schema the validation is done
on the master file with XInclude instructions resolved while for DTD the
validation is performed on each document without expanding the XInclude
instructions.
More problematic however seems the case when the schema feature is set
to true and the documents specify DTDs. In this case the main document
with Xinclude instructions not expanded is validated against the DTD but
in the XInclude handler the validation is turned off, thus the included
documents are not validated at all. In this case the XInclude handler I
think should at least behave in a similar way as the main pipeline, that
is it should validate the included document against its DTD.
Any insights on these will be appreciated.
Are there any chances for fixing at least this last part in 2.7.1?
Let me know if I should file a bug on Jira.
Best Regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Thanks.
[1] http://marc.theaimsgroup.com/?l=xerces-j-user&m=112166040310811&w=2
[2] http://issues.apache.org/jira/browse/XERCESJ-843
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]