> On Fri, 2005-08-05 at 17:27 +0200, Eric van der Vlist wrote:
> > W3C XML Schema is quite clear that both are valid... The underlying
> > assumption is that application will normalize spaces (like the XPath
> > normalize-space function would do) before using any value that is from a
> > datatype that strip spaces.
> Which is about where the Atom group are. Just trying to find some
> text to clarify that underlying assumption.
> The claim
>
>
> >
> > If you agree with this assumption, that means that the value doesn't
> > need to be a valid URI before normalization.... If not, use another
> > type!
> That's where I'm stuck Eric. I don't know where to start looking
> for an alternative type![1]
The only way to define a datatype that preserves all the spaces is to
derive it from xs:string...
If you really want a datatype that checks what's specify in the RFCs
that define the URI syntax and that you want that this datatype
preserves all spaces, the only solution is thus to apply a xs:pattern
facet on an xs:string.
Now, I don't think I would advise doing so. This syntax is really
complex to check, it can evolve in future versions and well known
so-called URIs do not meet this syntax anyway.
Also, I would think that most errors (such as typos) that can be done in
a URI would still produce URIs that are still syntactically correct and
that the proportion of errors that you'll detect would be low.
> >
> > Also note that this isn't the only difference between the RFC and the
> > datatype:
> >
> > The ·lexical space· of anyURI is finite-length character sequences
> > which, when the algorithm defined in Section 5.4 of [XML Linking
> > Language] is applied to them, result in strings which are legal URIs
> > according to [RFC 2396], as amended by [RFC 2732].
>
> Put simply, I don't understand that statement. Sorry Eric.
That means that the goal of xs:anyURI isn't to validate URIs as defined
by the RFC, but to validate URIs that, after space normalization and a
transformation defined in the XLink rec would be valid per the RFC.
> Are spaces included in the character sequence or not?
The leading and trailing spaces are removed and sequences of multiple
spaces are replaced by single spaces before doing the validation.
> >
> > Also, well known URIs (including WEBDAV) wouldn't be valid according the
> > RFC and (all|some)? implementations have relaxed these conditions:
> >
> > http://lists.oasis-open.org/archives/relax-ng/200111/msg00033.html
>
> Interesting thread.
> http://lists.oasis-open.org/archives/relax-ng/200111/msg00039.html
> in particular.
> [1]
> (Ignoring any errors)
> Is the regex there a good approximation to your statement Eric?
> I.e. No spaces.
No regex can check that there are no leading or trailing spaces if the
facet isn't applied to an xs:string datatype.
Eric
--
Curious about Relax NG? Read my book online.
http://books.xmlschemata.org/relaxng/
------------------------------------------------------------------------
Eric van der Vlist http://xmlfr.org http://dyomedea.com
(ISO) RELAX NG ISBN:0-596-00421-4 http://oreilly.com/catalog/relax
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------
YAHOO! GROUPS LINKS
- Visit your group "rng-users" on the web.
- To unsubscribe from this group, send an email to:
[EMAIL PROTECTED]
- Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
