On sam, 2005-08-06 at 12:36 +0100, Dave Pawson wrote:
> On Sat, 2005-08-06 at 10:58 +0900, MURATA Makoto (FAMILY Given) wrote:
> > Dave> <define name="anyuri">
> > Dave>  <data type="string">
> > Dave>  <param name="pattern">(([a-zA-Z][0-9a-zA-Z
> > Dave> +\-\.]*:)?/{0,2}[0-9a-zA-Z;/?:@&amp;=+$\.\-_!
> > Dave> ~*'()%]+)?(#[0-9a-zA-Z;/?:@&amp;=+$\.\-_!~*'()%]+)?</param>
> > Dave>       </data>
> > Dave> </define>
> >
> > This does not allow non-ASCII characters.  I am very sure that
> > a lot of people want to use non-ASCII characters.
> Noted. Thanks.
>
> Eric also said:
>
> If you really want a datatype that checks what's specify in the RFCs
> that define the URI syntax and that you want that this datatype
> preserves all spaces, the only solution is thus to apply a xs:pattern
> facet on an xs:string.
>
> Now, I don't think I would advise doing so. This syntax is really
> complex to check, it can evolve in future versions and well known
> so-called URIs do not meet this syntax anyway.
>
>
> Summary. The problem as I see it is that Atom wants an element
> content to be a URI (according to the RFC).
> The perceived problem is that an Atom author will get a failure
> (due to leading spaces etc), and not understand why.

Which kind of failure would he get? Is that very different from the 404
error he can get if he does a typo that doesn't make a URI invalid?

I think people must be aware that not all errors can be detected by a
schema language...

> Validating with msv or jing will confuse the user further.

Note that this space normalization is true not only for xs;anyURI but
all the WXS datatypes except xs:string and xs:normalizeString (it may
not be what you're expecting, but at least it's coherent). If you think
that this is an issue for xs:anyURI, you'll have the same issue for
other datatypes (aren't you using for instance xs:dateTime in Atom?)...

> Hence relax NG anyURI is unsuited

This hasn't anything to do with RELAX NG since this behaviour is just
conform to W3C XML Schema part 2 (don't forget that RNG per itself has
no datatype system)!

> for their purpose (as currently worded).
> Using a regex restricted String pattern is unlikely to be
> successful over time.
> Hence the specification text must be used to clarify.
>
> More to the point, relax NG doesn't provide a way to validate
> a URI 'as presented'.

Not quite:
     1. RNG doesn't have direct support for datatyping at all.
     2. W3C XML Schema part 2 does allow to do what you want through a
        xs:pattern facet on the xs:string datatype and is usable from
        RNG.

> I can see the logic for this, but I
> find it unsatisfactory.

That's because W3C XML Schema part 2 isn't flawless that the DSDL
working group has decided that "DSDL part 5 Data Type Library Language -
DTLL" was needed (http://dsdl.org/0546.htm).

Help is very welcome :-) ...

> I can't see how schematron can help either.

It can check whether a text node is changed by the normalize-space
function so, it could be used in complement to a RNG schema.

Eric

> regards DaveP.

--
Read me on XML.com.
                                            http://www.xml.com/pub/au/74
------------------------------------------------------------------------
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
(ISO) RELAX NG   ISBN:0-596-00421-4 http://oreilly.com/catalog/relax
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------



YAHOO! GROUPS LINKS




Reply via email to