Re: Atom syndication schema

2006-03-19 Thread Martin Duerst


At 18:49 06/03/17, Bjoern Hoehrmann wrote:

* Martin Duerst wrote:
When looking with a microscope, you will find some little
differences, because xs:anyURI was described before the IRI
spec (RFC 3987) was approved. These differences are:

1) xs:aryURI also allows spaces and a few other ASCII characters
that are not allowed in URIs nor in IRIs (but the IRI spec has
an escape hatch for such cases).
2) The IRI spec contains many more details than the xs:anyURI
description, in particular also some requirements re.
normalization. However, some of the requirements in this
area of the IRI spec may be lowered or removed in the future
because we have received feedback from implementers that
there are difficulties to implement these.

I agree with Martin that it would be incorrect to use xsd:anyURI here.

Sorry, but I never said that it would be incorrect to use
xsd:anyURI. I personally think that it should be okay to
use xsd:anyURI. The differences are microscopic, and they should
become even smaller, or hopefully go away completely, over time.
It does not make sense to perpetuate minor differences for
something that was and is supposed to be one and the same
thing.

Regards,Martin. 



Datatype for IRIs in RELAX NG (was: Re: Atom syndication schema)

2006-03-19 Thread Henri Sivonen


(Discussion started on atom-syntax, but this is a more general RELAX  
NG issue, so cross-posting to rng-users.)


On Mar 19, 2006, at 09:33, Martin Duerst wrote:


At 18:49 06/03/17, Bjoern Hoehrmann wrote:

* Martin Duerst wrote:
When looking with a microscope, you will find some little
differences, because xs:anyURI was described before the IRI
spec (RFC 3987) was approved. These differences are:

1) xs:aryURI also allows spaces and a few other ASCII characters
that are not allowed in URIs nor in IRIs (but the IRI spec has
an escape hatch for such cases).
2) The IRI spec contains many more details than the xs:anyURI
description, in particular also some requirements re.
normalization. However, some of the requirements in this
area of the IRI spec may be lowered or removed in the future
because we have received feedback from implementers that
there are difficulties to implement these.

I agree with Martin that it would be incorrect to use xsd:anyURI  
here.


Sorry, but I never said that it would be incorrect to use
xsd:anyURI. I personally think that it should be okay to
use xsd:anyURI. The differences are microscopic, and they should
become even smaller, or hopefully go away completely, over time.


I need datatypes for IRIs in general (relative, absolute or just  
fragment identifiers) and for absolute IRIs (possibly with a fragment  
id) in a RELAX NG schema.


Is it really the best practice to use xsd:anyURI and sweep the  
discrepancies under the rug in the hope that future definitions of  
xsd:anyURI change the meaning of the schema later? Can xsd:anyURI be  
augmented with a regexp pattern to restrict spaces and a few other  
ASCII characters in such a way that the resulting datatype  
restriction matches the definition of IRI? Has anyone implemented a  
strictly correct IRI datatype in a Java datatype library (for Jing  
and MSV)?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [rng-users] Datatype for IRIs in RELAX NG (was: Re: Atom syndication schema)

2006-03-19 Thread John Cowan

Henri Sivonen scripsit:

 Is it really the best practice to use xsd:anyURI and sweep the  
 discrepancies under the rug in the hope that future definitions of  
 xsd:anyURI change the meaning of the schema later? Can xsd:anyURI be  
 augmented with a regexp pattern to restrict spaces and a few other  
 ASCII characters in such a way that the resulting datatype  
 restriction matches the definition of IRI? Has anyone implemented a  
 strictly correct IRI datatype in a Java datatype library (for Jing  
 and MSV)?

It's certainly possible to construct a regular expression, a long and complex
one, that will match all IRIs and only IRIs (note that IRI by itself
means absolute IRI with or without fragment identifier).  The question
is whether it's really worth doing so.  If you feel you need it,
by all means go ahead.

-- 
LEAR: Dost thou call me fool, boy?  John Cowan
FOOL: All thy other titles  http://www.ccil.org/~cowan
 thou hast given away:  [EMAIL PROTECTED]
  That thou wast born with. http://www.ap.org



Re: Datatype for IRIs in RELAX NG

2006-03-19 Thread Elliotte Harold


I would recommend against using xsd:anyURI for IRIs. A URI is much more 
restrictive than an IRI, and one of the easiest things for a schema 
validator to check about an xsd:anyURI is that it only contains 
URI-legal ASCII characters. I think a new type is necessary if you do 
want to allow IRIs instead of simple URIs. I suspect you could do it 
with a regular expression but the syntax would be really hairy.



--
Elliotte Rusty Harold  [EMAIL PROTECTED]
XML in a Nutshell 3rd Edition Just Published!
http://www.cafeconleche.org/books/xian3/
http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim



Re: Datatype for IRIs in RELAX NG

2006-03-19 Thread Bjoern Hoehrmann

* Elliotte Harold wrote:
I would recommend against using xsd:anyURI for IRIs. A URI is much more 
restrictive than an IRI, and one of the easiest things for a schema 
validator to check about an xsd:anyURI is that it only contains 
URI-legal ASCII characters. I think a new type is necessary if you do 
want to allow IRIs instead of simple URIs. I suspect you could do it 
with a regular expression but the syntax would be really hairy.

In Schema 1.1 it is not possible for a xsd:string to be no xsd:anyURI.
-- 
Björn Höhrmann · mailto:[EMAIL PROTECTED] · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/