Re: Datatype for IRIs in RELAX NG

2006-03-21 Thread Martin Duerst


At 02:08 06/03/20, Elliotte Harold wrote:

I would recommend against using xsd:anyURI for IRIs. A URI is much more 
restrictive than an IRI, and one of the easiest things for a schema 
validator to check about an xsd:anyURI is that it only contains URI-legal 
ASCII characters.


This is indeed one of the easiest things, but it would be TOTALLY
wrong.

http://www.w3.org/TR/xmlschema-2/datatypes.html#anyURI says, among else:

   The mapping from anyURI values to URIs is as defined by the URI reference
   escaping procedure defined in Section 5.4 Locator Attribute of [XML
   Linking Language] (see also Section 8 Character Encoding in URI References
   of [Character Model]). This means that a wide range of internationalized
   resource identifiers can be specified when an anyURI is called for, and
   still be understood as URIs per [RFC 2396], as amended by [RFC 2732],
   where appropriate to identify resources.

If there is confusion in other venues about this issue, please help
to make sure it gets fixed.


Regards,Martin. 



Re: Datatype for IRIs in RELAX NG

2006-03-21 Thread Julian Reschke


Martin Duerst wrote:


At 02:08 06/03/20, Elliotte Harold wrote:
 
 I would recommend against using xsd:anyURI for IRIs. A URI is much 
more restrictive than an IRI, and one of the easiest things for a schema 
validator to check about an xsd:anyURI is that it only contains 
URI-legal ASCII characters.


This is indeed one of the easiest things, but it would be TOTALLY
wrong.

http://www.w3.org/TR/xmlschema-2/datatypes.html#anyURI says, among else:

   The mapping from anyURI values to URIs is as defined by the URI 
reference

   escaping procedure defined in Section 5.4 Locator Attribute of [XML
   Linking Language] (see also Section 8 Character Encoding in URI 
References

   of [Character Model]). This means that a wide range of internationalized
   resource identifiers can be specified when an anyURI is called for, and
   still be understood as URIs per [RFC 2396], as amended by [RFC 2732],
   where appropriate to identify resources.

If there is confusion in other venues about this issue, please help
to make sure it gets fixed.


Well,

maybe it's time that *some* specification adds new datatypes that do 
*exactly* what RFC3986 and RFC3987 ask for :-)


Best regards, Julian



Re: Datatype for IRIs in RELAX NG

2006-03-21 Thread Bjoern Hoehrmann

* Martin Duerst wrote:
At 02:30 06/03/20, Bjoern Hoehrmann wrote:

 In Schema 1.1 it is not possible for a xsd:string to be no xsd:anyURI.

Can you explain? It seems you are saying that all xsd:strings are
also xsd:anyURIs, but that seems going a bit too far.

Yes, that's exactly what the XML Schema 1.1 Last Call Working Draft
implies as far as I can tell.
-- 
Björn Höhrmann · mailto:[EMAIL PROTECTED] · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 



Latest IE7 release 'AtomicRSS' output comparison results

2006-03-21 Thread M. David Peterson
Hey Folks,

With yesterdays build release of IE7, it seemed appropriate to run aquick inventory check to see where things stand in regards to the derived MS/RSS conversion of a fairly element/attribute usage heavyAtom feed.Here's the overallbreakdown.


Process:
I took a quick snapshot of the atom feed from my personal blog -- put it on a medium dosage of Atom-RNC approvedsteroids (using Uche's latest RNC update  
http://copia.ogbuji.net/blog/2006-02-06/Small_fix_ ), andranthe resultthrough the officiallive instance of the feedvalidator -- minus the incorrect encoding being reported (obviously a simple fix -- will do that after
I finish to the current mapping and related transformation file) it validated.

I then subscribed to this feed in the latest (March 20th) build of IE7, extracting the transformed result to compare against the original looking for areas ofpotential data loss.

The two docs:

Original:
http://m.david-2.xsltransformations.com/atom-test.xml

Derived:
http://m.david-2.xsltransformations.com/xsltblog.atomicrss-sample.xml

Initial analysis: From what I can tell, it seems that theres is only one significant loss that can not be extracted, interpolated, or otherwise derived from somewhere within the transformed result doc:


Original:
category label=foobar scheme=http://www.xsltblog.com/tags term=Internet Explorer 7.0/


Derived:
category domain=http://www.xsltblog.com/tagsInternet Explorer 7.0/category

Obviously the derived category element is missing: label=foobar

I should note that its only the first entry that Iadded@label and @schemeto the category element. The rest contain only the required @term. 

I'm not sure if this, in and of itself, caused the loss. I guess it would depend on how they go about the conversion,(e.g. placing weight on the number ofoccurrences of an attribute, disregarding that which falls below a certain determined criteria? I don't know... just throwing something out therefor the sake of throwing something out there:)


Sean, can you verify this and determine if its something on your side of the lake, or on mine?

Beyond this, it seems that everything else *SHOULD* be able to map back fairly well.

The areas that are currently untested, and potentially a point of concern (that I can think of off the top of my head, anyway)

* undefinedContent of element atom:category
* any extended usage of xml:base and xml:lang

Some areas worth noting:

The following elementsseems to be exact copies (including attributes)of the originals:

atom:link 
atom:author atom:name 
 atom:email 
 atom:uri
atom:contributor
 atom:name 
 atom:email 
 atom:uri
atom:published
atom:summary

Overall it seems to me the MS/RSS team has done a pretty fantastic job of ensuringa fairly high quality conversion, making exact copies of those elements and their associated attributes that did not allow for a clean conversion to the MS/RSS formatand still maintainany hope whatsoever of making the round trip back to its original Atom format.


A BIG PHAT thanks to the MS/RSS team for this! There's a significant amount of conversion work to get back to theoriginal Atom formatthat is no longer required because of their efforts. So again, thanks to Sean and the rest of the MS/RSS folks for their extended efforts on this.


I will be posting this report to the http://trac.understandingatom.com/wiki/AtomicRSS.NETsite, as well as adding the transformation files to the repository (available from this same interface) as soon as I can finish it up and verify to some level of certainty that the original file and the conversion back appear to be the same file. Of course, if this is simply not possible I will post what I have, and report the problems areas back to this thread.

Enjoy your dev-day! :)
-- M:D/M. David Petersonhttp://www.xsltblog.com/