Re: [rng-users] Lets standardize PI for associating Relax NG schema with XML document

Henri Sivonen Sun, 03 Jul 2005 06:46:15 -0700

On Jul 3, 2005, at 00:42, B Tommie Usdin wrote: > At 11:14 PM +0200 7/2/05, Jirka Kosek wrote: >> >> Primary motivation (although not stated clearly) for my proposal was >> not validation, but guided editing of XML document. Describing >> complex validation is out of scope of my proposal, something much >> more powerfull like NRL could be used. > > But if there is a "standard" there will be pressure to use it for > everything conceivable, whether it is appropriate or not. I agree. I think it is a desirable feature that the RELAX NG validation process takes two *independent* inputs: the schema and the document. (Mentioned also by James Clark in the famous IETF post: http://www.imc.org/ietf-xml-use/mail-archive/msg00217.html) I can see three main cases here: 1) Apps that want to check their input in an off-the-shelf manner 2) Quality assurance tools 3) Editors with autocomplete/error high-lighting In case 1) an application receives input from an outside source and cannot trust that the outside source produces correct output (correct in the sense that the receiving application works properly when using it as input). In order to avoid hand coding checks for all the possible errors situations, the developer of the application decides to embed a RELAX NG validator and an appropriate schema. Then in the hand-coded part of the application can trust that anything it sees conforms to the schema. If the input can smuggle in its own rules the way DOCTYPE and schemaLocation allow it to do, the app can no longer trust the validation stage, which defeats the whole point of embedding the validator. Therefore, I think a PI for the input to specify its own schema is totally wrong considering case 1). In case 2) a user has a document (not necessarily created by the user him/herself) and is interested in the syntactic correctness of the document. If the document is allowed to define the rules, the user is getting the answer to the question "Does this document conform to the grammar it sets for itself?" http://validator.w3.org/ works like this. It gives you a little badge of validity to show off, but it doesn't tell you if the internal subset was used to introduce radically different home grown rules than what the "This document is valid FooML" message implies. All you know is that whoever produced the document managed to adhere to his/her own rules. Then what? The rules could be anything. http://hsivonen.iki.fi/validator/ - being a RELAX NG validator - works differently. It allows the user to pose the (in my opinion much more useful) question "Does this document conform to this grammar?" It does not give out a badge, but after the validation the user knows what schema the document did or did not conform to. I think RELAX NG-based QA tools would regress to a less useful level if the user of a QA tool only knew that the document is internally consistent without knowing whether it adheres to the particular grammar the user is interested in. Therefore, I think a PI for the input to specify its own schema would harm case 2). I agree that in case 3) it is desirable to use a RELAX NG schema for editing assistance. However, I think such use is a private matter between the user and his/her editor and, therefore, it is not necessary to expose such private editing method details to whoever subsequently receives the document. Moreover, the schema repository is likely to be local, so the most obvious references ie. installation-specific file system paths would be useless to others making the PI useful only privately. OTOH, registering common identifiers for schemas and abstracting away the file paths would probably be an overkill and for the same effort you could use some configurable association method that does not contaminate the document. Also, having to contaminate the document itself with editing process-specific artifacts can be a sign of a design flaw in the editor. In the common cases, the schema could be bound to the root namespace or to the filename extension (as is customary with programming language-specific syntax highlighting in text editors). Since case 3) seems more like a private issue, I think central endorsement of a standard PI is not necessary for case 3). BTW, I think DOCTYPE and schemaLocation are design bugs, because they foil the point of cases 1) and 2). -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/

YAHOO! GROUPS LINKS

Visit your group "rng-users" on the web.
To unsubscribe from this group, send an email to: [EMAIL PROTECTED]
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.

Re: [rng-users] Lets standardize PI for associating Relax NG schema with XML document

Reply via email to