On Tue Oct 21 00:53:35 2008, Waqas wrote:
The expat parser (as an example) in namespace-aware mode reports a
fatal error on undeclared prefixes. This was added in response to this bug report: http://sourceforge.net/tracker/index.php?func=detail&aid=695401&group_id=10127&atid=110127
which references this section of XML Names:
http://www.w3.org/TR/REC-xml-names/#ns-qualnames


Which doesn't say anything about mandatory fatal errors.

If you're parsing a static document, it's quite reasonable to generate a fatal error, but I don't think that's the right thing at all with an XML stream.

Ah yes, a namespace aware parser (expat) is indeed being used with
namespace awareness disabled...

Right - and then namespaces are handled, so the overall result is that a namespace aware parser is used. If you're mandating that all XMPP implementations MUST use somebody else's parser, then I don't know quite what to say.


I looked at the Gajim sources, and using
'http://www.gajim.org/xmlns/undeclared-root' as the namespace of all
undeclared prefixes clearly does not conform with [XML-NAMES].
See: http://www.w3.org/TR/REC-xml-names/#ProcessorConformance


Nonsense.

"A processor MUST report violations of namespace well-formedness" - Gajim is doing so, signalling this condition using a specific namespace URI, so it clearly *does* conform. You may argue that I should have used some special non-string object instead, if you like, and that how Gajim handles this signal - by treating it as the unknown namespace it (kind of) is - is sufficiently simple and neat as to warrant being maligned as a hack, but it's a damn sight better than terminating the connection.

Gajim does not conform to XML-NAMES. I reviewed the code, and it
appears to act correctly for most XML. But it does not act correctly
for prefixes on attributes

Not that it did when expat was used to handle the namespaces, either. Making it handle these properly would involve quite a bit more rewriting. (Possible and desirable rewriting, to be sure, but nothing to do with the issue at hand, sorry).

. And it does not have a single one of all
those required checks for non-conforming XML (except the undeclared
prefix check on tag names). XML-NAMES requires a number of checks for
conformance, some of which are in
http://www.w3.org/TR/REC-xml-names/#Conformance while others are
sprinkled throughout the spec.


I'll accept that - I didn't make it check for multiple colons, etc, and I might well allow a redefinition of xml: and xmlns:, which'd be confusing. I ought to fix these at some point.

Incidentally, by stating "except the undeclared prefix check", aren't you arguing that the code *is* following XML-NAMES in this regard?

Dave, I don't think you want to conform to XML-NAMES. I think you'd
prefer to sanitize the XML instead to make it conform to XML-NAMES.
One step closer to HTML ;)

The mechanism by which I happen to have chosen to report undeclared namespaces is merely a convenient mechanism which happens to have result I desired with minimal programming. I happen to think the code is less hacky than Expat's rather bizarre API, which has namespace handling hacked on via character delimiters, especially given how Gajim then used this API. (Either Expat looks up namespaces and then leaves you a non-standard notation to parse, or else you parse the standard notation and lookup namespaces yourself, in a more resilient manner - not a hard choice, really).

What I'm trying to do is look at where we are now, and describe the best option for developers wishing to deploy now, especially bearing in mind we need to obtain the best result, where "best" is in terms of interoperability and potential efficiency. If you disagree with those goals, please say so - I don't think your goals are all that different.

You appear to be arguing that the best interoperability (presumably) is achieved by producing only XML-NAMES conforming XML. I can agree with you there.

I also think this doesn't always happen right now, and that therefore clients are best advised to handle "Bad XMLNS" in a graceful manner, in particular, not generated a fatal stream-level error.

Furthermore, I note that if clients do this, the requirement to produce only "Good XMLNS" can be relaxed slightly, since no serious damage results. That is, avoid if possible - bad things may result, rather than avoid at all costs - bad things will result. SHOULD instaed of MUST in RFC 2119 terms.

Finally, I note that the costs can be, in fact, remarkably high for a server in the case of forwarding stanzas, since in order to merely forward stanzas, a simple lexing pass is sufficient, whereas to check - in particular - for undeclared prefixes requires a full parse and lookup. These are expensive operations involving allocations, string compares, and other primitives that have a detrimental effect on short-term and long-term server performance.

Which step in my chain of thought here is so offensive that it requires attack by the HTML bogeyman argument? :-)

 Something which needed to be done to cope with xmlns-unaware
servers. All client developers should roll their own XMLNS processing
code? They do, but they shouldn't have had to.

I agree entirely - as I say, nothing in XML-NAMES mandates processors generating fatal errors for the Prefix Declared constraint, and whilst I can see this is a reasonable thing for handling the case in a static document that the user has control over, it's utterly unsuitable for most other cases - what is the user supposed to do, after all?

This is the kind of thing that XML processors should offer, signalling that an element (or attribute) is unbound, rather than generating a fatal error. It's unfortunate that they do not, and my personal hope for the developers of 2015 (which, I hope, will include myself) is that XML processors themselves will improve, providing what developers actually need.

Finally, I should note that XMLNS processing is not really very hard. The code to do so in Gajim consists of around ten lines, from memory - it's slightly more in our server, perhaps as many as 20.

Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Reply via email to