Hi Richard, The application deadline is fast approaching: April 3rd at 19:00 UTC. I think that translates to 5 or 6 AM on Saturday in Australia. If you haven't already, you should update your proposal on the official site [1]. I'm not sure that you will be able to edit after the deadline.
Thanks. [1] http://socghop.appspot.com/ Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [email protected] E-mail: [email protected] Michael Glavassevich/Toronto/i...@ibmca wrote on 03/31/2009 09:58:05 AM: > Hi Richard, > > Richard Kelly <[email protected]> wrote on 03/31/2009 02:44:03 AM: > > > Hi Michael, > > > > Thanks for your thorough review, I'll revise my proposal ASAP using > > your feedback. > > > > > > 2009/3/30 Michael Glavassevich <[email protected]>: > > > > > > All XML characters are Unicode. If you were thinking of other character > > > encodings besides UTF-*, these all get converted to Java chars on input so > > > essentially Xerces is always working on UTF-16 and thus the normalization > > > checker / normalizer will always see a "Unicode encoding form". > > > > > > > Ok, dealing with a single encoding should make it easier. I think I > > got mixed up > > when reading section 4.3.3 of the XML spec which mentions some other > > encodings. [1] > > > > > > > > Probably something you've already realized but worth clarifying... The > > > pipeline (XMLParserConfiguration [1]) is shared between the SAX > and DOM (and > > > perhaps one day StAX) parsers, so these features equally apply to the > > > existing SAX XMLReader, JAXP SAXParser and DocumentBuilder. > There's already > > > a standard SAX feature defined for normalization checking [2]. We should > > > probably define a Xerces' specific feature URI to cover the normalization > > > function which could be set on the SAX parser, similar to the parameter > > > defined in DOM Level 3 Core / Load & Save. For a DOM in memory the > > > normalizing / normalization checking functions would be invoked by setting > > > the parameters on the DOMConfiguration and calling normalizeDocument(). In > > > addition to plugging in the XNI component here it would also involve > > > updating the DOM with the normalized text. And when a DOM is > loaded with an > > > LSParser if the LSInput.certifiedText [3] flag is true, I believe the > > > intention is that normalization processing is skipped so should have some > > > way to bypass the normalization component (e.g. excluding it from the > > > pipeline) when the input claims to be certified. > > > > > > > I had one question about another class called XML11Char which letsyou check > > if a character is a valid XML 1.1 character. Should normalization > checks play > > any role in this validation? > > XML11Char and its counterpart XMLChar are used for checking well- > formedness: the set of rules which all XML documents must conform to > (otherwise they're not XML). Well-formedness checking will have > logically occurred before the normalization checker / normalizer > sees the data. I wouldn't expect that you would need to call any of > these methods again in that context. > > > Thanks, > > Richard > > > > [1] http://www.w3.org/TR/xml11/#charencoding > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: [email protected] > E-mail: [email protected]
