Hi Mukul, When I was skimming over the spec before I responded to your e-mail I missed an important detail. I wasn't thinking about refactoring or the number of lines of code in XMLSchemaValidator.java though those are things that I suppose we could revisit at some point. I had been assuming that the XDM would be constructed from the whole subtree and that adding full support for XPath 2.0 would introduce the need to buffer the data as well as introducing complexity in processing PSVI and error info. Given that the XDM is only constructed from the current element (without its children), its attributes and inherited attributes the issues that I previously mentioned aren't relevant to CTA because you can always evaluate the XPath on the startElement(). Sorry about that. :-)
With that in mind I think you're proposal to use PsychoPath here is fine, though might be better to always favour our own built-in XPath support (for performance reasons) when the expression does fall within the subset and only use PsychoPath for expressions that Xerces does not handle natively. Thanks. Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [email protected] E-mail: [email protected] Mukul Gandhi <[email protected]> wrote on 09/09/2009 11:56:52 PM: > Hi Michael, > Thanks for your reply. > > If you think, we must do this later, no problems, and we could > pursue this later. > > But I feel, that the design for integrating PsychoPath processor as an > alternative processor, for CTA is probably quite simple. Below is a > pseudo code for this: > > In XMLSchemaValidator.java, instead of: > > for (int i = 0; i < alternatives.length; i++) { > Test test = alternatives[i].getTest(); > if (test != null && test.evaluateTest(element, ctaAttributes)) { > ... > > } > } > > We need to something like, following (PsychoPath can be selected using > a Java system property): > > for (int i = 0; i < alternatives.length; i++) { > Test test = alternatives[i].getTest(); > > boolean xpathSucceeds = false; > String ctaProcessorProp = > System.getProperty("org.apache.xerces.ctaProcessor"); > if (ctaProcessorProp == null || ctaProcessorProp.equals("")) { > xpathSucceeds = test.evaluateTest(element, ctaAttributes); > } > else { > // construct XDM (DOM, for PsychoPath) tree for CTA (using the > element, and it's attributes) > xpathSucceeds = evaluate XPath on this XDM tree; > } > > if (test != null && xpathSucceeds) { > ... > > } > } > > Personally speaking, I think, I can write this modification within a > week's time. > > I think, providing an option like PsychoPath with CTA, to user's would > be good, as PsychoPath is part of Eclipse Web Tools project, and we > use it in assertions as well. > > About complexity of the XMLSchemaValidator, I agree with you. I can > see, that XMLSchemaValidator.java is already about 5000 lines long. > > I strongly suggest, we must refactor XMLSchemaValidator.java. An > immediate measure, I can think about controlling XMLSchemaValidator's > complexity is, to move assertions and CTA code into separate > components, and integrating them with XMLSchemaValidator. I think, > this alone would reduce XMLSchemaValidator's size roughly by 500-1000 > lines. > > If refactoring is agreed upon, we can do it, after inheritable > attributes changes get committed. > > If my help is needed, for the refactoring, I am always available. > > Any thoughts, please about what I have proposed above? > > We could postpone, any of the ideas, as proposed by me above, as you > and other committer's would wish, and also considering feedback from > community. > > On Thu, Sep 10, 2009 at 1:33 AM, Michael Glavassevich > <[email protected]> wrote: > > Hi Mukul, > > > > I've given this some thought and think we should probably hold offon adding > > support beyond the subset (which is streamable) at least until we get some > > user feedback. It complicates the validator. Now have to be prepared to > > buffer arbitrarily large portions of the document because we may not be able > > to determine an element's type until we've processed the entire subtree. > > Means we won't be able to stream PSVI and error info to the user and may > > not report accurate line / column numbers (unless we cache all the document > > positions along the way which is expensive). > > > > You could open a JIRA issue for tracking but suggest that we revisit later. > > > > Thanks. > > > > Michael Glavassevich > > XML Parser Development > > IBM Toronto Lab > > E-mail: [email protected] > > E-mail: [email protected] > > > > -- > Regards, > Mukul Gandhi > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected]
