Re: can we provide full XPath 2.0 language, for type alternatives

Michael Glavassevich Thu, 10 Sep 2009 06:50:48 -0700

Hi Mukul,

When I was skimming over the spec before I responded to your e-mail I
missed an important detail. I wasn't thinking about refactoring or the
number of lines of code in XMLSchemaValidator.java though those are things
that I suppose we could revisit at some point. I had been assuming that the
XDM would be constructed from the whole subtree and that adding full
support for XPath 2.0 would introduce the need to buffer the data as well
as introducing complexity in processing PSVI and error info. Given that the
XDM is only constructed from the current element (without its children),
its attributes and inherited attributes the issues that I previously
mentioned aren't relevant to CTA because you can always evaluate the XPath
on the startElement(). Sorry about that. :-)


With that in mind I think you're proposal to use PsychoPath here is fine,
though might be better to always favour our own built-in XPath support (for
performance reasons) when the expression does fall within the subset and
only use PsychoPath for expressions that Xerces does not handle natively.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [email protected]
E-mail: [email protected]

Mukul Gandhi <[email protected]> wrote on 09/09/2009 11:56:52 PM:

> Hi Michael,
>   Thanks for your reply.
>
>   If you think, we must do this later, no problems, and we could
> pursue this later.
>
> But I feel, that the design for integrating PsychoPath processor as an
> alternative processor, for CTA is probably quite simple. Below is a
> pseudo code for this:
>
> In XMLSchemaValidator.java, instead of:
>
> for (int i = 0; i < alternatives.length; i++) {
>     Test test = alternatives[i].getTest();
>     if (test != null && test.evaluateTest(element, ctaAttributes)) {
>        ...
>
>     }
> }
>
> We need to something like, following (PsychoPath can be selected using
> a Java system property):
>
> for (int i = 0; i < alternatives.length; i++) {
>   Test test = alternatives[i].getTest();
>
>   boolean xpathSucceeds = false;
>   String ctaProcessorProp =
> System.getProperty("org.apache.xerces.ctaProcessor");
>   if (ctaProcessorProp == null || ctaProcessorProp.equals("")) {
>     xpathSucceeds = test.evaluateTest(element, ctaAttributes);
>   }
>   else {
>     // construct XDM (DOM, for PsychoPath) tree for CTA (using the
> element, and it's attributes)
>     xpathSucceeds = evaluate XPath on this XDM tree;
>   }
>
>   if (test != null && xpathSucceeds) {
>     ...
>
>   }
> }
>
> Personally speaking, I think, I can write this modification within a
> week's time.
>
> I think, providing an option like PsychoPath with CTA, to user's would
> be good, as PsychoPath is part of Eclipse Web Tools project, and we
> use it in assertions as well.
>
> About complexity of the XMLSchemaValidator, I agree with you. I can
> see, that XMLSchemaValidator.java is already about 5000 lines long.
>
> I strongly suggest, we must refactor XMLSchemaValidator.java. An
> immediate measure, I can think about controlling XMLSchemaValidator's
> complexity is, to move assertions and CTA code into separate
> components, and integrating them with XMLSchemaValidator. I think,
> this alone would reduce XMLSchemaValidator's size roughly by 500-1000
> lines.
>
> If refactoring is agreed upon, we can do it, after inheritable
> attributes changes get committed.
>
> If my help is needed, for the refactoring, I am always available.
>
> Any thoughts, please about what I have proposed above?
>
> We could postpone, any of the ideas, as proposed by me above, as you
> and other committer's would wish, and also considering feedback from
> community.
>
> On Thu, Sep 10, 2009 at 1:33 AM, Michael Glavassevich
> <[email protected]> wrote:
> > Hi Mukul,
> >
> > I've given this some thought and think we should probably hold offon
adding
> > support beyond the subset (which is streamable) at least until we get
some
> > user feedback.  It complicates the validator.  Now have to be prepared
to
> > buffer arbitrarily large portions of the document because we may not be
able
> > to determine an element's type until we've processed the entire
subtree.
> >  Means we won't be able to stream PSVI and error info to the user and
may
> > not report accurate line / column numbers (unless we cache all the
document
> > positions along the way which is expensive).
> >
> > You could open a JIRA issue for tracking but suggest that we revisit
later.
> >
> > Thanks.
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: [email protected]
> > E-mail: [email protected]
>
>
>
> --
> Regards,
> Mukul Gandhi
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]

Re: can we provide full XPath 2.0 language, for type alternatives

Reply via email to