Joseph Kessselman created XALANJ-2691:
-----------------------------------------

             Summary: Refactor: Replace direct DTM reference with XCI
                 Key: XALANJ-2691
                 URL: https://issues.apache.org/jira/browse/XALANJ-2691
             Project: XalanJ2
          Issue Type: Wish
      Security Level: No security risk; visible to anyone (Ordinary problems in 
Xalan projects.  Anybody can view the issue.)
          Components: DOM, DTM, JAXP, Xalan, Xalan-interpretive, XSLTC
            Reporter: Joseph Kessselman
            Assignee: Gary D. Gregory


DTM was never intended to be a general interface to document models. It was a 
specific solution for cramming as large a document as possible into the limited 
resources available in PCs around the year 2000.

We have been able to create DTM proxy layers for other data sources – 
databases, for example – but they are uniformly ugly and inefficient code.

A better solution, which I implemented in IBM's internal derivative of Xalan 
and hoped to contribute back to Apache, was to make DTM _only_ a data model, 
and access it through a set of cursor-object interfacesI named XCI (XML Cursor 
Interface ... IBM does love its three-letter initialisms). These slotted in 
essentially where the DTM Iterators sit now, but provided complete 
encapsulation of the data implementation –- all access to a node was via an XCI 
cursor currently pointing to that node, with no DTM artifacts leaking through.

The advantage is the obvious one: XCI can be efficiently implemented _directly_ 
over data models (DTM, DOM, database, custom data trees in any representation, 
potentially directly over Java reflection...), without needing to build a map 
to DTM node numbers. Little to no efficiency is lost accessing DTM, as the DTM 
implementation of the XCI is essentially a repackaging of DTM iterators. 
Efficiency gains, and as importantly ease-of-implementation gains, for other 
back-end models are significant.

For models which are general networks rather than trees there is risk of 
getting into loops; stylesheets can be written to avoid that, or the XCI 
implementation can have some mechanism to break those potential loops; in any 
case, Xalan already has protection against unreasonably deep recursion so those 
will be caught and reported.

I had hoped to donate this improvement back to Apache. Unfortunately, at the 
same time I was doing this, the IBM XSLT processor was undergoing other major 
rewrites to build a true optimizing compiler as replacement for XSLTC, which we 
were not ready (willing?) to contribute to Apache and I didn't have a chance to 
disentangle the two.

Having done it once: It *is* a big rewrite task.  Not an insanely complicated 
one, since the switch from DTM iterators to XCI cursors is nearly a 1:1 
mapping,. But even with experience I'd say 3 months of full-time work minimum; 
possibly much more. And it's been long enough that I'd have to redevelop it de 
novo – which is not a bad thing, as I'm not sure who, if anyone, in IBM could 
be persuaded to donate the XCI I created for them.

I have set the initial priority as {_}*minor*{_}; Xalan can certainly continue 
as it is, with DTM as the (ugly) glue layer, and we've got lots of 
higher-severity items stacked up on the backlog.   However, if we had the 
resources we once did (with people actually paid to work on Xalan), I believe 
this is a significant improvement in Xalan architecture and would be worth 
investing in. Certainly if anyone attempts a true Xalan 3.0 (as opposed to a 
Xalan 2.x that supports part or all of XSLT 3.0), this should be part of the 
effort.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org
For additional commands, e-mail: dev-h...@xalan.apache.org

Reply via email to