Joseph Kessselman created XALANJ-2691: -----------------------------------------
Summary: Refactor: Replace direct DTM reference with XCI Key: XALANJ-2691 URL: https://issues.apache.org/jira/browse/XALANJ-2691 Project: XalanJ2 Issue Type: Wish Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: DOM, DTM, JAXP, Xalan, Xalan-interpretive, XSLTC Reporter: Joseph Kessselman Assignee: Gary D. Gregory DTM was never intended to be a general interface to document models. It was a specific solution for cramming as large a document as possible into the limited resources available in PCs around the year 2000. We have been able to create DTM proxy layers for other data sources – databases, for example – but they are uniformly ugly and inefficient code. A better solution, which I implemented in IBM's internal derivative of Xalan and hoped to contribute back to Apache, was to make DTM _only_ a data model, and access it through a set of cursor-object interfacesI named XCI (XML Cursor Interface ... IBM does love its three-letter initialisms). These slotted in essentially where the DTM Iterators sit now, but provided complete encapsulation of the data implementation –- all access to a node was via an XCI cursor currently pointing to that node, with no DTM artifacts leaking through. The advantage is the obvious one: XCI can be efficiently implemented _directly_ over data models (DTM, DOM, database, custom data trees in any representation, potentially directly over Java reflection...), without needing to build a map to DTM node numbers. Little to no efficiency is lost accessing DTM, as the DTM implementation of the XCI is essentially a repackaging of DTM iterators. Efficiency gains, and as importantly ease-of-implementation gains, for other back-end models are significant. For models which are general networks rather than trees there is risk of getting into loops; stylesheets can be written to avoid that, or the XCI implementation can have some mechanism to break those potential loops; in any case, Xalan already has protection against unreasonably deep recursion so those will be caught and reported. I had hoped to donate this improvement back to Apache. Unfortunately, at the same time I was doing this, the IBM XSLT processor was undergoing other major rewrites to build a true optimizing compiler as replacement for XSLTC, which we were not ready (willing?) to contribute to Apache and I didn't have a chance to disentangle the two. Having done it once: It *is* a big rewrite task. Not an insanely complicated one, since the switch from DTM iterators to XCI cursors is nearly a 1:1 mapping,. But even with experience I'd say 3 months of full-time work minimum; possibly much more. And it's been long enough that I'd have to redevelop it de novo – which is not a bad thing, as I'm not sure who, if anyone, in IBM could be persuaded to donate the XCI I created for them. I have set the initial priority as {_}*minor*{_}; Xalan can certainly continue as it is, with DTM as the (ugly) glue layer, and we've got lots of higher-severity items stacked up on the backlog. However, if we had the resources we once did (with people actually paid to work on Xalan), I believe this is a significant improvement in Xalan architecture and would be worth investing in. Certainly if anyone attempts a true Xalan 3.0 (as opposed to a Xalan 2.x that supports part or all of XSLT 3.0), this should be part of the effort. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org