Hi Sylvain, thanks for your kind reply! I suspected the XPath limitations you explained very well, but deeply in my heart I was hoping to a solution I didn't know yet, for this reason I asked it :P :P
I'll take a look at both the solutions, eve if the first sounds to me more compliant to the xpointer recommendation and at the same time closer with what I already did - and to older XInclude cocoon implementations. Thank you very much for your hints, very well appreciated :) A bientot! Simone P.S. Offtopic: maybe I'm wrong, but I'm sure we met once in Tolouse, I was one of the Asemantics juniors involved in Joost :P On Sun, Nov 22, 2009 at 3:27 PM, Sylvain Wallez <sylv...@apache.org> wrote: > Simone Tripodi wrote: >> >> Hi all guys, >> I'm very sorry if I don't appear frequently on the ML but since April >> I've been working very hard for a customer client in Paris that don't >> let me some spare time to dedicate to OS projects. >> > > Don't be sorry. We all have our own jobs/interest/duties that have driven us > away from Cocoon. Glad to see you back! > >> I'm writing because I'm sure the XInclude transformer I submitted time >> ago could be optimized, so I'd like to ask you a little help :) >> >> The state of the art is that, when including an entire document, it is >> processed efficiently through SAX APIs; the problem comes when >> processing a document referenced by xinclude+xpointer, that forces the >> processor to extract a sub-document of the included. >> >> To perform this, I implemented a DOM parsing, then through XPath I >> extract the sub-document the processor has to be included, then >> navigating the elements will be converted to SAX events. As you >> noticed, this takes time, too much IMO, but I didn't find/don't know >> any better solution :( >> Since you experienced the stax, maybe you're able to suggest me a fast >> way to parse a document with xpath and invoke SAX events, so I'm able >> to provide you a much better - and faster, above all - solution. >> >> Any hint? Every suggestion will be very appreciated. >> > > The problem with XPath and XML streaming (be it SAX or StAX) is that XPath > is a language that allows exploring the document tree in all directions and > thus inherently expects having the whole document tree available, which is > clearly not compatible with streaming. > > There are different approaches to solving this : > - use a deferred loading DOM implementation, which buffers events only when > it needs them to traverse the tree. Axiom [1] provides this IIRC, along with > an XPath implementation. > - restrain the XPointer expression to a subset of XPath that can easily be > implemented on top of a stream. This means restricting selection only on the > current element, its attribute and its ancestors. There's an implementation > of this approach in Tika. > > The XInclude transformer can be smart enough to use the most efficient > implementation for the given XPath expression, i.e. try to parse it with > Tika's restricted subset, and fallback to something more costly, either > Axiom or plain DOM. > > Sylvain > > [1] http://ws.apache.org/commons/axiom/ > [2] > https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/ > > -- > Sylvain Wallez - http://bluxte.net > > -- http://www.google.com/profiles/simone.tripodi