Re: XInclude optimization

2009-12-10 Thread Reinhard Pötz
Simone Tripodi wrote:
 Hi Guys,
 do you have some spare time to review the last patch submitted on [1]?
 I know it requires time...
 Thanks in advance, best regards,

Unless somebody else is quicker than me, I will have a look at your
patch before I create the release.

-- 
Reinhard Pötz   Managing Director, {Indoqa} GmbH
 http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member  reinh...@apache.org



Re: XInclude optimization

2009-12-10 Thread Simone Tripodi
Hi Reinhard
Very appreciated, thanks!!! :)
alles gute, auf wiedersehen!
Simo


On Fri, Dec 11, 2009 at 8:44 AM, Reinhard Pötz reinh...@apache.org wrote:
 Simone Tripodi wrote:
 Hi Guys,
 do you have some spare time to review the last patch submitted on [1]?
 I know it requires time...
 Thanks in advance, best regards,

 Unless somebody else is quicker than me, I will have a look at your
 patch before I create the release.

 --
 Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                         http://www.indoqa.com/en/people/reinhard.poetz/

 Member of the Apache Software Foundation
 Apache Cocoon Committer, PMC member                  reinh...@apache.org
 




-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-12-09 Thread Simone Tripodi
Hi Guys,
do you have some spare time to review the last patch submitted on [1]?
I know it requires time...
Thanks in advance, best regards,
Simone

[1] https://issues.apache.org/jira/browse/COCOON3-3

On Tue, Nov 24, 2009 at 12:42 PM, Simone Tripodi
simone.trip...@gmail.com wrote:
 Hi all,
 Thank you both guys, my question was about legal issues that you clarified me 
 :)

 Reinhard, no problem about the optionals, even if I remember the
 policy I appreciate you reminded me it :) BTW, after a quick overview
 on Tika, I was thinking about importing just the needed classes and
 modifying them according to our needs, so if you agree I'd add the
 XInclude in the cocoon-sax module... what do you think about it? Just
 let me know!

 See you guys and thanks a *lot* for your help :)
 Best regards
 Simo

 On Tue, Nov 24, 2009 at 12:16 PM, Reinhard Pötz reinh...@apache.org wrote:
 Simone Tripodi wrote:
 Hi Sylvain
 Sorry but I forgot to ask you a short question in the previous email:
 can the Tika code be imported/modified into Cocoon3?

 Do you really have to modify Tika code? If so it would be best to give
 back your contributions to the their project.

 Since you have to include a library I strongly recommend that everything
 goes into cocoon-optional in order to keep the number of required
 libraries low for the pipeline API.

 AFAIK it should
 be allowed, but I don't know the conditions under which it can be
 done.

 If your questions is about licensing, then it's very simple: You don't
 have to do anything because Tika is an ASF project.

 --
 Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                         http://www.indoqa.com/en/people/reinhard.poetz/

 Member of the Apache Software Foundation
 Apache Cocoon Committer, PMC member                  reinh...@apache.org
 




 --
 http://www.google.com/profiles/simone.tripodi




-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-11-24 Thread Sylvain Wallez

Simone Tripodi wrote:

Hi Sylvain and Simone,
thank you a lot, the suggestions you provided are all very very
interesting, so I wonder now if it is possible to realize a processor
able to use at the same time the Tika way when it recognizes some kind
of paths, the XSL-on-the-fly for more complex cases. What do you
think?
  


As I suggested previously: first try to parse the XPath expression with 
Tika's parser, and if it fails because the expression doesn't match the 
subset it accepts, fall back to XSL-on-the-fly.


Looking at Tika's parser [1], it looks like you'll have to overload the 
parse() method to fail hard by throwing an exception rather than 
returning Matcher.FAIL to be able to detect XPath features outside of 
the subset it accepts.



Sylvain, I still haven't read the Tika documentation, can you just
point me the related doc about this topic?
  


There's no specific documentation on this particular feature, as its 
more an internal utility than a primary feature in Tika. Now the code is 
pretty straightforward.

Simo, did you already give a try about the XSLT generation on the fly?
The most basic operation I thought is generating the XSL string by a
template, then pass it to the XSL parser, but I'm sure it could be
implemented in a better way :P
  


Sounds like the way to go, but you should cache the resulting template 
object to avoid recreating and reparsing the XSL at every request. The 
same applies to Tika matcher objects.


Sylvain

[1] 
https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/XPathParser.java


--
Sylvain Wallez - http://bluxte.net



Re: XInclude optimization

2009-11-24 Thread Simone Tripodi
Hi Sylvain,
there are no words to say thank you, very very appreciated, I'll
follow your suggestions :)
A bientot
Simone

On Tue, Nov 24, 2009 at 10:21 AM, Sylvain Wallez sylv...@apache.org wrote:
 Simone Tripodi wrote:

 Hi Sylvain and Simone,
 thank you a lot, the suggestions you provided are all very very
 interesting, so I wonder now if it is possible to realize a processor
 able to use at the same time the Tika way when it recognizes some kind
 of paths, the XSL-on-the-fly for more complex cases. What do you
 think?


 As I suggested previously: first try to parse the XPath expression with
 Tika's parser, and if it fails because the expression doesn't match the
 subset it accepts, fall back to XSL-on-the-fly.

 Looking at Tika's parser [1], it looks like you'll have to overload the
 parse() method to fail hard by throwing an exception rather than returning
 Matcher.FAIL to be able to detect XPath features outside of the subset it
 accepts.

 Sylvain, I still haven't read the Tika documentation, can you just
 point me the related doc about this topic?


 There's no specific documentation on this particular feature, as its more an
 internal utility than a primary feature in Tika. Now the code is pretty
 straightforward.

 Simo, did you already give a try about the XSLT generation on the fly?
 The most basic operation I thought is generating the XSL string by a
 template, then pass it to the XSL parser, but I'm sure it could be
 implemented in a better way :P


 Sounds like the way to go, but you should cache the resulting template
 object to avoid recreating and reparsing the XSL at every request. The same
 applies to Tika matcher objects.

 Sylvain

 [1]
 https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/XPathParser.java

 --
 Sylvain Wallez - http://bluxte.net





-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-11-24 Thread Simone Tripodi
Hi Sylvain
Sorry but I forgot to ask you a short question in the previous email:
can the Tika code be imported/modified into Cocoon3? AFAIK it should
be allowed, but I don't know the conditions under which it can be
done.
A bientot!!!
Simo

On Tue, Nov 24, 2009 at 10:29 AM, Simone Tripodi
simone.trip...@gmail.com wrote:
 Hi Sylvain,
 there are no words to say thank you, very very appreciated, I'll
 follow your suggestions :)
 A bientot
 Simone

 On Tue, Nov 24, 2009 at 10:21 AM, Sylvain Wallez sylv...@apache.org wrote:
 Simone Tripodi wrote:

 Hi Sylvain and Simone,
 thank you a lot, the suggestions you provided are all very very
 interesting, so I wonder now if it is possible to realize a processor
 able to use at the same time the Tika way when it recognizes some kind
 of paths, the XSL-on-the-fly for more complex cases. What do you
 think?


 As I suggested previously: first try to parse the XPath expression with
 Tika's parser, and if it fails because the expression doesn't match the
 subset it accepts, fall back to XSL-on-the-fly.

 Looking at Tika's parser [1], it looks like you'll have to overload the
 parse() method to fail hard by throwing an exception rather than returning
 Matcher.FAIL to be able to detect XPath features outside of the subset it
 accepts.

 Sylvain, I still haven't read the Tika documentation, can you just
 point me the related doc about this topic?


 There's no specific documentation on this particular feature, as its more an
 internal utility than a primary feature in Tika. Now the code is pretty
 straightforward.

 Simo, did you already give a try about the XSLT generation on the fly?
 The most basic operation I thought is generating the XSL string by a
 template, then pass it to the XSL parser, but I'm sure it could be
 implemented in a better way :P


 Sounds like the way to go, but you should cache the resulting template
 object to avoid recreating and reparsing the XSL at every request. The same
 applies to Tika matcher objects.

 Sylvain

 [1]
 https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/XPathParser.java

 --
 Sylvain Wallez - http://bluxte.net





 --
 http://www.google.com/profiles/simone.tripodi




-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-11-24 Thread Sylvain Wallez

Simone Tripodi wrote:

Hi Sylvain
Sorry but I forgot to ask you a short question in the previous email:
can the Tika code be imported/modified into Cocoon3? AFAIK it should
be allowed, but I don't know the conditions under which it can be
done.
  


I don't really understand your question. Tika is an Apache project, so 
there's no license issue.


Now if the question is about how, technically, to include Tika into 
Cocoon, I admit having no clue about that.


Sylvain

--
Sylvain Wallez - http://bluxte.net



Re: XInclude optimization

2009-11-24 Thread Reinhard Pötz
Simone Tripodi wrote:
 Hi Sylvain
 Sorry but I forgot to ask you a short question in the previous email:
 can the Tika code be imported/modified into Cocoon3? 

Do you really have to modify Tika code? If so it would be best to give
back your contributions to the their project.

Since you have to include a library I strongly recommend that everything
goes into cocoon-optional in order to keep the number of required
libraries low for the pipeline API.

 AFAIK it should
 be allowed, but I don't know the conditions under which it can be
 done.

If your questions is about licensing, then it's very simple: You don't
have to do anything because Tika is an ASF project.

-- 
Reinhard Pötz   Managing Director, {Indoqa} GmbH
 http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member  reinh...@apache.org



Re: XInclude optimization

2009-11-24 Thread Simone Tripodi
Hi all,
Thank you both guys, my question was about legal issues that you clarified me :)

Reinhard, no problem about the optionals, even if I remember the
policy I appreciate you reminded me it :) BTW, after a quick overview
on Tika, I was thinking about importing just the needed classes and
modifying them according to our needs, so if you agree I'd add the
XInclude in the cocoon-sax module... what do you think about it? Just
let me know!

See you guys and thanks a *lot* for your help :)
Best regards
Simo

On Tue, Nov 24, 2009 at 12:16 PM, Reinhard Pötz reinh...@apache.org wrote:
 Simone Tripodi wrote:
 Hi Sylvain
 Sorry but I forgot to ask you a short question in the previous email:
 can the Tika code be imported/modified into Cocoon3?

 Do you really have to modify Tika code? If so it would be best to give
 back your contributions to the their project.

 Since you have to include a library I strongly recommend that everything
 goes into cocoon-optional in order to keep the number of required
 libraries low for the pipeline API.

 AFAIK it should
 be allowed, but I don't know the conditions under which it can be
 done.

 If your questions is about licensing, then it's very simple: You don't
 have to do anything because Tika is an ASF project.

 --
 Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                         http://www.indoqa.com/en/people/reinhard.poetz/

 Member of the Apache Software Foundation
 Apache Cocoon Committer, PMC member                  reinh...@apache.org
 




-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-11-23 Thread Simone Gianni

Hi Simone and Sylvain,
aren't XSLT transformers already SAX/Xpath optimized? I mean, an XSLT 
containing an XPath expression and used in a SAX context, isn't already 
able to resolve the XPath while keeping buffering at the minimum possible?


I can clearly remember that there has been a lot of work about this in 
Xalan and other XSLT engines, and also how a complex XPath expressions 
could change the performance of a transformation because of increased 
buffering.


In that case, maybe, instead of reinventing it, it should be possible to 
delegate the transformation (extraction of a fragment from the entire 
XML stream) to an XSLT processor. The simplest way could be to generate 
an XSLT on the fly :) .. the correct way would be to use the 
[Xalan|Saxon|any other] internal APIs to perform the XPath resolution. 
In both cases, it will be faster than transforming to DOM.


Simone


Simone Tripodi wrote:

Hi Sylvain,
indeed, that's yet another exception I didn't think, thanks for your
clarification!!!
Bonne journée, a bientot ;)
Simo

On Mon, Nov 23, 2009 at 8:28 AM, Sylvain Wallez sylv...@apache.org wrote:
  

Jos Snellings wrote:


Hmmm, I guess the XPath expression is known before the parsing begins?
I remember I have done a similar thing, where a chunk had to be isolated
from a document that came by via a SAX stream, but here the xpath
expression was something like: /element1/elemen...@id=somenumber].

Theorem: any XPath expression can be evaluated with a SAX filter.
Proof?
Do you know some exceptions?

  

What about this one : //foo[bar[position() = 3]//baz], find all elements
foo whose 3rd bar child has a baz descendent element.

This requires to buffer the contents of every foo element to inspect their
chidren sub-tree.

Sylvain

--
Sylvain Wallez - http://bluxte.net







  



--
Simone GianniCEO Semeru s.r.l.   Apache Committer
http://www.simonegianni.it/



Re: XInclude optimization

2009-11-23 Thread Sylvain Wallez

Simone Gianni wrote:

Hi Simone and Sylvain,
aren't XSLT transformers already SAX/Xpath optimized? I mean, an XSLT 
containing an XPath expression and used in a SAX context, isn't 
already able to resolve the XPath while keeping buffering at the 
minimum possible?


I can clearly remember that there has been a lot of work about this in 
Xalan and other XSLT engines, and also how a complex XPath expressions 
could change the performance of a transformation because of increased 
buffering.


Xalan has an optimized implementation of the document tree [1], more 
efficient than the standard DOM for read-only and selection operations. 
Xalan has an incremental processing mode, but IIRC it's more about being 
able to produce some output before the whole document has been read 
rather than avoiding to build parts of the document tree. So it will 
allow for faster processing, but won't change memory consumption.


In that case, maybe, instead of reinventing it, it should be possible 
to delegate the transformation (extraction of a fragment from the 
entire XML stream) to an XSLT processor. The simplest way could be to 
generate an XSLT on the fly :) .. the correct way would be to use the 
[Xalan|Saxon|any other] internal APIs to perform the XPath resolution. 
In both cases, it will be faster than transforming to DOM.


Agree. It may be easier to produce a small XSL transformation from the 
XPointer expression than using Axiom. But still, for simple expressions, 
the pure streaming approach used by Tika would be way more efficient.


Sylvain

[1] http://xml.apache.org/xalan-j/dtm.html

--
Sylvain Wallez - http://bluxte.net



Re: XInclude optimization

2009-11-23 Thread Simone Tripodi
Hi Sylvain and Simone,
thank you a lot, the suggestions you provided are all very very
interesting, so I wonder now if it is possible to realize a processor
able to use at the same time the Tika way when it recognizes some kind
of paths, the XSL-on-the-fly for more complex cases. What do you
think?

Sylvain, I still haven't read the Tika documentation, can you just
point me the related doc about this topic?

Simo, did you already give a try about the XSLT generation on the fly?
The most basic operation I thought is generating the XSL string by a
template, then pass it to the XSL parser, but I'm sure it could be
implemented in a better way :P

Every suggestion will be very appreciated, thanks in advance

Best regards, have a nice evening!!!
Simone

On Mon, Nov 23, 2009 at 7:16 PM, Sylvain Wallez sylv...@apache.org wrote:
 Simone Gianni wrote:

 Hi Simone and Sylvain,
 aren't XSLT transformers already SAX/Xpath optimized? I mean, an XSLT
 containing an XPath expression and used in a SAX context, isn't already able
 to resolve the XPath while keeping buffering at the minimum possible?

 I can clearly remember that there has been a lot of work about this in
 Xalan and other XSLT engines, and also how a complex XPath expressions could
 change the performance of a transformation because of increased buffering.

 Xalan has an optimized implementation of the document tree [1], more
 efficient than the standard DOM for read-only and selection operations.
 Xalan has an incremental processing mode, but IIRC it's more about being
 able to produce some output before the whole document has been read rather
 than avoiding to build parts of the document tree. So it will allow for
 faster processing, but won't change memory consumption.

 In that case, maybe, instead of reinventing it, it should be possible to
 delegate the transformation (extraction of a fragment from the entire XML
 stream) to an XSLT processor. The simplest way could be to generate an XSLT
 on the fly :) .. the correct way would be to use the [Xalan|Saxon|any other]
 internal APIs to perform the XPath resolution. In both cases, it will be
 faster than transforming to DOM.

 Agree. It may be easier to produce a small XSL transformation from the
 XPointer expression than using Axiom. But still, for simple expressions, the
 pure streaming approach used by Tika would be way more efficient.

 Sylvain

 [1] http://xml.apache.org/xalan-j/dtm.html

 --
 Sylvain Wallez - http://bluxte.net





-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-11-22 Thread Sylvain Wallez

Simone Tripodi wrote:

Hi all guys,
I'm very sorry if I don't appear frequently on the ML but since April
I've been working very hard for a customer client in Paris that don't
let me some spare time to dedicate to OS projects.
  


Don't be sorry. We all have our own jobs/interest/duties that have 
driven us away from Cocoon. Glad to see you back!



I'm writing because I'm sure the XInclude transformer I submitted time
ago could be optimized, so I'd like to ask you a little help :)

The state of the art is that, when including an entire document, it is
processed efficiently through SAX APIs; the problem comes when
processing a document referenced by xinclude+xpointer, that forces the
processor to extract a sub-document of the included.

To perform this, I implemented a DOM parsing, then through XPath I
extract the sub-document the processor has to be included, then
navigating the elements will be converted to SAX events. As you
noticed, this takes time, too much IMO, but I didn't find/don't know
any better solution :(
Since you experienced the stax, maybe you're able to suggest me a fast
way to parse a document with xpath and invoke SAX events, so I'm able
to provide you a much better - and faster, above all - solution.

Any hint? Every suggestion will be very appreciated.
  


The problem with XPath and XML streaming (be it SAX or StAX) is that 
XPath is a language that allows exploring the document tree in all 
directions and thus inherently expects having the whole document tree 
available, which is clearly not compatible with streaming.


There are different approaches to solving this :
- use a deferred loading DOM implementation, which buffers events only 
when it needs them to traverse the tree. Axiom [1] provides this IIRC, 
along with an XPath implementation.
- restrain the XPointer expression to a subset of XPath that can easily 
be implemented on top of a stream. This means restricting selection only 
on the current element, its attribute and its ancestors. There's an 
implementation of this approach in Tika.


The XInclude transformer can be smart enough to use the most efficient 
implementation for the given XPath expression, i.e. try to parse it with 
Tika's restricted subset, and fallback to something more costly, either 
Axiom or plain DOM.


Sylvain

[1] http://ws.apache.org/commons/axiom/
[2] 
https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/


--
Sylvain Wallez - http://bluxte.net



Re: XInclude optimization

2009-11-22 Thread Simone Tripodi
Hi Sylvain,
thanks for your kind reply! I suspected the XPath limitations you
explained very well, but deeply in my heart I was hoping to a solution
I didn't know yet, for this reason I asked it :P :P

I'll take a look at both the solutions, eve if the first sounds to me
more compliant to the xpointer recommendation and at the same time
closer with what I already did - and to older XInclude cocoon
implementations.

Thank you very much for your hints, very well appreciated :)
A bientot!
Simone

P.S. Offtopic: maybe I'm wrong, but I'm sure we met once in Tolouse, I
was one of the Asemantics juniors involved in Joost :P

On Sun, Nov 22, 2009 at 3:27 PM, Sylvain Wallez sylv...@apache.org wrote:
 Simone Tripodi wrote:

 Hi all guys,
 I'm very sorry if I don't appear frequently on the ML but since April
 I've been working very hard for a customer client in Paris that don't
 let me some spare time to dedicate to OS projects.


 Don't be sorry. We all have our own jobs/interest/duties that have driven us
 away from Cocoon. Glad to see you back!

 I'm writing because I'm sure the XInclude transformer I submitted time
 ago could be optimized, so I'd like to ask you a little help :)

 The state of the art is that, when including an entire document, it is
 processed efficiently through SAX APIs; the problem comes when
 processing a document referenced by xinclude+xpointer, that forces the
 processor to extract a sub-document of the included.

 To perform this, I implemented a DOM parsing, then through XPath I
 extract the sub-document the processor has to be included, then
 navigating the elements will be converted to SAX events. As you
 noticed, this takes time, too much IMO, but I didn't find/don't know
 any better solution :(
 Since you experienced the stax, maybe you're able to suggest me a fast
 way to parse a document with xpath and invoke SAX events, so I'm able
 to provide you a much better - and faster, above all - solution.

 Any hint? Every suggestion will be very appreciated.


 The problem with XPath and XML streaming (be it SAX or StAX) is that XPath
 is a language that allows exploring the document tree in all directions and
 thus inherently expects having the whole document tree available, which is
 clearly not compatible with streaming.

 There are different approaches to solving this :
 - use a deferred loading DOM implementation, which buffers events only when
 it needs them to traverse the tree. Axiom [1] provides this IIRC, along with
 an XPath implementation.
 - restrain the XPointer expression to a subset of XPath that can easily be
 implemented on top of a stream. This means restricting selection only on the
 current element, its attribute and its ancestors. There's an implementation
 of this approach in Tika.

 The XInclude transformer can be smart enough to use the most efficient
 implementation for the given XPath expression, i.e. try to parse it with
 Tika's restricted subset, and fallback to something more costly, either
 Axiom or plain DOM.

 Sylvain

 [1] http://ws.apache.org/commons/axiom/
 [2]
 https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/

 --
 Sylvain Wallez - http://bluxte.net





-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-11-22 Thread Jos Snellings
Hmmm, I guess the XPath expression is known before the parsing begins?
I remember I have done a similar thing, where a chunk had to be isolated
from a document that came by via a SAX stream, but here the xpath
expression was something like: /element1/elemen...@id=somenumber].

Theorem: any XPath expression can be evaluated with a SAX filter.
Proof?
Do you know some exceptions?

Jos



Re: XInclude optimization

2009-11-22 Thread Sylvain Wallez

Simone Tripodi wrote:

Hi Sylvain,
thanks for your kind reply! I suspected the XPath limitations you
explained very well, but deeply in my heart I was hoping to a solution
I didn't know yet, for this reason I asked it :P :P

I'll take a look at both the solutions, eve if the first sounds to me
more compliant to the xpointer recommendation and at the same time
closer with what I already did - and to older XInclude cocoon
implementations.
  


Axiom is what will give you the better compliance, but it is a 
relatively heavyweight solution compared to pure streaming. This is why 
I was suggesting to choose the actual xpath implementation according to 
the given XPath expression, since the Tika approach is really pure 
streaming. But this adds some complexity.



Thank you very much for your hints, very well appreciated :)
A bientot!
Simone

P.S. Offtopic: maybe I'm wrong, but I'm sure we met once in Tolouse, I
was one of the Asemantics juniors involved in Joost :P
  


That's right! I did not made the connection! This is a small world ;-)

Sylvain

--
Sylvain Wallez - http://bluxte.net



Re: XInclude optimization

2009-11-22 Thread Sylvain Wallez

Jos Snellings wrote:

Hmmm, I guess the XPath expression is known before the parsing begins?
I remember I have done a similar thing, where a chunk had to be isolated
from a document that came by via a SAX stream, but here the xpath
expression was something like: /element1/elemen...@id=somenumber].

Theorem: any XPath expression can be evaluated with a SAX filter.
Proof?
Do you know some exceptions?
  


What about this one : //foo[bar[position() = 3]//baz], find all elements 
foo whose 3rd bar child has a baz descendent element.


This requires to buffer the contents of every foo element to inspect 
their chidren sub-tree.


Sylvain

--
Sylvain Wallez - http://bluxte.net



Re: XInclude optimization

2009-11-22 Thread Simone Tripodi
Hi Jos,
thanks for your reply, the XPath expression is already known before
parsing the document since the XInclude processor catches the xpointer
reference before including the document.
I think your solution works but I've the suspect just for a limited
subset of the XPath expressions, the exception comes when an
expression contains siblings/parent references...
What do you think about it?
Best regards and thanks for your hint!
Simone

On Mon, Nov 23, 2009 at 7:12 AM, Jos Snellings jos.snelli...@pandora.be wrote:
 Hmmm, I guess the XPath expression is known before the parsing begins?
 I remember I have done a similar thing, where a chunk had to be isolated
 from a document that came by via a SAX stream, but here the xpath
 expression was something like: /element1/elemen...@id=somenumber].

 Theorem: any XPath expression can be evaluated with a SAX filter.
 Proof?
 Do you know some exceptions?

 Jos





-- 
http://www.google.com/profiles/simone.tripodi


Re: XInclude optimization

2009-11-22 Thread Simone Tripodi
Hi Sylvain,
indeed, that's yet another exception I didn't think, thanks for your
clarification!!!
Bonne journée, a bientot ;)
Simo

On Mon, Nov 23, 2009 at 8:28 AM, Sylvain Wallez sylv...@apache.org wrote:
 Jos Snellings wrote:

 Hmmm, I guess the XPath expression is known before the parsing begins?
 I remember I have done a similar thing, where a chunk had to be isolated
 from a document that came by via a SAX stream, but here the xpath
 expression was something like: /element1/elemen...@id=somenumber].

 Theorem: any XPath expression can be evaluated with a SAX filter.
 Proof?
 Do you know some exceptions?


 What about this one : //foo[bar[position() = 3]//baz], find all elements
 foo whose 3rd bar child has a baz descendent element.

 This requires to buffer the contents of every foo element to inspect their
 chidren sub-tree.

 Sylvain

 --
 Sylvain Wallez - http://bluxte.net





-- 
http://www.google.com/profiles/simone.tripodi