Hi Sylvain,
there are no words to say thank you, very very appreciated, I'll
follow your suggestions :)
A bientot!!!!
Simone

On Tue, Nov 24, 2009 at 10:21 AM, Sylvain Wallez <sylv...@apache.org> wrote:
> Simone Tripodi wrote:
>>
>> Hi Sylvain and Simone,
>> thank you a lot, the suggestions you provided are all very very
>> interesting, so I wonder now if it is possible to realize a processor
>> able to use at the same time the Tika way when it recognizes some kind
>> of paths, the "XSL-on-the-fly" for more complex cases. What do you
>> think?
>>
>
> As I suggested previously: first try to parse the XPath expression with
> Tika's parser, and if it fails because the expression doesn't match the
> subset it accepts, fall back to XSL-on-the-fly.
>
> Looking at Tika's parser [1], it looks like you'll have to overload the
> parse() method to fail hard by throwing an exception rather than returning
> Matcher.FAIL to be able to detect XPath features outside of the subset it
> accepts.
>
>> Sylvain, I still haven't read the Tika documentation, can you just
>> point me the related doc about this topic?
>>
>
> There's no specific documentation on this particular feature, as its more an
> internal utility than a primary feature in Tika. Now the code is pretty
> straightforward.
>>
>> Simo, did you already give a try about the XSLT generation on the fly?
>> The most basic operation I thought is generating the XSL string by a
>> template, then pass it to the XSL parser, but I'm sure it could be
>> implemented in a better way :P
>>
>
> Sounds like the way to go, but you should cache the resulting template
> object to avoid recreating and reparsing the XSL at every request. The same
> applies to Tika matcher objects.
>
> Sylvain
>
> [1]
> https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/XPathParser.java
>
> --
> Sylvain Wallez - http://bluxte.net
>
>



-- 
http://www.google.com/profiles/simone.tripodi

Reply via email to