On Thu, Sep 03, 2009 at 03:07:18PM +0200, Jukka Zitting wrote:
> Hi,
>
> On Wed, Sep 2, 2009 at 2:40 PM, David Causse wrote:
> > If I use tika for parsing HTML code and inject parsed String to a lucene
> > analyzer. What about the offset information for KWIC and return to text
> > (like the google
On Sep 2, 2009, at 5:40 AM, David Causse wrote:
Hi,
If I use tika for parsing HTML code and inject parsed String to a
lucene
analyzer. What about the offset information for KWIC and return to
text
(like the google cache view)? how can I keep track of the offsets
between tika parser and lu
.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Jukka Zitting [mailto:jukka.zitt...@gmail.com]
> Sent: Thursday, September 03, 2009 3:07 PM
> To: java-user@lucene.apache.org; David Causse
> Subject: Re: Use of t
Hi,
On Wed, Sep 2, 2009 at 2:40 PM, David Causse wrote:
> If I use tika for parsing HTML code and inject parsed String to a lucene
> analyzer. What about the offset information for KWIC and return to text
> (like the google cache view)? how can I keep track of the offsets
> between tika parser and