On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindler <u...@thetaphi.de> wrote: > Have you marked that for GSOC? Would be a good idea! yes I did > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -----Original Message----- >> From: Simon Willnauer [mailto:simon.willna...@googlemail.com] >> Sent: Monday, March 19, 2012 4:43 PM >> To: dev@lucene.apache.org >> Subject: Re: Using term offsets for hit highlighting >> >> Alan, you made my day! >> >> The branch is kind of outdated but I looked at it lately and I can certainly >> help >> to get it up to speed. The feature in that branch is quite a big one and its >> in a >> very early stage. Still I want to encourage you to take a look and work on >> it. I >> promise all my help with the issues! >> >> let me know if you have questions! >> >> simon >> >> On Mon, Mar 19, 2012 at 3:52 PM, Alan Woodward >> <alan.woodw...@romseysoftware.co.uk> wrote: >> > Cool, thanks Robert. I'll take a look at the JIRA ticket. >> > >> > On 19 Mar 2012, at 14:44, Robert Muir wrote: >> > >> >> On Mon, Mar 19, 2012 at 10:38 AM, Alan Woodward >> >> <alan.woodw...@romseysoftware.co.uk> wrote: >> >>> Hello, >> >>> >> >>> The project I'm currently working on requires the reporting of exact >> >>> hit positions from some pretty hairy queries, not all of which are >> >>> covered by the existing highlighter modules. I'm working round this >> >>> by translating everything into SpanQueries, and using the getSpans() >> >>> method to locate hits (I've extended the Spans interface to make >> >>> term offsets available - see >> >>> https://issues.apache.org/jira/browse/LUCENE-3826). This works for >> >>> our use-case, but isn't terribly efficient, and obviously isn't >> >>> applicable to >> non-Span queries. >> >>> >> >>> I've seen a bit of chatter on the list about using term offsets to >> >>> provide accurate highlighting in Lucene. I'm going to have a couple >> >>> of weeks free in April, and I thought I might have a go at >> >>> implementing this. Mainly I'm wondering if there's already been >> >>> thoughts about how to do it. My current thoughts are to somehow >> >>> extend the Weight and Scorer interface to make term offsets >> >>> available; to get highlights for a given set of documents, you'd >> >>> essentially run the query again, with a filter on just the documents >> >>> you want highlighted, and have a custom collector that gets the term >> offsets in place of the scores. >> >>> >> >> >> >> Hi Alan, Simon started some initial work on >> >> https://issues.apache.org/jira/browse/LUCENE-2878 >> >> >> >> Some work and prototypes were done in a branch, but it might be >> >> lagging behind trunk a bit. >> >> >> >> Additionally at the time it was first done, I think we didn't yet >> >> support offsets in the postings lists. >> >> We've since added this and several codecs support it. >> >> >> >> -- >> >> lucidimagination.com >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For >> >> additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For >> > additional commands, e-mail: dev-h...@lucene.apache.org >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional >> commands, e-mail: dev-h...@lucene.apache.org >
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org