You can use SpanNearQuery to seek matches within a specified distance.

Lucene knows nothing about "sentences". But if you have an analyzer or custom code that artificially bumps the position to the next multiple of some number like 100 or 1000 when a sentence boundary pattern is encountered, you could use that number times n to match within n sentences, roughly, plus or minus a sentence or two - there is nothing to cause the nearness to be rounded or truncated exactly to one of those boundaries.

Maybe you want two numbers: 1) sentence separation, say 1000, and 2) maximum sentence length, say 500. The SpanNearQuery would use n-1 times the sentence separation plus the maximum sentence length. Well, you have to adjust that for how you count sentences - is 1 the current sentence or is that 0?

-- Jack Krupansky

-----Original Message----- From: Igor Shalyminov
Sent: Thursday, April 25, 2013 6:54 AM
To: java-user@lucene.apache.org
Subject: Multiple PositionIncrement attributes

Hi all!

I use PositionIncrement attribute for finding words at some distance from each other. And I have two problems with that: 1) I want to search words within one sentence. A possible solution would be to set PositionIncrement of +INF (like 30 :) ) to the sentence break tag. 2) I want to use in my search both word-distance and sentence-distance between words (e.g. find the word "Putin" within 3 sentences after the word "Obama" or find the words "cheese" and "bacon" in one sentence within 3 words of each other).

For the 2nd problem, is there a way of storing multiple position information sources in the index and using them for searching? Say, at least choosing one of those for a query.


--
Best Regards,
Igor Shalyminov

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to