Hi,

I am just about to work through the demo and get to know lucene now I
actually got it to build :)  I was wondering if someone could point me in
the right direction for my project.  

I want to query using a list of words but the order that they appear in and
how common they are is not relevant (i.e. no 'stop words' if I got that
terminology correct).  The only relevant thing is how closely grouped they
are and how many of the words in the list occur, and I want to be able to
configure from 0 (no other non-queried words inbetween) until 'n'
non-queried words inbetween.

So for example, if I query for 'a and in house I go together or' (stupid
example I guess) and specify 0 words inbetween then I would only want to get
hits with those query words in any order sorted by relevance based on how
many of those words occured.  For example:

'In a house together' may be the most relevant result

If I specify 1 other none query word allowed, results may look like

1. 'In a house together.'
2. 'In a house sleeping together.'  ('sleeping' being the one extra word
allowed)

These should also be complete sentences or clauses, i.e. not 'fragments' - I
guess I need to use a grammar analyser to determine that.

Any help very much appreciated, I realise that this is probably deceptively
difficult but if anyone can give some pointers that would be amazing.

Svetlana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searching-for-sentences-containing-a-list-of-words-with-a-configurable-number-of-words-not-in-the-li-tp3993981.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to