Hi Manjula, Sounds like ShingleFilter will do what you want: < http://lucene.apache.org/core/4_6_0/analyzers-common/org/apache/lucene/analysis/shingle/ShingleFilter.html >
Steve www.lucidworks.com On Dec 22, 2013 11:25 PM, "Manjula Wijewickrema" <manjul...@gmail.com> wrote: > Dear All, > > My Lucene programme is able to index single words and search the most > matching documents (based on term frequencies) documents from a corpus to > the input document. > Now I want to index two word phrases and search the matching corpus > documents (based on phrase frequencies) to the input documents. > > ex:- > input document: > blue house is very beautiful > > split it into phrases (say two term phrases) like: > blue house > house very > very beautiful > etc. > > Is it possible to do this with Lucene? If so how can I do it? > > Thanks, > > Manjula. >