Re: Problem with MultiPhrase Query in Lucene 4.3

VIGNESH S Thu, 03 Oct 2013 09:08:37 -0700

Thanks for your Reply Ian.

I will check and let you know.



On Thu, Oct 3, 2013 at 9:19 PM, Ian Lea <[email protected]> wrote:

> Below is a little self-contained test program.  You may recognise some
> of the code.
>
> Here's the output from a couple of runs using lucene 4.4.0.
>
> $ java ian.G1 "Dremel is a scalable, interactive ad-hoc query system"
> "interactive ad-hoc"
> term=interactive
> term=ad-hoc
> +content:"interactive" +content:"ad-hoc": totalHits=1
>
>
> $ java ian.G1 "Dremel is a scalable, interactive ad-hoc query system"
> "interactive adhoc"
> term=interactive
> +content:"interactive": totalHits=1
>
> All looks OK to me.  Maybe you can make it fail, or use it to help fix
> your problem.
>
> --
> Ian.
>
> package ian;
>
> import java.util.*;
> import org.apache.lucene.analysis.*;
> import org.apache.lucene.analysis.core.*;
> import org.apache.lucene.analysis.en.*;
> import org.apache.lucene.analysis.standard.*;
> import org.apache.lucene.document.*;
> import org.apache.lucene.queries.*;
> import org.apache.lucene.search.*;
> import org.apache.lucene.store.*;
> import org.apache.lucene.index.*;
> import org.apache.lucene.util.*;
>
> public class G1 {
>
>     void test(String _contents, String _words) throws Exception {
> String contents = _contents;
> String words = _words;
>
>   RAMDirectory dir = new RAMDirectory();
> Analyzer anl = new WhitespaceAnalyzer(Version.LUCENE_44);
> IndexWriterConfig iwcfg = new IndexWriterConfig(Version.LUCENE_44,
> anl);
> IndexWriter iw = new IndexWriter(dir, iwcfg);
>
> FieldType offsetsType = new FieldType(TextField.TYPE_STORED);
> Field field = new Field("content", "", offsetsType);
> Document doc = new Document();
> doc.add(field);
> field.setStringValue(contents);
> iw.addDocument(doc);
> iw.close();
>
> IndexReader rdr = DirectoryReader.open(dir);
> Fields fields = MultiFields.getFields(rdr);
> Terms terms = fields.terms("content");
>
> BooleanQuery bq = new BooleanQuery();
> String[] worda = _words.split(" ");
> for (String w : worda) {
>    LinkedList<Term> termsWithPrefix = new LinkedList<Term>();
>    TermsEnum trm = terms.iterator(null);
>    trm.seekCeil(new BytesRef(w));
>    do {
> String s = trm.term().utf8ToString();
> if (s.startsWith(w)) {
>    termsWithPrefix.add(new Term("content", s));
>    System.out.printf("term=%s\n", s);
> }
> else {
>    break;
> }
>    }
>    while (trm.next() != null);
>
>    if (!termsWithPrefix.isEmpty()) {
> MultiPhraseQuery mpquery = new MultiPhraseQuery();
> mpquery.add(termsWithPrefix.toArray(new Term[0]));
> bq.add(mpquery, BooleanClause.Occur.MUST);
>    }
> }
>
> IndexSearcher searcher = new IndexSearcher(rdr);
> TopDocs results = searcher.search(bq, 10);
> System.out.printf("%s: totalHits=%s\n",
>  bq, results.totalHits);
>     }
>
>
>
>     public static void main(String[] _args) throws Exception {
> G1 t = new G1();
> t.test(_args[0], _args[1]);
>     }
> }
>
>
> On Thu, Oct 3, 2013 at 4:10 PM, VIGNESH S <[email protected]> wrote:
> > Hi,
> >
> > sorry.. thats my typo..
> >
> > Its not failing because of that
> >
> >
> > On Thu, Oct 3, 2013 at 8:17 PM, Ian Lea <[email protected]> wrote:
> >
> >> Are you sure it's not failing because "adhoc" != "ad-hoc"?
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >> On Thu, Oct 3, 2013 at 3:07 PM, VIGNESH S <[email protected]>
> wrote:
> >> > Hi,
> >> >
> >> > I am Trying to do Multiphrase Query in Lucene 4.3. It is working
> Perfect
> >> > for all scenarios except the below scenario.
> >> > When I try to Search for a phrase which is preceded by any
> punctuation,it
> >> > is not working..
> >> >
> >> > TextContent:  Dremel is a scalable, interactive ad-hoc query system
> for
> >> > analysis
> >> > of read-only nested data. By combining multi-level execution
> >> > trees and columnar data layout, it is capable of running aggregation
> >> >
> >> > Search phrase :  interactive adhoc
> >> >
> >> > The Above Search is failing because "interactive adhoc" is preceded by
> >> ","
> >> > in original text.
> >> >
> >> >
> >> > I am Doing Indexing like this..Sample Code for Indexing.I have used
> >> > whitespace analyzer.
> >> >
> >> > Document doc = new Document();
> >> >
> >> > contents ="Dremel is a scalable, interactive ad-hoc query system for
> >> > analysis
> >> > of read-only nested data. By combining multi-level execution
> >> > trees and columnar data layout, it is capable of running aggregation";
> >> >
> >> > FieldType offsetsType = new FieldType(TextField.TYPE_STORED);
> >> >
> >> > Field field =new Field("content","", offsetsType);
> >> >
> >> > doc.add(field);
> >> > field.setStringValue(contents);
> >> >
> >> > mWriter.addDocument(doc);
> >> >
> >> > In the Search I am forming MultiphraseQueryObject and adding the
> tokens
> >> of
> >> > the search Phrase.
> >> >
> >> > Before Adding the tokens,I validated like this
> >> >
> >> > LinkedList<Term> termsWithPrefix = new LinkedList<Term>();
> >> trm.seekCeil(new
> >> > BytesRef(word)); do { String s = trm.term().utf8ToString(); if
> >> > (s.startsWith(word)) { termsWithPrefix.add(new Term("content", s)); }
> >> else
> >> > { break; } } while (trm.next() != null);
> >> > mpquery.add(termsWithPrefix.toArray(new Term[0])); }
> >> >
> >> > It is working for all scenarios except the scenarios where the search
> >> > phrase is preceded by punctuation.
> >> >
> >> > In case of text preceded by punctuation trm.seekCeil(new
> BytesRef(word));
> >> > is pointing a diffrent word which actually causes the problem..
> >> >
> >> > Please kindly help..
> >> >
> >> >
> >> > --
> >> > Thanks and Regards
> >> > Vignesh Srinivasan
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
> >
> >
> > --
> > Thanks and Regards
> > Vignesh Srinivasan
> > 9739135640
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Thanks and Regards
Vignesh Srinivasan
9739135640

Re: Problem with MultiPhrase Query in Lucene 4.3

Reply via email to