Re: Using lucene queries to search StringFields

Jack Krupansky Sun, 21 Jun 2015 08:28:59 -0700

Unlike Solr, which customizes the query parser to do field-specific
analysis, and only analyzes tokenized fields, not string fields, the Lucene
query parser will unconditionally analyze every query term for every field
using the single specified analyzer, which is the white space analyzer in
this case, which will split your string term with an embedded space into
two separate terms, which will generate a phrase query rather that a single
term query, which is not supported for non-tokenized fields.


Use the KeywordAnalyzer which will not split a quoted string into multiple
terms:
http://lucene.apache.org/core/5_2_0/analyzers-common/org/apache/lucene/analysis/core/KeywordAnalyzer.html

You can also simply escape the spaces with a backslash rather than quote
the entire term, but you still need to use the keyword analyzer.


-- Jack Krupansky

On Fri, Jun 19, 2015 at 2:31 AM, Gimantha Bandara <giman...@wso2.com> wrote:

> Correction..
>
> second time I used the following code to test. Then I got the above
> IllegalStateException issue.
>
> w = new QueryParser(null, new WhitespaceAnalyzer()).parse("*B:\"1 2\"*");
>
> not the below one.
>
> w = new QueryParser(null, new WhitespaceAnalyzer()).parse("*\**"B:1 2\"*");
>
> Can someone point out the correct way to query for StringFields?
>
> Thanks,
>
> On Thu, Jun 18, 2015 at 2:12 PM, Gimantha Bandara <giman...@wso2.com>
> wrote:
>
> > Hi all,
> >
> > I have created lucene documents like below.
> >
> > Document doc = new Document();
> > doc.add(new TextField("A", "1", Field.Store.YES));
> > doc.add(new StringField("B", "1 2 3", Field.Store.NO));
> > doc.add(new TextField("Publish Date", "2010", Field.Store.NO));
> > indexWriter.addDocument(doc);
> >
> > doc = new Document();
> > doc.add(new TextField("A", "2", Field.Store.YES));
> > doc.add(new StringField("B", "1 2", Field.Store.NO));
> > doc.add(new TextField("Publish Date", "2010", Field.Store.NO));
> > indexWriter.addDocument(doc);
> >
> > doc = new Document();
> > doc.add(new TextField("A", "3", Field.Store.YES));
> > doc.add(new StringField("B", "1", Field.Store.NO));
> > doc.add(new TextField("Publish Date", "2012", Field.Store.NO));
> > indexWriter.addDocument(doc);
> >
> > Now I am using the following code to test the StringField behavior.
> >
> >         Query w = null;
> >         try {
> >             w = new QueryParser(null, new
> WhitespaceAnalyzer()).parse("B:1
> > 2");
> >         } catch (ParseException e) {
> >             e.printStackTrace();
> >         }
> >         TopScoreDocCollector collector = TopScoreDocCollector.create(100,
> > true);
> >         searcher.search(w, collector);
> >         ScoreDoc[] hits = collector.topDocs(0).scoreDocs;
> >         Document indexDoc;
> >         for (ScoreDoc doc : hits) {
> >             indexDoc = searcher.doc(doc.doc);
> >             System.out.println(indexDoc.get("A"));
> >         }
> >
> > Above code should print only the second document's 'A' value as it is the
> > only one where 'B' has value '1 2'. But it returns the 3rd document. So I
> > tried using double quotation marks for 'B' value as below.
> >
> > w = new QueryParser(null, new WhitespaceAnalyzer()).parse("\"B:1 2\"");
> >
> > It gives the following error.
> >
> > Exception in thread "main" java.lang.IllegalStateException: field "B" was
> > indexed without position data; cannot run PhraseQuery (term=1)
> >     at
> >
> org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:277)
> >     at org.apache.lucene.search.Weight.bulkScorer(Weight.java:131)
> >     at
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
> >     at
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309)
>  Is
> > my searching query wrong? (Note: I am using whitespace analyzer
> everywhere)
> >
> > --
> > Gimantha Bandara
> > Software Engineer
> > WSO2. Inc : http://wso2.com
> > Mobile : +94714961919
> >
>
>
>
> --
> Gimantha Bandara
> Software Engineer
> WSO2. Inc : http://wso2.com
> Mobile : +94714961919
>

Re: Using lucene queries to search StringFields

Reply via email to