Re: Wild card and multiple keyword search

Ian Lea Wed, 13 Jul 2005 05:48:47 -0700

Sounds to me that all you need is to AND rather than OR your search terms.

        QueryParser qp = new QueryParser("keywords", analyzer);
        qp.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
        Query q = qp.parse(words);


where analyzer is just the standard one.

Or search for +MAIN +BOARD.  Or MAIN AND BOARD.


--
Ian.


On 13 Jul 2005 12:18:44 -0000, Rahul D Thakare
<[EMAIL PROTECTED]> wrote:
>  
> Hi,
> 
>  We are using doc.add(Field.Text("keywords",keywords)); to add the keywords 
> to the document, where keywords is comma separated keywords string.
> Lucene seems to tokenize the keywords with multiple words like(MAIN BOARD) as 
> different keywords(ie as MAIN and BOARD). Tokenization is based on comma and 
> space...So if we search for "MAIN BOARD", documents having keywords like 
> "MAIN LOGIC", "MAIN PARTS", etc also show up
> 
> If one searches for "MAIN BOARD", we want get only the documents have "MAIN 
> BOARD".  How to do this ?
> 
> To achieve this we used doc.add(Field.Keyword("keywords", keywords)); and 
> while searching
> we cannot use standard analyzer, while searching, as divides the keywords if 
> we search keywords having space... so we wrote an 
> KeywordAnalyser(KeywordAnalyzer is basically returns only one single token) 
> as given below.
> 
> /**
>  * Tokenizes the entire stream as single token
>  */
> 
>  public class KeywordAnalyzer extends Analyzer
>  {
>          public TokenStream tokenStream(String fieldName, final Reader reader)
>          {
>                  return new TokenStream(){
>                          private boolean done;
>                          private final char[] buffer = new char[1024];
>                          public Token next() throws IOException
>                          {
>                                  if(!done)
>                                  {
>                                          done = true;
>                                          StringBuffer buffer = new 
> StringBuffer();
>                                          int length = 0;
>                                          while(true)
>                                          {
>                                                  length = 
> reader.read(this.buffer);
>                                                  if(length == -1) break;
> 
>                                                  
> buffer.append(this.buffer,0,length);
>                                          }
>                                          String text = buffer.toString();
>                                          return new 
> Token(text.toUpperCase(),0,text.length());
>                                  }
>                                  return null;
>                          }
>                  };
>          }
>  }
> 
> Which solve the above said problem, but we are not able to the wild card 
> searchs like MAIN*, etc.
> 
> We need both the functionality ie.
> 1.  if user searches for MAIN BOARD, should get only documents that contain 
> MAIN BOARD and not MAIN LOGIC, MAIN, MAIN PART etc.
> 2. User should be able to do the wild card search like MAIN*, etc and get the 
> desired documents.
> 
> Please let us know, how we should do the indexing ? and which analyzer to use 
> to do the search ?
> 
> thanks
> Rahul...
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Wild card and multiple keyword search

Reply via email to