Re: AnalyZer HELP Please

Erik Hatcher Tue, 17 Aug 2004 06:32:40 -0700

Further on this, Karthik, is that you need to really understand what you indexed. For example... take a document that has "New Year" in it, and follow it through your indexing process. See what your analyzer at indexing time actually indexed. And if "new year" are side-by-side tokens emitted from that process, then querying for "New Year" through QueryParser should find a match.

You can easily put together a 10-line JUnit test case using RAMDirectory and your favorite Analyzer to see how this works. I highly recommend you do this in order to isolate the situation even further.

        Erik


On Aug 17, 2004, at 9:25 AM, Patrick Burleson wrote:

Karthik,

What you would want to do with the split tokens ( "New" and "Year" )
is then create a PhraseQuery containing a Term object for each token.
This should do what you want. As Erik said, QueryParser would have
done this internally, only if you actually sent in the quotes...not
just "New Year", but "\"New Year\"".

Patrick

On Tue, 17 Aug 2004 18:53:01 +0530, Karthik N S
<[EMAIL PROTECTED]> wrote:

Hi

Erik

  Apologies.......

What I ment to Say was, a word such as "New Year" (Quotes means "\"" ) on QueryParser.parse(word, "contents", analyzer) should return me hits for the full word, but it did not.

 So when I  did a quick run on Analyzer process and
 found that it was splitting the Word

  "New Year"  =  [New]  [Year]

 Am I doing some thing wrong in here....

Thx in advance.....
Karthik

-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 17, 2004 6:18 PM
To: Lucene Users List
Subject: Re: AnalyZer HELP Please

This is what analyzers do.  I don't know of any analyzer that deals
with quotes in the way you're requesting, by keeping the contents
together as a complete token.  You'll have to write your own variant
that does this.

QueryParser, however, uses quotes to denote a phrase query, and will
query for the words together.  Perhaps this is sufficient for your
needs?

        Erik

On Aug 17, 2004, at 8:40 AM, Karthik N S wrote:


Hey Guys.....

Apologies......


Some small Help needed

When I Run the Analyzer's for the word  "New Year" (with Quotes) on
Lucene1-4 final.jar on win 2k O/s
Why is the SimpleAnalyzer splitting it into 2 words ???

or


am i missing something in here......

Analzying " New  Year "
org.apache.lucene.analysis.WhitespaceAnalyzer:

["] [New] [+] [Year] ["]

org.apache.lucene.analysis.SimpleAnalyzer:

[new] [year]

org.apache.lucene.analysis.StopAnalyzer:

[new] [year]

org.apache.lucene.analysis.standard.StandardAnalyzer:

[new] [year]

com.controlnet.indexing.analyzers.GrammerAnalyzer:

[year]

      WITH WARM REGARDS
      HAVE A NICE DAY
      [ N.S.KARTHIK]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: AnalyZer HELP Please

Reply via email to