Harini - you won't find a custom analyzer that does exactly what you've described, but building custom analyzers is pretty straightforward. You can learn a lot about it by looking at the pieces within Lucene's source code or the examples (and text) from Lucene in Action.

Reading from an external file and aggregating tokens shouldn't be too difficult.

        Erik


On Jan 11, 2006, at 8:37 AM, Harini Raghavan wrote:

Hi Erik,

I had a look at the SpansExtractor class by Mark, that can convert any
Query to spans. But I think ultimately the analyzer that is used to
convert the text in to TokenStream is what is more important. I am using
the StandardAnalyzer and it seems to return a stream of Tokens where
each token is single word with some positional information. But I want
some words(company name which I can read from an external file) not to
be broken in to separate tokens. So what I need to work on first, is a
custom analyzer. Are there any such existing Analyzer implementations
available?

Thanks,
Harini

Erik Hatcher wrote:


On Jan 9, 2006, at 1:16 PM, Harini Raghavan wrote:

I am using the highlighter package to highlight my search results. The query I am passing to the Highlighter is: +(Content:"Apple Computer" Content:"Apple Comp") +(Title:"Apple Computer" Title:"Apple Comp") But the Highlighter is highlighting even occurances of terms 'Computer'/'Comp'. Anyone knows how to make sure
that only phrases are highlighted, not just the individual terms?


This is not currently implemented in the Highlighter in contrib/ highlighter. Implementing this has been discussed (by converting a Query to a SpanQuery and leveraging its precise range for highlighting).

It would be greatly welcome to have some patches to achieve this level of precise highlighting!

    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to