Harini - you won't find a custom analyzer that does exactly what
you've described, but building custom analyzers is pretty
straightforward. You can learn a lot about it by looking at the
pieces within Lucene's source code or the examples (and text) from
Lucene in Action.
Reading from an external file and aggregating tokens shouldn't be too
difficult.
Erik
On Jan 11, 2006, at 8:37 AM, Harini Raghavan wrote:
Hi Erik,
I had a look at the SpansExtractor class by Mark, that can convert any
Query to spans. But I think ultimately the analyzer that is used to
convert the text in to TokenStream is what is more important. I am
using
the StandardAnalyzer and it seems to return a stream of Tokens where
each token is single word with some positional information. But I want
some words(company name which I can read from an external file) not to
be broken in to separate tokens. So what I need to work on first, is a
custom analyzer. Are there any such existing Analyzer implementations
available?
Thanks,
Harini
Erik Hatcher wrote:
On Jan 9, 2006, at 1:16 PM, Harini Raghavan wrote:
I am using the highlighter package to highlight my search
results. The query I am passing to the Highlighter is:
+(Content:"Apple Computer" Content:"Apple Comp") +(Title:"Apple
Computer" Title:"Apple Comp")
But the Highlighter is highlighting even occurances of terms
'Computer'/'Comp'. Anyone knows how to make sure
that only phrases are highlighted, not just the individual terms?
This is not currently implemented in the Highlighter in contrib/
highlighter. Implementing this has been discussed (by converting
a Query to a SpanQuery and leveraging its precise range for
highlighting).
It would be greatly welcome to have some patches to achieve this
level of precise highlighting!
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]