It definitely cannot be done with custom token types. You're probably aiming for field-specific boosting, so you will need to parse the HTML into separate fields and use a multi-field search approach.

I'm sure there are other tricks that could be used for boosting, like inserting the words inside <b> multiple times into the same field for example.

Erik


On Jan 21, 2004, at 6:50 AM, Alexey Maksakov wrote:


Hello!

Is there any idea how to achieve boosting terms in HTML-documents surrounded
by HTML tags, such as <B>, <H1>, etc.?


Can it be done with use of existing API or reimplemeting or implementation
of TokenStream with custom Token types is needed?


Though it seems to me, that even such re-implementation won't help without
changing indexing and searcher code... Hope that I'm wrong.


Thanks in advance.

Alexey.




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to