[EMAIL PROTECTED] wrote:
nice suggestion about capping the highlighter's number of tokens - I'll add that in.

I agree, good suggestion.

I've had a quick look at your knowledgebase docs. Can't you split them at index time into multiple 
smaller docs using the <a name="xxx"> tags as doc boundaries?
Each lucene document could then have a field with the URL [sourcedoc]#xxx, taking you 
to the relevant section in the source document.

Ideally, yes. Unfortunately, I do not control what our customers put into their knowledge base. Where boundaries are present that's actually quite a good suggestion - thanks!

Doug, do you believe the storing (as an option of course) of token offset information would be something that you'de accept as a contribution to the core of lucene? Does anyone else think that this would be beneficial information to have?


Regards,

Bruce Ritchie
http://www.jivesoftware.com/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature



Reply via email to