Re: Text storing design and performance question

Chris Hostetter Thu, 11 Jan 2007 13:58:28 -0800

In general, if you are having performance issues with highlighting, the
first thing to do is double check what the bottleneck is: is it accessing
the text to by highlighted, or is it running the highlighter?


you suggested earlier in the thread that the problem was with accessing
the text...

: >>> Now we keep this big text field on disk (in a file), and feed it to
: >>> the highlighter. Unfortunately the highlighter has to read the file,
: >>> parse it, etc... It's slooow, sometimes over a second on a large

the first question to ask is: are you storing just the text that you want
to highlight, or is it in a higher level format -- i ask because you use
the expression "parse it" which suggests to me that perhaps you are
storing raw HTML or RDF odcuments or something like that ... things might
be faster for you if you store only the extracted text.

second: for many applications, only a small percentage of the documents
every wind up getting surfaced via search, and more importantly: a small
percentage tend to be surfaced a high percentage of hte time ... perhaps
you could add an in memory cache of the most frequently highlighted
documents (without hte highlighting being applied, so cache hits can occur
on popular documents even if the search words are different)

if you find that accessing the text isn't the bottleneck, and the acctual
highlighting call is what's slow, then perhaps the issue is that the
fragmenter is working too hard to come up with good snippets, and you
could write your own that does leswork but takes less time?


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Text storing design and performance question

Reply via email to