Re: [lucy-dev] Thank you, and site built with Lucy

Moritz Lenz Tue, 16 Jul 2013 03:45:13 -0700


On 07/16/2013 01:43 AM, Marvin Humphrey wrote:

On Mon, Jul 15, 2013 at 9:23 AM, Moritz Lenz <[email protected]> wrote:

Some details:

An example search can be found here:
http://irclog.perlgeek.de/perl6/search/?nick=timtoady&q=threads
The backend code is here:
https://github.com/moritz/ilbot/blob/master/lib/Ilbot/Backend/Search.pm

For the indexing I lump together all subsequent lines by the same nick into
one document, and store the database IDs as a comma-separate value in a
second field, and the day in a third field. Each IRC channel has a separate
index.

Then when displaying the search results, I retrieve all lines for that day
and channel from the database or cache (which is fast enough, and much
simpler than building complicated queries), and filter out the search
results, plus a few lines before and afterward for context.

So far I'm very happy with theses tradeoffs, and like the results.


It's a very nice interface.  Congratulations on a successful design. :)

Thanks. (TBH the design wasn't by me. I provided a useful but uglyservice, and a user was sufficiently annoyed to provide a better design;this approach has worked several times for me in the open sourcecommunity :-)

I wonder whether you might consider making the "line" field stored and
highlightable.

       my $type = Lucy::Plan::FullTextType->new(
           analyzer => $polyanalyzer,
-         stored => 0,
+         highlightable => 1,
       );

I see that you've emboldened the relevant line, but you could go further and
use the Highlighter to emphasize the keywords that were searched for.

     http://lucy.apache.org/docs/perl/Lucy/Docs/Tutorial/Highlighter.html
     http://lucy.apache.org/docs/perl/Lucy/Highlight/Highlighter.html

By default, the Highlighter surrounds keywords with `<strong>` tags, but using
set_pre_tag() and set_post_tag() you can make it use a span with CSS, <blink>
tags, or whatever.


Thanks for the comment.

I'm well aware of the highlighting feature. The reason I don't use it isthat (although it's not obvious from the example search I've linked to),there is a big amount of processing going on (escaping HTML,automatically turning URLs into links, inserting zero-width, breakingspaces into long words to prevent horizontal scrolling, ...), and Icouldn't quite figure out how to mix my own processing with thehilighting from Lucy.

For my purposes it would be much nicer to obtain indexes into the stringwhere the search term was found, (or alternatively, let me set separatecallbacks for both the context and the search results) so that I coulddo my own processing with that information.

(I guess I could generate a unique string that doesn't yet appear in thestring, set it as pre/post tag, and then split on that, but that feelsvery backwards).


Cheers,
Moritz

Re: [lucy-dev] Thank you, and site built with Lucy

Reply via email to