On Tue, Sep 05, 2006 at 09:39:46PM -0700, Jeff Breidenbach wrote:
> By private email, one person noted that Xapian is not doing
> so hot on Czech. As in not returning search results.

I'm a little suprised that anything gets found that isn't in ASCII
since Omega's indexer is generating words from utf-8 text but treating
it as iso-8859-1 (unless JeffB has patched it).  Since the second and
subsequent characters of any multibyte utf-8 sequence are symbols in
iso-8859-1, that means you'll get a word break after any non-ASCII
character, even if it's an accented letter.

But anyway, utf-8 support for Xapian is pretty much written now.  I've
just sent JeffB a patch to try (but it'll require a reindex so it'll
probably not be live for a while).

> Another managed to craft a query that causes PyLucene to throw an
> exception.

Any query which it can't parse seems to do that.  Try searching
for * or ", for example.

> In the meantime, I'd love to hear some pointed comments
> on the user interface. Do people prefer Omega's "search engine
> standard" layout, or PyLucene's "blend into service" appearance?

FWIW, I think I prefer the "blend in" look.  But the look is essentially
orthogonal to the choice of engine - it's just a matter of slotting the
appropriate HTML into the templates.

Cheers,
    Olly

_______________________________________________
Discussion list for The Mail Archive
Gossip@jab.org
http://jab.org/cgi-bin/mailman/listinfo/gossip

Reply via email to