Lucene: it slices, it dices, it's twice as nice as...well, htDig 3.1 for sure.

Here are a couple reasons why Lucene is the Battle Royale champ in my eyes:

* Powerful searching features: http://lucene.apache.org/java/docs/queryparsersyntax.html - let's see the other kids do all this.

* When was the last time you searched Unicode characters on The Mail Archive? That's right, never! But with our experiemental Lucene interface, you can: http://www.mail-archive.com/lucene/search.py?list=freebsd-users-jp%40jp.freebsd.org&query=%E3%81%95%E3%82%93


Downside/upside: Xapian has the nice Omega web interface. Lucene has Nutch, but for a few reasons it didn't fit our needs. So, our web interface is coded up by hand using PyLucene/modpython. The downside is that it meant a bit more work for us, but the upside is complete configurability. Missing right now is paging - it simply displays the best 100 matches.

Downside: our PyLucene interface doesn't have a killer name like Omega.

Jeff Marshall
[EMAIL PROTECTED]



Jeff Breidenbach wrote:
If you don't care about search,  don't read further.

===

Sunday, Sunday, SUNDAY!

Come see the data crunching, webpage hopping, free-styling
search library action. Two monster libraries, titans of Free
Software technology, compete to become the native search
engine for The Mail Archive.

Watch as Xapian Omega crushes and destroys the competition,
finishing off queries in milliseconds. This probabilistic juggernaut is
a battle tested, email chewing reigning champion in Europe. Honed
for years and more hardened than quartz, Jeff Breidenbach will
drive Xapian Omega during this Battle Royale.

PyLucene is a mild mannered garbage collectin' programming library
just like your mom's search index. That is, if your mom's search index
could jump partitions, crush gigabytes down to tiny sements, and
plow through millions of records. Forged on the anvil of a Xerox PARC
alumni, brimming with black magic, Lucene will be wrought by the
indomitable Jeff Marshall.

We're taking these two byte belching, buffer oversized, monster libraries
and pitting them head to head. Old geezer HtDig 3.1 will also make a
final appearance in the arena.  All three engines can run on any list,
just by replacing gossip@jab.org with the listname of your choice.

Who will win the monster rally? Xapian vs Lucene? Jeff vs Jeff? Yes,
you decide! Send comments to gossip, or privately if you are shy,
for the next week or so. Bonus points for using phrases like
like "slamming!" "spectacular" or "crushed like a bug". Who's got
the slickest user interface? Which contender has superior
data-crunching  performance? How about grits, determination and
the baddest sounding name?

Want to see something tweaked? Have questions? Ask and it will
be done if humanly possible - this is a gritty bit-for-bit battle of
hotrod software and programmer ingenuity no holds barred.  Ladies
and gentlemen...  Start your search engines!

HtDig3.1
http://www.mail-archive.com/cgi-bin/htsearch?config=gossip_jab_org&words=magically

Xapian Omega
http://www.mail-archive.com/cgi-bin/omega/omega?P=magically&[EMAIL PROTECTED]

PyLucene
http://www.mail-archive.com/lucene/[EMAIL PROTECTED]&query=magically

_______________________________________________
Discussion list for The Mail Archive
Gossip@jab.org
http://jab.org/cgi-bin/mailman/listinfo/gossip



_______________________________________________
Discussion list for The Mail Archive
Gossip@jab.org
http://jab.org/cgi-bin/mailman/listinfo/gossip

Reply via email to