-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Am 07.10.2006 um 17:40 schrieb Cristina Belderrain:

Let me remind you that all this must be done just to provide something
that's already there: Nutch is built on top of Lucene, after all. If
it's hard to understand why Lucene's capabilities were simply
neutralized in Nutch, it's even harder to figure out why no choice was
left to users by means of some configuration file.

I think this issue is rooted in the underlying philosophy of Nutch: Nutch was designed with the idea of a possible Google(and the likes)- sized crawler and indexer in mind. Regular expressions and wildcard queries do not seem to fit into this philosophy, as such queries would be way less efficient on a huge data set than simple boolean queries.

Nevertheless, I agree that there should be an option to choose the Lucene query engine instead of the Nutch flavour one because Nutch has been proven to be equally suitable for areas which do not require as efficient queries (like intranet crawling for instance) as an all- out web indexing application.

- --
Best regards,
Björn Wilmsmann


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFFJ+75gz0R1bg11MERAgT7AJ4mPRF8Z0BR2yLCm5Pxsz4VvtTI6QCfcS8b
q8gM8LQapjAloNIRwNV+osE=
=v7Lf
-----END PGP SIGNATURE-----

Reply via email to