FYI, i've build a 111 million page corpus on the
latest & greatest code (w/lucene 1.4). Hopefully the
scp'ing of indices to servers will be complete by
10:00 pm EST so you should be able to run queries then
and see the updates results.

Documents have been refreshed for the most part 3
times, so the scoring should be better than the
current index.

http://www.mozdex.com

I'll reply to this message once completed, but i
thought i would let people know nutch/lucene has
worked great thus far to build this index and our next
goal will be 250 million urls :)


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to