I just wanted to thank everyone for the beautifull project that nutch is :)
The index has been refreshed (took a bit longer then i'd expected to copy everything over) and the results after Dougs suggestion are OUTSTANDING compared to my prior index. Thanks Doug, Thanks everyone else & all developers! I welcome you to try out the index. If you want a copy of the corpus, i can make it available on FTP as well or burn to a couple of DVD's. -byron --- Byron Miller <[EMAIL PROTECTED]> wrote: > FYI, i've build a 111 million page corpus on the > latest & greatest code (w/lucene 1.4). Hopefully the > scp'ing of indices to servers will be complete by > 10:00 pm EST so you should be able to run queries > then > and see the updates results. > > Documents have been refreshed for the most part 3 > times, so the scoring should be better than the > current index. > > http://www.mozdex.com > > I'll reply to this message once completed, but i > thought i would let people know nutch/lucene has > worked great thus far to build this index and our > next > goal will be 250 million urls :) > > > ------------------------------------------------------- > This SF.Net email sponsored by Black Hat Briefings & > Training. > Attend Black Hat Briefings & Training, Las Vegas > July 24-29 - > digital self defense, top technical experts, no > vendor pitches, > unmatched networking opportunities. Visit > www.blackhat.com > _______________________________________________ > Nutch-developers mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/nutch-developers > ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
