I have a method that might work for clustering. Would it be possible toYes, i have 50 Millions of pages and also a dataset of 200Millions provided by the Nutch organization.
get a test set from your anchor texts?
I sent to them a disk and they copied the content. Would you like to do the same ?
------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
