On Jun 26, 2007, at 10:46 AM, Doğacan Güney wrote: >> >> I think that the distributed online Index part should be done >> outside of >> Nutch (or if done here do it with extreme caution:) so it does not >> get >> tied to Nutch. > > I am not sure I understand you here. If I have 10 machines I am using > for serving indexes(I am assuming I have a Solr instance running on > each one), IndexerSolr should be able to partition my index to 10 > machines. >
It may be that Solr handles this with a master server to send to distributed Solr indexes. I currently use Sami's SolrIndexer with the trunk solrj, and we have a single Solr index of about 5m pages on a single 4GB machine, with stored content. Although the indexing is fast and stable, complicated full text queries are too slow for comfort (forget about MLT/faceting etc.) We are currently looking into ways of partitioning this and we may be of service in the future here. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
