Hello, >From what I've learned, once you have a new segment created and indexed, you can drop it in "as is" without merging -- you just need to shutdown/restart Tomcat (or try the "touch" trick mentioned a few posts ago).
Merging segments together will help speed up query response times. Correct me if I'm wrong, folks, but ideally you have ONE SEGMENT only per machine for optimal performance. QUESTION: I've 10 segments, each about 7GB (~700,000 documents). The mergesegs tool is taking a LONG time -- 14.5 hours to do the merge and on its way to over 13 hours to index the new segment. Any tricks to speeding this up?! I read about 100M-document Nutch instances and I'm only at 7M, but the merging and indexing takes days -- how long do such maintenance tasks take with these larger installations? ... 051215 110130 Processed 4600000 records (227.03795 rec/s) 051215 110258 Processed 4620000 records (226.12411 rec/s) 051215 110427 Processed 4640000 records (225.17198 rec/s) 051215 110604 Processed 4660000 records (207.01791 rec/s) 051215 110732 Processed 4680000 records (226.23923 rec/s) 051215 110856 Processed 4700000 records (237.00052 rec/s) 051215 111446 Processed 4720000 records (57.266716 rec/s) 051215 111615 Processed 4740000 records (223.50114 rec/s) 051215 111742 Processed 4760000 records (231.44398 rec/s) 051215 111909 Processed 4780000 records (229.86656 rec/s) 051215 112043 Processed 4800000 records (212.17458 rec/s) ... DaveG -----Original Message----- From: Arun Kaundal [mailto:[EMAIL PROTECTED] Sent: Thursday, December 15, 2005 5:24 AM To: [email protected] Subject: Re: After mergesegs Hi there, I also want to merge segment based on certain criteria, What I need to do so that newly created segment can be re-indexed and searchable. Can u suggest me how can I achieve that. In what order I need to execute various Nutch Tools Thanx in advance... On 12/9/05, Andy Liu <[EMAIL PROTECTED]> wrote: > > I'm not sure exactly what you're asking, but if you're merging segments, > you'll have to reindex that segment before you can make it searchable. > "updatedb" doesn't affect the segments. it just updates the webdb with > the > results of the fetch. so if you already ran updatedb on the pre-merged > segments, you don't need to run it again. > > On 12/8/05, Goldschmidt, Dave <[EMAIL PROTECTED]> wrote: > > > > Hi, just wanted to be sure - after I merge segments via the "mergesegs" > > tool, I need to use the "updatedb" tool before dropping the new indexes > > in, correct? > > > > > > > > And, as just posted, I need to shutdown and restart Tomcat, too, yes? > > > > > > > > Thanks, > > > > DaveG > > > > > > > > > > > > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
