Sounds fine with me although more experience people here may have different opinion.

One small thing, if you are setting up each site individually, then, fully disable the spidering. That way, you can inject individual sites by yourself.

Good luck,
Emilijan
Ian Reardon wrote:

I am going to crawl a small set of sites and I never want to go off
site and I also want to strictly control my link dept.

I setup crawls for each site using the crawl command.  Then manually
move the segments folder to my "master" directory and re-index.  (This
can all be scripted).  This gives me the flex ability to QA each
individual crawl.

Am I jumping through unnecessary hoops here or does this sound like a
reasonable plan?




-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to