Thanks for the helpful reply, very kind! > >> I don't know what's up with FooFactory at the moment, but I put together >> the Solr&Nutch page. I may be able to send/post something. Was there >> something in particular you were looking for?
I found the google cached site, but there are some links to other resources that I need to see. I tried to search google for them (site:www.foofactory.fi) - no luck. These are links to information I'd like to see (copied from foofactory site page) 1. Set up conf/regex-urlfilter.txt 2. Set up conf/nutch-site.xml 3. Generate a list of seed urls into folder urls 4. Grab this simple script that will help you along in your crawling task. and A patch against Nutch trunk is provided for those who wish to be brave. The "patch" is a link. (Incidently, I don't know if it's required, or optional, but I figured I'd be brave.)
