Hello, I've been subscribed to this list for several months and have read numerous posts, although most sail merrily above my noggin. I hope this is not an inappropriate post.
I want to start a SE and have done quite a lot of thinking about it .. although I'm not a programmer by any stretch of imagination .. and my budget is .. um .. "challenged" The SE I want is a clustering SE for Travel .. for specific regions. - I want to spider and index as many pages as possible (I think) on *one server* - I want to exclude *all* predominant affiliate sites and all directory sites. - I *think* I only want to spider to 3 levels .. as the site should be about travel resources more so than detailed information. - I believe updating the information (respidering) every 30 days is sufficient .. maybe even every 60 days? - I think that the easiest part is to set up nutch and get it working .. and the harder part is configuring the crawler? or the indexer? to inclue only those URLs that fit the requirements I have defined above. - The one specific section of the DMOZ index would probably be okay for seeding the database So I'm hoping to get: - Some reasonable comments on my plan - A price from a consultant (free works too! ;) to get Nutch + clustering set up and running - A price from a consultant to configure the spider/indexer. Thanks and best regards, Dave W.
