Re: Limit Nutch Crawl to Seed URLs

2009-03-14 Thread yanky young
domain url filter seems in 1.0, maybe u can just checkout this plugin code from 1.0 trunk and build it into your 0.9 code base good luck yanky 2009/3/14 MyD myd.ro...@googlemail.com Where can I find the domain urlfilter? I'm using the branch 0.9... Cheers, Markus Dennis Kubes-2 wrote:

Re: The Future of Nutch

2009-03-14 Thread yanky young
Hi: I also agree that the most usage scenarios of nutch are in vertical search area. and in some unusual case users may don't even use nutch indexing at all. they just crawl some pages as mirror purpose. and in some cases of vertical search, user only need a fraction of pages, e.g. house rent

Re: The Future of Nutch

2009-03-14 Thread consultas
I am using Nutch for more than four years now, as a vertical search engine, having indexed, some times, over one million pages. On the other hand, I dont know nothing about programming and some specialized aplications. Words like solr and others are like aliens for me. I am just interested

Re: The Future of Nutch

2009-03-14 Thread John Martyniak
I think that this would be the case for making Nutch a top level Apache Project. So that you can publish the framework and a complete app but still tie it all together. Because personally I think that is the strength of Nutch, that you can use it right out of the box, without