Stefan, Thanks a bunch, turns out the ? mark was the cause of the problems. Able to run the search now and not having any problems with it. Thanks a ton! Matt
Stefan Groschupf wrote: > Hi Matt, > the first impression I have is that your segment has only 6 pages. > You generated many empthy pages. > is the website a CMS? Has it questionmarks in the URLs? You exclude > all pages with questions marks. > Check how many pages are after injecting in your web db. > Check how many pages are in your segment fetch list. > > HTH > Stefan > >> -. >> >> >> Stefan Neufeind wrote: >> >>> Matthew Holt wrote: >>> >>>> Just fyi,.. both of the sites I am trying to crawl are under the >>>> same domain. The sub-domains just differ. Works for one, the other >>>> it o nly appears to fetch 6 or so pages then doesn't fetch anymore. >>>> Do you need any more information to solve the problem? I've tried >>>> everything and havent' had any luck.. Thanks. >>> >>> >>> >>> What does your crawl-urlfilter.txt look like? >>> >>> Stefan >>> >> >> >> > _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
