Just use a depth of 10 or whatever. If there are no more pages to crawl one depth more or less does no harm. For normal websites anything in the range from 5 to 10 for depth imho should be reasonable.
topN: This allows you to work on only the highest ranked URLs not yet fetched. It functions as a max. pages limit per each run (depth). Regards, Stefan Matthew Holt wrote: > Ok thanks.. as far as crawling the entire subdomain.. what exact command > would I use? > > Because depth says how many pages deep to go.. is there anyway to hit > every single page, without specifying depth? Or should I just say > depth=10? Also, topN is no longer used, correct? > > Stefan Neufeind wrote: > >> Matthew Holt wrote: >> >> >>> Question, >>> I'm trying to index a subdomain of my intranet. How do I make it >>> index the entire subdomain, but not index any pages off of the >>> subdomain? Thanks! >>> >> >> Have a look at crawl-urlfilter.txt in the conf/ directory. >> >> # accept hosts in MY.DOMAIN.NAME >> +^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ >> >> # skip everything else >> -. >> >> >> Regards, >> Stefan
