In addition to Lewis suggestions, please try giving bigger value to topN, if configuration files are defined right way, you will see more crawls.
On Thu, Mar 14, 2013 at 12:30 AM, Lewis John Mcgibbney < [email protected]> wrote: > You can use the parsechecker from the nutch script to see what outlinks you > should be picking up. > Once you know how the crawler is configured then you can begin to assert > why outlinks are not either being parsed out, or subsequently being > fetched. > hth > > On Wed, Mar 13, 2013 at 6:13 PM, Dat Tran <[email protected]> wrote: > > > Thank for your reply. After configure urlfilter, i execute this command > to > > crawl > > bin/nutch crawl urls -topN 10 -depth 3 > > (urls is the directory where seed list located ). > > But it crawls, fetchs and parses only links which are defined in seed > > list, > > not the outlinks. > > > > > > > > -- > > View this message in context: > > > http://lucene.472066.n3.nabble.com/Iterative-Crawling-tp4046501p4047209.html > > Sent from the Nutch - User mailing list archive at Nabble.com. > > > > > > -- > *Lewis* > -- Kiran Chitturi <http://www.linkedin.com/in/kiranchitturi>

