RE: CrawlTool - fetching only first page

Fuad Efendi Thu, 11 Aug 2005 09:36:11 -0700

I noticed some changes between 0.6 and 0.7, CrawlTool class...
Probably...

Thanks



-----Original Message-----
From: Fuad Efendi [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 11, 2005 12:22 PM
To: [email protected]
Subject: RE: CrawlTool - fetching only first page


I loaded latest code, created nutch-0.7-dev, and run command bin/nutch
crawl url.txt -dir test.crawl -depth 1

Still does not work. It works in nutch-0.6, with same depth and url.txt,
it fetches about 30 files.



-----Original Message-----
From: Fuad Efendi [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 11, 2005 11:29 AM
To: [email protected]
Subject: RE: CrawlTool - fetching only first page


Yes, I defined depth 5 (I noticed, it creates 5 segments)
It fetches only main URLs without linked pages


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 11, 2005 11:20 AM
To: [email protected]
Subject: Aw: CrawlTool - fetching only first page


 Did you define a depth?
What is your exact command? 

should be something like

./nuch crawl urls -dir crawldir -threads 1 -depth 3

Nils 

----- Original Nachricht ----
Von:     Fuad Efendi <[EMAIL PROTECTED]>
An:      [email protected]
Datum:   11.08.2005 17:16
Betreff: CrawlTool - fetching only first page

> I configured classpath including \conf\ and \build\ (which contains
> plugins) folders, and run CrawlTool without any errors, but it fetches

> only first page and does not fetch lined pages. Windows XP.
> 
> What is missed?
> 
> 

Machen Sie aus 14 Cent spielend bis zu 100 Euro!
Die neue Gaming-Area von Arcor - über 50 Onlinespiele im Angebot.
http://www.arcor.de/rd/emf-gaming-1

RE: CrawlTool - fetching only first page

Reply via email to