...by host.
I guess you want suggest to change PartitionUrlByHost. In this case, can you please me point out how to change it to have an "unpolite" fetcher?



Luca Rondanini


Doğacan Güney wrote:
On 7/25/07, Luca Rondanini <[EMAIL PROTECTED]> wrote:

Hi all,
The gerate step of my crwal process is taking more then 2 hours....is it
normal?


Are you partitioning urls by ip or by host?


this is my stat report:

CrawlDb statistics start: crawl/crawldb
Statistics for CrawlDb: crawl/crawldb
TOTAL urls:     586860
retry 0:        578159
retry 1:        1983
retry 2:        2017
retry 3:        4701
min score:      0.0
avg score:      0.0
max score:      1.0
status 1 (db_unfetched):        164849
status 2 (db_fetched):  417306
status 3 (db_gone):     4701
status 5 (db_redir_perm):       4
CrawlDb statistics: done





Luca Rondanini




Reply via email to