I started the crawler with about 2000 sites. The fetcher could achieve 7 pages/sec initially, but the performance gradually dropped to about 2 pages/sec, sometimes even 0.5 pages/sec. The fetch list had 300k pages and I used 500 threads. What are the main causes of this slowing down? Below are sample status:

050927 005952 status: segment 20050927005922, 100 pages, 3 errors, 1784615 bytes, 14611 ms
050927 005952 status: 6.8441586 pages/s, 954.2334 kb/s, 17846.15 bytes/page
050927 010005 status: segment 20050927005922, 200 pages, 9 errors, 3656863 bytes, 28170 ms 050927 010005 status: 7.0997515 pages/s, 1014.1726 kb/s, 18284.314 bytes/page

after sometime ...
050927 171818 status: segment 20050927070752, 101400 pages, 7201 errors, 2593026554 bytes, 36216316 ms
050927 171818 status: 2.799843 pages/s, 559.3617 kb/s, 25572.254 bytes/page
050927 171832 status: segment 20050927070752, 101500 pages, 7204 errors, 2595591632 bytes, 36230516 ms
050927 171832 status: 2.8015058 pages/s, 559.6956 kb/s, 25572.332 bytes/page

Thanks,
AJ

Reply via email to