Hi,

When a large fetch finally finishes we see the typical  -activeThreads=0 for a 
long time with slightly increased RAM consumption (relative to during the 
fetch) and extremely high IO-wait time.

At first it would look like the fetch job is writing away the files it 
downloaded, but i cannot be since the sum of data size is much greater than 
the used, and available RAM. After a while the IO-wait drops to almost zero 
and process time increases again while it's still finishing the fetch job. At 
this time RAM consumption drops back to the usual during fetch.

My question: can anyone please explain this behaviour or at least explain 
what's happening when the fetcher finishes?

Since IO-wait just stops the process the non-IO-wait time is interesting since 
it may be a point of improvement. Why not do the tasks it's doing while 
fetching?

In this specific case it's about 1.4-dev running a local job and a fetcher 
being limited by time. The crawl is limited to a big TLD and only takes a few 
pages per host. Linux has been tuned to allow high amount of packages (syslog 
doesn't mention dropping packets anymore) and a very large list of hosts (the 
TLD).

Thanks,
M.

Reply via email to