MilleBii wrote:
Interesting updates on the current run of 450K urls :
+ 30minutes @ 3Mbits/s
+ drop to 1Mbit/s (1/X shape)
+ gradual improvement to 1.5 Mbit/s and steady for 7 hours
+ sudden drop to 0.9 Mbits/s and steady for 4 hours
+ up to 1.7 Mbits for 1hour
+ staircasing down to 0.5 Mbit/s by steps of 1 hour
I don't know what to take as a conclusion, but it is quite strange to have
those sudden variation of bandwidth and overall very slow.
I can post the graph if people are interested.
This most likely comes from the allocation of urls to map tasks, and the
maximum number of map tasks that you can run on your cluster. when tasks
finish their run, you see a sudden drop in speed, until the next task
starts running. Initially, I suspect that you have more tasks available
than the capacity of your cluster, so it's easy to fill the slots and
max the speed. Later on, slow map tasks tend to hang around, but still
some of them finish and make space for new tasks. As time goes on,
majority of your tasks becomes slow tasks, so the overall speed
continues to drop down.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com