Hi all, I am currently starting to play with nutch 0.9 on a hadoop cluster with 5 nodes (one of them acting as master name node). One of the performance bottle neck I observed is related to the speed of copying files from hadoop. In particular in cases when:
- Reduce copy stage (rate of 0.00Mb per seconds) - hadoop dfs -copyToLocal command (it is going at a very slow rate of about 30K per seconds) (please do correct me if these 2 observations are totally unrealated) All these machines are connected via 10Mb LAN and performance should be higher than this. Would anybody have any idea on what to try to improve on the copying performance? Many thanks boris
