Hi.

I have small clusters (9 nodes) to run a hadoop here.

Under this cluster, a hadoop will take thousands of directories sequencely.

In a each dir, there is two input files to m/r. Size of input files are from
1m to 5g bytes.
In a summary, each hadoop job will take an one of these dirs.

To get best performance, which strategy is proper for us?

Could u suggest me about it?
Which configuration is best?

Ps) physical memory size is 12g of each node.

Reply via email to