hi, all
I have a hadoop cluster which have one master and three datanodes.

I want to put a local file about 128M intpu hdfs, I have set the
block-size to 10M

when I set the replication to 0,
I found that all the data distributed to the node which I execute the
command 'bin/hadoop dfs -put file.gz input', so this node's disk space
is used about 128M, but other nodes has no disk space used.

when I set the replication to 3,
I found that every nodes have the same data, so every nodes is about
128M disk space used.

what should I do? I'm using hadoop-0.15.2.

any one can help me?

thanks.

Reply via email to