Hello, I'm working on large amount of logs, and I've noticed that the distribution of data on the network (./hadoop dfs -put input input) takes a lot of time.
Let's says that my data is already distributed among the network, is there anyway to say to hadoop to use the already existing distribution ?. Thanks -- Jean-Pierre <[EMAIL PROTECTED]>
