Local disk is NOT reliable.  As I mentioned, a machine can die or a map task
can be run multiple times on different machines.

If you want reliable storage, you have to use HDFS.

You can write to local disk during the execution of a task, but you need to
move the data to HDFS before the task completes if you want it to be
permanently stored.

The fact that local disk is not reliable is inherent in the way that Hadoop
works (and, in fact, inherent in the way large scale computing works).  If
this seems surprising to you, you need to work through the basic assumptions
again.

On Wed, Jul 1, 2009 at 1:46 PM, bonito perdo <[email protected]>wrote:

> Since i write them locally to each node, how is it possible? What I want is
> not losing data.
>

Reply via email to