Loading data into HDFS

Venkates .P.B. Wed, 01 Aug 2007 06:10:19 -0700

Few queries regarding the way data is loaded into HDFS.

-Is it a common practice to load the data into HDFS only through the master
node ? We are able to copy only around 35 logs (64K each) per minute in a 2
slave configuration.


-We are concerned about time it would take to update filenames and block
maps in the master node when data is loaded from few/all the slave nodes.
Can anyone let me know how long generally it takes for this update to
happen.

And one more question, what if the node crashes soon after the data is
copied into one it. How is data consistency maintained here ?

Thanks in advance,
Venkates P B

Loading data into HDFS

Reply via email to