Hello guys,
I am starting the update from hadoop 0.20 to a newer version which changes
HDFS format(2.0). I read a lot of tutorials and they say that data loss is
possible (as expected). In order to avoid HDFS data loss I am will probably
backup all HDFS structure (7TB per node). However, this is a huge amount
of data and it will take a lot of time in which my service would be unavailable.

I was thinking about a simple approach: copying all files to a different place. I tried to find some parallel files compactor to fasten the process, but could
not find it.

How do you guys did it?
Is there some trick?

Thank you in advance,
Pablo Musa

Reply via email to