The reasoning was that in the event of system-inherent failures (i.e.
bugs in HDFS which corrupt the files) a system set up with a completely
different technology would protect from that type of failure would
prevent it from becoming catastrophic. Sounds (and probably in our case
is) a bit paranoid but is common practice e.g. in the aerospace industry
for really critical systems.
Ted Dunning wrote:
Why not go to the next step and use a second cluster as the backup?
On 5/16/08 6:33 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote:
Hi,
what are the options to keep a copy of data from an HDFS instance in
sync with a backup file system which is not HDFS? Are there Rsync-like
tools that allow only to transfer deltas or would one have to implement
that oneself (e.g. by writing a java program that accesses both
filesystems)?
Thanks in advance,
Robert
P.S.: Why would one want that? E.g. to have a completely redundant copy
which in case of systematic failure (e.g. data corruption due to a bug)
offers a backup not affected by that problem.