There was some chatter on the Hbase list about a dual hdfs/s3 driver
class which would write to both but only read from hdfs.  Of course,
having this functionality at the hadoop level would be better than in
a subsidiary project.

Maybe the ability to specify a secondary filesystem in the
hadoop-site.xml?  Candidates might include S3, NFS, or of course,
another HDFS in a geographically isolated location.

-- Jim R. Wilson (jimbojw)

On Fri, May 16, 2008 at 12:06 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:
>
> Why not go to the next step and use a second cluster as the backup?
>
>
> On 5/16/08 6:33 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote:
>
>>
>> Hi,
>>
>> what are the options to keep a copy of data from an HDFS instance in
>> sync with a backup file system which is not HDFS? Are there Rsync-like
>> tools that allow only to transfer deltas or would one have to implement
>> that oneself (e.g. by writing a java program that accesses both
>> filesystems)?
>>
>> Thanks in advance,
>>
>> Robert
>>
>> P.S.: Why would one want that? E.g. to have a completely redundant copy
>> which in case of systematic failure (e.g. data corruption due to a bug)
>> offers a backup not affected by that problem.
>
>

Reply via email to