Re: HDFS replication factor

2018-02-02 Thread रविशंकर नायर
This is solved in Hadoop 3. So stay tuned

Best,

On Feb 2, 2018 6:26 AM, "李立伟"  wrote:

> Hi:
>   It's my understanding that HDFS write  operation is not considered
> completd until all of the replicas have been successfully written.If so,
> does the replication factor affect the write latency? the mapreduce\spark
> task will be affected?
>   is there the way to set HDFS write the first replica synchronously
> and return ,the others in an asynchronous.
>   Thanks in advance.
>
>
>


HDFS replication factor

2018-02-02 Thread 李立伟
Hi:
  It's my understanding that HDFS write  operation is not considered
completd until all of the replicas have been successfully written.If so,
does the replication factor affect the write latency? the mapreduce\spark
task will be affected?
  is there the way to set HDFS write the first replica synchronously
and return ,the others in an asynchronous.
  Thanks in advance.


hadoop distcp and hbase ExportSnapshot hdfs replication factor question.

2016-02-24 Thread Mark Selby
I have a primary Hadoop cluster (2.6.0) running Mapreduce and HBase. I 
am backing up to a remote data center that has many fewer machines with 
a higher per disk density.


The default HDFS replication factor on the primary is 3.
The default HDFS replication factor on the primary is 2.

When I run distcp on the primary cluster specifying the remote are the 
source, and I DO NOT specify preserve replication factor as an argument, 
I still get 3 replicas on the remote.


All my HBase snapshots that are copied from the primary to the backup 
also end up with h-files that have a replication factor of 3.


As a test I ran distcp from the backup pulling from the primary and this 
did result in a replication factor of 2. I have many fewer resources on 
the backup and think that it would be faster to perform the large copy 
with a larger number of machines.


As well I can not pull HBase snapshots from the backup cluster. The 
ExportSnapshot utility does not support this.


Does anyone know if it is possible to distcp to another cluster that has 
a smaller replication factor and have that take effect.


Thanks!

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org