[
https://issues.apache.org/jira/browse/HDFS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070979#comment-14070979
]
Allen Wittenauer commented on HDFS-268:
---------------------------------------
This should probably be part of the HDFS-6382 discussion.
> Distinguishing file missing/corruption for low replication files
> ----------------------------------------------------------------
>
> Key: HDFS-268
> URL: https://issues.apache.org/jira/browse/HDFS-268
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Koji Noguchi
>
> In PIG-856, there's a discussion about reducing the replication factor for
> intermediate files between jobs.
> I've seen users doing the same in mapreduce jobs getting some speed up. (I
> believe their outputs were too small to benefit from the pipelining.)
> Problem is, when users start changing replications to 1 (or 2), ops starts
> seeing alerts from fsck and HADOOP-4103 even with a single datanode failure.
> Also, problem of Namenode not getting out of safemode when restarted.
> My answer has been asking the users "please don't change the replication
> less than 3".
> But is this the right approach?
--
This message was sent by Atlassian JIRA
(v6.2#6252)