The later. Replication of 1 means there's only one copy of any given data block. If you lose that replica, you lose your data.
-Joey On Thu, May 19, 2011 at 9:44 AM, Steve Cohen <mail4st...@gmail.com> wrote: > One last question about these replication values. If dfs.replication > and mapred.submit.replication are set to 1, does that mean they get > copied one time so there are two dfs blocks and two job files or does > it mean there is one dfs block and one job file? > > Thanks, > Steve Cohen > > On Thu, May 19, 2011 at 2:43 AM, Friso van Vollenhoven > <fvanvollenho...@xebia.com> wrote: >> I believe it's this: >> >> <property> >> <name>mapred.submit.replication</name> >> <value>10</value> >> <description>The replication level for submitted job files. This >> should be around the square root of the number of nodes. >> </description> >> </property> >> >> You can set it per job in the job specific conf and/or in mapred-site.xml. >> >> >> Friso >> >> >> >> On 19 mei 2011, at 03:42, Steve Cohen wrote: >> >>> Where is the default replication factor on job files set? Is it different >>> then the dfs.replication setting in hdfs-site.xml? >>> >>> Sent from my iPad >>> >>> On May 18, 2011, at 9:10 PM, Joey Echeverria <j...@cloudera.com> wrote: >>> >>>> Did you run a map reduce job? >>>> >>>> I think the default replication factor on job files is 10, which >>>> obviously doesn't work well on a psuedo-distributed cluster. >>>> >>>> -Joey >>>> >>>> On Wed, May 18, 2011 at 5:07 PM, Steve Cohen <mail4st...@gmail.com> wrote: >>>>> Thanks for the answer. Earlier, I asked about why I get occasional not >>>>> replicated yet errors. Now, I had dfs.replication set to one. What >>>>> replication could it have been doing? Did the error messages actually >>>>> mean that the file couldn't get created in the cluster? >>>>> >>>>> Thanks, >>>>> Steve Cohen >>>>> >>>>> >>>>> >>>>> On May 18, 2011, at 6:39 PM, Todd Lipcon <t...@cloudera.com> wrote: >>>>> >>>>>> Tried to send this, but apparently SpamAssassin finds emails about >>>>>> "replicas" to be spammy. This time with less rich text :) >>>>>> >>>>>> On Wed, May 18, 2011 at 3:35 PM, Todd Lipcon <t...@cloudera.com> wrote: >>>>>>> >>>>>>> Hi Steve, >>>>>>> Running setrep will indeed change those files. Changing >>>>>>> "dfs.replication" just changes the default replication value for files >>>>>>> created in the future. Replication level is a file-specific property. >>>>>>> Thanks >>>>>>> -Todd >>>>>>> >>>>>>> On Wed, May 18, 2011 at 3:32 PM, Steve Cohen <mail4st...@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> Say I add a datanode to a pseudo cluster and I want to change the >>>>>>>> replication factor to 2. I see that I can either run hadoop fs -setrep >>>>>>>> or change the hdfs-site.xml value for dfs.replication. But do either >>>>>>>> of these cause the existing blocks to replicate? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Steve Cohen >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Todd Lipcon >>>>>>> Software Engineer, Cloudera >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>> >>>> >>>> >>>> >>>> -- >>>> Joseph Echeverria >>>> Cloudera, Inc. >>>> 443.305.9434 >> >> > -- Joseph Echeverria Cloudera, Inc. 443.305.9434