Thanks for clarifying.
On Thu, May 19, 2011 at 1:50 PM, Joey Echeverria <j...@cloudera.com> wrote: > The later. Replication of 1 means there's only one copy of any given > data block. If you lose that replica, you lose your data. > > -Joey > > On Thu, May 19, 2011 at 9:44 AM, Steve Cohen <mail4st...@gmail.com> wrote: >> One last question about these replication values. If dfs.replication >> and mapred.submit.replication are set to 1, does that mean they get >> copied one time so there are two dfs blocks and two job files or does >> it mean there is one dfs block and one job file? >> >> Thanks, >> Steve Cohen >> >> On Thu, May 19, 2011 at 2:43 AM, Friso van Vollenhoven >> <fvanvollenho...@xebia.com> wrote: >>> I believe it's this: >>> >>> <property> >>> <name>mapred.submit.replication</name> >>> <value>10</value> >>> <description>The replication level for submitted job files. This >>> should be around the square root of the number of nodes. >>> </description> >>> </property> >>> >>> You can set it per job in the job specific conf and/or in mapred-site.xml. >>> >>> >>> Friso >>> >>> >>> >>> On 19 mei 2011, at 03:42, Steve Cohen wrote: >>> >>>> Where is the default replication factor on job files set? Is it different >>>> then the dfs.replication setting in hdfs-site.xml? >>>> >>>> Sent from my iPad >>>> >>>> On May 18, 2011, at 9:10 PM, Joey Echeverria <j...@cloudera.com> wrote: >>>> >>>>> Did you run a map reduce job? >>>>> >>>>> I think the default replication factor on job files is 10, which >>>>> obviously doesn't work well on a psuedo-distributed cluster. >>>>> >>>>> -Joey >>>>> >>>>> On Wed, May 18, 2011 at 5:07 PM, Steve Cohen <mail4st...@gmail.com> wrote: >>>>>> Thanks for the answer. Earlier, I asked about why I get occasional not >>>>>> replicated yet errors. Now, I had dfs.replication set to one. What >>>>>> replication could it have been doing? Did the error messages actually >>>>>> mean that the file couldn't get created in the cluster? >>>>>> >>>>>> Thanks, >>>>>> Steve Cohen >>>>>> >>>>>> >>>>>> >>>>>> On May 18, 2011, at 6:39 PM, Todd Lipcon <t...@cloudera.com> wrote: >>>>>> >>>>>>> Tried to send this, but apparently SpamAssassin finds emails about >>>>>>> "replicas" to be spammy. This time with less rich text :) >>>>>>> >>>>>>> On Wed, May 18, 2011 at 3:35 PM, Todd Lipcon <t...@cloudera.com> wrote: >>>>>>>> >>>>>>>> Hi Steve, >>>>>>>> Running setrep will indeed change those files. Changing >>>>>>>> "dfs.replication" just changes the default replication value for files >>>>>>>> created in the future. Replication level is a file-specific property. >>>>>>>> Thanks >>>>>>>> -Todd >>>>>>>> >>>>>>>> On Wed, May 18, 2011 at 3:32 PM, Steve Cohen <mail4st...@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Say I add a datanode to a pseudo cluster and I want to change the >>>>>>>>> replication factor to 2. I see that I can either run hadoop fs -setrep >>>>>>>>> or change the hdfs-site.xml value for dfs.replication. But do either >>>>>>>>> of these cause the existing blocks to replicate? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Steve Cohen >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Todd Lipcon >>>>>>>> Software Engineer, Cloudera >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Todd Lipcon >>>>>>> Software Engineer, Cloudera >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Joseph Echeverria >>>>> Cloudera, Inc. >>>>> 443.305.9434 >>> >>> >> > > > > -- > Joseph Echeverria > Cloudera, Inc. > 443.305.9434 >