No worries Usman, I will try and do the same on Monday. Thanks Todd for the clarification.
Tim On Fri, Oct 16, 2009 at 5:30 PM, Usman Waheed <[email protected]> wrote: > Hi Tim, > > I have been swamped with some other stuff so did not get a chance to run > further tests on my setup. > Will send them out early next week so we can compare. > > Cheers, > Usman > >> On Fri, Oct 16, 2009 at 4:01 AM, tim robertson >> <[email protected]>wrote: >> >> >>> >>> Hi all, >>> >>> Adding the following to core-site.xml, mapred-site.xml and >>> hdfs-site.xml (based on Cloudera guidelines: >>> http://tinyurl.com/ykupczu) >>> io.sort.factor: 15 (mapred-site.xml) >>> io.sort.mb: 150 (mapred-site.xml) >>> io.file.buffer.size: 65536 (core-site.xml) >>> dfs.datanode.handler.count: 3 (hdfs-site.xml actually this is the >>> default) >>> >>> and using the default of HADOOP_HEAPSIZE=1000 (hadoop-env.sh) >>> >>> Using 2 mappers and 2 reducers, can someone please help me with the >>> maths as to why my jobs are failing with "Error: Java heap space" in >>> the maps? >>> (the same runs fine with io.sort.factor of 10 and io.sort.mb of 100) >>> >>> io.sort.mb of 200 x 4 (2 mappers, 2 reducers) = 0.8G >>> Plus the 2 daemons on the node at 1G each = 1.8G >>> Plus Xmx of 1G for each hadoop daemon task = 5.8G >>> >>> The machines have 8G in them. Obviously my maths is screwy somewhere... >>> >>> >>> >> >> Hi Tim, >> >> Did you also change mapred.child.java.opts? The HADOOP_HEAPSIZE parameter >> is >> for the daemons, not the tasks. If you bump up io.sort.mb you also have to >> bump up the -Xmx argument in mapred.child.java.opts to give the actual >> tasks >> more RAM. >> >> -Todd >> >> >> >>> >>> On Fri, Oct 16, 2009 at 9:59 AM, Erik Forsberg <[email protected]> >>> wrote: >>> >>>> >>>> On Thu, 15 Oct 2009 11:32:35 +0200 >>>> Usman Waheed <[email protected]> wrote: >>>> >>>> >>>>> >>>>> Hi Todd, >>>>> >>>>> Some changes have been applied to the cluster based on the >>>>> documentation (URL) you noted below, >>>>> >>>> >>>> I would also like to know what settings people are tuning on the >>>> operating system level. The blog post mentioned here does not mention >>>> much about that, except for the fileno changes. >>>> >>>> We got about 3x the read performance when running DFSIOTest by mounting >>>> our ext3 filesystems with the noatime parameter. I saw that mentioned >>>> in the slides from some Cloudera presentation. >>>> >>>> (For those who don't know, the noatime parameter turns off the >>>> recording of access time on files. That's a horrible performance killer >>>> since it means every read of a file also means that the kernel must do >>>> a write. These writes are probably queued up, but still, if you don't >>>> need the atime (very few applications do), turn it off!) >>>> >>>> Have people been experimenting with different filesystems, or are most >>>> of us running on top of ext3? >>>> >>>> How about mounting ext3 with "data=writeback"? That's rumoured to give >>>> the best throughput and could help with write performance. From >>>> mount(8): >>>> >>>> writeback >>>> Data ordering is not preserved - data may be written into the >>>> >>> >>> main file system >>> >>>> >>>> after its metadata has been committed to the journal. This >>>> >>> >>> is rumoured to be the >>> >>>> >>>> highest throughput option. It guarantees internal file system >>>> >>> >>> integrity, >>> >>>> >>>> however it can allow old data to appear in files after a crash >>>> >>> >>> and journal recovery. >>> >>>> >>>> How would the HDFS consistency checks cope with old data appearing in >>>> the unerlying files after a system crash? >>>> >>>> Cheers, >>>> \EF >>>> -- >>>> Erik Forsberg <[email protected]> >>>> Developer, Opera Software - http://www.opera.com/ >>>> >>>> >> >> > >
