Re: Configuration for small Cluster

baran cakici Mon, 02 May 2011 08:30:33 -0700

I got it, I want to run on each Tasktracker one ReduceTask, overall 4
Redeuce Task on all Cluster


2011/5/2 baran cakici <[email protected]>

> Actually it was one, I changed that, and got better Performance by Reduce,
> because my Reduce-Algortihm is a little bit complex.
>
> thanks anyway
>
> Regards,
>
> Baran
>
> 2011/5/2 Richard Nadeau <[email protected]>
>
>> I would change "mapred.tasktracker.reduce.tasks.maximum" to one. With your
>> setting
>>
>> On May 2, 2011 8:48 AM, "baran cakici" <[email protected]> wrote:
>> > without job;
>> >
>> > CPU Usage = 0%
>> > Memory = 585 MB (2GB Ram)
>> >
>> > Baran
>> > 2011/5/2 baran cakici <[email protected]>
>> >
>> >> CPU Usage = 95-100%
>> >> Memory = 650-850 MB (2GB Ram)
>> >>
>> >> Baran
>> >>
>> >>
>> >> 2011/5/2 James Seigel <[email protected]>
>> >>
>> >>> If you have windows and cygwin you probably don't have a lot if memory
>> >>> left at 2 gig.
>> >>>
>> >>> Pull up system monitor on the data nodes and check for free memory
>> >>> when you have you jobs running. I bet it is quite low.
>> >>>
>> >>> I am not a windows guy so I can't take you much farther.
>> >>>
>> >>> James
>> >>>
>> >>> Sent from my mobile. Please excuse the typos.
>> >>>
>> >>> On 2011-05-02, at 8:32 AM, baran cakici <[email protected]>
>> wrote:
>> >>>
>> >>> > yes, I am running under cygwin on my datanodes too. OS of Datanodes
>> are
>> >>> > Windows as well.
>> >>> >
>> >>> > What can I do exactly for a better Performance. I changed
>> >>> > mapred.child.java.opts to default value.How can I solve this
>> "swapping"
>> >>> > problem?
>> >>> >
>> >>> > PS: I dont have a chance to get Slaves(Celeron 2GHz) with Liniux OS.
>> >>> >
>> >>> > thanks, both of you
>> >>> >
>> >>> > Regards,
>> >>> >
>> >>> > Baran
>> >>> > 2011/5/2 Richard Nadeau <[email protected]>
>> >>> >
>> >>> >> Are you running under cygwin on your data nodes as well? That is
>> >>> certain to
>> >>> >> cause performance problems. As James suggested, swapping to disk is
>> >>> going
>> >>> >> to
>> >>> >> be a killer, running on Windows with Celeron processors only
>> compounds
>> >>> the
>> >>> >> problem. The Celeron processor is also sub-optimal for CPU
>> intensive
>> >>> tasks
>> >>> >>
>> >>> >> Rick
>> >>> >>
>> >>> >> On Apr 28, 2011 9:22 AM, "baran cakici" <[email protected]>
>> wrote:
>> >>> >>> Hi Everyone,
>> >>> >>>
>> >>> >>> I have a Cluster with one Master(JobTracker and NameNode - Intel
>> >>> Core2Duo
>> >>> >> 2
>> >>> >>> GB Ram) and four Slaves(Datanode and Tasktracker - Celeron 2 GB
>> Ram).
>> >>> My
>> >>> >>> Inputdata are between 2GB-10GB and I read Inputdata in MapReduce
>> line
>> >>> by
>> >>> >>> line. Now, I try to accelerate my System(Benchmark), but I'm not
>> sure,
>> >>> if
>> >>> >> my
>> >>> >>> Configuration is correctly. Can you please just look, if it is ok?
>> >>> >>>
>> >>> >>> -mapred-site.xml
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.job.tracker</name>
>> >>> >>> <value>apple:9001</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.child.java.opts</name>
>> >>> >>> <value>-Xmx512m -server</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.job.tracker.handler.count</name>
>> >>> >>> <value>2</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.local.dir</name>
>> >>> >>>
>> >>> >>
>> >>>
>>
>> <value>/cygwin/usr/local/hadoop-datastore/hadoop-Baran/mapred/local</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.map.tasks</name>
>> >>> >>> <value>1</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.reduce.tasks</name>
>> >>> >>> <value>4</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.submit.replication</name>
>> >>> >>> <value>2</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.system.dir</name>
>> >>> >>>
>> >>> >>
>> >>> >>
>> >>>
>>
>> <value>/cygwin/usr/local/hadoop-datastore/hadoop-Baran/mapred/system</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.tasktracker.indexcache.mb</name>
>> >>> >>> <value>10</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.tasktracker.map.tasks.maximum</name>
>> >>> >>> <value>1</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.tasktracker.reduce.tasks.maximum</name>
>> >>> >>> <value>4</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.temp.dir</name>
>> >>> >>>
>> >>> >>
>> >>>
>> <value>/cygwin/usr/local/hadoop-datastore/hadoop-Baran/mapred/temp</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>webinterface.private.actions</name>
>> >>> >>> <value>true</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>mapred.reduce.slowstart.completed.maps</name>
>> >>> >>> <value>0.01</value>
>> >>> >>> </property>
>> >>> >>>
>> >>> >>> -hdfs-site.xml
>> >>> >>>
>> >>> >>> <property>
>> >>> >>> <name>dfs.block.size</name>
>> >>> >>> <value>268435456</value>
>> >>> >>> </property>
>> >>> >>> PS: I extended dfs.block.size, because I won 50% better
>> performance
>> >>> with
>> >>> >>> this change.
>> >>> >>>
>> >>> >>> I am waiting for your comments...
>> >>> >>>
>> >>> >>> Regards,
>> >>> >>>
>> >>> >>> Baran
>> >>> >>
>> >>>
>> >>
>> >>
>>
>
>

Re: Configuration for small Cluster

Reply via email to