Re: Reduce Performance

Enis Soztutar Tue, 21 Aug 2007 00:30:28 -0700

See below...

Eric Baldeschwieler wrote:

Actually...
I think it is greatly in the projects interest to have a reallyelegant one node solution. It should certainly supportmultithreading, the web UI, etc.

AFAIK, local setup has never been the interest of hadoop, however, agood implementation will definitely be appreciated.

If it is trivial to write and use single node jobs, then we can writean application once in map-reduce and use it either on large clustersor on small devices.

A local threaded implementation will be quite useful in testing code forsmall inputs. You can look at MiniMRCluster in src/test as an examplewritten for unit tests.

This would be useful.
Supporting Arun's point, this is an open source projects. If you finda way of making it work better, give it back and we will incorporate it.

Just open an issue at jira, and attach a patch against trunk. Seehttp://wiki.apache.org/lucene-hadoop/HowToContribute

On Aug 19, 2007, at 9:17 PM, Arun C Murthy wrote:
On Sun, Aug 19, 2007 at 11:33:35PM +0200, Thorsten Schuett wrote:
>I have been looking into the LocalJobRunner today. Is there a chancefor>official support for parallel map execution/>1 reduce tasks orshould I look
>into adding it to my local copy of the code?
>
Please file a request (jira), and patch if you are so inclined! Thereis nothing *official* about anything here... make yourself at home.
Usually there isn't much bang per buck trying to optimize single-nodeperformance of hadoop's map-reduce, but any contribution is alwayswelcome.
Arun

>Thorsten
>
>On 8/19/07, Thorsten Schuett <[EMAIL PROTECTED]> wrote:
>>
>> In my case, it looks as if the loopback device is the bottleneck. So
>> increasing the number of tasks won't help.
>>
>> Thorsten
>>
>> On 8/18/07, Ted Dunning <[EMAIL PROTECTED]> wrote:
>> >
>> >
>> >
>> > You might try increasing the number of map and reduce tasks sothat you
>> > can
>> > overlap cpu and I/O. It is common in parallel applications thatyou
>> > need to
>> > do something like this.
>> >
>> >
>> > On 8/18/07 8:36 AM, "Thorsten Schuett" <[EMAIL PROTECTED] > wrote:
>> > >> If my assumptions are correct, would it be possible to
>> > >>> read/access the files directly in the "one-node mode"?
>> > >>
>> > >> Please take a look at LocalJobRunner insrc/org/apache/hadoop/mapred
>> > ...
>> > >> set the jobtracker in your config to 'local' and this happens
>> > automatically.
>> > >>(http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms )
>> > >
>> > >
>> > > When I use "local", I loose the web interface and themulti-threading.
>> > I can
>> > > live with the former, but the latter is not an option.
>> > >
>> > > Thorsten
>> >
>> >
>>

Re: Reduce Performance

Reply via email to