See below...

Eric Baldeschwieler wrote:
Actually...

I think it is greatly in the projects interest to have a really elegant one node solution. It should certainly support multithreading, the web UI, etc.
AFAIK, local setup has never been the interest of hadoop, however, a good implementation will definitely be appreciated.

If it is trivial to write and use single node jobs, then we can write an application once in map-reduce and use it either on large clusters or on small devices.
A local threaded implementation will be quite useful in testing code for small inputs. You can look at MiniMRCluster in src/test as an example written for unit tests.

This would be useful.

Supporting Arun's point, this is an open source projects. If you find a way of making it work better, give it back and we will incorporate it.
Just open an issue at jira, and attach a patch against trunk. See http://wiki.apache.org/lucene-hadoop/HowToContribute

On Aug 19, 2007, at 9:17 PM, Arun C Murthy wrote:

On Sun, Aug 19, 2007 at 11:33:35PM +0200, Thorsten Schuett wrote:
>I have been looking into the LocalJobRunner today. Is there a chance for >official support for parallel map execution/>1 reduce tasks or should I look
>into adding it to my local copy of the code?
>

Please file a request (jira), and patch if you are so inclined! There is nothing *official* about anything here... make yourself at home.

Usually there isn't much bang per buck trying to optimize single-node performance of hadoop's map-reduce, but any contribution is always welcome.

Arun

>Thorsten
>
>On 8/19/07, Thorsten Schuett <[EMAIL PROTECTED]> wrote:
>>
>> In my case, it looks as if the loopback device is the bottleneck. So
>> increasing the number of tasks won't help.
>>
>> Thorsten
>>
>> On 8/18/07, Ted Dunning <[EMAIL PROTECTED]> wrote:
>> >
>> >
>> >
>> > You might try increasing the number of map and reduce tasks so that you
>> > can
>> > overlap cpu and I/O. It is common in parallel applications that you
>> > need to
>> > do something like this.
>> >
>> >
>> > On 8/18/07 8:36 AM, "Thorsten Schuett" <[EMAIL PROTECTED] > wrote:
>> > >> If my assumptions are correct, would it be possible to
>> > >>> read/access the files directly in the "one-node mode"?
>> > >>
>> > >> Please take a look at LocalJobRunner in src/org/apache/hadoop/mapred
>> > ...
>> > >> set the jobtracker in your config to 'local' and this happens
>> > automatically.
>> > >> (http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms )
>> > >
>> > >
>> > > When I use "local", I loose the web interface and the multi-threading.
>> > I can
>> > > live with the former, but the latter is not an option.
>> > >
>> > > Thorsten
>> >
>> >
>>



Reply via email to