RE: task assignment managemens.

2008-09-08 Thread Dmitry Pushkarev
How about just specify machines to run the task on? I haven't seen it
anywhere..

-Original Message-
From: Devaraj Das [mailto:[EMAIL PROTECTED] 
Sent: Sunday, September 07, 2008 9:55 PM
To: core-user@hadoop.apache.org
Subject: Re: task assignment managemens.

No that is not possible today. However, you might want to look at the
TaskScheduler to see if you can implement a scheduler to provide this kind
of task scheduling.

In the current hadoop, one point regarding computationally intensive task is
that if the machine is not able to keep up with the rest of the machines
(and the task on that machine is running slower than others), speculative
execution, if enabled, can help a lot. Also, implicitly, faster/better
machines get more work than the slower machines.


On 9/8/08 3:27 AM, "Dmitry Pushkarev" <[EMAIL PROTECTED]> wrote:

> Dear Hadoop users,
> 
>  
> 
> Is it possible without using java manage task assignment to implement some
> simple rules?  Like do not launch more that 1 instance of crawling task
on
> a machine, and do not run data intensive tasks on remote machines, and do
> not run computationally intensive tasks on single-core machines:etc.
> 
>  
> 
> Now it's done by failing tasks that decided to run on a wrong machine, but
I
> hope to find some solution on jobtracker side..
> 
>  
> 
> ---
> 
> Dmitry
> 




Re: task assignment managemens.

2008-09-07 Thread Alejandro Abdelnur
We need something similar
(https://issues.apache.org/jira/browse/HADOOP-3740), the problem with
the TaskScheduler is that does not have hooks into the lifecycle of a
task.

A

On Mon, Sep 8, 2008 at 10:25 AM, Devaraj Das <[EMAIL PROTECTED]> wrote:
> No that is not possible today. However, you might want to look at the
> TaskScheduler to see if you can implement a scheduler to provide this kind
> of task scheduling.
>
> In the current hadoop, one point regarding computationally intensive task is
> that if the machine is not able to keep up with the rest of the machines
> (and the task on that machine is running slower than others), speculative
> execution, if enabled, can help a lot. Also, implicitly, faster/better
> machines get more work than the slower machines.
>
>
> On 9/8/08 3:27 AM, "Dmitry Pushkarev" <[EMAIL PROTECTED]> wrote:
>
>> Dear Hadoop users,
>>
>>
>>
>> Is it possible without using java manage task assignment to implement some
>> simple rules?  Like do not launch more that 1 instance of crawling task  on
>> a machine, and do not run data intensive tasks on remote machines, and do
>> not run computationally intensive tasks on single-core machines:etc.
>>
>>
>>
>> Now it's done by failing tasks that decided to run on a wrong machine, but I
>> hope to find some solution on jobtracker side..
>>
>>
>>
>> ---
>>
>> Dmitry
>>
>
>
>


Re: task assignment managemens.

2008-09-07 Thread Devaraj Das
No that is not possible today. However, you might want to look at the
TaskScheduler to see if you can implement a scheduler to provide this kind
of task scheduling.

In the current hadoop, one point regarding computationally intensive task is
that if the machine is not able to keep up with the rest of the machines
(and the task on that machine is running slower than others), speculative
execution, if enabled, can help a lot. Also, implicitly, faster/better
machines get more work than the slower machines.


On 9/8/08 3:27 AM, "Dmitry Pushkarev" <[EMAIL PROTECTED]> wrote:

> Dear Hadoop users,
> 
>  
> 
> Is it possible without using java manage task assignment to implement some
> simple rules?  Like do not launch more that 1 instance of crawling task  on
> a machine, and do not run data intensive tasks on remote machines, and do
> not run computationally intensive tasks on single-core machines:etc.
> 
>  
> 
> Now it's done by failing tasks that decided to run on a wrong machine, but I
> hope to find some solution on jobtracker side..
> 
>  
> 
> ---
> 
> Dmitry
>