hey chris,

i believe mike is talking about the following case:

1) queue is full
2) scheduler pops job from queue and beginnings trying to find a node for job
3) queue now has 1 open slot
4) another job is given to the resource manager and is placed in the queue
5) queue is now full again
6) scheduler fails to schedule popped job
7) scheduler pushs job back into the queue
8) queue is full so exception is thrown and job is lost

-brian

On Apr 10, 2012, at 07:08 AM, "Mattmann, Chris A (388J)" <[email protected]> wrote:

Hi Mike,

On Apr 9, 2012, at 9:12 AM, Cayanan, Michael D (388J) wrote:

> Hey Chris,

> Comments are below.
>> 
>> "At the time of this writing, jobs that cannot be added to the queue
>> disappear...."
>> 
>> I think we should be more clear than "disappear". They don't disappear.
>> The 
>> Scheduler will try and send a Job to the BatchMgr, and if there is an
>> exception,
>> it tries to re-queue the Job back onto the JobStack. If it's unable to do
>> that, then
>> there is an issue, but it at the very least tries to re-queue the job if
>> there was an
>> issue. 

> The reason this blurb was put into the wiki was because when Gabe and I
> were looking through the Resource Manager code, this is what looks to be
> happening. Check out the piece of code that tries to add a job:

Reaching Max queue size is different than saying that jobs that cannot be
added to the queue disappear. I think we should explicitly state:

"At the time of this writing, when then queue has reached the max queue 
size, a message is logged by the Scheduler saying there is a Job Queue
Exception adding a job to the queue, and then the Job is dropped."

I think that's more accurate based on your code walk. I was thinking based on
your above message that you were talking about Jobs that couldn't be
Scheduled for whatever reason (e.g., the Batch Mgr being down, or a
Batch Stub being down) in which case they are re-queued.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to