FYI I thought this would be of interest to the community... On 12/13/12 5:30 PM, "Jain, Peyush (GSFC-5860)" <[email protected]> wrote:
>>On 12/12/12 6:28 PM, "Mattmann, Chris A (388J)" wrote: >> >>>Hey Peyush, >>> >>> >>>On 12/12/12 9:32 AM, "Jain, Peyush (GSFC-5860)" >>>wrote: >>> >>>>Hi Chris, >>>> >>>> >>>>Can you tell us how Job Scheduler talks to Resource and Job Monitors? >>>>Is >>>>Job Scheduler giving or getting information (regarding node >>>>availability) >>>>to Resource Monitor? >>> >>>Sure, specifically: >>> >>>The LRUScheduler keeps an instance to an instance of a Monitor, >>>specifically for use in obtaining load (per node), nodes (by id and >>>URL), >>>incrementing and reducing load and so forth. So the scheduler uses the >>>Monitor to profile the nodes that it assigns load onto and that it >>>removes >>>load from. >>> >>>> >>>>If Job Scheduler is giving information to Resource Monitor and >>>>executing >>>>the jobs when node is available then why do we need Resource Monitor? >>> >>>The monitor is meant to be an interface for managing load with respect >>>to >>>nodes. Ultimately right now that information is managed partially in XML >>>files, and partially in memory during the resource manager running, but >>>ideally we have always wanted to have a generic GangliaMonitor to get >>>information from a system like Ganglia, and then plug it into the >>>scheduler. Right now the monitoring is "virtual" by profiling what we've >>>sent to a node, and its capacity, etc. But we'd like it to be more real >>>time (and have hacked together solutions for this that haven't made >>>their >>>way back into the OODT Apache trunk). >>> >>> >>>>I >>>>can see a need for Resource Monitor if it is monitoring the nodes and >>>>sending information (node availability) to Job Scheduler so that Job >>>>Scheduler can execute a task (from Job Queue). >>> >>>Yep you got it -- that's basically what the current one is doing. >>> >>>Cheers, >>>Chris >>> >>>> >>>>Thanks, >>>>Peyush >>>> >>>> >>>>On 12/12/12 9:12 AM, "Iwunze, Michael C (GSFC-4700)[NOAA-JPSS]" wrote: >>>> >>>>> >>>>>------ Forwarded Message >>>>>From: "Mattmann, Chris A" >>>>>Date: Tue, 11 Dec 2012 15:39:25 -0600 >>>>>To: Michael Iwunze >>>>>Subject: Re: Resource Manager issue >>>>> >>>>>Hey Mike, >>>>> >>>>> >>>>>On 12/11/12 11:29 AM, "Iwunze, Michael C (GSFC-4700)[NOAA-JPSS]" >>>>>wrote: >>>>> >>>>>>Hi Chris, >>>>>> >>>>>> How are you doing? Thanks for all your help. Things seem to be >>>>>>working >>>>>>fine. >>>>> >>>>>Doing great and getting ready for the holidays. Going to head to El >>>>>Paso, >>>>>TX to watch USC play G. Tech in the Sun Bowl! Even though the Trojans >>>>>didn't have a great season I am still going to head out and to support >>>>>them. My wife and kid are coming too so it'll be a family vacation >>>>>around >>>>>New Years :) >>>>> >>>>>As for the integration work into JPSS/GRAVITE and that working fine, >>>>>that >>>>>is totally awesome and great to hear! :) >>>>> >>>>>>I am trying to get an overview of how all the extensions points >>>>>>interact with each other based on the Resource Manager document >>>>>>online. >>>>>>Extension points meaning the resource manager client/server, batch >>>>>>manager, >>>>>>Job scheduler, Job monitor and Resource Monitor. Without delving deep >>>>>>into >>>>>>the code, my understanding is that once jobs are submitted via the >>>>>>Workflow >>>>>>to the Resource manager the job scheduler queues up jobs using the >>>>>>job >>>>>>queue. During this process the resource monitor checks for available >>>>>>nodes >>>>>>and the job monitor checks if a job is done executing on the nodes. >>>>>>If >>>>>>a >>>>>>node is available the job is sent to the batch manager via the >>>>>>scheduler >>>>>>for >>>>>>execution if not it sits in the queue. Is my explanation close to >>>>>>accurate? >>>>> >>>>>+1 totally accurate and pretty much what happens. >>>>> >>>>>>In addition does the Job scheduler directly communicate with both >>>>>>monitors >>>>>>for node availability? >>>>> >>>>>Yep it sure does -- it has a reference to them in the code I believe. >>>>> >>>>>Cheers, >>>>>Chris >>>>> >>>>>> >>>>>>Thanks >>>>>>mike >>>>>> >>>>>------ End of Forwarded Message >>>>> >>>> >
