Take a look here:
http://www.schedmd.com/slurmdocs/documentation.html

Especially the section at the bottom labeled "SLURM Developers"

There is also SLURM architecture information here:
http://www.schedmd.com/slurmdocs/publications.html

Quoting Ravi Gupta <[email protected]>:

> Hi,
>
> I am studying a course on Distributed OS as a part of my Masters and my
> professor has given me an assignment i.e. "Implement Sender Initiated Load
> Distribution algorithm over SLURM", which is as follow:-
>
> Q1. Implement *Sender Initiated Load Distribution algorithm over SLURM*.
>
>
>
> Let each node estimate its load ?l? by randomly picking a number 0 < l < n
> that represents the CPU queue length of the node concerned. Also, consider
> the Threshold for load transfer as T = 10, where l < T makes a node as a
> receiver, and l >= T makes a node as sender. The selection policy is on the
> new arrivals. Use location policy as a *Threshold* based. Take the poll
> limit as 5. For polling of a node, you may think of sending a message to
> query for the load index (l) of the new node to find out whether it can be
> a receiver. As the output of your distributed scheduler, *print out the log
> of all the polls encountered* till the receiver is found out, and a message
> saying the *job is transferred from thread (older) to thread (newer) one*.
> Every time a job is scheduled, the load index is increased, and departure
> of a job reduces the load index. If you think that you need any other
> assumption, you are free to make those assumptions. Sample code snippet is
> given here:
>
>
>
> /*do inclusion of necessary header files*/
>
> #define POLL_LIMIT 5
>
> #define THRESHOLD 10
>
> char hostname[20][50];
>
>
>
> void get_available_hosts()
>
> {
>
> // find out all available host names which are in UP state
>
> // store all host names in 'hostname' array defined globally
>
> //Use 'scontrol show partition' command.
>
> }
>
>
>
> char* ret_self_host_name()
>
> {
>
> //return self host name
>
> }
>
>
>
> int count_job_host(char *host_name)
>
> {
>
> //count total number of running jobs on a particular host
>
> // passed as parameter. Use 'scontrol show jobs' command
>
> //return number of jobs obtained
>
> }
>
>
>
> int allocate_job(char *host, char *program[])
>
> {
>
> //allocate job 'program' to node 'host'
>
> //use 'salloc' command to allocate the program to host
>
> //retrun true if allocation is successful else false.
>
> }
>
>
>
> void main(int argc,char *argv[])
>
> {
>
>      //program to execute is taken input from command line
>
>      char *self_host;
>
>      int rndm_node,i,j;
>
>
>
> get_available_hosts();
>
> self_addr=ret_self_host_name();
>
>
>
>      // if jobs are less than threshold limit then
>
>      // schedule job on same machine
>
>
>
>      if(count_job_host(self_addr)<=THRESHOLD)
>
>      {
>
> //make an entry in log file
>
>            allocate_job(self_addr,argv);
>
>      }
>
>      else
>
>      {
>
>            for(j=0;j<POLL_LIMIT;)//poll for receivers
>
>            {
>
>                 rndm_node=random()%no_hosts;
>
>                 //rndm_node should be other than self node
>
>                 //address.
>
>
>
>                 //make an entry in log file
>
>                 if(count_job_host(hostname[rndm_node])< THRESHOLD)
>
>                 {
>
>                      allocate_job(hostname[rndm_node],argv)
>
>                      break;
>
>                 }
>
>                 j++;
>
>            }
>
>      }
>
>      if(j==POLL_LIMIT) //if poll_limt exeeds schedule on itself
>
>      {
>
> //make an entry in log file
>
>            allocate_job(self_addr,argv);
>
>      }
>
> }//main ends
>
>
> Can anybody, familiar with SCRUM programming in C, guide me how to do it? I
> search a lot on internet for SLURM examples in C but unable to find one. I
> am not saying that give me the complete solution just guide me.
>
>
> Thanks in advance
>
> Ravi
>

Reply via email to