Ravi,
As a sometimes owner of, or contributor to, a number of open source
software projects, allow me to offer some friendly advice.
Please understand that I speak only for myself; others on the list may
totally disagree with me. Likewise, neither SchedMD nor my own employer
have been consulted for their opinions, and may think me wrong.
You will find that open source contributors are generally willing to
bend over backwards to resolve questions and problems related to their
projects, ranging from "How do I configure it to do this?" to "I am
seeing incorrect results" to "The system crashes when I do this."
Most or all of us in the open source community have been through school,
or are there now. You will find that we generally reject requests to do
homework assignments on our projects, as that rarely benefits either the
project or the requester.
You can make this a positive experience. If the available documentation
is inadequate, the source code is available to study. On a project that
I ran a while ago, the best contribution was an installation and user
guide that came out of the blue! If, as a part of your learning, you
come up with documentation or examples of how to use the API that you
want to contribute for others to use in the future, that could be
invaluable!
Now, back to your assignment and the use of the SLURM API. Should you
get further into your implementation and have a question such as why
some call returns a result different from what you expect, please feel
free to check in with this list.
Good luck!
Andy
On 02/28/2012 01:41 AM, Ravi Gupta wrote:
Hi Moe,
Can you provide me a simple Hello World type examples written in C? In
the mean time, i am looking at the links you provided.
--Ravi
On Tue, Feb 28, 2012 at 1:14 AM, Moe Jette <[email protected]
<mailto:[email protected]>> wrote:
Take a look here:
http://www.schedmd.com/slurmdocs/documentation.html
Especially the section at the bottom labeled "SLURM Developers"
There is also SLURM architecture information here:
http://www.schedmd.com/slurmdocs/publications.html
Quoting Ravi Gupta <[email protected]
<mailto:[email protected]>>:
> Hi,
>
> I am studying a course on Distributed OS as a part of my Masters
and my
> professor has given me an assignment i.e. "Implement Sender
Initiated Load
> Distribution algorithm over SLURM", which is as follow:-
>
> Q1. Implement *Sender Initiated Load Distribution algorithm over
SLURM*.
>
>
>
> Let each node estimate its load ?l? by randomly picking a number
0 < l < n
> that represents the CPU queue length of the node concerned.
Also, consider
> the Threshold for load transfer as T = 10, where l < T makes a
node as a
> receiver, and l >= T makes a node as sender. The selection
policy is on the
> new arrivals. Use location policy as a *Threshold* based. Take
the poll
> limit as 5. For polling of a node, you may think of sending a
message to
> query for the load index (l) of the new node to find out whether
it can be
> a receiver. As the output of your distributed scheduler, *print
out the log
> of all the polls encountered* till the receiver is found out,
and a message
> saying the *job is transferred from thread (older) to thread
(newer) one*.
> Every time a job is scheduled, the load index is increased, and
departure
> of a job reduces the load index. If you think that you need any
other
> assumption, you are free to make those assumptions. Sample code
snippet is
> given here:
>
>
>
> /*do inclusion of necessary header files*/
>
> #define POLL_LIMIT 5
>
> #define THRESHOLD 10
>
> char hostname[20][50];
>
>
>
> void get_available_hosts()
>
> {
>
> // find out all available host names which are in UP state
>
> // store all host names in 'hostname' array defined globally
>
> //Use 'scontrol show partition' command.
>
> }
>
>
>
> char* ret_self_host_name()
>
> {
>
> //return self host name
>
> }
>
>
>
> int count_job_host(char *host_name)
>
> {
>
> //count total number of running jobs on a particular host
>
> // passed as parameter. Use 'scontrol show jobs' command
>
> //return number of jobs obtained
>
> }
>
>
>
> int allocate_job(char *host, char *program[])
>
> {
>
> //allocate job 'program' to node 'host'
>
> //use 'salloc' command to allocate the program to host
>
> //retrun true if allocation is successful else false.
>
> }
>
>
>
> void main(int argc,char *argv[])
>
> {
>
> //program to execute is taken input from command line
>
> char *self_host;
>
> int rndm_node,i,j;
>
>
>
> get_available_hosts();
>
> self_addr=ret_self_host_name();
>
>
>
> // if jobs are less than threshold limit then
>
> // schedule job on same machine
>
>
>
> if(count_job_host(self_addr)<=THRESHOLD)
>
> {
>
> //make an entry in log file
>
> allocate_job(self_addr,argv);
>
> }
>
> else
>
> {
>
> for(j=0;j<POLL_LIMIT;)//poll for receivers
>
> {
>
> rndm_node=random()%no_hosts;
>
> //rndm_node should be other than self node
>
> //address.
>
>
>
> //make an entry in log file
>
> if(count_job_host(hostname[rndm_node])< THRESHOLD)
>
> {
>
> allocate_job(hostname[rndm_node],argv)
>
> break;
>
> }
>
> j++;
>
> }
>
> }
>
> if(j==POLL_LIMIT) //if poll_limt exeeds schedule on itself
>
> {
>
> //make an entry in log file
>
> allocate_job(self_addr,argv);
>
> }
>
> }//main ends
>
>
> Can anybody, familiar with SCRUM programming in C, guide me how
to do it? I
> search a lot on internet for SLURM examples in C but unable to
find one. I
> am not saying that give me the complete solution just guide me.
>
>
> Thanks in advance
>
> Ravi
>
--
Andy Riebs
Hewlett-Packard Company
High Performance Computing
+1-786-263-9743
My opinions are not necessarily those of HP