[slurm-dev] Re: "Super node" Integration with SLURM.

amjad syed Fri, 21 Sep 2012 00:53:09 -0700

I have configured slurm with frontend configuration mode. I have chosen
cons_res as my resource selection algorithm .
Now my front end node has xcpufs  (XCPU agent) running which is required
for VERTEX. Hence when i try to start slurmd on front end node i get
following error:



slurmd: error: select/cons_res is incompatible with XCPU use
"slurmd: fatal: Use SelectType=select/linear"

So my question is why can i not use cons_res (allocating individual
resources)  when XCPUfs is running on a  compute/front end node.?





On Wed, Sep 5, 2012 at 11:29 AM, Alejandro Lucero Palau <
[email protected]> wrote:

> **
> Hi Amjad,
>
> As Moe commented, SLURM has a frontend configuration mode that can help
> you. However, there are code related to resources using local functions
> along with data created during initialization. For example, you can not use
> affinity plugin with frontend mode due to code mixing real hardware
> information from the node running slurmd with "virtual" node information
> received from the slurmctld. And I guess this dependency is not just at the
> affinity plugin.
>
> We are using frontend mode for splitting a NUMA machine in virtual nodes
> and I've been working on the affinity plugin problem lately. I've a patch
> for solving  this issue but it requires more testing.
>
>
> On 09/05/2012 03:35 AM, amjad syed wrote:
>
> Hello,
>
> We are working on concept of "super node" which transparently connects
> heterogeneous light weight compute nodes to storage and services subsystem.
> The light weight compute nodes will be used exclusively for computational
> purposes and no service daemons will be running on these light weight
> compute nodes.
> We have open source implementation of this product is hosted on github.
> ( https://github.com/HPCLinks/Open-Vertex)
>
> So in terms of SLURM, the light weight compute nodes will not have slurmd
> daemons running on it. The management node  daemon (slurmctld) will only
> communicate with "super node" daemon (slurmd). This slurmd daemon should be
> able to get dynamic resource information from light weight compute nodes
> attached to "super node" and pass that information to management node. We
> are looking at maximum 10 light weight compute nodes  attached to  one
> "super node".
>
>
>
>
> Can slurmd running on compute node manage  remote resources (such as
> memory) ?
>
> What is the best way forward to integrate VERTEX with  SLURM ?
>
> Sincerely,
> Amjad
>
>
>
>
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain information
> which is privileged, confidential, proprietary, or exempt from disclosure
> under applicable law. If you are not the intended recipient or the person
> responsible for delivering the message to the intended recipient, you are
> strictly prohibited from disclosing, distributing, copying, or in any way
> using this message. If you have received this communication in error,
> please notify the sender and destroy and delete any copies you may have
> received.
>
> http://www.bsc.es/disclaimer <http://www.bsc.es/disclaimer.htm>
>

[slurm-dev] Re: "Super node" Integration with SLURM.

Reply via email to