[slurm-dev] local scratch

2016-10-05 Thread Neile Havens
To work around some performance issues with our storage server, I need my users 
to work in a local scratch directory, copying files in at the beginning of a 
job and out at the end as needed.  We're doing something similar to the 
suggestions on these sites, but would like to write the output and error files 
to the local scratch directory as well.
https://www.princeton.edu/researchcomputing/faq/how-do-i-use-local-scratc/
http://www.ceci-hpc.be/slurm_faq.html#Q11

Something like the following fails.  It does not look like SLURM evaluates 
environment variables in SBATCH directives.

#SBATCH --output="$SCRATCH/%A/%a/stdout.txt"
#SBATCH --error="$SCRATCH/%A/%a/stderr.txt"

Something like the following fails, because the directory in which SLURM tries 
to create the output and error files does not exist when it tries to create the 
files.

#SBATCH --output="/localscratch/netid/%A/%a/stdout.txt"
#SBATCH --error="/localscratch/netid/%A/%a/stderr.txt"
export SCRATCH="/localscratch/netid/$SLURM_ARRAY_JOB_ID/$SLURM_ARRAY_TASK_ID"
rm -rf $SCRATCH
mkdir -p $SCRATCH

Creating a local scratch directory in a Prolog script works.  However, Prolog 
scripts that run on compute nodes do not have the SLURM_ARRAY_JOB_ID or 
SLURM_ARRAY_TASK_ID environment variables.  They can only get SLURM_JOB_ID.  
This makes it difficult to create a unique local scratch directory for each 
array index in the job arrays that most of my users submit.  Does anyone know 
of a way to get job array or array task ids from a Prolog script or a way to 
set an environment variable in a Prolog script that could be used in a batch 
script?

Thanks,

Neile Havens
Scientific Computing System Administrator
Wheaton College


[slurm-dev] Re: Send notification email

2016-10-05 Thread Fanny Pagés Díaz

According to the previous mail explaining this, I'm trying to configure SLURM + 
SMTP (mail client) without using postfix.
I execute the jobs like this, but is not work:

salloc -n 2-N 2 --mail-user=fpa...@gmail.com --mail-type=END mpirun jobs1

/var/log/maillog
Oct  5 11:34:09 cluster sSMTP[2139]: Creating SSL connection to host
Oct  5 11:34:09 cluster sSMTP[2139]: SSL connection using 
ECDHE-RSA-AES256-GCM-SHA384
Oct  5 11:34:09 cluster sSMTP[2139]: Sent mail for root@fpa...@citi.cu (221 
2.0.0 Bye) uid=0 username=root outbytes=420
Oct  5 11:34:52 compute-0-3 postfix/qmgr[1792]: 2AC6BC006B: from=, 
size=4328, nrcpt=1 (queue active)
Oct  5 11:34:52 compute-0-3 postfix/smtp[6469]: connect to 
10.8.52.254[10.8.52.254]:25: Connection refused
Oct  5 11:34:52 compute-0-3 postfix/smtp[6469]: 2AC6BC006B: to=, 
orig_to=, relay=none, delay=8869, delays=8869/0.01/0/0, dsn=4.4.1, 
status=deferred $

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] 
Enviado el: miércoles, 5 de octubre de 2016 11:33
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
Are you able to send us some of the lines from the /var/log/maillog file which 
indicateds why the email server is rejecting the email?
Thankyou



-Original Message-
From: Fany Pages [mailto:fpa...@udio.cujae.edu.cu]
Sent: 05 October 2016 16:30
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Thanks anyway.

All the best.
Fany

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] Enviado el: miércoles, 5 de 
octubre de 2016 11:20
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
You are correct.  I understand this a bit better now.
My answer I am afraid is that you will have to ask your corporate IT people to 
allow email from this address.

I recently dealt with a similar case at a university. The mail servers were 
refusing to accept mail from the cluster head node, as it did not have a 
reverse DNS access. I n the end we had to configure email to go via an 
Office365 server!

Other people on the list may be able to offer a better solution though.





-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:45
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Yes, I refer to the external network IP cluster  is not valid out, my domain is 
not registered (@ cluster.citi.cu) therefore is not in the MX records, so when 
I relay in my postfix, my corporate mail server refuses mails go out of my 
internal network. I think that's what is happening. I'm wrong?

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] Enviado el: miércoles, 5 de 
octubre de 2016 10:17
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
Many clusters which have an internal network which is a private network.

However the other interface on the cluster head node, which is normally called 
the 'external' interface can have a real, proper IP address on your external 
network.
It will therefore be able to send email.
The cluster compute nodes can be configured to 'relay' email via the head node.



-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:13
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Hi,
Thanks for your answer.
My HPC cluster does not have a real IP segment, it is a test cluster. Therefore 
it not recognized in the external network. So, I need try to another way.
All the best,
Fany

-Mensaje original-
De: Christopher Samuel [mailto:sam...@unimelb.edu.au] Enviado el: martes, 4 de 
octubre de 2016 18:43
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email


On 03/10/16 23:39, Fanny Pagés Díaz wrote:

> I have a slurm running in the same HPC cluster server, but I need send 
> all notification using my corporate mail server, which running in 
> another server at my internal network. I not need use the local 
> postfix installed at slurm server.

The most reliable solution will be to configure Postfix to send emails via the 
corporate server.

All our clusters send using our own mail server quite deliberately.

We set:

relayhost (to say where to relay email via) myorigin (to set the system name to 
its proper FQDN) aliasmaps (to add an LDAP lookup to rewrite users email to the 
value in
LDAP)

But really this isn't a Slurm issue, it's a host config issue for Postfix.

All the best,
Chris
--
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory 

[slurm-dev] Re: cons_res / CR_CPU - we don't have select plugin type 102

2016-10-05 Thread Jose Antonio

Hello Lachlan,

Thanks for your reply. All the nodes have access to /usr/lib64/slurm, 
but the directory is not shared, each one has its own, and yes they do 
have the file "select_cons_res.so":


$ ls /usr/lib64/slurm/ | grep select*
select_alps.so
select_bluegene.so
select_cons_res.so
select_cray.so
select_linear.so
select_serial.so

Thanks for the tip, scontrol reconfigure is way easier than restarting 
the daemons.


As the problems persists, I will post my slurm.conf just in case I am 
messing something up.


ControlMachine=phb1
ControlAddr=X

MpiDefault=none
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/cons_res # If I change it to select/linear it works fine
SelectTypeParameters=CR_CPU
#
# LOGGING AND ACCOUNTING
AccountingStorageHost=phb1
AccountingStorageType=accounting_storage/slurmdbd
ClusterName=phb1-cluster
#JobAcctGatherFrequency=30
#JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurmd.log
#
#
# COMPUTE NODES
NodeName=phb2 CPUs=8 Sockets=1 CoresPerSocket=4 ThreadsPerCore=2 
RealMemory=15877 NodeAddr=X State=UNKNOWN

PartitionName=phb2 -queue Nodes=phb2 Default=Yes MaxTime=1:00:00 State=UP

Regards,

Jose

El 05/10/2016 a las 0:12, Lachlan Musicman escribió:
Re: [slurm-dev] Re: cons_res / CR_CPU - we don't have select plugin 
type 102

Jose,

Do all the nodes have access to either a shared /usr/lib64/slurm or do 
they each have their own? And is there a file in that dir (on each 
machine) called select_cons_res.so?


Also, when changing slurm.conf here's a quick and easy workflow:

1. change slurm.conf
2. deploy to all machines in cluster (I use ansible, but puppet, 
satellite, clusterssh, pssh, pdsh etc are also good here)

3. on head node: restart slurmctld
4. on head node: run "scontrol reconfigure"


That it. No need to reboot any nodes or even login to them.

cheers
L.



--
The most dangerous phrase in the language is, "We've always done it 
this way."


- Grace Hopper

On 5 October 2016 at 07:25, Jose Antonio > wrote:


Hi Manuel,

Thanks for replying. Yes, I have checked the slurm.conf, they are
all the same on the server and compute nodes. I restarted the
slurmd  daemon on the compute nodes and finally restarted the
slurmctld service on the server. I rebooted the machines too, but
it keeps showing the same error message on the console (Zero Bytes
were...) and log files.

I have also set the PluginDir=/usr/lib64/slurm just in case it
could not find the plugins, but it does not work either.
All the partitions are active (idle), they did not turn to down or
drained state.

Regards,

Jose


El 04/10/2016 a las 20:28, Manuel Rodríguez Pascual escribió:

Hi Jose,

I don't know if it's the case, but this error tends to arise
after changing configuration in slurmctld but not rebooting the
compute nodes or having there a different configuration. Have you
double-checked this?

Best regards,

Manuel

El martes, 4 de octubre de 2016, Jose Antonio
>
escribió:


Hi,

Currently I have set the SelectType parameter to
"select/linear", which
works fine. However, when a job is sent to a node, the job
takes all the
cpus of the machine, even if it only uses 1 core.

That is why I changed SelectType to "select/cons_res" and its
SelectTypeParameters to "CR_CPU", but this doesn't seem to
work. If I
try to send a task to a partition, which works with
select/linear, the
following message pops up:

sbatch: error: slurm_receive_msg: Zero Bytes were transmitted
or received
sbatch: error: Batch job submission failed: Zero Bytes were
transmitted
or received

The log in the server node (/var/log/slurmctld.log):

error: we don't have select plugin type 102
error: select_g_select_jobinfo_unpack: unpack error
error: Malformed RPC of type REQUEST_SUBMIT_BATCH_JOB(4003)
received
error: slurm_receive_msg: Header lengths are longer than data
received
error: slurm_receive_msg [155.54.204.200:38850
]: Header lengths are
longer than data received

There is no update in the compute node logs after this error
comes up.

Any ideas?

Thanks,

Jose








[slurm-dev] Re: Send notification email

2016-10-05 Thread John Hearns
Fany,
Are you able to send us some of the lines from the /var/log/maillog file which 
indicateds why the email server is rejecting the email?
Thankyou



-Original Message-
From: Fany Pages [mailto:fpa...@udio.cujae.edu.cu]
Sent: 05 October 2016 16:30
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Thanks anyway.

All the best.
Fany

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] Enviado el: miércoles, 5 de 
octubre de 2016 11:20
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
You are correct.  I understand this a bit better now.
My answer I am afraid is that you will have to ask your corporate IT people to 
allow email from this address.

I recently dealt with a similar case at a university. The mail servers were 
refusing to accept mail from the cluster head node, as it did not have a 
reverse DNS access. I n the end we had to configure email to go via an 
Office365 server!

Other people on the list may be able to offer a better solution though.





-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:45
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Yes, I refer to the external network IP cluster  is not valid out, my domain is 
not registered (@ cluster.citi.cu) therefore is not in the MX records, so when 
I relay in my postfix, my corporate mail server refuses mails go out of my 
internal network. I think that's what is happening. I'm wrong?

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] Enviado el: miércoles, 5 de 
octubre de 2016 10:17
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
Many clusters which have an internal network which is a private network.

However the other interface on the cluster head node, which is normally called 
the 'external' interface can have a real, proper IP address on your external 
network.
It will therefore be able to send email.
The cluster compute nodes can be configured to 'relay' email via the head node.



-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:13
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Hi,
Thanks for your answer.
My HPC cluster does not have a real IP segment, it is a test cluster. Therefore 
it not recognized in the external network. So, I need try to another way.
All the best,
Fany

-Mensaje original-
De: Christopher Samuel [mailto:sam...@unimelb.edu.au] Enviado el: martes, 4 de 
octubre de 2016 18:43
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email


On 03/10/16 23:39, Fanny Pagés Díaz wrote:

> I have a slurm running in the same HPC cluster server, but I need send
> all notification using my corporate mail server, which running in
> another server at my internal network. I not need use the local
> postfix installed at slurm server.

The most reliable solution will be to configure Postfix to send emails via the 
corporate server.

All our clusters send using our own mail server quite deliberately.

We set:

relayhost (to say where to relay email via) myorigin (to set the system name to 
its proper FQDN) aliasmaps (to add an LDAP lookup to rewrite users email to the 
value in
LDAP)

But really this isn't a Slurm issue, it's a host config issue for Postfix.

All the best,
Chris
--
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP Any views or opinions presented in this email are 
solely those of the author and do not necessarily represent those of the 
company. Employees of XMA Ltd are expressly required not to make defamatory 
statements and not to infringe or authorise any infringement of copyright or 
any other legal right by email communications. Any such communication is 
contrary to company policy and outside the scope of the employment of the 
individual concerned. The company will not accept any liability in respect of 
such communication, 

[slurm-dev] Re: Send notification email

2016-10-05 Thread Fany Pages

Thanks anyway.

All the best.
Fany

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] 
Enviado el: miércoles, 5 de octubre de 2016 11:20
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
You are correct.  I understand this a bit better now.
My answer I am afraid is that you will have to ask your corporate IT people to 
allow email from this address.

I recently dealt with a similar case at a university. The mail servers were 
refusing to accept mail from the cluster head node, as it did not have a 
reverse DNS access. I n the end we had to configure email to go via an 
Office365 server!

Other people on the list may be able to offer a better solution though.





-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:45
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Yes, I refer to the external network IP cluster  is not valid out, my domain is 
not registered (@ cluster.citi.cu) therefore is not in the MX records, so when 
I relay in my postfix, my corporate mail server refuses mails go out of my 
internal network. I think that's what is happening. I'm wrong?

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] Enviado el: miércoles, 5 de 
octubre de 2016 10:17
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
Many clusters which have an internal network which is a private network.

However the other interface on the cluster head node, which is normally called 
the 'external' interface can have a real, proper IP address on your external 
network.
It will therefore be able to send email.
The cluster compute nodes can be configured to 'relay' email via the head node.



-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:13
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Hi,
Thanks for your answer.
My HPC cluster does not have a real IP segment, it is a test cluster. Therefore 
it not recognized in the external network. So, I need try to another way.
All the best,
Fany

-Mensaje original-
De: Christopher Samuel [mailto:sam...@unimelb.edu.au] Enviado el: martes, 4 de 
octubre de 2016 18:43
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email


On 03/10/16 23:39, Fanny Pagés Díaz wrote:

> I have a slurm running in the same HPC cluster server, but I need send 
> all notification using my corporate mail server, which running in 
> another server at my internal network. I not need use the local 
> postfix installed at slurm server.

The most reliable solution will be to configure Postfix to send emails via the 
corporate server.

All our clusters send using our own mail server quite deliberately.

We set:

relayhost (to say where to relay email via) myorigin (to set the system name to 
its proper FQDN) aliasmaps (to add an LDAP lookup to rewrite users email to the 
value in
LDAP)

But really this isn't a Slurm issue, it's a host config issue for Postfix.

All the best,
Chris
--
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP Any views or opinions presented in this email are 
solely those of the author and do not necessarily represent those of the 
company. Employees of XMA Ltd are expressly required not to make defamatory 
statements and not to infringe or authorise any infringement of copyright or 
any other legal right by email communications. Any such communication is 
contrary to company policy and outside the scope of the employment of the 
individual concerned. The company will not accept any liability in respect of 
such communication, and the employee responsible will be personally liable for 
any damages or other liability arising. XMA Limited is registered in England 
and Wales (registered no. 2051703). Registered Office: Wilford Industrial 
Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP


[slurm-dev] Re: Send notification email

2016-10-05 Thread John Hearns
Fany,
You are correct.  I understand this a bit better now.
My answer I am afraid is that you will have to ask your corporate IT people to 
allow email from this address.

I recently dealt with a similar case at a university. The mail servers were 
refusing to accept mail from the cluster head node,
as it did not have a reverse DNS access. I n the end we had to configure email 
to go via an Office365 server!

Other people on the list may be able to offer a better solution though.





-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:45
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Yes, I refer to the external network IP cluster  is not valid out, my domain is 
not registered (@ cluster.citi.cu) therefore is not in the MX records, so when 
I relay in my postfix, my corporate mail server refuses mails go out of my 
internal network. I think that's what is happening. I'm wrong?

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] Enviado el: miércoles, 5 de 
octubre de 2016 10:17
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
Many clusters which have an internal network which is a private network.

However the other interface on the cluster head node, which is normally called 
the 'external' interface can have a real, proper IP address on your external 
network.
It will therefore be able to send email.
The cluster compute nodes can be configured to 'relay' email via the head node.



-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:13
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Hi,
Thanks for your answer.
My HPC cluster does not have a real IP segment, it is a test cluster. Therefore 
it not recognized in the external network. So, I need try to another way.
All the best,
Fany

-Mensaje original-
De: Christopher Samuel [mailto:sam...@unimelb.edu.au] Enviado el: martes, 4 de 
octubre de 2016 18:43
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email


On 03/10/16 23:39, Fanny Pagés Díaz wrote:

> I have a slurm running in the same HPC cluster server, but I need send
> all notification using my corporate mail server, which running in
> another server at my internal network. I not need use the local
> postfix installed at slurm server.

The most reliable solution will be to configure Postfix to send emails via the 
corporate server.

All our clusters send using our own mail server quite deliberately.

We set:

relayhost (to say where to relay email via) myorigin (to set the system name to 
its proper FQDN) aliasmaps (to add an LDAP lookup to rewrite users email to the 
value in
LDAP)

But really this isn't a Slurm issue, it's a host config issue for Postfix.

All the best,
Chris
--
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP


[slurm-dev] Re: Send notification email

2016-10-05 Thread Fanny Pagés Díaz

Yes, I refer to the external network IP cluster  is not valid out, my domain is 
not registered (@ cluster.citi.cu) therefore is not in the MX records, so when 
I relay in my postfix, my corporate mail server refuses mails go out of my 
internal network. I think that's what is happening. I'm wrong?

-Mensaje original-
De: John Hearns [mailto:john.hea...@xma.co.uk] 
Enviado el: miércoles, 5 de octubre de 2016 10:17
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email

Fany,
Many clusters which have an internal network which is a private network.

However the other interface on the cluster head node, which is normally called 
the 'external' interface can have a real, proper IP address on your external 
network.
It will therefore be able to send email.
The cluster compute nodes can be configured to 'relay' email via the head node.



-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:13
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Hi,
Thanks for your answer.
My HPC cluster does not have a real IP segment, it is a test cluster. Therefore 
it not recognized in the external network. So, I need try to another way.
All the best,
Fany

-Mensaje original-
De: Christopher Samuel [mailto:sam...@unimelb.edu.au] Enviado el: martes, 4 de 
octubre de 2016 18:43
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email


On 03/10/16 23:39, Fanny Pagés Díaz wrote:

> I have a slurm running in the same HPC cluster server, but I need send 
> all notification using my corporate mail server, which running in 
> another server at my internal network. I not need use the local 
> postfix installed at slurm server.

The most reliable solution will be to configure Postfix to send emails via the 
corporate server.

All our clusters send using our own mail server quite deliberately.

We set:

relayhost (to say where to relay email via) myorigin (to set the system name to 
its proper FQDN) aliasmaps (to add an LDAP lookup to rewrite users email to the 
value in
LDAP)

But really this isn't a Slurm issue, it's a host config issue for Postfix.

All the best,
Chris
--
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP


[slurm-dev] Re: Send notification email

2016-10-05 Thread John Hearns
Fany,
Many clusters which have an internal network which is a private network.

However the other interface on the cluster head node, which is normally called 
the 'external' interface can have a real, proper IP address on your external 
network.
It will therefore be able to send email.
The cluster compute nodes can be configured to 'relay' email via the head node.



-Original Message-
From: Fanny Pagés Díaz [mailto:fpa...@citi.cu]
Sent: 05 October 2016 15:13
To: slurm-dev 
Subject: [slurm-dev] Re: Send notification email


Hi,
Thanks for your answer.
My HPC cluster does not have a real IP segment, it is a test cluster. Therefore 
it not recognized in the external network. So, I need try to another way.
All the best,
Fany

-Mensaje original-
De: Christopher Samuel [mailto:sam...@unimelb.edu.au] Enviado el: martes, 4 de 
octubre de 2016 18:43
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email


On 03/10/16 23:39, Fanny Pagés Díaz wrote:

> I have a slurm running in the same HPC cluster server, but I need send
> all notification using my corporate mail server, which running in
> another server at my internal network. I not need use the local
> postfix installed at slurm server.

The most reliable solution will be to configure Postfix to send emails via the 
corporate server.

All our clusters send using our own mail server quite deliberately.

We set:

relayhost (to say where to relay email via) myorigin (to set the system name to 
its proper FQDN) aliasmaps (to add an LDAP lookup to rewrite users email to the 
value in
LDAP)

But really this isn't a Slurm issue, it's a host config issue for Postfix.

All the best,
Chris
--
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP


[slurm-dev] Re: Send notification email

2016-10-05 Thread Fanny Pagés Díaz

Hi,
Thanks for your answer. 
My HPC cluster does not have a real IP segment, it is a test cluster. Therefore 
it not recognized in the external network. So, I need try to another way.
All the best,
Fany

-Mensaje original-
De: Christopher Samuel [mailto:sam...@unimelb.edu.au] 
Enviado el: martes, 4 de octubre de 2016 18:43
Para: slurm-dev
Asunto: [slurm-dev] Re: Send notification email


On 03/10/16 23:39, Fanny Pagés Díaz wrote:

> I have a slurm running in the same HPC cluster server, but I need send 
> all notification using my corporate mail server, which running in 
> another server at my internal network. I not need use the local 
> postfix installed at slurm server.

The most reliable solution will be to configure Postfix to send emails via the 
corporate server.

All our clusters send using our own mail server quite deliberately.

We set:

relayhost (to say where to relay email via) myorigin (to set the system name to 
its proper FQDN) aliasmaps (to add an LDAP lookup to rewrite users email to the 
value in
LDAP)

But really this isn't a Slurm issue, it's a host config issue for Postfix.

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci


[slurm-dev] Scheduling Algorithms

2016-10-05 Thread Michael Miller

Dear List,


let’s assume I have one user submitting batch jobs like:

sbatch --array=1-60 --wrap="/bin/sleep 10" --job-name=sleeper1
sbatch --array=1-90 --wrap="/bin/sleep 3" --job-name=sleeper2

Then, because of “fifo” the job “sleeper1” is beeing processed followed by 
“sleeper2”. They are all in one partition. Is it somehow possible to have the 
scheduler processing both jobs? I saw fairshare and gang scheduling but both do 
not seem to do what I am looking for. 

(I have consumable resources activated with CPU metrics if that is somehow of 
concern)

Best Regards,

Michael=