We had this problem lots, and I can't quite remember how I solved it - I
think it might've been either a JSV or a qsub wrapper that shoves all
GPU jobs into the superordinate queue.
Now that I'm thinking about this again - does the subordinate queue
setting accept 'queueu@@hostgroup' syntax
Hello,
from a kernel/mechanism point of view, it is perfectly possible to
restrict device access using cgroups. I use that on my current cluster,
works really well (both for things like CPU cores and GPUs - you only
see what you request, even using something like 'nvidia-smi').
Sadly, my
I was about to ask that :)
$SGE_ROOT ought to be accessible from (the) submit host(s), at least. So
in general, you should be able to access it from there?
(Note that you can also tell qacct where the accounting file lives - it
assumes a default location, but the file does not have be in that
...or one can just use logrotate (rather than run an extra cron job).
It's surprisingly good at that sort of thing ;)
Tina
On 29/01/2019 16:21, Reuti wrote:
> Hi,
>
>> Am 29.01.2019 um 17:09 schrieb John Young :
>>
>> The gridengine accounting file on our cluster has gotten
>> rather large. I
fancy new things like mesos that have a different
> > programming model or are too new.
> >
> > Dan
> > ___
> > users mailing list
> > users@gridengine.org
> > https://gridengine.org/mailman/listinfo/users
>
> Thanks for sharing,
>
> Paul.
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Snr HPC Systems Administrator, Advanced Research Computing
Research Computing and Support Serv
>
> Douglas Duckworth, MSc, LFCS
> HPC System Administrator
> Scientific Computing Unit
> Weill Cornell Medicine
> E: d...@med.cornell.edu
> O: 212-746-6305
> F: 212-746-8690
--
Tina Friedrich, Snr HPC Systems Administrator, Advanced Research Computing
Research Comp
ounting" ?
>
> Thanks for any help.
>
>
>
>
>
> -Noel Benitez, Salk iT Dept.
--
Tina Friedrich, Snr HPC Systems Administrator, Advanced Research Computing
Research Computing and Support Services, Academic IT
IT Services, University of Oxford
http://www.arc.ox.ac.uk
___
I dealt with a similar problem by way of using the pam-regex[1] module to
simply transform all entered usernames to lower case... as long as your user
names (on the Linux side) are all *supposed* to be lower case, that should do
the trick :)
(Had the bonus of also solving all sorts of other
Only time I ever had problems with duplicate IDs it was simply because
they rolled over - been a while ago though (might've been SGE6.2,
actually - I think that might've hit max job ID at 99 ). You'd have
to run through a very large amount of jobs to hit that monthly, though.
Tina
On
My trick won't work for the random port on the submit host.
William
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Sc
gles
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments
___
users mailing list
users@gridengine.org <mailto:users@gridengine.org>
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman
e does the
verification. Would spare me rather a lot of helpdesk calls :)
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
priv
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential
://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material
@gridengine.org
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science
mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright
how people have made XDMoD and other variants
work with Grid Engine(s) -- can we get a little thread going on best
practices and recommendations for 3rd party reporting/metrics tools?
Suspect there would be decent interest in this ...
-Chris
Tina Friedrich mailto:tina.friedr...@diamond.ac.uk
connecting to other
nodes though; I simply needed a X forwarding capable qlogin :)
Tina
On 13/10/14 22:45, Prentice Bisbal wrote:
Is that ssh conf dynamically generated to limit access only to nodes
that SGE has assigned to that user?
Prentice
On 10/13/2014 12:44 PM, Tina Friedrich wrote:
We
qrsh and
qlogin but don't expose SSH directly to the users?
Regards,
Derrick
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright
run in queue' stanzas, so unfortunately I'm not much
wiser. (qstat -F for a queue instance that would fit does give me
'qc:slots=8').
Tina
On 07/07/14 17:24, Tina Friedrich wrote:
Okay, I checked. All jobs in the queue have the same priority. They all
request the same resources. There aren't
Hi William,
On 07/07/14 15:22, William Hay wrote:
On Fri, 4 Jul 2014 10:37:56 +
Tina Friedrich tina.friedr...@diamond.ac.uk wrote:
Hello list,
I have a couple of jobs sitting in the queue (been there for ages)
that never seem to start (they're in qw).
qalter -w p #JOBNO says
Friedrich wrote:
Hi William,
On 07/07/14 15:22, William Hay wrote:
On Fri, 4 Jul 2014 10:37:56 +
Tina Friedrich tina.friedr...@diamond.ac.uk wrote:
Hello list,
I have a couple of jobs sitting in the queue (been there for ages)
that never seem to start (they're in qw).
qalter -w p #JOBNO says
the python script starts (and should print something),
in the runs where it works a 'write' is called and if it fails it doesn't.
Anyone has any idea?
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77
|
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain
/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use of the intended
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential
, and will simply have
to not use MPI. Such hardship :)
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material
this?
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use of the intended
On 02/04/14 16:04, Skylar Thompson wrote:
An exclusive host consumable is the right way to approach the problem. If
the task elements might be part of a parallel environment, then you'll want
to set the scaling to JOB as well.
On Wed, Apr 02, 2014 at 03:39:03PM +0100, Tina Friedrich wrote
) with 'slots
1'.
-Hugh
-Original Message-
From: users-boun...@gridengine.org [mailto:users-boun...@gridengine.org] On
Behalf Of Skylar Thompson
Sent: Wednesday, April 02, 2014 11:04 AM
To: Tina Friedrich
Cc: users@gridengine.org
Subject: Re: [gridengine users] array job / node allocation
Hello All,
just double checking - there still is no way to use anything but a
user's primary group for ACLs etc?
(Directly use, I mean. Without resorting to duplicating information in
SGE setup, or using a JSV, or wrapping qsub, or ...)
Tina
--
Tina Friedrich, Computer Systems
Hi Reuti,
On 31/03/14 12:47, Reuti wrote:
Hi,
Am 31.03.2014 um 12:22 schrieb Tina Friedrich:
just double checking - there still is no way to use anything but a user's
primary group for ACLs etc?
(Directly use, I mean. Without resorting to duplicating information in SGE
setup, or using
Hi Reuti,
On 31/03/14 15:05, Reuti wrote:
Am 31.03.2014 um 14:14 schrieb Tina Friedrich:
On 31/03/14 12:47, Reuti wrote:
Hi,
Am 31.03.2014 um 12:22 schrieb Tina Friedrich:
just double checking - there still is no way to use anything
but a user's primary group for ACLs etc?
(Directly use
I was about to say, that sounds like something got missed.
Glad it all worked!
Tina
On 28/03/14 01:11, Kevin Buckley wrote:
On 28 March 2014 13:48, Kevin Buckley
kevin.buckley.ecs.vuw.ac...@gmail.com wrote:
On 28 March 2014 00:37, Tina Friedrich tina.friedr...@diamond.ac.uk wrote:
(Sorry I
On 27/03/14 05:18, Kevin Buckley wrote:
On 26 March 2014 23:34, Tina Friedrich tina.friedr...@diamond.ac.uk wrote:
It does sound as if you need to move the SGE_ROOT file system from one
host to the next as well,
Yes, we do.
I'd say stopping everything simply syncing it should work.
Yes
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any
to bother with the IP - changing 'act_qmaster'
ought to do the trick.
Tina
On 26/03/14 09:57, Tina Friedrich wrote:
Can't you just install the new one, make it one of the shadow masters,
and call a migrate? I've never done this across OSs (all my qmasters are
Linux hosts), but I quite often migrate
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may
of a problem (likely simply more noticeable with higher cluster load).
Tina
On 03/03/14 12:46, Reuti wrote:
Am 03.03.2014 um 12:59 schrieb Tina Friedrich:
I was about to ask a similar question; we have the same sort of setup - high,
medium and low priority queues - and run into the same problem
useful information yet.
Any recommendations on a source for info on installing / configuring arco
Or another alternative ?
Isaac Jessop
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer
in SGE8.1.3?
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use of the intended
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain
to
configuration management - hence CFEngine.
I'd second Dave's requirement 4 - I wouldn't really go for anything
that's coupled to the OS.
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
'. I would
consider a combination of PXE, kickstart (or whatever installation
scripting system you are using) and Puppet/Chef/CFEngine/... satisfy my
'OS independence' requirement, really.
I was more thinking of things like Rocks.
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any
, but as I couldn't manage to detect
the current jobs I reverted back to SGE.
Txema
El 10/09/13 11:40, Tina Friedrich escribió:
Hi Txema,
I recently upgrades our Grid Engine from SGE6.2u4 (I think it was) to
OGE8.1.3. No rocks though, so I don't know any details on that.
What I basically did was:
1
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments
: +49.9471.200.195 | Mobile:
+49.170.819.7390
Where Grid Engine lives
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond
://arc.liv.ac.uk/mailman/listinfo/sge-discuss
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell
use ssh per
remote_startup(5).
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use
it has
those symptoms, but there appears to be a race in the threading of the
builtin startup that appears on recent Ubuntu, for instance, but doesn't
on RHEL5 or 6 in our experience. You can still use ssh per
remote_startup(5).
--
Tina Friedrich, Computer Systems Administrator, Diamond Light
require non-authenticated qrsh (if
I'm not mistaken).
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged
, cluster nodes get upgraded
along with the rest of the estate.
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright
)
with the standard workstation setup (and hence, things that work on
workstations not working on the cluster or vice versa) is - to us - much
more of a concern. So, cluster nodes get upgraded along with the rest of
the estate.
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light
installed hwlock in a non-standard location (and
currently have to tell the execd process where it is). Is there an
option for aimk to build statically linked binaries? (I'm sort of
guessing that that's what the difference is here.).
Apologies for the very long post.
Tina
--
Tina Friedrich
haven't got an obvious bug somewhere.
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material
decision.
I have tried a load sensor - basically counting the number of jobs in
the queue on a machine - but that didn't seem to make a difference;
which might be due to the weighting, I suppose.
Anyone got any bright ideas?
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light
Hi Reuti,
On 08/04/13 13:12, Reuti wrote:
Hi Tina,
Am 08.04.2013 um 11:16 schrieb Tina Friedrich:
is it possible to restrict access to a queue by anything but ACL or project? A
complex/resource would be a favourite.
A forced boolean complex attached to a queue?
Ah, no. Thought
Hi Reuti,
discussion veering off a bit :)
On 08/04/13 13:33, Reuti wrote:
Am 08.04.2013 um 14:28 schrieb Tina Friedrich:
Hi Reuti,
On 08/04/13 13:12, Reuti wrote:
Hi Tina,
Am 08.04.2013 um 11:16 schrieb Tina Friedrich:
is it possible to restrict access to a queue by anything but ACL
priority queue (might be the easiest).
I could introduce a project for this I suppose; however, if there was a
way to solve it with a resource I'd prefer that. Any suggestions?
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science
/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use of the intended
deadline -
so time is not on our side.
-paul
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science
(and had forgotten about). Can anyone cast more light on
it?
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science
70 matches
Mail list logo