ostlist such that the host with the greatest
number of slots assigned to the job is first in the list, reducing the
frequency that the problem is hit.
* Enhance GE by making qrsh more light-weight.
If you made it down to the bottom of this post, my thanks :)
Mark
--
---
Thanks,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Lee
#x27;s source to find out how
it does it.
Cheers,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0
liant, thanks to you both.
For what it's worth, it seems to be giving similar answers to qmon. I'll
keep an eye out and report back if I notice anything amiss...
Cheers,
Mark
--
-----
Mark Dixon Email:
ou
got on :)
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University o
is (and many other improvements).
All the best,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ex
pirun". For really crazy things, we use the
queue starter_method :)
Perhaps we've been lucky, but I don't think we've got anything that cannot
be wrapped in some shape or form at the moment.
Cheers,
Mark
--
-
e instance. You might want to give that a go.
All the best,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ex
.
Mark
PS v20z? I liked them, but for some reason I always ended up bleeding
slightly from my right hand index finger whenever I took the lids off.
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid
equest, but are not
otherwise enforced.
How far along with your solution are you? Am I just duplicating work
someone else has already done?
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Sy
e
execd, and editing the "obvious" places (data structures and sizes)
doesn't seem to be the end of it.
gdb ahoy!
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Supp
On Wed, 23 May 2012, Rayson Ho wrote:
On Wed, May 23, 2012 at 4:09 AM, Mark Dixon wrote:
Intended notable features of the patchset:
* Two new resources h_mem and s_mem to limit total memory + swap usage (i.e.
not just rss).
In my implementation, I did not add any new queue resource limits
the development work
that has gone into it, and thanks again: it's very pleasing to see this
sort of thing done under an open source model.
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
them under the SISSL. Nothing was mentioned about copyright assignment.
[actually, that conversation with them went stale... must pick it up
again].
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Gr
u can just translate
the percentage to an absolute value using the JSV feature.
* If you never want to use an absolute value, you can put in a consumable
value of 100 and hack the execd code to interpret the number however you
like.
Mark
--
to make $
but since Oracle gave us the maintainership to maintain open source
Grid Engine we have the obligation to maintain it in an *Open Source*
way.
...
I'm very reluctant to end on this, but here goes; that paragraph should
have at least one of:
s/as the$/as an/
Or:
s/^open so
to reimplement
functionality already found in ssh(d) and rsh(d) on almost every host?
:)
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information
port to the
wildcard address (instead of just the loopback) and puts a real hostname
in your DISPLAY variable. "qsub -V" then makes sure the compute nodes get
it.
Mark
--
-----
Mark Dixon Email: m.c.di
#x27;t repeat the good stuff that other people have already written about
what you can do today but, if you haven't already, you might want to
skim-read the ongoing cgroups thread that describes a much better approach
that should be available in the future.
All the best,
Mark
--
-
On Fri, 25 May 2012, Rayson Ho wrote:
On Fri, May 25, 2012 at 12:01 PM, Mark Dixon wrote:
That's what I was wondering was the answer :)
In my opinion there are simpler ways round it as long as not having
encrypted X11 within the cluster is ok:
As I've mentioned befo
de BTW??
Rayson
It's in an email from me to d...@gridengine.org, dated 9/12/2011 with
subject "[PATCH] Fix PE task array job failure due to missing job script
on execd".
I'm happy for any of the gridengine forks to take it (hint).
Mark
--
-
On Tue, 29 May 2012, Mark Dixon wrote:
...
Is this the gridscheduler-developers sourceforge list?
I tried looking for a button to push or address to email to join it, but
didn't have any luck. The web frontend doesn't show anything posted to it
recently - is archiving turned off
ase you missed it, here's the nub of what I've previously said:
On Fri, 25 May 2012, Mark Dixon wrote:
...
memsw usage and virtual address space usage can wildly diverge under
fairly common use cases:
* Processes under 64-bit mode. Comparing the VIRT column in "top" with
On Wed, 30 May 2012, Mark Dixon wrote:
Hi Rayson,
Sorry to sound needy, but have you had time to consider controlling
cgroup's memory.memsw.limit_in_bytes via a new attribute defined as the
OS-dependent way to measure memory usage (typically RAM+swap), rather than
overloading h_vmem/s
limit's RLIMIT_AS, as previously?
Best wishes,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel
're putting into it. I've obviously not done a very good job
at being clear and concise.
All the best,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int
h_vmem is needed for?
Thx
Aside from the other methods mentioned in this thread, I often find the
following CLI tool useful:
https://software.sandia.gov/trac/utilib/wiki/Documentation/memmon
Mark
--
-
Mark Dixon
h, but it's a much-needed break from the past which takes
advantage of cgroups to greatly improve utilisation of the resources
available.
How does that sound?
Cheers,
Mark
--
-
Mark Dixon Email: m.
On Thu, 14 Jun 2012, Mark Dixon wrote:
...
* I do NOT believe that AS should be summed when the processes of a job
are being polled. Instead it should be RSS+SWAP (or similar).
(sorry, ignore that 2nd sentence)
* I DO believe that the per-process AS setrlimit should be settable to a
value
tude of sins being passed through to my starter_method,
generally when qrsh gets involved (e.g. tightly-integrated parallel jobs),
which generally get coped with better that way.
Mark
--
-
Mark Dixon Email: m
ne="$1"
shift
if ! which "$arg_one" > /dev/null 2>&1; then
eval "$arg_one" "$@"
exit $?
fi
exec "$arg_one" "$@"
Perhaps there's a better way.
Mark
--
-
tle more complicated - cheers!
Good luck,
Mark
PS One other thing I recall, is that I think the exact behaviour of what
is passed to the starter_method depends on the queue's shell_start_mode
setting. Ours is set to unix_behavior to try and get a more intuitive
result than you get from the de
/env to simulate
unix_behavior for reasons of backward compatibility. Should be fun.
W.Hay Esq ornate grid engine configurations a speciality
Yowsah.
Go on - what's the subtlety you've hit there?
Mark
--
-----
Mark Dixon
environment variables.
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
Un
cification. I doubt it will generally
behave the same as without the starter, but it's probably relatively
safe in the absence of Liverpool access to the system.
...
If there any particular use case that's concerning you here?
Cheers,
Mark
--
;m not the
biggest fan of PE customisation (taste). This may or may not be one of
them :)
Cheers,
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
I
using the same languages, as the JSV bindings? If Python or Lua
are important, perhaps JSV bindings should be written for them (if not
available already).
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
H
On Thu, 12 Jul 2012, Dave Love wrote:
Mark Dixon writes:
Things I think we've used starter_methods for in the past:
Gosh. You live in interesting times^Wclusters.
I've certainly had some interesting problems to tackle. Something's got to
keep me busy in f
This is to create some functionality that is currently missing. It's an
experiment at the moment: I'll report back if there's anything positive to
say (or negative for that matter, if people are interested).
Mark
--
----
On Tue, 11 Sep 2012, Schmidt U. wrote:
Hi Mark Dixon,
is the mentioned patch maybe a little bit helpful as well for my problem
with the virtual memory overload of the first node in massive parallel jobs
? overhead_vmem = bash_vmem + mpirun_vmem + (nodes -1)*qrsh_vmem
Udo
Hi Udo,
Apologies
y controller.
"This can may contain worms."
...
Well said! This is obviously a new area and, aside from the obvious
problems, we may still get interesting interactions with some of the more
exotic things regularly found in our environments (e.g. Lustre,
InfiniBand, etc.). It'
On Wed, 19 Sep 2012, Mark Dixon wrote:
...
"This can may contain worms."
...
Well said! This is obviously a new area and, aside from the obvious
problems, we may still get interesting interactions with some of the
more exotic things regularly found in our environments (e
a poke to see if there was any conclusion.
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +
onment too, so I'm not sure if this
helps. Would using SGE_CLUSTER_NAME help?
I don't think so.
...
Ho-hum :)
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support
be
really significant or just a minor stuff.
Jérémie
Depending on what your program does, the overestimate can pretty much be
as big as you like (until you hit architectural 32-bit/64-bit limits).
Mark
--
-
Mark Dixon
On Mon, 16 Jul 2012, Mark Dixon wrote:
...
* Transparent (from the job script's perspective) serial BLCR
integration
Could you post the recipe/code? DMTCP is facing the knife for exactly
that, but C++ encourages displacement activities.
Sure, I'll dig it out (to follow under
about tackling this bug?
...
Hi Orlando,
Great to hear you're game :)
Are you after advice on debugging gridengine, or pointers for this bug
specifically?
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.u
s the qmaster is being given
to be used in the accounting file and the share tree.
If you're unlucky, the problem is in how the qmaster aggregates, records
and decays the share tree values over time.
If you're really unlucky, the problem might only occur if the vario
rted while
it's queuing, you'll probably lose the job entirely.
TTFN
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Service
ungrid-dev machine localhost:16.0? Shouldn't it
be :0?
...
You might want to read the bottom of the following post:
http://gridengine.org/pipermail/dev/2011-November/70.html
Mark
--
-----
Mark Dixon Em
). It could do MIT-MAGIC-COOKIE-1,
but had to be hit until it did it.
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel
ers,
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, L
details.
If you don't have the same amount of RAM everywhere, you might also want
to play with "usage_scaling" parameters in the execd host definitions.
Good luck :)
Mark
--
-----
Mark Dixon Emai
27;s usage via a usage_scaling
of cpu=0.00,mem=0.00,io=0.00.
All the best,
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
On Tue, 30 Apr 2013, Dave Love wrote:
Mark Dixon writes:
We use the share tree here, rather than the functional policy, so this
might not be applicable.
By default, the "usage" of a job is wholly based on slots*seconds.
I think it's only (effectively) s
formula to calculate the normalized value?
...
If you're looking at that level of detail, I would read the source code if
I were you. I'd be interested to hear what you find.
All the best,
Mark
--
-
em.
All the best,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
Univers
better way for doing that?
...
Hi Christoph,
We normally use an advance reservation to do this, draining node(s) for a
particular point in time:
man qrsub
All the best,
Mark
--
-
Mark Dixon Email
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
to call me naive, but this announcement sounds like good news to
me - Oracle were clearly not interested in gridengine. Congratulations to
Fritz, the engineers and Univa :)
Mark
--
-
Mark Dixon Email
ound in Univa's public git repo here:
https://github.com/gridengine/gridengine
Alternatively, it was integrated into Son of Gridengine some time ago.
All the best,
Mark
--
-----
Mark Dixon Email: m.c.d
ority when they shouldn't.
Given that ticket allocations are calculated afresh every scheduling
interval, I don't think there's any point in errored jobs attracting
tickets like this.
Is that right?
Mark
--
-----
Mark Dix
matters? Or is that what you meant?
Cheers,
Mark
(sorry for delay - got distracted on a training course)
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Inform
then submitting the array into
that. Not sure which option I favour, still.
The AR approach sounds horrendous!
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (in
On Mon, 10 Mar 2014, Joshua Baker-LePain wrote:
I'm hoping to get some idea as to the status and future of OGS/GE. As
background, we're a moderately sized (4000+ cores) academic cluster and
have been running SGE for several years (we've been through versions 6,
6.1, 6.2, and are now running OGS
it explained to me by reuti on
this list a while ago...)
Reuti's great for that :)
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Hi,
I think we've been bitten by something that others have seen and brought
up on this list over the years, where the amount of usage reported in the
share tree can become unexpectedly large when using task array jobs.
I am trying to reproduce this on a test install, to take a closer look at
On Fri, 4 Apr 2014, Joshua Baker-LePain wrote:
...
If you have any issues recreating this, let me know and I'll see if I
can still do so.
...
Thanks for the pointers :)
Mark
--
-
Mark Dixon Email: m
On Fri, 4 Apr 2014, Cameron Brunner wrote:
Mark -
I analyzed this issue on a UGE 8.1.4 cluster last fall and found that this
process was the simplest way to get some overbooking reported.
...
Thanks for the pointers :)
Very useful Cameron and Joshua, thanks again for taking the trouble to
On Fri, 4 Apr 2014, Cameron Brunner wrote:
Mark -
I analyzed this issue on a UGE 8.1.4 cluster last fall and found that this
process was the simplest way to get some overbooking reported. I think any
version of SGE with original share tree code will show this issue. The
following steps show
On Tue, 30 Sep 2014, Derrick Lin wrote:
...
I am trying to configure SSH as underlying protocol for qrsh, qlogin.
However, this requires allowing users to SSH into compute nodes. In such
case, users can simply go to compute nodes with SSH, bypassing SGE (qrsh,
qlogin etc).
I am wondering what th
On Mon, 13 Oct 2014, Prentice Bisbal wrote:
...
I think what he wants to do is this, which is actually a pretty common
desire:
1. Not let users ssh directly into cluster nodes and bypass the scheduler.
2. If a user is in a qrsh or qlogin session and has requested multiple
nodes, for debugging p
On Thu, 30 Apr 2015, Chris Dagdigian wrote:
...
- Does my use of "-p" to send lower-than-zero values for my submitted jobs
affect just MY jobs and the order in which they get dispatched or will I end
up penalizing myself globally because all the other jobs from other users on
the cluster are ru
On Thu, 30 Apr 2015, Fritz Ferstl wrote:
Nah, the weight_priority won't help. It just determines how much influence
the -p has vs things like job wait time or urgency. If you have none of those
then all being equal it would have the same effect as if you left it
untouched. And if you have infl
On Tue, 4 Feb 2014, Mark Dixon wrote:
...
Over the years, we've run various versions of SGE and SoGE with a share
tree policy. I think it's always been the case that jobs in an error state
still attract tickets from the share tree policy - despite the fact that
the job isn't
On Tue, 10 Nov 2015, Marlies Hankel wrote:
...
We are using OGS/Grid Engine 2011.11. I have recently implemented a fair
share policy which seems to work OK. However, on occasion, when a user comes
up to a deadline I would like to advance them up the queue. Previously I
could change their priori
On Fri, 4 Apr 2014, Joshua Baker-LePain wrote:
On Fri, 4 Apr 2014 at 8:45am, Mark Dixon wrote
I think we've been bitten by something that others have seen and
brought up on this list over the years, where the amount of usage
reported in the share tree can become unexpectedly large when
Hi there,
Is there any interest for a meeting in the UK looking at the internals of
gridengine? Potential topics might be:
* Building from source
* How the code is organised
* How to debug or develop gridengine
The principles discussed ought to be applicable to any flavour of
gridengine that
On Thu, 8 Sep 2016, William Hay wrote:
...
At present we're using a huge swap partition and TMPFS instead of btrfs.
You could probably do this with a volume manager and creating a
regular filesystem as well but it would be slower.
...
Hi William,
I always liked your idea for handling scratch s
On Thu, 8 Sep 2016, William Hay wrote:
...
Remember tmpfs is not a ramdisk but the linux VFS layer without an
attempt to provide real file system guarantees. It shouldn't be cached
any more agressively than other filesystems under normal circumstances.
Most of the arguments against it seem to
On Tue, 4 Oct 2016, Derrick Lin wrote:
...
I have had a simple implementation working. Now I need to look at a
situation when -pe is specified. It looks like the accurate way to
determine host/slot allocation is to get from $pe_hostfile. But
$pe_hostfile seems to be available only in start_proc
On Tue, 4 Oct 2016, Reuti wrote:
...
Do you mean your implementation or the general behavior? The $TMPDIR
will be created when a `qrsh -inherit ...` spans a process to a node and
is removed once it returns.
...
Hi Reuti,
Thanks: I thought $TMPDIR only appeared on the host with the MASTER task
On Tue, 4 Oct 2016, William Hay wrote:
...
I have a per-job consumable and the TMPDIR filesystem is created on
every node of the job. We have a (jsv enforced) policy that all
multi-node jobs have exclusive access to the node and run on identical
nodes so it works as a faux per-host consumable.
On Tue, 4 Oct 2016, Reuti wrote:
...
Yeah, I had the idea of different temporary directories some time ago,
as some applications like Molcas need a persistent one across several
`mpiruns` on each node and how to delete them again.
...
Hi Reuti,
Interesting!
I wonder what happens with $TMPDIR
On Wed, 5 Oct 2016, William Hay wrote:
...
It was originally head node only so per job until a user requested local
TMPDIR on each node so historical reasons.
...
Hi William,
What do you do with people who want to keep the contents of $TMPDIR at the
end of the job?
It's easy to use the epilog
On Wed, 5 Oct 2016, William Hay wrote:
...
Our prolog and epilog (parallel) ssh into the slave nodes and do the
equivalent of run-parts on directories full of scripts some of which
check if they are running on the head node of the job before doing
anything. If we did want the epilog to save TMP
Hi,
I've been playing with allocating GPUs using gridengine and am wondering
if I'm trying to make it too complicated.
We have some 24 core, 128G RAM machines, each with two K80 GPU cards in
them. I have a little client/server program that allocates named cards to
jobs (via a starter method
On Tue, 14 Feb 2017, William Hay wrote:
...
We tweak the permissions on the device nodes from a privileged prolog
but otherwise I suspect we're doing something similar.
Hi William,
Yeah, but I've put the permission tweaker in the starter, as that fits our
existing model a bit better (looking
On Tue, 14 Feb 2017, William Hay wrote:
...
qsub -ac Template=GPU -l gpu=1 script
Your jsv could spot that a Template had been requested and fill in
sensible defaults based on other requests. If no template is requested
users have access to the full power of this fully operational caommand
l
On Tue, 14 Feb 2017, William Hay wrote:
...
options nvidia NVreg_ModifyDeviceFiles=0
...
Hi William,
Many thanks, much appreciated :)
Mark
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
On Tue, 14 Feb 2017, William Hay wrote:
...
Our prolog does a parallel ssh(passing through appropriate envvars) into
every node assigned to the job and does the equivalent of a run-parts on
a directory filled with scripts. Some of these scripts check if they
are running on the head node.
(be
On Mon, 27 Mar 2017, Joshua Baker-LePain wrote:
Investigating my odd 'some commands don't generate output' issue revealed a
couple other issues. First, on my old cluster (running OGS 2011.11p2), only
a very few hosts are admin hosts -- notably, the exec nodes are not. On those
nodes, 'qconf -
Hi,
It's this bit that's doing it: "SGE_ND=true". It's there so that the
qmaster doesn't daemonise, in order to play nicely with systemd.
Unfortunately, as it was originally put in to aid debugging, it also
enables some debug messages.
If too much is being generated, I'd suggest either redir
Hi Jakub,
That's right: if you need to cut down the logging, one option is to add
the redirection in the start script.
You're looking for the line starting "sge_qmaster", and you might want to
try adding a ">/dev/null" after it. You'll lose all syslog messages from
sge_qmaster though (normal
Hi William,
I've seen this before back in the SGE 6.2u5 days when it used to write out
core binding options it couldn't subsequently read back in.
IIRC, users are read from disk at startup in turn and then the files are
only written to from then on - so this sort of thing only tends to be
no
On Mon, 16 Apr 2018, William Hay wrote:
...
I don't think that can be right given that the qmaster complains about
multiple user files on start up. If it gave up after the first then
presumably it wouldn't complain about the others.
All I know is that, when we had this sort of problem, most o
On Tue, 17 Apr 2018, Joshua Baker-LePain wrote:
As an alternative to fixing our current setup, I'd be most interested to
hear if/how other folks are handling GPUs in their SoGE setups. I was
considering changing the slot count in gpu.q to match the number of GPUs
in a host (rather than CPU core
Hi Daniel,
Well done on wanting to work on gridengine, it's really good to see people
interested.
Although the topmost layers have clearly suffered from years of applying
patches on top of patches on top of patches and so are in sore need of a
bit of refactoring, there are some really nice b
On Tue, 5 Jun 2018, Ilya M wrote:
...
Is there a way to submit AR when there are projects attached to queues?
I am using SGE 6.2u5.
...
Hi Ilya,
I've run into this, too: I'm afraid that there isn't. I logged it here:
https://arc.liv.ac.uk/trac/SGE/ticket/1466
I started to fix it but ran out
projects from queues' configuration.
Ilya.
On Wed, Jun 6, 2018 at 2:41 AM, Mark Dixon wrote:
On Tue, 5 Jun 2018, Ilya M wrote:
...
Is there a way to submit AR when there are projects attached to queues? I
am using SGE 6.2u5.
...
Hi Ilya,
I've run into this, too: I'm afraid that
1 - 100 of 148 matches
Mail list logo