Thanks - I'll try doing it that way and hopefully we can get everything
stable again.
Thanks to everyone for quick reply - and a fix!
Simon.
On 08/06/2015 18:08, "Alex Chekholko" wrote:
>IIRC the JOB vs YES for h_vmem for parallel vs batch is a "known issue".
>
>So we modified our environment
What was the "grid reconfiguration"?
On 06/08/2015 11:42 AM, Dan Hyatt wrote:
We are running a binary program called metaanalysis, which the user says
was working prior to a grid reconfiguration.
qsub -cwd -b y /dsg_cent/bin/metal < c22srcfile.txt > c22SBP.log
This starts, runs, creates the
Most common scenario when "it works from the command line" but "it does
not work in grid engine" is usually:
- Different shell environment between command-line and batch execution
(especially if SGE is running in POSIX_COMPLIANT mode)
- Different ENV variables between CLI and batch environme
We are running a binary program called metaanalysis, which the user says
was working prior to a grid reconfiguration.
qsub -cwd -b y /dsg_cent/bin/metal < c22srcfile.txt > c22SBP.log
This starts, runs, creates the logs, and then fails to create the data files
qsub -cwd -b y /dsg_cent/bin/me
On 06/08/2015 12:49 AM, William Hay wrote:
On Fri, 5 Jun 2015 22:16:12 +
Alex Chekholko wrote:
Hi all,
I have a standard grid engine cluster (sge-8.1.8 tarball from Dave
Love's site) where users use qlogin to get interactive shells on compute
nodes, and we use a qlogin wrapper script to e
IIRC the JOB vs YES for h_vmem for parallel vs batch is a "known issue".
So we modified our environment back to "YES" instead of "JOB" and
adjusted the rest of the environment appropriately, e.g. your jsv
So our h_vmem requests are all "per slot".
On 6/8/15 9:00 AM, Simon Andrews wrote:
Tha
Thanks for replying!
Am I reading that right, that if the resource is allocated per job then it
doesn't actually need to be available?
If that's the case, what is the correct way to set up a job level resource
which we can use for scheduling? I suppose I could change the resource to
be slot leve
Hi Simon,
As you defined the h_vmem as "JOB", according to the manual:
"
A consumable defined by 'y' is a per slot consumables which
means the limit is multiplied by the number of slots being
used by the job before being applied. In case of 'j' the
consumable is a per jo
Having done a bit of investigation it seems that the problem we're hitting is
that our h_vmem limits aren't being respected if the jobs are being submitted
as parallel jobs.
If I put two jobs in:
$ qsub -o test.log -l h_vmem=1000G hostname
Your job 343719 ("hostname") has been submitted
$ qsub
Our cluster seems to have ended up in a strange state, and I don't understand
why.
We have set up h_vmem to be a consumable resource so that users can't exhaust
the memory on any compute node. This has been working OK and in our tests it
all seemed to be right, but we've now found that somehow
On Fri, 5 Jun 2015 22:16:12 +
Alex Chekholko wrote:
> Hi all,
>
> I have a standard grid engine cluster (sge-8.1.8 tarball from Dave
> Love's site) where users use qlogin to get interactive shells on compute
> nodes, and we use a qlogin wrapper script to enable X11 forwarding, by
> using
11 matches
Mail list logo