2011/10/4 Carlos Fernández Iglesias :
> Hello,
>
> Is there a way to associate a job to a reservation so it would only
> execute when the reservation starts and in the node the reservation is made?
>
> Thanks.
The -ar flag to qsub does this I believe.
William
On 4 October 2011 15:40, Schmidt U. wrote:
> Dear all,
> sometimes I have trouble with array jobs.
> e.g. #$ t 1-5000
> Then it happens, that some jobs are submitted and some are rejected back
> to "qw". In that case the "touched" queues are set into "E" mode.
> I have a cron job to "qmod -cq "*"
On 6 October 2011 09:39, wzlu wrote:
> Dear All,
>
> There are 144 nodes in my queue and I configured 1 slot for each node. That
> is 144 nodes with 144 slots.
> The PE is used 121 slots now. One job need 12 PE's slots and there are
> enough nodes and slots for this job.
> But it queued by "cannot
On 7 October 2011 09:45, Balint Takacs wrote:
> Can I somehow change the *relative* priority of my own jobs?
> I am working in a company environment where lots of people are competing for
> grid resources, and jobs usually has to queue. Some of my jobs are more
> important than others, but they so
On 11 October 2011 12:55, Reuti wrote:
> Am 10.10.2011 um 20:46 schrieb Gerald Ragghianti:
>
>> We have a cluster consisting of 48-core compute nodes where we need to run
>> parallel (MPI) jobs across nodes. There is a hardware limitation on the QDR
>> Infiniband cards that limits the available
On 11 October 2011 23:33, Gerald Ragghianti wrote:
>
>> Like the OP mentioned, one could use a consumable complex for 6.1. If you
>> add "complex_values network=16" to the queue, and "load_thresholds
>> network=15" it will be pushed to alarm state automatically and you can avoid
>> the load sen
Grid engine allows you to define:
a)prolog in the qrid engine config (qconf -sconf)
b)prolog in the queue definition (qconf -sq)
c)start_proc_args in a pe definition.
Is the order in which these are run defined anywhere?
Likewise for epilog and stop_proc_args.
I'm hoping I can avoid having to re
On 18 October 2011 10:58, Reuti wrote:
> Am 18.10.2011 um 11:42 schrieb William Hay:
>
>> Grid engine allows you to define:
>> a)prolog in the qrid engine config (qconf -sconf)
>> b)prolog in the queue definition (qconf -sq)
>> c)start_proc_args in a pe definition.
We've purchased a license for Gaussian with support for parallelism
via Linda. A quick google doesn't show up any tight integrations for
Linda/SGE.
i)Does anyone have a working tight integration config for Linda? Or
even a Gaussian specific one?
ii)If not does anyone have experience of running Li
On 4 November 2011 07:24, Johan Finstadsveen wrote:
> Hi,
> Unsure whether this is the correct forum for this debate.
>
> We are currently in the process of acquiring a gpu-cluster. From before we
> have a cpu-based cluster running Rocks 5.3 and SGE. The desire from the
> users is to have three di
On 10 November 2011 03:46, Ron Chen wrote:
>
> 4) Fritz was telling customers (including William Hay) that open source Grid
> Engine is "buggy, unstable, hard to debug", and to use SGE in production
> customers need to buy support from Univa.
I should point out this wa
On 9 November 2011 16:22, Rayson Ho wrote:
> The Open Grid Scheduler Project is releasing a new release: Grid
> Engine 2011.11. We are going back to the open source model that was
So the software is still called Grid Engine even though the project is
open Grid Scheduler?
> used by Sun Microsystem
Looking at the various rsh impersonating qrsh wrappers provided with
SGE I notice that the difference between the mpi/rsh and the
mpi/openmpi/rsh wrapper is that the openmpi variant uses the -V option
to pass all environment variables through to the slave processes.
What if anything is the downside
Are there any guides to doing a tight integration between SGE and
Intel MPI? There is a guide to loose integration on the Intel website
with a comment suggesting that the mpich2_mpd integration on the
sunsource site should work (presumably the same as the mpd section at
http://arc.liv.ac.uk/SGE/ho
On 14 November 2011 12:50, Reuti wrote:
> Hi,
>
> Am 14.11.2011 um 13:41 schrieb William Hay:
>
>> Are there any guides to doing a tight integration between SGE and
>> Intel MPI? There is a guide to loose integration on the Intel website
>> with a comment
On 14 November 2011 14:28, Reuti wrote:
> Am 14.11.2011 um 15:24 schrieb William Hay:
>
>> On 14 November 2011 12:50, Reuti wrote:
>>> Hi,
>>>
>>> Am 14.11.2011 um 13:41 schrieb William Hay:
>>>
>>>> Are there any guides to doing a t
On 16 November 2011 03:29, Vang Le wrote:
> Hello GridUsers,
> My grid is running, it can deliver jobs, but they only run on one nodes at a
> time.
> When I tried running with mpirun in a batch script, i get errors like
> "execution daemon on host didn't accept task" as shown at the
> bottom
On 16 November 2011 00:10, Dave Love wrote:
> William Hay writes:
>
>> On 10 November 2011 03:46, Ron Chen wrote:
>>
>>>
>>> 4) Fritz was telling customers (including William Hay) that open source
>>> Grid Engine is "buggy, unstable, hard to d
On 16 November 2011 09:38, Reuti wrote:
> While I on my own use SGE on all machines I set up, we have access to a
> cluster using Torque and I noticed something similar. Besides that we need a
> tight integration of parellel jobs using the Linda library (i.e. Gaussian),
> and as there is nothi
I added PROFILE=1 to the params of sched_conf in order to measure
where the scheduler was spending its time. I discovered a fortuitous
side effect is that any "job BLAH should have finished since" lines
appear between
the PROF: sge_mirror and PROF: static urgency lines. We have a
script that pro
On 16 November 2011 09:38, Reuti wrote:
> Am 16.11.2011 um 10:24 schrieb William Hay:
>
>> On 16 November 2011 00:10, Dave Love wrote:
>>> William Hay writes:
>>>
>>>> On 10 November 2011 03:46, Ron Chen wrote:
>>>>
>>>>>
&
On 16 November 2011 11:51, Reuti wrote:
> Am 16.11.2011 um 12:45 schrieb William Hay:
>
>>>
>>> While I on my own use SGE on all machines I set up, we have access to a
>>> cluster using Torque and I noticed something similar. Besides that we need
>>>
On 16 November 2011 13:52, Vang Le wrote:
> Hi William and Reuti,
> Thank you for your suggestions and your time. They are really helpful. I
> solved almost of my problems.
>
> I installed rsh-redone-client and rsh-redone-server, also I modify my PE so
> that "control_slaves TRUE" is set. I can ru
On 16 November 2011 13:52, Vang Le wrote:
> I googled and there was something mentioned about editing /etc/hosts.equiv
> file to permit rsh and rlogin without password. However, typing "qconf
> -mconf" at the management host, I saw this:
>
> rlogin_daemon/usr/sbin/sshd -i
> r
On 16 November 2011 13:52, Vang Le wrote:
> Hi William and Reuti,
> Thank you for your suggestions and your time. They are really helpful. I
> solved almost of my problems.
>
> I installed rsh-redone-client and rsh-redone-server, also I modify my PE so
> that "control_slaves TRUE" is set. I can ru
On 16 November 2011 23:44, Dave Love wrote:
> William Hay writes:
>> The main issue we
>> currently have with SGE is the time a scheduling cycle takes. We're
>> currently trying to tweak the configuration to minimise the work SGE
>> has to do while still impl
On 17 November 2011 10:27, Reuti wrote:
> The wrappers are no longer used in case you use a recent version of Open MPI
> (compiled --with-sge) or MPICH2. Both call directly `qrsh -inherit -V ...` if
> they discover that they are running under SGE and entries in
> start-/stop_proc_args can be s
On 18 November 2011 14:21, Gerard Henry wrote:
> hello all,
>
> i got trouble to confgure a queue on SGE 6.2u5 (linux)
>
> I have two machines amd64, with this topology: SCCSCC so the total of
> cores is 8.
>
> first, i defined a group:
> # qconf -shgrp @qlong
> group_name @qlong
> hostlist charyb
On 19 November 2011 04:53, mahbube rustaee wrote:
> Hi,
> I defined a queue on @node-grp (a group of nodes).
> I defined mpi2 parallel environment as:
> start_proc_args /opt/gridengine/mpi/startmpi.
> sh $pe_hostfile
> stop_proc_args /opt/gridengine/mpi/stopmpi.sh
> allocation_rule 2
> c
On 19 November 2011 05:03, mahbube rustaee wrote:
> Hi,
>
> I define slots of all hosts with:
> {
> name limit-slots-of-hosts
> description limits slots of clusters's hosts
> enabled TRUE
> limit hosts {@gpu} to slots=48
> limit hosts {@xeon} to slots=24
On 19 November 2011 09:58, mahbube rustaee wrote:
>
>
> On Sat, Nov 19, 2011 at 12:05 PM, William Hay wrote:
>>
>> On 19 November 2011 04:53, mahbube rustaee wrote:
>> > Hi,
>> > I defined a queue on @node-grp (a group of nodes).
>> > I defined mp
On 22 November 2011 20:05, Chris Dagdigian wrote:
>
> Hi folks,
>
> I'm hands-on with a shiny new cluster running Univa's 8.0.1 release and
> am having some issues running jobs as a non-root user via an account
> that lives in Active Directory.
>
> The cluster is the standard sort of RHEL 5.7 base
Are there any instructions for getting CFX working under tight
integration? It appears to work OK loosely integrated but it doesn't
appear to work under our existing integrations. If we use an rsh
resembling wrapper around qrsh I get the following output:
+ cfx5solve -max-elapsed-time '14 [min]'
On 24 November 2011 12:59, Reuti wrote:
> Am 24.11.2011 um 12:51 schrieb William Hay:
>
>> Are there any instructions for getting CFX working under tight
>> integration? It appears to work OK loosely integrated but it doesn't
>> appear to work under our existing in
B_ID and TASK_ID to a file just
before invoking qrsh and all seems sensible).
William
>
> Brian
>
> -Original Message-
> From: wish.dum...@gmail.com [mailto:wish.dum...@gmail.com] On Behalf Of
> William Hay
> Sent: Saturday, November 26, 2011 2:21 AM
> To: Murphy, B
128 > 2147483648
A user has submitted a job requesting 128 slots:
qstat -j producing the following output:
parallel environment: qlc-[1ABCDEFGHIJTWKLMNOPX] range: 128
qalter -w v produces the following:
Job 404311 cannot run in PE "qlc-H" because it only offers 2147483648 slots
Job 404311 cannot
On 6 December 2011 09:48, Reuti wrote:
> Hi,
>
> Am 06.12.2011 um 10:04 schrieb William Hay:
>
>> 128 > 2147483648
>>
>> A user has submitted a job requesting 128 slots:
>> qstat -j producing the following output:
>> parallel environment: qlc-[1A
On 6 December 2011 10:21, Reuti wrote:
> Hi,
>
> Am 04.12.2011 um 11:57 schrieb mahbube rustaee:
>
>> I defined an exclusive tag in complex resources for users that can request
>> "-l excl=1 "
>> Such users, lock free slots of hosts that have added excl=true to
>> consumable resources.
>>
>> 1
On 6 December 2011 10:21, Reuti wrote:
> Hi,
>
> Am 04.12.2011 um 11:57 schrieb mahbube rustaee:
>
>> I defined an exclusive tag in complex resources for users that can request
>> "-l excl=1 "
>> Such users, lock free slots of hosts that have added excl=true to
>> consumable resources.
>>
>> 1
On 6 December 2011 13:10, Reuti wrote:
> Am 06.12.2011 um 12:16 schrieb William Hay:
>
>> On 6 December 2011 09:48, Reuti wrote:
>>> Hi,
>>>
>>> Am 06.12.2011 um 10:04 schrieb William Hay:
>>>
>>>> 128 > 2147483648
>>&
On 6 December 2011 16:03, Reuti wrote:
> Am 06.12.2011 um 17:01 schrieb William Hay:
>
>> On 6 December 2011 13:10, Reuti wrote:
>>> Am 06.12.2011 um 12:16 schrieb William Hay:
>>>
>>>> On 6 December 2011 09:48, Reuti wrote:
>>>>> H
On 6 December 2011 16:44, Reuti wrote:
> Am 06.12.2011 um 17:32 schrieb William Hay:
>
>> On 6 December 2011 16:03, Reuti wrote:
>>> Am 06.12.2011 um 17:01 schrieb William Hay:
>>>
>>>> On 6 December 2011 13:10, Reuti wrote:
>>>>> Am 0
One of our users is complaining that jobs they have put on hold with
qalter -h u are becoming unheld without intervention from them. Is
there any practical way to investigate this? AFAICS Grid Engine
doesn't provide logging of hold and release events. I would expect it
to be enabled by the joblo
On 7 December 2011 11:24, Reuti wrote:
> Am 07.12.2011 um 09:33 schrieb William Hay:
>
>> On 6 December 2011 16:44, Reuti wrote:
>>> Am 06.12.2011 um 17:32 schrieb William Hay:
>>>
>>>> On 6 December 2011 16:03, Reuti wrote:
>>>>> Am 0
On 7 December 2011 11:24, Reuti wrote:
> Can you try to create a copy of the job with `qresub` and change for the copy
> the resource requests like a time limit. Any change?
Well here I think I've found where my little trick with the JSV does
bite me. If I qresub this job then it is resubmitte
On 10 December 2011 09:01, mahbube rustaee wrote:
> Hi all,
>
> some slicker users submit jobs and save needed output , then run qdel and
> delete the job on sge.
> there isn't any record accounting for such jobs whereas user has saved
> outputs.
>
> Any suggestion to prevent mentioned senario?
On 13 December 2011 15:25, Lars van der bijl wrote:
> Hey everyone,
>
> we have been running our sge for a while now but we implemented a new
> technique and I'm having trouble figuring out how to make the grid
> help with it.
>
> I have the following task / dependency structure.
>
> task1
>
> tas
Possibly assigning a fair share to each job with -js would cause them
to change priority between scheduling runs so different jobs would
snaffle the reservations on each run.
On 13 December 2011 15:57, Lars van der bijl wrote:
> hey Reuti,
>
> I wrote a python api using networkx and a database la
On 14 December 2011 07:50, mahbube rustaee wrote:
> Hi ,
>
> 1) By default prolog output is output of user's job.
> How can I set other path/filename for prolog output?
Well since your prolog appears to be shell.
Add a line at the top of the script
exec >/location/of/prolog/output
>
> 2) I set
On 13 December 2011 19:11, Christoph Müller
wrote:
> Hi Reuti,
>
>> -Ursprüngliche Nachricht-
>> Von: Reuti [mailto:re...@staff.uni-marburg.de]
>> Gesendet: Dienstag, 13. Dezember 2011 19:20
>> An: Christoph Müller
>> Cc: users@gridengine.org
>> Betreff: Re: AW: AW: [gridengine users] Acce
On 14 December 2011 08:53, Christoph Müller
wrote:
> Hi William,
>
>> -Ursprüngliche Nachricht-
>> Von: wish.dum...@gmail.com [mailto:wish.dum...@gmail.com] Im Auftrag
>> von William Hay
>> Gesendet: Mittwoch, 14. Dezember 2011 09:47
>> An:
On 13 December 2011 23:46, Gowtham wrote:
>
> In some of our Rocks 5.4.2 clusters running SGE
> 6.2u5, I have been noticing the load average on
> several compute nodes being significantly higher
> than others when all cores/processors in all
> compute nodes involved are doing about the same
> amou
On 14 December 2011 10:06, Christoph Müller
wrote:
> Hi William,
>
>> -Ursprüngliche Nachricht-
>> Von: wish.dum...@gmail.com [mailto:wish.dum...@gmail.com] Im Auftrag
>> von William Hay
>> Gesendet: Mittwoch, 14. Dezember 2011 10:09
>> An:
The schedule file contains lots of information on jobs. I believe
that the 4th field for a RUNNING job is the start time of the job (in
seconds since the epoch). Can someone (or better yet some docs)
confirm this and if so is it guaranteed to match up with the start
time for the head node in the a
On 16 December 2011 14:21, Reuti wrote:
> Am 16.12.2011 um 14:08 schrieb William Hay:
>
>> The schedule file contains lots of information on jobs. I believe
>> that the 4th field for a RUNNING job is the start time of the job (in
>> seconds since the epoch).
>
> I
On 21 December 2011 20:53, Rayson Ho wrote:
> Hi Dave,
>
> Is the original wiki really under a free license? I could not find
> references that explicitly give anyone the permission to use it.
>
I don't know but a colleague of mine noticed the disappearance as
well. We found a copy of the page we
On 22 December 2011 14:04, Dave Love wrote:
> William Hay writes:
>
>> I don't know but a colleague of mine noticed the disappearance as
>> well. We found a copy of the page we were looking for in the wayback
>> machine so Dave isn't the only one to copy it.
&g
According to:
http://arc.liv.ac.uk/pipermail/gridengine-users/2010-December/033190.html
The mpiexec.hydra provided with Intel MPI doesn't tightly integrate
with Grid Engine and one therefore has to use MPD (which is a pain).
However the MPICH2 FAQ
claims(http://wiki.mcs.anl.gov/mpich2/index.php/Fr
On 10 January 2012 15:40, Reuti wrote:
> Am 10.01.2012 um 16:03 schrieb William Hay:
>
>> According to:
>> http://arc.liv.ac.uk/pipermail/gridengine-users/2010-December/033190.html
>> The mpiexec.hydra provided with Intel MPI doesn't tightly integrate
>> with G
On 12 January 2012 11:41, Semi wrote:
> I need to setup high and low priority queues for the same nodes.
> I preferred to make it without subordinate lists.
> I know, that the following parameters are dealing with this:
> seq_no 10
The seq_no is used to determine which queue a j
On 27 January 2012 08:03, Gerard Henry wrote:
> hello all,
> soory if this is a trivial question, but i don't find how to change a
> pending job from one queue to other queue:
> # status -a
qalter -q big
should do it
> ...
> queue used free
> --
> CLUSTER 0
On 27 January 2012 14:01, Martin Gumbau wrote:
> Hi,
>
> I don't known if it is possible and the best way to make it (if was
> possible):
>
> SCENARIO:
>
> - 2 Cells (cell-A and cell-B)
>
> - Cell-A have 3 queues (q1-A, q2-A,q3-A)
>
> - Cell-B have 2 queues (q1-B, q2-B)
>
> ONLY IN CELL cell-A
>
>
On 31 January 2012 08:09, Anton Löfgren wrote:
> Any hints or insight on how this works by default would be much appreciated.
sudo?
>
> Regards,
> Anton
>
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
On 31 January 2012 08:55, Anton Löfgren wrote:
> Really? Or is that speculation?
>
> -Original Message-
> From: wish.dum...@gmail.com [mailto:wish.dum...@gmail.com] On Behalf Of
> William Hay
> Sent: den 31 januari 2012 09:45
> To: Anton Löfgren
> Cc: users@grid
On 1 February 2012 09:49, Harris He, Kun - CD wrote:
> Hi all,
>
>
>
> When I add a new user by NIS, I can access every servers by this user
> account. But I cannot submit any job including interactive job at all.
> System just return back “critical error: can't resolve group”.
>
> The NIS passwd
On 1 February 2012 19:42, Brian Smith wrote:
> I've started a github page for some tools I've put together from various
> bits of code, how-tos, etc. to simplify the setup of parallel
> environments so that they work universally for all MPI implementations
> (on x86_64 Linux) w/ tight-integration
On 2 February 2012 15:33, Robert Hutton wrote:
> Hi Everyone,
>
> I've just set up a small Grid Engine cluster, but I'm new to using Grid
> Engine, and would like some advice on the best way to submit jobs that
> in turn submit jobs. What I'd like to do is:
>
> Run a regular shell script that loo
We discovered a host where the infiniband connection was playing up.
Our normal procedure for this is to remove the host from the
hostgroups it is normally in and add it to a hostgroup associated with
queues that only
accept single node jobs (ie serial jobs and PEs with an allocation
method of $pe_
On 6 February 2012 22:13, Reuti wrote:
> Am 06.02.2012 um 10:55 schrieb William Hay:
>
>> We discovered a host where the infiniband connection was playing up.
>> Our normal procedure for this is to remove the host from the
>> hostgroups it is normally in and add it to a h
On 8 February 2012 15:37, Prentice Bisbal wrote:
> So I finally have MATLAB set up and working fine with SGE. I can submit
> parallel and distributed jobs from MATLAB to SGE, and then SGE does its
> thing.
> I have one remaining problem, and I thought I'd ask here first before
> talking to Mathwor
On 9 February 2012 16:28, Sabine Kreidl wrote:
> Hi,
>
> I'm experiencing (SGE 8.0.0a) that disabled nodes (we have only one
> queue) are added to advance reservations. As those nodes are usually
> disabled for a reason (defect hard drive, etc...) this is not what I
> would want to happen (especia
On 21 February 2012 19:20, Txema Heredia Genestar wrote:
> Hello all,
>
> I am having some problems to run threaded jobs in SGE 6.1u4. In our
> cluster, h_vmem is defined as a consumable attribute in all nodes. It is
> mandatory, all jobs must request it, with a default value of 6Gb. That
> constr
On 22 February 2012 08:21, Hay, William wrote:
> On 21 February 2012 19:20, Txema Heredia Genestar
> wrote:
>> Hello all,
>>
>> I am having some problems to run threaded jobs in SGE 6.1u4. In our
>> cluster, h_vmem is defined as a consumable attribute in all nodes. It is
>> mandatory, all jobs m
We just got bit by https://arc.liv.ac.uk/trac/SGE/ticket/802 and it
took me a lot longer to figure it out than it should have in part
because there does not appear to be any indication when a job has an
array dependency on another job (at least in 6.2u3 which we're using)
All holds and dependencies
On 22 February 2012 19:04, Dave Love wrote:
> On Wed, 22 Feb 2012 14:39:00 +
> William Hay wrote:
>
>> We just got bit by https://arc.liv.ac.uk/trac/SGE/ticket/802 and it
>
> Could you attach a script to submit dummy jobs that reproduce it?
>
I don't know for a f
On 23 February 2012 00:36, Maes, Richard wrote:
> Reuti,
> For the example below where you spec which PE to instantiate.
>> $ qsub -pe ixia* 1 job.sh
>
> Can this accept something other than wildcards? Is there a way to make
> it do REGEX? Or ranges?
> For a case where I have Ixia1, Ixia2, and
On 22 February 2012 17:57, Txema Heredia Genestar wrote:
> William - Yours is my best bet. Long time ago I tried tinkering with the
> "slots" attribute, but never thought about adding this threaded one. I
> only see one (minor) flaw in your solution: I cannot ask for an interval
> of threads (fro
On 23 February 2012 09:31, Reuti wrote:
> Am 23.02.2012 um 10:01 schrieb William Hay:
>
>> On 23 February 2012 00:36, Maes, Richard wrote:
>>> Reuti,
>>> For the example below where you spec which PE to instantiate.
>>>> $ qsub -pe ixia* 1 job.sh
&g
On 23 February 2012 11:56, Reuti wrote:
>>>
>> That the pe is interpreted as a full pattern (per sge_types) which can
>> be set to ixia[12]
>> from the server side JSV is the undocumented part. Sorry if I was unclear.
>
> Argh, this was an extension beyond 6.2u5 and the pe_name can be any
> obj
On 23 February 2012 13:49, Reuti wrote:
> Am 23.02.2012 um 14:32 schrieb William Hay:
>
>> On 23 February 2012 11:56, Reuti wrote:
>>
>>>>>
>>>> That the pe is interpreted as a full pattern (per sge_types) which can
>>>> be set to ixia[1
On 24 February 2012 14:58, Reuti wrote:
?
>
> Default values are only for consumables.
True but you can get a similar effect by putting a default request in
$SGE_ROOT/$SGE_CELL/sge_request.
William
___
users mailing list
users@gridengine.org
https://g
On 24 February 2012 15:11, William Hay wrote:
> On 24 February 2012 14:58, Reuti wrote:
> ?
>>
>> Default values are only for consumables.
> True but you can get a similar effect by putting a default request in
> $SGE_ROOT/$SGE_CELL/sge_request.
Sorry, meant S$
On 24 February 2012 15:27, Reuti wrote:
> Am 24.02.2012 um 16:12 schrieb William Hay:
>
>> On 24 February 2012 15:11, William Hay wrote:
>>> On 24 February 2012 14:58, Reuti wrote:
>>> ?
>>>>
>>>> Default values are only for consumables.
On 24 February 2012 15:53, Reuti wrote:
> Am 24.02.2012 um 16:40 schrieb William Hay:
>
>> On 24 February 2012 15:27, Reuti wrote:
>>> Am 24.02.2012 um 16:12 schrieb William Hay:
>>>
>>>> On 24 February 2012 15:11, William Hay wrote:
>
On 26 February 2012 09:13, mahbube rustaee wrote:
> Hi all,
>
> How can prevent some users to use some resource with -l option ? JSV is best
> way for that ?
>
> Thx
If it is a consumable you could could try configuring a quota of 0 for them.
William
___
On 26 February 2012 12:41, Reuti wrote:
> Am 26.02.2012 um 12:32 schrieb William Hay:
>
>> On 26 February 2012 09:13, mahbube rustaee wrote:
>>> Hi all,
>>>
>>> How can prevent some users to use some resource with -l option ? JSV is best
>>> way
On 28 February 2012 11:02, Stefano Bridi wrote:
> Hi list, I have a problem on a SGE setup where the home directory are
> shared trough glusterfs and some job failed to start because of a
> latency on the filesystem propagation between the login node and the
> compute node.
> What happen is that a
On 28 February 2012 11:02, Stefano Bridi wrote:
> Hi list, I have a problem on a SGE setup where the home directory are
> shared trough glusterfs and some job failed to start because of a
> latency on the filesystem propagation between the login node and the
> compute node.
> What happen is that a
On 29 February 2012 17:47, Reuti wrote:
> Hi,
>
> Am 29.02.2012 um 18:07 schrieb Txema Heredia Genestar:
>
>> I want to control the usage of the local disk of our execution nodes. As far
>> as I have found, the only related option offered by SGE is the h_fsize
>> limit. But that will not work be
On 29 February 2012 22:14, Joe Whitney wrote:
> Hello,
>
> I am having a simple problem where the behaviour of mem_free resource
> requests are being treated differently on two different queues (actually,
> separate installations of SGE).
>
> For context, the hosts servicing queue.A have 32G/4core
On 1 March 2012 07:41, Rayson Ho wrote:
> On Thu, Mar 1, 2012 at 2:03 AM, William Hay wrote:
>> It is possible that it is also consumable in installation/queue A but per
>> JOB.
>
> Joe,
>
> As pointed out by William, that can likely be the root of the issue -
>
We have multiple queue instances on each node each with slots equal to
the number of cpus. To prevent oversubscription I added a slots
consumable to each host restricting it to a number of slots equal to
the cpus on the node.
This has worked up to now but this morning there are a couple of jobs
th
On 7 March 2012 09:47, William Hay wrote:
> We have multiple queue instances on each node each with slots equal to
> the number of cpus. To prevent oversubscription I added a slots
> consumable to each host restricting it to a number of slots equal to
> the cpus on the node.
> Thi
on't have the same issue. Is there any reason to
believe the RQS solution will be more reliable than the host
consumable solution (which has worked pretty well up to now)?
> Regards,
> On Wed, Mar 7, 2012 at 11:00 AM, William Hay wrote:
>>
>> On 7 March 2012 09:47, William
On 7 March 2012 11:36, Reuti wrote:
> Am 07.03.2012 um 11:18 schrieb William Hay:
>
>> On 7 March 2012 10:11, Mazouzi wrote:
>>> I remember Reuti proposed a solution using RQS:
>>>
>>> {
>>> name noverload
>>> descript
On 9 March 2012 14:50, Stuart Barkley wrote:
> We are running into an awkward problem with one of our users jobs.
>
> We have h_vmem set as a consumable resource and have it set to the
> physical memory (minus a small amount) on the systems. These are
> diskless systems and have no swap defined.
On 13 March 2012 09:59, Lars van der bijl wrote:
> Hey everyone,
>
> Where having the following problem.
>
> randomly on some task we start getting "CPU time limit exceeded". we
> don't specify a time limit. we do specify h_vmem.
> this only happens on some tasks and not other. even between same t
We recently made a host specific sge_conf change (alternate prolog).
This didn't propogate out until we softstoped and then restarted the
execd even though we left it for a day.
Is this normal or should we be concerned.
William
___
users mailing list
use
A colleague of mine increased the posix priority of an array job part
of which was running. The running parts have increased ppri,npprior
and prior but qstat of the queued portion shows only
ppri and npprior increased while prior remains as it was. In practice
new tasks from the array job seem to
On 15 March 2012 11:23, Hay, William wrote:
> A colleague of mine increased the posix priority of an array job part
> of which was running. The running parts have increased ppri,npprior
> and prior but qstat of the queued portion shows only
> ppri and npprior increased while prior remains as it w
1 - 100 of 554 matches
Mail list logo