On Tue, Jul 10, 2012 at 06:51:03PM -0400, David Erickson wrote:
Hi all-
Following up with a slightly different question from yesterday, I have
another GE installation that has many hosts grabbing jobs from a
single queue. These jobs are logically grouped together, although
they are not submitted
Hi all-
Following up with a slightly different question from yesterday, I have
another GE installation that has many hosts grabbing jobs from a
single queue. These jobs are logically grouped together, although
they are not submitted together (but actually when a job finishes from
another GE cluste
Great info, will be hacking on this this afternoon.
Thanks!
On Tue, Jul 10, 2012 at 11:43 AM, Rayson Ho wrote:
> On Tue, Jul 10, 2012 at 5:45 AM, Reuti wrote:
>>
>> Just to note, that the path can be accessed by $SGE_JOB_SPOOL_DIR.
>
>
> Thanks Reuti - it will be useful to David.
>
> I forgot t
On Tue, Jul 10, 2012 at 4:23 PM, John Young wrote:
> With this in place, it seems odd that from one of my queues I
> get a default setting for the number of descriptors of 1024.
>
> So I have two questions really:
>
> 1. Why am I getting different behavior from the two queues?
Could be OS issue -
On Tue, Jul 10, 2012 at 3:51 PM, Reuti wrote:
>> Failure was that user's .bashrc were not being read. In each user's
>> account, I have it sourcing a system wide shell script which sets certain
>> things up, like our module environment configuration and so nothing was
>> being setup.
>
> Yep,
On 07/10/2012 04:14 PM, Rayson Ho wrote:
On Tue, Jul 10, 2012 at 4:02 PM, John Young wrote:
If you really have a real use-case for setting the # of descriptors in
the queue config, then let us know and we can implement that in OGS/GE
(... when time permits).
Well... I have an engineer here w
On Tue, Jul 10, 2012 at 4:02 PM, John Young wrote:
>> If you really have a real use-case for setting the # of descriptors in
>> the queue config, then let us know and we can implement that in OGS/GE
>> (... when time permits).
>>
> Well... I have an engineer here who want to run a 2048 core job.
On 07/10/2012 03:47 PM, Rayson Ho wrote:
The number of file descriptors is not part of the queue limit, see the
message I sent to the list 2 months ago:
http://gridengine.org/pipermail/users/2012-May/003705.html
If you really have a real use-case for setting the # of descriptors in
the queue co
Am 10.07.2012 um 21:20 schrieb Joseph Farran:
> On 07/10/2012 11:38 AM, Rayson Ho wrote:
>> On Tue, Jul 10, 2012 at 1:48 PM, Joseph Farran wrote:
>>> I was using the same identical script, so it's still a mystery why the
>>> script ran on some nodes while it failed on others, but now with this ch
The number of file descriptors is not part of the queue limit, see the
message I sent to the list 2 months ago:
http://gridengine.org/pipermail/users/2012-May/003705.html
If you really have a real use-case for setting the # of descriptors in
the queue config, then let us know and we can implement
I have a short test job that I can submit to different
queues on my cluster that appear to be configured the
same, but I get different results. Here is the job:
---
#!/bin/tcsh
#
#$ -N show-limits
#$ -S /bin/tcsh
#$ -o show-limits.out
#$ -e show-limits.err
On 07/10/2012 11:38 AM, Rayson Ho wrote:
On Tue, Jul 10, 2012 at 1:48 PM, Joseph Farran wrote:
I was using the same identical script, so it's still a mystery why the
script ran on some nodes while it failed on others, but now with this change
it works on all nodes and that is good enough.
Th
On Tue, Jul 10, 2012 at 5:45 AM, Reuti wrote:
>
> Just to note, that the path can be accessed by $SGE_JOB_SPOOL_DIR.
Thanks Reuti - it will be useful to David.
I forgot this environment var as I have not used this hack for almost
a year... basically since getting the job exit status in epilog w
On Tue, Jul 10, 2012 at 1:48 PM, Joseph Farran wrote:
> I was using the same identical script, so it's still a mystery why the
> script ran on some nodes while it failed on others, but now with this change
> it works on all nodes and that is good enough.
Those settings affect whether the global l
On 07/10/2012 02:37 AM, Reuti wrote:
Am 10.07.2012 um 09:29 schrieb Hung-Sheng Tsao Ph.D.:
hi
to use -S /bin/bash
need
shell_start_mode posix_behavior
you may also want to add bash to login_shells in qconf -mconf global
regards
On 7/10/2012 1:21 AM, Joseph A. Farran wrote:
Hello.
I
On Tue, 10 Jul 2012, Reuti wrote:
...
* PE_HOSTFILE rewriting
This I would suggest to do in the start_proc_args of the PE, but it
might be personal taste of course.
-- Reuti
The advantage of doing PE_HOSTFILE rewriting in the starter_method is that
we can modify the job's PE_HOSTFILE vari
Am 10.07.2012 um 07:42 schrieb mahbube rustaee:
>
> >
> > Yes, I configured BLCR checkpoint . when a job suspend (qmod -sj) state be
> > "s" and will be queue (Rq ) automatically.
>
> This is the normal behavior.
>
> > How can do that manually? I mean job be in "s" state until unsuspend it
>
Am 10.07.2012 um 05:10 schrieb Rayson Ho:
> 1) There's the "exit_status" file in the job's spool directory, and
> you can just go to the active_jobs directory in the execd's spool, and
> in there you will find a subdirectory for each job. So you can just
> parse the file to get the exit status of
Am 10.07.2012 um 10:25 schrieb Mark Dixon:
> On Mon, 9 Jul 2012, Dave Love wrote:
> ...
>> You essentially need to mimic the built-in starter, which is why it's
>> best to hook into it instead. The mimic is straightforward as long as
>> you don't use shell and can follow the built-in code.
>>
>>
Am 10.07.2012 um 09:29 schrieb Hung-Sheng Tsao Ph.D.:
> hi
> to use -S /bin/bash
> need
> shell_start_mode posix_behavior
>
> you may also want to add bash to login_shells in qconf -mconf global
>
> regards
>
> On 7/10/2012 1:21 AM, Joseph A. Farran wrote:
>> Hello.
>>
>> I have a cluste
On Mon, 9 Jul 2012, Dave Love wrote:
...
I'm not sure I understand the problem. Is it specific to using the
starter method? Fluent 12 apparently works here with the standard
wrapper as "rsh".
All I know is that it didn't work for me :)
...
I guess it depends on the specification. I doubt i
On Mon, 9 Jul 2012, Dave Love wrote:
...
You essentially need to mimic the built-in starter, which is why it's
best to hook into it instead. The mimic is straightforward as long as
you don't use shell and can follow the built-in code.
What sort of things do people need to do other than firkle w
hi
to use -S /bin/bash
need
shell_start_mode posix_behavior
you may also want to add bash to login_shells in qconf -mconf global
regards
On 7/10/2012 1:21 AM, Joseph A. Farran wrote:
Hello.
I have a cluster with Rocks 5.4.3 (SL) and I believe this is a Rocks
issue but not sure.
I have
Hi Reuti,
I have contacted the original poster. Unfortunately he cannot remember the
exact solution. He wrote something about changes to
the queueing system... but this does not help anyway.
Cheers,
Sascha
Original Message
processed by CONSOLIDATE
Subject:
Re: [gridengine users] U
24 matches
Mail list logo