ample.
Regards,
Jeevan.
Sent with AquaMail for Android
http://www.aqua-mail.com
On 28 March 2017 10:54:18 pm Jesse Becker <becke...@mail.nih.gov> wrote:
** This mail has been sent from an external source **
We wrote a small JSV script (in Python) that logs the resource requests,
/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
ssions:
http://gridengine.org/pipermail/users/2011-November/001932.html
http://comments.gmane.org/gmane.comp.clustering.opengridengine.user/1700
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
? Thanks for your help.
Look into using a "parallel environment". We have one called
"multicore", and to run a 6-thread job, we'd do something like this:
qsub -pe multicore 6 /path/to/script.sh
--
Jesse Becker (Contractor)
___
user
iling list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
__
o want to
consider removing them and reconfiguring them if possible.
Process 42491 detached
[pid 42490] +++ killed by SIGABRT +++
[pid 42489] +++ killed by SIGABRT +++
[pid 42488] +++ killed by SIGABRT +++
[pid 42487] +++ killed by SIGABRT +++
[pid 42486] +++ killed by SIGABRT +++
[pid 42485
of messages especially
if you don't know what you are looking for.
Try running the qmaster under strace or GDB. You'll likely have to
either modify the init script, or run it by hand.
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
rg
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.o
s are for this error?
Thanks,
Justin
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gr
cine
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
that using:
limit users {*} hosts @compute to slots=16
doesn't work?
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
foo=TRUE host1 host2 host3
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
of a '$parent_process'!\n;
}
}
--snip--
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
with actually scheduling the job, since it may be a long
time before an entire compute node is free...)
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
the old 9,999,999 twice in my accounting records...)
I ingested all of our old logs, which go back to last summer. We roll the job
ids about every 2 months, and I haven't seen any problems.
On 02/03/15 21:21, Jesse Becker wrote:
I spent a bit of time looking at things to replace ARCO--which I found
On Mon, Mar 02, 2015 at 04:21:57PM -0500, Jesse Becker wrote:
One thing that we *have* learned is that you should keep all of the
raw records. They compress well, and disk space is cheap. Our UGE
logs compress about 85% using gzip -9, and is fast. Other methods
(xz) get almost 90%, but take
easier: just use a sharetree
or functional shares and who owns a box. It also means that anytime
new hardware is added, *everyone* benefits.
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman
@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
://beowulf.rutgers.edu/info-user/pdf/ge_presentation.pdf
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
submissions, what is happening, and what
you want/expect?
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
a minor step or three)
Thanks
--
Dan Hyatt
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
Box 8506
St. Louis, MO 63108
314 747 4767 (o)
314 473 8713 (c)
dhy...@dsgmail.wustl.edu
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker (Contractor
CPU cycles.
You can try removing the job entry from SGE using qdel -f jobid
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
pull certain information about the just-completed job. We use
it occasionally to trap errors that the prologue can miss.
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
On Thu, May 22, 2014 at 11:17:07AM +0300, Semi wrote:
Please explain the meanning of the following parameters:
CQLOAD aoACDS cdsuE
These are all described in detail in the OUTPUT FORMATS / Cluster Queue
Format section of the qstat(1) man page.
--
Jesse Becker (Contractor
/mailman/listinfo/users
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
at 11:48 AM, Jesse Becker becke...@mail.nih.gov wrote:
On Tue, May 06, 2014 at 11:34:50AM -0400, VG wrote:
Hi Everyone,
I have started to understand the basics of SGE, but still I am pretty new
in this cluster computation.
Welcome.
When I qsub an application, and use qstat to check
}
BTW, I think there's no way to make these limits dynamic? Dynamic limits
are only for hosts? I'd like to limit to N * max-num-slots-for-user-on-queue
+1 on this.
--
Jesse Becker (Contractor)
___
users mailing list
users@gridengine.org
https
On Wed, May 15, 2013 at 11:28:23PM +0100, Dave Love wrote:
Jesse Becker becker...@mail.nih.gov writes:
I seem to recall hearing that 5 halflives is how long radioactive
stuff has to decay before it's safe. Don't quote me on that though.
:)
That rather depends on how unsafe it was to start
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman
@gridengine.org
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
module, and (obviously) outputs a .dot file which can be
converted into the image format of your choice.
Run it on a system that can call 'qconf', as it will attempt to pull in
your actual SGE settings for the chart.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
#!/usr/bin/perl -w
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org
.
Hopefully this will help someone, somewhere, at some point.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
:(){ ::};:
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
the 'stress' program:
http://weather.ou.edu/~apw/projects/stress/
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
for large(ish) memory jobs).
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
into resource quotas as well, to keep a given
group from taking over the cluster.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
On Tue, Nov 27, 2012 at 05:15:24PM -0500, Allan Tran wrote:
On Tue, Nov 27, 2012 at 2:43 PM, Jesse Becker
becker...@mail.nih.govmailto:becker...@mail.nih.gov wrote:
I was thinking to enable the functional share policy and actually set it up,
following this instructions
(http://docs.oracle.com
to running jobs. I can't remember if I've ever
had this particular scenario come up before...
Regards,
Chris
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon
(if the shepherd survives) unless you specify the
keep_active (?) flag in execd_params.
It's KEEP_ACTIVE=TRUE in execd_params.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman
are actually running. Specifcally,
the files indicate there should be a job, but nothing is actually running.
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
large Illumina pipeline jobs run 32-way on a single compute node.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
... a followup query.. any special
settings you all are using as per the Linux interfaces or on the switch?For
example I have pretty much default everything but have 9000 for the MTU (jumbo
frames).
(Sent from my Galaxy Nexus phone)
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
.
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https
On Wed, Jul 18, 2012 at 07:35:47AM -0400, Dave Love wrote:
Jesse Becker becker...@mail.nih.gov writes:
I use the attached script. It's loosly based on something found
packaged with SGE a while back, but didn't fit my needs.
(Do you remember) what's wrong for your needs
dependencies
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
?
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https
)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org
the
submission command file itself to parse the output of
PE_HOSTFILE to create hadoop *.site.xml, masters and slaves
files at run time. This methodology is suitable for any
scheduler as it is not dependent on them. If there is
interest I can post the prologue script. Thanks.
Please do.
--
Jesse
be) of running the
SGE qmaster.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
Specialization is for insects. -- R.A.Heinlein
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
them off with
rsh/ssh is something we want to avoid.
earl
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
/pipermail/users/2012-January/002526.html
-- Reuti
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
://beowulf.rutgers.edu/info-user/pdf/ge_presentation.pdf
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
start time,
after the job has actually stared.
I suppose that the altering '-m b' post-start will actually change
setting, but it looks as if the wording is to tell users users that they
shouldn't expect to see any difference in behavior until after a
restart/migration.
--
Jesse Becker
NHGRI
biophysikalische Chemie, Abteilung 105
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
senario?
Hmm...not a prevention, per se, but you could use a prologue script to
dump a 'pseudo-accounting' record of the job. The two logs could then
be reconciled afterwards. Additionally, if you turn the logging level
up far enough, the qmaster logs will log qdels.
--
Jesse Becker
NHGRI Linux
number sorting, and resource complexes as well.
thanks in advance
regards
Walid
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
.
Thanks in advance for your help.
Janet He
Linux/HPC Team
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
No.: HRB 382196
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
indicates a 32/64 bit issue. I think
there's information about it in the mailing list.
I'm not sure about child processes.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman
* host. You could, in theory,
have something running on the qmaster (or elsewhere) to report on the
nodes. This could be useful for tracking how long a host is off (via
periodic pings), or watching accounting logs to try and find idle
time.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
into the queue, and let SGE
deal with them when it can. You could also use qrsh or qsub -now y.
These will both block until the job is actually complete. This doesn't
sound like an ideal solution, but it may open some other options.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
something we can change.
Understood.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
of
these approaches will just lead to the abandonment of the exercise.
It may be that what they are asking is not directly possible, in
which case you can propose reasonable alternatives.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
: +49.9471.200.195 |
Mobile: +49.170.819.7390
[cid:part2.04000503.05080804@univa.com]
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org
/true
This eventually runs, and control passes back to the driving script.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
with qstat. A
simple qstat -j jobid will tell you if a job is running or not.
The accounting file can grow to be quite large, and parsing it can take
quite a while not to mention incur an IO hit.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor
the need for a new logo. Any takers? :)
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
On Mon, Mar 21, 2011 at 04:45:07AM -0400, Esztermann, Ansgar wrote:
On Mar 21, 2011, at 1:36 , Jesse Becker wrote:
Thus, for my *simple* testing, data and rss are both ignored (and yes,
rss *is* documented as being ignored, but data is not).
Did you write to memory after allocation
On Sun, Mar 20, 2011 at 05:22:33PM -0400, Dave Love wrote:
Jesse Becker becker...@mail.nih.gov writes:
Are you having a problem that the limits are not set by SGE, or that they
are not acutally enforced by the OS? I've had a lot of trouble on Linux
systems where the data and rss settings
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
for the number of
CPUs. These number should more closely match what is actually reported
by the OS on the node.
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
80 matches
Mail list logo