On Mon, Oct 06, 2008 at 09:58:54PM +0200, Bas van der Vlies wrote:
> I find this a good suggestion. Just checked the latest snapshot
> and the DPrint statements did not changed.
Great.
> I am also interested what of kind you patches you applied to maui to
> handle this amount of jobs.
I did:
In include/msched.h:
These were at 4096, upped to 32k:
#define MAX_MJOB 32768
#define MMAX_JOB 32768
#define MAX_MJOB_TRACE 32768
In include/msched-common.h:
This was upped as well:
#ifndef MAX_MTASK
# define MAX_MTASK 32768
#endif /* MAX_MTASK */
I wasn't sure what all needed to be bumped from 4096 to whatever, but
these are what I changed.
I believe someone asked about memory usage. I've seen maui with these
patches and workload up to about 20k jobs, and I've seen the memory
usage of maui at about 1 GB, maybe more. We bought more RAM for the
machine because it was thrashing the swap.
This is also on a 32bit machine. I've run maui on 64bit machines before
(alpha, not intel), but I have found issues with 64bit builds on intel
boxes. For example, LOGFILEMAXSIZE is cast to an int, instead of
size_t, I would guess there are other similar issues.
I've found similar issues in torque. The src/resmom/linux/mom_mach.c
after:
else if (!strcmp(pname,"file"))
I don't remember what the problem was, if it was in getsize(), or inside
of the if(!strcmp(pname,"file")) {} block, but with either filesystems
or files > 2 or 4 gig, that chunk of code did not work. It may of been
the setrlimit() call that failed. (this is with torque-2.1.10, this may
of been fixed since that release).
-mb
--
+-----------------------------------------------
| Michael Barnes
|
| Thomas Jefferson National Accelerator Facility
| 12000 Jefferson Ave.
| Newport News, VA 23606
| (757) 269-7634
+-----------------------------------------------
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers