Re: [gridengine users] Berkeley DB (was building RHEL5)

Chris Dagdigian Fri, 08 Apr 2011 07:45:08 -0700

Rayson answered the ARCO question - spooling does not matter since theonly ARCO involved files that get scraped are the accounting andreporting files


classic vs. berkeley is always an interesting question.

I also am firmly in the classic spooling camp but we sometimes useberkeley spooling. There seem to be two main things driving the choice:

- NFS performance. If your NFS server is poor and you have a largeclient count than at some point spooling may become a bottleneck.However, on the flip side if you have a great NFS server you can useclassic spooling at large scale. One trivial example -- a 4,000-corecluster easily using classic spooling even with more than ~500,000 jobsper day because the NFS service is coming from a small Isilon scale-outNAS system that is running wire-speed across a dozen GbE NICs

- Job submission rate and job "churn". I think DanT said this in a blogpost years ago but if you expect to need 200+ qsubs per second then youare going to need berkeley spooling. Same goes for clusters thatexperience huge amounts of job flows or state changes. I have lessexperience here but in these sorts of systems I think binary spoolingmakes a real difference


My $.02 of course!

-chris




Mark Suhovecky wrote:


OK, I got SGE6.2u5p1 to build with version 4.4.20 of Berkeley DB,
and proceeded to try and install Grid Engine on the master host
via inst_sge.

  At some point it tells me that I should install Berkeley DB
on the master host  first, so I do "inst_sge -db", which hangs when it tries
to start the DB for the first time. Then, because some
days I'm not terribly bright, I decide to see if the DB will start
at machine reboot. Well, now it hangs when sgedb start
runs from init. Still gotta fix that.

So let me back up for a minute and ask about Berkeley DB...

We currently run sge_6.2u1 on 1250 or so hosts, with "classic"
flat file spooling, and it's pretty stable.
When we move to SGE6.2u5p1, we'd like
to use the Arco reporting package, and I'm blithely assuming
that I need a DB with an SQL interface to accomodate this.

Is that true? Can we use Arco w/o DB spooling?

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Berkeley DB (was building RHEL5)

Reply via email to