In case you didn't see it on the list, PBS has a bug in its uninstall script, so the start-over script probably won't take it out right either. I wasn't sure if you were using some sort of pre-install backup image, but I have rarely gotten OSCAR to work reliably after an aborted install.
I think I have the fix for the pbs script archived somewhere if you cant find it. Running Torque won't fix your problems with defering I don't think, since I am 99% sure that is a maui strangeness, but it is supposed to be much better. Sadly I didn't have time to figure out SGE (or Torque for that mater) before my boss made me give up on the schedulers for a while and do some production runs. OpenPBS/maui appears to be working fine for me at the moment, but then I don't really have any users yet so we'll see how long that holds up. On Fri, 3 Sep 2004 12:30:54 -0700, Bernard Li <[EMAIL PROTECTED]> wrote: > Hi Jenny: > > You should be able to run both SGE and Torque at the same time (though I > haven't tried it) since they are drastically different systems. The > only thing to keep in mind is that they use similar binary names for > commonly used programs like qsub and qstat, so make sure you know which > one you're executing. > > But it might be just as easy (and cleaner) to install Torque and SGE on > different installations (machines), and then you can see their > performance side by side. > > Cheers, > > Bernard > > > -----Original Message----- > > From: Jenny Aquilino [mailto:[EMAIL PROTECTED] > > Sent: Friday, September 03, 2004 12:22 > > To: Bernard Li; [EMAIL PROTECTED] > > Subject: RE: [Oscar-users] How do you make master node a > > compute node too? > > > > Oh, that's great news. I was going to check to see if the > > opd would list Torque or SGE but you answered that for me. > > Before I sent off my last e-mail saying that configuring the > > master as a client through the install_server GUI killed my > > setup, I tried it again since I've learned a lot since I last > > tried that and sure enough it did the same thing again. Doh! > > So I'm currently in the process of rebuilding that machine > > again but I'll try installing Torque after I've got it up > > again and see how that works. I just saw Arnie's post and he > > seems to be happy with SGE so I can try that too. I hope > > there are no issues with having several schedulers installed. > > I would turn the daemons off for the others but still. Ok, > > well it looks like my system just finished kickstarting so > > I'm gonna try this again. > > Thanks for all the info. > > > > -Jenny =) > > At 12:06 PM -0700 9/3/04, Bernard Li wrote: > > >Hi Jenny: > > > > > >I believe PBS Pro is free for educational uses, so perhaps > > you guys can > > >get it for free. > > > > > >Torque is basically OpenPBS but with it all patched up (and updated > > >quite regularly too). > > > > > >Sun Grid Engine is getting more and more popular (and my personal > > >preference). > > > > > >You can actually d/l and install Torque from OPD (OSCAR Package > > >Downloader). I am working on creating a package for SGE as well but > > >that has been pre-empted by other projects I am working on. > > > > > >Cheers, > > > > > >Bernard > > > > > >> -----Original Message----- > > >> From: Jenny Aquilino [mailto:[EMAIL PROTECTED] > > >> Sent: Friday, September 03, 2004 12:03 > > >> To: Bernard Li; Michael Edwards; [EMAIL PROTECTED] > > >> Subject: RE: [Oscar-users] How do you make master node a > > >> compute node too? > > >> > > >> Hi Bernard, > > >> > > >> That is good input. I went to their site to try and get some > > >> documentation and it looked like OpenPBS is no longer so > > >> Open. =( Professional PBS appears to be what they're > > >> throwing their effort behind now. I'll check out the other > > >> batch schedulers you suggested. > > >> I can still use my same infrastructure that was setup with > > >> OSCAR and just use cexec or something to push out the new > > >> scheduler to the other nodes...once I have them. Thanks for > > >> the feedback. =) > > >> > > >> -Jenny =) > > >> At 11:28 AM -0700 9/3/04, Bernard Li wrote: > > >> >Hi Jenny: > > >> > > > >> >Just throwing in my 2 cents here. > > >> > > > >> >If you are new to batch scheduling systems and you don't have a > > >> >preference, I would recommend using something like Torque or SGE > > >> >(instead of OpenPBS). OpenPBS is no longer being maintained > > >> and there > > >> >are much better choices out there. > > >> > > > >> >Cheers, > > >> > > > >> >Bernard > > >> > > > >> >> -----Original Message----- > > >> >> From: [EMAIL PROTECTED] > > >> >> [mailto:[EMAIL PROTECTED] On > > >> Behalf Of Jenny > > >> >> Aquilino > > >> >> Sent: Friday, September 03, 2004 11:08 > > >> >> To: Michael Edwards; [EMAIL PROTECTED] > > >> >> Subject: Re: [Oscar-users] How do you make master > > node a compute > > >> >> node too? > > >> >> > > >> >> Hi Michael, > > >> >> > > >> >> Wow, thanks for the quick response. =) Yeah, please > > >> don't break > > >> >> your production cluster just to check this out. I will > > >> look at the > > >> >> installer again to see if I see that check box you're > > >> talking about. > > >> >> It actually sounds really familiar. I installed an OSCAR > > >> cluster > > >> >> about 3 years ago and I think I remember seeing that > > then but I > > >> >> can't seem to find it now. I did try defining the > > master nodes > > >> >> private interface as node0.cluster using the > > >> install_server GUI with > > >> >> horrific results. Yeah, I won't try that again. It > > completely > > >> >> broke my access to the oda database so I couldn't > > back out of the > > >> >> installation or fix it. I spent quite a bit of time > > >> trying differen > > >> >> things like granting access to the [EMAIL PROTECTED] > > >> user but > > >> >> it still wouldn't work so I finally just threw my > > hands up and > > >> >> rebuilt the system. =P > > >> >> > > >> >> As for just geting mpich to work on its own, it appears > > > > that it is > > >> >> configured correctly after the OSCAR installation so > > that isn't > > >> >> really a huge problem. Well, except for the fact > > that it is only > > >> >> seeing one processor for use. If you have any idea > > why that is > > >> >> happening please let me know. I would really like to > > >> have all the > > >> >> management stuff working though which is why I'm > > still pushing to > > >> >> try and get the xpbs stuff setup correctly. If you have > > >> any other > > >> >> ideas for me, that would be great. Thanks Michael. > > >> >> > > >> >> -Jenny =) > > >> >> At 12:33 PM -0400 9/3/04, Michael Edwards wrote: > > > > >> >I am fairly sure there is a radio box somewhere in > > the install > > >> >> process >that says "Use head node to compute". I will > > >> try and see > > >> >> if I can find >it if I can find a computer I don't mind > > >> breaking... > > >> >> Don't want to >mess with my production cluster, at least > > >> one of the > > >> >> installer steps >does things even if you hit cancel :) > > > >> >If you > > >> >> just need to use lam (or I assume mpich, never tried > > installing > > >> >> >that one), it is quite easy to install stand alone and > > >> configure to > > >> >> run >on one computer. OSCAR is mainly handy for > > installing the > > >> >> scheduler >and resource managers. Then you could go back > > >> and start > > >> >> over with >OSCAR once you get some compute nodes. > > Just a thought > > >> >> for a quick fix. > > >> >> > > > >> >> >Unless there is something else in OSCAR you would need for > > >> >> development, >I can't think of anything off the top of my head > > >> >> though. > > >> >> >I could walk you through using lam on just one node if > > >> you need a > > >> >> hand, >I have done it before while testing things. > > >> >> > > > >> >> >On Fri, 3 Sep 2004 09:05:44 -0700, Jenny Aquilino > > >> >> <[EMAIL PROTECTED]> wrote: > > >> >> >> Hi, > > >> >> >> > > >> >> >> I was hoping that someone out there might be able to > > >> help me out > > >> >> >> with this question. I am currently working on a > > rather strange > > >> > > >> cluster...it's only one node. I know that sounds strange > > >> >> but there > > >> >> >> is a reason for it. The user only has one system to > > >> start out > > >> >> with >> but would like to add more nodes very soon. The > > >> problem is > > >> >> that he >> can't wait for the other nodes to come on-line > > >> before he > > >> >> starts >> developing on it so I need to make this one > > >> node cluster > > >> >> functional >> for him. > > >> >> >> > > >> >> >> I saw the FAQ on the OSCAR sourceforge website > > that says to > > >> >> modify >> the /var/spool/pbs/server_priv/nodes file and > > >> then restart > > >> >> the >> pbs_mom, pbs_server in order to make the master a > > >> node so I > > >> >> did that. > > >> >> >> When I run "pbsnodes -a" it does reflect my one node. > > >> >> However, when > > >> >> >> I run xpbs to see how many processors are allocated to my > > >> >> server, it >> reflects 0. It also shows the workq as > > not having > > >> >> access to any >> processors. I added a print > > statement to one of > > >> >> the mpich example >> programs to print out the number of cpus > > >> >> recognized by the >> "MPI_Comm_size" command and it > > >> returned 1 when > > >> >> in fact my system has >> 2 processors. Does anyone > > >> know why this > > >> >> might be? Is one of the >> processors on the master > > node always > > >> >> reserved? More generally, is >> there a set of steps > > >> anyone could > > >> >> recommend to be able to get the >> master node setup like > > >> a client > > >> >> as far as the monitoring and >> scheduling tools go? I > > >> did go into > > >> >> qmgr and by hand can enter the >> information but it > > seems that > > >> >> when other nodes get built, that > > information is > > >> automatically > > >> >> populated into the tables and I would >> really like it > > >> if I could > > >> >> get the master to populate this >> information on itself > > >> when it is > > >> >> setup to also be a compute node. > > >> >> >> > > >> >> >> I hope this e-mail makes sense. I'm a little crazy after > > >> >> several >> nights of tossing and turning trying to figure > > >> out why > > >> >> things aren't >> working the way I would expect them to. > > >> Thanks in > > >> >> advance for any >> help you can offer. =) >> >> > > >> -Jenny Aquilino > > >> >> =) >> >> > > ------------------------------------------------------- > > > > >> >> This SF.Net email is sponsored by BEA Weblogic > > Workshop FREE > > >> >> Java >> Enterprise J2EE developer tools! > > >> >> >> Get your free copy of BEA WebLogic Workshop 8.1 today. > > >> >> >> http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click > > >> >> >> _______________________________________________ > > >> >> >> Oscar-users mailing list > > >> >> >> [EMAIL PROTECTED] >> > > >> >> https://lists.sourceforge.net/lists/listinfo/oscar-users > > >> >> >> > > >> >> > > >> >> > > >> >> > > >> >> ------------------------------------------------------- > > > > >> This SF.Net email is sponsored by BEA Weblogic Workshop > > >> FREE Java > > >> >> Enterprise J2EE developer tools! > > >> >> Get your free copy of BEA WebLogic Workshop 8.1 today. > > >> >> http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click > > >> >> _______________________________________________ > > >> >> Oscar-users mailing list > > >> >> [EMAIL PROTECTED] > > >> >> https://lists.sourceforge.net/lists/listinfo/oscar-users > > >> >> > > >> >> > > >> > > >> > > >> > > > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_idP47&alloc_id808&opclick > > > _______________________________________________ > Oscar-users mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/oscar-users > ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_idP47&alloc_id808&op=click _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users
