Presuming this is not a production cluster...

You should be able to upgrade the RPMs on top of your existing RPMs,
however you need to do a few things:

- update RPMs on headnode
- update RPMs on image (rpm -ivhr
/var/lib/systemimager/images/oscarimage <rpms>)
- re-push image or use cexec to update RPMs on compute nodes
- copy updated RPMs to /tftpboot/rpm
- re-start the daemons if that is not automatically done

See if that works for you.

P.S. OSCAR 4.2.1b6 should be cut today, would appreciate it if you give
that a whirl as well.

Thanks,

Bernard 

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> carlos vasco
> Sent: Tuesday, March 28, 2006 2:09
> To: Bernard Li
> Cc: [email protected]
> Subject: [Oscar-devel] Re: Oscar 4.2.1b5 testing
> 
> Thanks, Bernard, but what should the way to update this rpms
> (reinstall Oscar or can be done over that installation)?
> 
> Carlos
> 
> On 3/28/06, Bernard Li <[EMAIL PROTECTED]> wrote:
> >
> >
> > Hi Carlos:
> >
> > You can use the newer TORQUE RPMs from trunk:
> >
> > http://svn.oscar.openclustergroup.org/oscar/trunk/packages/torque/
> >
> > Please let us know if this fixes your problem.
> >
> > Cheers,
> >
> > Bernard
> >
> >  ________________________________
> >
> > From: carlos vasco [mailto:[EMAIL PROTECTED]
> > Sent: Tue 28/03/2006 01:55
> > To: Bernard Li
> > Cc: [email protected]
> > Subject: Re: Oscar 4.2.1b5 testing
> >
> >
> >
> >
> > Hi Bernard,
> >
> > I have been searching the torque list about the time use 
> issue, and found
> > that:
> >
> > >> We are using torque-1.2.0p5 and Maui-3.2.6p13. When I do 
> a qstat I
> > >> see that the 'Time Use' is only a couple of seconds, yet the jobs
> > >> have been running for a couple of hours. We are running 
> Matlab jobs
> > >> which are launched from a script. They are only single 
> cpu (no mpi).
> > >
> > > This is a bug fixed in 1.2.0p6...
> >
> > Since my problem is very similar (not a mpi issue), and oscar 4.2.1
> > being torque-1.2.0p5 (I think), the solution could be using
> > torque-1.2.0p6. Any easy way to update torque?
> >
> > Thanks,
> > Carlos
> >
> > On 3/28/06, carlos vasco <[EMAIL PROTECTED]> wrote:
> > > Hi Bernard (and oscar-devel, I forgot last time to cc them):
> > >
> > > The test now worked OK, apart from ganglia, but this has been
> > > reconfigured by our IT people, so it should be ok.
> > >
> > > TORQUE still reports 00:00:00 ...
> > >
> > > Carlos
> > >
> > > On 3/28/06, carlos vasco <[EMAIL PROTECTED]> wrote:
> > > > Hi Bernard (and oscar-devel, I forgot last time to cc them):
> > > >
> > > > The test now worked OK, apart from ganglia, but this has been
> > > > reconfigured by our IT people, so it should be ok.
> > > >
> > > > TORQUE still reports 00:00:00 ...
> > > >
> > > > Carlos
> > > >
> > > > On 3/28/06, carlos vasco <[EMAIL PROTECTED]> wrote:
> > > > > Not sure what I did, but apparently installing Oscar 
> I modified
> > > > > ssh_config on the server instead of sshd_config, and 
> that is why I
> > > > > think I forgot tho modified sshd_config.
> > > > >
> > > > > I am trying the test again.
> > > > >
> > > > > Carlos
> > > > >
> > > > > On 3/28/06, Bernard Li <[EMAIL PROTECTED]> wrote:
> > > > > >
> > > > > >
> > > > > > [ CC:ing oscar-devel on this ]
> > > > > >
> > > > > > You shouldn't need to manually modify your 
> /etc/ssh/sshd_config to
> > add
> > > > the
> > > > > > "PermitRootLogin" - this should already be done for you.
> > > > > >
> > > > > > In your error log, it indicates that you have put 
> the option in
> > > > > > /etc/ssh/ssh_config, which is _wrong_.  Try taking 
> out that line and
> > > > re-run
> > > > > > the tests (that option should be in sshd_config, 
> not ssh_config, but
> > as
> > > > I
> > > > > > mentioned you shouldn't need to manually modify it).
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Bernard
> > > > > >
> > > > > >  ________________________________
> > > > > >  From: carlos vasco [mailto:[EMAIL PROTECTED]
> > > > > > Sent: Mon 27/03/2006 22:18
> > > > > > To: Bernard Li
> > > > > > Subject: Re: [Oscar-devel] Oscar 4.2.1b5 testing
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi, Bernard,
> > > > > >
> > > > > > I don't know exactly what the logs are, I only can find the
> > following
> > > > > > in the /home/oscartst/ directory:
> > > > > >
> > > > > > drwxr-xr-x  2 oscartst oscartst 4096 Mar 27 15:30 ganglia
> > > > > > drwxr-xr-x  2 oscartst oscartst 4096 Mar 27 15:31 lam
> > > > > > drwxr-xr-x  2 oscartst oscartst 4096 Mar 22 16:49 maui
> > > > > > drwxr-xr-x  2 oscartst oscartst 4096 Mar 27 15:30 mpich
> > > > > > -rwxr-xr-x  1 oscartst oscartst 4826 Mar 27 15:30 pbs_test
> > > > > > drwxr-xr-x  2 oscartst oscartst 4096 Mar 27 15:30 pvm
> > > > > > -rwxr-xr-x  1 oscartst oscartst  927 Mar 27 15:30 
> ssh_user_tests
> > > > > > -rwxr-xr-x  1 oscartst oscartst 7326 Mar 27 15:30 
> test_cluster
> > > > > > -rwxr-xr-x  1 oscartst oscartst 3562 Mar 27 15:30 testprint
> > > > > > drwxr-xr-x  2 oscartst oscartst 4096 Mar 27 15:30 torque
> > > > > >
> > > > > > In mpich,
> > > > > > -rw-r--r--  1 oscartst oscartst   3093 Mar 18 00:30 cpi.c
> > > > > > -rw-r--r--  1 oscartst oscartst   1732 Mar 18 00:30 
> cxxhello.cc
> > > > > > -rw-r--r--  1 oscartst oscartst   1647 Mar 18 00:30 
> f77hello.f
> > > > > > -rwxrwxr-x  1 oscartst oscartst 337512 Mar 27 15:30 
> mpich-cpi
> > > > > > -rw-------  1 oscartst oscartst    136 Mar 27 15:30 
> mpichtest.err
> > > > > > -rw-------  1 oscartst oscartst    454 Mar 27 15:30 
> mpichtest.out
> > > > > > -rwxr-xr-x  1 oscartst oscartst   1412 Mar 18 00:30 
> pbs_script.mpich
> > > > > > -rw-rw-r--  1 oscartst oscartst    510 Mar 27 13:51 PI21051
> > > > > > -rw-rw-r--  1 oscartst oscartst    510 Mar 27 12:37 PI3408
> > > > > > -rwxr-xr-x  1 oscartst oscartst   2837 Mar 18 00:30 
> test_user
> > > > > >
> > > > > > I attach the mpichtest files.
> > > > > >
> > > > > > Not sure how to track the TORQUE problem, maybe I 
> can config it in
> > the
> > > > > > same way we configured the other clusters.
> > > > > >
> > > > > > Thanks,
> > > > > > Carlos
> > > > > >
> > > > > >
> > > > > > On 3/27/06, Bernard Li <[EMAIL PROTECTED]> wrote:
> > > > > > > Hi Carlos:
> > > > > > >
> > > > > > > > No problems have been found during 
> installation, but some errors
> > > > did
> > > > > > > > occur during the test phase (see attachment).
> > > > > > >
> > > > > > > Can you post the relevant logs in /home/oscartst?
> > > > > > >
> > > > > > > > Other problem found is that qstat reports 
> 00:00:00 in the
> > > > > > > > Time Use field.
> > > > > > >
> > > > > > > I wonder if this is a TORQUE bug or a bug of us 
> setting it up - do
> > > > you
> > > > > > > think you can dig deeper into this?
> > > > > > >
> > > > > > > > During installation, I think I forgot to put 
> PermitRootLogin yes
> > in
> > > > > > > > sshd_config, and after the nodes were created, 
> I cpushed the
> > > > corrected
> > > > > > > > sshd_config file. Could these be related with 
> the errors?
> > > > > > >
> > > > > > > You shouldn't need to edit sshd_config manually - 
> anyways, we
> > should
> > > > be
> > > > > > > able to figure out what's wrong by investigating 
> the log files.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Bernard
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking 
> scripting language
> that extends applications into web and mobile media. Attend 
> the live webcast
> and join the prime developer group breaking into this new 
> coding territory!
> http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
> _______________________________________________
> Oscar-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/oscar-devel
> 


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to