Re: [Oscar-users] Re: Two install issues OSC2.3/RH80

Jeff Johnson Thu, 04 Sep 2003 15:12:41 -0700

On Thu, 2003-09-04 at 14:25, Terrence Fleury wrote:
> >> On 04 Sep 2003, Jeff Johnson <[EMAIL PROTECTED]> wrote:
> > Greetings,
> > 
> >     I have run into strange behavior on two separate installs of Oscar 2.3
> > on top of Redhat 8.0. In both cases RH8 was updated current as of Aug
> > 29th. The same behaviors were noted on both installs which occurred on
> > two separate clusters.
> > 
> > The first was during step 1, download additional packages. After
> > selecting this step a progress bar is displayed and the install gui
> > becomes unresponsive. This condition lasts for over a half hour during
> > which perl (according to top) runs as high as 90%, takes 2GB of RAM and
> > dips into swap before the gui dies. Running the gui again runs fine as
> > long as step 1 is bypassed.
> 
> There are two possible issues here.  One is the 'opd' program and the other
> is the Opder GUI.  The GUI simply calls the 'opd' script (which is found in
> $OSCAR_HOME/scripts/).  It could be that 'opd' is not working properly OR it
> could be that the files you are trying to download are REALLY big and it's
> just taking a long time to transfer the files.  Right now, there's no way to
> display the file download status within the Opder GUI (because opd itself
> doesn't output that info when called from another process).  This is
> something that we will definitely address in the future.


No file transfer takes place. A menu of additional packages to select
does not even appear. Selection of download additional packages from the
main oscar install gui causes a blank grey window to appear that hangs
and dies in the manner I mentioned above in the original message. From
your comments I assume it must be something with the opd script
initially called by the gui when the initial selection is made.

> So, my suggestion is to run the $OSCAR_HOME/scripts/opd program from the
> command line and see if you can download the files that way.  It should show
> you a progress bar on a per-file basis so you can see if the problem is opd
> failing, or just huge files taking a long time to download.  
> 
> If running opd from the command line seems to run fine (and quickly), you
> can try the Opder GUI again and look in the /var/cache/oscar/opd directory
> while getting files to see if they are actually coming in.  The files are
> given an .opd extension while downloading.  Any files that were successfully
> downloaded get put in /var/cache/oscar/downloads.
> 
> If the problem is in fact opd failing, please let us know.  Thanks.
> 
> Terry Fleury
[EMAIL PROTECTED]

The other, more crucial issue in my opinion, is the drastic slowdown in
job starting and ssh transactions involving PAM. This slowdown is
causing a simple cexec or ckill command to take 60-90 seconds to
complete. The starting of a mpich job whether by pbs or manually started
(ie: mpirun -nolocal -np 34 ./PMB2 -npmin 32) takes a very long time. To
give you an idea to make the test_cluster script pass I had to up the
time factor in all of the test scripts to 12 so it had 210+ seconds to
complete. This case is 17 nodes over a gigabit network running dual 3Ghz
Xeons. This is a test that normally completes in under 30 seconds.

What is it about RH8 over RH73 or Oscar2.3 over previous versions with
regard to PAM that causes such a severe lag?

I appreciate your advice.

Jeff
-- 
Jeff Johnson <[EMAIL PROTECTED]>
Western Scientific, Inc

"Rome did not create a great Empire by holding meetings. They did it by
killing all those who opposed them."



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Re: [Oscar-users] Re: Two install issues OSC2.3/RH80

Reply via email to