Okay well - glad that it's working for you now :-)
 
Cheers,
 
Bernard


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of DD
Sent: Friday, March 31, 2006 12:17
To: [email protected]
Subject: RE: [Oscar-users] Error marking node - Unauthorized Request (Got it!)

That's another glitch from somewhere I wouldn't know. The first time I did a "Complete Cluster Installation", it failed. So what I did was re-run the "./install cluster eth0" command to get the wizard, but went straight ahead to "Complete cluster" and it passed. But after that, "test cluster" repeatedly failed until last night when I could pinpointed the pfiler.conf file. I must had done a million of things in between that I couldn't remember what I did anymore.



Bernard Li <[EMAIL PROTECTED]> wrote:
When you add more nodes to your cluster, did you run "Complete Cluster Installation" prior to testing?
 
Cheers,
 
Bernard


From: DD [mailto:[EMAIL PROTECTED]
Sent: Friday, March 31, 2006 11:45
To: Bernard Li
Subject: RE: [Oscar-users] Error marking node - Unauthorized Request (Got it!)

ganglia has the ganglia.err and looking in there, I saw the message: "The number of nodes expected is different from the number of nodes detected.
Check to see if gmond is running on all your nodes and make sure that you
are not having any network issues."

While trying to look for more clues, I stumbled on the /etc/pfilter.conf files and there it was! The pfilter.conf file on the computing nodes all missing two very last nodes I added to the cluster in the "%define nodes" line. I added the last two nodes to pfilter.conf, did a cpush, and restarted pfilter and that was it! (I hope).

My cluster is working now. Thanks, Bernard Li.

Dzung





Bernard Li <[EMAIL PROTECTED]> wrote:
Test error logs are stored in the package-specific directories in /home/oscartst as .err files - anything interesting there?

Cheers,

Bernard

________________________________

From: [EMAIL PROTECTED] on behalf of DD
Sent: Thu 30/03/2006 09:48
To: [email protected]
Subject: RE: [Oscar-users] Error marking node - Unauthorized Request


Well, I fixed that problem, but then something else must had broken, because ./test_cluster now showed this:

# ./test_cluster
Performing root tests...
Torque node check [PASSED]
Torque service check:pbs_server [PASSED]
Maui service check:maui [PASSED]
/home mounts [PASSED]

Preparing user tests...
Performing user tests...
SSH ping test [PASSED]
SSH server->node [PASSED]
SSH node->server [PASSED]
LAM/MPI (via PBS) [FAILED]
Torque default queue definition [PASSED]
Torque Shell Test [FAILED]
PVM (via Torque) [FAILED]
MPICH (via Torque) [FAILED]
Can't find string terminator '"' anywhere before EOF at -e line 1.

Ganglia setup test [FAILED]
There were issues running some user test scripts. Please check your logs
located in /home/oscartst.

Run APItests...

Running Installation tests for pvm
[PASS] 2006-03-30T11:32:24Z pvmd-path-ls.apt
[PASS] 2006-03-30T11:32:24Z envvar-pvm_arch.apt
[PASS] 2006-03-30T11:32:24Z envvar-pvm_root.apt
[PASS] 2006-03-30T11:32:24Z pvmd-path-which.apt
[PASS] 2006-03-30T11:32:24Z modulecmd-path-ls.apt
[PASS] 2006-03-30T11:32:24Z pvm-module-list.apt
[PASS] 2006-03-30T11:32:24Z pvm-module-show-pvm_rsh.apt
[PASS] 2006-03-30T11:32:25Z pvm-module-show-pvm_arch.apt
[PASS] 2006-03-30T11:32:25Z pvm-module-show-pvm_root.apt

Bernard Li <[EMAIL PROTECTED]>wrote:

I thought you fixed that problem? As mentioned this is usually due to some mismatch in your hostname resolution... you need to dig into your PBS logs to figure out what the problem is.

Cheers,

Bernard

________________________________

From: [EMAIL PROTECTED] on behalf of DD
Sent: Tue 28/03/2006 19:28
To: [email protected]
Subject: RE: [Oscar-users] Error marking node - Unauthorized Request


I found something's not right with "qmgr".

"qstat -q" list all the queues correctly;

qmgr to create or modify queues failed:

Qmgr: set queue workq resources_available.nodect = 24
qmgr obj=workq svr=default: Unauthorized Request

Qmgr: create queue testq
qmgr obj=testq svr=default: Unauthorized Request

Thanks.

Bernard Li <[EMAIL PROTECTED]>wrote:

They seem to be all free and available, so most definitely an issue with your scheduler (MAUI).

Cheers,

Bernard

________________________________

From: [EMAIL PROTECTED] on behalf of DD
Sent: Tue 28/03/2006 18:50
To: [email protected]
Subject: RE: [Oscar-users] Error marking node - Unauthorized Request


"pbsnodes -a" returned ...
node01
state = free
np = 2 properties = all
ntype = cluster
status = arch=linux,uname=Linux node01 2.6.9-11.ELsmp #1 SMP Wed Jun 8 16:59:12 CDT 2005 x86_64,sessions=4596 5241,nsessions=2,nusers=1,idletime=122203,totmem=8294408kb,availmem=8246976kb,physmem=8165928kb,ncpus=2,loadave=0.00,netload=18446744073118928684,state=free,rectime=1143600489

node02
state = free
np = 2
properties = all
ntype = cluster
status = arch=linux,uname=Linux node02 2.6.9-11.ELsmp #1 SMP Wed Jun 8 16:59:12 CDT 2005 x86_64,sessions=? 0,nsessions=? 0,nusers=0,idletime=37638,totmem=8294408kb,availmem=8251144kb,physmem=8165928kb,ncpus=2,loadave=0.00,netload=311232074,state=free,rectime=1143600489

node03
state = free
np = 2
properties = all
ntype = cluster
status = arch=linux,uname=Linux node03 2.6.9-11.ELsmp #1 SMP Wed Jun 8 16:59:12 CDT 2005 x86_64,sessions=? 0,nsessions=? 0,nusers=0,idletime=122202,totmem=8294408kb,availmem=8251664kb,physmem=8165928kb,ncpus=2,loadave=0.00,netload=18446744072900255856,state=free,rectime=1143600489
.
.
.



Bernard Li wrote:

Probably issue with the scheduler... or none of your nodes are available... what does pbsnodes -a give you?

Cheers,

Bernard

________________________________

From: [EMAIL PROTECTED] on behalf of DD
Sent: Tue 28/03/2006 18:37
To: [email protected]
Subject: RE: [Oscar-users] Error marking node - Unauthorized Request


While waiting for your response, I consulted with... Google and found this:
http://www.mail-archive.com/[email protected]/msg05197.html

Oh yes, it's credited right back to you as the provider of the s olution! :)

Now a different problem - All my jobs get queued up, but nothing is running!

I'll go back searching again.



Bernard Li wrote:

[I'm Cc:ing back to the oscar-users list, please reply there]

I'm just thinking you might have changed the hostname of the headnode and thus confused TORQUE - anyways perhaps you can post what you have in /etc/hosts and also any TORQUE log messages.

Cheers,

Bernard


________________________________

From: DD [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 28, 2006 17:33
To: Bernard Li
Subject: RE: [Oscar-users] Error marking node - Unauthorized Request


Yes, I edited the /etc/hosts file a few times, but now don't know what it was before. But I thought I was careful not to touch the data below the "# These entries are managed by SIS, please don't modify them" line. Not true?



Bernard Li wrote:

Th is usually means there's a mismatch in the hostname according to TO RQUE - have you changed anything in /etc/hosts ev er since you built the cluster?

Also, I'd check for the TORQUE logs in /var/spool/pbs/.

Cheers,

Bernard

________________________________

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of DD
Sent: Tuesday , March 28, 2006 16:42
To: OSCAR-Users
Subject: [Oscar-users] Error marking node - Unauthorized Request


Hello everyone,
I was trying to expand my cluster with more nodes today by using the wizard. All the steps seemed to work ok until I got to the ./post_install steps (step #7: Complete Cluste Setup). It had been a long time since I do any update and I never had problem setup OSCAR from scratch. Before I went ahead and re-do the entire cluster, maybe someone can take a look at the error below and hopefully have a quicker solution. Thanks.

.
.
.
Updating pbs_server nodes
Error marking node node01 - Unauthorized Request
qmgr obj=node01 svr=default: Unauthorized Request
set node node01 np = 2
Error marking node node01 - Unauthorized Request
Error marking no de node02 - Unauthorized Request
qmgr obj=node02 svr=default: Unauthorized Request
set node node02 np = 2
Error ma rking node node02 - Unauthorized Request
Error marking node node03 - Unauthorized Request
qmgr obj=node03 svr=default: Unauthorized Request
set node node03 np = 2
Error marking node node03 - Unauthorized Request
Error marking node node04 - Unauthorized Request
qmgr obj=node04 svr=default: Unauthorized Request
set node node04 np = 2
.
.
.


.
.
.

Updating pbs_server nodes
Error marking node nod e01 - Unauthorized Request
qmgr obj=node01 svr=default: Unauthorized Request
set node node01 n p = 2
Error marking node node01 - Unauthorized Request
Error marking node node02 - Unauthorized Request
qmgr obj=node02 svr=default: Unauthorized Request
set node node02 np = 2
Error marking node node02 - Unauthorized Request
Error marking node node03 - Unautho rized Request
qmgr obj=node03 svr=default: Unauthorized Request
set node node03 np = 2
Error marking node node03 - Unauthorized Request
Error marking node node04 - Unauthorized Request
qmgr obj=node04 svr=default: Unauthorized Request
set node node04 np = 2
.
.
.




Bridging the Gap Between Faith and Everyday Life - Relevant Radio
http://www.relevantradio.com/docs/index.asp
________________________________

Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2�/min or less.


________________________________

Blab-away for as little as 1�/min. Make PC-to-Phone Calls using Yahoo! Messenger with Voice.


________________________________

New Yahoo! Messenger with Voice. Call regular phones from your PC for low, low rates.



________________________________

Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2�/min or less.




________________________________

Yahoo! Messenger with Voice. PC-to-Phone calls for ridiculously low rates.


________________________________

Blab-away for as little as 1�/min. Make PC-to-Phone Calls using Yahoo! Messenger with Voice.


How low will we go? Check out Yahoo! Messenger�s low PC-to-Phone call rates.


Yahoo! Messenger with Voice. PC-to-Phone calls for ridiculously low rates.

Reply via email to