Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread Preethi Chockalingam

I have fixed the prob.. The prob was with ssh. I included the IP and the public 
rsa key in known_host file. I dint specify the hostname and public rsa key. 
Host name verification was the prob.. 

I thank all who helped me on this..Thanks a lot!


- Original Message 
From: James A. Peltier <[EMAIL PROTECTED]>
To: Preethi Chockalingam <[EMAIL PROTECTED]>
Cc: Alexander Piavka <[EMAIL PROTECTED]>;
Sent: Tuesday, 30 October, 2007 9:06:01 PM
Subject: Re: [Mauiusers] Output and error files are missing

Preethi Chockalingam wrote:
> This is the output of /var/log/messages file..
> Oct 30 09:00:07 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp 
> -rpB /var/spool/torque/spool/85.academyl.ER 
> [EMAIL PROTECTED]:/home/jaya/torque-2.1.6/trial.e85' 
> <mailto:[EMAIL PROTECTED]:/home/jaya/torque-2.1.6/trial.e85'> 
> failed with status=1, giving up after 4 attempts

Are you using NFS/NIS or some similar setup?  If so try adding the 
following to your PBS MOM config file

$usecp *:/ /

James A. Peltier
Technical Director, RHCE
SCIRF | GrUVi @ Simon Fraser University - Burnaby Campus
Phone  : 778-782-3610
Fax: 778-782-3045
Mobile  : 778-840-6434
Website : |

  Save all your chat conversations. Find them online at
mauiusers mailing list

Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread James A. Peltier

Preethi Chockalingam wrote:

This is the output of /var/log/messages file..


Oct 30 09:00:07 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp 
-rpB /var/spool/torque/spool/85.academyl.ER 
[EMAIL PROTECTED]:/home/jaya/torque-2.1.6/trial.e85' 
failed with status=1, giving up after 4 attempts

Are you using NFS/NIS or some similar setup?  If so try adding the 
following to your PBS MOM config file

$usecp *:/ /

James A. Peltier
Technical Director, RHCE
SCIRF | GrUVi @ Simon Fraser University - Burnaby Campus
Phone   : 778-782-3610
Fax : 778-782-3045
Mobile  : 778-840-6434
Website : |
mauiusers mailing list

Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread Preethi Chockalingam
27; failed with status=1, giving up 
after 4 attempts
Oct 30 09:30:21 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/90.academyl.ER to [EMAIL 
Oct 30 09:37:02 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/91.academyl.OU [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/trial.o91' failed with status=1, giving up 
after 4 attempts
Oct 30 09:37:02 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/91.academyl.OU to [EMAIL 
Oct 30 09:37:06 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/91.academyl.ER [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/trial.e91' failed with status=1, giving up 
after 4 attempts
Oct 30 09:37:06 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/91.academyl.ER to [EMAIL 
Oct 30 09:49:39 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/92.academyl.OU [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/trial.o92' failed with status=1, giving up 
after 4 attempts
Oct 30 09:49:39 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/92.academyl.OU to [EMAIL 
Oct 30 09:49:44 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/92.academyl.ER [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/trial.e92' failed with status=1, giving up 
after 4 attempts
Oct 30 09:49:44 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/92.academyl.ER to [EMAIL 
Oct 30 10:08:14 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/93.academyl.OU [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/loop.out' failed with status=1, giving up 
after 4 attempts
Oct 30 10:08:14 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/93.academyl.OU to [EMAIL 
Oct 30 10:08:18 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/93.academyl.ER [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/loop.error' failed with status=1, giving up 
after 4 attempts
Oct 30 10:08:18 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/93.academyl.ER to [EMAIL 
Oct 30 10:08:49 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/94.academyl.OU [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/loop.out' failed with status=1, giving up 
after 4 attempts
Oct 30 10:08:49 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/94.academyl.OU to [EMAIL 
Oct 30 10:08:53 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/94.academyl.ER [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/loop.error' failed with status=1, giving up 
after 4 attempts
Oct 30 10:08:53 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/94.academyl.ER to [EMAIL 
Oct 30 10:11:27 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/95.academyl.OU [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/try.o95' failed with status=1, giving up 
after 4 attempts
Oct 30 10:11:27 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/95.academyl.OU to [EMAIL 
Oct 30 10:11:31 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/95.academyl.ER [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/try.e95' failed with status=1, giving up 
after 4 attempts
Oct 30 10:11:31 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/95.academyl.ER to [EMAIL 
Oct 30 10:22:36 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/96.academyl.OU [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/try.o96' failed with status=1, giving up 
after 4 attempts
Oct 30 10:22:36 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/96.academyl.OU to [EMAIL 
Oct 30 10:22:40 academylab3 pbs_mom: sys_copy, command '/usr/bin/scp -rpB 
/var/spool/torque/spool/96.academyl.ER [EMAIL 
PROTECTED]:/home/jaya/torque-2.1.6/try.e96' failed with status=1, giving up 
after 4 attempts
Oct 30 10:22:40 academylab3 pbs_mom: req_cpyfile, Unable to copy file 
/var/spool/torque/spool/96.academyl.ER to [EMAIL 

----- Original Message ----
From: Alexander Piavka <[EMAIL

Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread Preethi Chockalingam

Yup I find the files in /var/spool/torque/undelivered.. But they are with a 
different extension... (.ER and .OU) Why is this happenin??? 

Reg messages files I am not able to make out as which is wrong.. I do find 
jeywords like sshd.. Should I look for some specific message??


- Original Message 
From: Alexander Piavka <[EMAIL PROTECTED]>
To: Preethi Chockalingam <[EMAIL PROTECTED]>
Cc: rishi pathak <[EMAIL PROTECTED]>;
Sent: Tuesday, 30 October, 2007 4:16:56 PM
Subject: Re: [Mauiusers] Output and error files are missing

On Tue, 30 Oct 2007, Preethi Chockalingam wrote:

> Hi,
> There are no error messages reg scp and ssh.. on pbs_mom node.

  so you don't jave any errors in /var/log/messages on pbs_mom regarding scp/ssh

  on pbs mom do you have the output and error files of your job in 
or /var/spool/pbs/spool while the job is in E state and after it exits the

> Job status displays the name of the error and output path, but I dont find 
> the files in the specified path.

You can add:
qmgr  -c "set queue NAME keep_completed = 600"
so that then job completes it is still keeped track of for 10 minutes

and after the job completes run 'qstat -f jid'

> Thanks
> -Preethi
> - Original Message 
> From: Alexander Piavka <[EMAIL PROTECTED]>
> To: Preethi Chockalingam <[EMAIL PROTECTED]>
> Cc: rishi pathak <[EMAIL PROTECTED]>;
> Sent: Tuesday, 30 October, 2007 1:43:07 PM
> Subject: Re: [Mauiusers] Output and error files are missing
>  Look for scp/ssh errors in syslog messages on pbs_mom node
> what does 'qstat -f jid' gives?
> On Tue, 30 Oct 2007, Preethi Chockalingam wrote:
>> Hi Rishi,
>> I checked my pbs_server and mom logs.. I dont find any error..
>> I am able to scp from all nodes in the cluster to the server node.. But 
>> still the output and error files are not created.
>> Wat else do u think could be wrong?
>> Thanks,
>> -Preethi
>> ----- Original Message ----
>> From: rishi pathak <[EMAIL PROTECTED]>
>> To: Preethi Chockalingam <[EMAIL PROTECTED]>
>> Cc:
>> Sent: Tuesday, 30 October, 2007 11:36:22 AM
>> Subject: Re: [Mauiusers] Output and error files are missing
>> HI,
>>  Check your mom logs and pbs_server logs for 'post job file processing 
>> error'.
>> Also check if you can rsh/rcp(as a cluster user) from any compute node to 
>> the node where pbs_server is running.
>> This has not got any relation to maui.
>> I suggest you to post mom_logs and server_logs for better identificatino of 
>> the problem.
>> On 10/30/07, Preethi Chockalingam <[EMAIL PROTECTED]> wrote:
>> Hi all,
>> I have been integratinf Maui and Torque. When I submit jobs through torque 
>> they appear in state 'E' and the job comes out of the queue.
>> I am not able to find th output and input files anywhere.
>> Any suggestions on this please??
>> Thanks in Advance,
>> Preethi.C
>> Save all your chat conversations. Find them online.
>> ___
>> mauiusers mailing list
>> --
>> Regards--
>> Rishi Pathak

  Bring your gang together - do your thing. Go to
mauiusers mailing list

Re: Re: Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread Jan Ploski
Preethi Chockalingam <[EMAIL PROTECTED]> schrieb am 10/30/2007 
10:57:50 AM:

> Hi,
> How do I check the maximum space of spool directory?? How do I 
> resolve the prob?? Should I increase it??

Just check your disk space with df. This is not TORQUE-specific.

Jan Ploski
mauiusers mailing list

Re: Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread Preethi Chockalingam

How do I check the maximum space of spool directory?? How do I resolve the 
prob?? Should I increase it??


- Original Message 
From: Jan Ploski <[EMAIL PROTECTED]>
To: Preethi Chockalingam <[EMAIL PROTECTED]>
Sent: Tuesday, 30 October, 2007 2:07:07 PM
Subject: Re: Re: [Mauiusers] Output and error files are missing

[EMAIL PROTECTED] schrieb am 10/30/2007 08:19:43 AM:

> Hi Rishi,
> I checked my pbs_server and mom logs.. I dont find any error.. 
> I am able to scp from all nodes in the cluster to the server node.. 
> But still the output and error files are not created. 
> Wat else do u think could be wrong?

Maybe you are running out of disk space in the spool directory on the 
target node.

Jan Ploski

  Save all your chat conversations. Find them online at
mauiusers mailing list

Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread Preethi Chockalingam

There are no error messages reg scp and ssh.. on pbs_mom node.
Job status displays the name of the error and output path, but I dont find the 
files in the specified path.


- Original Message 
From: Alexander Piavka <[EMAIL PROTECTED]>
To: Preethi Chockalingam <[EMAIL PROTECTED]>
Cc: rishi pathak <[EMAIL PROTECTED]>;
Sent: Tuesday, 30 October, 2007 1:43:07 PM
Subject: Re: [Mauiusers] Output and error files are missing

  Look for scp/ssh errors in syslog messages on pbs_mom node
what does 'qstat -f jid' gives?

On Tue, 30 Oct 2007, Preethi Chockalingam wrote:

> Hi Rishi,
> I checked my pbs_server and mom logs.. I dont find any error..
> I am able to scp from all nodes in the cluster to the server node.. But still 
> the output and error files are not created.
> Wat else do u think could be wrong?
> Thanks,
> -Preethi
> - Original Message 
> From: rishi pathak <[EMAIL PROTECTED]>
> To: Preethi Chockalingam <[EMAIL PROTECTED]>
> Cc:
> Sent: Tuesday, 30 October, 2007 11:36:22 AM
> Subject: Re: [Mauiusers] Output and error files are missing
> HI,
>  Check your mom logs and pbs_server logs for 'post job file processing error'.
> Also check if you can rsh/rcp(as a cluster user) from any compute node to the 
> node where pbs_server is running.
> This has not got any relation to maui.
> I suggest you to post mom_logs and server_logs for better identificatino of 
> the problem.
> On 10/30/07, Preethi Chockalingam <[EMAIL PROTECTED]> wrote:
> Hi all,
> I have been integratinf Maui and Torque. When I submit jobs through torque 
> they appear in state 'E' and the job comes out of the queue.
> I am not able to find th output and input files anywhere.
> Any suggestions on this please??
> Thanks in Advance,
> Preethi.C
> Save all your chat conversations. Find them online.
> ___
> mauiusers mailing list
> -- 
> Regards--
> Rishi Pathak
>  Bring your gang together - do your thing. Go to 

  Bring your gang together - do your thing. Go to
mauiusers mailing list

Re: Re: [Mauiusers] Output and error files are missing

2007-10-30 Thread Jan Ploski
[EMAIL PROTECTED] schrieb am 10/30/2007 08:19:43 AM:

> Hi Rishi,
> I checked my pbs_server and mom logs.. I dont find any error.. 
> I am able to scp from all nodes in the cluster to the server node.. 
> But still the output and error files are not created. 
> Wat else do u think could be wrong?

Maybe you are running out of disk space in the spool directory on the 
target node.

Jan Ploski
mauiusers mailing list

Re: [Mauiusers] Output and error files are missing

2007-10-29 Thread Preethi Chockalingam
Hi Rishi,

I checked my pbs_server and mom logs.. I dont find any error.. 
I am able to scp from all nodes in the cluster to the server node.. But still 
the output and error files are not created. 
Wat else do u think could be wrong?


- Original Message 
From: rishi pathak <[EMAIL PROTECTED]>
To: Preethi Chockalingam <[EMAIL PROTECTED]>
Sent: Tuesday, 30 October, 2007 11:36:22 AM
Subject: Re: [Mauiusers] Output and error files are missing

   Check your mom logs and pbs_server logs for 'post job file processing error'.
Also check if you can rsh/rcp(as a cluster user) from any compute node to the 
node where pbs_server is running.
This has not got any relation to maui.

I suggest you to post mom_logs and server_logs for better identificatino of the 

On 10/30/07, Preethi Chockalingam <[EMAIL PROTECTED]> wrote:
Hi all,
I have been integratinf Maui and Torque. When I submit jobs through torque they 
appear in state 'E' and the job comes out of the queue.
I am not able to find th output and input files anywhere.
Any suggestions on this please?? 
Thanks in Advance,

Save all your chat conversations. Find them online.

mauiusers mailing list

Rishi Pathak

  Bring your gang together - do your thing. Go to
mauiusers mailing list

Re: [Mauiusers] Output and error files are missing

2007-10-29 Thread rishi pathak
   Check your mom logs and pbs_server logs for 'post job file processing
Also check if you can rsh/rcp(as a cluster user) from any compute node to
the node where pbs_server is running.
This has not got any relation to maui.

I suggest you to post mom_logs and server_logs for better identificatino of
the problem.

On 10/30/07, Preethi Chockalingam <[EMAIL PROTECTED]> wrote:
> Hi all,
> I have been integratinf Maui and Torque. When I submit jobs through torque
> they appear in state 'E' and the job comes out of the queue.
> I am not able to find th output and input files anywhere.
> Any suggestions on this please??
> Thanks in Advance,
> Preethi.C
> --
> Save all your chat conversations. Find them 
> online.
> ___
> mauiusers mailing list

Rishi Pathak
mauiusers mailing list