On further digging, we found that the script is failing in the following part 
of $GALAXY_HOME/lib/galaxy/jobs/runners/pbs.py:

        # submit
        galaxy_job_id = job_wrapper.job_id
        log.debug("(%s) submitting file %s" % ( galaxy_job_id, job_file ) )
        log.debug("(%s) command is: %s" % ( galaxy_job_id, command_line ) )
        job_id = pbs.pbs_submit(c, job_attrs, job_file, pbs_queue_name, None)
        pbs.pbs_disconnect(c)

        # check to see if it submitted
        if not job_id:
            errno, text = pbs.error()
            log.debug( "(%s) pbs_submit failed, PBS error %d: %s" % 
(galaxy_job_id, errno, text) )
            job_wrapper.fail( "Unable to run this job due to a cluster error" )
            return

Could this be a problem related to the pbs_python egg (v. pbs_python-4.1.0) 
being used by Galaxy or a Torque-specific issue? Just to reiterate, we are on a 
development snapshot of Torque which is hard to replace as many other people 
using it.

Also, could you please advise which Torque & pbs_python version combinations 
have you successfully tested against?

Regards,
Sonali

PS: pbs_python has a new version 4.3 out 
(https://subtrac.sara.nl/oss/pbs_python/wiki/TorqueInstallation), why is this 
not in the PSU egg repository yet? Would that make a difference?

-----Original Message-----
From: Sonali Amonkar 
Sent: Tuesday, February 15, 2011 4:10 PM
To: 'Nate Coraor'
Cc: Galaxy Dev
Subject: RE: [galaxy-dev] Error with setuptools version in Galaxy installation 
on Cluster

Hi Nate,

We went through the mom_logs. We could see the jobs which were completed 
successfully in Galaxy. However, the job/ tool run which failed were not seen. 
These failed jobs were never submitted to PBS queue to be logged in the 
mom_logs.
On another note, is the error, "pbs_submit failed, PBS error 15031: Protocol 
(ASN.1) error" related to the pbs_python version in Galaxy and the Torque 
version? 
Also find attached a workaround we had adapted to get the Galaxy working on a 
PBS Torque version. Could this problem be related to one of those workarounds?
Is there any specific PBS Torque version on which Galaxy has been tested?

Thank you for your time Nate,
Warm Regards,
Sonali Amonkar

-----Original Message-----
From: Nate Coraor [mailto:n...@bx.psu.edu]
Sent: Saturday, February 12, 2011 2:10 AM
To: Sonali Amonkar
Cc: Galaxy Dev
Subject: Re: [galaxy-dev] Error with setuptools version in Galaxy installation 
on Cluster

Nate Coraor wrote:
> Sonali Amonkar wrote:
> > 
> > I am currently facing another issue. When I run my Workflow, I am seeing 
> > the following  error on the server log. This error is not consistent, and 
> > occurs in an erratic manner. 
> > 
> > galaxy.jobs INFO 2011-02-03 05:17:03,522 job 151 dispatched 
> > galaxy.jobs.runners.pbs DEBUG 2011-02-03 05:17:09,755
> > (150/69156.<primaryserver>) PBS job has left queue 
> > galaxy.jobs.runners.pbs DEBUG 2011-02-03 05:17:09,879 (151) 
> > submitting file galaxy-dist/database/pbs/151.sh 
> > galaxy.jobs.runners.pbs DEBUG 2011-02-03 05:17:09,880 (151) command
> > is: java -cp galaxy-dist/tools/my_tools/jars/PreRef1.jar
> > RefFilterModule galaxy-dist/database/files/000/dataset_192.dat
> > galaxy-dist/database/files/000/dataset_194.dat
> > galaxy-dist/database/files/000/dataset_195.dat
> > galaxy.jobs.runners.pbs DEBUG 2011-02-03 05:17:09,880 (151) 
> > pbs_submit failed, PBS error 15031: Protocol (ASN.1) error 
> > galaxy.jobs DEBUG 2011-02-03 05:17:13,363 job 150 ended galaxy.jobs 
> > ERROR 2011-02-03 05:17:15,816 Unable to cleanup job 152
> 
> Hi Sonali,
> 
> I am pretty sure this problem is somehow specific to the TORQUE setup, 
> and I see you also posted this to the torquedev list, but 
> unfortunately received no response.
> 
> I am not sure what is up here, but you may want to try adjusting the 
> 'tcp_timeout' server setting. (qmgr -c 'set server tcp_timeout = X')

Also, you may want to see if PBS is making an attempt to queue this job on a 
particular node, and if so, check the mom_logs for that node.

> 
> --nate
> 
> > 
> > Any help/pointer would be appreciated for this issue. Thank you very much 
> > for your time Nate.
> > 
> > Regards,
> > Sonali
> _______________________________________________
> To manage your subscriptions to this and other Galaxy lists, please 
> use the interface at:
> 
>   http://lists.bx.psu.edu/

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

_______________________________________________
To manage your subscriptions to this and other Galaxy lists, please use the
interface at:

  http://lists.bx.psu.edu/

Reply via email to