Hi guys, Since the last upgrade we have observed this error. When submitting a job it sometimes come back failed with the following error
tool error An error occurred with this dataset: *Unable to run this job due to a cluster error, please retry it later* * * It is not reproducible as when you relaunch the job it works but still very annoying when demonstrating Galaxy. It looks to me that there is something wrong with torque/pbs and that when one submit a number of jobs, it reaches an internal limit of some kind, and then stops communicating with torque (the batch system). galaxy.jobs DEBUG 2013-09-24 14:05:08,692 (4264) Working directory for job is: /data/galaxyTools/galaxydev/database/job_working_directory/004/4264 galaxy.jobs.handler DEBUG 2013-09-24 14:05:08,700 (4264) Dispatching to pbs runner galaxy.jobs.runners.pbs DEBUG 2013-09-24 14:05:08,885 (4263/9034.galaxy-compute) PBS job state changed from N to R galaxy.jobs DEBUG 2013-09-24 14:05:08,925 (4264) Persisting job destination (destination id: pbs:///) galaxy.jobs.handler INFO 2013-09-24 14:05:09,014 (4264) Job dispatched galaxy.jobs.runners.pbs ERROR 2013-09-24 14:05:09,524 (4262) All attempts to submit job failed galaxy.jobs.runners.pbs DEBUG 2013-09-24 14:05:15,795 (4264) submitting file /data/galaxyTools/galaxydev/database/pbs/4264.sh galaxy.jobs.runners.pbs DEBUG 2013-09-24 14:05:15,795 (4264) command is: python /data/galaxyTools/galaxydev/tools/fastq/fastq_groomer.py '/data/galaxyTools/galaxydev/database/files/007/dataset_7546.dat' 'sanger' '/data/galaxyTools/galaxydev/database/files/007/dataset_7610.dat' 'sanger' 'ascii' 'summarize_input'; cd /data/galaxyTools/galaxydev; /data/galaxyTools/galaxydev/set_metadata.sh ./database/files /data/galaxyTools/galaxydev/database/job_working_directory/004/4264 . /data/galaxyTools/galaxydev/universe_wsgi.ini /data/galaxyTools/galaxydev/database/tmp/tmpcdfNIZ /data/galaxyTools/galaxydev/database/job_working_directory/004/4264/galaxy.json /data/galaxyTools/galaxydev/database/job_working_directory/004/4264/metadata_in_HistoryDatasetAssociation_9064_Zk1kgC,/data/galaxyTools/galaxydev/database/job_working_directory/004/4264/metadata_kwds_HistoryDatasetAssociation_9064_yBjKtV,/data/galaxyTools/galaxydev/database/job_working_directory/004/4264/metadata_out_HistoryDatasetAssociation_9064_RBQTv6,/data/galaxyTools/galaxydev/database/job_working_directory/004/4264/metadata_results_HistoryDatasetAssociation_9064_zBFTMS,,/data/galaxyTools/galaxydev/database/job_working_directory/004/4264/metadata_override_HistoryDatasetAssociation_9064_0W_zYn galaxy.jobs.runners.pbs WARNING 2013-09-24 14:05:15,796 (4264) pbs_submit failed (try 1/5), PBS error 15033: No free connections galaxy.jobs.runners.pbs WARNING 2013-09-24 14:05:17,798 (4264) pbs_submit failed (try 2/5), PBS error 15033: No free connections galaxy.jobs.runners.pbs WARNING 2013-09-24 14:05:19,800 (4264) pbs_submit failed (try 3/5), PBS error 15033: No free connections Did you guys have ever experienced the same problem and if so how did you solve it ? Regards, Philippe
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/