Looks like Galaxy is OK with LSF queue that submits to Debian etch nodes; it seg. faults with jobs running on Debian lenny. The server itself is running on a lenny node. Investigating further...

Marina

On 02/02/2011 16:43, Marina Gourtovaia wrote:
Hello

I've set up Galaxy to use LSF. My first job has failed because Galaxy submitted it to the default queue, which was wrong in my case. However, Galaxy gracefully survived the failure, I was able to get the job number from the console output and figure out what went wrong.

Next time I run the Galaxy with the LSB_DEFAULTQUEUE env variable set like this:

LSB_DEFAULTQUEUE=test DRMAA_LIBRARY_PATH=/usr/local/lsf/7.0/linux2.6-glibc2.3-x86_64/lib/libdrmaa.so.1.0.4 PATH=/usr/bin:/software/solexa/bin:$PATH sh run.sh

The job is submitted to the correct queue and at this point Galaxy fails with this error:

run.sh: line 46: 6506 Segmentation fault python ./scripts/paster.py serve universe_wsgi.ini $@

The job successfully completes in its own time.

When I try to run Galaxy again I get the following:

galaxy.jobs DEBUG 2011-02-02 16:27:32,565 dispatching job 36 to drmaa runner
galaxy.jobs INFO 2011-02-02 16:27:32,675 job 36 dispatched
galaxy.jobs.runners.drmaa DEBUG 2011-02-02 16:27:33,192 (36) submitting file /nfs/users/nfs_m/mg8/mygalaxy/galaxy-dist/database/pbs/galaxy_36.sh galaxy.jobs.runners.drmaa DEBUG 2011-02-02 16:27:33,192 (36) command is: java -jar /nfs/users/nfs_m/mg8/mygalaxy/galaxy-dist/tool-data/shared/jars/SamToFastq.jar VALIDATION_STRINGENCY=SILENT QUIET=true INPUT=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_16.dat FASTQ=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_51.dat SECOND_END_FASTQ=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_52.dat
Job <855341> is submitted to queue <test>.
run.sh: line 46: 6506 Segmentation fault python ./scripts/paster.py serve universe_wsgi.ini $@

ie looks like Galaxy is trying to pick up where it has left and fails again.

I configured my job runners like this:

start_job_runners = drmaa
default_cluster_job_runner = drmaa:///

Any suggestions?

Regards

Marina








--
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________
galaxy-dev mailing list
galaxy-dev@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev

Reply via email to