My thanks as well, since we don't really have a way to debug LSF here. Platform only granted us a 1 month license for the development, so I'd have to get another license to debug it.
--nate Leandro Hermida wrote: > Hi Marina, > > Thanks for posting updates and information... we've moved from SGE to > LSF and are going to implement Galaxy LSF integration next month so > what you have posted is very interesting > > -Leandro > > On Mon, Feb 7, 2011 at 6:22 PM, Marina Gourtovaia <m...@sanger.ac.uk> wrote: > > This has now been narrowed down to a seq fault in the drmaa libraries > > immediately after submitting a job when an LSF queue is set explicitly with > > the LSB_DEFAULTQUEUE global variable. > > > > Marina > > > > On 02/02/2011 16:43, Marina Gourtovaia wrote: > >> > >> Hello > >> > >> I've set up Galaxy to use LSF. My first job has failed because Galaxy > >> submitted it to the default queue, which was wrong in my case. However, > >> Galaxy gracefully survived the failure, I was able to get the job number > >> from the console output and figure out what went wrong. > >> > >> Next time I run the Galaxy with the LSB_DEFAULTQUEUE env variable set like > >> this: > >> > >> LSB_DEFAULTQUEUE=test > >> DRMAA_LIBRARY_PATH=/usr/local/lsf/7.0/linux2.6-glibc2.3-x86_64/lib/libdrmaa.so.1.0.4 > >> PATH=/usr/bin:/software/solexa/bin:$PATH sh run.sh > >> > >> The job is submitted to the correct queue and at this point Galaxy fails > >> with this error: > >> > >> run.sh: line 46: 6506 Segmentation fault python ./scripts/paster.py > >> serve universe_wsgi.ini $@ > >> > >> The job successfully completes in its own time. > >> > >> When I try to run Galaxy again I get the following: > >> > >> galaxy.jobs DEBUG 2011-02-02 16:27:32,565 dispatching job 36 to drmaa > >> runner > >> galaxy.jobs INFO 2011-02-02 16:27:32,675 job 36 dispatched > >> galaxy.jobs.runners.drmaa DEBUG 2011-02-02 16:27:33,192 (36) submitting > >> file /nfs/users/nfs_m/mg8/mygalaxy/galaxy-dist/database/pbs/galaxy_36.sh > >> galaxy.jobs.runners.drmaa DEBUG 2011-02-02 16:27:33,192 (36) command is: > >> java -jar > >> /nfs/users/nfs_m/mg8/mygalaxy/galaxy-dist/tool-data/shared/jars/SamToFastq.jar > >> VALIDATION_STRINGENCY=SILENT QUIET=true > >> INPUT=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_16.dat > >> FASTQ=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_51.dat > >> > >> SECOND_END_FASTQ=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_52.dat > >> Job <855341> is submitted to queue <test>. > >> run.sh: line 46: 6506 Segmentation fault python ./scripts/paster.py > >> serve universe_wsgi.ini $@ > >> > >> ie looks like Galaxy is trying to pick up where it has left and fails > >> again. > >> > >> I configured my job runners like this: > >> > >> start_job_runners = drmaa > >> default_cluster_job_runner = drmaa:/// > >> > >> Any suggestions? > >> > >> Regards > >> > >> Marina > >> > >> > >> > >> > >> > >> > > > > > > -- > > The Wellcome Trust Sanger Institute is operated by Genome Research Limited, > > a charity registered in England with number 1021457 and a company registered > > in England with number 2742969, whose registered office is 215 Euston Road, > > London, NW1 2BE. _______________________________________________ > > galaxy-dev mailing list > > galaxy-dev@lists.bx.psu.edu > > http://lists.bx.psu.edu/listinfo/galaxy-dev > > > > _______________________________________________ > galaxy-dev mailing list > galaxy-dev@lists.bx.psu.edu > http://lists.bx.psu.edu/listinfo/galaxy-dev > _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev