Hi Marina,

Thanks for posting updates and information... we've moved from SGE to
LSF and are going to implement Galaxy LSF integration next month so
what you have posted is very interesting

-Leandro

On Mon, Feb 7, 2011 at 6:22 PM, Marina Gourtovaia <m...@sanger.ac.uk> wrote:
> This has now been narrowed down to a seq fault in the drmaa libraries
> immediately after submitting a job when an LSF queue is set explicitly with
> the LSB_DEFAULTQUEUE global variable.
>
> Marina
>
> On 02/02/2011 16:43, Marina Gourtovaia wrote:
>>
>> Hello
>>
>> I've set up Galaxy to use LSF. My first job has failed because Galaxy
>> submitted it to the default queue, which was wrong in my case. However,
>> Galaxy gracefully survived the failure, I was able to get the job number
>> from the console output and figure out what went wrong.
>>
>> Next time I run the Galaxy with the LSB_DEFAULTQUEUE env variable set like
>> this:
>>
>> LSB_DEFAULTQUEUE=test
>> DRMAA_LIBRARY_PATH=/usr/local/lsf/7.0/linux2.6-glibc2.3-x86_64/lib/libdrmaa.so.1.0.4
>> PATH=/usr/bin:/software/solexa/bin:$PATH  sh run.sh
>>
>> The job is submitted to the correct queue and at this point Galaxy fails
>> with this error:
>>
>> run.sh: line 46:  6506 Segmentation fault      python ./scripts/paster.py
>> serve universe_wsgi.ini $@
>>
>> The job successfully completes in its own time.
>>
>> When I try to run Galaxy again I get the following:
>>
>> galaxy.jobs DEBUG 2011-02-02 16:27:32,565 dispatching job 36 to drmaa
>> runner
>> galaxy.jobs INFO 2011-02-02 16:27:32,675 job 36 dispatched
>> galaxy.jobs.runners.drmaa DEBUG 2011-02-02 16:27:33,192 (36) submitting
>> file /nfs/users/nfs_m/mg8/mygalaxy/galaxy-dist/database/pbs/galaxy_36.sh
>> galaxy.jobs.runners.drmaa DEBUG 2011-02-02 16:27:33,192 (36) command is:
>> java -jar
>> /nfs/users/nfs_m/mg8/mygalaxy/galaxy-dist/tool-data/shared/jars/SamToFastq.jar
>>       VALIDATION_STRINGENCY=SILENT       QUIET=true
>> INPUT=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_16.dat
>> FASTQ=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_51.dat
>>
>> SECOND_END_FASTQ=/lustre/scratch103/sanger/mg8/galaxy/datasets/000/dataset_52.dat
>> Job <855341> is submitted to queue <test>.
>> run.sh: line 46:  6506 Segmentation fault      python ./scripts/paster.py
>> serve universe_wsgi.ini $@
>>
>> ie looks like Galaxy is trying to pick up where it has left and fails
>> again.
>>
>> I configured my job runners like this:
>>
>> start_job_runners = drmaa
>> default_cluster_job_runner = drmaa:///
>>
>> Any suggestions?
>>
>> Regards
>>
>> Marina
>>
>>
>>
>>
>>
>>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited,
> a charity registered in England with number 1021457 and a company registered
> in England with number 2742969, whose registered office is 215 Euston Road,
> London, NW1 2BE. _______________________________________________
> galaxy-dev mailing list
> galaxy-dev@lists.bx.psu.edu
> http://lists.bx.psu.edu/listinfo/galaxy-dev
>

_______________________________________________
galaxy-dev mailing list
galaxy-dev@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev

Reply via email to