[galaxy-dev] PBSpro Cluster Server possible?
Hello, I have tried to couple our GALAXY-server (unified, running on a virtual server, ubuntu 12) with our big SGI Cluster-Server uv100 running PBSpro as job system (pbs_version = PBSPro_11.1.1.112253). On the GALAXY-server I have installed ii pbs-drmaa-dev1.0.10-2 DRMAA for Torque/PBS Pro - devel ii pbs-drmaa1 1.0.10-2 DRMAA for Torque/PBS Pro - runtime We have the GALAXY-sources installed very recently; job-creation is prepared, but no job is scheduled to our uv100. Did you ever hear that the libdrmaa.so coming with this package is useful to couple to a PBSpro-server? The DRMAA_LIBRARY_PATH has been set accordingly to /usr/lib/libdrmaa.so.1.0.10 . we set default_cluster_job_runner = drmaa://uv100.awi.de/slong/ We also checked that the torque-system on the GALAXY-Server could successfully submit jobs by "qsub" on the PBSpro-server in the slong queue. Best regards and many thanks in advance, S. Frickenhaus -- -- Prof. Dr. Stephan Frickenhaus Hochschule Bremerhaven An der Karlstadt 8 27568 Bremerhaven 0471-4823-525 0151-1741 1631 Alfred-Wegener-Institut f. Polar- u. Meeresforschung Am Handelshafen 12 27570 Bremerhaven stephan.frickenh...@awi.de 0471-4831-1179 0151-1741 1631 -- Prof. Dr. Stephan Frickenhaus FB1 - Biotechnologie Hochschule Bremerhaven An der Karlstadt 8 27568 Bremerhaven 0471-4823-525 0151-1741 1631 Alfred-Wegener-Institut f. Polar- u. Meeresforschung Bioinformatik/Rechenzentrum Am Handelshafen 12 27570 Bremerhaven stephan.frickenh...@awi.de 0471-4831-1179 0151-1741 1631 ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] PBSpro Cluster Server possible?
Hi Stephan, I am no expert with PBS but your default_job_runner looks wrong Instead of default_cluster_job_runner = drmaa://uv100.awi.de/slong/ you should use some PBS specific paramters for the drmaa:/// part. I am using SGE an specifiy some of the qsub parameters: default_cluster_job_runner = drmaa://-pe make 16 -q bioinf.q/ this submits the job to our bioinf-cluster on a node with 16 free slots. So you should check how you submit jobs via commandline with PBS and use these paramaters for setting drmaa:/// Cheers, Sascha Kastens Project Manager GATC Biotech AG Jakob-Stadler-Platz 7 D-78467 Konstanz Phone: +49 (0) 7531-81604110 Fax: +49 (0) 7531-816081 Email: s.kast...@gatc-biotech.com http://www.gatc-biotech.com http://www.twitter.com/gatcbiotech http://www.facebook.com/gatcbiotech http://www.xing.com/companies/gatcbiotechag GATC Biotech AG Chairman Supervisory Board: Fritz Pohl Board of Directors: Peter Pohl, Thomas Pohl, Dr. Marcus Benz UID: DE 142 315 733 | Registration: Konstanz, HRB 1757 | Registered Office: Konstanz The information contained in this email is intended solely for the addressee. Access to this email by anyone else unauthorized. If you are not the intended recipient, any form of disclosure, reproduction, distribution or any action taken or refrained from in reliance on it, is prohibited and may be unlawful. Please notify the sender immediately. The content of this email is not legally binding unless confirmed by letter. Original Message processed by CONSOLIDATE Subject: [galaxy-dev] PBSpro Cluster Server possible? Sent: Dienstag, 24. Juli 2012 12:23 From: Stephan Frickenhaus (stephan.frickenh...@awi.de) Hello, I have tried to couple our GALAXY-server (unified, running on a virtual server, ubuntu 12) with our big SGI Cluster-Server uv100 running PBSpro as job system (pbs_version = PBSPro_11.1.1.112253). On the GALAXY-server I have installed ii pbs-drmaa-dev 1.0.10-2 DRMAA for Torque/PBS Pro - devel ii pbs-drmaa1 1.0.10-2 DRMAA for Torque/PBS Pro - runtime We have the GALAXY-sources installed very recently; job-creation is prepared, but no job is scheduled to our uv100. Did you ever hear that the libdrmaa.so coming with this package is useful to couple to a PBSpro-server? The DRMAA_LIBRARY_PATH has been set accordingly to /usr/lib/libdrmaa.so.1.0.10 . we set default_cluster_job_runner = drmaa://uv100.awi.de/slong/ We also checked that the torque-system on the GALAXY-Server could successfully submit jobs by "qsub" on the PBSpro-server in the slong queue. Best regards and many thanks in advance, S. Frickenhaus -- -- Prof. Dr. Stephan Frickenhaus Hochschule Bremerhaven An der Karlstadt 8 27568 Bremerhaven 0471-4823-525 0151-1741 1631 Alfred-Wegener-Institut f. Polar- u. Meeresforschung Am Handelshafen 12 27570 Bremerhaven stephan.frickenh...@awi.de 0471-4831-1179 0151-1741 1631 -- Prof. Dr. Stephan Frickenhaus FB1 - Biotechnologie Hochschule Bremerhaven An der Karlstadt 8 27568 Bremerhaven 0471-4823-525 0151-1741 1631 Alfred-Wegener-Institut f. Polar- u. Meeresforschung Bioinformatik/Rechenzentrum Am Handelshafen 12 27570 Bremerhaven stephan.frickenh...@awi.de 0471-4831-1179 0151-1741 1631 ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Local instance is running way too slow!!!
Dear All, I successfully install Galaxy onto my new MBP with 16Gb or Ram but when I tried to use Galaxy, it is painfully slow! The first test I did was to create Admin and import data (RNA-seq fastq, about 6Gb in size) into database and then history and it worked fine. The second test was to run fasqgroomer on this fasq and it took forever (3 hours+). Anybody got in idea of why it is so slow? Would it be possible that Galaxy was set up to run a single process instead of 8-core processor? If that is the case, how to fix it? Please help! Di Nguyen Postdoc, U of W, Seattle, WA ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Local instance is running way too slow!!!
galaxy just wraps existing tools... so it's probably not galaxy that is slow per se, but the fastqgroomer too. Each tool has its own performance characteristics. I don't use fastqgroomer, so I don't know how it can be expected to perform. Are you sure you need it? If you know that your error is scaled in sanger units (iontorrent and casava 1.8 fastqs are), then you may not. If you look at your activity monitor you can see if CPU or disk is the limiting factor for the work you are doing. Brad On Jul 24, 2012, at 3:41 PM, Di Nguyen wrote: > Dear All, > > I successfully install Galaxy onto my new MBP with 16Gb or Ram but when I > tried to use Galaxy, it is painfully slow! The first test I did was to create > Admin and import data (RNA-seq fastq, about 6Gb in size) into database and > then history and it worked fine. The second test was to run fasqgroomer on > this fasq and it took forever (3 hours+). > > Anybody got in idea of why it is so slow? Would it be possible that Galaxy > was set up to run a single process instead of 8-core processor? If that is > the case, how to fix it? > > Please help! > > Di Nguyen > Postdoc, U of W, Seattle, WA > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ -- Brad Langhorst langho...@neb.com 978-380-7564 ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Build indexes for BWA, Bowtie, and others in local instance
Dear all, I just installed my local instance. In order to use NGS tools, I need indexes. Do I have to build these indexes species by species, program by programs or there is a SHORTCUT for Galaxy compatible readied for download indexes? If I'am not mistaken, building these indexes can take weeks? Kindest regards, Di ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Local instance is running way too slow!!!
Hello all, we were having the same issues with groomer taking up to 12 hours for large files. I had a look at the code and saw it was only using the single core. I changed the code to split the fastq input into multiple file parts and process it in parallel and reassemble the results. It also reassembles the aggregator data (which prints the final summary). For using 8 cores we saw a 7x improvement. Naturally the data-output is identical. One limitation is that it does not support fastq that has multiple lines per single sequence. I have read that this practice is discouraged anyway as it was problematic (though it was in the original spec) and I haven't seen this occur in our data so far. I believe there is still room to improve as the Python readLine has suboptimal performance as it will do too much file I/O without enough buffering. I'm new to bioinformatics, though i come from a history of R&D comp eng. If anyone is at the Chicago Galaxy conference, you can talk to Warren Kaplan about this. I can provide the code. regards Kenny -- Bioinformatics Architect Garvan Institute On Wed, Jul 25, 2012 at 5:54 AM, Langhorst, Brad wrote: > galaxy just wraps existing tools... so it's probably not galaxy that is > slow per se, but the fastqgroomer too. Each tool has its own performance > characteristics. > > I don't use fastqgroomer, so I don't know how it can be expected to > perform. > > Are you sure you need it? > > If you know that your error is scaled in sanger units (iontorrent and > casava 1.8 fastqs are), then you may not. > > If you look at your activity monitor you can see if CPU or disk is the > limiting factor for the work you are doing. > > > Brad > On Jul 24, 2012, at 3:41 PM, Di Nguyen wrote: > > > Dear All, > > > > I successfully install Galaxy onto my new MBP with 16Gb or Ram but when > I tried to use Galaxy, it is painfully slow! The first test I did was to > create Admin and import data (RNA-seq fastq, about 6Gb in size) into > database and then history and it worked fine. The second test was to run > fasqgroomer on this fasq and it took forever (3 hours+). > > > > Anybody got in idea of why it is so slow? Would it be possible that > Galaxy was set up to run a single process instead of 8-core processor? If > that is the case, how to fix it? > > > > Please help! > > > > Di Nguyen > > Postdoc, U of W, Seattle, WA > > ___ > > Please keep all replies on the list by using "reply all" > > in your mail client. To manage your subscriptions to this > > and other Galaxy lists, please use the interface at: > > > > http://lists.bx.psu.edu/ > > -- > Brad Langhorst > langho...@neb.com > 978-380-7564 > > > > > > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ > ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Local instance is running way too slow!!!
If you problem really just is that fastq groomer is slower, I implemented several small optimizations for fastq groomer that I think resulted in a big improvement in performance. It seems it is not really used at my institution any more so I never pushed the changes out to our production server or pushed to hard on the pull request. But it did some testing as I was making the changes, and none of the changes broke the functional tests so there is some chance they don't break anything. You can pull my changes from here if you are interested: https://bitbucket.org/galaxy/galaxy-central/pull-request/20/fastq_groomer-optimizations -John John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 On Tue, Jul 24, 2012 at 6:32 PM, Kenny Sabir wrote: > Hello all, > > we were having the same issues with groomer taking up to 12 hours for large > files. I had a look at the code and saw it was only using the single core. I > changed the code to split the fastq input into multiple file parts and > process it in parallel and reassemble the results. It also reassembles the > aggregator data (which prints the final summary). > > For using 8 cores we saw a 7x improvement. Naturally the data-output is > identical. One limitation is that it does not support fastq that has > multiple lines per single sequence. I have read that this practice is > discouraged anyway as it was problematic (though it was in the original > spec) and I haven't seen this occur in our data so far. > > I believe there is still room to improve as the Python readLine has > suboptimal performance as it will do too much file I/O without enough > buffering. > > I'm new to bioinformatics, though i come from a history of R&D comp eng. If > anyone is at the Chicago Galaxy conference, you can talk to Warren Kaplan > about this. I can provide the code. > > regards > Kenny > > -- > Bioinformatics Architect > Garvan Institute > > > On Wed, Jul 25, 2012 at 5:54 AM, Langhorst, Brad wrote: >> >> galaxy just wraps existing tools... so it's probably not galaxy that is >> slow per se, but the fastqgroomer too. Each tool has its own performance >> characteristics. >> >> I don't use fastqgroomer, so I don't know how it can be expected to >> perform. >> >> Are you sure you need it? >> >> If you know that your error is scaled in sanger units (iontorrent and >> casava 1.8 fastqs are), then you may not. >> >> If you look at your activity monitor you can see if CPU or disk is the >> limiting factor for the work you are doing. >> >> >> Brad >> On Jul 24, 2012, at 3:41 PM, Di Nguyen wrote: >> >> > Dear All, >> > >> > I successfully install Galaxy onto my new MBP with 16Gb or Ram but when >> > I tried to use Galaxy, it is painfully slow! The first test I did was to >> > create Admin and import data (RNA-seq fastq, about 6Gb in size) into >> > database and then history and it worked fine. The second test was to run >> > fasqgroomer on this fasq and it took forever (3 hours+). >> > >> > Anybody got in idea of why it is so slow? Would it be possible that >> > Galaxy was set up to run a single process instead of 8-core processor? If >> > that is the case, how to fix it? >> > >> > Please help! >> > >> > Di Nguyen >> > Postdoc, U of W, Seattle, WA >> > ___ >> > Please keep all replies on the list by using "reply all" >> > in your mail client. To manage your subscriptions to this >> > and other Galaxy lists, please use the interface at: >> > >> > http://lists.bx.psu.edu/ >> >> -- >> Brad Langhorst >> langho...@neb.com >> 978-380-7564 >> >> >> >> >> >> ___ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ > > > > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/