Re: [galaxy-dev] [Nbicgalaxy-admin] RPM repository for NGS tools in Galaxy
Hi James, On 2 March 2011 19:44, James Taylor wrote: > Hi Leon, > > Thanks for sharing this with the community! > > As far as similar activities, we are actively working on a solution for > packaging and deploying tools. Enis can share more about that, it is what we > use already to automatically build our cloud images with all tools and data > installed. > > Importantly, we are not using an existing package manager like RPM for > (mostly) two reasons. First, we're trying to avoid focusing specifically on > redhat et al. But more importantly, we want to avoid installing anything at > the system level. In particular because it is difficult to have multiple > versions of the same tool installed and usable at the same time. Instead, we > are installing everything in isolated directories like: > > $GALAXY_APPS/package/version/ > > And adding the appropriate information to the environment at runtime based > on requirement tags in the tool config. > Thank you for your feedback. We'd be interesting in seeing what you do for your cloud images. I do think that we can achieve the same as you do with RPMl, if we design our repositories and the rpms well. My idea is that we would have two repositories: 1. Repository with the latest versions 2. Repository with versioned RPMs The repository with the latest versions is used for people who just want to always have the latest versions, the repository with the versioned RPMs is used if you want to pin to specific versions. The versioned repository could install with the same $GALAXY_APPS/package/version/ structure as you use. The nice thing about have RPM packages is that you know exactly which version is installed, and that the package management system is already there to take care of dependencies. We do consider to provide packages for other distributions later, but serving our own services is what we start with, and those are Red Hat. Actually, since RPM is the packaging format chosen in the Linux Standard Base any compliant distribution (and all major ones are) should be able to install RPM packages if we take care to provide the correct dependencies. (i.e. -compat packages for things that are missing) With kind regards, David van Enckevort > > On Mar 2, 2011, at 12:38 PM, Leon Mei wrote: > > > Dear colleagues, > > > > In order to ease administration on our servers running at VIB and > > NBIC, we will set up an RPM Repository for bioinformatics tools, the > > primary focus being NGS tools. The purpose is to come to a stable > > repository of easily installable packages for the common > > bioinformatics tools that can be used in a local Galaxy server. > > > > A list of tools under consideration can be found at > > > https://wiki.nbic.nl/index.php/NBIC_%26_VIB_Bioinformatics_RPM_Repository > > > > So are there already similar activities going on? If yes, we would > > really love to hear your experience and probably work together on > > this. > > > > If you would like to join this effort and contribute into this > > repository, you are more than welcome to contact us as well! > > > > Thanks, > > Leon > > > > -- > > Hailiang (Leon) Mei > > Netherlands Bioinformatics Center (http://www.nbic.nl/) > > Skype: leon_meiMobile: +31 6 41709231 > > > > ___ > > To manage your subscriptions to this and other Galaxy lists, please use > the interface at: > > > > http://lists.bx.psu.edu/ > > -- jt > > James Taylor, Assistant Professor, Biology / Computer Science, Emory > University > > > > > ___ > Nbicgalaxy-admin mailing list > nbicgalaxy-ad...@trac.nbic.nl > https://trac.nbic.nl/mailman/listinfo/nbicgalaxy-admin > -- David van Enckevort Project Leader biobanking taskforce Software Integration Engineer BioAssist mob: +31 6 543 32 276 tel: +31 24 36 19 500 fax: +31 24 89 01 798 E-mail: david.van.enckev...@nbic.nl Skype: enckevort76 Netherlands Bioinformatics Centre 260 NBIC P.O. Box 9101 6500 HB Nijmegen ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Galaxy Velvet error: Unknown option -ins_length3
Hi, I've downloaded the Suite of Velvet assembler tools at http://community.g2.bx.psu.edu/ and installed them as detailed in http://gmod.827538.n3.nabble.com/attachment/868065/0/README?by-user=t I've then run velveth in Galaxy using two files - a file each of corresponding left and right paired-end reads which gives me the following output: 0.01] Reading FastQ file /opt/galaxy_dist/database/files/002/dataset_2411.dat [0.003878] 1538 reads found. [0.003881] Done [0.003889] Reading FastQ file /opt/galaxy_dist/database/files/002/dataset_2412.dat [0.007500] 1538 reads found. [0.007502] Done [0.007533] Reading read set file /opt/galaxy_dist/database/files/002/dataset_2456_files/Sequences; [0.008571] 3076 sequences found [0.018786] Done [0.018791] 3076 sequences in total. [0.018825] Writing into roadmap file /opt/galaxy_dist/database/files/002/dataset_2456_files/Roadmaps... [0.021716] Inputting sequences... [0.021719] Inputting sequence 0 / 3076 [0.169007] Done inputting sequences [0.169017] Destroying splay table [0.177049] Splay table destroyed I then go to the velvetg tool and select my velveth output file as the input to velvetg, use 'auto' as the -ins_length and -ins_length_sd, and then leave the remaining ins_length fields blank. I leave the final values (from -exp_cov to Minimum Read-Pair Validation) as the default values. When I execute the job it runs and finishes almost instantly and the only output for 'velvetg on data X' is: [0.01] Unknown option: -ins_length3 The output files Contigs, Contig Stats and Unused Reads are all empty and the LastGraph file has the error: ERROR: /opt/galaxy_dist/database/files/002/dataset_2457_files/LastGraph not found! The version of Velvet that I have installed and that is in the path is velvet_1.0.18. I was wondering if anyone could give me some help here or has any suggestions. Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] [Nbicgalaxy-admin] RPM repository for NGS tools in Galaxy
This would be great. The 'tool dependency injection' part of Galaxy is designed so any directory having this structure will work, and you can have as many as you want and they will be searched in order. On Mar 4, 2011, at 3:21 AM, David van Enckevort wrote: The repository with the latest versions is used for people who just want to always have the latest versions, the repository with the versioned RPMs is used if you want to pin to specific versions. The versioned repository could install with the same $GALAXY_APPS/ package/version/ structure as you use. ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] custom datatypes
Hi Glen, Sorry for the delay in response. > And how do I limit selections in an input drop down to just my specific file > type? I'm guessing I need to extend the Tabular class, but I don't need to > add any additional functionality at this point, I just want to limit how the > tools can be chained together. We added the ability to dynamically create subclasses in changeset 5176:34d3fcd8037b, by adding subclass="True" to the datatypes_conf.xml file. But your guess is correct, before this change you would need to have created a dummy do-nothing class. > when I load data with this tool the format is set to "tabular" by galaxy and > not to my custom type. I have a feeling Galaxy is ignoring what the tool > .xml says the format of the output is and tries to autodetect, and comes up > with tabular. If a data source says the output is a specific type shouldn't > galaxy use that as the format rather than autodetect? There is a lot of legacy magic going on with datatype detection in datasource tools. We're currently working on cleaning this up and making it behave more sanely (e.g. make better use of the provided format attribute of the output dataset). However, the preferred method of setting the datatype in datasource tools is to provide a 'data_type' parameter to Galaxy. If the datasource cannot provide this information, you can have Galaxy create it by providing a request_param_translation --> request_param tag set in the configuration .xml file for the datasource tool. An example: Please let us know if we can provide additional information or help in any other way. Sorry again for the delay. Thanks for using Galaxy, Dan On Feb 23, 2011, at 3:00 PM, Glen Beane wrote: > I've created a custom file type based on the Galaxy Tabular type (this is so > some tools I'm developing can declare this type as input or output and > prevent any arbitrary tabular file from being used as input for a tool) > > I have a data source tool that declares its output format to be my type (this > tool brings the user to a website where they query a database and then sends > the data file back to galaxy) > > when I load data with this tool the format is set to "tabular" by galaxy and > not to my custom type. I have a feeling Galaxy is ignoring what the tool > .xml says the format of the output is and tries to autodetect, and comes up > with tabular. If a data source says the output is a specific type shouldn't > galaxy use that as the format rather than autodetect? > > Also, I have other tools that declare my type as the input type, yet when I > go to select an input file it shows me all tabular files in my history, not > just those with my custom type. > > > how do I get the format to be correctly set for a file I load from my > data_source tool? And how do I limit selections in an input drop down to > just my specific file type? I'm guessing I need to extend the Tabular class, > but I don't need to add any additional functionality at this point, I just > want to limit how the tools can be chained together. This datatype is > probably just a place holder for what will probably end up being a binary > type, so I'm not going to put much effort in. > > > -- > Glen L. Beane > Software Engineer > The Jackson Laboratory > Phone (207) 288-6153 > > > > > ___ > To manage your subscriptions to this and other Galaxy lists, please use the > interface at: > > http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Map with Bowtie for Illumina - multiple input fastqs
Hi When running bowtie on the command line I was able to use more than one fastq file as input simply by listing them separated by a comma eg: bowtie -p 3 -q -m 2 --best --strata --sam --chunkmb 256 /databank/ indices/bowtie/hg18/hg18 input1.fastq,input2.fastq output.sam How can I do this within galaxy? The "Map with Bowtie for Illumina" tool only allows for 1 input fastq file as far as I can see. Thanks, Nicki - Nicki Gray MRC Molecular Haematology Unit 01865 222434 ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] upload large data file
Dear Sir/Madam, I installed galaxy in my local server, then I tried to upload a 4.7 Gb fastq file into galaxy, but failed. Below is the error message. OverflowError: signed integer is greater than maximum How could I upload large data files into galaxy and process the data? Any information from you would be quite appreciated. Thanks, Dicty ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Map with Bowtie for Illumina - multiple input fastqs
Nicki, You are right that Galaxy's Bowtie only allows one input fastq. You would have to combine your multiple input files into one before running it in Galaxy. You can do this with the Concatenate datasets tool (under Text Manipulation). Let us know if you have any further questions. Regards, Kelly On Mar 4, 2011, at 11:49 AM, Nicki Gray wrote: Hi When running bowtie on the command line I was able to use more than one fastq file as input simply by listing them separated by a comma eg: bowtie -p 3 -q -m 2 --best --strata --sam --chunkmb 256 /databank/ indices/bowtie/hg18/hg18 input1.fastq,input2.fastq output.sam How can I do this within galaxy? The "Map with Bowtie for Illumina" tool only allows for 1 input fastq file as far as I can see. Thanks, Nicki - Nicki Gray MRC Molecular Haematology Unit 01865 222434 ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] upload large data file
On Fri, Mar 04, 2011 at 04:28:20PM +, Yanji Xu wrote: > Dear Sir/Madam, > > I installed galaxy in my local server, then I tried to upload a 4.7 Gb > fastq file into galaxy, but failed. Below is the error message. > > OverflowError: signed integer is greater than maximum > > How could I upload large data files into galaxy and process the data? Use either the Upload from filepath mechanism available for data libraries ( https://bitbucket.org/galaxy/galaxy-central/wiki/DataLibraries/UploadingFiles) which has you copy the file to the server in advance and then import it, or setup the Upload via FTP functionality ( https://bitbucket.org/galaxy/galaxy-central/wiki/UploadViaFTP ). -- Ry4an Brase 612-626-6575 Software Developer Application Development University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] phastCons and phastOdds scores
Hi David, Apologies for the late reply, but were trying to get you a complete answer, but it is not ready. Meanwhile, perhaps this will help. So, for phastCons: Needs to be converted to binned array files using wiggle_to_binned_array.py from bx-python. The loc file points to a base directory containing the binned array files split by chromosome. The phastOdds are different, requiring a special output format. Right now we are not sure if it is still supported. James or another team member will follow up. If you have figured it out yourself, please post back to the list (if you have time) so that we and the list members can learn. Thanks again! Best, Jen Galaxy team On 2/1/11 6:58 AM, David Hoover wrote: Does anyone have a clear explanation of what files are required/available for Get Genomic Scores -> Aggregate datapoints, Compute phastOdds (binned_scores.loc and phastOdds.loc files)? I know there are .mod and .pp files and .wigFix files from UCSC, but I can't figure out exactly what Galaxy is looking for. Do I need to install PHAST and generate these score files? David Hoover Helix Systems Staff http://helix.nih.gov ___ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/