Re: [galaxy-dev] [Nbicgalaxy-admin] RPM repository for NGS tools in Galaxy

2011-03-04 Thread David van Enckevort
Hi James,

On 2 March 2011 19:44, James Taylor  wrote:

> Hi Leon,
>
> Thanks for sharing this with the community!
>
> As far as similar activities, we are actively working on a solution for
> packaging and deploying tools. Enis can share more about that, it is what we
> use already to automatically build our cloud images with all tools and data
> installed.
>
> Importantly, we are not using an existing package manager like RPM for
> (mostly) two reasons. First, we're trying to avoid focusing specifically on
> redhat et al. But more importantly, we want to avoid installing anything at
> the system level. In particular because it is difficult to have multiple
> versions of the same tool installed and usable at the same time. Instead, we
> are installing everything in isolated directories like:
>
>  $GALAXY_APPS/package/version/
>
> And adding the appropriate information to the environment at runtime based
> on requirement tags in the tool config.
>

Thank you for your feedback. We'd be interesting in seeing what you do for
your cloud images.

I do think that we can achieve the same as you do with RPMl, if we design
our repositories and the rpms well.

My idea is that we would have two repositories:
1. Repository with the latest versions
2. Repository with versioned RPMs

The repository with the latest versions is used for people who just want to
always have the latest versions, the repository with the versioned RPMs is
used if you want to pin to specific versions. The versioned repository could
install with the same $GALAXY_APPS/package/version/ structure as you use.

The nice thing about have RPM packages is that you know exactly which
version is installed, and that the package management system is already
there to take care of dependencies.

We do consider to provide packages for other distributions later, but
serving our own services is what we start with, and those are Red Hat.
Actually, since RPM is the packaging format chosen in the Linux Standard
Base any compliant distribution (and all major ones are) should be able to
install RPM packages if we take care to provide the correct dependencies.
(i.e. -compat packages for things that are missing)


With kind regards,


 David van Enckevort

>
> On Mar 2, 2011, at 12:38 PM, Leon Mei wrote:
>
> > Dear colleagues,
> >
> > In order to ease administration on our servers running at VIB and
> > NBIC, we will set up an RPM Repository for bioinformatics tools, the
> > primary focus being NGS tools. The purpose is to come to a stable
> > repository of easily installable packages for the common
> > bioinformatics tools that can be used in a local Galaxy server.
> >
> > A list of tools under consideration can be found at
> >
> https://wiki.nbic.nl/index.php/NBIC_%26_VIB_Bioinformatics_RPM_Repository
> >
> > So are there already similar activities going on? If yes, we would
> > really love to hear your experience and probably work together on
> > this.
> >
> > If you would like to join this effort and contribute into this
> > repository, you are more than welcome to contact us as well!
> >
> > Thanks,
> > Leon
> >
> > --
> > Hailiang (Leon) Mei
> > Netherlands Bioinformatics Center (http://www.nbic.nl/)
> > Skype: leon_meiMobile: +31 6 41709231
> >
> > ___
> > To manage your subscriptions to this and other Galaxy lists, please use
> the interface at:
> >
> >  http://lists.bx.psu.edu/
>
> -- jt
>
> James Taylor, Assistant Professor, Biology / Computer Science, Emory
> University
>
>
>
>
> ___
> Nbicgalaxy-admin mailing list
> nbicgalaxy-ad...@trac.nbic.nl
> https://trac.nbic.nl/mailman/listinfo/nbicgalaxy-admin
>



-- 
David van Enckevort
Project Leader biobanking taskforce
Software Integration Engineer
BioAssist
mob: +31 6 543 32 276
tel: +31 24 36 19 500
fax: +31 24 89 01 798
E-mail:  david.van.enckev...@nbic.nl
Skype: enckevort76
Netherlands Bioinformatics Centre

260 NBIC
P.O. Box 9101
6500 HB Nijmegen
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Galaxy Velvet error: Unknown option -ins_length3

2011-03-04 Thread graham etherington (JIC)
Hi,
I've downloaded the Suite of Velvet assembler tools at 
http://community.g2.bx.psu.edu/
and installed them as detailed in 
http://gmod.827538.n3.nabble.com/attachment/868065/0/README?by-user=t
I've then run velveth in Galaxy using two files - a file each of corresponding 
left and right paired-end reads which gives me the following output:
0.01] Reading FastQ file 
/opt/galaxy_dist/database/files/002/dataset_2411.dat
[0.003878] 1538 reads found.
[0.003881] Done
[0.003889] Reading FastQ file 
/opt/galaxy_dist/database/files/002/dataset_2412.dat
[0.007500] 1538 reads found.
[0.007502] Done
[0.007533] Reading read set file 
/opt/galaxy_dist/database/files/002/dataset_2456_files/Sequences;
[0.008571] 3076 sequences found
[0.018786] Done
[0.018791] 3076 sequences in total.
[0.018825] Writing into roadmap file 
/opt/galaxy_dist/database/files/002/dataset_2456_files/Roadmaps...
[0.021716] Inputting sequences...
[0.021719] Inputting sequence 0 / 3076
[0.169007] Done inputting sequences
[0.169017] Destroying splay table
[0.177049] Splay table destroyed

I then go to the velvetg tool and select my velveth output file as the input to 
velvetg, use 'auto' as the -ins_length and -ins_length_sd, and then leave the 
remaining ins_length fields blank. I leave the final values (from -exp_cov to  
Minimum Read-Pair Validation) as the default values.
When I execute the job it runs and finishes almost instantly and the only 
output for 'velvetg on data X' is:
[0.01] Unknown option: -ins_length3

The output files Contigs, Contig Stats and Unused Reads are all empty and the 
LastGraph file has the error:
ERROR: /opt/galaxy_dist/database/files/002/dataset_2457_files/LastGraph not 
found!

The version of Velvet that I have installed and that is in the path is 
velvet_1.0.18.

I was wondering if anyone could give me some help here or has any suggestions.

Many thanks,
Graham


Dr. Graham Etherington
Bioinformatics Support Officer,
The Sainsbury Laboratory,
Norwich Research Park, 
Norwich NR4 7UH.
UK




___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] [Nbicgalaxy-admin] RPM repository for NGS tools in Galaxy

2011-03-04 Thread James Taylor
This would be great. The 'tool dependency injection' part of Galaxy is  
designed so any directory having this structure will work, and you can  
have as many as you want and they will be searched in order.


On Mar 4, 2011, at 3:21 AM, David van Enckevort wrote:

The repository with the latest versions is used for people who just  
want to always have the latest versions, the repository with the  
versioned RPMs is used if you want to pin to specific versions. The  
versioned repository could install with the same $GALAXY_APPS/ 
package/version/ structure as you use.


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] custom datatypes

2011-03-04 Thread Daniel Blankenberg
Hi Glen,

Sorry for the delay in response.

> And how do I limit selections in an input drop down to just my specific file 
> type?  I'm guessing I need to extend the Tabular class, but I don't need to 
> add any additional functionality at this point, I just want to limit how the 
> tools can be chained together. 


We added the ability to dynamically create subclasses in changeset 
5176:34d3fcd8037b, by adding subclass="True" to the datatypes_conf.xml file. 
But your guess is correct, before this change you would need to have created a 
dummy do-nothing class.

> when I load data with this tool the format is set to "tabular" by galaxy and 
> not to my custom type.  I have a  feeling Galaxy is ignoring what the tool 
> .xml says the format of the output is and tries to autodetect, and comes up 
> with tabular.  If a data source says the output is a specific type shouldn't 
> galaxy use that as the format rather than autodetect?


There is a lot of legacy magic going on with datatype detection in datasource 
tools. We're currently working on cleaning this up and making it behave more 
sanely (e.g. make better use of the provided format attribute of the output 
dataset).  However, the preferred method of setting the datatype in datasource 
tools is to provide a 'data_type' parameter to Galaxy. If the datasource cannot 
provide this information, you can have Galaxy create it by providing a 
request_param_translation --> request_param tag set in the configuration .xml 
file for the datasource tool. An example:




Please let us know if we can provide additional information or help in any 
other way. Sorry again for the delay.


Thanks for using Galaxy,

Dan


On Feb 23, 2011, at 3:00 PM, Glen Beane wrote:

> I've created a custom file type based on the Galaxy Tabular type  (this is so 
> some tools I'm developing can declare this type as input or output and 
> prevent any arbitrary tabular file from being used as input for a tool)
> 
> I have a data source tool that declares its output format to be my type (this 
> tool brings the user to a website where they query a database and then sends 
> the data file back to galaxy)
> 
> when I load data with this tool the format is set to "tabular" by galaxy and 
> not to my custom type.  I have a  feeling Galaxy is ignoring what the tool 
> .xml says the format of the output is and tries to autodetect, and comes up 
> with tabular.  If a data source says the output is a specific type shouldn't 
> galaxy use that as the format rather than autodetect?
> 
> Also,  I have other tools that declare my type as the input type, yet when I 
> go to select an input file it shows me all tabular files in my history, not 
> just those with my custom type.
> 
> 
> how do I get the format to be correctly set for a file I load from my 
> data_source tool?  And how do I limit selections in an input drop down to 
> just my specific file type?  I'm guessing I need to extend the Tabular class, 
> but I don't need to add any additional functionality at this point, I just 
> want to limit how the tools can be chained together. This datatype is 
> probably just a place holder for what will probably end up being a binary 
> type, so I'm not going to put much effort in. 
> 
> 
> --
> Glen L. Beane
> Software Engineer
> The Jackson Laboratory
> Phone (207) 288-6153
> 
> 
> 
> 
> ___
> To manage your subscriptions to this and other Galaxy lists, please use the 
> interface at:
> 
>  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Map with Bowtie for Illumina - multiple input fastqs

2011-03-04 Thread Nicki Gray

Hi

When running bowtie on the command line I was able to use more than  
one fastq file as input simply by listing them separated by a comma eg:


bowtie -p 3 -q -m 2 --best --strata --sam --chunkmb 256 /databank/ 
indices/bowtie/hg18/hg18 input1.fastq,input2.fastq output.sam


How can I do this within galaxy? The "Map with Bowtie for Illumina"  
tool only allows for 1 input fastq file as far as I can see.


Thanks, Nicki
-
Nicki Gray
MRC Molecular Haematology Unit
01865 222434

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] upload large data file

2011-03-04 Thread Yanji Xu
Dear Sir/Madam,

I installed galaxy in my local server, then I tried to upload a 4.7 Gb fastq 
file into galaxy, but failed.  Below is the error message.

OverflowError: signed integer is greater than maximum

How could I upload large data files into galaxy and process the data?

Any information from you would be quite appreciated.

Thanks,

Dicty
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Map with Bowtie for Illumina - multiple input fastqs

2011-03-04 Thread Kelly Vincent

Nicki,

You are right that Galaxy's Bowtie only allows one input fastq. You  
would have to combine your multiple input files into one before  
running it in Galaxy. You can do this with the Concatenate datasets  
tool (under Text Manipulation).


Let us know if you have any further questions.

Regards,
Kelly


On Mar 4, 2011, at 11:49 AM, Nicki Gray wrote:


Hi

When running bowtie on the command line I was able to use more than  
one fastq file as input simply by listing them separated by a comma  
eg:


bowtie -p 3 -q -m 2 --best --strata --sam --chunkmb 256 /databank/ 
indices/bowtie/hg18/hg18 input1.fastq,input2.fastq output.sam


How can I do this within galaxy? The "Map with Bowtie for Illumina"  
tool only allows for 1 input fastq file as far as I can see.


Thanks, Nicki
-
Nicki Gray
MRC Molecular Haematology Unit
01865 222434

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] upload large data file

2011-03-04 Thread Ry4an Brase
On Fri, Mar 04, 2011 at 04:28:20PM +, Yanji Xu wrote:
> Dear Sir/Madam,
> 
> I installed galaxy in my local server, then I tried to upload a 4.7 Gb
> fastq file into galaxy, but failed.  Below is the error message.
> 
> OverflowError: signed integer is greater than maximum
> 
> How could I upload large data files into galaxy and process the data?

Use either the Upload from filepath mechanism available for data
libraries (
https://bitbucket.org/galaxy/galaxy-central/wiki/DataLibraries/UploadingFiles)
which has you copy the file to the server in advance and then import it,
or setup the Upload via FTP functionality (
https://bitbucket.org/galaxy/galaxy-central/wiki/UploadViaFTP ).

-- 
Ry4an Brase 612-626-6575
Software Developer  Application Development
University of Minnesota Supercomputing Institutehttp://www.msi.umn.edu
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] phastCons and phastOdds scores

2011-03-04 Thread Jennifer Jackson

Hi David,

Apologies for the late reply, but were trying to get you a complete 
answer, but it is not ready. Meanwhile, perhaps this will help.


So, for phastCons:

Needs to be converted to binned array files using 
wiggle_to_binned_array.py from bx-python.


The loc file points to a base directory containing the binned array 
files split by chromosome.


The phastOdds are different, requiring a special output format. Right 
now we are not sure if it is still supported.


James or another team member will follow up. If you have figured it out 
yourself, please post back to the list (if you have time) so that we and 
the list members can learn.


Thanks again!

Best,

Jen
Galaxy team


On 2/1/11 6:58 AM, David Hoover wrote:

Does anyone have a clear explanation of what files are required/available for Get 
Genomic Scores ->  Aggregate datapoints, Compute phastOdds (binned_scores.loc 
and phastOdds.loc files)?  I know there are .mod and .pp files and .wigFix files 
from UCSC, but I can't figure out exactly what Galaxy is looking for.  Do I need 
to install PHAST and generate these score files?

David Hoover
Helix Systems Staff
http://helix.nih.gov
___
galaxy-dev mailing list
galaxy-dev@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev


--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/