Re: [galaxy-user] BED to BAM conversion in Galaxy

2011-09-29 Thread shamsher jagat
Thanks Jen,

My problem is I have ChIP-seq data where I have one Bed
file with  coordinates-


chr1   724027  724226  61PDWAAXX100706:4:19:6952:18071   -

Then there is wig file.? Is it possible that thsi data can be analyzed in
Galaxy/ Cistrome. I tried to use Cistrome  which gav eme error message.



Thanks


On Wed, Sep 28, 2011 at 3:46 PM, Jennifer Jackson  wrote:

> Hello,
>
> It is possible to go from SAM/BAM to BED, but not the reverse. SAM/BAM
> files contain the actual sequence data associated with the original aligned
> read. BED files only have the reference genome location of the alignment (no
> read "sequence").
>
> It is possible to extract genomic sequence based on BED coordinates, but
> the resulting sequence would not necessarily be the same sequence as in the
> original aligned read (any variation would be lost).
>
> BED is very similar to Interval format, so Interval tools also work with
> BED format. A BED file is basically a 3-12 column, tab delimited file, so
> tools that work with Tabular data are also appropriate for BED file. Note
> that you may need to change the datatype to be interval or tab for certain
> tools to recognize a BED file as an input.
>
> Hopefully this helps,
>
> Jen
> Galaxy team
>
>
>
>
> On 9/22/11 2:55 PM, shamsher jagat wrote:
>
>>  Is it possible to use some tool in Galaxy to convert BED file to Bam/
>> sam file. In other word do we have Bed tools or other option in Galaxy
>>
>> Thanks
>>
>>
>> __**_
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>   
>> http://lists.bx.psu.edu/**listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>   http://lists.bx.psu.edu/
>>
>
> --
> Jennifer Jackson
> http://usegalaxy.org
> http://galaxyproject.org/**Support 
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] [galaxy-dev] Add library to dataset performance metric: developer vs production instances

2011-09-29 Thread Duddy, John
We routinely put large compressed fastq files into data libraries by that 
method (linking, no copy) and it is very fast, since the patch that stopped it 
decompressing the files.

You should probably make sure you specify the file format (fastqsanger) so 
Galaxy does not attempt to sniff the file to learn its datatype.

John Duddy
Sr. Staff Software Engineer
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Tel: 858-736-3584
E-mail: jdu...@illumina.com


-Original Message-
From: galaxy-dev-boun...@lists.bx.psu.edu 
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Jennifer Jackson
Sent: Thursday, September 29, 2011 12:13 PM
To: Roman Valls; Galaxy-Dev
Cc: galaxy-user@lists.bx.psu.edu
Subject: Re: [galaxy-dev] [galaxy-user] Add library to dataset performance 
metric: developer vs production instances

Hi Roman,

This is a good question for the development community to provide 
feedback on, so I'll cross-post your question over to that list.

Best,

Jen
Galaxy team

On 9/19/11 2:30 PM, Roman Valls wrote:
> Hello,
>
> Today I was routinely adding a 27GB Illumina lane on my galaxy instance
> running on a cluster node. Just the regular cloned-from-hg type of
> instance with set_metadata_externally, no more tuning.
>
> It took more than 10 minutes to have the dataset imported into a data
> library via the filesystem path upload method... not copying it into
> galaxy, just "linking".
>
> galaxy.jobs INFO 2011-09-19 18:05:08,641 job 120 dispatched
> (...)
> galaxy.jobs DEBUG 2011-09-19 18:16:52,822 job 120 ended
> galaxy.datatypes.metadata DEBUG 2011-09-19 18:16:52,824 Cleaning up
> external metadata files
>
> Since I cannot add datasets to libraries in usegalaxy.org and compare, I
> was wondering if someone can state an approximated average time *for a
> production* galaxy installation to do that operation.
>
> I would like to have some empirical number to show on how a production
> deployment[1] could speed things up, as opposed to having individual
> galaxy instances per user in a cluster (as per IT policies):
>
> http://blogs.nopcode.org/brainstorm/2011/08/22/galaxy-on-uppmax-simplified/
>
> Thanks in advance !
> Roman
>
> [1] http://usegalaxy.org/production
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>http://lists.bx.psu.edu/

-- 
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] Fwd: de novo assembly

2011-09-29 Thread John Nash
Forgive the forwarded email. Not used to working from an iPad.  

> On 2011-09-29, at 7:58 PM, Peter Cock  wrote:
> 
>> On Thu, Sep 29, 2011 at 7:49 PM, Cecilia Tamborindeguy
>>  wrote:
>>> Hello,
>>> 
>>> I would like to know if Galaxy can do de novo assembly without a reference
>>> genome.
>>> 
>>> Thanks.
>>> 
>>> Cecilia
>> 
>> Are you trying to use the Public Galaxy or a local install? There
>> are several assemblers with Galaxy Wrappers on the Galaxy
>> ToolShed (e.g. Roche "Newbler", and MIRA 3) which you could
>> add to your own local Galaxy if you have one.
>> 
>> However, do novo genome assembly can be very computationally
>> demanding, so not many Galaxy Instances will want to offer it.
>> 
>> Peter
> 
> I would like to echo Peter's advice. (Again, that's twice in 5 min from 2 
> different lists. I promise you I'm not stalking you, Peter). 
> 
> Genome assembly is a bit of a dedicated domain with respect to expertise and 
> time. If possible, if you are assembling a lot of genome data, you really 
> should set yourself up properly with a multiple-CPU unix box with a lot of 
> RAM and dedicate it to assembly. Install MIRA, Newbler, Velvet, AMOS, 
> samtools, bamtools, bedtools, Staden, phred/phrap/consed on it, and you can 
> assemble and interconvert data to your heart's content. 
> 
> Galaxy is a wonderful and useful service but assembling genomes does require 
> dedicated power and expertise, and preferably in house. I just forked out for 
> a 64 CPU processor with 1 TB RAM bc we assemble lots of genomes. You don't 
> have to go that far but a 3-4 quad processor box with 128 GB RAM and 1 TB 
> disk should be on your mind. 
> 
> John
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Add library to dataset performance metric: developer vs production instances

2011-09-29 Thread Jennifer Jackson

Hi Roman,

This is a good question for the development community to provide 
feedback on, so I'll cross-post your question over to that list.


Best,

Jen
Galaxy team

On 9/19/11 2:30 PM, Roman Valls wrote:

Hello,

Today I was routinely adding a 27GB Illumina lane on my galaxy instance
running on a cluster node. Just the regular cloned-from-hg type of
instance with set_metadata_externally, no more tuning.

It took more than 10 minutes to have the dataset imported into a data
library via the filesystem path upload method... not copying it into
galaxy, just "linking".

galaxy.jobs INFO 2011-09-19 18:05:08,641 job 120 dispatched
(...)
galaxy.jobs DEBUG 2011-09-19 18:16:52,822 job 120 ended
galaxy.datatypes.metadata DEBUG 2011-09-19 18:16:52,824 Cleaning up
external metadata files

Since I cannot add datasets to libraries in usegalaxy.org and compare, I
was wondering if someone can state an approximated average time *for a
production* galaxy installation to do that operation.

I would like to have some empirical number to show on how a production
deployment[1] could speed things up, as opposed to having individual
galaxy instances per user in a cluster (as per IT policies):

http://blogs.nopcode.org/brainstorm/2011/08/22/galaxy-on-uppmax-simplified/

Thanks in advance !
Roman

[1] http://usegalaxy.org/production
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-user] de novo assembly

2011-09-29 Thread Jeremy Goecks

Cecilia,


Are you trying to use the Public Galaxy or a local install? There
are several assemblers with Galaxy Wrappers on the Galaxy
ToolShed (e.g. Roche "Newbler", and MIRA 3) which you could
add to your own local Galaxy if you have one.


There are wrappers for ABySS as well. These assemblers are generally  
for genome data.


For transcriptome data, galaxy-central provides a wrapper for the  
Trinity assembler.



However, do novo genome assembly can be very computationally
demanding, so not many Galaxy Instances will want to offer it.



If you don't want to/can't set up a local instance for assembly,  
consider using a cloud instance:


http://wiki.g2.bx.psu.edu/Admin/Cloud

Good luck,
J.___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] de novo assembly

2011-09-29 Thread Peter Cock
On Thu, Sep 29, 2011 at 7:49 PM, Cecilia Tamborindeguy
 wrote:
> Hello,
>
> I would like to know if Galaxy can do de novo assembly without a reference
> genome.
>
> Thanks.
>
> Cecilia

Are you trying to use the Public Galaxy or a local install? There
are several assemblers with Galaxy Wrappers on the Galaxy
ToolShed (e.g. Roche "Newbler", and MIRA 3) which you could
add to your own local Galaxy if you have one.

However, do novo genome assembly can be very computationally
demanding, so not many Galaxy Instances will want to offer it.

Peter
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] de novo assembly

2011-09-29 Thread Cecilia Tamborindeguy
Hello,
I would like to know if Galaxy can do de novo assembly without a reference 
genome.
Thanks.
Cecilia
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Server error

2011-09-29 Thread Jennifer Jackson

Hello Karthik,

Are you still running into this problem (is it reproducible)? If so, 
would you be able to provide a few more details?


- Are you using the main public Galaxy instance at http://usegalaxy.org ?

- Are you logged in as a user or a guest ?

- What are the exact steps that lead to this error?

- It may be most helpful to explain the steps and share a link to your 
history so that we can follow them exactly. Do this by using "Options -> 
Share or Publish", generating a link, then emailing it back to me 
directly (no cc to the mailing list, to keep your data private). Or, you 
can share the history with just me using my email address. In your email 
back with the other details, please note the name of the history so that 
I can locate it, if you use that method.


Hopefully we can sort out the root cause of the problem,

Best,

Jen
Galaxy team

On 9/29/11 2:02 AM, Karthik R Padmanabhan wrote:

Hello,

I am currently experiencing a server error (see below):

-


  Server Error

An error occurred. See the error logs for more information. (Turn debug
on to display exception reports here)

-

Kindly take a look. Thanks.

--
Regards,

Karthik R. Padmanabhan



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] non-coding RNA

2011-09-29 Thread dongdong zhaoweiming
Hi, I want to evaluate wherther my assembly transcripts produced by trinity is 
protein-coding or notcoding. I found two methods which are  "txCdsPredict" 
program from the UCSC(John R Prensner,2011) and Codon Substitution 
Frequencies,CSF(Michael F. Lin,2008). I wonder if galaxy can do this? Thanks a 
lot!
 
weimin zhao___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] problem mapping 454 reads in fastq format using LASTZ

2011-09-29 Thread Jennifer Jackson

Hello Chris,

Using a fasta file as input for LASTZ is the correct way to run the tool 
right now.


You discovered a small mismatch between our documentation and the 
version of the LASTZ wrapper on the main Galaxy instance. Fastq is not 
directly accepted (as the documentation states), so either using an 
original fasta file or using the "FASTQ to FASTA converter" tool would 
be required before running LASTZ.


Currently, LASTZ itself does not use quality scores for the alignment 
process, but will pass these values (ascii format) into the output file 
(SAM) for use in downstream analysis. The public Galaxy instance will 
likely be updated in the future to support this option, but there is no 
set time-line. To be clear, the alignment results themselves would be 
the same with or without the quality scores being passed through a fastq 
input/SAM output.


Our apologies for the confusion!

Best,

Jen
Galaxy team

On 9/26/11 7:49 PM, chris.how...@csiro.au wrote:

Dear Galaxy Team and users,

I have some 454 reads that I would like to map against a contig assembly
using LASTZ. I have already mapped the reads uploaded in fasta format
against the assembly but, as mapping the reads in fasta ignores the base
qualities that would be present in a fastq file, I am concerned that I
might need the base quality information that may be crucial in deciding
on ‘real’ SNPs later down the line. So, as LASTZ apparently recognises
fastq (*see below), I converted the reads to fastq using the ‘Combine
fasta and qual’ tool in Galaxy and I am now currently trying to map the
reads to the assembly. However, Galaxy would not recognise the fastq
reads in the LASTZ input page. So I tried to fool it by changing the
data type of the fastq to fasta using the ‘Edit attributes’ function of
the history. This kept the fastq info but allowed Galaxy to recognise
the file as input for LASTZ. However, this mapping has been running for
almost 24 hours now and so I am concerned that there is an error.

Is anyone able offer any help with why Galaxy does not recognise the
reads in fastq format prior to mapping with LASTZ?

**

Here are what the first two reads look like in fastq format:

@GIQ547K01A7QJK length=76 xy=0381_0142 region=1 run=R_2010_06_11_16_16_09_

GCTTCGTGTGCGACGACACTCGTCATCGACAACGCAAGACTGGCGCTATCGCAATTGGACACACAACATGTGACCG

+

27 19 17 17 18 19 11 14 14 17 19 17 22 17 17 14 14 14 17 19 17 19 17 17
19 19 25 25 22 17 12 13 14 19 21 21 21 27 21 19 19 17 17 19 17 24 25 22
20 22 22 17 16 16 12 12 12 12 19 22 17 17 17 20 20 22 27 21 25 22 20 20
22 21 16 12

@GIQ547K01AE4BG length=40 xy=0055_0266 region=1 run=R_2010_06_11_16_16_09_

GTGACTAGATACATGCAATCAATTGTCCATGTCATTCGAG

+

27 23 23 19 19 19 18 19 21 19 18 19 18 25 27 27 26 26 27 27 27 27 27 19
19 18 19 19 27 27 25 24 25 25 21 21 22 22 22 18

**

**

** Input formats (copied from the LASTZ input page in Galaxy)*

LASTZ accepts reference and reads in FASTA format. However, because
Galaxy supports implicit format conversion the tool will recognize fastq
and other method specific formats.

With thanks,

Chris

**



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] Server error

2011-09-29 Thread Karthik R Padmanabhan
 Hello, 

I am currently experiencing a server error (see below):

-
Server Error
 An error occurred. See the error logs for more information. (Turn debug on to 
display exception reports here)

-

Kindly take a look. Thanks. 

--
Regards,
 
Karthik R. Padmanabhan

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/