Re: [galaxy-user] question about Filtering Cufflink files

2011-05-09 Thread Jeremy Goecks
Jagat,

First, a couple housekeeping issues:

(a) the questions you're asking are better suited to the galaxy-user list 
(questions about using Galaxy and performing analyses) rather than galaxy-dev 
(questions about installing Galaxy locally and tool development), so I've moved 
this thread to galaxy-user;

(b) please start new threads when appropriate rather than replying to older 
threads as this makes threads shorter and more focused.

Onto your questions:

> I have another question when  I filter gene  list In the filtered list there 
> are multiple rows per gene. I should have one gene per row? I have attached 
> the snap shot of out put, but not sure if galaxy server will take it or not. 
> I did se the discussion on other forum:
> http://seqanswers.com/forums/showthread.php?t=8830

GTF files have multiple lines per feature, so your output is reasonable.


> which suggest that possible complications in getting one gene per row. My 
> next question is in that scenario what should be the best way of representing 
> one gene per FPKM value? should we take average of FPKM per gene? I think in 
> the gene it is till giving the transcript FPKM value but these values are 
> different from previous file filtered with transcript id.

As Vasu noted, this is an ongoing area of research. For some experiments, it 
may be reasonable to group alternatively-spliced isoforms of the same gene and 
jointly estimate FPKM, and for others it may not. Fortunately, if you do want 
to group transcripts to get gene FPKM values, Cuffdiff does this for you: see 
its gene FPKM expression file.

Best,
J.___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Composite Datatypes Q.

2011-05-09 Thread Todd Yilk
I have a program I'm trying to "galaxify" that emits a variable number of 
result files. I would like the output of my Galaxy tool to show up in Galaxy as 
an html file with links to the result files. So when you click on the eye, the 
html file should up in the middle pane ... sorry if I'm not describing this in 
an elegant way.

Creating a composite datatype is the way to go in this situation, correct? I'm 
creating a class that inherits from Html. How do I get the result returned from 
my custom "generate_primary_file" function to show up as a tool's output ... if 
that's the right way to go about this?

Thanks!
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] Filter Tool

2011-05-09 Thread Jeremy Goecks
(Starting new thread on galaxy-user.)

Jagat,

It depends what filter tool you're using and what dataset you're filtering. 
There is a generic filter tool that can be used to filter Cuffdiff tabular 
files for either FPKM values and differential expression tests. There is also a 
tool for filtering GTF files based on a Cuffdiff expr dataset. It sounds like 
you may be confusing either the tools or the inputs.

If after double-checking you're still having problems with filtering, please 
put together a short list of your analysis steps and share your history with 
me, and I can take a look.

Thanks,
J.

> Further to my question, It appear that there is some problem with the filter 
> option:
> When I use the isoform/gene exp file as such it work fine but when I filter 
> these files with either parameter such as status if test was successful or on 
> p value it return me empty file. The way am saving the file is - expr file 
> filter save as txt file and upload back in Galaxy.
> Any suggestion?
>  
> Jagat
> 
> 
> On Tue, May 3, 2011 at 3:08 AM, shamsher jagat  wrote:
> Jeremy,
>  
> I have been trying to follow  the steps in filtering Cufflink out put files 
> you have  described in one of the previous messages 
> (http://gmod.827538.n3.nabble.com/Re-downstream-analysis-of-cuffdiff-out-put-td2836457.html):
>  
> I have shared histroy with you, but in summary:
>  
> File 35: when Filter GTF data by attributes value list on data 11 (combined 
> GTF) and data 33 (which is gene expr  file) . Will not this should have one 
> gene per row. But it is not?
> 
> File 39:  Filter GTF file by attribute value list on data 11 and data 38 
> (Cuffdiff splicing expr) it failed. I would assume that it should filter  on 
> the basis of TSSid . The error message is
> 
> Traceback (most recent call last):
>   File 
> "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py",
>  line 67, in
> filter( gff_file, attribute_name, ids_file, output_file )
>   File 
> "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py",
>  line 57, in filter
> if attributes[ attribute_name ] in ids_dict:
> KeyError: 'tss_id'
> 
> 40 : Filter GTF data by attribute list on data 11 and 34 (tss group exp) 
> failed and error message is:
> 
> Traceback (most recent call last):
>   File 
> "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py",
>  line 67, in 
> filter( gff_file, attribute_name, ids_file, output_file )
>   File 
> "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py",
>  line 57, in filter
> if attributes[ attribute_name ] in ids_dict:
> KeyError: 'tss_id'
>  
> I would consider that if one gene has different Id than there is splicing .
> 
> However in contrast isoform file with transcript Id is working fine (File 20)
> 
>  On a different note can I convert GTF file to txt tab delaminated file I 
> tried to convert file 11 in txt (following Edit attributes) but the file is 
> not properly formatted especially col-pid and TSS id. Am I doing something 
> wrong.
> 
> Thanks.
> 
>  
> 
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] SNP annotation

2011-05-09 Thread Oliver, Gavin
Hi, 

 

I am new to Galaxy and am wondering what tools are available for
annotation of SNPs?

 

I know that snpEff is implemented in Galaxy and this enables annotation
such as location and predicted effect, however I am wondering if there
are any automated means of annotating a polymorphism as known/novel or
determining its frequency in the population.

 

Does Galaxy offer anything like this or would it be necessary to create
scripts that would for exampole compare to dbSNP for uniqueness and the
1000 genomes project for frequency.

 

Best, 

 

Gavin

 


The contents of this message and any attachments to it are confidential and may 
be legally privileged. If you have received this message in error, you should 
delete it from your system immediately and advise the sender.

Almac Group (UK) Limited, registered no. NI061368.  Almac Sciences Limited, 
registered no. NI041550.  Almac Discovery Limited, registered no. NI046249.  
Almac Pharma Services Limited, registered no. NI045055.  Almac Clinical 
Services Limited, registered no. NI041905.  Almac Clinical Technologies 
Limited, registered no. NI061202.  Almac Diagnostics Limited, registered no. 
NI043067.  All preceding companies are registered in Northern Ireland with a 
registered office address of Almac House, 20 Seagoe Industrial Estate, 
Craigavon, BT63 5QD, UK.  

Almac Sciences (Scotland) Limited, registered in Scotland no. SC154034. 

Almac Clinical Services LLC, Almac Clinical Technologies LLC, Almac Diagnostics 
LLC, Almac Pharma Services LLC and Almac Sciences LLC are Delaware limited 
liability companies and Almac Group Incorporated is a Delaware Corporation.  
More information on the Almac Group can be found on the Almac website: 
www.almacgroup.com


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] galaxy on the cloud - cannot select fastq files when trying to run workflow

2011-05-09 Thread Amit Indap
Hi Galaxy,

I am a newbie to Amazon EC2 and have been carefully following the
steps in the screencast. I am able to upload two fastq files from the
s3 bucket:
http://s3.amazonaws.com/heteroplasmy/F4-bM4-1.fastq
http://s3.amazonaws.com/heteroplasmy/F4-bM4-2.fastq

I am also able import the published workflow
http://s3.amazonaws.com/heteroplasmy/Galaxy-Workflow-mt_analysis_0.01_strand-specific_(fastq_double).ga

But when it comes to running it, I cannot select the fastq file from
the drop down menu. I am able to view them on my GC instance, since I
imported them successfully, but am at a loss as to why
I can't select them from the drop down menu in the workflow to begin
their alignment. Is it something to do with my security group settings
in setting up the EC2 instance?  Any assistance you can provide would
be great!

Thanks,
Amit

-- 
Amit Indap
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] galaxy on the cloud - cannot select fastq files when trying to run workflow

2011-05-09 Thread Enis Afgan
Hi Amit,
The workflow requires the input data to be in 'fastqsanger' format before
being able to run. The files you uploaded from S3 are already in the correct
format but this is most likely not set correctly in the metadata. So, click
on the pencil icon for each of the datasets in your history and edit the
data type by setting it to 'fastqsanger'. Save the changes and try running
the workflow. It should work fine then.

For future reference, if you decide to upload the rest of the files from the
screencast/heteroplasmy study, you can choose the fastqsanger type right on
the data upload form and can thus avoid this subsequent step.

Enis


On Mon, May 9, 2011 at 5:23 PM, Amit Indap  wrote:

> Hi Galaxy,
>
> I am a newbie to Amazon EC2 and have been carefully following the
> steps in the screencast. I am able to upload two fastq files from the
> s3 bucket:
> http://s3.amazonaws.com/heteroplasmy/F4-bM4-1.fastq
> http://s3.amazonaws.com/heteroplasmy/F4-bM4-2.fastq
>
> I am also able import the published workflow
>
> http://s3.amazonaws.com/heteroplasmy/Galaxy-Workflow-mt_analysis_0.01_strand-specific_(fastq_double).ga
>
> But when it comes to running it, I cannot select the fastq file from
> the drop down menu. I am able to view them on my GC instance, since I
> imported them successfully, but am at a loss as to why
> I can't select them from the drop down menu in the workflow to begin
> their alignment. Is it something to do with my security group settings
> in setting up the EC2 instance?  Any assistance you can provide would
> be great!
>
> Thanks,
> Amit
>
> --
> Amit Indap
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>  http://lists.bx.psu.edu/
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Stastistics _q valu

2011-05-09 Thread vasu punj
I am trying to use stat function of Galaxy, and compute q value from p values. 
it is giving following errors:
An error occurred running this job: Error in scan(file, what, nmax, sep, dec, 
quote, skip, nlines, na.strings, : 
scan() expected 'a real', got 'pvals'
Execution halted

Any suggestion___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Galaxy Community Conference Registration Closes May 17

2011-05-09 Thread Dave Clements
Hello all,

We are closing registrations for the Galaxy Community Conference at the end
of May 17.  However, the conference is likely to fill up before that date.
So, if you are thinking about attending, *now* is the time to register for
the meeting:

  http://galaxy.psu.edu/gcc2011/Register.html

We have also secured some additional rooms on the night of May 24, in a
hotel in Lunteren:

  http://galaxy.psu.edu/gcc2011/Logistics.html

Again, it is not necessary to stay in Lunteren the night of May 24, as it is
easy to get there from Amsterdam on May 25, before the meeting starts.

Finally, a draft schedule is now available at:

  http://galaxy.psu.edu/gcc2011/Programme.html

This is still subject to change, but we are hoping it won't change much.

Just 15 days to go!

Dave C.
-- 
http://galaxy.psu.edu/gcc2011/
http://getgalaxy.org
http://usegalaxy.org/
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/