Re: [galaxy-user] miRNA-seq help

2013-10-11 Thread Hoang, Thanh
Hi Calvin,
I am analyzing miRNA differential expression from my small RNA sequencing
data from mouse tissue using Bowtie > Htseq>Deseq.
I tried both whole mouse genome and hairpin miRNA( from miRbase) as
reference sequences and annotation of all known miRNA (from miRbase). These
worked for me.
Another option is that you can try mirDeep2 and Novoalign.
Anyway, what organism are you working with? Where u download the piRNA
reference sequence?
Let me know what happens
Thanh


On Fri, Oct 11, 2013 at 12:51 PM, Gabriel Calvin  wrote:

> Hi, I'm new to Galaxy and am trying to view several miRNA datasets as a
> differential expression. The pipeline I'm using is Bowtie for Illumina
> (paired-end run) > SAM-to-BAM > ? > xls. The references I used with Bowtie
> are a mature miRNA fasta and a piRNA fasta and the reads are 30nt in length.
>
> So, my questions are: Is this the proper pipeline? How do I go about
> converting the BAM into a xls file viewable in Excel?
>
> Thanks!
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>
>   http://galaxyproject.org/search/mailinglists/
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] Can TopHat take SNPs into account?

2013-10-02 Thread Hoang, Thanh
Hi,
I have been mapping my RNA-seq data to mouse genome from a different mouse
strain using TopHat. I am  wondering whether TopHat can take SNPs into
account during the alignment? ( using SNPs track as an optional input)?
Thanks
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] non-coding RNA annotation

2013-09-22 Thread Hoang, Thanh
Hi all,
I am analyzing my small RNA sequencing data on mouse tissue. Does anyone
know where to download annotation file for non-coding RNA?
Thanks
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] RNA-DNA converter in Fasta format

2013-09-22 Thread Hoang, Thanh
Thank you Bjoern,
I figured it out. Just use Fastx nucleotide exchanger.in Fastx-toolkit
Thanh


On Sun, Sep 22, 2013 at 4:51 AM, Björn Grüning <
bjoern.gruen...@pharmazie.uni-freiburg.de> wrote:

> Hi Thanh,
>
> under "FASTA manipulation" I have one tool das is called "RNA/DNA
> converter". If it is not available in your instance you need to install
> the FASTX-toolkit wrappers. These are available in the Tool Shed.
>
> Hope that helps,
> Bjoern
>
>
> > Hi all,
> > I want to to map my sequencing reads to miRNA reference database .
> > Anyone know how to convert RNA to DNA in FASTA format ( U to T) .
> > Thanks
> > Thanh
> >
> >
> > ___
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org.  Please keep all replies on the list by
> > using "reply all" in your mail client.  For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> >
> >   http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> >
> >   http://lists.bx.psu.edu/
> >
> > To search Galaxy mailing lists use the unified search at:
> >
> >   http://galaxyproject.org/search/mailinglists/
>
>
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] RNA-DNA converter in Fasta format

2013-09-21 Thread Hoang, Thanh
Hi all,
I want to to map my sequencing reads to miRNA reference database . Anyone
know how to convert RNA to DNA in FASTA format ( U to T) .
Thanks
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] 3' adapter trimming using FASTX-toolkit clipper

2013-09-19 Thread Hoang, Thanh
Hi all,
I am analyzing miRNA sequencing now. My data is 51bp, single -ended and ~5
M reads. I want to remove the adapter sequences from the reads before
mapping to the genomes/known miRNA database.
My 3' adapter sequence is : 5-AGATCGGAAGAGCACACGTCT-3. I found that many
reads only contain part of the 3' adapter sequence. I am using
FASTX-toolkit to clip it off. How many bases  should I put in the " Enter
custom clipping sequence" ? Because in the output files, I end up with more
reads when putting the whole 3 adapter sequence than putting only first 8
nt.
Also, miRNA is about 17-25 nt long, I guess that the rest of the reads
(51-21=30bp) must contain part or whole 5's adapter sequence or the
by-product of mRNA/tRNA degradation. So I think that I have to trim the 5'
adapter as well.
Any suggestion will be highly appreciated
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] miRNA analysis

2013-09-18 Thread Hoang, Thanh
Thanks  Mete and Jenifer for your information.
Last time, I did mRNA sequencing analysis and decided to go with
TopHat>Htseq>DESeq for DE genes. Just because the results match up quite
nicely with my qPCR validation. Although Cuufflink/Cuffdiff produced
results with quite similar trend with DESeq, It seem to me that Cuffdiff
tend to inflate the fold-change and is not good at statistical analysis.
Thanks Jenny and Ross.
Anyway, about the miRNA I am working on now. My miRNA data is 51bp,
single-ended.  I am going to cut adapter using FASTX-toolkit and align
reads using Novoalign ( as Mete suggests)  and  Bowtie.
Just have a question for now: my 3' adapter sequence is :
5-rAppAGATCGGAAGAGCACACGTCT-NH2-3.
How many bases  should I put in the " Enter custom clipping sequence" ? Is
 " AGATCGGA" is optimal? Just because I observed that many reads only
contain part of 3' adapter sequence.
Also, Do I need to trim 5' adapter as well? and How?
Thank so much for your help
Thanh




On Wed, Sep 18, 2013 at 8:12 PM, Jennifer Jackson  wrote:

>  Hi Thanh,
>
> Did the Tuxedo suite not work out for you in the end? Or the other tools
> that Ross suggested? These are both pipelines that are in common use.
> http://lists.bx.psu.edu/pipermail/galaxy-user/2013-July/006367.html
>
> Using a cloud Galaxy and installing tools from the Tool Shed is required
> for certain tools, perhaps that is the problem? Many tools now have
> automatic dependency installation, making set-up much easier. For a
> demonstration, watch the Channel: Galaxy ToolShed videos at Vimeo:
> http://vimeo.com/user20484153
>
> You also may want to look at some of the miRNA specific tools in the Tool
> Shed. They can be found under "Sequence Analysis". Most of these have
> online documentation, or the tool author includes documentation, that you
> can review to see if the tool is a good fit for what you want to do (if it
> is not expression analysis anymore, or you want to try something different
> like DESeq).
> http://toolshed.g2.bx.psu.edu/repository
>
> Hopefully this helps,
>
> Jen
> Galaxy team
>
>
> On 9/18/13 10:19 AM, Hoang, Thanh wrote:
>
> Hi all,
> I would like to analyze my miRNA sequencing analysis from mouse tissue. I
> have not any idea which tools or pipeline work best. Do you have any
> suggestion?
> Regards
> Thanh
>
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>
>   http://galaxyproject.org/search/mailinglists/
>
>
> --
> Jennifer Hillman-Jacksonhttp://galaxyproject.org
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] miRNA analysis

2013-09-18 Thread Hoang, Thanh
Hi all,
I would like to analyze my miRNA sequencing analysis from mouse tissue. I
have not any idea which tools or pipeline work best. Do you have any
suggestion?
Regards
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] "Significant" in Cuffdiff's ouput

2013-07-19 Thread Hoang, Thanh
Hi all,
I have some questions about how Cufffdiff does the  statistical analysis.
I am looking for DE genes in two sample groups ( 3 replicate per group). In
the Cuffdiff;s gene_exp.diff, I found many genes that have very large RPKM
fold-change between two groups (with p value < or > 0.05)  but still NO
significant. Something like this:

test_id gene_id gene locus sample_1 sample_2 status value_1 value_2
log2(fold_change) test_stat p_value q_value significant  ENSMUSG0047139
ENSMUSG0047139 Cd24a 10:43579168-43584262 q1 q2 OK 96.2585 2700.55
4.8102 1.6486 0.03995 0.078237 no  ENSMUSG0066975 ENSMUSG0066975
Cryba4 5:112246492-112252518 q1 q2 OK 424.582 46190.2 6.7654 0.598327 0.3408
0.442128 no

Then I checked the READ_GROUP_TRACKING file for those genes to check the
RPKM value for each replicate:

 tracking_id condition replicate raw_frags internal_scaled_frags
external_scaled_frags FPKM effective_length status  ENSMUSG0047139 q1 1
11256 5876.82 5876.82 125.915 - OK  ENSMUSG0047139 q1 0 3783 4343.44
4343.44 42.0316 - OK  ENSMUSG0047139 q1 2 10051 5639.48 5639.48 120.829
- OK  ENSMUSG0047139 q2 1 76771 156059 156059 3343.66 - OK
ENSMUSG0047139 q2 0 82394 162172 162172 1420.33 - OK  ENSMUSG0066975
q1 1 12825 6696 6696 407.899 - OK  ENSMUSG0066975 q1 0 3694 4241.26
4241.26 375.211 - OK  ENSMUSG0066975 q1 2 14397 8077.95 8077.95 490.636
- OK  ENSMUSG0066975 q2 1 348103 707619 707619 42455.1 - OK
ENSMUSG0066975 q2 0 420896 828430 828430 48920.6 - OK
ENSMUSG0066975 q2 2 331098 767405 767405 47195 - OK


Would not I expect these DE genes are significant? Could anyone explain why
 Cufflinks show  this result?


Best regards
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Exceptionally high RPKM values of miRNA and other short genes in Cuffdiff's output

2013-07-19 Thread Hoang, Thanh
Thank you Mohammad and Ross for your valuable information.
I am re-running Cuffdiff with No effective length correction and also
running EdgeR tool to see how the results go.
Thanh


On Thu, Jul 18, 2013 at 8:42 PM, Ross  wrote:

> Hi, Thanh,
> If your primary goal is inference about differential 'gene' expression
> taking biological variability into account with biological replicates for
> each of two conditions, you might want (eg see Dillies et al.,
> http://bib.oxfordjournals.org/content/early/2012/09/15/bib.bbs046.longand
> http://wiki.galaxyproject.org/Events/GCC2013/Abstracts#Events.2FGCC2013.2FAbstracts.2FPosters.P4:_Comparing_R-based_methods_and_Cuffdiff2_for_analysis_of_RNA-seq_data_in_Galaxy)
> to try (and compare!) edgeR (and optionally DESeq and VOOM/limma). A set of
> *very much beta* tools is available for admin installation and user testing
> from the test toolshed in the statistics section owned by fubar.
>
> The edgeR tool can optionally run 2 way GLM. It requires raw count
> matrices as inputs which can be generated from a GTF/'gene' model of your
> choice and any number of mapped SAM/BAM inputs using the htseq based
> companion tool in the same tool shed section. Please don't install to a
> production machine yet but we're getting good results from it - feedback
> and code improvements are welcomed from willing beta testers.
>
> The R 3.0.x tool shed dependency package in particular is still under
> development and is likely to change substantially in the next week or two
> as we sort out a sane and generalised Atlas dependency installation.
>
>
> On Fri, Jul 19, 2013 at 2:55 AM, Hoang, Thanh  wrote:
>
>> Hi all,
>> I have been analyzing my RNA-seq data on mouse tissues. My RNA-data is
>> single-ended and 51 bp in length. I ran TopHat/Cufflink/Cuffdiff to test to
>> differential gene expression
>> In the Cuffdiff's output, I got very high RPKM value for some of miRNA
>> and some other short genes ( less than 100bp). These genes are in the top
>> genes with the highest RPKM. I think the RPKM values of these genes are
>> probably  too high to be true.
>>   *test_id* *gene_id* *gene* *locus* *sample_1* *sample_2* *status* *
>> value_1* *value_2* *log2(fold_change)* *test_stat* *p_value* *q_value* *
>> significant*  *ENSMUSG0093077* *ENSMUSG0093077* *Mir5105* *
>> 5:146231229-146302874* *Epithelium* *Fiber* *OK* *1.53E+06* *  445558* *
>> -1.78097* *-355.367* *0.00715* *0.016986* *yes*  *ENSMUSG0093098* *
>> ENSMUSG0093098* *Gm22641* *7:130162450-133124354* *Epithelium* *Fiber
>> * *OK* *87894.1* * 36474.7* *-1.26887* *-0.59863* *0.4913* *0.587174* *no
>> *  *ENSMUSG0089855* *ENSMUSG0089855* *Gm15662* *
>> 10:105187662-105583874* *Epithelium* *Fiber* *OK* *42868.9* * 21566.5* *
>> -0.99114* *-20.7066* *0.0186* *0.039568* *yes*  *ENSMUSG0092984* *
>> ENSMUSG0092984* *Mir5115* *2:73012853-73012927* *Epithelium* *Fiber*
>> *OK* *21104.8* * 8317.49* *-1.34335* *-447.314* *0.0001* *0.000354* *yes*
>> *ENSMUSG0086324* *ENSMUSG0086324* *Gm15564* *16:35926510-36037131
>> * *Epithelium* *Fiber* *OK* *6443.35* * 3664.15* *-0.81433* *-1.52095* *
>> 0.2129* *0.301429* *no*  *ENSMUSG0092981* *ENSMUSG0092981* *
>> Mir5125* *17:23803186-23824739* *Epithelium* *Fiber* *OK* *5974.14* *
>> 2390.75* *-1.32127* *-0.34111* *0.5746* *0.661937* *no*
>>
>>  I checked some forums and they said that this is the drawback of
>> TopHat/Cufflink/Cuffdiff when dealing with short genes. But I am still not
>> so clear about this. Anyone got the same problem? What can I do with this
>> situation?
>> Anyone suggests any other good tools to test for (1) differential gene
>> expression OR (2) both differential gene expression and gene discovery?
>>
>> Thank you
>> Thanh
>>
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] Exceptionally high RPKM values of miRNA and other short genes in Cuffdiff's output

2013-07-18 Thread Hoang, Thanh
Hi all,
I have been analyzing my RNA-seq data on mouse tissues. My RNA-data is
single-ended and 51 bp in length. I ran TopHat/Cufflink/Cuffdiff to test to
differential gene expression
In the Cuffdiff's output, I got very high RPKM value for some of miRNA and
some other short genes ( less than 100bp). These genes are in the top genes
with the highest RPKM. I think the RPKM values of these genes are probably
 too high to be true.
 *test_id* *gene_id* *gene* *locus* *sample_1* *sample_2* *status* *value_1*
*value_2* *log2(fold_change)* *test_stat* *p_value* *q_value* *significant*
*ENSMUSG0093077* *ENSMUSG0093077* *Mir5105* *5:146231229-146302874*
*Epithelium* *Fiber* *OK* *1.53E+06* *  445558* *-1.78097* *-355.367* *
0.00715* *0.016986* *yes*  *ENSMUSG0093098* *ENSMUSG0093098* *
Gm22641* *7:130162450-133124354* *Epithelium* *Fiber* *OK* *87894.1* *
 36474.7* *-1.26887* *-0.59863* *0.4913* *0.587174* *no*  *
ENSMUSG0089855* *ENSMUSG0089855* *Gm15662* *10:105187662-105583874*
*Epithelium* *Fiber* *OK* *42868.9* * 21566.5* *-0.99114* *-20.7066* *0.0186
* *0.039568* *yes*  *ENSMUSG0092984* *ENSMUSG0092984* *Mir5115* *
2:73012853-73012927* *Epithelium* *Fiber* *OK* *21104.8* * 8317.49* *
-1.34335* *-447.314* *0.0001* *0.000354* *yes*  *ENSMUSG0086324* *
ENSMUSG0086324* *Gm15564* *16:35926510-36037131* *Epithelium* *Fiber* *
OK* *6443.35* * 3664.15* *-0.81433* *-1.52095* *0.2129* *0.301429* *no*  *
ENSMUSG0092981* *ENSMUSG0092981* *Mir5125* *17:23803186-23824739* *
Epithelium* *Fiber* *OK* *5974.14* *2390.75* *-1.32127* *-0.34111* *0.5746*
*0.661937* *no*

 I checked some forums and they said that this is the drawback of
TopHat/Cufflink/Cuffdiff when dealing with short genes. But I am still not
so clear about this. Anyone got the same problem? What can I do with this
situation?
Anyone suggests any other good tools to test for (1) differential gene
expression OR (2) both differential gene expression and gene discovery?

Thank you
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] Problem with repeated genes in Cuffdiff's output

2013-07-16 Thread Hoang, Thanh
Hi all,
I am working on RNA-seq using TopHat/Cufflink/Cuffdiff for differential
gene expression and new gene discovery ( this is what I am interested in).
 However, I found many genes that are repeated in the Cuffdiff's ouput.
These are the same genes and at the exact the same locus. There should be
only one gene for 1 line. Something like this:
   *Genes* *Locus* *Status* *q1* *q2* *Log2 Folg change* *Significance*  *
Lnp* *2:74517521-74584544* *OK* *8.91501* *85.2735* *3.25779* *   yes*  *Lnp
* *2:74517521-74584544* *OK* *12.0044* *171.352* *3.83533* *   yes*

If I re-run the Cuffdiff for differential gene expression only ( No gene
discovery), the problem is fixed. Anyone knows how o explain and fix this?
Thank you so much
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Evaluating TopHat's results

2013-07-15 Thread Hoang, Thanh
Hi Jen,
Thanks so much for your advice.
*The tool " NGS: Picard (beta) -> SAM/BAM Alignment Summary Metrics" may be
the tool you are looking for. There are others in this tool group that
added up numbers in BAM or SAM files, and SAMTools has "flagstat", so you
could create you own calculation with one of those, plus a count on the
fastq inputs, and the "Compute" tool, if it is not exactly right.*
*
*
I ran the Flagstat on my TopHat 's output BAM file. I am now confusing
about the result:

44066574 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
44066574 + 0 mapped (100.00%:-nan%)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (-nan%:-nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (-nan%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Note that my RNA-seq data is single-ended sequencing. The raw data for
this sample before mapping with TopHat only has 33174286 reads. My
question is why I have more reads mapped in the BAM file from TopHat's
output? and Does this BAM file contains the mapped reads only ( NOT
non-mapped reads)?

I have also tried the *SAM/BAM Alignment Summary Metrics  *tool. This
time I have 25541681 reads from BAM file ( the result seems only show
mapped reads). Is that the number I should expect?

Thank you

Thanh



On Mon, Jul 15, 2013 at 10:41 AM, Jennifer Jackson  wrote:

>  Hello Thanh,
>
> The tool " NGS: Picard (beta) -> SAM/BAM Alignment Summary Metrics" may be
> the tool you are looking for. There are others in this tool group that
> added up numbers in BAM or SAM files, and SAMTools has "flagstat", so you
> could create you own calculation with one of those, plus a count on the
> fastq inputs, and the "Compute" tool, if it is not exactly right.
>
> Are you using the public Main Galaxy instance at
> https://main.g2.bx.psu.edu/ (usegalaxy.org) clicking over to connect to
> the Genomic HyperBrowser, via web? Or are you doing something else? Can you
> give this another try this morning and see if it is working?
>
> Hopefully the first part helped, let us know about the second,
>
> Take care,
>
> Jen
> Galaxy team
>
>
> On 7/14/13 1:47 PM, Hoang, Thanh wrote:
>
> Hi,
> I ran TopHat on Galaxy for my RNA-seq data. I want to analyze TopHat's
> output files, such as percentage of reads mapped to the genome...but I am
> not sure how to do that.
> I am also trying to visualize the BAM file by IGB but the following error
> message appears : " Failed to authenticate to the server".
>
>  Anyone can help with these issues?
>
>  Thank so much
> Thanh
>
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>
>   http://galaxyproject.org/search/mailinglists/
>
>
> --
> Jennifer Hillman-Jackson
> Galaxy Support and Traininghttp://galaxyproject.org
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] Evaluating TopHat's results

2013-07-14 Thread Hoang, Thanh
Hi,
I ran TopHat on Galaxy for my RNA-seq data. I want to analyze TopHat's
output files, such as percentage of reads mapped to the genome...but I am
not sure how to do that.
I am also trying to visualize the BAM file by IGB but the following error
message appears : " Failed to authenticate to the server".

Anyone can help with these issues?

Thank so much
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] How to define the cutoff value of RPKM for expressed genes?

2013-07-03 Thread Hoang, Thanh
Hi all,
I have been working on RNA-seq data analysis using TopHat and Cuffdiff. One
of the problem I have is to define the cutoff RPKM value to tell whether a
gene is expressed from  the background noise?.
 Could anybody give me a suggestion?
Thank you
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] Get Gene name from Cuffdiff's output?

2013-06-17 Thread Hoang, Thanh
Hi guys,
I am trying to examine gene differential expression in my mouse samples
using :
Cufflink >> Cuffmerge>>Cuffdiff
The output from Cuffdiff shows only gene id, but not gene name:
test_id gene_idgenelocussample_1
sample_2
XLOC_01XLOC_01-  1:3200263-3200566EpitheliumFiber
Could anyone tell me how to make the gene name show up?
I used Mus_musculus.GRCm38.71.dna.toplevel.fa   as the reference sequence
(not GRCm38/mm10 from UCSC table broswer because i think this may be old
version).
I have been trying to find a solution online but still very confused
Thanks so much
Thanh
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] FTP upload problem

2013-06-12 Thread Hoang, Thanh
Hi all,
I have been trying to upload my data files from FTP to Galaxy. It worked
perfectly yesterday. But now it did not run when I upload. The status is
like" Job is waiting to run". But I have waited for many hours.
Do you guys know what the problem is?
Thanks
Thanh



On Tue, Jun 11, 2013 at 9:40 PM, Jennifer Jackson  wrote:

>  Hi Thanh,
>
> I am glad the password reset worked. This new issue was on our side and
> has been corrected again. Please try the FTP upload now. Our apologies for
> the multiple problems confusing the process,
>
> Jen
> Galaxy team
>
>  On 6/11/13 3:42 PM, Hoang, Thanh wrote:
>
> Hi all,
> I just finished uploading 2 files via FTP and I have some more files to be
> uploaded. After an Internet  disconnection, I got the same problem like
> Delong had. I got this message:
>   "530 Sorry, the maximum number of clients (3) for this user are already
> connected
>  Error: Critical error
> Error: Could not connect to server"
>
>  I tried to re-connect but it still did not work. Could anyone please
> help me with this?
> Regards
> Thanh
>
>
> On Tue, Jun 11, 2013 at 4:16 PM, Jennifer Jackson  wrote:
>
>>  Hello Delong,
>>
>> This was a temporary issue during a short time window yesterday. Letting
>> us know about it is always OK, but then going back in and trying to connect
>> again in the right solution, as it will generally clear up. If you haven't
>> been able to make the connection by now (I just tested it again right now,
>> and it is functioning for me), please let us know,
>>
>> Best,
>>
>> Jen
>> Galaxy team
>>
>> On 6/10/13 10:50 AM, Delong, Zhou wrote:
>>
>> Hello,
>> I was using the ftp uploading via filezilla. It worked perfectly this
>> morning, but after a disconnection of the internet I can no longer upload
>> any file. This is the message I got:
>> 530 Sorry, the maximum number of clients (3) for this user are already
>> connected
>> I restarted my computer and shut down the program for 30 minutes hoping
>> the server will reset the connection but it didnt work. I've done some
>> search on the internet but nothing helped.
>> I noticed that this problem has already been mentioned on the ML but the
>> solution was not given. Anyone could help please?
>> Thanks
>> Delong
>>
>>
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>   http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>
>>   http://galaxyproject.org/search/mailinglists/
>>
>>
>> --
>> Jennifer Hillman-Jackson
>> Galaxy Support and Traininghttp://galaxyproject.org
>>
>>
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>   http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>
>>   http://galaxyproject.org/search/mailinglists/
>>
>
>
> --
> Jennifer Hillman-Jackson
> Galaxy Support and Traininghttp://galaxyproject.org
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] FTP upload problem

2013-06-11 Thread Hoang, Thanh
Hi all,
I just finished uploading 2 files via FTP and I have some more files to be
uploaded. After an Internet  disconnection, I got the same problem like
Delong had. I got this message:
 "530 Sorry, the maximum number of clients (3) for this user are already
connected
Error: Critical error
Error: Could not connect to server"

I tried to re-connect but it still did not work. Could anyone please help
me with this?
Regards
Thanh


On Tue, Jun 11, 2013 at 4:16 PM, Jennifer Jackson  wrote:

>  Hello Delong,
>
> This was a temporary issue during a short time window yesterday. Letting
> us know about it is always OK, but then going back in and trying to connect
> again in the right solution, as it will generally clear up. If you haven't
> been able to make the connection by now (I just tested it again right now,
> and it is functioning for me), please let us know,
>
> Best,
>
> Jen
> Galaxy team
>
> On 6/10/13 10:50 AM, Delong, Zhou wrote:
>
> Hello,
> I was using the ftp uploading via filezilla. It worked perfectly this
> morning, but after a disconnection of the internet I can no longer upload
> any file. This is the message I got:
> 530 Sorry, the maximum number of clients (3) for this user are already
> connected
> I restarted my computer and shut down the program for 30 minutes hoping
> the server will reset the connection but it didnt work. I've done some
> search on the internet but nothing helped.
> I noticed that this problem has already been mentioned on the ML but the
> solution was not given. Anyone could help please?
> Thanks
> Delong
>
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>
>   http://galaxyproject.org/search/mailinglists/
>
>
> --
> Jennifer Hillman-Jackson
> Galaxy Support and Traininghttp://galaxyproject.org
>
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>
>   http://galaxyproject.org/search/mailinglists/
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] Upload files to Galaxy via FTP

2013-06-11 Thread Hoang, Thanh
Hi,
I have been trying to upload my RNA-seq data files (~20 Gb total) via FTP
using my account in Galaxy. But I could not connect to the server. The
following message showed up:
*"Error: Critical error*
*Error: Could not connect to server"*

Could you suggest any idea to overcome this problem?
Thank you so much
Thanh
*
*
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/