Re: [galaxy-user] Stats on BAM files

2011-04-05 Thread Sean Davis
On Tue, Apr 5, 2011 at 2:01 PM, Slim Sassi  wrote:
> Sean,
> I only wanted to start collecting stats with flagstats but knew that I
> needed something else to get everthing needed.
> I would like to know:
> % that didn't pass QC

That information is not in the FASTQ files, so it will not be in the
BAM files, generally.

> % mapped

That information assumes that the aligner writes the the unaligned
reads to the BAM file.  Tophat does not do that.

> % reads in exons/introns/intergenic regions

That is not something that flatstats provides.  However, there are a
number of other tools that could be coerced to give you this type of
information (including Galaxy, probably).

> and then, knowing that this is more complicated, I wanted to measure bias
> within transcripts (for example 3' versus 5'). Of course I am assuming that
> there is a consistent bias.

For this task, you may have to write some code unless someone else on
the list is aware of a package that does this for RNA-seq data.

>
> Thanks
> Slim
>
>
> On Apr 5, 2011, at 1:50 PM, Sean Davis wrote:
>
> On Apr 5, 2011 1:05 PM, "Slim Sassi"  wrote:
>>
>> Sean,
>> You are correct,  I did use tophat. Can you or anyone suggest a program
>> for BAM/SAM stats where the alignment was done with tophat
>>
>
> Slim,
>
> What stats do you want to capture?  The output you gave for flagstats is
> correct for single-end tophat alignments.  All reads are aligned, none are
> paired, none are marked as duplicates.
>
> Sean
>
>> Thanks
>> Slim
>> On Apr 5, 2011, at 12:51 PM, Sean Davis wrote:
>>
>> > Hi, Slim.
>> >
>> > My guess is that you used an aligner that outputs only aligned reads
>> > (tophat, for example) and that the input was single-ended.  If that is
>> > the case, then what you see below is exactly as expected.  If not,
>> > then you might need to be more specific about how you generated the
>> > BAM file.
>> >
>> > Sean
>> >
>> > On Tue, Apr 5, 2011 at 12:31 PM, Slim Sassi
>> >  wrote:
>> >> Hello,
>> >> I tried to use NGS: SAM Tools ->flagstat on a BAM files for basic
>> >> stats, but
>> >> I got results like you see below. It doesn't seem to be working. Any
>> >> suggestions?
>> >> 26584869 in total
>> >>
>> >> 0 QC failure
>> >> 0 duplicates
>> >> 26584869 mapped (100.00%)
>> >> 0 paired in sequencing
>> >> 0 read1
>> >> 0 read2
>> >> 0 properly paired (-nan%)
>> >> 0 with itself and mate mapped
>> >> 0 singletons (-nan%)
>> >> 0 with mate mapped to a different chr
>> >> 0 with mate mapped to a different chr (mapQ>=5)
>> >>
>> >> Thanks
>> >> Slim
>> >>
>> >> The information in this e-mail is intended only for the person to whom
>> >> it is
>> >> addressed. If you believe this e-mail was sent to you in error and the
>> >> e-mail
>> >> contains patient information, please contact the Partners Compliance
>> >> HelpLine at
>> >> http://www.partners.org/complianceline . If the e-mail was sent to you
>> >> in
>> >> error
>> >> but does not contain patient information, please contact the sender and
>> >> properly
>> >> dispose of the e-mail.
>>
>
>

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Stats on BAM files

2011-04-05 Thread Slim Sassi
Sean,

I only wanted to start collecting stats with flagstats but knew that I needed 
something else to get everthing needed. 
I would like to know:
% that didn't pass QC
% mapped
% reads in exons/introns/intergenic regions

and then, knowing that this is more complicated, I wanted to measure bias 
within transcripts (for example 3' versus 5'). Of course I am assuming that 
there is a consistent bias.


Thanks
Slim



On Apr 5, 2011, at 1:50 PM, Sean Davis wrote:

> 
> On Apr 5, 2011 1:05 PM, "Slim Sassi"  wrote:
> >
> > Sean,
> > You are correct,  I did use tophat. Can you or anyone suggest a program for 
> > BAM/SAM stats where the alignment was done with tophat
> >
> 
> Slim,
> 
> What stats do you want to capture?  The output you gave for flagstats is 
> correct for single-end tophat alignments.  All reads are aligned, none are 
> paired, none are marked as duplicates.
> 
> Sean 
>   
> > Thanks
> > Slim
> > On Apr 5, 2011, at 12:51 PM, Sean Davis wrote:
> >
> > > Hi, Slim.
> > >
> > > My guess is that you used an aligner that outputs only aligned reads
> > > (tophat, for example) and that the input was single-ended.  If that is
> > > the case, then what you see below is exactly as expected.  If not,
> > > then you might need to be more specific about how you generated the
> > > BAM file.
> > >
> > > Sean
> > >
> > > On Tue, Apr 5, 2011 at 12:31 PM, Slim Sassi  
> > > wrote:
> > >> Hello,
> > >> I tried to use NGS: SAM Tools ->flagstat on a BAM files for basic stats, 
> > >> but
> > >> I got results like you see below. It doesn't seem to be working. Any
> > >> suggestions?
> > >> 26584869 in total
> > >>
> > >> 0 QC failure
> > >> 0 duplicates
> > >> 26584869 mapped (100.00%)
> > >> 0 paired in sequencing
> > >> 0 read1
> > >> 0 read2
> > >> 0 properly paired (-nan%)
> > >> 0 with itself and mate mapped
> > >> 0 singletons (-nan%)
> > >> 0 with mate mapped to a different chr
> > >> 0 with mate mapped to a different chr (mapQ>=5)
> > >>
> > >> Thanks
> > >> Slim
> > >>
> > >> The information in this e-mail is intended only for the person to whom 
> > >> it is
> > >> addressed. If you believe this e-mail was sent to you in error and the
> > >> e-mail
> > >> contains patient information, please contact the Partners Compliance
> > >> HelpLine at
> > >> http://www.partners.org/complianceline . If the e-mail was sent to you in
> > >> error
> > >> but does not contain patient information, please contact the sender and
> > >> properly
> > >> dispose of the e-mail.
> >

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/