Re: [galaxy-user] run Bowtie to estimate Mean Inner Distance between Mate Pairs

2012-08-21 Thread Du, Jianguang
Hi All,
Thank you for your help. I understand how to do now.
Jianguang


From: rshar...@bx.psu.edu [rshar...@bx.psu.edu]
Sent: Tuesday, August 21, 2012 11:15 AM
To: galaxy-user@lists.bx.psu.edu
Cc: Du, Jianguang
Subject: Re: [galaxy-user] run Bowtie to estimate Mean Inner Distance between 
Mate Pairs

Howdy Jianguang,

There's a more complete description of the SAM format in "The Sequence
Alignment/Map format and SAMtools", Li et al, Bioinformatics (2009).  And
you can find the latest specification for the format at
samtools.sourceforge.net .

In the spec, the terminology for the ISIZE field has been changed to TLEN,
template length, to allow for sequencing technologies that produce more
than two sequenced segments.  The description there is "the number of
bases from the leftmost mapped base to the rightmost mapped base".

So I think to convert to "inner distance between mate pairs" you would
typically take ISIZE and subtract the lengths of the mates.  Note that for
some technologies that value could be negative (which just means the mates
overlap).  You might need to take into account whether the mates have been
mapped with proper orientation-- for example, if an inversion has flipped
one mate it has also carried that mate closer to or farther from the
other.

Bob H


> Hello Jianguang,
>
> On the Bowtie tool form itself, please find this text:
>
> Outputs
>
> The output is in SAM format, and has the following columns:
>
>Column  Description
>   
>   1 QNAME  Query (pair) NAME
>   2 FLAG   bitwise FLAG
>   3 RNAME  Reference sequence NAME
>   4 POS1-based leftmost POSition/coordinate of clipped sequence
>   5 MAPQ   MAPping Quality (Phred-scaled)
>   6 CIGAR  extended CIGAR string
>   7 MRNM   Mate Reference sequence NaMe ('=' if same as RNAME)
>   8 MPOS   1-based Mate POSition
>   9 ISIZE  Inferred insert SIZE
> 10 SEQquery SEQuence on the same strand as the reference
> 11 QUAL   query QUALity (ASCII-33 gives the Phred base quality)
> 12 OPTvariable OPTional fields in the format TAG:VTYPE:VALUE
>
>
> The value of ISIZE is the total insert size for this read pair.
>
>
> Hopefully this helps!
>
> Jen
> Galaxy team
>
> On 8/16/12 2:34 PM, Du, Jianguang wrote:
>> Dear All,
>>
>> In order to figure out the Mean Inner Distance between Mate Pairs of my
>> paired-end RNA-seq datasets, I ran Bowtie (Map with Bowtie for Illumina)
>> with both forward and reverse datasets and mouse mm9 as reference
>> genome. Below I list the Bowtie output for only one pair of reads (I put
>> the fields on the left side):
>>
>> For the forward read
>>  ...snip...
>> Is the ISIZE the insert size? The difference between POS and MPOS is
>> 145bp, which is 36bp shorter than ISIZE (181). My question is: if
>> ISIZE does mean insert size, how should I convert INSIZE into Mean Inner
>> Distance between Mate Pairs?
>>
>> Thanks,
>>
>> Jianguang Du
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] run Bowtie to estimate Mean Inner Distance between Mate Pairs

2012-08-21 Thread rsharris
Howdy Jianguang,

There's a more complete description of the SAM format in "The Sequence
Alignment/Map format and SAMtools", Li et al, Bioinformatics (2009).  And
you can find the latest specification for the format at
samtools.sourceforge.net .

In the spec, the terminology for the ISIZE field has been changed to TLEN,
template length, to allow for sequencing technologies that produce more
than two sequenced segments.  The description there is "the number of
bases from the leftmost mapped base to the rightmost mapped base".

So I think to convert to "inner distance between mate pairs" you would
typically take ISIZE and subtract the lengths of the mates.  Note that for
some technologies that value could be negative (which just means the mates
overlap).  You might need to take into account whether the mates have been
mapped with proper orientation-- for example, if an inversion has flipped
one mate it has also carried that mate closer to or farther from the
other.

Bob H


> Hello Jianguang,
>
> On the Bowtie tool form itself, please find this text:
>
> Outputs
>
> The output is in SAM format, and has the following columns:
>
>Column  Description
>   
>   1 QNAME  Query (pair) NAME
>   2 FLAG   bitwise FLAG
>   3 RNAME  Reference sequence NAME
>   4 POS1-based leftmost POSition/coordinate of clipped sequence
>   5 MAPQ   MAPping Quality (Phred-scaled)
>   6 CIGAR  extended CIGAR string
>   7 MRNM   Mate Reference sequence NaMe ('=' if same as RNAME)
>   8 MPOS   1-based Mate POSition
>   9 ISIZE  Inferred insert SIZE
> 10 SEQquery SEQuence on the same strand as the reference
> 11 QUAL   query QUALity (ASCII-33 gives the Phred base quality)
> 12 OPTvariable OPTional fields in the format TAG:VTYPE:VALUE
>
>
> The value of ISIZE is the total insert size for this read pair.
>
>
> Hopefully this helps!
>
> Jen
> Galaxy team
>
> On 8/16/12 2:34 PM, Du, Jianguang wrote:
>> Dear All,
>>
>> In order to figure out the Mean Inner Distance between Mate Pairs of my
>> paired-end RNA-seq datasets, I ran Bowtie (Map with Bowtie for Illumina)
>> with both forward and reverse datasets and mouse mm9 as reference
>> genome. Below I list the Bowtie output for only one pair of reads (I put
>> the fields on the left side):
>>
>> For the forward read
>>  ...snip...
>> Is the ISIZE the insert size? The difference between POS and MPOS is
>> 145bp, which is 36bp shorter than ISIZE (181). My question is: if
>> ISIZE does mean insert size, how should I convert INSIZE into Mean Inner
>> Distance between Mate Pairs?
>>
>> Thanks,
>>
>> Jianguang Du

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] run Bowtie to estimate Mean Inner Distance between Mate Pairs

2012-08-20 Thread Jennifer Jackson

Hello Jianguang,

On the Bowtie tool form itself, please find this text:


Outputs

The output is in SAM format, and has the following columns:

  Column  Description
  
 1 QNAME  Query (pair) NAME
 2 FLAG   bitwise FLAG
 3 RNAME  Reference sequence NAME
 4 POS1-based leftmost POSition/coordinate of clipped sequence
 5 MAPQ   MAPping Quality (Phred-scaled)
 6 CIGAR  extended CIGAR string
 7 MRNM   Mate Reference sequence NaMe ('=' if same as RNAME)
 8 MPOS   1-based Mate POSition
 9 ISIZE  Inferred insert SIZE
10 SEQquery SEQuence on the same strand as the reference
11 QUAL   query QUALity (ASCII-33 gives the Phred base quality)
12 OPTvariable OPTional fields in the format TAG:VTYPE:VALUE


The value of ISIZE is the total insert size for this read pair.


Hopefully this helps!

Jen
Galaxy team

On 8/16/12 2:34 PM, Du, Jianguang wrote:

Dear All,

In order to figure out the Mean Inner Distance between Mate Pairs of my
paired-end RNA-seq datasets, I ran Bowtie (Map with Bowtie for Illumina)
with both forward and reverse datasets and mouse mm9 as reference
genome. Below I list the Bowtie output for only one pair of reads (I put
the fields on the left side):

For the forward read

QNAME: SRR322837.8.1

FLAG:99

RNAME:chr1

POS:163761156

MAPQ:255

CIAGR:36M

MRNM:=

MPOS:163761301

ISIZE:181

SEQ:NTGGATACTAGCCATAAATGAATT

QUAL:%(,,')(())@@@2235885<<2@@@##

OPT:XA:i:1MD:Z:0A35NM:i:1

For the reverse read

QNAME: SRR322837.8.2

FLAG:147

RNAME:chr1

POS:163761301

MAPQ:255

CIAGR:36M

MRNM:=

MPOS:163761156

ISIZE:-181

SEQ:TATTATGTCAATCTATGAAGAAGGACGGCGAGGTGA

QUAL:GDBE@B>EEGDB=BD-=GG>GGGEDDGhttp://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/



--
Jennifer Jackson
http://galaxyproject.org
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] run Bowtie to estimate Mean Inner Distance between Mate Pairs

2012-08-16 Thread Du, Jianguang
Dear All,

In order to figure out the Mean Inner Distance between Mate Pairs of my 
paired-end RNA-seq datasets, I ran Bowtie (Map with Bowtie for Illumina) with 
both forward and reverse datasets and mouse mm9 as reference genome. Below I 
list the Bowtie output for only one pair of reads (I put the fields on the left 
side):


For the forward read
QNAME:   SRR322837.8.1
FLAG:99
RNAME:   chr1
POS: 163761156
MAPQ:255
CIAGR:   36M
MRNM:=
MPOS:163761301
ISIZE:   181
SEQ: NTGGATACTAGCCATAAATGAATT
QUAL:%(,,')(())@@@2235885<<2@@@##
OPT: XA:i:1 MD:Z:0A35  NM:i:1

For the reverse read
QNAME:   SRR322837.8.2
FLAG:147
RNAME:   chr1
POS: 163761301
MAPQ:255
CIAGR:   36M
MRNM:=
MPOS:163761156
ISIZE:   -181
SEQ: TATTATGTCAATCTATGAAGAAGGACGGCGAGGTGA
QUAL:GDBE@B>EEGDB=BD-=GG>GGGEDDG___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/