Re: [galaxy-user] cufflinks FPKM problem

2011-04-11 Thread 李世勇
Hi,Paul Korir:
   Thank you for yours help.I have known the reason,But I also I have a little 
problem about to solve the question.
   if I want to add a XS tag ,what should I do ,can you tell me in detail(like 
that ,dose it only have two value ,such as XS:A:-,XS:A:+ ,not have 
XS:B([B-Z]):+ ? 
Best wishes
- 原始邮件 -
发件人: "Paul Korir" 
收件人: "lishiyong" 
抄送: "tophat.cufflinks" , "galaxy-user" 
, "高欢" 
发送时间: 星期一, 2011年 4 月 11日 下午 11:10:56
主题: Re: [galaxy-user] cufflinks FPKM problem

Hi Li, 

Tophat includes a custom tag 'XS' at the end of spliced read alignments which 
your pipeline is not aware about. 

The following is taken from http://cufflinks.cbcb.umd.edu/manual.html 


"Cufflinks takes a text file of SAM alignments as input. For more details on 
the SAM format, see the specification . The RNA-Seq read mapper TopHat produces 
output in this format, and is recommended for use with Cufflinks. However 
Cufflinks will accept SAM alignments generated by any read mapper. Here's an 
example of an alignment Cufflinks will accept: 
s6.25mer.txt-913508 16  chr1 4482736 255 14M431N11M * 0 0 \ 
CAAGATGCTAGGCAAGTCTTGGAAG I NM:i:0 XS:A:- 
Note the use of the custom tag XS . This attribute, which must have a value of 
"+" or "-", indicates which strand the RNA that produced this read came from. 
While this tag can be applied to any alignment, including unspliced ones, it 
must be present for all spliced alignment records (those with a 'N' operation 
in the CIGAR string)." 

Kind regards, 

Paul 



2011/4/11 lishiyong < lishiy...@genomics.org.cn > 




Hi: 
I use the solid PE sequencing data and mapped with the bioscope tools(AB 
company supported) ,which is better for solid data mapping ,so I don't use the 
bowtie to map . Igain the BAM file! Now ,I want use the cufflinks to calculate 
the gene expression. But there is a error. 

[15:08:06] Inspecting reads and determining fragment length distribution. 
BAM record error: found spliced alignment without XS attribute 
BAM record error: found spliced alignment without XS attribute 
 the BAM file : 
323_358_201073  chr1343 0   45M5H   *   0   0   CCCTAACCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCT   III))C/1@?<4)='))415'-4118-'1)9>'+1'<6+'1)85+)-+6- CS:Z:T200230100231102301000301002301002301000320
 
423_236_195581  chr1550 0   8H42M   =   699451  698945  GTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGG  GF>%%III))8?%%  RG:Z:20110328192522421   NH:i:2  CM:i:5  SM:i:3  CQ:Z:9BA;?AB:55;A%9?AB,4:@@*/)7>2<%5@<:3,;-.%8.*;5 CS:Z:T2030311033322303302232133302223222131122330223
 
298_1884_1495   113 chr1562 0   7H43M   chr3199392032   0   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGG 5AI;6:>A>?I7FIE  RG:Z:20110328192522421  NH:i:2  CM:i:0  SM:i:3  CQ:Z:BB@782:?A388.A&28(77;64.1*-/<&0:9/%3? CS:Z:T202212311122100303110333220033022321331022
 
62_1428_195489  chr1562 1   50M *   0   0   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGGAGAATGC  *=AIII4/CII=%%I((=EIII   RG:Z:20110328192522421  NH:i:0  CM:i:4  SM:i:0  CQ:Z:@B@BABB=ABBB?@A=B>>@@?<;?>B>=http://lists.bx.psu.edu/listinfo/galaxy-dev 

To manage your subscriptions to this and other Galaxy lists, 
please use the interface at: 

  http://lists.bx.psu.edu/ 



-- 
Paul Korir 
www.paulkorir.com 

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] cufflinks FPKM problem

2011-04-11 Thread gaohuan
Thank you very much for your reply!

I'd like to know how to add this 'xs' tag since the amount of reads mapped to 
genome is much less using tophat, can we just add a '+' or '-' at the end of 
each line?


2011-04-11 



gaohuan 



发件人: Ryan Golhar 
发送时间: 2011-04-11  23:19:10 
收件人: lishiyong 
抄送: tophat.cufflinks; galaxy-user; 高欢 
主题: Re: [galaxy-user] cufflinks FPKM problem 
 
Cufflinks requires an 'xs' tag on each read in the bam file. Only tophat does 
this. You can write a script to add this or remap with tophat. 


How much of a difference do you see between tophat and bioscope?

Please excuse any typos -- Sent from my iPhone

On Apr 11, 2011, at 9:46 AM, lishiyong  wrote:


Hi:
I use the solid PE sequencing data and mapped with the bioscope tools(AB 
company supported) ,which is better for solid data mapping ,so I don't use the 
bowtie to map . Igain the BAM file! Now ,I want use the cufflinks to calculate 
the gene expression. But there is a error.
[15:08:06] Inspecting reads and determining fragment length distribution.
BAM record error: found spliced alignment without XS attribute
BAM record error: found spliced alignment without XS attribute
 the BAM file :
323_358_201073  chr1343 0   45M5H   *   0   0   
CCCTAACCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCT   
III))C/1@?<4)='))415'-4118-'1)9>'+1'<6+'1)85+)-+6- 
CS:Z:T200230100231102301000301002301002301000320
423_236_195581  chr1550 0   8H42M   =   699451  698945  
GTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGG  
GF>%%III))8?%%  RG:Z:20110328192522421   NH:i:2 
 CM:i:5  SM:i:3  CQ:Z:9BA;?AB:55;A%9?AB,4:@@*/)7>2<%5@<:3,;-.%8.*;5 
CS:Z:T2030311033322303302232133302223222131122330223
298_1884_1495   113 chr1562 0   7H43M   chr3199392032   
0   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGG 
5AI;6:>A>?I7FIE  RG:Z:20110328192522421  NH:i:2 
 CM:i:0  SM:i:3  CQ:Z:BB@782:?A388.A&28(77;64.1*-/<&0:9/%3? 
CS:Z:T202212311122100303110333220033022321331022
62_1428_195489  chr1562 1   50M *   0   0   
ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGGAGAATGC  
*=AIII4/CII=%%I((=EIII   RG:Z:20110328192522421 
 NH:i:0  CM:i:4  SM:i:0  
CQ:Z:@B@BABB=ABBB?@A=B>>@@?<;?>B>=http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] cufflinks FPKM problem

2011-04-11 Thread Adam Roberts
Since SOLiD reads are strand-specific you can use the option '--library-type
fr-secondstrand', and the strand information will automatically be added to
the reads during the run.

-Adam

On Mon, Apr 11, 2011 at 8:27 AM, gaohuan  wrote:

>  Thank you very much for your reply!
>
> I'd like to know how to add this 'xs' tag since the amount of reads mapped
> to genome is much less using tophat, can we just add a '+' or '-' at the end
> of each line?
>
>
> 2011-04-11
> --
>  gaohuan
> --
> *发件人:* Ryan Golhar
> *发送时间:* 2011-04-11  23:19:10
> *收件人:* lishiyong
> *抄送:* tophat.cufflinks; galaxy-user; 高欢
> *主题:* Re: [galaxy-user] cufflinks FPKM problem
>   Cufflinks requires an 'xs' tag on each read in the bam file. Only tophat
> does this. You can write a script to add this or remap with tophat.
>
> How much of a difference do you see between tophat and bioscope?
>
> Please excuse any typos -- Sent from my iPhone
>
> On Apr 11, 2011, at 9:46 AM, lishiyong  wrote:
>
>   Hi:
> I use the solid PE sequencing data and mapped with the bioscope tools(AB
> company supported) ,which is better for solid data mapping ,so I don't use
> the bowtie to map . Igain the BAM file! Now ,I want use the cufflinks to
> calculate the gene expression. But there is a error.
>  [15:08:06] Inspecting reads and determining fragment length distribution.
> BAM record error: found spliced alignment without XS attribute
> BAM record error: found spliced alignment without XS attribute
>  the BAM file :
>
> 323_358_201073  chr1343 0   45M5H   *   0   0 
>   CCCTAACCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCT   
> III))C/1 ;7BI+'7))I?3   RG:Z:20110328192522421   NH:i:0  CM:i:4  SM:i:2  
> CQ:Z:A=ABA<<>@?<4)='))415'-4118-'1)9>'+1'<6+'1)85+)-+6- 
> CS:Z:T200230100231102301000301002301002301000320
>
> 423_236_195581  chr1550 0   8H42M   =   699451  
> 698945  GTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGG  
> GF>%%III))8?%%  RG:Z:20110328192522421   
> NH:i:2  CM:i:5  SM:i:3  CQ:Z:9BA;?AB:55;A%9?AB,4:@
> @*/)7>2<%5@
> <:3,;-.%8.*;5 CS:Z:T2030311033322303302232133302223222131122330223
>
> 298_1884_1495   113 chr1562 0   7H43M   chr3199392032 
>   0   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGG 
> 5AI;6:>A>?I7FIE  RG:Z:20110328192522421  
> NH:i:2  CM:i:0  SM:i:3  CQ:Z:BB@7
>  =2;=>82:?A388.A&28(77;64.1*-/<&0:9/%3? 
> CS:Z:T202212311122100303110333220033022321331022
>
> 62_1428_195489  chr1562 1   50M *   0   0 
>   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGGAGAATGC  
> *=AIII4/CII=%%I((=EIII   
> RG:Z:20110328192522421  NH:i:0  CM:i:4  SM:i:0  CQ:Z:@B
> @BABB=ABBB?@A=B>>@@?<;?>B>= .4* CS:Z:T1313022202212311122100303110331222033022321331
>
> I have sorted the bam file and the gtf file.
> cufflinks  -G refGene_hg18.gtf -p 3 -r  human_hg18.fa -o test  test.pe.bam
> (the version of cufflinks is v0.9.2 )
> Who know the reason ,and what shoud I do!
> best wishes!
> Shiyong Li
> 2011-04-11
> --
> lishiyong
>
>  ___
>
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   <http://lists.bx.psu.edu/listinfo/galaxy-dev>
> http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   <http://lists.bx.psu.edu/>http://lists.bx.psu.edu/
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] cufflinks FPKM problem

2011-04-11 Thread Ryan Golhar
Cufflinks requires an 'xs' tag on each read in the bam file. Only tophat does 
this. You can write a script to add this or remap with tophat. 

How much of a difference do you see between tophat and bioscope?

Please excuse any typos -- Sent from my iPhone

On Apr 11, 2011, at 9:46 AM, lishiyong  wrote:

> Hi:
> I use the solid PE sequencing data and mapped with the bioscope tools(AB 
> company supported) ,which is better for solid data mapping ,so I don't use 
> the bowtie to map . Igain the BAM file! Now ,I want use the cufflinks to 
> calculate the gene expression. But there is a error.
> [15:08:06] Inspecting reads and determining fragment length distribution.
> BAM record error: found spliced alignment without XS attribute
> BAM record error: found spliced alignment without XS attribute
>  the BAM file :
> 323_358_201073  chr1343 0   45M5H   *   0   0 
>   CCCTAACCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCT   
> III))C/1 NH:i:0  CM:i:4  SM:i:2  
> CQ:Z:A=ABA<<>@?<4)='))415'-4118-'1)9>'+1'<6+'1)85+)-+6- 
> CS:Z:T200230100231102301000301002301002301000320
> 423_236_195581  chr1550 0   8H42M   =   699451  
> 698945  GTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGG  
> GF>%%III))8?%%  RG:Z:20110328192522421   
> NH:i:2  CM:i:5  SM:i:3  
> CQ:Z:9BA;?AB:55;A%9?AB,4:@@*/)7>2<%5@<:3,;-.%8.*;5 
> CS:Z:T2030311033322303302232133302223222131122330223
> 298_1884_1495   113 chr1562 0   7H43M   chr3199392032 
>   0   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGG 
> 5AI;6:>A>?I7FIE  RG:Z:20110328192522421  
> NH:i:2  CM:i:0  SM:i:3  
> CQ:Z:BB@782:?A388.A&28(77;64.1*-/<&0:9/%3? 
> CS:Z:T202212311122100303110333220033022321331022
> 62_1428_195489  chr1562 1   50M *   0   0 
>   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGGAGAATGC  
> *=AIII4/CII=%%I((=EIII   
> RG:Z:20110328192522421  NH:i:0  CM:i:4  SM:i:0  
> CQ:Z:@B@BABB=ABBB?@A=B>>@@?<;?>B>= CS:Z:T1313022202212311122100303110331222033022321331
>  
> I have sorted the bam file and the gtf file.
> cufflinks  -G refGene_hg18.gtf -p 3 -r  human_hg18.fa -o test  test.pe.bam 
> (the version of cufflinks is v0.9.2 ) 
> Who know the reason ,and what shoud I do!
> best wishes!
> Shiyong Li  
> 2011-04-11
> lishiyong
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] cufflinks FPKM problem

2011-04-11 Thread Paul Korir
Hi Li,

Tophat includes a custom tag 'XS' at the end of spliced read alignments
which your pipeline is not aware about.

The following is taken from http://cufflinks.cbcb.umd.edu/manual.html

"Cufflinks takes a text file of SAM alignments as input. For more details on
the SAM format, see the
specification.
The RNA-Seq read mapper TopHat  produces output
in this format, and is recommended for use with Cufflinks. However Cufflinks
will accept SAM alignments generated by any read mapper. Here's an example
of an alignment Cufflinks will accept:

s6.25mer.txt-913508 16  chr1 4482736 255 14M431N11M * 0 0 \
   CAAGATGCTAGGCAAGTCTTGGAAG I NM:i:0 XS:A:-

Note the use of the custom tag XS. This attribute, which must have a value
of "+" or "-", indicates which strand the RNA that produced this read came
from. While this tag can be applied to any alignment, including unspliced
ones, it *must* be present for all spliced alignment records (those with a
'N' operation in the CIGAR string)."

Kind regards,

Paul


2011/4/11 lishiyong 

>  Hi:
> I use the solid PE sequencing data and mapped with the bioscope tools(AB
> company supported) ,which is better for solid data mapping ,so I don't use
> the bowtie to map . Igain the BAM file! Now ,I want use the cufflinks to
> calculate the gene expression. But there is a error.
>  [15:08:06] Inspecting reads and determining fragment length distribution.
> BAM record error: found spliced alignment without XS attribute
> BAM record error: found spliced alignment without XS attribute
>  the BAM file :
>
> 323_358_201073  chr1343 0   45M5H   *   0   0 
>   CCCTAACCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCT   
> III))C/1 ;7BI+'7))I?3   RG:Z:20110328192522421   NH:i:0  CM:i:4  SM:i:2  
> CQ:Z:A=ABA<<>@?<4)='))415'-4118-'1)9>'+1'<6+'1)85+)-+6- 
> CS:Z:T200230100231102301000301002301002301000320
>
> 423_236_195581  chr1550 0   8H42M   =   699451  
> 698945  GTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGG  
> GF>%%III))8?%%  RG:Z:20110328192522421   
> NH:i:2  CM:i:5  SM:i:3  CQ:Z:9BA;?AB:55;A%9?AB,4:@
> @*/)7>2<%5@
> <:3,;-.%8.*;5 CS:Z:T2030311033322303302232133302223222131122330223
>
> 298_1884_1495   113 chr1562 0   7H43M   chr3199392032 
>   0   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGG 
> 5AI;6:>A>?I7FIE  RG:Z:20110328192522421  
> NH:i:2  CM:i:0  SM:i:3  CQ:Z:BB@7
>  =2;=>82:?A388.A&28(77;64.1*-/<&0:9/%3? 
> CS:Z:T202212311122100303110333220033022321331022
>
> 62_1428_195489  chr1562 1   50M *   0   0 
>   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGGAGAATGC  
> *=AIII4/CII=%%I((=EIII   
> RG:Z:20110328192522421  NH:i:0  CM:i:4  SM:i:0  CQ:Z:@B
> @BABB=ABBB?@A=B>>@@?<;?>B>= .4* CS:Z:T1313022202212311122100303110331222033022321331
>
> I have sorted the bam file and the gtf file.
> cufflinks  -G refGene_hg18.gtf -p 3 -r  human_hg18.fa -o test  test.pe.bam
> (the version of cufflinks is v0.9.2 )
> Who know the reason ,and what shoud I do!
> best wishes!
> Shiyong Li
> 2011-04-11
> --
> lishiyong
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>  http://lists.bx.psu.edu/
>



-- 
Paul Korir
www.paulkorir.com
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] cufflinks FPKM problem

2011-04-11 Thread lishiyong
Hi:
I use the solid PE sequencing data and mapped with the bioscope tools(AB 
company supported) ,which is better for solid data mapping ,so I don't use the 
bowtie to map . Igain the BAM file! Now ,I want use the cufflinks to calculate 
the gene expression. But there is a error.
[15:08:06] Inspecting reads and determining fragment length distribution.
BAM record error: found spliced alignment without XS attribute
BAM record error: found spliced alignment without XS attribute
 the BAM file :
323_358_201073  chr1343 0   45M5H   *   0   0   
CCCTAACCCTACCCTAACCCTAACCCTAACCCTAACCCTAACCCT   
III))C/1@?<4)='))415'-4118-'1)9>'+1'<6+'1)85+)-+6- 
CS:Z:T200230100231102301000301002301002301000320
423_236_195581  chr1550 0   8H42M   =   699451  698945  
GTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGG  
GF>%%III))8?%%  RG:Z:20110328192522421   NH:i:2 
 CM:i:5  SM:i:3  CQ:Z:9BA;?AB:55;A%9?AB,4:@@*/)7>2<%5@<:3,;-.%8.*;5 
CS:Z:T2030311033322303302232133302223222131122330223
298_1884_1495   113 chr1562 0   7H43M   chr3199392032   
0   ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGG 
5AI;6:>A>?I7FIE  RG:Z:20110328192522421  NH:i:2 
 CM:i:0  SM:i:3  CQ:Z:BB@782:?A388.A&28(77;64.1*-/<&0:9/%3? 
CS:Z:T202212311122100303110333220033022321331022
62_1428_195489  chr1562 1   50M *   0   0   
ACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTGAGGAGAATGC  
*=AIII4/CII=%%I((=EIII   RG:Z:20110328192522421 
 NH:i:0  CM:i:4  SM:i:0  
CQ:Z:@B@BABB=ABBB?@A=B>>@@?<;?>B>=___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/