Good Morning Assaf: You can fetch the mRNA sequence for these genes from the genome browser. Click through on the genes in the display and select mRNA sequence.
Some of the N's you are removing are actually in the mRNA sequence. From the cDNA sequence you gave me, I find exactly the same sequence in the mRNA sequence from the genome browser, counting the sequence: Your cDNA: #seq len A C G T N ENSFCAT00000010563 5409 904 813 982 451 2259 The mRNA from the genome browser: #seq len A C G T N ENSFCAT00000010563 1671 500 405 551 215 0 ENSFCAT00000010563 242 58 71 62 51 0 ENSFCAT00000010563 1267 346 337 369 185 30 total 3180 904 813 982 451 30 Note the 30 N's in the third bit of mRNA sequence. The amount of ACGT is exactly the same. Your cDNA: [hiram@okazaki /tmp] faCount Felis_catus.CAT.62.cdna.examples #seq len A C G T N ENSFCAT00000005360 675 139 105 134 108 189 The mRNA from the genome browser: #seq len A C G T N ENSFCAT00000005360 165 49 44 38 34 0 ENSFCAT00000005360 321 90 61 96 74 0 total 486 139 105 134 108 0 The amount of ACGT is exactly the same. The extra N's are only in the cDNA. Your cDNA: #seq len A C G T N ENSFCAT00000015608 2295 416 472 406 388 613 The mRNA from the genome browser: #seq len A C G T N ENSFCAT00000015608 493 103 126 142 122 0 ENSFCAT00000015608 1625 288 261 239 224 613 ENSFCAT00000015608 177 25 85 25 42 0 total 2295 416 472 406 388 613 The N sequence in this case is the same in the cDNA and the mRNA, with exactly the same ACGT sequence. --Hiram asas asasa wrote: > Hi Hiram, > > My problem is how to know which part of the cDNA is mapped to which part of > the scaffold. > In fact this problem exists also in the main site, in the case of felCat3 > for example. > Attached the cDNAs of 3 examples downloaded in ensembl 62 ftp. The exons > starts/ends appear in the ensGene for felCat3. Accordingly: > > * the gene ENSFCAT00000010563 is mapped to 3 scaffolds and the sum of all > mapped exon sizes is 3180 bp, while the total size of the cDNA is 5409 bp. > After removing the Ns we get 3150 bp, which is still not equal. > > * For ENSFCAT00000005360 the exons sum is 486 bp, while the cDNA is 675, and > only after removing the Ns we get 486 bp. > > * in ENSFCAT00000015608 the total size of cDNA is 2295 bp including Ns > block, which is equal to the sum of exons. > > > Best, > Assaf _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
