2009/4/23 Ryan Raaum <[email protected]>: > The refseq entry tells you which non-refseq entry/entries it was > derived from. In this case it says DQ386163, which suggests there are > at least 2 pototo chloroplast sequences available - one by an Italian > group and one by a Korean group.
Right I see. Any way to judge the quality of the two? In the RefSeq record I read "PROVISIONAL REFSEQ: This record has not yet been subject to final NCBI review." - Anyway to kick them about that? i.e. Dear RefSeq, I have DQ231562 and DQ386163, should they be merged into NC_008096? Thanks for the info, Dan. > On Thu, Apr 23, 2009 at 11:42 AM, Dan Bolser <[email protected]> wrote: >> Hi, >> >> I found that the potato chloroplast sequence from GenBank (DQ231562.1) >> has several differences (260 SNPs and 30 indels) relative to the same >> sequence in RefSeq (NC_008096.1). As far as I am aware this sequence >> has only been obtained once, why would the two differ? In general >> should I trust the refseq sequence? >> >> >> For your reference here is the output of dnadiff over the two files: >> >> Reference/DQ231562.fasta Query/NC_008096.fasta >> NUCMER >> >> [REF] [QRY] >> [Sequences] >> TotalSeqs 1 1 >> AlignedSeqs 1(100.00%) 1(100.00%) >> UnalignedSeqs 0(0.00%) 0(0.00%) >> >> [Bases] >> TotalBases 155312 155298 >> AlignedBases 155312(100.00%) 155298(100.00%) >> UnalignedBases 0(0.00%) 0(0.00%) >> >> [Alignments] >> 1-to-1 1 1 >> TotalLength 155312 155298 >> AvgLength 155312.00 155298.00 >> AvgIdentity 99.81 99.81 >> >> M-to-M 1 1 >> TotalLength 155312 155298 >> AvgLength 155312.00 155298.00 >> AvgIdentity 99.81 99.81 >> >> [Feature Estimates] >> Breakpoints 0 0 >> Relocations 0 0 >> Translocations 0 0 >> Inversions 0 0 >> >> Insertions 0 0 >> InsertionSum 0 0 >> InsertionAvg 0.00 0.00 >> >> TandemIns 0 0 >> TandemInsSum 0 0 >> TandemInsAvg 0.00 0.00 >> >> [SNPs] >> TotalSNPs 260 260 >> AC 23(8.85%) 14(5.38%) >> AG 24(9.23%) 30(11.54%) >> AT 15(5.77%) 14(5.38%) >> CA 14(5.38%) 23(8.85%) >> CG 24(9.23%) 18(6.92%) >> CT 32(12.31%) 19(7.31%) >> GA 30(11.54%) 24(9.23%) >> GC 18(6.92%) 24(9.23%) >> GT 13(5.00%) 34(13.08%) >> TA 14(5.38%) 15(5.77%) >> TC 19(7.31%) 32(12.31%) >> TG 34(13.08%) 13(5.00%) >> >> TotalGSNPs 113 113 >> AC 9(7.96%) 8(7.08%) >> AG 17(15.04%) 17(15.04%) >> AT 5(4.42%) 3(2.65%) >> CA 8(7.08%) 9(7.96%) >> CG 6(5.31%) 7(6.19%) >> CT 15(13.27%) 8(7.08%) >> GA 17(15.04%) 17(15.04%) >> GC 7(6.19%) 6(5.31%) >> GT 6(5.31%) 12(10.62%) >> TA 3(2.65%) 5(4.42%) >> TC 8(7.08%) 15(13.27%) >> TG 12(10.62%) 6(5.31%) >> >> TotalIndels 30 30 >> A. 14(46.67%) 4(13.33%) >> C. 1(3.33%) 0(0.00%) >> G. 0(0.00%) 0(0.00%) >> T. 7(23.33%) 4(13.33%) >> >> TotalGIndels 24 24 >> A. 10(41.67%) 4(16.67%) >> C. 1(4.17%) 0(0.00%) >> G. 0(0.00%) 0(0.00%) >> T. 5(20.83%) 4(16.67%) >> >> >> Thanks for any pointers, >> Dan. >> >> _______________________________________________ >> BBB mailing list >> [email protected] >> http://www.bioinformatics.org/mailman/listinfo/bbb >> > > > > -- > Ryan Raaum > Assistant Professor > Department of Anthropology > Lehman College, The City University of New York > 250 Bedford Park Blvd. West > Bronx, NY 10468 > e: [email protected] > w: http://www.raaum.org > o: (718) 960-8845 > f: (718) 960-8406 > _______________________________________________ > BBB mailing list > [email protected] > http://www.bioinformatics.org/mailman/listinfo/bbb > _______________________________________________ BBB mailing list [email protected] http://www.bioinformatics.org/mailman/listinfo/bbb
