I wish to draw your attention to the discrepancies I find in the CDS file for human data (please see attached screenshots to see the settings i used to download this file). I find that the length of the coding region of the following accessions is not a multiple of three. In other words, it is incomplete. My manual check tells me that the terminal exons in CDS file are missing one or two bases at the end. However, there could be other variations to the theme. My random checks for some of the entries in the corresponding NCBI file shows no discrepancy. I am listing some of the entries from Xchromosome below but it is likely that this problem exists even for other entries on other chromosomes.
Also, I find that some entries have been annotated on both the strands (Eg: NM_001079538). Please have a look and do the needful. NM_001101357 NM_001136234 NM_138702 NM_001004486 NM_003868 NM_005193 NM_001007524 NM_001013627 NM_033380 NM_001136273 NM_001011719 NM_001007523 NM_001079538
_______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
