Hi, I noticed some discrepancies between Table-tool generated coding sequences for known genes and posted on May 10 knownGenePep sequences.
For example, for the human gene SIPA1 there are 5 knowngenes in the hg18: SIPA1 (uc009yqq.1) at chr11:65164878-65174965 SIPA1 (uc009yqp.1) at chr11:65164168-65174965 SIPA1 (uc009yqo.1) at chr11:65162171-65174965 SIPA1 (uc001ofd.1) at chr11:65164168-65174965 SIPA1 (uc001ofb.1) at chr11:65162171-65174965 #name chrom strand cdsStart cdsEnd proteinID uc009yqo.1 chr11 + 65164968 65169167 NP_006738 uc001ofb.1 chr11 + 65164968 65174762 Q96FS4 uc009yqp.1 chr11 + 65164968 65169167 NP_694985 uc001ofd.1 chr11 + 65164968 65174762 Q96FS4 uc009yqq.1 chr11 + 65164968 65174762 Q96FS4 getting cds sequences using Table-tool returns sequences of following lengths: uc009yqo.1 1149 uc001ofb.1 3129 uc009yqp.1 1149 uc001ofd.1 3129 uc009yqq.1 3129 which corresponds to shown in browser: two short and 3 longer proteins, 382 and 1042 aa. However in the knownGenePep.txt all 5 proteins are reported as longer version: uc009yqo.1 1042 uc001ofb.1 1042 uc009yqp.1 1042 uc001ofd.1 1042 uc009yqq.1 1042 could you please clarify it? Thanks a lot! Irina Khrebtukova _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
