Hi,

I noticed some discrepancies between Table-tool generated coding
sequences for known genes and posted on May 10 knownGenePep sequences.

For example, for the human gene SIPA1 there are 5 knowngenes in the
hg18:
SIPA1 (uc009yqq.1) at chr11:65164878-65174965 
SIPA1 (uc009yqp.1) at chr11:65164168-65174965 
SIPA1 (uc009yqo.1) at chr11:65162171-65174965 
SIPA1 (uc001ofd.1) at chr11:65164168-65174965 
SIPA1 (uc001ofb.1) at chr11:65162171-65174965 

#name   chrom   strand  cdsStart        cdsEnd  proteinID
uc009yqo.1      chr11   +       65164968        65169167
NP_006738
uc001ofb.1      chr11   +       65164968        65174762        Q96FS4
uc009yqp.1      chr11   +       65164968        65169167
NP_694985
uc001ofd.1      chr11   +       65164968        65174762        Q96FS4
uc009yqq.1      chr11   +       65164968        65174762        Q96FS4

getting cds sequences using Table-tool returns sequences of following
lengths:

uc009yqo.1      1149 
uc001ofb.1      3129
uc009yqp.1      1149
uc001ofd.1      3129
uc009yqq.1      3129

which corresponds to shown in browser: two short and 3 longer proteins,
382 and 1042 aa. 

However in the knownGenePep.txt all 5 proteins are reported as longer
version:

uc009yqo.1      1042
uc001ofb.1      1042
uc009yqp.1      1042
uc001ofd.1      1042
uc009yqq.1      1042

could you please clarify it?
Thanks a lot!
Irina Khrebtukova

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to