Hi everybody,
I downloaded the knownGene table with the table browser (default settings) to
use the annotations for some protein mapping.
Hsu's paper (The UCSC Known Genes, 2006) says "mRNA with the highest score is
selected as the representative mRNA for the protein" and "removing duplicates
having identical chromosome number, start and ending positions of coding
sequence"
So actually I expected one gene per protein (uniprot id) but i found a couple
of genes coding for the same protein.
This is causing some trouble, because now I'm not sure which one to take for my
protein.
The "redundant" (?) genes almost share the same loci and/or cds, but seem to
differ in number of exons and/or splice sites.
So I'm afraid they don't code for the same amino acid sequence, which i thought
usally leads to different proteins (or are there many exceptions?).
Here is an example (I attached some more):
#name chrom strand txStart txEnd cdsStart cdsEnd exonCount
exonStarts exonEnds proteinID alignID
uc001lqz.2 chr11 + 747431 765023 747481 764845 8
747431,755878,758949,760121,763343,763746,764287,764812,
747578,756002,759057,760253,763519,763944,764433,765023, P37837
uc001lqz.2
uc001lra.2 chr11 + 747431 765023 747481 764413 8
747431,755878,758949,760121,763343,763746,764287,764812,
747578,756002,759057,760253,763519,763940,764433,765023, P37837
uc001lra.2
The second gene's cdsEnd is smaller and exon 6 ends 4 positions earlier but is
still in the cds.
I had a look a the sequences and I there is a shift. So I think the propability
to code for the same protein is pretty low.
But how should I handle those genes now? Do they code for protein isoforms,
which didn't get a unique protein id yet?
May i got Hsu's paper wrong? Or did I just miss some information somewhere?
I hope you can help me with this. Thanks a lot.
Greetinz,
Mathias
#name chrom strand txStart txEnd cdsStart cdsEnd exonCount
exonStarts exonEnds proteinID alignID
---
uc002agn.2 chr15 - 60639350 60690185 60639828
60678274 14
60639350,60641273,60643391,60643922,60644581,60646352,60648117,60649344,60653139,60656627,60674540,60678226,60682411,60690141,
60639888,60641396,60643450,60644018,60644675,60646412,60648197,60649435,60653253,60656722,60674640,60678285,60682516,60690185,
P07355 uc002agn.2
uc002agk.2 chr15 - 60639350 60690185 60639828
60678274 14
60639350,60641273,60643391,60643922,60644581,60646352,60648117,60649344,60653139,60656627,60674540,60678226,60689456,60690141,
60639888,60641396,60643450,60644018,60644675,60646412,60648197,60649435,60653253,60656722,60674640,60678285,60689537,60690185,
P07355 uc002agk.2
uc002agl.2 chr15 - 60639350 60690185 60639828
60678274 13
60639350,60641273,60643391,60643922,60644581,60646352,60648117,60649344,60653139,60656627,60674540,60678226,60690141,
60639888,60641396,60643450,60644018,60644675,60646412,60648197,60649435,60653253,60656722,60674640,60678285,60690185,
P07355 uc002agl.2
---
uc002pjp.1 chr19 - 49118587 49121141 49118628
49120614 6 49118587,49119133,49119335,49119982,49120572,49121047,
49118704,49119203,49119459,49120081,49120680,49121141, Q07020 uc002pjp.1
uc002pjq.1 chr19 - 49118587 49122433 49118628
49122400 7
49118587,49119133,49119335,49119982,49120572,49121047,49122397,
49118704,49119203,49119459,49120081,49120680,49121134,49122433, Q07020
uc002pjq.1
---
uc001fum.1 chr1 - 159887902 159890349 159888589
159890299 4 159887902,159889063,159889450,159890119,
159888731,159889166,159889625,159890349, P37802 uc001fum.1
uc001fun.1 chr1 - 159887902 159895284 159888589
159890299 5 159887902,159889063,159889450,159890119,159895239,
159888731,159889166,159889625,159890327,159895284, P37802 uc001fun.1
uc001fuo.1 chr1 - 159887951 159895284 159889059
159890299 4 159887951,159889450,159890119,159895239,
159889166,159889625,159890327,159895284, P37802 uc001fuo.1
---
uc002fpg.1 chr16 + 89989744 90002466 90001297
90002212 4 89989744,89998978,89999875,90001141,
89989866,89999087,89999990,90002466, P68371 uc002fpg.1
uc010cjb.1 chr16 + 89998978 90002505 90001297
90002212 2 89998978,90001136, 89999087,90002505, P68371
uc010cjb.1
uc002fpk.1 chr16 + 90000319 90002505 90001297
90002212 2 90000319,90001136, 90000983,90002505, P68371
uc002fpk.1
---
uc011kjj.1 chr7 - 99752044 99755385 99752633
99754943 9
99752044,99752884,99753293,99754008,99754256,99754467,99754716,99754915,99755255,
99752787,99753078,99753448,99754172,99754339,99754615,99754837,99755082,99755385,
Q8WVR3 uc011kjj.1
uc003utr.2 chr7 - 99752044 99756302 99752633
99756122 11
99752044,99752884,99753293,99754008,99754256,99754467,99754716,99754995,99755255,99755465,99755711,
99752787,99753078,99753448,99754172,99754339,99754615,99754837,99755082,99755385,99755561,99756302,
Q8WVR3 uc003utr.2
---
uc004dzo.2 chrX + 70503041 70521016 70510487
70519926 13
70503041,70504246,70504833,70510478,70511628,70514076,70516414,70516700,70517226,70517685,70518318,70518556,70519791,
70503571,70504303,70504947,70510641,70511822,70514378,70516510,70516897,70517311,70517788,70518358,70518666,70521016,
Q15233 uc004dzo.2
uc004dzn.2 chrX + 70503041 70521016 70510487
70519926 12
70503041,70504246,70510478,70511628,70514076,70516414,70516700,70517226,70517685,70518318,70518556,70519791,
70503571,70504303,70510641,70511822,70514378,70516510,70516897,70517311,70517788,70518358,70518666,70521016,
Q15233 uc004dzn.2
uc004dzp.2 chrX + 70503041 70521016 70510487
70519926 11
70503041,70510478,70511628,70514076,70516414,70516700,70517226,70517685,70518318,70518556,70519791,
70503571,70510641,70511822,70514378,70516510,70516897,70517311,70517788,70518358,70518666,70521016,
Q15233 uc004dzp.2
---
uc001iou.2 chr10 + 17270257 17279592 17271421
17279270 10
17270257,17271274,17272648,17275585,17275768,17276691,17277167,17277844,17278292,17279228,
17270523,17271984,17272709,17275681,17275930,17276817,17277388,17277888,17278378,17279592,
P08670 uc001iou.2
uc001iox.1 chr10 + 17271236 17279592 17271421
17279270 9
17271236,17272648,17275585,17275768,17276691,17277167,17277844,17278292,17279228,
17271984,17272709,17275681,17275930,17276817,17277388,17277888,17278378,17279592,
P08670 uc001iox.1
uc009xjv.1 chr10 + 17271274 17279592 17271421
17279270 8
17271274,17272648,17275585,17275768,17277167,17277844,17278292,17279228,
17271984,17272709,17275681,17275930,17277388,17277888,17278378,17279592,
P08670 uc009xjv.1
---
uc003zxo.3 chr9 + 35673914 35681152 35673956
35681022 11
35673914,35675534,35675757,35676060,35676293,35677786,35679181,35679850,35680109,35680749,35680961,
35674359,35675564,35675928,35676203,35676386,35677853,35679339,35679995,35680136,35680831,35681152,
Q16790 uc003zxo.3
uc003zxp.3 chr9 + 35673914 35681152 35673956
35681022 10
35673914,35675534,35675757,35676060,35676293,35677786,35679181,35679850,35680749,35680961,
35674359,35675564,35675928,35676203,35676386,35677853,35679339,35679995,35680831,35681152,
Q16790 uc003zxp.3
---
uc002ynb.2 chr21 - 30428649 30446010 30428795
30445911 15
30428649,30432861,30433573,30433816,30434448,30434648,30434810,30435672,30437288,30439036,30439211,30439876,30441743,30442567,30445851,
30428873,30432981,30433738,30433888,30434564,30434736,30434877,30435851,30437426,30439098,30439392,30440026,30441823,30442658,30446010,
P50990 uc002ynb.2
uc002ync.2 chr21 - 30428649 30446010 30428795
30445911 15
30428649,30432861,30433573,30433816,30434448,30434648,30434810,30435672,30437288,30439036,30439211,30439872,30441743,30442567,30445851,
30428873,30432981,30433738,30433888,30434564,30434736,30434877,30435851,30437426,30439098,30439385,30440026,30441823,30442658,30446010,
P50990 uc002ync.2
---
uc004ang.3 chr9 - 86582998 86595184 86584264
86593167 17
86582998,86585076,86585651,86585811,86586187,86586586,86586796,86587758,86588200,86588816,86589431,86590376,86591909,86592603,86593109,86593287,86595067,
86584295,86585246,86585734,86585827,86586271,86586641,86587104,86587887,86588314,86588888,86589504,86590420,86591966,86592701,86593194,86593367,86595184,
P61978 uc004ang.3
uc004anj.3 chr9 - 86582998 86595184 86584264
86593167 16
86582998,86585076,86585651,86586174,86586586,86586796,86587758,86588200,86588816,86589431,86590376,86591909,86592603,86593109,86593287,86595067,
86584295,86585246,86585734,86586271,86586641,86587104,86587887,86588314,86588888,86589504,86590420,86591966,86592701,86593194,86593367,86595184,
P61978 uc004anj.3
uc004anm.3 chr9 - 86582998 86595569 86584264
86593167 17
86582998,86585076,86585651,86585811,86586187,86586586,86586796,86587758,86588200,86588816,86589431,86590376,86591909,86592603,86593109,86593287,86595417,
86584295,86585246,86585734,86585827,86586271,86586641,86587104,86587887,86588314,86588888,86589504,86590420,86591966,86592701,86593194,86593367,86595569,
P61978 uc004anm.3
---
uc001sjs.2 chr12 + 56546334 56551770 56547702
56551510 8
56546334,56547656,56548596,56548838,56549202,56551251,56551481,56551665,
56546415,56547876,56548624,56548982,56549376,56551329,56551519,56551770,
P14649 uc001sjs.2
uc001sjt.2 chr12 + 56546334 56551770 56547702
56551510 8
56546334,56547592,56548596,56548838,56549202,56551251,56551481,56551665,
56546415,56547876,56548624,56548982,56549376,56551329,56551519,56551770,
P14649 uc001sjt.2
uc001sju.1 chr12 + 56547656 56551716 56547702
56551510 6 56547656,56548596,56548838,56549202,56551251,56551481,
56547876,56548624,56548982,56549376,56551329,56551716, P14649 uc001sju.1
---
uc001lqz.2 chr11 + 747431 765023 747481 764845 8
747431,755878,758949,760121,763343,763746,764287,764812,
747578,756002,759057,760253,763519,763944,764433,765023, P37837
uc001lqz.2
uc001lra.2 chr11 + 747431 765023 747481 764413 8
747431,755878,758949,760121,763343,763746,764287,764812,
747578,756002,759057,760253,763519,763940,764433,765023, P37837
uc001lra.2
---
uc001fni.2 chr1 + 156084460 156109878 156084709
156108897 12
156084460,156100407,156104193,156104595,156104977,156105691,156106004,156106711,156106903,156107444,156108278,156108870,
156085065,156100564,156104319,156104766,156105103,156105912,156106227,156106819,156107023,156107534,156108548,156109878,
P02545 uc001fni.2
uc001fnk.2 chr1 + 156096345 156109878 156099640
156108897 13
156096345,156099617,156100407,156104193,156104595,156104977,156105691,156106004,156106711,156106903,156107444,156108278,156108870,
156096442,156099699,156100564,156104319,156104766,156105103,156105912,156106227,156106819,156107023,156107534,156108548,156109878,
P02545 uc001fnk.2
---
uc001ttu.2 chr12 - 112842994 112847443 112843027
112846460 7
112842994,112843656,112844097,112844550,112846043,112846223,112847214,
112843180,112843841,112844146,112844694,112846142,112846460,112847443, Q02878
uc001ttu.2
uc001ttv.2 chr12 - 112842994 112847443 112843027
112846460 7
112842994,112843656,112844097,112844550,112846043,112846223,112847390,
112843180,112843841,112844146,112844694,112846142,112846460,112847443, Q02878
uc001ttv.2_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome