Hello Florian,

Hopefully we can help clarify the data a bit.

The results you have depend on how you did the filtering, but the guess 
is that you used the coordinates directly. If so, it would not be 
unexpected for two RefSeq transcripts to have the same 5' UTR but a 
different 3' UTR (or the reverse). It would not be unexpected for two 
RefSeq transcripts to have the same 5' and 3' UTR, but a different CDS 
(coding) region.

The RefSeq track would not be expected to contain any completely 
redundant entries that are not labeled with the same RefSeq identifier. 
Meaning, a single distinct RefSeq transcript may map to the genome in 
more than one location (as documented in the track Methods), but two or 
more distinct RefSeq transcripts would be not expected to be identical 
in content.

The RefSeq Genes track in the UCSC Browser has data processing that 
involves aligning the transcripts to the genome only (we do not curate 
the content of the dataset as published by NCBI). The best advice (if 
you want to explore this further) is to examine the transcripts that you 
think are redundant and first eliminate those with the same identifier 
(NM_* name). If you do find two records that are exactly the same using 
the UCSC browser, confirm by examining the most recent records at NCBI. 
Then, contact them to share the redundancy evidence, as it would 
indicate a problem that they may or may not be aware of.

Thanks!
Jennifer

---------------------------------
Jennifer Jackson
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu/

On 2/26/10 3:29 AM, Florian Wagner wrote:
> Dear Sir/Madam,
>
> I fetched upstream and downstream regions for all RefSeq genes in a
> certain region of chr2, based on NCBI36/hg18. Initially, the lists had
> the same number of entries, but after filtering for redundant entries I
> ended up with slightly different numbers (138 for upstream, 130 for
> downstream). This is a bit surprising, as I would expect equal numbers
> (it is based on the same RefSeq genes). Could you please comment on this?
>
> Thank you and best regards,
> Florian Wagner
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to