Dear Genome List,

While using the table browser to fetch different regions of refseqs gene  
models (genome: hg19, group: mRNA and EST tracks, table: refseq genes and  
output format: BED), I find different numbers of refGene ids. I would  
expect to find the exact number of refGene ids in each region bed file.

For example, using the table browser I got all the 5' UTR and 3' UTR  
refGene regions. Here are the first lines of each respective bed file:

#5' UTR bed file
chr1    66999824        67000041        NM_032291_utr5_0_0_chr1_66999825_f      
0       +

#3' UTR bed file
chr1    67208778        67210767        NM_032291_utr3_24_0_chr1_67208779_f     
0       +

This is good because the refseq id (NM_032291) is in both. I parsed each  
file to get a non-redundant list of all the refGene ids in each bed file  
and found different numbers. For example, the refseq id NM_000462 is only  
in the 3' UTR bed file and not the 5' UTR.

In total there are 30408 and 30627 refGene ids in the 5' and 3' UTR bed  
files, respectively.

May you explain the discrepancy?

Thank you in advance.

Cheers,

Dave
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to