Dear Genome List, While using the table browser to fetch different regions of refseqs gene models (genome: hg19, group: mRNA and EST tracks, table: refseq genes and output format: BED), I find different numbers of refGene ids. I would expect to find the exact number of refGene ids in each region bed file.
For example, using the table browser I got all the 5' UTR and 3' UTR refGene regions. Here are the first lines of each respective bed file: #5' UTR bed file chr1 66999824 67000041 NM_032291_utr5_0_0_chr1_66999825_f 0 + #3' UTR bed file chr1 67208778 67210767 NM_032291_utr3_24_0_chr1_67208779_f 0 + This is good because the refseq id (NM_032291) is in both. I parsed each file to get a non-redundant list of all the refGene ids in each bed file and found different numbers. For example, the refseq id NM_000462 is only in the 3' UTR bed file and not the 5' UTR. In total there are 30408 and 30627 refGene ids in the 5' and 3' UTR bed files, respectively. May you explain the discrepancy? Thank you in advance. Cheers, Dave _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
