Hi Andrew, You also might be able to use the knownCanonical table instead of knownGene to get around this problem. The knownCanonical table contains a single transcript for each cluster of transcripts. (The entire list of transcripts in each cluster is in the knownIsoforms table.) The only problem with this approach is that you will lose some of the options for including or excluding parts of genes (introns, exons, UTRs, etc.) on the sequence retrieval options page. If you need a different solution, please feel free to write back to this list, and we will try to help you find another way to go about the problem.
Please let us know if you have any additional questions: [email protected] - Greg Roe UCSC Genome Bioinformatics Group On 7/21/11 11:48 AM, Greg Roe wrote: > Hi Andrew, > > I think what's happening is that there is more than one transcript for > these genes, so when you query a single gene, e.g. Gvin1, you're > getting multiple records since there are more than one transcript of > that gene. Gvin1, for example, has two, e.g.,: > > #name chrom strand txStart txEnd cdsStart cdsEnd > exonCount exonStarts exonEnds proteinID alignID > uc001meo.3 chr11 - 6734382 6743110 6735841 > 6741268 1 6734382, 6743110, Q7Z2Y8 uc001meo.3 > uc010ras.1 chr11 - 6737526 6767670 6737527 > 6741268 2 6737526,6767579, 6743113,6767670, Q7Z2Y8 > uc010ras.1 > > > Please let us know if you have any additional questions: > [email protected] > > - > Greg Roe > UCSC Genome Bioinformatics Group > > > On 7/12/11 12:15 PM, Bissell, Andrew wrote: >> Greetings, >> >> >> >> I am utilizing your database for the acquisition of promoter regions >> of various genes within the Mus musculus genome of the mm9 build, >> using the UCSC Genes track. My parameters are fairly basic, as I >> have only been searching out the promoter region of each gene, to the >> extent of 1000 bp upstream and 50 bp downstream, from the 5' side fo >> the transcription start site; and the 'One FASTA record per gene' >> option is selected. >> >> >> >> My question is that, with the inclusion of only one known gene (e.g., >> Gvin1 or Lilrb3), why are multiple promoter region sequences provided >> along different chromosomal coordinates for an single gene? >> >> >> >> Thank you kindly, >> >> >> >> Andrew >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
