Hello, I downloaded the file "upstream1000.fa.gz" - Sequences 1000 bases upstream of annotated transcription starts for RefSeq genes with annotated 5' UTRs. It seems sometime one NM name may have multiple kinds of sequence, if they show up on different location on same or different chromosome, for example "NM_175342,have 3 kinds and all from chr14;NM_023052 have 6 kinds, 2 from chrUn_random, and 4 from chr4".
If I want to leave only one sequence for each NM name(since the sequence analyze software I am using need so), how can I decide which one to leave would make the most sense? Best, Yunfei Li -------------------------------------------------------------------------------------- Research Assistant Department of Statistics & School of Molecular Biosciences Biotechnology Life Sciences Building 427 Washington State University Pullman, WA 99164-7520 Phone: 509-339-5096 http://www.wsu.edu/~ye_lab/people.html _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
