Hi, this may sound rather stupid, but since I'm not too familiar with how this "works", I'll just go ahead and ask.
I need some DNA data. I need/want to run some statistical tests on the frequencies of certain "words" (DNA "letter" combinations), and for that I need a simple (text) file with the A,G,C,T's of the different chromosomes (two files for each, for either direction). Is such a thing even available (anywhere), is the human dna only partially mapped (I'm actually just as interested in the junk DNA part, as in the protein coding parts). I went to http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/ and downloaded the file "est.fa.gz <http://newmail.walla.co.il/est.fa.gz>", it seems to have a some nice letters, but I can't understand what the ">AA000972 1" parts mean (is it a "place holder" for 972 "letters"?) hope you can help me out. thanks _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
