Hello, Fasta format represents a single sequence string (mrna, dna, etc). In short: 1) a header line with a leading ">" followed by a unique identifier then some whitespace followed by an optional description line 2) lines of equal length representing the sequence until the end is reached. The actual length of the line can be variable across datasets but must be consistent within the same dataset (query all same, all target same, but query and target can be different).
An example is: >sequence_1 ATGCAGAGCAAGGTGCTGCTGGCCGTCGCCCTGTGGCTCTGCGTGGAGAC CCGGGCCGCCTCTGTGGGTTTGCCTAGTGTTTCTCTTGATCTGCCCAGGC >sequence_2 ATGTTGTTTACCGTAAGCTGTAGTAAAATGAGCTCGATTGTTGACAGAGA TGACAGTAGTATTTTTGATGGGTTGGTGGAAGAAGATGACAAGGACAAAG >sequence_3 ATGCTGCGAACAGAGAGCTGCCGCCCCAGGTCGCCCGCCGGACAGGTGGC CGCGGCGTCCCCGCTCCTGCTGCTGCTGCTGCTGCTCGCCTGGTGCGCGG Alignments between genomic are contained in the UCSC Browser under the Conservation, Chain and Net tracks. Data can be obtained from the Table browser, Downloads, or simply viewed in the Assembly browser. FAQs for these data formats: http://genome.ucsc.edu/FAQ/FAQformat#format5 http://genome.ucsc.edu/goldenPath/help/chain.html http://genome.ucsc.edu/goldenPath/help/net.html If you need help locating any of this data, please let us know. In general, clicking a track's name in the Assembly browser will point you to the correct database tables/files. Or search the Downloads area for similar by species. Jennifer Jackson UCSC Genome Bioinforamatics Group [email protected] wrote: > Hello! > > Is it possible to drive alignments for whole genomes, say mouse versus > rat, so that the results could be seen in FASTA format? > > K.M. > Institute of biotechnology, Finland > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
