Hi Shlomit, For an answer to your first question, please see this previously answered mailing list question: https://lists.soe.ucsc.edu/pipermail/genome/2007-May/013587.html
Converting to .2bit is worth it for large files. The .2bit file will load faster than the fasta file. Use faToTwoBit with soft-masking (the default). Here its the description and usage statement: faToTwoBit - Convert DNA from fasta to 2bit format usage: faToTwoBit in.fa [in2.fa in3.fa ...] out.2bit options: -noMask - Ignore lower-case masking in fa file. -stripVersion - Strip off version number after . for genbank accessions. -ignoreDups - only convert first sequence if there are duplicates Soft-masking (lower-case means repeat-masked) is the default. Please don't hesitate to contact the mail list again if you have any further questions. Katrina Learned UCSC Genome Bioinformatics Group shlomit farkash wrote, On 04/14/10 01:58: > Hello, > > > > Thanks for a very useful tool, blat. We are trying to use blat to design > probes for microarray. We would like to see whether our probe is unique > in the genome or have more than one match with at least 90% identity. > The probes are 60mers and the genome is not an ordinary genome, but > rather a mouse genome that was converted by bisulfate treatment - where > every C was converted to T. In the manual, it says that the default for > blat search is 90% identity for DNA sequences, but when trying to run a > search, we also got matches that are 34 or 38 long. Is there a way to > ask blat to report only on matches that are at least 90% identical out > of all the 60mer (then, the smallest match could be 54 mer )? should > we change the minScore ? > > Another question is ,if we have the (converted) chromosomes in fasta > format, should we convert it to 2bit format (for faster execution) and > how ? > > > > Thanks a lot, > > Shlomit Amar-Farkash > > The Hebrew University, > > Jeruslaem, Israel > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
