Thanks, Mera, for the test sequence. I found the alignment on chr6 of the grape assembly. It matches perfectly with what you found.
Aligning 772 bases perfectly with one little one-base gap is a very good alignment, very high scoring, with no significant probability that it could occur by chance. That's what the blast bitscore and expectation values are telling us. Earlier I wrote: > The eval 0.0e+00 and bit score 1481.0 both seem to be unlikely. My statement here was incorrect, BLAT's blast8 output was right. I tracked this down and looked at it. Given the 1481 bitscore, the expectation is e^(-1005) or so close to zero (0.0e+00) as makes no difference. You expect zero occurences of an alignment this good and long in a dabase of 3 billion nucleotides. The bitscore being high is another way of saying that the probability of this aligmnment occuring randomly is negligible. So, I have checked the blast8 output and it appears to be working correctly on BLAT. The one bit about the score I saw that was arbitrary (although it works well for most mammals) is that the search size is 3e9 or 3 billion nucleotides. If one's assembly were very different from this, then BLAT's blast8 score would be incorrect as this constant is currently hard-wired into the code. One could probably adjust the scores to the correct value for a given genome size using a simple mathematical expression and a script. If you want to understand the blast8 output of BLAT, you should probably google about BLAST scoring. e.g. http://people.musc.edu/~hazards/WebBioInformatics/Similarity_Searching.html -Galt On Fri, 9 Jan 2009, Galt Barber wrote: > > > The eval 0.0e+00 and bit score 1481.0 both seem to be unlikely. > > What version of BLAT are you using? > This could be a problem that has already been fixed. > > From kent/src/lib/blastOut.c > > fprintf(f, "%3.1e\t", blastzScoreToNcbiExpectation(axt->score)); > fprintf(f, "%d.0\n", blastzScoreToNcbiBits(axt->score)); > > > static double blastzScoreToNcbiExpectation(int bzScore) > /* Convert blastz score to expectation in NCBI sense. */ > { > double bits = bzScore * 0.0205; > double logProb = -bits*log(2); > return 3.0e9 * exp(logProb); > } > > static int blastzScoreToNcbiBits(int bzScore) > /* Convert blastz score to bit score in NCBI sense. */ > { > return round(bzScore * 0.0205); > } > > > > From kent/src/lib/axt.c > The function for calculating the axt->score > uses the blastz approach which has a simple > gapOpen/gapExtension scoring mechanism. > > > > -Galt > > > On Fri, 9 Jan 2009, Mera Vigyan wrote: > >> Good morning, >> I have a particular BLAT alignment in blast8 format like this : >> >> TOCA136B20FOR1 scaffold_24 99.87 773 0 1 1 >> 772 4129889 4129117 0.0e+00 1481.0 >> >> here the columns represent : >> >> Query id, Subject id, % identity, alignment length, mismatches, gap >> openings, >> q. start, q. end, s. start, s. end, e-value, bit score >> >> Here we have the alignment length as 773 with 1 gap. Then how come >> the score is given as 1481. >> What is the formula for score calculation in the "blast8" format ? >> kindly explain how this is done. >> >> warm regards >> Mera Vigyan >> _______________________________________________ >> Genome maillist - [email protected] >> http://www.soe.ucsc.edu/mailman/listinfo/genome >> > _______________________________________________ Genome maillist - [email protected] http://www.soe.ucsc.edu/mailman/listinfo/genome
