On Thu, Apr 29, 2010 at 12:56 PM, Galt Barber <[email protected]> wrote: > Looks like you'd have to just get the whole kent source. > > http://hgdownload.cse.ucsc.edu/admin/jksrc.zip > > There are several fixes which are done > but not officially released yet as version 35. > > Here are the unreleased fixes from blat/version.doc: > o (in 34x1) Making total query output reporting a 64 bit number to avoid > overflow when people using more than 4 gig of query sequence. > o (in 34x2) Fixed -out=blast to use +/- instead of -/+ for non-translated. > o (in 34x3) Fixed -minScore, filter was not working when over half > query-size. > o (in 34x4) Made it convert u's to t's for RNA sequence stuff. > o (in 34x5) Made gfServer calculate repMatch based on stepSize/tileSize > combination the way blat does > rather than just being good for stepSize 11. > o (in 34x6) Fixed negative strand pcr psl output > o (in 34x7) Made it check and error out if the same name is reused in the > target database. > > From looking at the usual BLAT source package, > these are the only files/subdirs you need to keep: > > Apr 20 2007 blatSrc34.zip > > blatSrc > > cd blatSrc > blatSrc> ls -l > -rw-rw-r-- 1438 Feb 18 2005 README > drwxrwxr-x 512 Apr 29 10:35 blat > drwxrwxr-x 512 Apr 20 2007 gfClient > drwxrwxr-x 512 Apr 20 2007 gfServer > drwxrwxr-x 512 Feb 10 2004 hg > drwxrwxr-x 3072 Apr 20 2007 inc > drwxrwxr-x 512 Apr 20 2007 jkOwnLib > drwxrwxr-x 3584 Apr 20 2007 lib > -rw-rw-r-- 402 Dec 16 2005 makefile > drwxrwxr-x 512 Mar 25 2004 utils > drwxrwxr-x 512 Apr 20 2007 webBlat > > > The README just has this to say: > --- > CONTENTS AND COPYRIGHT > > This archive contains the entire source tree for BLAT and > associated utilities. All files are copyrighted, but license > is hereby granted for personal, academic, and non-profit use. > A license is also granted for the contents of the top level > lib and inc directories for commercial users. Commercial > users should contact [email protected] for access to other modules. > > INSTALL INSTRUCTIONS > > 1. Unzip this to create a blatSrc directory. > 2. Check that the environment variable MACHTYPE > exists on your system. It should on Unix. > (And making this on non-Unix systems is beyond > the scope of this README). For a Linux > system MACHTYPE will probably be 'i386', for > and Alpha it will be 'alpha', for a Sun > probably 'sparc'. If necessary set up > this environment variable. Do this under the > bash shell as so: > MACHTYPE=something > export MACHTYPE > or under tcsh as so: > setenv MACHTYPE something > 3. Make the directory ~/bin/$MACHTYPE which is > where the (non-web) executables will go. > Add this directory to your path. > 4. Go to the lib directory. If it doesn't > already exist do a mkdir $MACHTYPE. > 5. If you're on an alpha system do a: > setenv SOCKETLIB -lxnet > on Solaris do > setenv SOCKETLIB "-lsocket -lnsl" > on SunOS do > setenv SOCKETLIB "-lsocket -lnsl -lresolv" > on Linux you can skip this step. > 6. At the blatSrc directory type 'make' > --- > > Here is the makefile: > --- > all: > cd lib && ${MAKE} > cd jkOwnLib && ${MAKE} > cd blat && $(MAKE) > cd gfClient && $(MAKE) > cd gfServer && $(MAKE) > cd hg/pslPretty && $(MAKE) > cd hg/pslReps && $(MAKE) > cd hg/pslSort && $(MAKE) > cd utils/nibFrag && $(MAKE) > cd utils/faToNib && $(MAKE) > cd utils/faToTwoBit && $(MAKE) > cd utils/twoBitToFa && $(MAKE) > cd utils/twoBitInfo && $(MAKE) > cd webBlat && $(MAKE)
Hi Galt, I just want to bring this to you attention that the above way of running make for each directory may not be the best way. I'd suggest you to use $(MAKE) -C. Use the current way, the '-j' option (of GNU make) would not work correctly for sub directory. Using '$(MAKE) -C', the '-j' option will work. Maybe you can fix this? > clean: > rm -f */*.o */*/*.o > --- > > This should be enough for you to > download and compile the latest blat source. > > Obviously, this should come with the simple > warning that this is pre-official release > code. However, we have been using it here > without trouble. > > As usual with BLAT, it's good to remind people > that the software is licensed. It's only free > for academic, personal, non-commericial use. > Commercial licenses may be purchased. > > -Galt > > Ar 4/29/2010 9:47 AM, scríobh Peng Yu: >> >> On Tue, Apr 27, 2010 at 9:00 PM, Galt Barber<[email protected]> wrote: >>> >>> Hi, Peng! >>> >>> As the FAQ points out >>> http://genome.ucsc.edu/FAQ/FAQblat.html >>> >>> "A note on filtering output: increasing the -minScore parameter value >>> beyond >>> one-half of the query size has no further effect. Therefore, use either >>> the >>> pslReps or pslCDnaFilter program available in the Genome Browser source >>> code to filter for the size, score, coverage, or quality desired. For >>> information on obtaining the source code, see our FAQ on source code >>> licensing and downloads. " >>> >>> This seems to have been an odd restriction >>> which was removed at the urging of users, >>> however, the change came only in 2008: >>> >>> blat/version.doc >>> 1.72 (galt 09-Dec-08): (in blat version 34x3) >>> Fixed -minScore, filter was not working when over half query-size. >>> v197_branch: 1.72.0.2 >>> >>> revision 1.72 >>> date: 2008/12/09 08:11:46; author: galt; state: Exp; lines: +1 -0 >>> fixing minScore >>> ---------------------------- >>> >>> galt >>> Tue Dec 9 08:11:46 2008 +0000 >>> fixing minScore >>> diff --git src/jkOwnLib/gfBlatLib.c src/jkOwnLib/gfBlatLib.c >>> --- src/jkOwnLib/gfBlatLib.c >>> +++ src/jkOwnLib/gfBlatLib.c >>> @@ -18,7 +18,7 @@ >>> >>> >>> static void saveAlignments(char *chromName, int chromSize, int >>> chromOffset, >>> struct ssBundle *bun, struct hash *t3Hash, >>> boolean qIsRc, boolean tIsRc, >>> enum ffStringency stringency, int minMatch, struct gfOutput *out) >>> /* Save significant alignments to file in .psl format. */ >>> { >>> struct dnaSeq *tSeq = bun->genoSeq, *qSeq = bun->qSeq; >>> struct ssFfItem *ffi; >>> -if (minMatch> qSeq->size/2) minMatch = qSeq->size/2; >>> -if (minMatch< 1) minMatch = 1; >>> for (ffi = bun->ffList; ffi != NULL; ffi = ffi->next) >>> { >>> struct ffAli *ff = ffi->ff; >>> struct trans3 *t3List = NULL; >>> int score; >>> if (t3Hash != NULL) >>> t3List = hashMustFindVal(t3Hash, tSeq->name); >>> score = scoreAli(ff, bun->isProt, stringency, tSeq, t3List); >>> if (score>= minMatch) >>> { >>> out->out(chromName, chromSize, chromOffset, ff, tSeq, t3Hash, >>> qSeq, >>> qIsRc, tIsRc, stringency, minMatch, out); >>> } >>> } >>> } >>> >>> See the two lines leading with "-" ? >>> They were deleted. They seemed to be >>> unneeded and causing unexpected behavior >>> to users. >> >> Hi Galt, >> >> Where to do get the blat version that you have fixed? >> >>> Unfortunately, Jim Kent's official release >>> seems to date back to 2007, but you could >>> get the source and compile it. >>> >>> Any blat version after 34x3 should have the fix. >>> >>> With the newer version, the cutoff works more >>> as you would expect. And for your example >>> of a 25bp stretch of dna with one mismatch, >>> your score would be +24 for the matches and >>> -1 for the 1 mismatch, thus score=24-1==23. >>> >>> And thus if you use minScore of 23 or lower >>> you can see the output psl record. >>> -minScore=23 >>> >>> As we mentioned before, >>> you can just set minScore to zero and >>> then filter the psl output >>> with other tools afterwards. >>> >>> -Galt >>> >>> Ar 4/27/2010 3:35 PM, scríobh Peng Yu: >>>> >>>> Hi Galt, >>>> >>>> Here is the command that I use. You mentioned "Generally people don't >>>> much bother with using BLAT's own commandline options for minScore, >>>> etc." But I want to understand what minScore is and when it can be >>>> ignored. Would you please let me know? >>>> >>>> >>>> $ blat -t=dna -q=dna -stepSize=5 -minScore=25 -maxGap=0 -noHead \ >>>> database.fasta \ >>>> query.fasta \ >>>> query.psl >>>> $ cat query.fasta >>>>> >>>>> test_sequence >>>> >>>> cttgcaccggaaagtctgctccaga >>>> $ cat database.fasta >>>>> >>>>> database_chr1 >>>> >>>> ctagcaccggaaagtctgctccaga >>>> $ cat query.psl >>>> 24 1 0 0 0 0 0 0 + >>>> test_sequence 25 0 25 database_chr1 25 0 >>>> 25 >>>> 1 25, 0, 0, >>>> >>>> >>>> >>>> On Mon, Apr 26, 2010 at 4:30 PM, Jennifer Jackson<[email protected]> >>>> wrote: >>>>> >>>>> Hello Peng, >>>>> >>>>> Very sorry, your reply went to the genome mailing list only, not to >>>>> your >>>>> email address as well. Our apologies. >>>>> >>>>> Here is the posting: >>>>> https://lists.soe.ucsc.edu/pipermail/genome/2010-April/022012.html >>>>> >>>>> Jennifer >>>>> >>>>> --------------------------------- >>>>> Jennifer Jackson >>>>> UCSC Genome Informatics Group >>>>> http://genome.ucsc.edu/ >>>>> >>>>> On 4/24/10 12:09 PM, Peng Yu wrote: >>>>>> >>>>>> Could somebody answer me the following question? >>>>>> >>>>>> On Wed, Apr 21, 2010 at 2:48 PM, Peng Yu<[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> I'm wondering what "some sort of gap penalty" refers to. Also I query >>>>>>> 25bp sequence using the default, BLAT still gives the result. By >>>>>>> definition 25bp sequence should at most have a score of 25, which is >>>>>>> less than 30. Why the query still returns the the result? >>>>>>> >>>>>>> -minScore=N sets minimum score. This is the matches minus the >>>>>>> mismatches minus some sort of gap penalty. Default is >>>>>>> 30 >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Regards, >>>>>>> Peng >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>> >>> >> >> >> > > -- Regards, Peng _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
