Hi, John! I don't know of any size-limit such as you mention, but our databases are usually around 3 to 5GB and not 36GB.
We will pass along your message to Jim Kent, the author of BLAT, and probably be getting back to you off-list. -Galt On 06/29/11 08:22, John Legato wrote: > > We're experiencing a segmentation fault running blat on an x86_64 box > with 128GB of RAM. We are running against NCBI's nt database (in FASTA) > form. We are using the following query sequences for testing (testst.fa > in the example below): > > > >Test1 > ATCTCTACATCCGCCCACTCCCAAATCCGTTTTGTGCAACCAACCTCTATT > >Test2 > CCCCCACAGCAGCAGGAATAATCAAGGGGATGACAGGAAGAGNNNNNNNNN > >Test3 > AAGTAACCTAGACCTTAAAATTGTACATAGCCTCTCCGAGGANNNNNNNNN > >Test4 > TTCAAACTTAAGGAATGTAGTGTTGCGATGGGTACTCAACTGATCCCANTT > >Test5 > AGATGTGGTTCCACCCATAACTCAAGGGCAGATAGGAAACACCNNNNNNNN > >Test6 > AGGCAACCCCCGGCAGGATCATTCCAGGCACCGTGGGTTTCANNNNNNNNN > >Test7 > TCTTAGTGTTGAGTCAGACGCAAAGTTGAGACAGGGGAAAAGGCNNNNNNN > >Test8 > CTTCTACATGTTGGCTGCCAGTTAAACCAGCACCATTTGTTGCAAATGCTA > >Test9 > CCTCACTAACACAAATGTTGGAGGAAGTCTTGGGAGGCATCCTATTGATAC > >Test10 > TTTGTGTTCTGGGGCAGCTGGCTTTAGAAAGAGAACTCCAGGTCAANNNNG > > > We've recompiled blat 34 from source with -g, gdb reports the following > when we do a back trace: > > > (gdb) set args nt.fa testst.fa testout1.psl -out=blast > (gdb) r > Starting program: /v/server1a/jlegato/bin/x86_64/blat nt.fa testst.fa > testout1.psl -out=blast > Loaded 36318681436 letters in 14096376 sequences > > Program received signal SIGSEGV, Segmentation fault. > gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>, > qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>, > lm=<value optimized out>, retHitCount=<value optimized out>) at > genoFind.c:1359 > 1359 slAddHead(pb, hit); > (gdb) bt > #0 gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>, > qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>, > lm=<value optimized out>, retHitCount=<value optimized out>) at > genoFind.c:1359 > #1 0x000000000040acc5 in gfLongDnaInMem (query=0x7fff20f542c0, > gf=0x82f028d50, isRc=0, minScore=30, qMaskBits=0x0, out=0xb30810, > fastMap=0, band=0) at gfBlatLib.c:1530 > #2 0x000000000040329e in searchOneStrand (seq=0x7fff20f542c0, > gf=0x82f028d50, psl=<value optimized out>, isRc=0, maskHash=<value > optimized out>, qMaskBits=0x0) at blat.c:200 > #3 0x000000000040332c in searchOne (seq=0x7fff20f542c0, gf=0x82f028d50, > f=0xb30510, isProt=0, maskHash=0x0, qMaskBits=0x0) at blat.c:241 > #4 0x000000000040343f in searchOneMaskTrim (seq=0x6488c0, isProt=0, > gf=0x82f028d50, outFile=0xb30510, maskHash=0x0, > retTotalSize=0x7fff20f54358, retCount=0x7fff20f54364) at blat.c:310 > #5 0x00000000004036a6 in searchOneIndex (fileCount=1, files=0xb30790, > gf=0x82f028d50, outName=<value optimized out>, isProt=0, maskHash=0x0, > outFile=0xb30510, showStatus=1) at blat.c:380 > #6 0x00000000004039e9 in blat (dbFile=<value optimized out>, > queryFile=<value optimized out>, outName=0x7fff20f547cd "testout1.psl") > at blat.c:606 > #7 0x0000000000404049 in main (argc=4, argv=0x7fff20f54528) at blat.c:783 > > We've also tried the psl output form with similar results. > > Does this suggest an error in the output functions? We're also wondering > if the size of nt.fa (36GB) is just too large for blat. Any other ideas > on what might be causing the segfault? We've had success with smaller > databases. > > Thanks > > John > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
