We're experiencing a segmentation fault running blat on an x86_64 box with 128GB of RAM. We are running against NCBI's nt database (in FASTA) form. We are using the following query sequences for testing (testst.fa in the example below):
>Test1 ATCTCTACATCCGCCCACTCCCAAATCCGTTTTGTGCAACCAACCTCTATT >Test2 CCCCCACAGCAGCAGGAATAATCAAGGGGATGACAGGAAGAGNNNNNNNNN >Test3 AAGTAACCTAGACCTTAAAATTGTACATAGCCTCTCCGAGGANNNNNNNNN >Test4 TTCAAACTTAAGGAATGTAGTGTTGCGATGGGTACTCAACTGATCCCANTT >Test5 AGATGTGGTTCCACCCATAACTCAAGGGCAGATAGGAAACACCNNNNNNNN >Test6 AGGCAACCCCCGGCAGGATCATTCCAGGCACCGTGGGTTTCANNNNNNNNN >Test7 TCTTAGTGTTGAGTCAGACGCAAAGTTGAGACAGGGGAAAAGGCNNNNNNN >Test8 CTTCTACATGTTGGCTGCCAGTTAAACCAGCACCATTTGTTGCAAATGCTA >Test9 CCTCACTAACACAAATGTTGGAGGAAGTCTTGGGAGGCATCCTATTGATAC >Test10 TTTGTGTTCTGGGGCAGCTGGCTTTAGAAAGAGAACTCCAGGTCAANNNNG We've recompiled blat 34 from source with -g, gdb reports the following when we do a back trace: (gdb) set args nt.fa testst.fa testout1.psl -out=blast (gdb) r Starting program: /v/server1a/jlegato/bin/x86_64/blat nt.fa testst.fa testout1.psl -out=blast Loaded 36318681436 letters in 14096376 sequences Program received signal SIGSEGV, Segmentation fault. gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>, qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>, lm=<value optimized out>, retHitCount=<value optimized out>) at genoFind.c:1359 1359 slAddHead(pb, hit); (gdb) bt #0 gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>, qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>, lm=<value optimized out>, retHitCount=<value optimized out>) at genoFind.c:1359 #1 0x000000000040acc5 in gfLongDnaInMem (query=0x7fff20f542c0, gf=0x82f028d50, isRc=0, minScore=30, qMaskBits=0x0, out=0xb30810, fastMap=0, band=0) at gfBlatLib.c:1530 #2 0x000000000040329e in searchOneStrand (seq=0x7fff20f542c0, gf=0x82f028d50, psl=<value optimized out>, isRc=0, maskHash=<value optimized out>, qMaskBits=0x0) at blat.c:200 #3 0x000000000040332c in searchOne (seq=0x7fff20f542c0, gf=0x82f028d50, f=0xb30510, isProt=0, maskHash=0x0, qMaskBits=0x0) at blat.c:241 #4 0x000000000040343f in searchOneMaskTrim (seq=0x6488c0, isProt=0, gf=0x82f028d50, outFile=0xb30510, maskHash=0x0, retTotalSize=0x7fff20f54358, retCount=0x7fff20f54364) at blat.c:310 #5 0x00000000004036a6 in searchOneIndex (fileCount=1, files=0xb30790, gf=0x82f028d50, outName=<value optimized out>, isProt=0, maskHash=0x0, outFile=0xb30510, showStatus=1) at blat.c:380 #6 0x00000000004039e9 in blat (dbFile=<value optimized out>, queryFile=<value optimized out>, outName=0x7fff20f547cd "testout1.psl") at blat.c:606 #7 0x0000000000404049 in main (argc=4, argv=0x7fff20f54528) at blat.c:783 We've also tried the psl output form with similar results. Does this suggest an error in the output functions? We're also wondering if the size of nt.fa (36GB) is just too large for blat. Any other ideas on what might be causing the segfault? We've had success with smaller databases. Thanks John _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
