We're experiencing a segmentation fault running blat on an x86_64 box 
with 128GB of RAM. We are running against NCBI's nt database (in FASTA) 
form. We are using the following query sequences for testing (testst.fa 
in the example below):


 >Test1
ATCTCTACATCCGCCCACTCCCAAATCCGTTTTGTGCAACCAACCTCTATT
 >Test2
CCCCCACAGCAGCAGGAATAATCAAGGGGATGACAGGAAGAGNNNNNNNNN
 >Test3
AAGTAACCTAGACCTTAAAATTGTACATAGCCTCTCCGAGGANNNNNNNNN
 >Test4
TTCAAACTTAAGGAATGTAGTGTTGCGATGGGTACTCAACTGATCCCANTT
 >Test5
AGATGTGGTTCCACCCATAACTCAAGGGCAGATAGGAAACACCNNNNNNNN
 >Test6
AGGCAACCCCCGGCAGGATCATTCCAGGCACCGTGGGTTTCANNNNNNNNN
 >Test7
TCTTAGTGTTGAGTCAGACGCAAAGTTGAGACAGGGGAAAAGGCNNNNNNN
 >Test8
CTTCTACATGTTGGCTGCCAGTTAAACCAGCACCATTTGTTGCAAATGCTA
 >Test9
CCTCACTAACACAAATGTTGGAGGAAGTCTTGGGAGGCATCCTATTGATAC
 >Test10
TTTGTGTTCTGGGGCAGCTGGCTTTAGAAAGAGAACTCCAGGTCAANNNNG


We've recompiled blat 34 from source with -g, gdb reports the following 
when we do a back trace:


(gdb) set args  nt.fa testst.fa testout1.psl -out=blast
(gdb) r
Starting program: /v/server1a/jlegato/bin/x86_64/blat nt.fa testst.fa 
testout1.psl -out=blast
Loaded 36318681436 letters in 14096376 sequences

Program received signal SIGSEGV, Segmentation fault.
gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>, 
qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>, 
lm=<value optimized out>, retHitCount=<value optimized out>) at 
genoFind.c:1359
1359        slAddHead(pb, hit);
(gdb) bt
#0  gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>, 
qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>, 
lm=<value optimized out>, retHitCount=<value optimized out>) at 
genoFind.c:1359
#1  0x000000000040acc5 in gfLongDnaInMem (query=0x7fff20f542c0, 
gf=0x82f028d50, isRc=0, minScore=30, qMaskBits=0x0, out=0xb30810, 
fastMap=0, band=0) at gfBlatLib.c:1530
#2  0x000000000040329e in searchOneStrand (seq=0x7fff20f542c0, 
gf=0x82f028d50, psl=<value optimized out>, isRc=0, maskHash=<value 
optimized out>, qMaskBits=0x0) at blat.c:200
#3  0x000000000040332c in searchOne (seq=0x7fff20f542c0, gf=0x82f028d50, 
f=0xb30510, isProt=0, maskHash=0x0, qMaskBits=0x0) at blat.c:241
#4  0x000000000040343f in searchOneMaskTrim (seq=0x6488c0, isProt=0, 
gf=0x82f028d50, outFile=0xb30510, maskHash=0x0, 
retTotalSize=0x7fff20f54358, retCount=0x7fff20f54364) at blat.c:310
#5  0x00000000004036a6 in searchOneIndex (fileCount=1, files=0xb30790, 
gf=0x82f028d50, outName=<value optimized out>, isProt=0, maskHash=0x0, 
outFile=0xb30510, showStatus=1) at blat.c:380
#6  0x00000000004039e9 in blat (dbFile=<value optimized out>, 
queryFile=<value optimized out>, outName=0x7fff20f547cd "testout1.psl") 
at blat.c:606
#7  0x0000000000404049 in main (argc=4, argv=0x7fff20f54528) at blat.c:783

We've also tried the psl output form with  similar results.

Does this suggest an error in the output functions? We're also wondering 
if the size of nt.fa (36GB) is just too large for blat. Any other ideas 
on what might be causing the segfault? We've had success with smaller 
databases.

Thanks

John


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to