Hi, Galt and John.
I've just read this thread from a year ago on the Genome list, when
searching for messages about BLAT segfaults: I'm having similar problems
to John, using a 20GiB BLAT DB and I get a segfault at the same place as
John, on different hardware and using different data:
>> genoFind.c:1359
>> 1359 slAddHead(pb, hit);
What is worrying, is that the "blat" process terminated normally the
third time I ran it under "gdb". I've run memory diagnostics on the
server (2156GiB RAM) and no errors were reported. Has this problem been
resolved since the message was posted a year ago?
Bye,
Tony.
> Hi, John!
>
> I don't know of any size-limit such as you mention,
> but our databases are usually around 3 to 5GB and not 36GB.
>
> We will pass along your message to Jim Kent, the author of BLAT,
> and probably be getting back to you off-list.
>
> -Galt
>
> On 06/29/11 08:22, John Legato wrote:
>>
>> We're experiencing a segmentation fault running blat on an x86_64 box
>> with 128GB of RAM. We are running against NCBI's nt database (in FASTA)
>> form. We are using the following query sequences for testing (testst.fa
>> in the example below):
>>
>>
>> >Test1
>> ATCTCTACATCCGCCCACTCCCAAATCCGTTTTGTGCAACCAACCTCTATT
>> >Test2
>> CCCCCACAGCAGCAGGAATAATCAAGGGGATGACAGGAAGAGNNNNNNNNN
>> >Test3
>> AAGTAACCTAGACCTTAAAATTGTACATAGCCTCTCCGAGGANNNNNNNNN
>> >Test4
>> TTCAAACTTAAGGAATGTAGTGTTGCGATGGGTACTCAACTGATCCCANTT
>> >Test5
>> AGATGTGGTTCCACCCATAACTCAAGGGCAGATAGGAAACACCNNNNNNNN
>> >Test6
>> AGGCAACCCCCGGCAGGATCATTCCAGGCACCGTGGGTTTCANNNNNNNNN
>> >Test7
>> TCTTAGTGTTGAGTCAGACGCAAAGTTGAGACAGGGGAAAAGGCNNNNNNN
>> >Test8
>> CTTCTACATGTTGGCTGCCAGTTAAACCAGCACCATTTGTTGCAAATGCTA
>> >Test9
>> CCTCACTAACACAAATGTTGGAGGAAGTCTTGGGAGGCATCCTATTGATAC
>> >Test10
>> TTTGTGTTCTGGGGCAGCTGGCTTTAGAAAGAGAACTCCAGGTCAANNNNG
>>
>>
>> We've recompiled blat 34 from source with -g, gdb reports the following
>> when we do a back trace:
>>
>>
>> (gdb) set args nt.fa testst.fa testout1.psl -out=blast
>> (gdb) r
>> Starting program: /v/server1a/jlegato/bin/x86_64/blat nt.fa testst.fa
>> testout1.psl -out=blast
>> Loaded 36318681436 letters in 14096376 sequences
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>,
>> qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>,
>> lm=<value optimized out>, retHitCount=<value optimized out>) at
>> genoFind.c:1359
>> 1359 slAddHead(pb, hit);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
*** My debugging session shows a segfault at the same point as John's:
> atravis@bifx-cli:~/work/BLAT$ gdb /homes/atravis/bin/blat
> GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2) 7.4-2012.04
> Copyright (C) 2012 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu".
> For bug reporting instructions, please see:
> <http://bugs.launchpad.net/gdb-linaro/>...
> Reading symbols from /homes/atravis/bin/blat...done.
> (gdb) run -out=blast8 /data1/human/GRCh37/GRCh37.fof
> NNNTCTCTAGC_FIBfl_comp.fasta NNNTCTCTAGC_FIBfl_comp.blat
> Starting program: /homes/atravis/bin/blat -out=blast8
> /data1/human/GRCh37/GRCh37.fof NNNTCTCTAGC_FIBfl_comp.fasta
> NNNTCTCTAGC_FIBfl_comp.blat
> Loaded 21439866084 letters in 223 sequences
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000040a1de in clumpHits (gf=0xa45dd0, hitList=0x3ad1c88, minMatch=2)
> at genoFind.c:1359
> 1359 slAddHead(pb, hit);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> (gdb) where
> #0 0x000000000040a1de in clumpHits (gf=0xa45dd0, hitList=0x3ad1c88,
> minMatch=2) at genoFind.c:1359
> #1 0x000000000040b365 in gfFindClumpsWithQmask (gf=0xa45dd0,
> seq=0x7fffffffddb0, qMaskBits=0x0, qMaskOffset=0, lm=0x4667a0,
> retHitCount=0x7fffffffde20) at genoFind.c:1866
> #2 0x00000000004107ed in gfLongDnaInMem (query=0x7fffffffdf10, gf=0xa45dd0,
> isRc=0, minScore=30, qMaskBits=0x0, out=0xa45d70,
> fastMap=0, band=0) at gfBlatLib.c:1530
> #3 0x00000000004028ba in searchOneStrand (seq=0x7fffffffdf10, gf=0xa45dd0,
> psl=0x466510, isRc=0, maskHash=0x0, qMaskBits=0x0)
> at blat.c:200
> #4 0x0000000000402a19 in searchOne (seq=0x7fffffffdf10, gf=0xa45dd0,
> f=0x466510, isProt=0, maskHash=0x0, qMaskBits=0x0)
> at blat.c:241
> #5 0x0000000000402d04 in searchOneMaskTrim (seq=0x461880, isProt=0,
> gf=0xa45dd0, outFile=0x466510, maskHash=0x0,
> retTotalSize=0x7fffffffdfa0, retCount=0x7fffffffdfd8) at blat.c:310
> #6 0x0000000000402ffe in searchOneIndex (fileCount=1, files=0x466750,
> gf=0xa45dd0,
> outName=0x7fffffffe4bc "NNNTCTCTAGC_FIBfl_comp.blat", isProt=0,
> maskHash=0x0, outFile=0x466510, showStatus=1) at blat.c:380
> #7 0x0000000000403a98 in blat (dbFile=0x7fffffffe480
> "/data1/human/GRCh37/GRCh37.fof",
> queryFile=0x7fffffffe49f "NNNTCTCTAGC_FIBfl_comp.fasta",
> outName=0x7fffffffe4bc "NNNTCTCTAGC_FIBfl_comp.blat") at blat.c:606
> #8 0x00000000004041c4 in main (argc=4, argv=0x7fffffffe1b8) at blat.c:783
> (gdb) print pb
> $1 = (struct gfHit **) 0x3b553f0
> (gdb) print hit
> $2 = (struct gfHit *) 0x3a71a80
> (gdb) run -out=blast8 /data1/human/GRCh37/GRCh37.fof
> NNNTCTCTAGC_FIBfl_comp.fasta NNNTCTCTAGC_FIBfl_comp.blat
> The program being debugged has been started already.
> Start it from the beginning? (y or n) y
>
> Starting program: /homes/atravis/bin/blat -out=blast8
> /data1/human/GRCh37/GRCh37.fof NNNTCTCTAGC_FIBfl_comp.fasta
> NNNTCTCTAGC_FIBfl_comp.blat
> Loaded 21439866084 letters in 223 sequences
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000040a1de in clumpHits (gf=0xa45dd0, hitList=0x3ad1c88, minMatch=2)
> at genoFind.c:1359
> 1359 slAddHead(pb, hit);
> (gdb) q
> A debugging session is active.
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome