Hi, Galt and John.

I've just read this thread from a year ago on the Genome list, when 
searching for messages about BLAT segfaults: I'm having similar problems 
to John, using a 20GiB BLAT DB and I get a segfault at the same place as 
John, on different hardware and using different data:

   >> genoFind.c:1359
   >> 1359        slAddHead(pb, hit);

What is worrying, is that the "blat" process terminated normally the 
third time I ran it under "gdb". I've run memory diagnostics on the 
server (2156GiB RAM) and no errors were reported. Has this problem been 
resolved since the message was posted a year ago?

Bye,

   Tony.

> Hi, John!
>
> I don't know of any size-limit such as you mention,
> but our databases are usually around 3 to 5GB and not 36GB.
>
> We will pass along your message to Jim Kent, the author of BLAT,
> and probably be getting back to you off-list.
>
> -Galt
>
> On 06/29/11 08:22, John Legato wrote:
>>
>> We're experiencing a segmentation fault running blat on an x86_64 box
>> with 128GB of RAM. We are running against NCBI's nt database (in FASTA)
>> form. We are using the following query sequences for testing (testst.fa
>> in the example below):
>>
>>
>>  >Test1
>> ATCTCTACATCCGCCCACTCCCAAATCCGTTTTGTGCAACCAACCTCTATT
>>  >Test2
>> CCCCCACAGCAGCAGGAATAATCAAGGGGATGACAGGAAGAGNNNNNNNNN
>>  >Test3
>> AAGTAACCTAGACCTTAAAATTGTACATAGCCTCTCCGAGGANNNNNNNNN
>>  >Test4
>> TTCAAACTTAAGGAATGTAGTGTTGCGATGGGTACTCAACTGATCCCANTT
>>  >Test5
>> AGATGTGGTTCCACCCATAACTCAAGGGCAGATAGGAAACACCNNNNNNNN
>>  >Test6
>> AGGCAACCCCCGGCAGGATCATTCCAGGCACCGTGGGTTTCANNNNNNNNN
>>  >Test7
>> TCTTAGTGTTGAGTCAGACGCAAAGTTGAGACAGGGGAAAAGGCNNNNNNN
>>  >Test8
>> CTTCTACATGTTGGCTGCCAGTTAAACCAGCACCATTTGTTGCAAATGCTA
>>  >Test9
>> CCTCACTAACACAAATGTTGGAGGAAGTCTTGGGAGGCATCCTATTGATAC
>>  >Test10
>> TTTGTGTTCTGGGGCAGCTGGCTTTAGAAAGAGAACTCCAGGTCAANNNNG
>>
>>
>> We've recompiled blat 34 from source with -g, gdb reports the following
>> when we do a back trace:
>>
>>
>> (gdb) set args  nt.fa testst.fa testout1.psl -out=blast
>> (gdb) r
>> Starting program: /v/server1a/jlegato/bin/x86_64/blat nt.fa testst.fa
>> testout1.psl -out=blast
>> Loaded 36318681436 letters in 14096376 sequences
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> gfFindClumpsWithQmask (gf=0x82f028d50, seq=<value optimized out>,
>> qMaskBits=<value optimized out>, qMaskOffset=<value optimized out>,
>> lm=<value optimized out>, retHitCount=<value optimized out>) at
>> genoFind.c:1359
>> 1359        slAddHead(pb, hit);
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
*** My debugging session shows a segfault at the same point as John's:

> atravis@bifx-cli:~/work/BLAT$ gdb /homes/atravis/bin/blat
> GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2) 7.4-2012.04
> Copyright (C) 2012 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu".
> For bug reporting instructions, please see:
> <http://bugs.launchpad.net/gdb-linaro/>...
> Reading symbols from /homes/atravis/bin/blat...done.
> (gdb) run -out=blast8 /data1/human/GRCh37/GRCh37.fof 
> NNNTCTCTAGC_FIBfl_comp.fasta NNNTCTCTAGC_FIBfl_comp.blat
> Starting program: /homes/atravis/bin/blat -out=blast8 
> /data1/human/GRCh37/GRCh37.fof NNNTCTCTAGC_FIBfl_comp.fasta 
> NNNTCTCTAGC_FIBfl_comp.blat
> Loaded 21439866084 letters in 223 sequences
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000040a1de in clumpHits (gf=0xa45dd0, hitList=0x3ad1c88, minMatch=2) 
> at genoFind.c:1359
> 1359      slAddHead(pb, hit);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> (gdb) where
> #0  0x000000000040a1de in clumpHits (gf=0xa45dd0, hitList=0x3ad1c88, 
> minMatch=2) at genoFind.c:1359
> #1  0x000000000040b365 in gfFindClumpsWithQmask (gf=0xa45dd0, 
> seq=0x7fffffffddb0, qMaskBits=0x0, qMaskOffset=0, lm=0x4667a0,
>     retHitCount=0x7fffffffde20) at genoFind.c:1866
> #2  0x00000000004107ed in gfLongDnaInMem (query=0x7fffffffdf10, gf=0xa45dd0, 
> isRc=0, minScore=30, qMaskBits=0x0, out=0xa45d70,
>     fastMap=0, band=0) at gfBlatLib.c:1530
> #3  0x00000000004028ba in searchOneStrand (seq=0x7fffffffdf10, gf=0xa45dd0, 
> psl=0x466510, isRc=0, maskHash=0x0, qMaskBits=0x0)
>     at blat.c:200
> #4  0x0000000000402a19 in searchOne (seq=0x7fffffffdf10, gf=0xa45dd0, 
> f=0x466510, isProt=0, maskHash=0x0, qMaskBits=0x0)
>     at blat.c:241
> #5  0x0000000000402d04 in searchOneMaskTrim (seq=0x461880, isProt=0, 
> gf=0xa45dd0, outFile=0x466510, maskHash=0x0,
>     retTotalSize=0x7fffffffdfa0, retCount=0x7fffffffdfd8) at blat.c:310
> #6  0x0000000000402ffe in searchOneIndex (fileCount=1, files=0x466750, 
> gf=0xa45dd0,
>     outName=0x7fffffffe4bc "NNNTCTCTAGC_FIBfl_comp.blat", isProt=0, 
> maskHash=0x0, outFile=0x466510, showStatus=1) at blat.c:380
> #7  0x0000000000403a98 in blat (dbFile=0x7fffffffe480 
> "/data1/human/GRCh37/GRCh37.fof",
>     queryFile=0x7fffffffe49f "NNNTCTCTAGC_FIBfl_comp.fasta", 
> outName=0x7fffffffe4bc "NNNTCTCTAGC_FIBfl_comp.blat") at blat.c:606
> #8  0x00000000004041c4 in main (argc=4, argv=0x7fffffffe1b8) at blat.c:783
> (gdb) print pb
> $1 = (struct gfHit **) 0x3b553f0
> (gdb) print hit
> $2 = (struct gfHit *) 0x3a71a80
> (gdb) run -out=blast8 /data1/human/GRCh37/GRCh37.fof 
> NNNTCTCTAGC_FIBfl_comp.fasta NNNTCTCTAGC_FIBfl_comp.blat
> The program being debugged has been started already.
> Start it from the beginning? (y or n) y
>
> Starting program: /homes/atravis/bin/blat -out=blast8 
> /data1/human/GRCh37/GRCh37.fof NNNTCTCTAGC_FIBfl_comp.fasta 
> NNNTCTCTAGC_FIBfl_comp.blat
> Loaded 21439866084 letters in 223 sequences
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000040a1de in clumpHits (gf=0xa45dd0, hitList=0x3ad1c88, minMatch=2) 
> at genoFind.c:1359
> 1359      slAddHead(pb, hit);
> (gdb) q
> A debugging session is active.
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to