Maybe there is still a bug in Thread? I now use threads in a very simple way:
    
    
    for q in get_seq_data(config, min_n_read, min_len_aln):
        var (seqs, seed_id) = q
        log("len(seqs)=", $len(seqs), ", seed_id=", seed_id)
        var cargs: ConsensusArgs = (inseqs: seqs, seed_id: seed_id, config: 
config)
        if n_core == 0:
          process_consensus(cargs)
        else:
          var rthread: ref Thread[ConsensusArgs]
          new(rthread)
          createThread(rthread[], process_consensus, cargs)
          joinThread(rthread[])
    
    
    
    ... (threadpool first creates 48 threads, even though I do not use 
threadpool.)
    [New Thread 0x7ffff015a700 (LWP 202052)]
    [New Thread 0x7fffefedb700 (LWP 202053)]
    [New Thread 0x7fffefbdc700 (LWP 202054)]
    main(n_core=1)
    len(seqs)=25, seed_id=2
    [New Thread 0x7fffef52b700 (LWP 202055)]
    [Thread 0x7fffef52b700 (LWP 202055) exited]
    len(seqs)=98, seed_id=14
    [New Thread 0x7fffef52b700 (LWP 202056)]
    [Thread 0x7fffef52b700 (LWP 202056) exited]
    len(seqs)=58, seed_id=15
    [New Thread 0x7fffef52b700 (LWP 202057)]
    [Thread 0x7fffef52b700 (LWP 202057) exited]
    len(seqs)=43, seed_id=22
    [New Thread 0x7fffef52b700 (LWP 202058)]
    [Thread 0x7fffef52b700 (LWP 202058) exited]
    len(seqs)=55, seed_id=25
    [New Thread 0x7fffef52b700 (LWP 202059)]
    
    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7fffef52b700 (LWP 202059)]
    deallocOsPages_e5IRqVbks39a9bBzvLjGxw2g (a=0x7ffff7f3d0c8) at 
/home/UNIXHOME/cdunn/repo/gh/Nim/lib/system/alloc.nim:740
    740         osDeallocPages(it, it.origSize and not 1)
    
    (gdb) bt
    #0  deallocOsPages_e5IRqVbks39a9bBzvLjGxw2g (a=0x7ffff7f3d0c8) at 
/home/UNIXHOME/cdunn/repo/gh/Nim/lib/system/alloc.nim:740
    #1  0x00000000004143f3 in deallocOsPages_njssp69aa7hvxte9bJ8uuDcg_3 () at 
/home/UNIXHOME/cdunn/repo/gh/Nim/lib/system/gc.nim:107
    #2  threadProcWrapStackFrame_dXJaXMz804k05DGz7X4RkA (thrd=0x7ffff7f79328) 
at /home/UNIXHOME/cdunn/repo/gh/Nim/lib/system/threads.nim:427
    #3  threadProcWrapper_2AvjU29bJvs3FXJIcnmn4Kg_2 (closure=0x7ffff7f79328) at 
/home/UNIXHOME/cdunn/repo/gh/Nim/lib/system/threads.nim:437
    #4  0x00007ffff76ba182 in start_thread (arg=0x7fffef52b700) at 
pthread_create.c:312
    #5  0x00007ffff73e700d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
    
    (gdb) l
    735         when defined(debugHeapLinks):
    736           cprintf("owner %p; dealloc A: %p size: %ld; next: %p\n", 
addr(a),
    737             it, it.origSize and not 1, next)
    738         sysAssert it.origSize >= PageSize, "origSize too small"
    739         # note:
    740         osDeallocPages(it, it.origSize and not 1)
    741         it = next
    742       when false:
    743         for p in elements(a.chunkStarts):
    744           var page = cast[PChunk](p shl PageShift)
    
    (gdb) p it
    $1 = (BigChunk_Rv9c70Uhp2TytkX7eH78qEg *) 0x101010101010101
    

That is with Nim origin/devel up-to-date, at 
    
    
    commit 172a9c8e97694846c3348983a9b2b7c2931c939d
    Author: Dominik Picheta <[email protected]>
    Date:   Mon Mar 27 12:14:06 2017
    

My program works fine without threads (n_core=0). It worked fine when I used 
threadpool.

Another problem with this approach is that is goes 3x slower (despite using 
GC_disable within the thread) than my single-threaded version, which was 3x 
faster than C+Python/multiprocessing. Very disappointing. The single-threaded 
version also suffers an explosion in memory fragmentation, though not as bad as 
before I started re-using strings and seqs within each task.

So at this point, I've lost my runtime advantage; I have to jump through hoops 
to avoid memory fragmentation (compared with Python multiprocessing); and now I 
have this seg-fault.

If anyone wants to debug this, let me know. I can put together a full test-case 
(via my corporate cloud server). I have 3 test-cases: 75k, 1.4M, and 800M. This 
crash happens only on the largest, but at least it happens pretty quickly.

Reply via email to