On 02/12/13 07:32 PM, Jeff Tan wrote:
> Hi all,
>
> We're seeing this seg fault occur in Vertex.cpp, it seems, using Ray 2.3.0 on 
> x86 launched via Slurm:

Hi,

I checked the stack you provided and the fault occurs during the loading of
some checkpoints (RAY_MPI_TAG_START_SEEDING).

m_readsStartingHere is a linked list and a value of 0 means that it's empty.

You can build Ray with ASSERT=y to help in debugging. Aisde from that, I don't
see the issue with the available information.


>
> [jtan@barcoo-m barcoo]$ grep 27960 slurm-349755.out
> Rank 233: Rank= 233 Size= 512 ProcessIdentifier= 27960
> [barcoo050:27960] *** Process received signal ***
> [barcoo050:27960] Signal: Segmentation fault (11)
> [barcoo050:27960] Signal code: Address not mapped (1)
> [barcoo050:27960] Failing at address: 0x18
> [barcoo050:27960] [ 0] /lib64/libpthread.so.0(+0xf500) [0x2b076d782500]
> [barcoo050:27960] [ 1] 
> /usr/local/Ray/2.3.0-gcc/Ray(_ZN6Vertex7addReadEP4KmerP14ReadAnnotation+0xe) 
> [0x5d18ee]
> [barcoo050:27960] [ 2] 
> /usr/local/Ray/2.3.0-gcc/Ray(_ZN33Adapter_RAY_MPI_TAG_START_SEEDING4callEP7Message+0x481)
>  [0x4cdf11]
> [barcoo050:27960] [ 3] 
> /usr/local/Ray/2.3.0-gcc/Ray(_ZN18MessageTagExecutor11callHandlerEiP7Message+0x22)
>  [0x62cca2]
> [barcoo050:27960] [ 4] 
> /usr/local/Ray/2.3.0-gcc/Ray(_ZN11ComputeCore15runWithProfilerEv+0x1105) 
> [0x60fb75]
> [barcoo050:27960] [ 5] 
> /usr/local/Ray/2.3.0-gcc/Ray(_ZN11ComputeCore3runEv+0x28e) [0x60d33e]
> [barcoo050:27960] [ 6] 
> /usr/local/Ray/2.3.0-gcc/Ray(_ZN7Machine5startEv+0x2024) [0x475544]
> [barcoo050:27960] [ 7] /usr/local/Ray/2.3.0-gcc/Ray(_ZN7Machine3runEv+0x6) 
> [0x473516]
> [barcoo050:27960] [ 8] /usr/local/Ray/2.3.0-gcc/Ray(main+0x2d7) [0x470c47]
> [barcoo050:27960] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd) 
> [0x2b076d9aecdd]
> [barcoo050:27960] [10] /usr/local/Ray/2.3.0-gcc/Ray() [0x4708a9]
>
> [jtan@barcoo-m barcoo]$ gdb -d ~/src/Ray-2.3.0 -c core.27960 `which Ray`
> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/local/Ray/2.3.0-gcc/Ray...done.
> ....
> Core was generated by `/usr/local/Ray/2.3.0-gcc/Ray BarcooRay31.conf'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00000000005d18ee in Vertex::addRead (this=0x2b079cfae040,
> vertex=0x7fffadade1b0, e=0x2b079cfae040)
>      at code/VerticesExtractor/Vertex.cpp:176
> 176             e->setNext(m_readsStartingHere);
> ...
>
> with the source code in ~/src/Ray-2.3.0.
>
> and I find:
>
> (gdb) where full
> #0  0x00000000005d18ee in Vertex::addRead (this=0x2b079cfae040,
> vertex=0x7fffadade1b0, e=0x2b079cfae040)
>      at code/VerticesExtractor/Vertex.cpp:176
> ...
> (gdb) print m_readsStartingHere
> $1 = (ReadAnnotation *) 0x0
>
>
>
> With the user's permission, I have attached the configuration file, but not 
> the 800MB core dump. :-)
>
> Does anyone have any experience with this sort of problem? Maybe suggestions 
> on how to debug this further?
>
> Regards
>
> Jeff Tan
> High Performance Computing Specialist
> IBM Research Collaboratory for Life Sciences, Melbourne
>
>


------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to