Thanks for that! We'll give the extra debugging a try and see how it goes. Regards Jeff
On Thu, 2014-01-09 at 15:57 -0500, Sébastien Boisvert wrote: > On 02/12/13 07:32 PM, Jeff Tan wrote: > > Hi all, > > > > We're seeing this seg fault occur in Vertex.cpp, it seems, using Ray 2.3.0 > > on x86 launched via Slurm: > > Hi, > > I checked the stack you provided and the fault occurs during the loading of > some checkpoints (RAY_MPI_TAG_START_SEEDING). > > m_readsStartingHere is a linked list and a value of 0 means that it's empty. > > You can build Ray with ASSERT=y to help in debugging. Aisde from that, I don't > see the issue with the available information. > > > > > > [jtan@barcoo-m barcoo]$ grep 27960 slurm-349755.out > > Rank 233: Rank= 233 Size= 512 ProcessIdentifier= 27960 > > [barcoo050:27960] *** Process received signal *** > > [barcoo050:27960] Signal: Segmentation fault (11) > > [barcoo050:27960] Signal code: Address not mapped (1) > > [barcoo050:27960] Failing at address: 0x18 > > [barcoo050:27960] [ 0] /lib64/libpthread.so.0(+0xf500) [0x2b076d782500] > > [barcoo050:27960] [ 1] > > /usr/local/Ray/2.3.0-gcc/Ray(_ZN6Vertex7addReadEP4KmerP14ReadAnnotation+0xe) > > [0x5d18ee] > > [barcoo050:27960] [ 2] > > /usr/local/Ray/2.3.0-gcc/Ray(_ZN33Adapter_RAY_MPI_TAG_START_SEEDING4callEP7Message+0x481) > > [0x4cdf11] > > [barcoo050:27960] [ 3] > > /usr/local/Ray/2.3.0-gcc/Ray(_ZN18MessageTagExecutor11callHandlerEiP7Message+0x22) > > [0x62cca2] > > [barcoo050:27960] [ 4] > > /usr/local/Ray/2.3.0-gcc/Ray(_ZN11ComputeCore15runWithProfilerEv+0x1105) > > [0x60fb75] > > [barcoo050:27960] [ 5] > > /usr/local/Ray/2.3.0-gcc/Ray(_ZN11ComputeCore3runEv+0x28e) [0x60d33e] > > [barcoo050:27960] [ 6] > > /usr/local/Ray/2.3.0-gcc/Ray(_ZN7Machine5startEv+0x2024) [0x475544] > > [barcoo050:27960] [ 7] /usr/local/Ray/2.3.0-gcc/Ray(_ZN7Machine3runEv+0x6) > > [0x473516] > > [barcoo050:27960] [ 8] /usr/local/Ray/2.3.0-gcc/Ray(main+0x2d7) [0x470c47] > > [barcoo050:27960] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd) > > [0x2b076d9aecdd] > > [barcoo050:27960] [10] /usr/local/Ray/2.3.0-gcc/Ray() [0x4708a9] > > > > [jtan@barcoo-m barcoo]$ gdb -d ~/src/Ray-2.3.0 -c core.27960 `which Ray` > > GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1) > > Copyright (C) 2010 Free Software Foundation, Inc. > > License GPLv3+: GNU GPL version 3 or later > > <http://gnu.org/licenses/gpl.html> > > This is free software: you are free to change and redistribute it. > > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > > and "show warranty" for details. > > This GDB was configured as "x86_64-redhat-linux-gnu". > > For bug reporting instructions, please see: > > <http://www.gnu.org/software/gdb/bugs/>... > > Reading symbols from /usr/local/Ray/2.3.0-gcc/Ray...done. > > .... > > Core was generated by `/usr/local/Ray/2.3.0-gcc/Ray BarcooRay31.conf'. > > Program terminated with signal 11, Segmentation fault. > > #0 0x00000000005d18ee in Vertex::addRead (this=0x2b079cfae040, > > vertex=0x7fffadade1b0, e=0x2b079cfae040) > > at code/VerticesExtractor/Vertex.cpp:176 > > 176 e->setNext(m_readsStartingHere); > > ... > > > > with the source code in ~/src/Ray-2.3.0. > > > > and I find: > > > > (gdb) where full > > #0 0x00000000005d18ee in Vertex::addRead (this=0x2b079cfae040, > > vertex=0x7fffadade1b0, e=0x2b079cfae040) > > at code/VerticesExtractor/Vertex.cpp:176 > > ... > > (gdb) print m_readsStartingHere > > $1 = (ReadAnnotation *) 0x0 > > > > > > > > With the user's permission, I have attached the configuration file, but not > > the 800MB core dump. :-) > > > > Does anyone have any experience with this sort of problem? Maybe > > suggestions on how to debug this further? > > > > Regards > > > > Jeff Tan > > High Performance Computing Specialist > > IBM Research Collaboratory for Life Sciences, Melbourne > > > > > > -- Jeff Tan High Performance Computing Specialist IBM Research Collaboratory for Life Sciences, Melbourne Phone: +61 3 9035 4392 ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users