On 22/11/13 11:29 AM, JC Grenier wrote: > Hi Sebastien, > > I have a question concerning the usage of the checkpoints options. > > I have run one big analysis using 40 nodes of 12 cores using the > -write-checkpoints option. > So like I was guessing, I didn't have enough time to complete the job. The > thing is now that when I relaunch the analysis with the > -read-write-checkpoints option, all the nodes are crashing. > > Here's an example : > > [node-c3-79:27010] *** Process received signal *** > [node-c3-79:27010] Signal: Segmentation fault (11) > [node-c3-79:27010] Signal code: Address not mapped (1) > [node-c3-79:27010] Failing at address: 0x10 > [node-c3-79:27010] [ 0] /lib64/libpthread.so.0() [0x39bf40f500] > [node-c3-79:27010] [ 1] Ray(_ZN6Vertex12addDirectionEP4KmerP9Direction+0xe) > [0x58f91e] > [node-c3-79:27010] [ 2] > Ray(_ZN16MessageProcessor38call_RAY_MPI_TAG_SAVE_WAVE_PROGRESSIONEP7Message+0x104) > [0x4c0004] > [node-c3-79:27010] [ 3] > Ray(_ZN52Adapter_RAY_MPI_TAG_SAVE_WAVE_PROGRESSION_WITH_REPLY4callEP7Message+0x27) > [0x4c0287] > [node-c3-79:27010] [ 4] Ray(_ZN11ComputeCore15runWithProfilerEv+0x312) > [0x5c9442] > [node-c3-79:27010] [ 5] Ray(_ZN11ComputeCore3runEv+0xbc) [0x5cd35c] > [node-c3-79:27010] [ 6] Ray(_ZN7Machine5startEv+0x1756) [0x477666] > [node-c3-79:27010] [ 7] Ray(_ZN11RankProcessI7MachineE3runEv+0x9f) [0x474cef] > [node-c3-79:27010] [ 8] Ray(main+0xc7) [0x474dc7] > [node-c3-79:27010] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd) > [0x39bec1ecdd] > [node-c3-79:27010] [10] Ray() [0x472119] > [node-c3-79:27010] *** End of error message *** > > > > Could this be because I'm not reserving the same number of processors as the > first analysis?
Yes, you basically need to use the same number of ranks. > > Thanks for your help. > -- > Jean-Christophe Grenier, M.Sc. > > ----------------------------------------- > /Bio-informaticien/ > /Laboratoire de Philip Awadalla/ > /Laboratoire de Luis Barreiro/ > /CHU Sainte-Justine/ > //3175, Côte Sainte-Catherine, local B-607 > ///Tél : 514-345-4931 poste 5199/ > ----------------------------------------- ------------------------------------------------------------------------------ Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R) Software Adrenaline delivers strategic insight and game-changing conversations that shape the rapidly evolving mobile landscape. Sign up now. http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users