On 22/11/13 11:29 AM, JC Grenier wrote:
> Hi Sebastien,
>
> I have a question concerning the usage of the checkpoints options.
>
> I have run one big analysis using 40 nodes of 12 cores  using the 
> -write-checkpoints option.
> So like I was guessing, I didn't have enough time to complete the job. The 
> thing is now that when I relaunch the analysis with the 
> -read-write-checkpoints option, all the nodes are crashing.
>
> Here's an example :
>
> [node-c3-79:27010] *** Process received signal ***
> [node-c3-79:27010] Signal: Segmentation fault (11)
> [node-c3-79:27010] Signal code: Address not mapped (1)
> [node-c3-79:27010] Failing at address: 0x10
> [node-c3-79:27010] [ 0] /lib64/libpthread.so.0() [0x39bf40f500]
> [node-c3-79:27010] [ 1] Ray(_ZN6Vertex12addDirectionEP4KmerP9Direction+0xe) 
> [0x58f91e]
> [node-c3-79:27010] [ 2] 
> Ray(_ZN16MessageProcessor38call_RAY_MPI_TAG_SAVE_WAVE_PROGRESSIONEP7Message+0x104)
>  [0x4c0004]
> [node-c3-79:27010] [ 3] 
> Ray(_ZN52Adapter_RAY_MPI_TAG_SAVE_WAVE_PROGRESSION_WITH_REPLY4callEP7Message+0x27)
>  [0x4c0287]
> [node-c3-79:27010] [ 4] Ray(_ZN11ComputeCore15runWithProfilerEv+0x312) 
> [0x5c9442]
> [node-c3-79:27010] [ 5] Ray(_ZN11ComputeCore3runEv+0xbc) [0x5cd35c]
> [node-c3-79:27010] [ 6] Ray(_ZN7Machine5startEv+0x1756) [0x477666]
> [node-c3-79:27010] [ 7] Ray(_ZN11RankProcessI7MachineE3runEv+0x9f) [0x474cef]
> [node-c3-79:27010] [ 8] Ray(main+0xc7) [0x474dc7]
> [node-c3-79:27010] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd) 
> [0x39bec1ecdd]
> [node-c3-79:27010] [10] Ray() [0x472119]
> [node-c3-79:27010] *** End of error message ***
>
>
>
> Could this be because I'm not reserving the same number of processors as the 
> first analysis?

Yes, you basically need to use the same number of ranks.

>
> Thanks for your help.
> --
> Jean-Christophe Grenier, M.Sc.
>
> -----------------------------------------
> /Bio-informaticien/
> /Laboratoire de Philip Awadalla/
> /Laboratoire de Luis Barreiro/
> /CHU Sainte-Justine/
> //3175, Côte Sainte-Catherine, local B-607
> ///Tél : 514-345-4931 poste 5199/
> -----------------------------------------


------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to