On 12/06/13 05:10 PM, Lin wrote:
> Hi,
>
> Yes, they are.  When I run "top" or "ps". There are exactly 16 Ray ranks and 
> one mpiexec process in the oak machine.

But this is before one of the Ray ranks receives a SIGHUP (1, Hangup), right ?

>
> But this problem does not always happen because I have gotten some good 
> results from Ray when I ran it for other datasets.
>

I never got this SIGHUP with Ray. That's strange.

Is it reproducible, meaning that if you run the same thing 10 times, do you get 
this SIGHUP 10 times too ?


> Thanks
> Lin
>
>
> On Wed, Jun 12, 2013 at 8:00 AM, Sébastien Boisvert 
> <[email protected] <mailto:[email protected]>> 
> wrote:
>
>     On 10/06/13 05:26 PM, Lin wrote:
>
>         Hi,
>
>         Thanks for your answers.
>         However, I got the error message from nohup.out. That is to say, I 
> have used nohup to run Ray.
>
>         This is my command:
>         nohup mpiexec -n 16 Ray Col.conf &
>
>
>     Are all your MPI ranks running on the "oak" machine ?
>
>
>         And the Col.conf contains:
>
>         -k 55  # this is a comment
>         -p /s/oak/a/nobackup/lin/Art/Col___illumina_art/Col_il1.fastq
>              /s/oak/a/nobackup/lin/Art/Col___illumina_art/Col_il2.fastq
>
>         -o RayOutputOfCol
>
>
>
>
>         On Mon, Jun 10, 2013 at 2:02 PM, Sébastien Boisvert 
> <sebastien.boisvert.3@ulaval.__ca <mailto:[email protected]> 
> <mailto:sebastien.boisvert.3@__ulaval.ca 
> <mailto:[email protected]>>> wrote:
>
>              On 09/06/13 11:35 AM, Lin wrote:
>
>                  Hi, Sébastien
>
>                  I changed the Max Kmer to 64. And set it as 55 in a run.
>                  But it always end up with a problem like this.
>                  "mpiexec noticed that process rank 11 with PID 25012 on node 
> oak exited on signal 1(Hangup)"
>                  Could you help me figure it out?
>
>
>              The signal 1 is SIGHUP according to this list:
>
>              $ kill -l
>                1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL      
>  5) SIGTRAP
>                6) SIGABRT      7) SIGBUS       8) SIGFPE       9) SIGKILL     
> 10) SIGUSR1
>              11) SIGSEGV     12) SIGUSR2     13) SIGPIPE     14) SIGALRM     
> 15) SIGTERM
>              16) SIGSTKFLT   17) SIGCHLD     18) SIGCONT     19) SIGSTOP     
> 20) SIGTSTP
>              21) SIGTTIN     22) SIGTTOU     23) SIGURG      24) SIGXCPU     
> 25) SIGXFSZ
>              26) SIGVTALRM   27) SIGPROF     28) SIGWINCH    29) SIGIO       
> 30) SIGPWR
>              31) SIGSYS      34) SIGRTMIN    35) SIGRTMIN+1  36) SIGRTMIN+2  
> 37) SIGRTMIN+3
>              38) SIGRTMIN+4  39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  
> 42) SIGRTMIN+8
>              43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 
> 47) SIGRTMIN+13
>              48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 
> 52) SIGRTMAX-12
>              53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9  56) SIGRTMAX-8  
> 57) SIGRTMAX-7
>              58) SIGRTMAX-6  59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  
> 62) SIGRTMAX-2
>              63) SIGRTMAX-1  64) SIGRTMAX
>
>
>              This signal is not related to the compilation option 
> MAXKMERLENGTH=64.
>
>              You are gettig this signal because the parent process of your 
> mpiexec process dies
>              (probably because you are closing your terminal) and this causes 
> the SIGHUP that is being sent to your Ray processes.
>
>
>              There are several solutions to this issue (pick up on solution 
> in the list below):
>
>
>              1. Use nohup^(i.e.: nohup mpiexec -n 999 Ray -p data1.fastq.gz 
> data2.fastq.gz
>
>              2. Launch your work inside a screen session (the screen command)
>
>              3. Launch your work inside a tmux session (the tmux command)
>
>              4. Use a job scheduler (like Moab, Grid Engine, or another).
>
>
>              --SÉB--
>
>
>                  
> ------------------------------____----------------------------__--__------------------
>
>                  How ServiceNow helps IT people transform IT departments:
>                  1. A cloud service to automate IT design, transition and 
> operations
>                  2. Dashboards that offer high-level views of enterprise 
> services
>                  3. A single system of record for all IT processes
>         http://p.sf.net/sfu/____servicenow-d2d-j 
> <http://p.sf.net/sfu/__servicenow-d2d-j> 
> <http://p.sf.net/sfu/__servicenow-d2d-j 
> <http://p.sf.net/sfu/servicenow-d2d-j>>
>                  ___________________________________________________
>                  Denovoassembler-users mailing list
>                  Denovoassembler-users@lists.____sourceforge.net 
> <http://sourceforge.net> 
> <mailto:Denovoassembler-users@__lists.sourceforge.net 
> <mailto:[email protected]>>
>         
> https://lists.sourceforge.net/____lists/listinfo/____denovoassembler-users 
> <https://lists.sourceforge.net/__lists/listinfo/__denovoassembler-users> 
> <https://lists.sourceforge.__net/lists/listinfo/__denovoassembler-users 
> <https://lists.sourceforge.net/lists/listinfo/denovoassembler-users>>
>
>
>
>
>


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to