Hi all,
I've checked the "limits.conf", and it contains theses lines
# Jcb 29.06.2007 : pbs wrf (Siji)
#* hard stack 1000000
#* soft stack 1000000
# Dr 14.02.2008 : pour voltaire mpi
* hard memlock unlimited
* soft memlock unlimited
Many thanks for your help
Mouhamad
Gus Correa <g...@ldeo.columbia.edu> a écrit :
Hi Mouhamad, Ralph, Terry
Very often big programs like wrf crash with segfault because they
can't allocate memory on the stack, and assume the system doesn't
impose any limits for it. This has nothing to do with MPI.
Mouhamad: Check if your stack size is set to unlimited on all compute
nodes. The easy way to get it done
is to change /etc/security/limits.conf,
where you or your system administrator could add these lines:
* - memlock -1
* - stack -1
* - nofile 4096
My two cents,
Gus Correa
Ralph Castain wrote:
Looks like you are crashing in wrf - have you asked them for help?
On Oct 25, 2011, at 7:53 AM, Mouhamad Al-Sayed-Ali wrote:
Hi again,
This is exactly the error I have:
----
taskid: 0 hostname: part034.u-bourgogne.fr
[part034:21443] *** Process received signal ***
[part034:21443] Signal: Segmentation fault (11)
[part034:21443] Signal code: Address not mapped (1)
[part034:21443] Failing at address: 0xfffffffe01eeb340
[part034:21443] [ 0] /lib64/libpthread.so.0 [0x3612c0de70]
[part034:21443] [ 1] wrf.exe(__module_ra_rrtm_MOD_taugb3+0x418) [0x11cc9d8]
[part034:21443] [ 2] wrf.exe(__module_ra_rrtm_MOD_gasabs+0x260) [0x11cfca0]
[part034:21443] [ 3] wrf.exe(__module_ra_rrtm_MOD_rrtm+0xb31) [0x11e6e41]
[part034:21443] [ 4]
wrf.exe(__module_ra_rrtm_MOD_rrtmlwrad+0x25ec) [0x11e9bcc]
[part034:21443] [ 5]
wrf.exe(__module_radiation_driver_MOD_radiation_driver+0xe573)
[0xcc4ed3]
[part034:21443] [ 6]
wrf.exe(__module_first_rk_step_part1_MOD_first_rk_step_part1+0x40c5)
[0xe0e4f5]
[part034:21443] [ 7] wrf.exe(solve_em_+0x22e58) [0x9b45c8]
[part034:21443] [ 8] wrf.exe(solve_interface_+0x80a) [0x902dda]
[part034:21443] [ 9]
wrf.exe(__module_integrate_MOD_integrate+0x236) [0x4b2c4a]
[part034:21443] [10] wrf.exe(__module_wrf_top_MOD_wrf_run+0x24) [0x47a924]
[part034:21443] [11] wrf.exe(main+0x41) [0x4794d1]
[part034:21443] [12] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x361201d8b4]
[part034:21443] [13] wrf.exe [0x4793c9]
[part034:21443] *** End of error message ***
-------
Mouhamad
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users