On 3/21/07, Keith M Wesolowski wrote:
On Tue, Mar 20, 2007 at 01:43:57PM -0700, Bart Smaalders wrote:
> Turns out the limit command doesn't display the actual
> limit size above some pretty nominal values, but the
> limit command seems to work up to 2GB or so.
I've tried Bart's stack test and it works for me too:
[EMAIL PROTECTED]:~/test$ cc -o stack stack.c -fast -fsimple=1
-xopenmp -xtarget=opteron -xarch=sse3a -m64
[EMAIL PROTECTED]:~/test$ ./stack 1999990
allocating 1999990 k...worked
I've tried the pmap on the stack test and this is what it yields:
[EMAIL PROTECTED]:~/test$ ulimit -s unlimited
[EMAIL PROTECTED]:~/test$ ulimit -s
unlimited
[EMAIL PROTECTED]:~/test$ ./stack 1999990
allocating 1999990 k...^Z
[EMAIL PROTECTED]:~/test$ pmap -x 638
(...)
00007FFFFFFFB000 8 8 8 - rwx-- ld.so.1
FFFFFD7F85CE2000 1999992 1999992 1999992 - rw--- [ stack ]
which looks fine, so as you say, it must be a library problem.
Yes, the thing that's interesting here is that his process is either
somehow not honouring the limit or the limit is being reset along the
way:
FFFFFD7FE9DFC000 8 8 8 - rwx-- ld.so.1
FFFFFD7FFC419000 59292 59292 59292 - rw--- [ stack ]
> FFFFFD7FFC419000-FFFFFD7FE9DFC000+(0t59292<<0t10)=X
16004000
So the effective limit appears to be about 352MB. Normally, a 64-bit
process (such as yours) will yield this when run with an unlimited
stack size.
...
00007FFFFFFFB000 8 8 8 - rwx-- ld.so.1
FFFFFD7FFFDFD000 12 12 12 - rw--- [ stack ]
which leaves oodles. So...is some fortran library or openmp-related
generated code resetting the limits somewhere along the way? Sounds
like a job for truss and/or DTrace...
To find out a bit more, I've written the following program:
[EMAIL PROTECTED]:~/test$ cat stack.f90
program stack
integer*8, parameter :: k=1024
integer*8, parameter :: s=16*k*k*k/4
integer,automatic :: foo(s)
print*, 'allocating:', s/1024*4,'kbytes'
foo=1
pause
print*, foo(1),foo(s)
print*, 'worked'
end program
Which touches a whole 16GB array located on the stack. It runs fine,
here is the pmap:
[EMAIL PROTECTED]:~/test$ f90 -o stackf stack.f90 -fast -fsimple=1
-xopenmp -xtarget=opteron -xarch=sse3a -m64
[EMAIL PROTECTED]:~/test$ ./stackf
allocating: 16777216 kbytes
PAUSE
To continue execution, type "go"
^Z
[3]+ Stopped ./stackf
[EMAIL PROTECTED]:~/test$ pmap -x 715
(...)
00007FFFFFFFB000 8 8 8 - rwx-- ld.so.1
FFFFFD7BFFDFF000 16777220 16777220 16777220 - rw--- [ stack ]
The compile options are the same as for the program that fails. Let's
try now to insert some openmp stuff:
[EMAIL PROTECTED]:~/test$ cat stack.f90
program stack
implicit none
integer*8, parameter :: k=1024
integer*8, parameter :: s=512*k*k/4
integer,automatic :: foo(s)
integer*8 i
print*, 'allocating:', s/1024*4,'kbytes'
!$omp parallel do private(i)
do i=1,s
foo(i)=1
end do
!$omp end parallel do
pause
print*, foo(1),foo(s)
print*, 'worked'
end program
[EMAIL PROTECTED]:~/test$ f90 -o stackf stack.f90 -fast -fsimple=1
-xopenmp -xtarget=opteron -xarch=sse3a -m64
[EMAIL PROTECTED]:~/test$ ./stackf
allocating: 524288 kbytes
Fallo de segmentación (SEGV) (core dumped)
[EMAIL PROTECTED]:~/test$ pmap core
core 'core' of 856: ./stackf
0000000000400000 252K r-x-- /home/franjesus/test/stackf
000000000044E000 8K rw--- /home/franjesus/test/stackf
0000000000450000 32K rw--- [ heap ]
00007FFFFABFF000 4K rw--- [ stack tid=16 ]
00007FFFFAFFF000 4K rw--- [ stack tid=15 ]
00007FFFFB3FF000 4K rw--- [ stack tid=14 ]
00007FFFFB7FF000 4K rw--- [ stack tid=13 ]
00007FFFFBBFF000 4K rw--- [ stack tid=12 ]
00007FFFFBFFF000 4K rw--- [ stack tid=11 ]
00007FFFFC3FF000 4K rw--- [ stack tid=10 ]
00007FFFFC7FF000 4K rw--- [ stack tid=9 ]
00007FFFFCBFF000 4K rw--- [ stack tid=8 ]
00007FFFFCFFF000 4K rw--- [ stack tid=7 ]
00007FFFFD3FF000 4K rw--- [ stack tid=6 ]
00007FFFFD7FF000 4K rw--- [ stack tid=5 ]
00007FFFFDBFF000 4K rw--- [ stack tid=4 ]
00007FFFFDFFF000 4K rw--- [ stack tid=3 ]
00007FFFFE3FF000 4K rw--- [ stack tid=2 ]
00007FFFFE600000 2048K rwx--
00007FFFFE910000 1128K r-x--
/usr/lib/locale/es_ES.UTF-8/amd64/es_ES.UTF-8.so.3
00007FFFFEA39000 8K rw---
/usr/lib/locale/es_ES.UTF-8/amd64/es_ES.UTF-8.so.3
00007FFFFEA40000 1632K r-x-- /lib/amd64/libc.so.1
00007FFFFEBE8000 44K rw--- /lib/amd64/libc.so.1
00007FFFFEBF3000 8K rw--- /lib/amd64/libc.so.1
00007FFFFEC00000 2912K r-x--
/export/home/opt/SUNWspro/prod/lib/amd64/libfsu.so.1
00007FFFFEEE7000 12K rw---
/export/home/opt/SUNWspro/prod/lib/amd64/libfsu.so.1
00007FFFFEEEA000 4K rw---
/export/home/opt/SUNWspro/prod/lib/amd64/libfsu.so.1
00007FFFFEF10000 64K rwx--
00007FFFFEF31000 4K rwx--
00007FFFFEF39000 64K rw---
00007FFFFEF4B000 64K rw---
00007FFFFEF5D000 132K r----*
00007FFFFEF80000 72K r-x--
/usr/lib/locale/common/amd64/methods_unicode.so.3
00007FFFFEFA1000 4K rw---
/usr/lib/locale/common/amd64/methods_unicode.so.3
00007FFFFEFB0000 4K rwx--
00007FFFFEFC0000 28K r-x--
/usr/lib/locale/es_ES.ISO8859-15/amd64/es_ES.ISO8859-15.so.3
00007FFFFEFD6000 8K rw---
/usr/lib/locale/es_ES.ISO8859-15/amd64/es_ES.ISO8859-15.so.3
00007FFFFEFE0000 28K r-x--
/usr/lib/locale/es_ES.ISO8859-1/amd64/es_ES.ISO8859-1.so.3
00007FFFFEFF6000 8K rw---
/usr/lib/locale/es_ES.ISO8859-1/amd64/es_ES.ISO8859-1.so.3
00007FFFFF000000 236K r-x--
/export/home/opt/SUNWspro/prod/lib/amd64/libfai.so.1
00007FFFFF04A000 4K rw---
/export/home/opt/SUNWspro/prod/lib/amd64/libfai.so.1
00007FFFFF04B000 13316K rw---
/export/home/opt/SUNWspro/prod/lib/amd64/libfai.so.1
00007FFFFFD50000 4K rwx--
00007FFFFFD60000 4K rwx--
00007FFFFFD70000 4K rwx--
00007FFFFFD80000 16K r-x-- /lib/amd64/libpthread.so.1
00007FFFFFD90000 24K rwx--
00007FFFFFDA0000 4K rwx--
00007FFFFFDB0000 16K r-x-- /lib/amd64/libm.so.1
00007FFFFFDC3000 4K rw--- /lib/amd64/libm.so.1
00007FFFFFDD0000 4K rwx--
00007FFFFFDE0000 388K r-x-- /lib/amd64/libm.so.2
00007FFFFFE50000 4K rw--- /lib/amd64/libm.so.2
00007FFFFFE51000 4K rw--- /lib/amd64/libm.so.2
00007FFFFFE60000 4K rwx--
00007FFFFFE70000 204K r-x-- /lib/amd64/libmtsk.so.1
00007FFFFFEB2000 12K rw--- /lib/amd64/libmtsk.so.1
00007FFFFFEB5000 32K rw--- /lib/amd64/libmtsk.so.1
00007FFFFFEC0000 312K r-x--
/export/home/opt/SUNWspro/prod/lib/amd64/libsunmath.so.1
00007FFFFFF1D000 8K rw---
/export/home/opt/SUNWspro/prod/lib/amd64/libsunmath.so.1
00007FFFFFF30000 4K rwx--
00007FFFFFF40000 4K rwx--
00007FFFFFF50000 60K r-x--
/export/home/opt/SUNWspro/prod/lib/amd64/libfui.so.2
00007FFFFFF6E000 4K rw---
/export/home/opt/SUNWspro/prod/lib/amd64/libfui.so.2
00007FFFFFF80000 4K r-x-- /lib/amd64/libdl.so.1
00007FFFFFF90000 4K rwx--
00007FFFFFF94000 340K r-x-- /lib/amd64/ld.so.1
00007FFFFFFF9000 8K rwx-- /lib/amd64/ld.so.1
00007FFFFFFFB000 8K rwx-- /lib/amd64/ld.so.1
FFFFFD7FDFDFF000 524292K rw--- [ stack ]
total 547960K
[EMAIL PROTECTED]:~/test$ mdb core
Loading modules: [ libc.so.1 ld.so.1 ]
$r
%rax = 0x0000000000000000 %r8 = 0x00007ffffef54f4c
%rbx = 0x0000000000000000 %r9 = 0x00007ffffebf0484
%rcx = 0x0000000000000000 %r10 = 0x0000000000000000
%rdx = 0x0000000052e3d905 %r11 = 0x0000000000800000
%rsi = 0x0000000000000000 %r12 = 0x0000000000000000
%rdi = 0x0000000000000000 %r13 = 0x00007ffffef54f40
%r14 = 0x00007fffffd92000
%r15 = 0x00007ffffe6043c8
%cs = 0x0053 %fs = 0x0000 %gs = 0x0000
%ds = 0x0000 %es = 0x0000 %ss = 0x004b
%rip = 0x0000000100000001
%rbp = 0xfffffd7fffdff520
%rsp = 0xfffffd7fffdff4d0
%rflags = 0x00010246
id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
status=<of,df,IF,tf,sf,ZF,af,PF,cf>
%gsbase = 0x0000000000000000
%fsbase = 0x00007fffffd92000
%trapno = 0xe
%err = 0x14
Looks like now it's another problem. For 500MB it runs just fine, with
512 it segfaults.
I don't know how to use dtrace, though I guess it could help me a lot
finding where the problem is :-(
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code