The truss(1) man page documents this.
I believe any object which still includes symbols (e.g. not stripped) is
tracable in this manner.
Happy trussing.
Brian
[email protected] wrote:
Thanks for the reply.
I will try with the -ulibc::tempnam,fopen option.
One last query, in general if I am interested to trace a method using
truss, will -<libraryname>::methodname work.
Is it applicable to libraries provided by Solaris or even for any 3rd
party libraries.
Thanks
Girish
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: 15 July 2009 23:42
To: Girish Prabhakarrao
Subject: Re: [osol-discuss] Problems with tempnam
[ I notice you've removed CC:[email protected] - was
this intentional? I've not included it again in case it was, but feel
free to add it to your reply. ]
IIRC, seeing an open followed by a close is exactly the same failure
mode as the old "gethostbyname() fails with >255 file descriptors" issue
(CR 4353836 I think - see
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4353836 for
some details).
In that case, truss would show an open("/etc/netconfig") which suceeded,
returning an fd# >= 256, but in place of a read() command or anything
else, the next line would immediately close(2) the fd.
Replies are inline (including some I missed in your original post):
[email protected] wrote:
>
> Hi,
> Thanks for the quick reply.
>
> That implies the problem is not with tempnam but with fopen. If i use
> C++ stream libraries instead of fopen will it take the fd limit to the
> one specified by ulimit.
> Say,
> ofstream file.open(temporaryFile);
>
That sounds about right. stdio has documented limits, and it looks like
you're hitting one.
>
> 2)I just removed the fopen and ran the code just with tempnam, the
> truss shows me that there is a stat() and access() system call called
> every time a tempnam is executed.
>
> 23683/1: stat64(0x000484D4, 0xFFBFF670) = 0
> 23683/1: 0x000484D4: "/home/giripra/"
> 23683/1: d=0x05F0017B i=245281 m=0040777 l=46 u=7145
> g=1 sz=40960
> 23683/1: at = Jul 15 12:30:26 IST 2009 [ 1247641226 ]
> 23683/1: mt = Jul 15 18:06:04 IST 2009 [ 1247661364 ]
> 23683/1: ct = Jul 15 18:06:04 IST 2009 [ 1247661364 ]
> 23683/1: bsz=8192 blks=88 fs=nfs
> 23683/1: access(0x00067C68, 3) = 0
> 23683/1: 0x00067C68: "/home/giripra"
>
> However on the site where the actual problem is occurring(Please refer
> http://forums.sun.com/thread.jspa?threadID=5396830&tstart=0
<http://forums.sun.com/thread.jspa?threadID=5396830&tstart=0>
> <http://forums.sun.com/thread.jspa?threadID=5396830&tstart=0
<http://forums.sun.com/thread.jspa?threadID=5396830&tstart=0>> ) , I
> see that for some hours when binary is working as expected the number
> of fds are less than 255 and once it stops working the number of fds
> are more than 255. In the code immediatly after tempnam is called we
> call fopen. Your explanation clarfies the problem.
>
Your forum thread relates to Solaris 10. This is an OpenSolaris mailing
list. You would be better pursuing this through your support contract
and your local Solution Centre.
> But if fopen fails why is that we do not see system calls for tempnam
> in the truss. Below I have pasted the truss for 2 cases when binary is
> working as expected(Case when fds are less than 255) and when binary
> does not behave as expected(Occurs when fds in the process has
> exceeded 255)
>
Are these truss outputs from your test case program originally posted or
from the in-production app? If the latter, then I would hope that error
checking is being done. Your test case does not check the value of
outFile after calling fopen(). If it did, it should detect that fopen
returns NULL (or should as documented in the man page) and sets errno to
EMFILE.
If the truss output is coming from your test app, then I'm not sure what
is going on. The equivalent C program always shows the same stat/access
calls, regardless of whether fopen is used or not. Try adding the option
"-ulibc::tempnam,fopen" to truss to trace entry to and exit from the
tempnam() and fopen() library calls. I get output like this around the
255/256 fd mark:
/1...@1: -> libc:tempnam(0x8050b90, 0x0)
/1: stat64("/tmp/brian/", 0x08047900) = 0
/1: access("/tmp/brian", W_OK|X_OK) = 0
/1: getpid() = 1205 [1204]
/1: lstat64("/tmp/brian/SJA44aOwc", 0x08047810) Err#2 ENOENT
/1...@1: <- libc:tempnam() = 0x80662c8
/1...@1: -> libc:fopen(0x80662c8, 0x8050b9c)
/1: open("/tmp/brian/SJA44aOwc", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 255
/1...@1: <- libc:fopen() = 0x80675a0
Count 255 /tmp/brian/SJA44aOwc opened
/1: write(1, " C o u n t 2 5 5 / t".., 38) = 38
/1...@1: -> libc:tempnam(0x8050b90, 0x0)
/1: stat64("/tmp/brian/", 0x08047900) = 0
/1: access("/tmp/brian", W_OK|X_OK) = 0
/1: getpid() = 1205 [1204]
/1: lstat64("/tmp/brian/TJA54aOwc", 0x08047810) Err#2 ENOENT
/1...@1: <- libc:tempnam() = 0x80662f0
/1...@1: -> libc:fopen(0x80662f0, 0x8050b9c)
/1: open("/tmp/brian/TJA54aOwc", O_WRONLY|O_CREAT|O_TRUNC, 0666)
Err#24 EMFILE
/1...@1: <- libc:fopen() = 0
Count 256 /tmp/brian/TJA54aOwc FAILED
/1: write(1, " C o u n t 2 5 6 / t".., 38) = 38
From this we can clearly see which system calls were associated with
the tempnam() function, and which for the fopen.
In your original post:
>
> The truss looks very interesting, The file which gets fd 256, seems to
> get closed immediately even though there is no close call in the code.
> However the other fds i.e from 0 to 254 continues to be open till the
> program exits. This is a strange behaviour.See attached file for truss.
>
This is due to the call to close() in the failure path of fopen. See
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/port/stdio/_endopen.c#56
for the OpenSolaris version of this code. The close you are hitting is
probably the one at line 106.
Does your app actually need all 255 file descriptors at the same time,
or is it simply leaking them?
Hope that helps,
Brian
--
Brian Ruthven Sun Microsystems UK
Solaris Revenue Product Engineering Tel: +44 (0)1252 422 312
Sparc House, Guillemont Park, Camberley, GU17 9QG
Please do not print this email unless it is absolutely necessary.
The information contained in this electronic message and any
attachments to this message are intended for the exclusive use of the
addressee(s) and may contain proprietary, confidential or privileged
information. If you are not the intended recipient, you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient
should check this email and any attachments for the presence of
viruses. The company accepts no liability for any damage caused by any
virus transmitted by this email.
www.wipro.com
--
Brian Ruthven
Solaris Revenue Product Engineering
Sun Microsystems UK
Sparc House, Guillemont Park, Camberley, GU17 9QG
_______________________________________________
opensolaris-discuss mailing list
[email protected]