I am seeking some assistance in isolating a re-occurring problem we are 
experiencing with our 389 DS Version 1.2.8.3 installation. We use the directory 
server for user authentication to our website. Every couple of days we start 
getting errors from our website login application reporting a user 
authentication timed out. These timeouts get more frequent as time passes. Our 
fix now is to restart the directory server which fixes the problem for a couple 
of days then the timeouts start happening again. I traced one application 
timeout back to the ds access logs and found the following entry at the same 
time:

[14/Mar/2012:10:23:01 -0500] conn=14730 op=-1 fd=1093 closed error 104 
(Connection reset by peer) - TCP connection reset by peer.

I looked through the older logs and the only time this conn/fd was used was two 
days ago. Here are the access log entries:

[12/Mar/2012:14:33:06 -0500] conn=14730 fd=1093 slot=1093 connection from 
10.1.xx.xx to 10.1.xx.xx
[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 BIND dn="uid,dc=domain,dc=com" 
method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 RESULT err=0 tag=97 nentries=0 
etime=0 dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 SRCH 
base="ou=users,ou=external,dc=domain,dc=com" scope=2 
filter="(&(uid=xxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 RESULT err=0 tag=101 nentries=1 
etime=0
[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 BIND 
dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 RESULT err=0 tag=97 nentries=0 
etime=0 dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 BIND dn="uid,dc=domain,dc=com" 
method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 RESULT err=0 tag=97 nentries=0 
etime=0 dn="uid=theManager,dc= domain,dc=com"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 SRCH 
base="ou=groups,ou=external,dc= domain,dc=com" scope=2 
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))" 
attrs="1.1"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 RESULT err=0 tag=101 nentries=1 
etime=0
[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 SRCH 
base="ou=groups,ou=external,dc= domain,dc=com" scope=2 
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
 attrs="cn"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 RESULT err=0 tag=101 nentries=0 
etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 SRCH 
base="ou=users,ou=external,dc=domain,dc=com" scope=2 
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 RESULT err=0 tag=101 nentries=1 
etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 BIND 
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 RESULT err=0 tag=97 nentries=0 
etime=0 dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 BIND 
dn="uid=theManager,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 RESULT err=0 tag=97 nentries=0 
etime=0 dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 SRCH 
base="ou=groups,ou=external,dc=domain,dc=com" scope=2 
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))" 
attrs="1.1"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 RESULT err=0 tag=101 nentries=1 
etime=0
[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 SRCH 
base="ou=groups,ou=external,dc=domain,dc=com" scope=2 
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
 attrs="cn"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 RESULT err=0 tag=101 nentries=0 
etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 SRCH 
base="ou=users,ou=external,dc=domain,dc=com" scope=2 
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 RESULT err=0 tag=101 nentries=1 
etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 BIND 
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 RESULT err=0 tag=97 nentries=0 
etime=0 dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 BIND 
dn="uid=theManager,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 RESULT err=0 tag=97 nentries=0 
etime=0 dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 SRCH 
base="ou=groups,ou=external,dc=domain,dc=com" scope=2 
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))" 
attrs="1.1"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 RESULT err=0 tag=101 nentries=1 
etime=0
[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 SRCH 
base="ou=groups,ou=external,dc=domain,dc=com" scope=2 
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
 attrs="cn"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 RESULT err=0 tag=101 nentries=0 
etime=0

The scenario seems to be that the DS works fine after a restart until it runs 
out of unused connections and/or file descriptors (max FDs= 8192). When it 
starts recycling connections and/or file descriptors the 104 errors start 
appearing more often in the access logs and we start getting more 
authentication errors.  We suspect that the original connection never got 
terminated correctly but don't know if it is the application that is at fault 
or a DS setting.

Our servers have been tuned according to the wiki doc at 
http://directory.fedoraproject.org/wiki/Performance_Tuning#Linux
We have set our idle "timeout" to 60 seconds and search "timelimit" to 120 
seconds with no change in behavior.

Watching netstat -nap | grep slapd shows established connections that do not 
drop off, just continually grow.

Any help would be greatly appreciated.

Nicholas J Alther
Sr. Software Developer/Analyst
Black Hills Corporation
Phone: 605.721.2158
Cell:     605.593.1899



Nicholas J Alther
Sr. Software Developer/Analyst
Phone: 605.721.2158
Cell:     605.593.1899


________________________________

This electronic message transmission contains information from Black Hills 
Corporation, its affiliate or subsidiary, which may be confidential or 
privileged. The information is intended to be for the use of the individual or 
entity named above. If you are not the intended recipient, be aware the 
disclosure, copying, distribution or use of the contents of this information is 
prohibited. If you received this electronic transmission in error, please reply 
to sender immediately; then delete this message without copying it or further 
reading.
--
389 users mailing list
[email protected]
https://admin.fedoraproject.org/mailman/listinfo/389-users

Reply via email to