Re: [Freeipa-users] Timeout (?) issues

Petr Spacek Fri, 20 Sep 2013 01:10:43 -0700

On 20.9.2013 01:24, KodaK wrote:

This is ridiculous, right?


IPA server 1:

# for i in $(ls access*); do echo -n  $i:\  ;grep err=32 $i | wc -l; done
access: 248478
access.20130916-043207: 302774
access.20130916-123642: 272572
access.20130916-201516: 294308
access.20130917-081053: 295060
access.20130917-144559: 284498
access.20130917-231435: 281035
access.20130918-091611: 291165
access.20130918-154945: 275792
access.20130919-014322: 296113

IPA server 2:

access: 4313
access.20130909-200216: 4023
access.20130910-200229: 4161
access.20130911-200239: 4182
access.20130912-200249: 5069
access.20130913-200258: 3833
access.20130914-200313: 4208
access.20130915-200323: 4702
access.20130916-200332: 4532


IPA server 3:

access: 802
access.20130910-080737: 3876
access.20130911-080748: 3902
access.20130912-080802: 3678
access.20130913-080810: 3765
access.20130914-080826: 3524
access.20130915-080907: 4142
access.20130916-080916: 4930
access.20130917-080926: 4769
access.20130918-081005: 2879

IPA server 4:

access: 2812
access.20130910-003051: 4095
access.20130911-003105: 3623
access.20130912-003113: 3606
access.20130913-003125: 3581
access.20130914-003135: 3758
access.20130915-003150: 3935
access.20130916-003159: 4184
access.20130917-003210: 3859
access.20130918-003221: 5110


The vast majority of the err=32 messages are DNS entries.

It depends on your setup. Bind-dyndb-ldap does LDAP search for eachnon-existent name to verify that the name wasn't added to LDAP in meanwhile.If you have clients doing 1M queries for non-existing names per day, then youwill see 1M LDAP queries with err=32 per day.

Next major version of bind-dyndb-ldap will have reworked internal database andit will support negative caching, so number of err=32 should drop significantly.

Here are some samples:

[19/Sep/2013:18:19:51 -0500] conn=9 op=169764 SRCH base="idnsName=xxx.com
,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com" scope=0
filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169764 RESULT err=32 tag=101
nentries=0 etime=0

This is interesting, because this LDAP query is equal to DNS query for"xxx.com.unix.xxx.com." Are your clients that crazy? :-)

[19/Sep/2013:18:19:51 -0500] conn=9 op=169774 SRCH base="idnsName=
slpoxacl01.unix.xxx.com,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169774 RESULT err=32 tag=101
nentries=0 etime=0


This is equivalent to DNS query for "slpoxacl01.unix.xxx.com.unix.xxx.com.".

[19/Sep/2013:18:19:51 -0500] conn=9 op=169770 SRCH base="idnsName=
sla400q1.unix.xxx.com,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169770 RESULT err=32 tag=101
nentries=0 etime=0


And this is "sla400q1.unix.xxx.com.unix.xxx.com.".

[19/Sep/2013:18:19:51 -0500] conn=9 op=169772 SRCH base="idnsName=
magellanhealth.com,idnsname=unix.magellanhealth.com,cn=dns,dc=unix,dc=magellanhealth,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169772 RESULT err=32 tag=101
nentries=0 etime=0

So far today there are over half a million of these.  That can't be right.

I would recommend you to use network sniffer and check which clients sendsthese crazy queries.


My guess is that your resolver library (libc?) causes this.

On my Linux system with glibc-2.17-14.fc19.x86_64 it behaves in this way:

client query = nonexistent.example.com.
(I used $ "ping nonexistent.example.com.")
search domain in /etc/resolv.conf = brq.redhat.com.

DNS query #1: nonexistent.example.com. => NXDOMAIN
DNS query #2: nonexistent.example.com.brq.redhat.com. => NXDOMAIN
DNS query #3: nonexistent.example.com.redhat.com. => NXDOMAIN

On Thu, Sep 19, 2013 at 3:05 PM, KodaK <sako...@gmail.com> wrote:

I didn't realize that DNS created one connection.  I thought it was one
connection spanning several days.

In theory, there should be 2-4 LDAP connections from each DNS server and thoseconnections should live until DNS or LDAP server restarts/crashes.


Petr^2 Spacek

On Thu, Sep 19, 2013 at 2:51 PM, Rich Megginson <rmegg...@redhat.com>wrote:

  On 09/19/2013 12:57 PM, KodaK wrote:

Well, this is awkward:

  [root@slpidml01 slapd-UNIX-xxx-COM]# grep conn=170902 access* | wc -l
5453936
[root@slpidml01 slapd-UNIX-xxx-COM]#


Why is it awkward?




On Thu, Sep 19, 2013 at 1:48 PM, KodaK <sako...@gmail.com> wrote:

Thanks.  I've been running that against my logs, and this has to be
abnormal:

  err=32               129274    No Such Object
err=0                 10952    Successful Operations
err=14                  536    SASL Bind in Progress
err=53                   39    Unwilling To Perform
err=49                    3    Invalid Credentials (Bad Password)

  I'm still trying to figure out why there are so many error 32s.  Are
there any usual suspects I should know about?  (That's just the current
access log, btw.)


On Tue, Sep 17, 2013 at 9:01 AM, Rich Megginson <rmegg...@redhat.com>wrote:

   On 09/16/2013 07:57 PM, Dmitri Pal wrote:

On 09/16/2013 12:02 PM, KodaK wrote:

Yet another AIX related problem:

  The AIX LDAP client is called secldapclntd (sure, they could make it
more awkward, but the budget ran out.)  I'm running into the issue detailed
here:

  http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344

  "If an LDAP server fails to answer an LDAP query, secldapclntd caches
the non-answered query negatively. This may happen if the LDAP server
is down for example. After the LDAP server is back again secldapclntd will
use the negative cache entry and the application initiating the original
query will still fail until the cache entry expires."

  IBM is working on porting the fix to our specific TL and SP levels.

  What I'm concerned with here, though, is *why* is it timing out?  I
don't know what the current timeout values are (AIX sucks, etc.)

  I don't see timeout issues on my Linux boxes, which leads me to
believe that either the sssd timouts are longer or that sssd is just more
robust when dealing with timeouts.

  I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
because I occasionally have to re-run sudo commands because they initially
fail (and I know I'm using the right passwords.)  However, sudo doesn't
appear to have a cache (or it handles caching better.)

  Does anyone have any troubleshooting suggestions?  Any general "speed
things up" suggestions on the IPA side?

  Thanks,

  --Jason

  --
The government is going to read our mail anyway, might as well make it
tough for them.  GPG Public key ID:  B6A1A7C6


_______________________________________________
Freeipa-users mailing 
listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users


Is the server FreeIPA?
Can see in the server logs what is actually happening is it the server
that really takes time or there is a network connectivity issue or FW is
dropping packets?
I would really start with the server side logs.


  As far as 389 goes, run logconv.pl against the access logs in
/var/log/dirsrv/slapd-DOMAIN-COM


_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

Reply via email to