Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-17 Thread Alexander Bokovoy

On Wed, 18 Feb 2015, Thomas Raehalme wrote:

Hi!

On Mon, Feb 16, 2015 at 8:44 AM, Alexander Bokovoy 
wrote:


I suspect you've triggered https://fedorahosted.org/freeipa/ticket/4586
and https://fedorahosted.org/freeipa/ticket/4635 -- slapi-nis plugin
configuration does not limit itself to $SUFFIX and listens to changes in
cn=changelog too so it may deadlock with a replication traffic.

We fixed these partly by changing slapi-nis configuration, partly by
fixing bugs in 389-ds.

I wonder if amending your slapi-nis config to avoid triggering internal
searches on cn=changelog would be enough.



Is it possible to go around this issue by disabling replication? If so, is
ipa-replica-manage disconnect enough or should we use del instead?

I think you are solving wrong issue.

Changing slapi-nis configuration to ignore cn=changelog was the change
we did for FreeIPA 4.1. We ended up ignoring a bit more subtrees too:
https://fedorahosted.org/freeipa/ticket/4635#comment:16

You need to show backtraces of nsslapd when it doesn't respond on LDAP
queries to verify it is the same issue but I suspect it is very likely
the issue.
--
/ Alexander Bokovoy


pgpDxHlnUTSpF.pgp
Description: PGP signature
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-17 Thread Thomas Raehalme
Hi!

On Mon, Feb 16, 2015 at 8:44 AM, Alexander Bokovoy 
wrote:

> I suspect you've triggered https://fedorahosted.org/freeipa/ticket/4586
> and https://fedorahosted.org/freeipa/ticket/4635 -- slapi-nis plugin
> configuration does not limit itself to $SUFFIX and listens to changes in
> cn=changelog too so it may deadlock with a replication traffic.
>
> We fixed these partly by changing slapi-nis configuration, partly by
> fixing bugs in 389-ds.
>
> I wonder if amending your slapi-nis config to avoid triggering internal
> searches on cn=changelog would be enough.
>
>
Is it possible to go around this issue by disabling replication? If so, is
ipa-replica-manage disconnect enough or should we use del instead?

Best regards,
Thomas
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-17 Thread Thomas Raehalme
On Mon, Feb 16, 2015 at 8:44 AM, Alexander Bokovoy 
wrote:

> I suspect you've triggered https://fedorahosted.org/freeipa/ticket/4586
> and https://fedorahosted.org/freeipa/ticket/4635 -- slapi-nis plugin
> configuration does not limit itself to $SUFFIX and listens to changes in
> cn=changelog too so it may deadlock with a replication traffic.
>
> We fixed these partly by changing slapi-nis configuration, partly by
> fixing bugs in 389-ds.
>
> I wonder if amending your slapi-nis config to avoid triggering internal
> searches on cn=changelog would be enough.
>
> If you have RHEL subscription, please open a case with Red Hat's
> support.
>
>
I opened a support case, but unfortunately the IPA server is running on
CentOS so no help from the support. Any chance you could share the
configuration changes you referred to above?

At the moment we cannot even access ipa-replica-manage because it "Can'
contact LDAP server". I doubt it has something to do with Kerberos based
authentication, as kinit is also really unstable at the moment.

Best regards,
Thomas
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-16 Thread Thomas Raehalme
On Mon, Feb 16, 2015 at 8:44 AM, Alexander Bokovoy 
wrote:

> I wonder if amending your slapi-nis config to avoid triggering internal
> searches on cn=changelog would be enough.
>

I can try, but would need some more details, if possible.


>
> If you have RHEL subscription, please open a case with Red Hat's
> support.
>

Ahh, it's been on my todo list for quite some time now (performing fresh
installs of all those CentOS servers isn't something I look forward to).
But an order has now been sent, and we'll start with IPA :-)

Best regards,
Thomas
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-15 Thread Alexander Bokovoy

On Mon, 16 Feb 2015, Thomas Raehalme wrote:

On Mon, Feb 16, 2015 at 1:04 AM, Thomas Raehalme <
thomas.raeha...@codecenter.fi> wrote:





Finally, do some stacktraces every couple of seconds over a period of a
minute.  For example, is the server really hung at the poll() in thread 32,
or will the poll() eventually return write ready and proceed?



Will do. Unfortunately it'll probably not take too long until the next
occurrence.




Here's a new set of stacktraces from a period of approx 1 minute. I hope
the attachment is not too large.

I suspect you've triggered https://fedorahosted.org/freeipa/ticket/4586
and https://fedorahosted.org/freeipa/ticket/4635 -- slapi-nis plugin
configuration does not limit itself to $SUFFIX and listens to changes in
cn=changelog too so it may deadlock with a replication traffic.

We fixed these partly by changing slapi-nis configuration, partly by
fixing bugs in 389-ds.

I wonder if amending your slapi-nis config to avoid triggering internal
searches on cn=changelog would be enough.

If you have RHEL subscription, please open a case with Red Hat's
support.

--
/ Alexander Bokovoy


pgpXcPFjpJMge.pgp
Description: PGP signature
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-15 Thread Thomas Raehalme
On Mon, Feb 16, 2015 at 12:57 AM, Rich Megginson 
wrote:

> Some info missing.  Since this is IPA, you'll need some additional
> debuginfo packages: debuginfo-install ipa-server slapi_nis (or maybe it's
> slapi-nis)
>
> Also looks as though the nspr debuginfo does not match the nspr version.
> rpm -q nspr nspr-debuginfo
>

I installed only the minimum debuginfo packages mentioned on the link (with
yum). Now I have added the ones you mentioned here.


> Finally, do some stacktraces every couple of seconds over a period of a
> minute.  For example, is the server really hung at the poll() in thread 32,
> or will the poll() eventually return write ready and proceed?
>
>
Will do. Unfortunately it'll probably not take too long until the next
occurrence.

Best regards,
Thomas
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-15 Thread Rich Megginson

On 02/15/2015 03:41 PM, Thomas Raehalme wrote:

Hi!

On Sun, Feb 15, 2015 at 11:37 PM, Rich Megginson > wrote:




Today we started having problems with dirsrv hanging. We have
observed the following symptoms (using EXAMPLE.COM
 instead of the real domain):



see http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs


Thanks! Please find the stacktrace attached.


Some info missing.  Since this is IPA, you'll need some additional 
debuginfo packages: debuginfo-install ipa-server slapi_nis (or maybe 
it's slapi-nis)


Also looks as though the nspr debuginfo does not match the nspr version.
rpm -q nspr nspr-debuginfo

Finally, do some stacktraces every couple of seconds over a period of a 
minute.  For example, is the server really hung at the poll() in thread 
32, or will the poll() eventually return write ready and proceed?




Best regards,
Thomas





-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-15 Thread Thomas Raehalme
Hi!

On Sun, Feb 15, 2015 at 11:37 PM, Rich Megginson 
wrote:

>
>  Today we started having problems with dirsrv hanging. We have observed
> the following symptoms (using EXAMPLE.COM instead of the real domain):
>
>
> see http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs
>
>
Thanks! Please find the stacktrace attached.

Best regards,
Thomas


stacktrace.1424039830.txt.gz
Description: GNU Zip compressed data
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-15 Thread Rich Megginson

On 02/15/2015 01:02 PM, Thomas Raehalme wrote:

Hi!

Today we started having problems with dirsrv hanging. We have observed 
the following symptoms (using EXAMPLE.COM  instead 
of the real domain):


/var/log/dirsrv/slapd-EXAMPLE-COM/errors:

[15/Feb/2015:21:48:50 +0200] slapd_ldap_sasl_interactive_bind - Error: 
could not perform interactive bind for id [] mech [GSSAPI]: LDAP error 
-1 (Can't contact LDAP server) ((null)) errno 107 (Transport endpoint 
is not connected)
[15/Feb/2015:21:48:50 +0200] slapi_ldap_bind - Error: could not 
perform interactive bind for id [] mech [GSSAPI]: error -1 (Can't 
contact LDAP server)


/var/log/messages:

Feb 15 21:49:02 ipa named[5545]: LDAP query timed out. Try to adjust 
"timeout" parameter
Feb 15 21:49:03 ipa named[5545]: LDAP query timed out. Try to adjust 
"timeout" parameter

(repeated)

Trying to access the DS also with ldapsearch just hangs:

ldapsearch -h localhost -x "dc=example,dc=com"


see http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs



And Kerberos is unavailable as well:

# KRB5_TRACE=/dev/stdout kinit admin
[6421] 1424029967.466519: Getting initial credentials for 
ad...@example.com 
[6421] 1424029967.467202: Sending request (172 bytes) to EXAMPLE.COM 

[6421] 1424029967.467736: Sending initial UDP request to dgram 
10.1.1.1:88 
[6421] 1424029968.469031: Initiating TCP connection to stream 
10.1.1.1:88 
[6421] 1424029968.469205: Sending TCP request to stream 10.1.1.1:88 

[6421] 1424029971.472024: Sending retry UDP request to dgram 
10.1.1.1:88 
[6421] 1424029976.477340: Sending retry UDP request to dgram 
10.1.1.1:88 
kinit: Cannot contact any KDC for realm 'EXAMPLE.COM 
' while getting initial credentials


Strange thing is that there is hardly any CPU utilization when the 
problem is occurring.


In addition we have started to see the following entries in 
/var/log/messages:


Feb 15 21:37:27 ipa kernel: possible SYN flooding on port 88. Sending 
cookies.
Feb 15 21:39:37 ipa kernel: possible SYN flooding on port 88. Sending 
cookies.


I'm not sure if this is related, but it's something we haven't seen 
before.


We are running CentOS release 6.6 (Final) with the latest available 
packages:


389-ds-base-libs-1.2.11.15-48.el6_6.x86_64
389-ds-base-1.2.11.15-48.el6_6.x86_64
ipa-client-3.0.0-42.el6.centos.x86_64
ipa-server-selinux-3.0.0-42.el6.centos.x86_64
libipa_hbac-1.11.6-30.el6_6.3.x86_64
sssd-ipa-1.11.6-30.el6_6.3.x86_64
ipa-admintools-3.0.0-42.el6.centos.x86_64
ipa-python-3.0.0-42.el6.centos.x86_64
ipa-pki-ca-theme-9.0.3-7.el6.noarch
ipa-server-3.0.0-42.el6.centos.x86_64
libipa_hbac-python-1.11.6-30.el6_6.3.x86_64
ipa-pki-common-theme-9.0.3-7.el6.noarch
krb5-workstation-1.10.3-33.el6.x86_64
krb5-libs-1.10.3-33.el6.x86_64
sssd-krb5-common-1.11.6-30.el6_6.3.x86_64
python-krbV-1.0.90-3.el6.x86_64
krb5-server-1.10.3-33.el6.x86_64
sssd-krb5-1.11.6-30.el6_6.3.x86_64
pam_krb5-2.3.11-9.el6.x86_64

Killing the dirsrv processes and restarting them resolves the issue - 
until it happens again after about 15 minutes.


Any idea what could have gone wrong? I can e-mail logs, if necessary.

Thank you in advance!

Best regards,
Thomas




-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

[Freeipa-users] dirsrv hangs, 0% CPU util

2015-02-15 Thread Thomas Raehalme
Hi!

Today we started having problems with dirsrv hanging. We have observed the
following symptoms (using EXAMPLE.COM instead of the real domain):

/var/log/dirsrv/slapd-EXAMPLE-COM/errors:

[15/Feb/2015:21:48:50 +0200] slapd_ldap_sasl_interactive_bind - Error:
could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -1
(Can't contact LDAP server) ((null)) errno 107 (Transport endpoint is not
connected)
[15/Feb/2015:21:48:50 +0200] slapi_ldap_bind - Error: could not perform
interactive bind for id [] mech [GSSAPI]: error -1 (Can't contact LDAP
server)

/var/log/messages:

Feb 15 21:49:02 ipa named[5545]: LDAP query timed out. Try to adjust
"timeout" parameter
Feb 15 21:49:03 ipa named[5545]: LDAP query timed out. Try to adjust
"timeout" parameter
(repeated)

Trying to access the DS also with ldapsearch just hangs:

ldapsearch -h localhost -x "dc=example,dc=com"

And Kerberos is unavailable as well:

# KRB5_TRACE=/dev/stdout kinit admin
[6421] 1424029967.466519: Getting initial credentials for ad...@example.com
[6421] 1424029967.467202: Sending request (172 bytes) to EXAMPLE.COM
[6421] 1424029967.467736: Sending initial UDP request to dgram 10.1.1.1:88
[6421] 1424029968.469031: Initiating TCP connection to stream 10.1.1.1:88
[6421] 1424029968.469205: Sending TCP request to stream 10.1.1.1:88
[6421] 1424029971.472024: Sending retry UDP request to dgram 10.1.1.1:88
[6421] 1424029976.477340: Sending retry UDP request to dgram 10.1.1.1:88
kinit: Cannot contact any KDC for realm 'EXAMPLE.COM' while getting initial
credentials

Strange thing is that there is hardly any CPU utilization when the problem
is occurring.

In addition we have started to see the following entries in
/var/log/messages:

Feb 15 21:37:27 ipa kernel: possible SYN flooding on port 88. Sending
cookies.
Feb 15 21:39:37 ipa kernel: possible SYN flooding on port 88. Sending
cookies.

I'm not sure if this is related, but it's something we haven't seen before.

We are running CentOS release 6.6 (Final) with the latest available
packages:

389-ds-base-libs-1.2.11.15-48.el6_6.x86_64
389-ds-base-1.2.11.15-48.el6_6.x86_64
ipa-client-3.0.0-42.el6.centos.x86_64
ipa-server-selinux-3.0.0-42.el6.centos.x86_64
libipa_hbac-1.11.6-30.el6_6.3.x86_64
sssd-ipa-1.11.6-30.el6_6.3.x86_64
ipa-admintools-3.0.0-42.el6.centos.x86_64
ipa-python-3.0.0-42.el6.centos.x86_64
ipa-pki-ca-theme-9.0.3-7.el6.noarch
ipa-server-3.0.0-42.el6.centos.x86_64
libipa_hbac-python-1.11.6-30.el6_6.3.x86_64
ipa-pki-common-theme-9.0.3-7.el6.noarch
krb5-workstation-1.10.3-33.el6.x86_64
krb5-libs-1.10.3-33.el6.x86_64
sssd-krb5-common-1.11.6-30.el6_6.3.x86_64
python-krbV-1.0.90-3.el6.x86_64
krb5-server-1.10.3-33.el6.x86_64
sssd-krb5-1.11.6-30.el6_6.3.x86_64
pam_krb5-2.3.11-9.el6.x86_64

Killing the dirsrv processes and restarting them resolves the issue - until
it happens again after about 15 minutes.

Any idea what could have gone wrong? I can e-mail logs, if necessary.

Thank you in advance!

Best regards,
Thomas
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project