Re: [Freeipa-users] named's LDAP connection hangs
If there is a resolution to this, we would love to know. We have been experiencing the same issues. From: freeipa-users-boun...@redhat.com [freeipa-users-boun...@redhat.com] on behalf of Thomas Raehalme [thomas.raeha...@codecenter.fi] Sent: Sunday, June 22, 2014 8:29 AM To: freeipa-users@redhat.com Subject: Re: [Freeipa-users] named's LDAP connection hangs Hi! Today it finally happened again - named is not resolving names under the IPA domain, pvnet.cc. Killing the named process and restarting it solves the problem (until it happens again). Petr, I'll send you the logs directly so I don't have to leave anything out. I hope that's okay. Thank you for the help! Best regards, Thomas On Mon, Jun 16, 2014 at 1:54 PM, Petr Spacek pspa...@redhat.commailto:pspa...@redhat.com wrote: On 16.6.2014 09:41, Thomas Raehalme wrote: Hi, We have a problem with IPA going out of service every now and then. There seems to be two kinds of situations: 1) The connection between named and dirsrv fails. Named can resolve external names but the domain managed by IPA does not resolve any names. named cannot be stopped. After killing the process and restarting the issue is resolved. 2) Sometimes the situation is more severe and also dirsrv is unresponsive. The solution then seems to be restarting both named and dirsrv (individually or through the 'ipa' service). Regarding #1 the file /var/log/messages contains the following: Jun 16 03:22:23 ipa named[7295]: received control channel command 'reload' Jun 16 03:22:23 ipa named[7295]: loading configuration from '/etc/named.conf' Jun 16 03:22:23 ipa named[7295]: using default UDP/IPv4 port range: [1024, 65535] Jun 16 03:22:23 ipa named[7295]: using default UDP/IPv6 port range: [1024, 65535] Jun 16 03:22:23 ipa named[7295]: sizing zone task pool based on 6 zones Jun 16 03:22:23 ipa named[7295]: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Ticket expired) Jun 16 03:22:23 ipa named[7295]: bind to LDAP server failed: Local error The reload is triggered by logrotate. For some reason authentication fails, and the IPA domain is no longer resolvable. I haven't discovered a pattern how often these problems occur. Maybe once a week or two. FreeIPA master running on CentOS 6.5 has been configured with the default settings. In addition a single replica has been added. Any ideas where I should look for the source of the problem? I have heard about this problem but nobody managed to reproduce the problem. Please: - configure KRB5_TRACE variable as described on https://fedorahosted.org/bind-dyndb-ldap/wiki/BIND9/NamedCannotStart#a1.Gathersymptoms - restart named - send me logs when it happens again. Thank you! -- Petr^2 Spacek ___ Freeipa-users mailing list Freeipa-users@redhat.commailto:Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users -- Thomas Raehalme CTO, teknologiajohtaja Mobile +358 40 545 0605 Codecenter Oy Väinönkatu 26 A, 4th Floor 40100 JYVÄSKYLÄ, Finland Tel. +358 10 322 0040 www.codecenter.fihttp://www.codecenter.fi Codecenter - Tietojärjestelmiä ymmärrettävästi -- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go To http://freeipa.org for more info on the project
[Freeipa-users] named unresponsive at seemingly random times
It seems to be at random and on different servers, but I will see the following in named.run: update_zone (psearch) failed for 'idnsname=example.com,cn=dns,dc=example,dc=com'. Zones can be outdated, run `rndc reload`: bad zone When I see this, I cannot do any dns lookup for records in example.com. In addition, named will not restart, I have to manually kill it and then start it again. Once it is restarted, everything is fine, I can lookup records again. I am looking for suggestions on troubleshooting or if anyone has seen this before and found a resolution. I am running Centos 6.5: 389-ds-base-1.2.11.15-30 bind-dyndb-ldap-2.3-5 bind-libs-9.8.2-0.17.rc1 bind-utils-9.8.2-0.17.rc1 bind-9.8.2-0.17.rc1 Thanks___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] named unresponsive at seemingly random times
The only thing I see that could be related is: Jan 21 10:31:05 freeipa2 named[20660]: LDAP query timed out. Try to adjust timeout parameter and then the message: Jan 21 10:31:05 freeipa2 named[20660]:update_zone (psearch) failed for 'idnsname=example.com,cn=dns,dc=example,dc=com'. Zones can be outdated, run `rndc reload`: timed out However in errors/access log for that 389 instance, I do not see anything around that time. When this happens again I will do what you suggested below (already have the debug packages installed) and will email you. Thanks a TON for your help on this! -Original Message- From: Petr Spacek pspa...@redhat.com Sent: Tuesday, January 21, 2014 10:29am To: andrew.tranqu...@mailtrust.com, freeipa-users@redhat.com Subject: Re: [Freeipa-users] named unresponsive at seemingly random times On 19.1.2014 03:38, andrew.tranqu...@mailtrust.com wrote: It seems to be at random and on different servers, but I will see the following in named.run: update_zone (psearch) failed for 'idnsname=example.com,cn=dns,dc=example,dc=com'. Zones can be outdated, run `rndc reload`: bad zone This typically mean that your zone is missing NS or glue records. Did you do some changes in the zone at time when the message appeared? Do you see any errors related to connection between LDAP server and named? Look carefully to /var/log/messages for any other messages from named. When I see this, I cannot do any dns lookup for records in example.com. In addition, named will not restart, I have to manually kill it and then start it again. Once it is restarted, everything is fine, I can lookup records again. This is really weird. Could you capture stacks at the time when the problem manifests? You can use following commands: $ yum install gdb $ debuginfo-install bind bind-dyndb-ldap $ gdb -ex 'set confirm off' -ex 'set pagination off' -ex 'thread apply all bt full' -ex 'quit' `which named` `pgrep named` stacktrace.`date +%s`.log 21 Please send the stracktrace file to this list of privately to me and I will look into it. Have a nice day! Petr^2 Spacek I am looking for suggestions on troubleshooting or if anyone has seen this before and found a resolution. I am running Centos 6.5: 389-ds-base-1.2.11.15-30 bind-dyndb-ldap-2.3-5 bind-libs-9.8.2-0.17.rc1 bind-utils-9.8.2-0.17.rc1 bind-9.8.2-0.17.rc1 ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
[Freeipa-users] Server randomly will stop accepting krb requests
I have 6 servers setup as freeipa replicas. 5 are working great, no problems. They are all running ipa-server-3.0.0-26.el6_4.4.x86_64 However, the same one will randomly stop working. By stop working I mean the following: (domain name and ips have been redacted) I cannot kinit as any user on that machine: [root@badserver ~]# kinit admin kinit: Generic error (see e-text) while getting initial credentials I cannot connect on 389 or 636 to that server: telnet badserver 636 telnet: Unable to connect to remote host: Connection refused slapd is running and listening on port 389 according to netstat: [root@badserver ~]# netstat -lpn | grep 389 tcp0 0 :::7389 :::* LISTEN 16419/ns-slapd but nothing is returned for port 636 in the /var/log/slapd-PKI* or slapd-DOMAIN error files, the last error is from over a week ago, actually the last entry period is from there. [18/Sep/2013:01:09:34 -0400] slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (KDC returned error string: PROCESS_TGS)) errno 2 (No such file or directory) /var/log/krb5kdc.log shows Sep 30 12:22:24 badserver krb5kdc[32063](info): AS_REQ (4 etypes {18 17 16 23}) ip: LOOKING_UP_CLIENT: ad...@example.com for krbtgt/example@example.com, Server error a service ipa restart ALWAYS fixes it. I added debug=true to /etc/ipa/default.conf but I do not see anything that is helpful. The only things listed in default.conf are things related to importing plugin module Any guidance/advice/docs to read would be greatly appreciated! The fact that it seems to be so random and the other 5 ipa servers are working great makes it even more frustrating! Thanks! ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] Server randomly will stop accepting krb requests
Thanks for the response I did look in /var/log/slapd-PKI* or slapd-DOMAIN (I guess I was not too clear I did that in my email) in those logs the last thing in that log is from Sep 18 From /var/log/dirsrv/slapd-EXAMPLE-COM/errors: [18/Sep/2013:01:09:34 -0400] slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (KDC returned error string: PROCESS_TGS)) errno 2 (No such file or directory) That is all, the items before that time are addition/deletion of entries which is normal. -Original Message- From: Alexander Bokovoy aboko...@redhat.com Sent: Monday, September 30, 2013 12:47pm To: Andrew Tranquada andrew.tranqu...@rackspace.com Cc: freeipa-users@redhat.com Subject: Re: [Freeipa-users] Server randomly will stop accepting krb requests On Mon, 30 Sep 2013, Andrew Tranquada wrote: I have 6 servers setup as freeipa replicas. 5 are working great, no problems. They are all running ipa-server-3.0.0-26.el6_4.4.x86_64 However, the same one will randomly stop working. By stop working I mean the following: (domain name and ips have been redacted) I cannot kinit as any user on that machine: [root@badserver ~]# kinit admin kinit: Generic error (see e-text) while getting initial credentials I cannot connect on 389 or 636 to that server: telnet badserver 636 telnet: Unable to connect to remote host: Connection refused slapd is running and listening on port 389 according to netstat: [root@badserver ~]# netstat -lpn | grep 389 tcp0 0 :::7389 :::* LISTEN 16419/ns-slapd This is port 7389, for CA LDAP instance, not port 389 which is main LDAP instance. but nothing is returned for port 636 Because port 636 is served by the same main dirsrv instance that is down. in the /var/log/slapd-PKI* or slapd-DOMAIN error files, the last error is from over a week ago, actually the last entry period is from there. [18/Sep/2013:01:09:34 -0400] slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (KDC returned error string: PROCESS_TGS)) errno 2 (No such file or directory) /var/log/krb5kdc.log shows Sep 30 12:22:24 badserver krb5kdc[32063](info): AS_REQ (4 etypes {18 17 16 23}) ip: LOOKING_UP_CLIENT: ad...@example.com for krbtgt/example@example.com, Server error a service ipa restart ALWAYS fixes it. Directory server instance is down, so LDAP server is not accessible, so Kerberos KDC cannot read the data which is only in LDAP, so it denies access. Any guidance/advice/docs to read would be greatly appreciated! The fact that it seems to be so random and the other 5 ipa servers are working great makes it even more frustrating! Look at directory server's logs to see what was the reason for refusing starting up in /var/log/dirsrv/slapd-DOMAIN/errors. -- / Alexander Bokovoy ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] Server randomly will stop accepting krb requests
Well I feel silly for not checking this earlier. You were correct. Sep 18 01:09:35 freeipa1 kernel: : ns-slapd[16553]: segfault at 4 ip 0041227a sp 7fb9d15edc68 error 4 in ns-slapd[40+53000] I am installing the 389-ds-base-debuginfo and accompanying packages now, restarting ipa, enabling core dumps in the kernel and changing core file size to unlimited. Will see what happens next! Thanks! -Original Message- From: Rob Crittenden rcrit...@redhat.com Sent: Monday, September 30, 2013 1:13pm To: Andrew Tranquada andrew.tranqu...@rackspace.com, Alexander Bokovoy aboko...@redhat.com Cc: freeipa-users@redhat.com Subject: Re: [Freeipa-users] Server randomly will stop accepting krb requests Andrew Tranquada wrote: Thanks for the response I did look in /var/log/slapd-PKI* or slapd-DOMAIN (I guess I was not too clear I did that in my email) in those logs the last thing in that log is from Sep 18 From /var/log/dirsrv/slapd-EXAMPLE-COM/errors: [18/Sep/2013:01:09:34 -0400] slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (KDC returned error string: PROCESS_TGS)) errno 2 (No such file or directory) That is all, the items before that time are addition/deletion of entries which is normal. -Original Message- From: Alexander Bokovoy aboko...@redhat.com Sent: Monday, September 30, 2013 12:47pm To: Andrew Tranquada andrew.tranqu...@rackspace.com Cc: freeipa-users@redhat.com Subject: Re: [Freeipa-users] Server randomly will stop accepting krb requests On Mon, 30 Sep 2013, Andrew Tranquada wrote: I have 6 servers setup as freeipa replicas. 5 are working great, no problems. They are all running ipa-server-3.0.0-26.el6_4.4.x86_64 However, the same one will randomly stop working. By stop working I mean the following: (domain name and ips have been redacted) I cannot kinit as any user on that machine: [root@badserver ~]# kinit admin kinit: Generic error (see e-text) while getting initial credentials I cannot connect on 389 or 636 to that server: telnet badserver 636 telnet: Unable to connect to remote host: Connection refused slapd is running and listening on port 389 according to netstat: [root@badserver ~]# netstat -lpn | grep 389 tcp0 0 :::7389 :::* LISTEN 16419/ns-slapd This is port 7389, for CA LDAP instance, not port 389 which is main LDAP instance. but nothing is returned for port 636 Because port 636 is served by the same main dirsrv instance that is down. in the /var/log/slapd-PKI* or slapd-DOMAIN error files, the last error is from over a week ago, actually the last entry period is from there. [18/Sep/2013:01:09:34 -0400] slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (KDC returned error string: PROCESS_TGS)) errno 2 (No such file or directory) /var/log/krb5kdc.log shows Sep 30 12:22:24 badserver krb5kdc[32063](info): AS_REQ (4 etypes {18 17 16 23}) ip: LOOKING_UP_CLIENT: ad...@example.com for krbtgt/example@example.com, Server error a service ipa restart ALWAYS fixes it. Directory server instance is down, so LDAP server is not accessible, so Kerberos KDC cannot read the data which is only in LDAP, so it denies access. Any guidance/advice/docs to read would be greatly appreciated! The fact that it seems to be so random and the other 5 ipa servers are working great makes it even more frustrating! Look at directory server's logs to see what was the reason for refusing starting up in /var/log/dirsrv/slapd-DOMAIN/errors. I'd look for evidence in /var/log/messages of ns-slapd core dumping. rob ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
[Freeipa-users] Replicas
Hello everyone. Is there a limit to the number of replicas you may have? Are there any documents detailing scaling limits for freeIPA? Thanks! ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] Replicas
Awesome thank you. From: Rob Crittenden [rcrit...@redhat.com] Sent: Tuesday, May 14, 2013 10:05 AM To: Andrew Tranquada; freeipa-users@redhat.com Subject: Re: [Freeipa-users] Replicas Andrew Tranquada wrote: Hello everyone. Is there a limit to the number of replicas you may have? Are there any documents detailing scaling limits for freeIPA? The maximum number of masters tested is 20. There is nothing in the code to prevent more, and there are users that have more. For scaling and performance I'd start with the 389-ds documentation. rob ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] Replicas
understood thank you From: Simo Sorce [sso...@redhat.com] Sent: Tuesday, May 14, 2013 10:54 AM To: Andrew Tranquada Cc: Rob Crittenden; freeipa-users@redhat.com Subject: Re: [Freeipa-users] Replicas - Original Message - Awesome thank you. note, we recommend no more than 4 replication agreements per master, so you should create a topology keeping this in mind (IE do not make 19 servers all have a replication agreement with 1). Simo. From: Rob Crittenden [rcrit...@redhat.com] Sent: Tuesday, May 14, 2013 10:05 AM To: Andrew Tranquada; freeipa-users@redhat.com Subject: Re: [Freeipa-users] Replicas Andrew Tranquada wrote: Hello everyone. Is there a limit to the number of replicas you may have? Are there any documents detailing scaling limits for freeIPA? The maximum number of masters tested is 20. There is nothing in the code to prevent more, and there are users that have more. For scaling and performance I'd start with the 389-ds documentation. rob ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users -- Simo Sorce * Red Hat, Inc. * New York ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users