On Wed, Dec 21, 2011 at 16:43, Simo Sorce <s...@redhat.com> wrote: > On Wed, 2011-12-21 at 15:33 -0500, Dan Scott wrote: >> On Wed, Dec 21, 2011 at 14:10, Dan Scott <danieljamessc...@gmail.com> wrote: >> > On Mon, Dec 19, 2011 at 15:26, Dan Scott <danieljamessc...@gmail.com> >> > wrote: >> >> On Mon, Dec 19, 2011 at 14:14, Simo Sorce <s...@redhat.com> wrote: >> >>> On Mon, 2011-12-19 at 11:01 -0500, Dan Scott wrote: >> >>>> On Thu, Dec 15, 2011 at 11:51, Rich Megginson <rmegg...@redhat.com> >> >>>> wrote: >> >>>> > On 12/15/2011 09:48 AM, Dan Scott wrote: >> >>>> >> >> >>>> >> Hi, >> >>>> >> >> >>>> >> On Thu, Dec 15, 2011 at 10:58, Rich Megginson<rmegg...@redhat.com> >> >>>> >> wrote: >> >>>> >>> >> >>>> >>> On 12/15/2011 08:41 AM, Dan Scott wrote: >> >>>> >>>> >> >>>> >>>> Hi, >> >>>> >>>> >> >>>> >>>> On my Fedora 15 FreeIPA server, I'm having some problems with >> >>>> >>>> stability. The server appears to 'hang' and stops responding to >> >>>> >>>> LDAP >> >>>> >>>> lookups. When I restart the dirsrv service, I get: >> >>>> >>>> >> >>>> >>>> Dec 15 09:40:02 ohm kernel: [254566.011404] ns-slapd[28910]: >> >>>> >>>> segfault >> >>>> >>>> at 17d ip 00007f00dbc0208c sp 00007fff929b7848 error 4 in >> >>>> >>>> libc-2.14.so[7f00dbb87000+18f000] >> >>>> >>>> >> >>>> >>>> and the /var/log/dirsrv/slapd-EXAMPLE-COM/errors contains >> >>>> >>>> >> >>>> >>>> [15/Dec/2011:09:47:35 -0500] set_krb5_creds - Could not get initial >> >>>> >>>> credentials for principal [ldap/example....@example.com] in keytab >> >>>> >>>> [WRFILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC >> >>>> >>>> for requested realm) >> >>>> >>>> [15/Dec/2011:09:47:35 -0500] slapd_ldap_sasl_interactive_bind - >> >>>> >>>> Error: >> >>>> >>>> could not perform interactive bind for id [] mech [GSSAPI]: error >> >>>> >>>> -2 >> >>>> >>>> (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified >> >>>> >>>> GSS failure. Minor code may provide more information (Credentials >> >>>> >>>> cache file '/tmp/krb5cc_496' not found)) >> >>>> >>>> >> >>>> >>>> This is happening very frequently, I'm having to restart the dirsrv >> >>>> >>>> process once an hour, otherwise people start complaining. >> >>>> >>>> >> >>>> >>>> I experienced similar problems with FreeIPA 1, when I was using >> >>>> >>>> Fedora >> >>>> >>>> 14 and earlier, and had to regularly (also once per hour) restart >> >>>> >>>> the >> >>>> >>>> dirsrv process. Could this be related? >> >>>> >>>> >> >>>> >>>> I also noticed this: >> >>>> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=730387 >> >>>> >>>> >> >>>> >>>> There are updates in 'updates-testing' which I believe fix the >> >>>> >>>> above >> >>>> >>>> issue, but I'm reluctant to install from a testing repo on my >> >>>> >>>> production server, can anyone report any feedback on this? >> >>>> >>> >> >>>> >>> The above bug does not cause a segfault. >> >>>> >>> What version of 389-ds-base are you using? >> >>>> >> >> >>>> >> [root@ohm ~]# rpm -qa|grep 389 >> >>>> >> 389-ds-base-libs-1.2.10-0.4.a4.fc15.x86_64 >> >>>> >> 389-ds-base-1.2.10-0.4.a4.fc15.x86_64 >> >>>> >> [root@ohm ~]# >> >>>> > >> >>>> > a4 is alpha software. Not sure how that got released to stable. >> >>>> > >> >>>> >>> Please enable the collection of core dumps so we can debug the >> >>>> >>> crash - >> >>>> >>> see >> >>>> >>> http://directory.fedoraproject.org/wiki/FAQ#Debugging_Crashes >> >>>> >> >> >>>> >> OK. I think there is a small typo in the instructions: >> >>>> >> >> >>>> >> 'debuginfo-install 389-ds-base-debuginfo' should be >> >>>> >> 'debuginfo-install >> >>>> >> 389-ds-base' >> >>>> > >> >>>> > Thanks. Fixed. >> >>>> > >> >>>> >> I managed to get the core dump (attached - so I only sent this >> >>>> >> message >> >>>> >> to you, not the list as well), but it doesn't contain much >> >>>> >> information. >> >>>> > >> >>>> > This is https://bugzilla.redhat.com/show_bug.cgi?id=755725 >> >>>> > >> >>>> > Will be fixed in 1.2.10.a6 >> >>>> > >> >>>> > But this still doesn't explain your kerberos errors. >> >>>> >> >>>> An additional problem is also occurring. I've been finding that the: >> >>>> >> >>>> /etc/dirsrv/slapd-EXAMPLE-COM/dse.ldif >> >>>> >> >>>> file is empty and prevents dirsrv from starting. I can restore it from >> >>>> dse.ldif.bak or dse.ldif.startOK, but this may be related to the LDAP >> >>>> problems that I'm having? >> >>> >> >>> This is an upgrade time problem, it should be fixed in latest packages. >> >>> Did you recently upgrade freeipa packages if so from what version to >> >>> what version ? >> >> >> >> The 0 length file doesn't appear related to upgrades. Possibly it only >> >> happens on the first service restart after an upgrade? >> >> >> >> It's happened at least 4 times since the last freeipa package upgrade >> >> on 4th November, so it seems to be happening too regularly to be the >> >> result of an upgrade. >> >> >> >> [root@curie ~]# grep freeipa /var/log/yum.log >> >> Sep 06 16:56:51 Installed: freeipa-python-2.0.1-2.fc15.x86_64 >> >> Sep 06 17:00:13 Installed: freeipa-client-2.0.1-2.fc15.x86_64 >> >> Sep 06 17:00:14 Installed: freeipa-admintools-2.0.1-2.fc15.x86_64 >> >> Sep 06 17:01:52 Installed: freeipa-server-selinux-2.0.1-2.fc15.x86_64 >> >> Sep 06 17:01:56 Installed: freeipa-server-2.0.1-2.fc15.x86_64 >> >> Sep 08 11:23:35 Updated: freeipa-python-2.1.0-1.fc15.x86_64 >> >> Sep 08 11:23:41 Updated: freeipa-client-2.1.0-1.fc15.x86_64 >> >> Sep 08 11:23:41 Updated: freeipa-admintools-2.1.0-1.fc15.x86_64 >> >> Sep 08 11:25:00 Updated: freeipa-server-selinux-2.1.0-1.fc15.x86_64 >> >> Sep 08 11:26:06 Updated: freeipa-server-2.1.0-1.fc15.x86_64 >> >> Nov 04 15:46:43 Updated: freeipa-python-2.1.3-2.fc15.x86_64 >> >> Nov 04 15:52:48 Updated: freeipa-client-2.1.3-2.fc15.x86_64 >> >> Nov 04 15:52:48 Updated: freeipa-admintools-2.1.3-2.fc15.x86_64 >> >> Nov 04 15:54:47 Updated: freeipa-server-2.1.3-2.fc15.x86_64 >> >> Nov 04 15:56:02 Updated: freeipa-server-selinux-2.1.3-2.fc15.x86_64 >> >> >> >> Dan >> > >> > I'm still having fairly serious problems. I keep getting: >> > >> > ipa: ERROR: Kerberos error: Kerberos error: ('Unspecified GSS failure. >> > Minor code may provide more information', 851968)/('Cannot contact >> > any KDC for requested realm', -1765328228)/ >> > >> > Whenever I try and run IPA commands on either of my servers, or a >> > client with the admin tools installed. >> > >> > The server logs contain: >> > >> > slapd_ldap_sasl_interactive_bind - Error: could not perform >> > interactive bind for id [] mech [GSSAPI]: error -1 (Can't contact LDAP >> > server) ((null)) >> > slapi_ldap_bind - Error: could not perform interactive bind for id [] >> > mech [GSSAPI]: error -1 (Can't contact LDAP server) >> > >> > And I can't create new replicas because they fail with: >> > >> > 2011-12-21 11:25:58,356 DEBUG Failed to start replication >> > File "/usr/sbin/ipa-replica-install", line 484, in <module> >> > main() >> > >> > File "/usr/sbin/ipa-replica-install", line 435, in main >> > ds = install_replica_ds(config) >> > >> > File "/usr/sbin/ipa-replica-install", line 137, in install_replica_ds >> > pkcs12_info) >> > >> > File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", >> > line 284, in create_replica >> > self.start_creation("Configuring directory server", 60) >> > >> > File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", >> > line 248, in start_creation >> > method() >> > >> > File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", >> > line 297, in __setup_replica >> > r_bindpw=self.dm_password) >> > >> > File "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py", >> > line 694, in setup_replication >> > raise RuntimeError("Failed to start replication") >> > >> > Can someone help me? This is getting fairly serious because I can't >> > create/modify anything and I'm worried that there will be problems >> > with existing users soon as well. >> >> OK, I think I'm narrowing in on this. It looks like the replication >> agreement is broken and the servers have got out of sync: > > odd > >> On the 'master' server (which contains the PKI dirsrv process): > > The PKI instance uses a diffeent set of replication agreementsso you > can't see those agreements with ipa-replica-manage which handles only > the IPA Idm instance. > >> [root@fileserver1 ~]# ipa-replica-manage list >> fileserver1.example.com: master >> >> On the other server: >> >> [root@fileserver2 ~]# ipa-replica-manage list >> fileserver1.example.com: master >> fileserver2.example.com: master > > strange indeed. > >> When I try and add the missing replication: >> >> [root@fileserver1 ~]# ipa-replica-manage connect fileserver2.example.com >> unexpected error: list index out of range >> >> Do I need to delete the replication from fileserver2? > > You can't remove a replication agreement if it is the only agreement you > have. This is to avoid split-brain situations. > > Not sure how to handle a disappeared agreement though it's > theorethically not possible unless you 'inadvertently' ran > ipa-replica-manage --force del fileserver2 on fileserver1 ...
This is possible... oops. I tried a few times to add another replica (fileserver3) which failed as I mentioned above. The replication process got most of the way through and showed up on one of the servers, but not the other, so I removed the replica. It's possible that I force removed fileserver2 by mistake. > Can you look into cn=config and see if you have references toi > fileserver2 ? > Maybe it is just a bug in displaying actually active replicas. I'm using 'jxplore' LDAP browser (my command line LDAP skills aren't very good, I can't seem to get the kerberos authentication working properly. In any case, I'm having trouble authenticating because of the problems mentioned above) and did an unauthenticated search for cn=config on fileserver1, no results. In cn=ipa,cn=etc there are: cn=masters which contains an entry for fileserver1 and cn=replicas which is empty. Thanks, Dan _______________________________________________ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users