Ok, so we tried to implement this again, and as soon as we put on a server that authenticates heavily the IPA came to it's knees again. This time I was able to watch it closely and try to troubleshoot a lot more, and also know exactly what server caused it (Mercurial with help of bamboo). This runs fine on a normal old openldap servers. The user is logging in very quickly and each time it logs in I can see in the logs that the krbLastsuccessfullogin parameter (or whatever it is called) is updated over and over and over in the changelog (/var/lib/dirsrv/slapd-$instanceid/db) those logs are filling VERY quickly and then disappear fairly quickly as well.
Issue 1: This is causing severe disk latency which obviously slows everything down wait times were around 25%+ Issue 2: These changes need to be replicated to my slave server thus adding to the mess My question is, why does the IPA server fail to keep up with the load when the openLDAP server didn't have an issue. Indexes? I'm running the following: CentOS release 6.4 (Final) 389-ds-base-126.96.36.199-20.el6_4.x86_64 389-ds-base-libs-188.8.131.52-20.el6_4.x86_64 ipa-python-3.0.0-26.el6_4.4.x86_64 ipa-admintools-3.0.0-26.el6_4.4.x86_64 ipa-pki-common-theme-9.0.3-7.el6.noarch python-iniparse-0.3.1-2.1.el6.noarch ipa-server-3.0.0-26.el6_4.4.x86_64 ipa-pki-ca-theme-9.0.3-7.el6.noarch ipa-server-selinux-3.0.0-26.el6_4.4.x86_64 libipa_hbac-1.9.2-82.7.el6_4.x86_64 ipa-client-3.0.0-26.el6_4.4.x86_64 libipa_hbac-python-1.9.2-82.7.el6_4.x86_64 So I've implemented this server anyway (against my better judgement with these issues and just made the user that logs into mercurial a local user instead of IPA). Also note before I did that for fun I implemented a RAM disk to put the change logs on, and that dropped the wait time to 0 (except bursts where it would raise to 30 to write the access log) but the CPU drove to 100% trying to keep up with the load. I have also killed the replication as well. Any help would be appreciated. Thanks, _____________________________________________________ John Moyer Director, IT Operations On Aug 7, 2013, at 4:08 PM, John Moyer <john.mo...@digitalreasoning.com> wrote: > > Thanks, > _____________________________________________________ > John Moyer > Director, IT Operations > Digital Reasoning Systems, Inc. > john.mo...@digitalreasoning.com > Office: 703.678.2311 > Mobile: 240.460.0023 > Fax: 703.678.2312 > www.digitalreasoning.com > > On Aug 6, 2013, at 10:57 AM, Rich Megginson <rmegg...@redhat.com> wrote: > >> On 08/05/2013 09:17 PM, John Moyer wrote: >>> Hello, >>> >>> So I've been preparing my infrastructure for a big change from an older >>> openldap system to a nice new IPA server. I have a redundant secondary >>> server and snapshots taken daily. I populated all my user data into IPA, >>> and gave the users a week to set a password. They all did this and the big >>> switch was this past weekend. We had done previous tests on each server >>> and it all worked. We switched this past weekend and it worked great. >>> >>> This morning a light load hit it (since I've only put a small fraction of >>> our servers on it about 15) and the primary came to it's knees. >> >> What platform? What version of ipa? What version of 389-ds-base? >> >> What was the nature of the load? Search requests? Update requests? >> Updates from replication? >> >> The logconv.pl tool can be used to analyze the 389-ds-base access logs. >> >> During this time of the load, are there any errors in the errors log? >> >>> Processor spiked, and logs started to fill (didn't fill at this point). >> >> I'm not sure what you mean by "logs started to fill (didn't fill at this >> point)." >> >>> I then decided it's probably a glitch (I'm an optimist) so I restarted >>> IPA services. They all restarted except for named which crashed (which >>> then caused everything to stop). I looked and now the disk was full. >> >> Which directory contained the files that caused the disk to become full? >> /var/log? /var/lib? Somewhere else? >> >>> So I trash the logs (had no easy place to put them at the time which I >>> regret now) and I restart the services again. >> >> What do you mean by "trash the logs"? >> >>> IPA fully crashes now (didn't even start the DIRSRV for my domain). >> >> Which component of IPA is crashing? If it is dirsrv that is refusing to >> start, is it crashing? What's in /var/log/dirsrv/slapd-*/errors? >> >> If it is crashing, we will need a core file and/or stack trace - see >> http://port389.org/wiki/FAQ#Debugging_Crashes >> >>> >>> So here are my questions: >>> >>> 1. Any idea what caused this? Any performance issues that have been seen? >> >> It could be almost anything given the above information. >> >>> >>> 2. Are the connection settings for IPA good out of the box? I ask >>> because in RHDS (in the first versions I used) the default connection >>> timeouts were a MAJOR issue, >> >> How so? Details? >> >>> I used to run a network of 400 servers and I had to set the time-outs to >>> >30sec which made my servers run really really well, >> >> Exactly which timeout settings are you talking about? >> >>> but if I used the 60 min defaults they also would come to their knees. Is >>> there a buried setting like this? (However, I must admit there didn't seem >>> like there were a lot of connections like when I had the issue with the 400 >>> servers years ago). >>> >>> Also is there an easy place to set log rotation settings? (If it's log >>> rotate just let me know, I just don't want to step on an internal app >>> rotate). >> >> IPA does not provide a GUI nor a command line utility for managing 389 >> logging settings. >> >> This document gives an example of how logs are managed using the RHDS GUI >> (not available when using IPA). >> >> https://access.redhat.com/site/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Monitoring_Server_and_Database_Activity.html#types-of-log-files >> >> However, all of these correspond to settings which you can set via >> ldapmodify: >> >> https://access.redhat.com/site/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Configuration_Command_and_File_Reference/Core_Server_Configuration_Reference.html#cnconfig-nsslapd_accesslog_logexpirationtime_Access_Log_Expiration_Time >> >> There are several attributes which control access log rotation parameters. >> >> >>> >>> >>> >>> Thanks, >>> _____________________________________________________ >>> John Moyer >>> Director, IT Operations >>> >>> >>> _______________________________________________ >>> Freeipa-users mailing list >>> Freeipaemail@example.com >>> https://www.redhat.com/mailman/listinfo/freeipa-users >> >
Description: Message signed with OpenPGP using GPGMail