Re: [Freeipa-users] 2.20 dirsrv memory usage
On Wed, Jul 18, 2012 at 2:26 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 05:09 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 1:52 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 04:27 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 1:06 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 03:45 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 12:28 PM, John Dennis jden...@redhat.com wrote: On 07/18/2012 02:59 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Actually neither, it means waiting for a response from the web server (technically it's making an RPC call via HTTP Ajax). The RPC call needs to go through the web server, memcached, and typically will invoke one or more directory server queries, and run a bunch of Python to massage everything before the RPC returns with the result. It doesn't look like you've got much difference in times between with pagination on and pagination off. I don't know the pagination code but I suspect it's run after the RPC call returns so the RPC timing is not telling us much with respect to that. Waiting for up to 3 seconds for an RPC call does seem on the high side. Do you have a lot of LDAP data? No. 49 users, 17 hosts, 25 services, 6 DNS zones, only 1 of which has any significant amount of hosts in it. But really, unless we get timing results for each component we're grasping at straws :-( Understood. Steve Do you have a replica and does this replica behave the same? No replica yet. I wanted to get the memory leak issue solved first. All I have to compare to is the old 2.1.3 before. This one is much slower. I just can't seem to figure out what's wrong. The upgrade seemed to complete successfully and there were no errors in the log. The only things I've found thus far (earlier in this thread) are the unindexed entries (all hosts entries) that Rich seemed to think might be slowing things up. As the slowness is on every page, I wouldn't think that would be the problem. I wouldn't have said as much about this were it not for the promised faster speed mentioned in the release notes. It's comparable to the old 2.0 release candidates so I thought it might have been due to the complexity of the feature additions. Steve Is the time correct on this system? Yes. HW clock is GMT and localtime is Pacific Daylight Time and it is accurate to the minute. Steve Can you check the Kerberos logs whether the ldap service ticket is acquired on every http request via browser? That looks to be the case. I tailed the log while accessing each page and got: Jul 18 22:43:15 ipa1.example.com krb5kdc[1243](info): TGS_REQ (4 etypes {18 17 16 23}) xxx.xx.xx.xx: ISSUE: authtime 1342640481, etypes {rep=18 tkt=18 ses=18}, HTTP/ipa1.example@example.com for ldap/ipa1.example@example.com Jul 18 22:43:15 ipa1.example.com krb5kdc[1243](info): ... CONSTRAINED-DELEGATION
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/18/2012 08:59 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). I was looking at the top of FreeIPA 2.2 branch. RHEL version differs a bit. It shouldn't matter in this case though. Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Basically all the stuff, which is needed for processing of the request. The pipeline is something like (I don't want to go into details): httpd - mod_wsgi - python - memcache, dir server request and back. From what I see I think the problem is not on Web UI side. Most of the time is waiting for server response. I initially thought the problem lies in a large number of users (1000+). But in other post you mentioned that it is under 100. Hence this new paging feature shouldn't be a big problem. Here's a portion of the initial load of the Users page: The first 3 requests are inital load of web UI (not including .js files and such) - you can see they are the same in both cases. I don't see a login request so session is already established. json/ipa/session POST 200 Success application/json jquery.js:7365 Script 33.94KB 33.10KB 1.57s 1.47s 96ms (1.37s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 568.09KB 564.36KB 3.92s 2.95s 963ms (2.85s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 556.94KB 553.40KB 3.78s 2.94s 836ms (2.83s waiting) This one is user load: json/ipa/session POST 200 Success application/json jquery.js:7365 Script 46.93KB 46.38KB 1.87s 1.71s (1.60s waiting) Now, with the pagination turned back on: json/ipa/session POST 200 Success application/json jquery.js:7365 Script 33.94KB 33.10KB 1.58s 1.48s 100ms (1.38s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 568.09KB 564.36KB 4.05s 3.09s 964ms (2.98s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 556.94KB 553.40KB 3.84s 2.99s 855ms (2.88s waiting) Here you can see a change. With pagination turned on it should be two request. One to get primary keys (logins) and second to get users. The latter is missing in this list. With a lot of users the first response grows. With low number of users it is in fact smaller than with pagination turned off. json/ipa/session POST 200 Success application/json jquery.js:7365 Script 46.93KB 46.38KB 1.52s 1.51s (1.40s waiting) As I mentioned you are missing one request so it will add aprox. 1.5s. The second request is kinda a slowdown from IPA 2.1.4 but the main issue is still the long server processing time. Steve -- Petr Vobornik ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Thu, 19 Jul 2012, John Dennis wrote: Rob may have already contacted you with this, but if not we would like to get more debugging information by have the server log what is occurring when it processes your requests. To do this you'll need to turn on the debug flag in the IPA configuration file /etc/ipa/default.conf, add a line that says: debug = True Then restart the server to pick up the new configuration. The information will be written to /var/log/httpd/error_log. We only need the contents of the log from when the server was restarted with debug logging enabled. For privacy reasons I suggest you send the contents of the log to one of the IPA team members directly in a private email, not to the public freeipa list. In addition we would like to see what's happening with krb5 communication under httpd processes. In order to obtain that tracing information you need to do following: 1. Add KRB5_TRACE=/tmp/http_krb5_trace.log to /etc/sysconfig/httpd 2. Restart httpd (or httpd.service in Fedora) 3. Now you need to create the file and chown it to apache's user so that httpd processes would be able to write to it: find out PID of any of httpd processes, doesn't matter which one touch /proc/$PID/cwd/tmp/http_krb5_trace.log chown apache /proc/$PID/cwd/tmp/http_krb5_trace.log 4. Now you can issue IPA commands and you'll get krb5 client tracing in /proc/$PID/cwd/tmp/http_krb5_trace.log The reason why (3) talks about PID of httpd process is because in Fedora, unlike in RHEL6.x, systemd is handling services startup and systemd confines httpd to a private /tmp. Using /proc/$PID/cwd/tmp is the easiest way to reach that private tmp. 5. Once finished and copied /proc/$PID/cwd/tmp/http_krb5_trace.log to an archive location, make sure to remove the file and its reference from /etc/sysconfig/httpd and restart the service. -- / Alexander Bokovoy ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/19/2012 02:29 AM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 2:26 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 05:09 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 1:52 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 04:27 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 1:06 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 03:45 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 12:28 PM, John Dennis jden...@redhat.com wrote: On 07/18/2012 02:59 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Actually neither, it means waiting for a response from the web server (technically it's making an RPC call via HTTP Ajax). The RPC call needs to go through the web server, memcached, and typically will invoke one or more directory server queries, and run a bunch of Python to massage everything before the RPC returns with the result. It doesn't look like you've got much difference in times between with pagination on and pagination off. I don't know the pagination code but I suspect it's run after the RPC call returns so the RPC timing is not telling us much with respect to that. Waiting for up to 3 seconds for an RPC call does seem on the high side. Do you have a lot of LDAP data? No. 49 users, 17 hosts, 25 services, 6 DNS zones, only 1 of which has any significant amount of hosts in it. But really, unless we get timing results for each component we're grasping at straws :-( Understood. Steve Do you have a replica and does this replica behave the same? No replica yet. I wanted to get the memory leak issue solved first. All I have to compare to is the old 2.1.3 before. This one is much slower. I just can't seem to figure out what's wrong. The upgrade seemed to complete successfully and there were no errors in the log. The only things I've found thus far (earlier in this thread) are the unindexed entries (all hosts entries) that Rich seemed to think might be slowing things up. As the slowness is on every page, I wouldn't think that would be the problem. I wouldn't have said as much about this were it not for the promised faster speed mentioned in the release notes. It's comparable to the old 2.0 release candidates so I thought it might have been due to the complexity of the feature additions. Steve Is the time correct on this system? Yes. HW clock is GMT and localtime is Pacific Daylight Time and it is accurate to the minute. Steve Can you check the Kerberos logs whether the ldap service ticket is acquired on every http request via browser? That looks to be the case. I tailed the log while accessing each page and got: Jul 18 22:43:15 ipa1.example.com krb5kdc[1243](info): TGS_REQ (4 etypes {18 17 16 23}) xxx.xx.xx.xx: ISSUE: authtime 1342640481, etypes {rep=18 tkt=18 ses=18}, HTTP/ipa1.example@example.com for ldap/ipa1.example@example.com Jul 18 22:43:15
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Tue, Jul 17, 2012 at 3:56 PM, John Dennis jden...@redhat.com wrote: On 07/17/2012 05:43 PM, Stephen Ingram wrote: [ details of performance analysis snipped for brevity ] I wonder if we shouldn't add some timing metrics to our code. As it is it's very hard to know where time is being spent. When I wrote the session code I added some timestamps used for managing session timeouts. It wouldn't be too hard to expand this to time how long it takes a command to execute because it's evaluated for every command. Combined with timestamping in the UI code we could get a reasonable idea of where some bottlenecks lie (or don't). I've never used this before so I'm not sure how it would work, but it sounds great. It's really difficult to tell what's causing the issue when there are so many processes occurring. Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/18/2012 01:53 PM, Stephen Ingram wrote: On Tue, Jul 17, 2012 at 3:56 PM, John Dennis jden...@redhat.com wrote: On 07/17/2012 05:43 PM, Stephen Ingram wrote: [ details of performance analysis snipped for brevity ] I wonder if we shouldn't add some timing metrics to our code. As it is it's very hard to know where time is being spent. When I wrote the session code I added some timestamps used for managing session timeouts. It wouldn't be too hard to expand this to time how long it takes a command to execute because it's evaluated for every command. Combined with timestamping in the UI code we could get a reasonable idea of where some bottlenecks lie (or don't). I've never used this before so I'm not sure how it would work, but it sounds great. It's really difficult to tell what's causing the issue when there are so many processes occurring. While we are going with the technical digging let us also try to collect the sufficient information about the problem. Here is some questions that would help us to reproduce the issue. 1) If the problem with every frame of just some specific UI pages? Can you for example see IPA Configuration panel or log as a self service user? Are those fast? 2) Say it is users is so how many users do you have? Is it thousands? Or may be it is a specific group? We might need to reproduce the same setup and see what is going on. Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users -- Thank you, Dmitri Pal Sr. Engineering Manager for IdM portfolio Red Hat Inc. --- Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Here's a portion of the initial load of the Users page: json/ipa/session POST 200 Success application/json jquery.js:7365 Script 33.94KB 33.10KB 1.57s 1.47s 96ms (1.37s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 568.09KB 564.36KB 3.92s 2.95s 963ms (2.85s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 556.94KB 553.40KB 3.78s 2.94s 836ms (2.83s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 46.93KB 46.38KB 1.87s 1.71s (1.60s waiting) Now, with the pagination turned back on: json/ipa/session POST 200 Success application/json jquery.js:7365 Script 33.94KB 33.10KB 1.58s 1.48s 100ms (1.38s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 568.09KB 564.36KB 4.05s 3.09s 964ms (2.98s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 556.94KB 553.40KB 3.84s 2.99s 855ms (2.88s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 46.93KB 46.38KB 1.52s 1.51s (1.40s waiting) Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/18/2012 02:59 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Actually neither, it means waiting for a response from the web server (technically it's making an RPC call via HTTP Ajax). The RPC call needs to go through the web server, memcached, and typically will invoke one or more directory server queries, and run a bunch of Python to massage everything before the RPC returns with the result. It doesn't look like you've got much difference in times between with pagination on and pagination off. I don't know the pagination code but I suspect it's run after the RPC call returns so the RPC timing is not telling us much with respect to that. Waiting for up to 3 seconds for an RPC call does seem on the high side. Do you have a lot of LDAP data? But really, unless we get timing results for each component we're grasping at straws :-( Here's a portion of the initial load of the Users page: json/ipa/session POST 200 Success application/json jquery.js:7365 Script 33.94KB 33.10KB 1.57s 1.47s 96ms (1.37s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 568.09KB 564.36KB 3.92s 2.95s 963ms (2.85s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 556.94KB 553.40KB 3.78s 2.94s 836ms (2.83s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 46.93KB 46.38KB 1.87s 1.71s (1.60s waiting) Now, with the pagination turned back on: json/ipa/session POST 200 Success application/json jquery.js:7365 Script 33.94KB 33.10KB 1.58s 1.48s 100ms (1.38s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 568.09KB 564.36KB 4.05s 3.09s 964ms (2.98s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 556.94KB 553.40KB 3.84s 2.99s 855ms (2.88s waiting) json/ipa/session POST 200 Success application/json jquery.js:7365 Script 46.93KB 46.38KB 1.52s 1.51s (1.40s waiting) Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users -- John Dennis jden...@redhat.com Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Wed, Jul 18, 2012 at 10:59 AM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 01:53 PM, Stephen Ingram wrote: On Tue, Jul 17, 2012 at 3:56 PM, John Dennis jden...@redhat.com wrote: On 07/17/2012 05:43 PM, Stephen Ingram wrote: [ details of performance analysis snipped for brevity ] I wonder if we shouldn't add some timing metrics to our code. As it is it's very hard to know where time is being spent. When I wrote the session code I added some timestamps used for managing session timeouts. It wouldn't be too hard to expand this to time how long it takes a command to execute because it's evaluated for every command. Combined with timestamping in the UI code we could get a reasonable idea of where some bottlenecks lie (or don't). I've never used this before so I'm not sure how it would work, but it sounds great. It's really difficult to tell what's causing the issue when there are so many processes occurring. While we are going with the technical digging let us also try to collect the sufficient information about the problem. Here is some questions that would help us to reproduce the issue. 1) If the problem with every frame of just some specific UI pages? The frame seems to load quickly. It is the inside part that contains the data that is much slower. Can you for example see IPA Configuration panel or log as a self service user? Are those fast? I'm not sure what you mean by configuration panel, but if I login as admin or self-service user, they are both equally slow. 2) Say it is users is so how many users do you have? Is it thousands? No, only 49 users at the moment. We're still adding people. There isn't a lot of data in the directory period--another reason I'm so surprised by the slowness. Or may be it is a specific group? I notice that everyone is automatically subscribed to the ipausers group. Hasn't that been changed such that the subscription is no longer automatic? Maybe it is taking too long to enumerate that group? I can unsubscribe the users if needed. Our groups only contain 3 users on average. We might need to reproduce the same setup and see what is going on. I'm more than willing to help in any way I can. I'm even considering pulling our old 2.1.3 system from backup, but it would be difficult as this on is in production now. I switched because of the memory leak in 2.1.3. Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/18/2012 03:45 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 12:28 PM, John Dennis jden...@redhat.com wrote: On 07/18/2012 02:59 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Actually neither, it means waiting for a response from the web server (technically it's making an RPC call via HTTP Ajax). The RPC call needs to go through the web server, memcached, and typically will invoke one or more directory server queries, and run a bunch of Python to massage everything before the RPC returns with the result. It doesn't look like you've got much difference in times between with pagination on and pagination off. I don't know the pagination code but I suspect it's run after the RPC call returns so the RPC timing is not telling us much with respect to that. Waiting for up to 3 seconds for an RPC call does seem on the high side. Do you have a lot of LDAP data? No. 49 users, 17 hosts, 25 services, 6 DNS zones, only 1 of which has any significant amount of hosts in it. But really, unless we get timing results for each component we're grasping at straws :-( Understood. Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users Do you have a replica and does this replica behave the same? -- Thank you, Dmitri Pal Sr. Engineering Manager for IdM portfolio Red Hat Inc. --- Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/18/2012 04:27 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 1:06 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 03:45 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 12:28 PM, John Dennis jden...@redhat.com wrote: On 07/18/2012 02:59 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Actually neither, it means waiting for a response from the web server (technically it's making an RPC call via HTTP Ajax). The RPC call needs to go through the web server, memcached, and typically will invoke one or more directory server queries, and run a bunch of Python to massage everything before the RPC returns with the result. It doesn't look like you've got much difference in times between with pagination on and pagination off. I don't know the pagination code but I suspect it's run after the RPC call returns so the RPC timing is not telling us much with respect to that. Waiting for up to 3 seconds for an RPC call does seem on the high side. Do you have a lot of LDAP data? No. 49 users, 17 hosts, 25 services, 6 DNS zones, only 1 of which has any significant amount of hosts in it. But really, unless we get timing results for each component we're grasping at straws :-( Understood. Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users Do you have a replica and does this replica behave the same? No replica yet. I wanted to get the memory leak issue solved first. All I have to compare to is the old 2.1.3 before. This one is much slower. I just can't seem to figure out what's wrong. The upgrade seemed to complete successfully and there were no errors in the log. The only things I've found thus far (earlier in this thread) are the unindexed entries (all hosts entries) that Rich seemed to think might be slowing things up. As the slowness is on every page, I wouldn't think that would be the problem. I wouldn't have said as much about this were it not for the promised faster speed mentioned in the release notes. It's comparable to the old 2.0 release candidates so I thought it might have been due to the complexity of the feature additions. Steve Is the time correct on this system? -- Thank you, Dmitri Pal Sr. Engineering Manager for IdM portfolio Red Hat Inc. --- Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/18/2012 05:09 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 1:52 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 04:27 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 1:06 PM, Dmitri Pal d...@redhat.com wrote: On 07/18/2012 03:45 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 12:28 PM, John Dennis jden...@redhat.com wrote: On 07/18/2012 02:59 PM, Stephen Ingram wrote: On Wed, Jul 18, 2012 at 6:45 AM, Petr Vobornik pvobo...@redhat.com wrote: On 07/17/2012 11:43 PM, Stephen Ingram wrote: 8-- I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern. It's certainly usable, but I just got really spoiled by the unbelievable quickness of 2.1.3. When your release notes indicate it should be faster, what are you comparing it to? Maybe this only happens with upgraded instances and not fresh installs. It is always possible something didn't get upgraded properly but I've done 2.1.3 - 2.2.0 upgrades and haven't seen this. When we say something is faster we're always referring to the previous version (or versions). Maybe I was just lucky with 2.1.3. On a first load it might take some time to load the frame as I call it. But the data would load almost instantaneously from there (certainly no more than 1 s) as you moved from page to page. Here, even if I return to the same page, the system acts as if the data is begin fetched for the very first time as it is no faster than the first load. Maybe that is significant to the problem? I think the culprit is Web UI paging capabilities introduced in 2.2. With lot of users, responses might grow in size. You can check their size and duration in browser developers tools. I suggest chrome/chromium - press F12 and choose 'network' tab. This new feature can't be disabled in configuration. To test if the slowdown is done by paging you can (at own risk) replace line /usr/share/ipa/ui/facet.js:538 that.pagination = spec.pagination === undefined ? true : spec.pagination; with: that.pagination = false; Note: It will break some other parts of the UI - so for testing only. I've made the substitution in the code (was line 507 for me-do I have a different version?). Looking at the time chart in Chrome I see that the bulk of the time is for /ipa/session waiting. Would waiting mean waiting for the directory server or memcached? Actually neither, it means waiting for a response from the web server (technically it's making an RPC call via HTTP Ajax). The RPC call needs to go through the web server, memcached, and typically will invoke one or more directory server queries, and run a bunch of Python to massage everything before the RPC returns with the result. It doesn't look like you've got much difference in times between with pagination on and pagination off. I don't know the pagination code but I suspect it's run after the RPC call returns so the RPC timing is not telling us much with respect to that. Waiting for up to 3 seconds for an RPC call does seem on the high side. Do you have a lot of LDAP data? No. 49 users, 17 hosts, 25 services, 6 DNS zones, only 1 of which has any significant amount of hosts in it. But really, unless we get timing results for each component we're grasping at straws :-( Understood. Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users Do you have a replica and does this replica behave the same? No replica yet. I wanted to get the memory leak issue solved first. All I have to compare to is the old 2.1.3 before. This one is much slower. I just can't seem to figure out what's wrong. The upgrade seemed to complete successfully and there were no errors in the log. The only things I've found thus far (earlier in this thread) are the unindexed entries (all hosts entries) that Rich seemed to think might be slowing things up. As the slowness is on every page, I wouldn't think that would be the problem. I wouldn't have said as much about this were it not for the promised faster speed mentioned in the release notes. It's comparable to the old 2.0 release candidates so I thought it might have been due to the complexity of the feature additions. Steve Is the time correct on this system? Yes. HW clock is GMT and localtime is Pacific Daylight Time and it is accurate to the minute. Steve Can you check the Kerberos logs whether the ldap service ticket is acquired on every http request via browser? Also http logs might shed some light on what is going on a request. May be this would give us some hint. What you are experiencing is wrong and alarming. This is why I want to drill down to the root of the problem. -- Thank you, Dmitri Pal Sr. Engineering Manager for IdM portfolio Red Hat Inc.
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Mon, Jul 16, 2012 at 12:23 PM, Rob Crittenden rcrit...@redhat.com wrote: Stephen Ingram wrote: On Mon, Jul 16, 2012 at 11:34 AM, Rich Megginson rmegg...@redhat.com wrote: On 07/16/2012 11:48 AM, Stephen Ingram wrote: On Mon, Jul 16, 2012 at 9:35 AM, Rich Megginsonrmegg...@redhat.com wrote: On 07/16/2012 10:19 AM, Stephen Ingram wrote: On Fri, Jul 13, 2012 at 6:14 AM, Rob Crittendenrcrit...@redhat.com wrote: Stephen Ingram wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jonessteven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Can you tell where it is being slow? Does it seem related to retrieving data from LDAP? I'm not really sure yet what is causing the slowness. I have the same number of directory entries as before the upgrade. It was very quick with 2.1.3, but once I upgraded, I felt like I was back to the pre-2.0 days--without a doubt much, much slower. You might check your 389-ds access logs and look for searches with notes=U. Perhaps you are missing an index. Yes there are lots of notes=U. What does this mean? Was something missed in the upgrade script? Try running logconv.pl Nice! I'm guessing that notes=U are unindexed searches then. I have 34 over the last 24 hours so I'm not sure this would be causing the issue as the slowness persists through every click. Yeah, I would expect to see a lot more than 34 if that were the cause. Can you post the search filters that are unindexed? Sure, here's a partial list (sanitized): filter=(managedBy=fqdn=ec2-x.x.x.us-west-2.compute.amazonaws.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap2.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=ipa1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn All the rest are the same, just with different hosts. I've traced the unindexed searches back to the time of Web UI access and they don't match. I also don't see any other obvious errors when running logconv.pl. One strange thing I have noticed is that the 389 server logs seem to update in spurts. If I'm tailing the logs while I access a Web UI page, there is nothing, then a couple of seconds later, I see the logs quickly scroll with new entires. Has this always been the case? I don't seem to remember this before. Yes. The 389 access log is buffered, for performance reasons. Just thought it might be relevant. I'm not sure what is causing the extreme slowness. I've also shut off memcached and tried without it with no discernible difference. The directory seems to be handling the load of external queries just fine, although I'm not sure I've solved the memory issue--I'm still testing with the compat plugin disabled to see if I can stop the memory creep. Maybe it's something in the code of the Web UI itself as its even slow when changing from page to page of users and hosts. Shutting of memcached is just going to make things even slower. I really didn't see much difference so I turned it back on right away. Things you can try on a quiet system: 1. Create /etc/ipa/server.conf: [global] debug=True Restart Apache Use the UI to do a few things. Verify in the logs that the session cache is being used. Yes, it is. It's interesting, 2.2 is slower such that you can see the frame load, and then the loading symbol spins below in the display area for a few seconds while that loads up. Before, with 2.1.3, you really couldn't distinguish the two as they loaded so quickly. 2. Check your browser configuration. You don't need delegation-uris set any more. Having it set might mask other problems (you still need negotiation-auth.trusted-uris set). I forgot about this. I changed it, completely cleared the browser cache and accessed without any noticeable difference. 3. Watch what URI is being used in the Apache access log. It should be /ipa/session/json. Check. this is where it lands. I'm beginning to think this is just the Web UI itself instead of 389 although it is really difficult to tell. I've poured over the debug logs and didn't see anything that caused me concern.
Re: [Freeipa-users] 2.20 dirsrv memory usage
Stephen Ingram wrote: On Mon, Jul 16, 2012 at 12:23 PM, Rob Crittenden rcrit...@redhat.com wrote: Stephen Ingram wrote: On Mon, Jul 16, 2012 at 11:34 AM, Rich Megginson rmegg...@redhat.com wrote: On 07/16/2012 11:48 AM, Stephen Ingram wrote: On Mon, Jul 16, 2012 at 9:35 AM, Rich Megginsonrmegg...@redhat.com wrote: On 07/16/2012 10:19 AM, Stephen Ingram wrote: On Fri, Jul 13, 2012 at 6:14 AM, Rob Crittendenrcrit...@redhat.com wrote: Stephen Ingram wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jonessteven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Can you tell where it is being slow? Does it seem related to retrieving data from LDAP? I'm not really sure yet what is causing the slowness. I have the same number of directory entries as before the upgrade. It was very quick with 2.1.3, but once I upgraded, I felt like I was back to the pre-2.0 days--without a doubt much, much slower. You might check your 389-ds access logs and look for searches with notes=U. Perhaps you are missing an index. Yes there are lots of notes=U. What does this mean? Was something missed in the upgrade script? Try running logconv.pl Nice! I'm guessing that notes=U are unindexed searches then. I have 34 over the last 24 hours so I'm not sure this would be causing the issue as the slowness persists through every click. Yeah, I would expect to see a lot more than 34 if that were the cause. Can you post the search filters that are unindexed? Sure, here's a partial list (sanitized): filter=(managedBy=fqdn=ec2-x.x.x.us-west-2.compute.amazonaws.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap2.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=ipa1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn All the rest are the same, just with different hosts. I've traced the unindexed searches back to the time of Web UI access and they don't match. I also don't see any other obvious errors when running logconv.pl. One strange thing I have noticed is that the 389 server logs seem to update in spurts. If I'm tailing the logs while I access a Web UI page, there is nothing, then a couple of seconds later, I see the logs quickly scroll with new entires. Has this always been the case? I don't seem to remember this before. Yes. The 389 access log is buffered, for performance reasons. Just thought it might be relevant. I'm not sure what is causing the extreme slowness. I've also shut off memcached and tried without it with no discernible difference. The directory seems to be handling the load of external queries just fine, although I'm not sure I've solved the memory issue--I'm still testing with the compat plugin disabled to see if I can stop the memory creep. Maybe it's something in the code of the Web UI itself as its even slow when changing from page to page of users and hosts. Shutting of memcached is just going to make things even slower. I really didn't see much difference so I turned it back on right away. Things you can try on a quiet system: 1. Create /etc/ipa/server.conf: [global] debug=True Restart Apache Use the UI to do a few things. Verify in the logs that the session cache is being used. Yes, it is. It's interesting, 2.2 is slower such that you can see the frame load, and then the loading symbol spins below in the display area for a few seconds while that loads up. Before, with 2.1.3, you really couldn't distinguish the two as they loaded so quickly. A lot of what gets loaded a just big javascript files. I wonder if there is a DNS problem, that would explain the slowness. The javascript and much of the UI is completely unprotected by anything. 2. Check your browser configuration. You don't need delegation-uris set any more. Having it set might mask other problems (you still need negotiation-auth.trusted-uris set). I forgot about this. I changed it, completely cleared the browser cache and accessed without any noticeable difference. 3. Watch what URI is being used in the Apache access log. It should be /ipa/session/json. Check. this is where it lands. I'm beginning to think this
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Tue, Jul 17, 2012 at 2:01 PM, Rob Crittenden rcrit...@redhat.com wrote: Stephen Ingram wrote: On Mon, Jul 16, 2012 at 12:23 PM, Rob Crittenden rcrit...@redhat.com wrote: Stephen Ingram wrote: On Mon, Jul 16, 2012 at 11:34 AM, Rich Megginson rmegg...@redhat.com wrote: On 07/16/2012 11:48 AM, Stephen Ingram wrote: On Mon, Jul 16, 2012 at 9:35 AM, Rich Megginsonrmegg...@redhat.com wrote: On 07/16/2012 10:19 AM, Stephen Ingram wrote: On Fri, Jul 13, 2012 at 6:14 AM, Rob Crittendenrcrit...@redhat.com wrote: Stephen Ingram wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jonessteven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Can you tell where it is being slow? Does it seem related to retrieving data from LDAP? I'm not really sure yet what is causing the slowness. I have the same number of directory entries as before the upgrade. It was very quick with 2.1.3, but once I upgraded, I felt like I was back to the pre-2.0 days--without a doubt much, much slower. You might check your 389-ds access logs and look for searches with notes=U. Perhaps you are missing an index. Yes there are lots of notes=U. What does this mean? Was something missed in the upgrade script? Try running logconv.pl Nice! I'm guessing that notes=U are unindexed searches then. I have 34 over the last 24 hours so I'm not sure this would be causing the issue as the slowness persists through every click. Yeah, I would expect to see a lot more than 34 if that were the cause. Can you post the search filters that are unindexed? Sure, here's a partial list (sanitized): filter=(managedBy=fqdn=ec2-x.x.x.us-west-2.compute.amazonaws.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap2.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=ipa1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn All the rest are the same, just with different hosts. I've traced the unindexed searches back to the time of Web UI access and they don't match. I also don't see any other obvious errors when running logconv.pl. One strange thing I have noticed is that the 389 server logs seem to update in spurts. If I'm tailing the logs while I access a Web UI page, there is nothing, then a couple of seconds later, I see the logs quickly scroll with new entires. Has this always been the case? I don't seem to remember this before. Yes. The 389 access log is buffered, for performance reasons. Just thought it might be relevant. I'm not sure what is causing the extreme slowness. I've also shut off memcached and tried without it with no discernible difference. The directory seems to be handling the load of external queries just fine, although I'm not sure I've solved the memory issue--I'm still testing with the compat plugin disabled to see if I can stop the memory creep. Maybe it's something in the code of the Web UI itself as its even slow when changing from page to page of users and hosts. Shutting of memcached is just going to make things even slower. I really didn't see much difference so I turned it back on right away. Things you can try on a quiet system: 1. Create /etc/ipa/server.conf: [global] debug=True Restart Apache Use the UI to do a few things. Verify in the logs that the session cache is being used. Yes, it is. It's interesting, 2.2 is slower such that you can see the frame load, and then the loading symbol spins below in the display area for a few seconds while that loads up. Before, with 2.1.3, you really couldn't distinguish the two as they loaded so quickly. A lot of what gets loaded a just big javascript files. I wonder if there is a DNS problem, that would explain the slowness. The javascript and much of the UI is completely unprotected by anything. Hmmm. DNS issues? What sort of things would I be looking for? ipa1.example.com resolves both ways both from IPA and outside nameserver that doesn't connect at all with IPA. Would there be anything else? Could a DNS conflict within the DNS portion of IPA cause an issue?
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/17/2012 05:43 PM, Stephen Ingram wrote: [ details of performance analysis snipped for brevity ] I wonder if we shouldn't add some timing metrics to our code. As it is it's very hard to know where time is being spent. When I wrote the session code I added some timestamps used for managing session timeouts. It wouldn't be too hard to expand this to time how long it takes a command to execute because it's evaluated for every command. Combined with timestamping in the UI code we could get a reasonable idea of where some bottlenecks lie (or don't). -- John Dennis jden...@redhat.com Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Mon, Jul 16, 2012 at 11:34 AM, Rich Megginson rmegg...@redhat.com wrote: On 07/16/2012 11:48 AM, Stephen Ingram wrote: On Mon, Jul 16, 2012 at 9:35 AM, Rich Megginsonrmegg...@redhat.com wrote: On 07/16/2012 10:19 AM, Stephen Ingram wrote: On Fri, Jul 13, 2012 at 6:14 AM, Rob Crittendenrcrit...@redhat.com wrote: Stephen Ingram wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jonessteven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Can you tell where it is being slow? Does it seem related to retrieving data from LDAP? I'm not really sure yet what is causing the slowness. I have the same number of directory entries as before the upgrade. It was very quick with 2.1.3, but once I upgraded, I felt like I was back to the pre-2.0 days--without a doubt much, much slower. You might check your 389-ds access logs and look for searches with notes=U. Perhaps you are missing an index. Yes there are lots of notes=U. What does this mean? Was something missed in the upgrade script? Try running logconv.pl Nice! I'm guessing that notes=U are unindexed searches then. I have 34 over the last 24 hours so I'm not sure this would be causing the issue as the slowness persists through every click. Yeah, I would expect to see a lot more than 34 if that were the cause. Can you post the search filters that are unindexed? Sure, here's a partial list (sanitized): filter=(managedBy=fqdn=ec2-x.x.x.us-west-2.compute.amazonaws.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap2.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=ipa1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn All the rest are the same, just with different hosts. I've traced the unindexed searches back to the time of Web UI access and they don't match. I also don't see any other obvious errors when running logconv.pl. One strange thing I have noticed is that the 389 server logs seem to update in spurts. If I'm tailing the logs while I access a Web UI page, there is nothing, then a couple of seconds later, I see the logs quickly scroll with new entires. Has this always been the case? I don't seem to remember this before. Yes. The 389 access log is buffered, for performance reasons. Just thought it might be relevant. I'm not sure what is causing the extreme slowness. I've also shut off memcached and tried without it with no discernible difference. The directory seems to be handling the load of external queries just fine, although I'm not sure I've solved the memory issue--I'm still testing with the compat plugin disabled to see if I can stop the memory creep. Maybe it's something in the code of the Web UI itself as its even slow when changing from page to page of users and hosts. Steve ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On Mon, 2012-07-16 at 12:11 -0700, Stephen Ingram wrote: On Mon, Jul 16, 2012 at 11:34 AM, Rich Megginson rmegg...@redhat.com wrote: On 07/16/2012 11:48 AM, Stephen Ingram wrote: On Mon, Jul 16, 2012 at 9:35 AM, Rich Megginsonrmegg...@redhat.com wrote: On 07/16/2012 10:19 AM, Stephen Ingram wrote: On Fri, Jul 13, 2012 at 6:14 AM, Rob Crittendenrcrit...@redhat.com wrote: Stephen Ingram wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jonessteven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Can you tell where it is being slow? Does it seem related to retrieving data from LDAP? I'm not really sure yet what is causing the slowness. I have the same number of directory entries as before the upgrade. It was very quick with 2.1.3, but once I upgraded, I felt like I was back to the pre-2.0 days--without a doubt much, much slower. You might check your 389-ds access logs and look for searches with notes=U. Perhaps you are missing an index. Yes there are lots of notes=U. What does this mean? Was something missed in the upgrade script? Try running logconv.pl Nice! I'm guessing that notes=U are unindexed searches then. I have 34 over the last 24 hours so I'm not sure this would be causing the issue as the slowness persists through every click. Yeah, I would expect to see a lot more than 34 if that were the cause. Can you post the search filters that are unindexed? Sure, here's a partial list (sanitized): filter=(managedBy=fqdn=ec2-x.x.x.us-west-2.compute.amazonaws.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap2.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=ipa1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn All the rest are the same, just with different hosts. I've traced the unindexed searches back to the time of Web UI access and they don't match. I also don't see any other obvious errors when running logconv.pl. One strange thing I have noticed is that the 389 server logs seem to update in spurts. If I'm tailing the logs while I access a Web UI page, there is nothing, then a couple of seconds later, I see the logs quickly scroll with new entires. Has this always been the case? I don't seem to remember this before. Yes. The 389 access log is buffered, for performance reasons. Just thought it might be relevant. I'm not sure what is causing the extreme slowness. I've also shut off memcached and tried without it with no discernible difference. The directory seems to be handling the load of external queries just fine, although I'm not sure I've solved the memory issue--I'm still testing with the compat plugin disabled to see if I can stop the memory creep. Maybe it's something in the code of the Web UI itself as its even slow when changing from page to page of users and hosts. Looks like the symptoms of not using session cookies. Do you see constant activity getting tickets for ldap/ipa.server.fqdn in the krb5kdc.log ? Simo. -- Simo Sorce * Red Hat, Inc * New York ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
Stephen Ingram wrote: On Mon, Jul 16, 2012 at 11:34 AM, Rich Megginson rmegg...@redhat.com wrote: On 07/16/2012 11:48 AM, Stephen Ingram wrote: On Mon, Jul 16, 2012 at 9:35 AM, Rich Megginsonrmegg...@redhat.com wrote: On 07/16/2012 10:19 AM, Stephen Ingram wrote: On Fri, Jul 13, 2012 at 6:14 AM, Rob Crittendenrcrit...@redhat.com wrote: Stephen Ingram wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jonessteven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Can you tell where it is being slow? Does it seem related to retrieving data from LDAP? I'm not really sure yet what is causing the slowness. I have the same number of directory entries as before the upgrade. It was very quick with 2.1.3, but once I upgraded, I felt like I was back to the pre-2.0 days--without a doubt much, much slower. You might check your 389-ds access logs and look for searches with notes=U. Perhaps you are missing an index. Yes there are lots of notes=U. What does this mean? Was something missed in the upgrade script? Try running logconv.pl Nice! I'm guessing that notes=U are unindexed searches then. I have 34 over the last 24 hours so I'm not sure this would be causing the issue as the slowness persists through every click. Yeah, I would expect to see a lot more than 34 if that were the cause. Can you post the search filters that are unindexed? Sure, here's a partial list (sanitized): filter=(managedBy=fqdn=ec2-x.x.x.us-west-2.compute.amazonaws.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=imap2.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn filter=(managedBy=fqdn=ipa1.example.com,cn=computers,cn=accounts,dc=example,dc=com) attrs=fqdn All the rest are the same, just with different hosts. I've traced the unindexed searches back to the time of Web UI access and they don't match. I also don't see any other obvious errors when running logconv.pl. One strange thing I have noticed is that the 389 server logs seem to update in spurts. If I'm tailing the logs while I access a Web UI page, there is nothing, then a couple of seconds later, I see the logs quickly scroll with new entires. Has this always been the case? I don't seem to remember this before. Yes. The 389 access log is buffered, for performance reasons. Just thought it might be relevant. I'm not sure what is causing the extreme slowness. I've also shut off memcached and tried without it with no discernible difference. The directory seems to be handling the load of external queries just fine, although I'm not sure I've solved the memory issue--I'm still testing with the compat plugin disabled to see if I can stop the memory creep. Maybe it's something in the code of the Web UI itself as its even slow when changing from page to page of users and hosts. Shutting of memcached is just going to make things even slower. Things you can try on a quiet system: 1. Create /etc/ipa/server.conf: [global] debug=True Restart Apache Use the UI to do a few things. Verify in the logs that the session cache is being used. 2. Check your browser configuration. You don't need delegation-uris set any more. Having it set might mask other problems (you still need negotiation-auth.trusted-uris set). 3. Watch what URI is being used in the Apache access log. It should be /ipa/session/json. rob ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
Stephen Ingram wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jones steven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Can you tell where it is being slow? Does it seem related to retrieving data from LDAP? You might check your 389-ds access logs and look for searches with notes=U. Perhaps you are missing an index. rob ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users
Re: [Freeipa-users] 2.20 dirsrv memory usage
On 07/12/2012 06:55 PM, Stephen Ingram wrote: On Thu, Jul 12, 2012 at 3:41 PM, Dmitri Pal d...@redhat.com wrote: On 07/12/2012 06:19 PM, Stephen Ingram wrote: On Thu, Jul 12, 2012 at 3:10 PM, Stephen Ingram sbing...@gmail.com wrote: On Thu, Jul 12, 2012 at 2:59 PM, Steven Jones steven.jo...@vuw.ac.nz wrote: Hi, I had huge memory issues pre 6.3, now its low and flatSounds like you have an issue somewhere. My normal cpu use is a few hundred mhzbut when something goes wrong such as replication failing that climbs...ditto memory use Yes, I saw your conversation with Rich on this list about that. And, yes, 6.2 (2.1.3) was bad for me too. I'm not sure why 2.2.0 is still having issues. It was an upgrade from 2.1.3, but the upgrade seemed to complete without issue. I'm also not even doing replication yet so I'm not sure why memory is so high. Web interface is much slower too so perhaps something else is wrong. Oops, I meant Rob, not Rich. Do you use any things exposed via compat tree? Do you have a lot of modifications that affect the data that is exposed via this tree? I suspect that the leak is somewhere there. Try turning off the things that you do not use if there are any. I only query cn=users,cn=accounts,dc=example,dc=com and cn=groups,cn=accounts,dc=example,dc=com containers for mail info and use for Kerberos auth. There are only very infrequent mods to those trees previously mentioned via the Web UI. Almost all activity is reads, but lots of them for the mail servers (validating users, etc.). I plan to use replication and DNS, but not using now. From this, I don't think I'm using the compat tree. Would turning it off help anyway? Steve Please use documented tools to disable it. Do not do it manually via LDAP. -- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. --- Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users