On 10/26/2017 02:49 PM, Jeffrey Eaton wrote: > I can try to build out a RHEL7 system with 1.3 and see if I can > reproduce it there with my data set as well. I'll let you know what I > find. Great! > > I created issue 49429 with more or less the same details as below. Thank you! > > -jeaton > >> On Oct 26, 2017, at 12:50 PM, Mark Reynolds <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> >> On 10/26/2017 12:16 PM, Jeffrey Eaton wrote: >>> I recently updated to 389-ds-base-1.2.11.15-91.el6_9.x86_64 (from >>> 389-ds-base-1.2.11.15-50.el6_6.x86_64) and have uncovered a bug with >>> large paged searches blocking all other searches on the server. I >>> tried to access https://pagure.io/389-ds-base/new_issue to open a >>> bug there, but its giving a 500 error. >> Its up and running now, so if you don't mind please file a ticket. >> >> But... 1.2.11 is no longer under development - meaning no new >> releases are being done. >> >> So we will see if we can reproduce this on 1.3.6, and fix it there if >> it's still an issue. >> >> Tickets of interest between 1.2.11.15-91 <--> 1.2.11.15-50: >> >> 1.2.11.15-75 --- Paged results search returns the blank list of >> entries (DS 48808) >> 1.2.11.15-71 --- SimplePagedResults -- in the search error case, >> simple paged results slot was not released (DS 48375) >> 1.2.11.15-69 --- SimplePagedResults -- abandon could happen between >> the abandon check and sending results -- Fixing a regression >> introduced in 1.2.11.15-68 (DS 48338) >> 1.2.11.15-68 --- SimplePagedResults -- abandon could happen between >> the abandon check and sending results (DS 48338) >> 1.2.11.15-64 --- pagedresults - when timed out, search results could >> have been already freed. (DS 48299) >> >> >> Regards, >> Mark >>> >>> The case I've discovered is as follows: >>> >>> I have a search which returns about 28,000 entries. The system >>> which generates this search does a paged search with 1000 results >>> per page. I can replicate the behavior with a command line ldap >>> search like the following >>> >>> # ldapsearch -H ldaps://servername:636 -D >>> 'uid=binddn,ou=specials,dc=cmu,dc=edu' -x -W -E pr=1000/noprompt >>> "(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))", >>> >>> As soon as the first 1000 entries are returned, and the search >>> requests the second page, all other searches being done to the same >>> server block. >>> >>> Specifically, I run a loop doing a trivial search: >>> >>> # while true; do time ldapsearch -h servername -x uid=jeaton; sleep >>> 1 ; done >>> >>> This search will return in a fraction of a second: >>> >>> real0m0.031s >>> user0m0.009s >>> sys0m0.006s >>> >>> until the server starts working on the second (and any subsequent >>> pages). When that happens, the trivial search blocks for the entire >>> time that it takes to generate the 1000 entries for the page: >>> >>> real0m45.907s >>> user0m0.009s >>> sys0m0.008s >>> >>> everything looks good in the access log at first: >>> >>> [26/Oct/2017:10:10:02 -0400] conn=202655 fd=85 slot=85 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=2 UNBIND >>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=2 fd=85 closed - U1 >>> [26/Oct/2017:10:10:03 -0400] conn=202656 fd=85 slot=85 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=2 UNBIND >>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=2 fd=85 closed - U1 >>> [26/Oct/2017:10:10:04 -0400] conn=202657 fd=85 slot=85 SSL >>> connection from 10.0.0.2 to 10.1.1.1 >>> [26/Oct/2017:10:10:04 -0400] conn=202657 TLS1.0 256-bit AES >>> [26/Oct/2017:10:10:04 -0400] conn=202657 op=0 BIND >>> dn="uid=binddn,ou=specials,dc=cmu,dc=edu" method=128 version=3 >>> [26/Oct/2017:10:10:04 -0400] conn=202657 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="uid=binddn,ou=specials,dc=cmu,dc=edu" >>> [26/Oct/2017:10:10:04 -0400] conn=202657 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 >>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))" >>> attrs=ALL >>> [26/Oct/2017:10:10:05 -0400] conn=202658 fd=86 slot=86 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=2 UNBIND >>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=2 fd=86 closed - U1 >>> [26/Oct/2017:10:10:06 -0400] conn=202659 fd=86 slot=86 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=2 UNBIND >>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=2 fd=86 closed - U1 >>> >>> >>> until we get to the point where the first page is returned and the >>> client requests the second page of results: >>> >>> >>> [26/Oct/2017:10:10:46 -0400] conn=202697 fd=86 slot=86 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=2 UNBIND >>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=2 fd=86 closed - U1 >>> [26/Oct/2017:10:10:47 -0400] conn=202698 fd=86 slot=86 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=2 UNBIND >>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=2 fd=86 closed - U1 >>> [26/Oct/2017:10:10:48 -0400] conn=202657 op=1 RESULT err=0 tag=101 >>> nentries=1000 etime=44 notes=P pr_idx=0 >>> [26/Oct/2017:10:10:48 -0400] conn=202657 op=2 SRCH >>> base="dc=cmu,dc=edu" scope=2 >>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))" >>> attrs=ALL >>> [26/Oct/2017:10:10:48 -0400] conn=202699 fd=86 slot=86 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:11:33 -0400] conn=202657 op=2 RESULT err=0 tag=101 >>> nentries=1000 etime=45 notes=P pr_idx=0 >>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:11:33 -0400] conn=202657 op=3 SRCH >>> base="dc=cmu,dc=edu" scope=2 >>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))" >>> attrs=ALL >>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=2 UNBIND >>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=2 fd=86 closed - U1 >>> [26/Oct/2017:10:12:14 -0400] conn=202657 op=3 RESULT err=0 tag=101 >>> nentries=1000 etime=41 notes=P pr_idx=0 >>> [26/Oct/2017:10:12:14 -0400] conn=202700 fd=86 slot=86 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:12:14 -0400] conn=202657 op=4 SRCH >>> base="dc=cmu,dc=edu" scope=2 >>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))" >>> attrs=ALL >>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=2 UNBIND >>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=2 fd=86 closed - U1 >>> [26/Oct/2017:10:12:58 -0400] conn=202657 op=4 RESULT err=0 tag=101 >>> nentries=1000 etime=44 notes=P pr_idx=0 >>> >>> >>> Note the large gap in time between 10:10:48 and 10:11:33. You can >>> see my client doing the trivial search connect at 10:10:48 with >>> conn=202699. At that point the client just hangs, waiting on the >>> server's response. Only after the server responds with the second >>> page of results at 10:11:33, is my client able to send the bind >>> request and search. This same pattern happens with every further >>> page of searches, eg, 10:11:33 - 10:12:14. >>> >>> Then when we get to the final page, as soon as that completes, the >>> trivial searches once again complete at one per second: >>> >>> [26/Oct/2017:10:25:30 -0400] conn=202657 op=28 SRCH >>> base="dc=cmu,dc=edu" scope=2 >>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))" >>> attrs=ALL >>> [26/Oct/2017:10:25:30 -0400] conn=202737 op=0 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:25:30 -0400] conn=202738 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:25:30 -0400] conn=202738 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:25:30 -0400] conn=202738 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:25:58 -0400] conn=202738 op=2 UNBIND >>> [26/Oct/2017:10:25:58 -0400] conn=202740 fd=89 slot=89 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:25:58 -0400] conn=202657 op=28 RESULT err=0 tag=101 >>> nentries=1000 etime=28 notes=P pr_idx=0 >>> [26/Oct/2017:10:25:58 -0400] conn=202737 op=1 UNBIND >>> [26/Oct/2017:10:25:58 -0400] conn=202737 op=1 fd=87 closed - U1 >>> [26/Oct/2017:10:25:58 -0400] conn=202738 op=2 fd=88 closed - U1 >>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:25:58 -0400] conn=202657 op=29 SRCH >>> base="dc=cmu,dc=edu" scope=2 >>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))" >>> attrs=ALL >>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:26:08 -0400] conn=202741 fd=87 slot=87 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:26:08 -0400] conn=202740 op=2 UNBIND >>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:26:08 -0400] conn=202740 op=2 fd=89 closed - U1 >>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:26:08 -0400] conn=202657 op=29 RESULT err=0 tag=101 >>> nentries=455 etime=10 notes=P pr_idx=0 >>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:26:08 -0400] conn=202657 op=30 UNBIND >>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=2 UNBIND >>> [26/Oct/2017:10:26:08 -0400] conn=202657 op=30 fd=85 closed - U1 >>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=2 fd=87 closed - U1 >>> [26/Oct/2017:10:26:09 -0400] conn=202742 fd=85 slot=85 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=2 UNBIND >>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=2 fd=85 closed - U1 >>> [26/Oct/2017:10:26:10 -0400] conn=202743 fd=85 slot=85 connection >>> from 10.0.0.1 to 10.1.1.1 >>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=0 BIND dn="" method=128 >>> version=3 >>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=0 RESULT err=0 tag=97 >>> nentries=0 etime=0 dn="" >>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=1 SRCH >>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL >>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=1 RESULT err=0 tag=101 >>> nentries=1 etime=0 >>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=2 UNBIND >>> >>> >>> I did some testing and think that the bug was introduced >>> around 389-ds-base-1.2.11.15-72.el6_7. I can probably upgrade one >>> by one and find it specifically if desired. >>> >>> >>> >>> >>> _______________________________________________ >>> 389-devel mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >> >> > > > > _______________________________________________ > 389-devel mailing list -- [email protected] > To unsubscribe send an email to [email protected]
_______________________________________________ 389-devel mailing list -- [email protected] To unsubscribe send an email to [email protected]
