On 10/26/2017 02:49 PM, Jeffrey Eaton wrote:
> I can try to build out a RHEL7 system with 1.3 and see if I can
> reproduce it there with my data set as well.  I'll let you know what I
> find.
Great!
>
> I created issue 49429 with more or less the same details as below.
Thank you!
>
> -jeaton
>
>> On Oct 26, 2017, at 12:50 PM, Mark Reynolds <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>
>>
>> On 10/26/2017 12:16 PM, Jeffrey Eaton wrote:
>>> I recently updated to 389-ds-base-1.2.11.15-91.el6_9.x86_64 (from
>>> 389-ds-base-1.2.11.15-50.el6_6.x86_64) and have uncovered a bug with
>>> large paged searches blocking all other searches on the server.  I
>>> tried to access https://pagure.io/389-ds-base/new_issue to open a
>>> bug there, but its giving a 500 error.
>> Its up and running now, so if you don't mind please file a ticket.
>>
>> But... 1.2.11 is no longer under development - meaning no new
>> releases are being done.  
>>
>> So we will see if we can reproduce this on 1.3.6, and fix it there if
>> it's still an issue.
>>
>> Tickets of interest between 1.2.11.15-91  <-->   1.2.11.15-50:
>>
>> 1.2.11.15-75 --- Paged results search returns the blank list of
>> entries (DS 48808)
>> 1.2.11.15-71 --- SimplePagedResults -- in the search error case,
>> simple paged results slot was not released (DS 48375)
>> 1.2.11.15-69 --- SimplePagedResults -- abandon could happen between
>> the abandon check and sending results -- Fixing a regression
>> introduced in 1.2.11.15-68 (DS 48338)
>> 1.2.11.15-68 --- SimplePagedResults -- abandon could happen between
>> the abandon check and sending results (DS 48338)
>> 1.2.11.15-64 --- pagedresults - when timed out, search results could
>> have been already freed. (DS 48299)
>>
>>
>> Regards,
>> Mark
>>>
>>> The case I've discovered is as follows:
>>>
>>> I have a search which returns about 28,000 entries.  The system
>>> which generates this search does a paged search with 1000 results
>>> per page.  I can replicate the behavior with a command line ldap
>>> search like the following
>>>
>>> # ldapsearch -H ldaps://servername:636 -D
>>> 'uid=binddn,ou=specials,dc=cmu,dc=edu' -x -W -E pr=1000/noprompt
>>> "(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))",
>>>
>>> As soon as the first 1000 entries are returned, and the search
>>> requests the second page, all other searches being done to the same
>>> server block.
>>>
>>> Specifically, I run a loop doing a trivial search:
>>>
>>> # while true; do time ldapsearch -h servername -x uid=jeaton; sleep
>>> 1 ; done
>>>
>>> This search will return in a fraction of a second:
>>>
>>> real0m0.031s
>>> user0m0.009s
>>> sys0m0.006s
>>>
>>> until the server starts working on the second (and any subsequent
>>> pages).  When that happens, the trivial search blocks for the entire
>>> time that it takes to generate the 1000 entries for the page:
>>>
>>> real0m45.907s
>>> user0m0.009s
>>> sys0m0.008s
>>>
>>> everything looks good in the access log at first:
>>>
>>> [26/Oct/2017:10:10:02 -0400] conn=202655 fd=85 slot=85 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=0 BIND dn="" method=128
>>> version=3 
>>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=2 UNBIND
>>> [26/Oct/2017:10:10:02 -0400] conn=202655 op=2 fd=85 closed - U1
>>> [26/Oct/2017:10:10:03 -0400] conn=202656 fd=85 slot=85 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=2 UNBIND
>>> [26/Oct/2017:10:10:04 -0400] conn=202656 op=2 fd=85 closed - U1
>>> [26/Oct/2017:10:10:04 -0400] conn=202657 fd=85 slot=85 SSL
>>> connection from 10.0.0.2 to 10.1.1.1
>>> [26/Oct/2017:10:10:04 -0400] conn=202657 TLS1.0 256-bit AES
>>> [26/Oct/2017:10:10:04 -0400] conn=202657 op=0 BIND
>>> dn="uid=binddn,ou=specials,dc=cmu,dc=edu" method=128 version=3
>>> [26/Oct/2017:10:10:04 -0400] conn=202657 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn="uid=binddn,ou=specials,dc=cmu,dc=edu"
>>> [26/Oct/2017:10:10:04 -0400] conn=202657 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2
>>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))"
>>> attrs=ALL
>>> [26/Oct/2017:10:10:05 -0400] conn=202658 fd=86 slot=86 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=2 UNBIND
>>> [26/Oct/2017:10:10:05 -0400] conn=202658 op=2 fd=86 closed - U1
>>> [26/Oct/2017:10:10:06 -0400] conn=202659 fd=86 slot=86 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=2 UNBIND
>>> [26/Oct/2017:10:10:06 -0400] conn=202659 op=2 fd=86 closed - U1
>>>
>>>
>>> until we get to the point where the first page is returned and the
>>> client requests the second page of results:
>>>
>>>
>>> [26/Oct/2017:10:10:46 -0400] conn=202697 fd=86 slot=86 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=2 UNBIND
>>> [26/Oct/2017:10:10:46 -0400] conn=202697 op=2 fd=86 closed - U1
>>> [26/Oct/2017:10:10:47 -0400] conn=202698 fd=86 slot=86 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=2 UNBIND
>>> [26/Oct/2017:10:10:47 -0400] conn=202698 op=2 fd=86 closed - U1
>>> [26/Oct/2017:10:10:48 -0400] conn=202657 op=1 RESULT err=0 tag=101
>>> nentries=1000 etime=44 notes=P pr_idx=0
>>> [26/Oct/2017:10:10:48 -0400] conn=202657 op=2 SRCH
>>> base="dc=cmu,dc=edu" scope=2
>>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))"
>>> attrs=ALL
>>> [26/Oct/2017:10:10:48 -0400] conn=202699 fd=86 slot=86 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:11:33 -0400] conn=202657 op=2 RESULT err=0 tag=101
>>> nentries=1000 etime=45 notes=P pr_idx=0
>>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:11:33 -0400] conn=202657 op=3 SRCH
>>> base="dc=cmu,dc=edu" scope=2
>>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))"
>>> attrs=ALL
>>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=2 UNBIND
>>> [26/Oct/2017:10:11:33 -0400] conn=202699 op=2 fd=86 closed - U1
>>> [26/Oct/2017:10:12:14 -0400] conn=202657 op=3 RESULT err=0 tag=101
>>> nentries=1000 etime=41 notes=P pr_idx=0
>>> [26/Oct/2017:10:12:14 -0400] conn=202700 fd=86 slot=86 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:12:14 -0400] conn=202657 op=4 SRCH
>>> base="dc=cmu,dc=edu" scope=2
>>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))"
>>> attrs=ALL
>>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=2 UNBIND
>>> [26/Oct/2017:10:12:14 -0400] conn=202700 op=2 fd=86 closed - U1
>>> [26/Oct/2017:10:12:58 -0400] conn=202657 op=4 RESULT err=0 tag=101
>>> nentries=1000 etime=44 notes=P pr_idx=0
>>>
>>>
>>> Note the large gap in time between 10:10:48 and 10:11:33.  You can
>>> see my client doing the trivial search connect at 10:10:48 with
>>> conn=202699.  At that point the client just hangs, waiting on the
>>> server's response.  Only after the server responds with the second
>>> page of results at 10:11:33, is my client able to send the bind
>>> request and search.   This same pattern happens with every further
>>> page of searches, eg, 10:11:33 - 10:12:14.
>>>
>>> Then when we get to the final page, as soon as that completes, the
>>> trivial searches once again complete at one per second:
>>>
>>> [26/Oct/2017:10:25:30 -0400] conn=202657 op=28 SRCH
>>> base="dc=cmu,dc=edu" scope=2
>>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))"
>>> attrs=ALL
>>> [26/Oct/2017:10:25:30 -0400] conn=202737 op=0 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:25:30 -0400] conn=202738 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:25:30 -0400] conn=202738 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:25:30 -0400] conn=202738 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:25:58 -0400] conn=202738 op=2 UNBIND
>>> [26/Oct/2017:10:25:58 -0400] conn=202740 fd=89 slot=89 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:25:58 -0400] conn=202657 op=28 RESULT err=0 tag=101
>>> nentries=1000 etime=28 notes=P pr_idx=0
>>> [26/Oct/2017:10:25:58 -0400] conn=202737 op=1 UNBIND
>>> [26/Oct/2017:10:25:58 -0400] conn=202737 op=1 fd=87 closed - U1
>>> [26/Oct/2017:10:25:58 -0400] conn=202738 op=2 fd=88 closed - U1
>>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:25:58 -0400] conn=202657 op=29 SRCH
>>> base="dc=cmu,dc=edu" scope=2
>>> filter="(&(isMemberOf=cn=GroupName,ou=groups,dc=cmu,dc=edu)(guid=*)(objectClass=inetorgperson))"
>>> attrs=ALL
>>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:25:58 -0400] conn=202740 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:26:08 -0400] conn=202741 fd=87 slot=87 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:26:08 -0400] conn=202740 op=2 UNBIND
>>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:26:08 -0400] conn=202740 op=2 fd=89 closed - U1
>>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:26:08 -0400] conn=202657 op=29 RESULT err=0 tag=101
>>> nentries=455 etime=10 notes=P pr_idx=0
>>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:26:08 -0400] conn=202657 op=30 UNBIND
>>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=2 UNBIND
>>> [26/Oct/2017:10:26:08 -0400] conn=202657 op=30 fd=85 closed - U1
>>> [26/Oct/2017:10:26:08 -0400] conn=202741 op=2 fd=87 closed - U1
>>> [26/Oct/2017:10:26:09 -0400] conn=202742 fd=85 slot=85 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=2 UNBIND
>>> [26/Oct/2017:10:26:09 -0400] conn=202742 op=2 fd=85 closed - U1
>>> [26/Oct/2017:10:26:10 -0400] conn=202743 fd=85 slot=85 connection
>>> from 10.0.0.1 to 10.1.1.1
>>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=0 BIND dn="" method=128
>>> version=3
>>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0 dn=""
>>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=1 SRCH
>>> base="dc=cmu,dc=edu" scope=2 filter="(uid=jeaton)" attrs=ALL
>>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=1 RESULT err=0 tag=101
>>> nentries=1 etime=0
>>> [26/Oct/2017:10:26:10 -0400] conn=202743 op=2 UNBIND
>>>
>>>
>>> I did some testing and think that the bug was introduced
>>> around 389-ds-base-1.2.11.15-72.el6_7.  I can probably upgrade one
>>> by one and find it specifically if desired.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> 389-devel mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
>>
>>
>
>
>
> _______________________________________________
> 389-devel mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

_______________________________________________
389-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to