[
https://issues.apache.org/jira/browse/DIRSERVER-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411717#comment-15411717
]
Emmanuel Lécharny edited comment on DIRSERVER-2162 at 8/14/22 1:23 PM:
-----------------------------------------------------------------------
First, let me ask you how many entries you have in your server ? Also, how many
of them have a {{person}} objectClass ? Also which ApacheDS version are you
using ?
Now, regardless of those values, using a filter like
{{(&(cn=username)(ObjectClass=*))}} is equivalent to using {{(cn=username)}} :
all entries have a {{ObjectClass}} attribute, so the second art of the filter
is simply discarded. Using a filter like
{{(&(cn=username)(ObjectClass=person))}} will work differently : as it's a
{{AND}} filter, we will have to evaluate both filters to know how many entries
each one will return. Let's say you have 3 entries which match
{{(cn=USERNAME)}} and 10450 that match {{{}(ObjectClass=person){}}}, the ext
step will be to get the smaller set (ie, the first) and check all of them
against the second filter. All in all, we will fetch 3 entries from teh
backend, and for each of them, we will check that it matches the
{{(ObjectClass=person)}} filter.
That does not tell a lot about what happens in your case, because we have to
know how many of entries will be selected for each of those filter elements.
One more thing : there is a third filter element that is not shown here,
because it's added behind the curtain : the subtree filter. Actually, every
search done with a filter will have an additional filter element added. So, in
your case, the real filter will be :
{{(&(&(cn=username)(ObjectClass=*))(subtree=ou=users,dc=xxx,dc=fi))}} (kindof).
Youc an determinate the number of selected entries by activating the logs to
get a DEBUG output for the {{DefaultSearchEngine}} class. You will typically
get an output like :
{noformat}
Nb results : 3 for filter : ([3]&([3]cn=username)([10450]ObjectClass=person))
{noformat}
where the number between {{[}}{{ and {{]}}}} are the number of candidates. That
will give us some information about why it's slower when you use {{person}} as
a filter.
Note that for index like {{{}ObjctClass{}}}, where one value may refer to many
entries, we use a sub-index. As {{person}} is obviously used by potentially
thousands of entries, we don't store all of the entry's ID in a simple list
associated to the {{person}} key, we use a B-tree to store an ordered list of
them. The rational being that adding an entry with a {{person}} value into a
list of thousands entry's ID would take way too long, while adding it in a
B-tree will only cost a few updates. Nevertheless, we *know* how many values
are associated with the {{person}} value, so the evaluation is clearly trivial,
and does not cost anything, so there is something else at play here.
If it's a bug, it has to be fixed. We will need for your input to understand
what's going on.
Thanks !
was (Author: elecharny):
First, let me ask you how many entries you have in your server ? Also, how many
of them have a {{person}} objectClass ? Also which ApacheDS version are you
using ?
Now, regardless of those values, using a filter like
{{(&(cn=username)(ObjectClass=*))}} is equivalent to using {{(cn=username)}} :
all entries have a {{ObjectClass}} attribute, so the second art of the filter
is simply discarded. Using a filter like
{{(&(cn=username)(ObjectClass=person))}} will work differently : as it's a
{{AND}} filter, we will have to evaluate both filters to know how many entries
each one will return. Let's say you have 3 entries which match
{{(cn=USERNAME)}} and 10450 that match {{{}(ObjectClass=person){}}}, the ext
step will be to get the smaller set (ie, the first) and check all of them
against the second filter. All in all, we will fetch 3 entries from teh
backend, and for each of them, we will check that it matches the
{{(ObjectClass=person)}} filter.
That does not tell a lot about what happens in your case, because we have to
know how many of entries will be selected for each of those filter elements.
One more thing : there is a third filter element that is not shown here,
because it's added behind the curtain : the subtree filter. Actually, every
search done with a filter will have an additional filter element added. So, in
your case, the real filter will be :
{{(&(&(cn=username)(ObjectClass=*))(subtree=ou=users,dc=xxx,dc=fi))}} (kindof).
Youc an determinate the number of selected entries by activating the logs to
get a DEBUG output for the {{DefaultSearchEngine}} class. You will typically
get an output like :
{noformat}
Nb results : 3 for filter : ([3]&([3]cn=username)([10450]ObjectClass=person))
{noformat}
where the number between {{[}}{{ and {{]}}}} are the number of candidates. That
will give us some information about why it's slower when you use {{person}} as
a filter.
Note that for index like {{{}ObjctClass{}}}, where one value may refer to many
entries, we use a sub-index. As {{person}} is obviously used by potentially
thousands of entries, we don't store all of the entry's ID in a simple list
associated to the {{person}} key, we use a B-tree to store an ordered list of
them. The rational being that adding an entry with a {{person}} value into a
list of thousands entry's ID would take way too long, while adding it in a
B-tree will only cost a few updates. Nevertheless, we *know* how many values
are associated with the {{person}} value, so the evaluation is clearly trivial,
and does not cost anything, so there is something else at play here.
If it's a bug, it has to be fixed. We will need for your input to understand
what's going on.
Thanks !
> Searcing for users using ObjectClass=person takes long
> ------------------------------------------------------
>
> Key: DIRSERVER-2162
> URL: https://issues.apache.org/jira/browse/DIRSERVER-2162
> Project: Directory ApacheDS
> Issue Type: Bug
> Components: search
> Affects Versions: 2.0.0-M20
> Reporter: John Peter
> Priority: Major
> Fix For: 2.0.0.AM27
>
>
> When we do the below query the result takes long. Around 10-50 seconds.
> Search base: ou=users,dc=xxx,dc=fi
> Filter (&(cn=*USERNAME*)(objectClass=person))
> Scope: Subtree
> However the below query returns the result immediately
> Search base: ou=users,dc=xxx,dc=fi
> Filter (&(cn=*USERNAME*)(objectClass=*))
> Scope: Subtree
> Looking at the Partition settings it has Indexed attributes ObjectClass and
> cn.
> First both queries took long. Then we added cn to the index and rebooted
> apacheDS and the second query got fast.
> It seems like a bug that using ObjectClass in the query makes it slow all
> tough it is in the index.
> It seems something similar was reported before DIRSERVER-2048, but it says
> it's fixed in M20 which we are using.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]