Re: RFC 2696 and total entries in the search result

2016-09-22 Thread steve . hammond
Many sites will retrieve 6 pages, and show you "page 1 2 3 4 5 >>"  and 
not know the full count.



On 9/22/16 8:21 AM, Emmanuel Lécharny wrote:

Le 22/09/16 à 14:44, Richard Sand a écrit :

Ok so basically do not pre-populate the number of pages.

I guess applications that do this are backed by an RDBMS not an LDAP?

That is the exact same problem witha RDBMS, as I said in a previous
response.

In order to know how much elements you will get in a RDBMS, you have to
compte a SELECT count(*)... beforehand, and then do your SELECT. In
otehr words, you do the request twice, doubling the server's CPU usage :/





Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Emmanuel Lécharny
Le 22/09/16 à 16:28, Richard Sand a écrit :
> Makes sense, thanks for the explanation. But for smaller directories
> where we know there aren't large volumes of entries (say, less than
> 1000 objects) and we want to do this operation, it should be possible,
> so long as we understand the risks and monitor the performance. Does
> ADS have the capability?

No, it desn't.

As I have explained in a previous response, having it implemented in the
core server would make it expensive. The oly possible route would be to
add an interceptor that do the job, but again, teh price is high : you
will have to keep all the results in memory, or do the request twice.

The reason is that we don't pull entries when we process a request : we
pull candidates (ie, entry's ID) and we evaluate those candidates *just*
before sending them to the client, and wait for the client to have read
them before processing the next entry (there is a technical reason for
us to do so, but it would take too long for me to explain it here, but
trst me on that : this is the way to go).

Bottom line, we wil discard some of the candidates, and return some
other. It's impossible to know how may entries will be returned unless
we have processed all the candidates, and processing all the candidates
eitehr takes time, or memory...


Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Richard Sand
Makes sense, thanks for the explanation. But for smaller directories 
where we know there aren't large volumes of entries (say, less than 1000 
objects) and we want to do this operation, it should be possible, so 
long as we understand the risks and monitor the performance. Does ADS 
have the capability?


-Richard


-- Original Message --
From: "Emmanuel Lécharny" 
To: users@directory.apache.org
Sent: 9/22/2016 10:21:28 AM
Subject: Re: RFC 2696 and total entries in the search result


Le 22/09/16 à 14:44, Richard Sand a écrit :

 Ok so basically do not pre-populate the number of pages.

 I guess applications that do this are backed by an RDBMS not an LDAP?

That is the exact same problem witha RDBMS, as I said in a previous
response.

In order to know how much elements you will get in a RDBMS, you have to
compte a SELECT count(*)... beforehand, and then do your SELECT. In
otehr words, you do the request twice, doubling the server's CPU usage 
:/






Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Emmanuel Lécharny
Le 22/09/16 à 14:44, Richard Sand a écrit :
> Ok so basically do not pre-populate the number of pages.
>
> I guess applications that do this are backed by an RDBMS not an LDAP? 
That is the exact same problem witha RDBMS, as I said in a previous
response.

In order to know how much elements you will get in a RDBMS, you have to
compte a SELECT count(*)... beforehand, and then do your SELECT. In
otehr words, you do the request twice, doubling the server's CPU usage :/



Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Richard Sand

Ok so basically do not pre-populate the number of pages.

I guess applications that do this are backed by an RDBMS not an LDAP? Or 
they're just being really expensive?





-- Original Message --
From: "Emmanuel Lécharny" 
To: users@directory.apache.org
Sent: 9/22/2016 8:21:53 AM
Subject: Re: RFC 2696 and total entries in the search result


Le 22/09/16 à 14:18, Richard Sand a écrit :

 Ok, what is the proper technique then?


always pull at least one element than the number you can store in one
page : that will tell you if you will have a next page or not.

This save CPU on the server, a lot of CPU. Actually, getting the number
of results beforhand results in doing the request twice, more or less
(for servers that handle such things).





Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Emmanuel Lécharny
Le 22/09/16 à 14:18, Richard Sand a écrit :
> Ok, what is the proper technique then?

always pull at least one element than the number you can store in one
page : that will tell you if you will have a next page or not.

This save CPU on the server, a lot of CPU. Actually, getting the number
of results beforhand results in doing the request twice, more or less
(for servers that handle such things).



Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Richard Sand

Ok, what is the proper technique then?

-Richard

-- Original Message --
From: "Emmanuel Lécharny" 
To: users@directory.apache.org
Sent: 9/22/2016 8:14:25 AM
Subject: Re: RFC 2696 and total entries in the search result


Le 22/09/16 à 13:44, Richard Sand a écrit :
 I think it's to have the web page say "showing 20 results" and  links 
for "page 2, page 3" etc.?


I think so, too, but this is totally wrong to do that this way.





Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Emmanuel Lécharny
Le 22/09/16 à 13:44, Richard Sand a écrit :
> I think it's to have the web page say "showing 20 results" and  links for 
> "page 2, page 3" etc.?

I think so, too, but this is totally wrong to do that this way.



Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Richard Sand
I think it's to have the web page say "showing 20 results" and  links for "page 
2, page 3" etc.?

> On Sep 22, 2016, at 5:46 AM, Emmanuel Lécharny  wrote:
> 
>> Le 22/09/16 à 10:42, Doan Tin Nghia a écrit :
>> Thanks. I need it for paging (same issue like
>> https://sourceforge.net/p/ldap-sdk/mailman/message/29370418/)
> 
> Ok, but let me ask you aain : why do you technically need to know how
> many entries you will get ?
> 
> This information is totally spurious, unless you want to tell your
> client "you are going to get 1 238 654 entries".
> 
> Most of the time, and I saw that many, many times by people trying to
> implemented an web page with back and forth buttons, the answer is "But,
> but, but, I *NEED* to know how many elements I wil get in order to
> correctly design my web page, otehrwise I will not be able to know if I
> will have a next page !!!". This is typically wrong. If you decide to
> show N elements in one page, and have a Next button, then just poll 2xN
> elements from the backend, and if you get more than N elements, you know
> that you will have a Next button.
> 
> FTR, those using a RDBMS are frequently doing a selectcount(*) before
> doing the real select for the exact same reason, and this is STUPID :
> they are doing the exact same request *TWICE*, overloading teh server.
> Be smart.
> 


Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Emmanuel Lécharny
Le 22/09/16 à 10:42, Doan Tin Nghia a écrit :
> Thanks. I need it for paging (same issue like
> https://sourceforge.net/p/ldap-sdk/mailman/message/29370418/)

Ok, but let me ask you aain : why do you technically need to know how
many entries you will get ?

This information is totally spurious, unless you want to tell your
client "you are going to get 1 238 654 entries".

Most of the time, and I saw that many, many times by people trying to
implemented an web page with back and forth buttons, the answer is "But,
but, but, I *NEED* to know how many elements I wil get in order to
correctly design my web page, otehrwise I will not be able to know if I
will have a next page !!!". This is typically wrong. If you decide to
show N elements in one page, and have a Next button, then just poll 2xN
elements from the backend, and if you get more than N elements, you know
that you will have a Next button.

FTR, those using a RDBMS are frequently doing a selectcount(*) before
doing the real select for the exact same reason, and this is STUPID :
they are doing the exact same request *TWICE*, overloading teh server.
Be smart.



Re: RFC 2696 and total entries in the search result

2016-09-22 Thread Doan Tin Nghia
Thanks. I need it for paging (same issue like
https://sourceforge.net/p/ldap-sdk/mailman/message/29370418/)


On Wed, Sep 21, 2016 at 2:00 PM, Emmanuel Lécharny 
wrote:

> Le 21/09/16 à 06:40, Doan Tin Nghia a écrit :
> > Hi Emmanuel,
> >
> > Is there any way to configure that calculation at server side ?
>
> An option would be to write a new interceptor that would be added at the
> very beginning of the chain, in order to gather all the results in
> memory before sending them. As we use Cursors to manage the results, you
> will have to define an accumulator cursor. Not that complex.
>
> Now, why do you really need to know how many entries you will get back ?
>
>


Re: RFC 2696 and total entries in the search result

2016-09-21 Thread Emmanuel Lécharny
Le 21/09/16 à 06:40, Doan Tin Nghia a écrit :
> Hi Emmanuel,
>
> Is there any way to configure that calculation at server side ?

An option would be to write a new interceptor that would be added at the
very beginning of the chain, in order to gather all the results in
memory before sending them. As we use Cursors to manage the results, you
will have to define an accumulator cursor. Not that complex.

Now, why do you really need to know how many entries you will get back ?



Re: RFC 2696 and total entries in the search result

2016-09-20 Thread Doan Tin Nghia
Hi Emmanuel,

Is there any way to configure that calculation at server side ?

Thanks

On Mon, Sep 19, 2016 at 6:11 PM, Doan Tin Nghia  wrote:

> I got it. Thanks for spotting that.
>
> On Mon, Sep 19, 2016 at 5:29 PM, Emmanuel Lécharny 
> wrote:
>
>> Le 19/09/16 à 12:12, Doan Tin Nghia a écrit :
>> > I could not obtain total entries in the search result. The 'size' value
>> in
>> > PageResults Control was always zero. Seems the API is not following RFC
>> > 2696.
>>
>> It does fllow teh RFC, which says :
>>
>> "the
>>size MAY be set to the server's estimate of the total number of
>>entries in the entire result set. Servers that cannot provide such an
>>estimate MAY set this size to zero (0)."
>>
>> Estimating the number of results is not simple, if you want to have a
>> performant server : you have to compute it beforehand, which is costly.
>> ApacheDS don't compute the results if not needed
>> (ie, it does not fetch every entry from the backend, check if they match
>> the filters, remove all the attributes that are not requested, add all the
>> computed attributes, etc. until it has to do it.
>> Bottom line, and especially for the paged search control, where we can
>> use a cursor that will stay in memory, that saves a LOT of CPU).
>>
>>
>


Re: RFC 2696 and total entries in the search result

2016-09-19 Thread Doan Tin Nghia
I got it. Thanks for spotting that.

On Mon, Sep 19, 2016 at 5:29 PM, Emmanuel Lécharny 
wrote:

> Le 19/09/16 à 12:12, Doan Tin Nghia a écrit :
> > I could not obtain total entries in the search result. The 'size' value
> in
> > PageResults Control was always zero. Seems the API is not following RFC
> > 2696.
>
> It does fllow teh RFC, which says :
>
> "the
>size MAY be set to the server's estimate of the total number of
>entries in the entire result set. Servers that cannot provide such an
>estimate MAY set this size to zero (0)."
>
> Estimating the number of results is not simple, if you want to have a
> performant server : you have to compute it beforehand, which is costly.
> ApacheDS don't compute the results if not needed
> (ie, it does not fetch every entry from the backend, check if they match
> the filters, remove all the attributes that are not requested, add all the
> computed attributes, etc. until it has to do it.
> Bottom line, and especially for the paged search control, where we can use
> a cursor that will stay in memory, that saves a LOT of CPU).
>
>


Re: RFC 2696 and total entries in the search result

2016-09-19 Thread Emmanuel Lécharny
Le 19/09/16 à 12:12, Doan Tin Nghia a écrit :
> I could not obtain total entries in the search result. The 'size' value in
> PageResults Control was always zero. Seems the API is not following RFC
> 2696.

It does fllow teh RFC, which says :

"the
   size MAY be set to the server's estimate of the total number of
   entries in the entire result set. Servers that cannot provide such an
   estimate MAY set this size to zero (0)."

Estimating the number of results is not simple, if you want to have a 
performant server : you have to compute it beforehand, which is costly. 
ApacheDS don't compute the results if not needed 
(ie, it does not fetch every entry from the backend, check if they match the 
filters, remove all the attributes that are not requested, add all the computed 
attributes, etc. until it has to do it.
Bottom line, and especially for the paged search control, where we can use a 
cursor that will stay in memory, that saves a LOT of CPU).



RFC 2696 and total entries in the search result

2016-09-19 Thread Doan Tin Nghia
I could not obtain total entries in the search result. The 'size' value in
PageResults Control was always zero. Seems the API is not following RFC
2696.

Has any body seen this?

Thanks