Hi Abhishek

We are using ML 5.0-4

We have all the default indexes on, but have also enabled word searches to 
support unstemmed queries. We also have 3 character and trailing wildcard 
indexes enabled.

The collection and uri lexicons are also turned on.

The query is very straight-forward. We have some range indexes so some of the 
queries are element or attribute range queries. However I am observing this 
behaviour even when the query is at its most basic e.g. a directory query ANDed 
with some element or attribute value queries.

The problem occurs even when no query is in effect and the searchable 
expression is effectively fn:collection().

If I run this search as follows:
search:search("", (), 1, 10)

I get back:
<search:response total="157855" start="1" page-length="1"

Then for:
search:search("", (), 100000, 1)
<search:response total="153302" start="100000" page-length="1"

Then for:
search:search("", (), 152208, 1)
<search:response total="0" start="152208" page-length="1"

So it seems the use of cts:remainder to estimate the count is quite variable as 
your offset increases. No doubt this is because no filtering is used to correct 
the value after the current offset. However, you reach a point where the search 
doesn't get back a result and the xdmp:estimate returns 0.

So its important to remember that you need your indexes and queries to be very 
carefully setup to provided accurate estimations, and to run searches 
unfiltered. However I don't get why 0 would be returned in response/@total when 
that is plainly not the case.

Regards,
Gavin


________________________________________
From: [email protected] 
<[email protected]> on behalf of 
[email protected] <[email protected]>
Sent: 23 April 2014 18:45
To: [email protected]
Subject: Re: [MarkLogic Dev General] Search API total results

Hello Gavin,

What version of MarkLogic are you using also what indexes are being used with 
the queries being passed to search:search?

Thanks
Abhishek
________________________________________
From: [email protected] 
[[email protected]] on behalf of Gavin Haydon 
[[email protected]]
Sent: Wednesday, April 23, 2014 11:09 AM
To: General List
Subject: [MarkLogic Dev General] Search API total results

Hi,


I am using search:search() and using the search:response/@total value to 
indicate the number of hits. The searches are running with filtering turned on.


When offsetting within the results, so that start < total, the reported total 
seems to be reliable. However, if the start value exceeds the known total, the 
Search API is returning zero for the total.


This is only happening in larger databases with many fragments. If I attempt 
the same on a database with only a few dozen fragments, I always get a reliable 
total even when offsetting past the end.


I realise that the Search API will use cts:remainder() to calculate the total 
when it has at least one result to work with. If the search returns no results, 
because of the query or paging too far, it resorts to using xdmp:estimate() to 
get the total.


So given that the query does return results, how is the xdmp:estimate() failing 
to get a count? If I run the xdmp:estimate() by hand, using the cts:query() 
that the Search API would run, I do get a proper count!


By the way, it is no better if the query narrows the result set down to very 
few such as 10. if you set start to 11 the total becomes zero.


Has anybody encountered this before, and is there a way to correct it?


Regards,

Gavin


Gavin Haydon

Technical Team Leader

PRESS
ASSOCIATION

www.pressassociation.com<http://www.pressassociation.com/>

[email protected]​​



This email is from the Press Association. For more information, see 
www.pressassociation.com. This email may contain confidential information. Only 
the addressee is permitted to read, copy, distribute or otherwise use this 
email or any attachments. If you have received it in error, please contact the 
sender immediately. Any opinion expressed in this email is personal to the 
sender and may not reflect the opinion of the Press Association. Any email 
reply to this address may be subject to interception or monitoring for 
operational reasons or for lawful business practices.
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

This email is from the Press Association. For more information, see 
www.pressassociation.com. This email may contain confidential information. Only 
the addressee is permitted to read, copy, distribute or otherwise use this 
email or any attachments. If you have received it in error, please contact the 
sender immediately. Any opinion expressed in this email is personal to the 
sender and may not reflect the opinion of the Press Association. Any email 
reply to this address may be subject to interception or monitoring for 
operational reasons or for lawful business practices.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to