Hi Abhishek
We are using ML 5.0-4
We have all the default indexes on, but have also enabled word searches to
support unstemmed queries. We also have 3 character and trailing wildcard
indexes enabled.
The collection and uri lexicons are also turned on.
The query is very straight-forward. We have some range indexes so some of the
queries are element or attribute range queries. However I am observing this
behaviour even when the query is at its most basic e.g. a directory query ANDed
with some element or attribute value queries.
The problem occurs even when no query is in effect and the searchable
expression is effectively fn:collection().
If I run this search as follows:
search:search("", (), 1, 10)
I get back:
<search:response total="157855" start="1" page-length="1"
Then for:
search:search("", (), 100000, 1)
<search:response total="153302" start="100000" page-length="1"
Then for:
search:search("", (), 152208, 1)
<search:response total="0" start="152208" page-length="1"
So it seems the use of cts:remainder to estimate the count is quite variable as
your offset increases. No doubt this is because no filtering is used to correct
the value after the current offset. However, you reach a point where the search
doesn't get back a result and the xdmp:estimate returns 0.
So its important to remember that you need your indexes and queries to be very
carefully setup to provided accurate estimations, and to run searches
unfiltered. However I don't get why 0 would be returned in response/@total when
that is plainly not the case.
Regards,
Gavin
________________________________________
From: [email protected]
<[email protected]> on behalf of
[email protected] <[email protected]>
Sent: 23 April 2014 18:45
To: [email protected]
Subject: Re: [MarkLogic Dev General] Search API total results
Hello Gavin,
What version of MarkLogic are you using also what indexes are being used with
the queries being passed to search:search?
Thanks
Abhishek
________________________________________
From: [email protected]
[[email protected]] on behalf of Gavin Haydon
[[email protected]]
Sent: Wednesday, April 23, 2014 11:09 AM
To: General List
Subject: [MarkLogic Dev General] Search API total results
Hi,
I am using search:search() and using the search:response/@total value to
indicate the number of hits. The searches are running with filtering turned on.
When offsetting within the results, so that start < total, the reported total
seems to be reliable. However, if the start value exceeds the known total, the
Search API is returning zero for the total.
This is only happening in larger databases with many fragments. If I attempt
the same on a database with only a few dozen fragments, I always get a reliable
total even when offsetting past the end.
I realise that the Search API will use cts:remainder() to calculate the total
when it has at least one result to work with. If the search returns no results,
because of the query or paging too far, it resorts to using xdmp:estimate() to
get the total.
So given that the query does return results, how is the xdmp:estimate() failing
to get a count? If I run the xdmp:estimate() by hand, using the cts:query()
that the Search API would run, I do get a proper count!
By the way, it is no better if the query narrows the result set down to very
few such as 10. if you set start to 11 the total becomes zero.
Has anybody encountered this before, and is there a way to correct it?
Regards,
Gavin
Gavin Haydon
Technical Team Leader
PRESS
ASSOCIATION
www.pressassociation.com<http://www.pressassociation.com/>
[email protected]
This email is from the Press Association. For more information, see
www.pressassociation.com. This email may contain confidential information. Only
the addressee is permitted to read, copy, distribute or otherwise use this
email or any attachments. If you have received it in error, please contact the
sender immediately. Any opinion expressed in this email is personal to the
sender and may not reflect the opinion of the Press Association. Any email
reply to this address may be subject to interception or monitoring for
operational reasons or for lawful business practices.
This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient(s), please reply to the sender and
destroy all copies of the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email,
and/or any action taken in reliance on the contents of this e-mail is strictly
prohibited and may be unlawful.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
This email is from the Press Association. For more information, see
www.pressassociation.com. This email may contain confidential information. Only
the addressee is permitted to read, copy, distribute or otherwise use this
email or any attachments. If you have received it in error, please contact the
sender immediately. Any opinion expressed in this email is personal to the
sender and may not reflect the opinion of the Press Association. Any email
reply to this address may be subject to interception or monitoring for
operational reasons or for lawful business practices.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general