< caveat > I'm no expert, just a humble user </caveat>

 

This is something you're going to have to get used to.

The search estimates (either in the search: or cts: categories) are
always based on # of fragments.

( not always documents ).

 

This is just how MarkLogic works.   Its indexes index down to the
fragment level.  So any estimates just use the index entries which don't
know actual hits.   To get real hit counts you have to fully resolve the
search which involves ML fetching every fragment and parsing the
document. (slower).

 

 

This is not always recommended ... (still not yet sure why although I've
read all the reasons) but you may have more accurate results by
fragmenting your documents using the FragmentRoot and FragmentParent
settings in the Database.   Careful use of fragments can result in 'hit'
counts which are closer to what your looking for.   But you have to be
careful and refine your searches to hit just the fragment's root element
for it to make sense.

 

The usually recommended way is to break your document up into smaller
documents instead of fragmenting a big document.

Then each "hit" is the 'document' you are looking for ... or atleast
closer to the right #

But that only makes sense if these smaller "documents" are meaningful
atomic units.

 

 

 

Marklogic Experts: Please feel free to tell me I'm totally wrong !

 

 

 

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Adam
Patterson
Sent: Saturday, March 20, 2010 2:43 PM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] RE: Search:search total

 

Thanks Danny,

 

This seems like odd behaviour to me and I have to admit I don't
understand the reasoning behind search:search returning an estimate on
the number of documents. I mean, if I do a search and provide a
searchable-expression in the options which matches on nodes within the
document(s) of the database, then the total returned by the search is
always wrong. Why wouldn't it calculate an estimate on the number of
hits?

 

I guess I'm showing my ignorance of the search API, but it seems to me
that developers would be interested in the total number of matches in
the search result set and not in the number of documents. If someone
could explain this to me I'd appreciate it.

 

Thanks,

 

Adam

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Danny
Sokolsky
Sent: March 19, 2010 5:34 PM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] RE: Search:search total

 

Hi Adam,

 

If you are really only searching a single document and want to know the
number of hits in that document, one approach could be to set the
max-matches in transform-results to a large number, then count the
search:match elements in the output from search:search.  Then you can do
a count of the search:match elements.

Something like:

 

 xquery version "1.0-ml";

 

import module namespace search="http://marklogic.com/appservices/search";

  at "/MarkLogic/appservices/search/search.xqy";

 

fn:count(

search:search("hello",

 <options xmlns="http://marklogic.com/appservices/search";>

   <transform-results apply="snippet">

    <per-match-tokens>30</per-match-tokens>

    <max-matches>200</max-matches>

    <max-snippet-chars>2000</max-snippet-chars>

    <preferred-elements/>

   </transform-results>

 </options>) /search:result[1]/search:snippet/search:match)

 

-Danny

 

 

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Adam
Patterson
Sent: Friday, March 19, 2010 12:06 PM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] Search:search total

 

Hi,

 

I'm using search:search to search a single document in my database. I
accomplish this by having an additional-query, a cts:document-query, in
my search options. I understand from the online documentation that
search:search returns the total number of documents which the search hit
on in the total attribute. I'm wondering if there's a way to tell
search:search that I want the total number of hits (at the node level)
which the search produced, and I do not want the number of documents as
it's always "1".

 

I've also tried using search:estimate with the query returned by
search:search with <return-query>{fn:true()}</return-query> set in the
search options, and I'm sure you're not surprised to hear that I get the
exact same results.

 

Any thoughts are appreciated.

 

Cheers,

 

Adam

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to