< caveat > I'm no expert, just a humble user </caveat>
This is something you're going to have to get used to. The search estimates (either in the search: or cts: categories) are always based on # of fragments. ( not always documents ). This is just how MarkLogic works. Its indexes index down to the fragment level. So any estimates just use the index entries which don't know actual hits. To get real hit counts you have to fully resolve the search which involves ML fetching every fragment and parsing the document. (slower). This is not always recommended ... (still not yet sure why although I've read all the reasons) but you may have more accurate results by fragmenting your documents using the FragmentRoot and FragmentParent settings in the Database. Careful use of fragments can result in 'hit' counts which are closer to what your looking for. But you have to be careful and refine your searches to hit just the fragment's root element for it to make sense. The usually recommended way is to break your document up into smaller documents instead of fragmenting a big document. Then each "hit" is the 'document' you are looking for ... or atleast closer to the right # But that only makes sense if these smaller "documents" are meaningful atomic units. Marklogic Experts: Please feel free to tell me I'm totally wrong ! From: [email protected] [mailto:[email protected]] On Behalf Of Adam Patterson Sent: Saturday, March 20, 2010 2:43 PM To: General Mark Logic Developer Discussion Subject: [MarkLogic Dev General] RE: Search:search total Thanks Danny, This seems like odd behaviour to me and I have to admit I don't understand the reasoning behind search:search returning an estimate on the number of documents. I mean, if I do a search and provide a searchable-expression in the options which matches on nodes within the document(s) of the database, then the total returned by the search is always wrong. Why wouldn't it calculate an estimate on the number of hits? I guess I'm showing my ignorance of the search API, but it seems to me that developers would be interested in the total number of matches in the search result set and not in the number of documents. If someone could explain this to me I'd appreciate it. Thanks, Adam From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: March 19, 2010 5:34 PM To: General Mark Logic Developer Discussion Subject: [MarkLogic Dev General] RE: Search:search total Hi Adam, If you are really only searching a single document and want to know the number of hits in that document, one approach could be to set the max-matches in transform-results to a large number, then count the search:match elements in the output from search:search. Then you can do a count of the search:match elements. Something like: xquery version "1.0-ml"; import module namespace search="http://marklogic.com/appservices/search" at "/MarkLogic/appservices/search/search.xqy"; fn:count( search:search("hello", <options xmlns="http://marklogic.com/appservices/search"> <transform-results apply="snippet"> <per-match-tokens>30</per-match-tokens> <max-matches>200</max-matches> <max-snippet-chars>2000</max-snippet-chars> <preferred-elements/> </transform-results> </options>) /search:result[1]/search:snippet/search:match) -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Adam Patterson Sent: Friday, March 19, 2010 12:06 PM To: General Mark Logic Developer Discussion Subject: [MarkLogic Dev General] Search:search total Hi, I'm using search:search to search a single document in my database. I accomplish this by having an additional-query, a cts:document-query, in my search options. I understand from the online documentation that search:search returns the total number of documents which the search hit on in the total attribute. I'm wondering if there's a way to tell search:search that I want the total number of hits (at the node level) which the search produced, and I do not want the number of documents as it's always "1". I've also tried using search:estimate with the query returned by search:search with <return-query>{fn:true()}</return-query> set in the search options, and I'm sure you're not surprised to hear that I get the exact same results. Any thoughts are appreciated. Cheers, Adam
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
