Re: [MarkLogic Dev General] Documents count in a Forest

Michael Blakeley Tue, 24 Dec 2013 06:38:54 -0800

Have you profiled the query?

The admin API is a little more heavyweight than you need for this use-case, so 
I would drop it. If I did have to use the admin API, I would create my own map 
of forest ids and names. But since we aren't actually changing the server 
configuration, we can avoid it entirely.

Try inverting the problem. Instead of querying the database once per document, 
query it once per forest. This should be much more efficient, because you have 
many documents and few forests. Use xdmp:estimate and the forest-ids parameter 
to http://docs.marklogic.com/cts:search together with a document-query: 
something like this.

    let $uris := ...
    let $query := cts:document-query($uris)
    for $f in xdmp:database-forests(xdmp:database())
    let $name := xdmp:forest-name($f)
    let $count := xdmp:estimate(
      cts:search(doc(), $query, (), (), $f))
    order by $count descending
    return text { $name, $count }

Change the sort order and return expressions to suit yourself. You could 
probably write $query more efficiently too: for example if you get the book 
using a collection, then replace it with a cts:collection-query.

BTW this seems to be the first time anyone on this list has referred to a lac 
or lakh of documents.

-- Mike

On 24 Dec 2013, at 06:04 , Shipra, Gupta <[email protected]> wrote:

> We have a three node cluster(ml1,ml2 and ml3) and approx 10 lacs of documents 
> are stored there. these documents belong to books. each book and its related 
> data has one common id. i need to get the count of documents in each forest 
> for a particular book. 
> 
> I tried admin function for this:
> admin:forest-get-name(admin:get-configuration(), xdmp:document-forest(uri))
> where uri is the uri of the document.
> 
> but this is taking a long to process. 
> 
> Can anyone suggest better way for this?

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Documents count in a Forest

Reply via email to