We're developing a MarkLogic-based project where the data consists of around 100K XML documents. Each document belongs to one of 5 different publications, which need to be differentiated for certain searches. I'm aware of at least three methods of handling this differentiation:
1) assign each document to a collection and use cts:collection-query() or equivalent; 2) load documents into subdirectories, one to each publication, and use cts:directory-query() or equivalent; 3) store publication identifier in the XML data as an element, then create an element range index to enable searches on it. Is there any way to guesstimate which of these approaches will have the best performance when combined with various word and element queries, or will it require empirical testing? David -- David Sewell, Editorial and Technical Manager ROTUNDA, The University of Virginia Press PO Box 400314, Charlottesville, VA 22904-4314 USA Email: [email protected] Tel: +1 434 924 9973 Web: http://rotunda.upress.virginia.edu/ _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
