Thank you for the faster solution, that was very interesting to read.
Would it be fair to say then that content enrichment plus indexes
would be the way one normally tries to tackle this class of problem in
MarkLogic Server?

If, for example, one had a more complicated record to retrieve based on
the key, one whose values could not all be put into the distinct key,
it would be advisible to build those keys into the documents themselves?

The Corb tools looks very useful, thank you for pointing it out!

Jim

> the MedlineCitation fragments into memory, plus calculating the 
> distinct-values of the pipe-delimited key. That's fine for small numbers 
> of fragments, but this approach requires rapidly-increasing amounts of 
> memory with large content sets. That's likely to be an issue with Saxon, 
> too.
> 
> To scale up, it's better to use a range index of type string. We can 
> access its values via cts:element-values() or 
> cts:element-attribute-values(). This approach can deliver your answers 
> in milliseconds.
> 
> For your desired output, though, this would involve some content 
> enrichment as well: perhaps by adding a 'key' attribute on every Author 
> element. The new "Corb" tool on http://developer.marklogic.com/code/ is 
> a good resource for this sort of enrichment, and the example 
> medline-iso8601.xqy module is very close to what you'd need 
> (http://developer.marklogic.com/svn/corb/trunk/src/java/com/marklogic/developer/corb/medline-iso8601.xqy).

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
James A. Robinson                       [EMAIL PROTECTED]
Stanford University HighWire Press      http://highwire.stanford.edu/
+1 650 7237294 (Work)                   +1 650 7259335 (Fax)
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to