I've written some code to resolve links in a batch process; the links can point to a number of different element/@id in any document, and we are trying to record the destination document uri with the link so it can be rendered quickly at run-time, and missing links won't be rendered at all.
Basically the process is: for each of some batch of documents, for each of its links, search for the matching document, and replace the link with an element having a uri attribute pointing to that document. Overall, this process is running much slower than I had expected. I've been examining the query using the profiler, and after doing some optimization of the searches, I find something a bit strange. The breakdown reported by the optimizer doesn't seem to account for the total time. It looks to me as if all the searches are completing fairly quickly, based on logging statements that indicate all the documents in the batch have been "processed", and then the query just seems to hang for a while before returning. It seems to spend about 90% of the total time in this second stage. My assumption is this time is spent performing the updates, committing, indexing, writing a journal file, or something like that. My question is: should I expect this to be reflected in the optimizer? And is there some way I can figure out why it is taking so long, and what I can do about it? Maybe inserting a node would be faster than replacing? I've tried a tree-walk rather than lots of node-replaces, but that actually seemed quite a bit slower. Thanks for any suggestions! -- Michael Sokolov Engineering Director www.ifactory.com _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
