Hi Mike, Another 2 cents: are you running a lot of parallel batches? Could they be interfering with each other, for instance through directory locking?
@Danny: aren't node replaces within a single request cumulated automatically? Kind regards, Geert > -----Oorspronkelijk bericht----- > Van: [email protected] [mailto:general- > [email protected]] Namens Danny Sokolsky > Verzonden: woensdag 8 augustus 2012 20:48 > Aan: MarkLogic Developer Discussion > Onderwerp: Re: [MarkLogic Dev General] my link resolver is slow > > Hi Mike, > > It is hard to answer a question like this in generalities. But here are a few > random ideas, probably most of which you have tried: > > Have you tried profiling the query? That can point you to hot spots fairly quickly. > I'm not totally sure what you mean by "the breakdown reported by the > optimizer". Do you mean in xdmp:plan? xdmp:query-trace? Something else? > > What version of MarkLogic are you running (xdmp:version() )? > > Are you using range index lookups to find the links (with a cts:query param, for > example)? > > When you say you are doing node replaces, do you mean you are writing each > document multiple times? That can get expensive, and it is often faster to > create a new version of the document in memory and then write the document > once. There is a library to do in-memory node-replaces too if you don't feel like > writing that yourself. > > -Danny > > > > -----Original Message----- > From: [email protected] [mailto:general- > [email protected]] On Behalf Of Mike Sokolov > Sent: Wednesday, August 08, 2012 10:41 AM > To: MarkLogic Developer Discussion > Subject: [MarkLogic Dev General] my link resolver is slow > > I've written some code to resolve links in a batch process; the links > can point to a number of different element/@id in any document, and we > are trying to record the destination document uri with the link so it > can be rendered quickly at run-time, and missing links won't be rendered > at all. > > Basically the process is: for each of some batch of documents, for each > of its links, search for the matching document, and replace the link > with an element having a uri attribute pointing to that document. > > Overall, this process is running much slower than I had expected. I've > been examining the query using the profiler, and after doing some > optimization of the searches, I find something a bit strange. The > breakdown reported by the optimizer doesn't seem to account for the > total time. It looks to me as if all the searches are completing fairly > quickly, based on logging statements that indicate all the documents in > the batch have been "processed", and then the query just seems to hang > for a while before returning. It seems to spend about 90% of the total > time in this second stage. My assumption is this time is spent > performing the updates, committing, indexing, writing a journal file, or > something like that. > > My question is: should I expect this to be reflected in the optimizer? > And is there some way I can figure out why it is taking so long, and > what I can do about it? Maybe inserting a node would be faster than > replacing? I've tried a tree-walk rather than lots of node-replaces, > but that actually seemed quite a bit slower. > > Thanks for any suggestions! > > -- > Michael Sokolov > Engineering Director > www.ifactory.com > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
