Re: [MarkLogic Dev General] New Module for Memory Operations on XML

Whitby, Rob, Springer Healthcare UK Tue, 17 Apr 2012 13:11:52 -0700

Hi Ryan,

This is really interesting, thanks for sharing it.

I recently encountered really poor performance using the in-mem-update module, 
and modified it slightly to use fn:generate-id().
https://github.com/robwhitby/commons/tree/master/memupdate

In my simple test of deleting nodes, the in-mem-update module takes 13.8s, 
modifying it to use fn:generate-id() improves this to 0.25s. I just tried your 
module and got 0.04s! Obviously this is just one use case but it's really 
impressive nonetheless. Do you have unit tests you could share on github? Or 
perhaps there are existing tests for in-mem-update that could be applied?

Thanks again,
Rob

-----Original Message-----
From: [email protected] on behalf of Ryan Dew
Sent: Tue 4/17/2012 18:17
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] New Module for Memory Operations on XML

The function mapping idea is good. I'm not quite sure how I would
cts:highlight, I'll have to think on that one. I wanted to make it easy for
the module to be fully XQuery 1.0 compatible. Currently I have a commented
out code to replace the functionality of fn:generate-id (an XQuery 3.0
function) to generate a unique id for a node (mine is a little slower, but
the module still provides overall better performance). I might consider
forking it so one version is fully XQuery 1.0 compliant and another is
tailored to MarkLogic.

Thanks for the suggestions!

-Ryan Dew

On Tue, Apr 17, 2012 at 11:03 AM, Michael Blakeley <[email protected]>wrote:

> Geert, I expect that the xdmp update functions also operate by walking the
> input tree and copying it an output tree. Otherwise how would you have
> multi-version concurrency?
>
> But the xdmp functions are implemented in C++, which makes a difference.
> You might be able to quantify that difference by comparing
> xmdp:node-replace with the equivalent in-memory operations plus
> xdmp:document-insert. That kind of evidence could help persuade someone at
> MarkLogic that the feature would be worthwhile.
>
> Ryan, I think you could improve performance even more with judicious use
> of function mapping. It is often faster than FLWOR expressions are. You
> might also see if there is a way to use cts:highlight for some operations,
> since that is a C++ function.
>
> -- Mike
>
> On 17 Apr 2012, at 07:02 , Geert Josten wrote:
>
> > Where can we find the code itself?
> >
> > And how much does it resemble the kind of updates allowed in XQUF?
> >
> > By the way, was kind of hoping MarkLogic would allow applying the xdmp
> node update functions (or copies of those) to in memory structures as well.
> Direct manipulation of the tree, without copying it recursively would be
> way faster..
> >
> > Kind regards,
> > Geert
> >
> >
> > Van: [email protected] [mailto:
> [email protected]] Namens Ryan Dew
> > Verzonden: dinsdag 17 april 2012 15:47
> > Aan: MarkLogic Developer Discussion
> > Onderwerp: [MarkLogic Dev General] New Module for Memory Operations on
> XML
> >
> > I've been working on my own module for updating XML in memory. It has
> greater functionality than the module shipped with MarkLogic, such as
> performing multiple operations at one time, and better performance from
> what I have been able to measure. You can see my post on it at
> http://maxdewpoint.blogspot.com/2012/04/lessons-learned-from-xquery-xml-memory.html.
> I would love to get some input from the MarkLogic community on this.
> >
> > -Ryan Dew
> > _______________________________________________
> > General mailing list
> > [email protected]
> > http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] New Module for Memory Operations on XML

Reply via email to