Hi Neil,
First of all, you end your code with an xdmp:unquote, while having done
search/replaces pretending that the text is plain text, not XML. That is rather
risky. I hope your search patterns are a bit smarter than the ones you are
giving as examples, you don't want to be replacing parts of element or
attribute names by mistake, when you intend to perform replacements in
character data. But perhaps your example is just an oversimplified version of
what you are actually doing.
Secondly, it is allowed to have let statements redefining existing variables.
So you could write:
let $Text := '....bla...'
let $Text := replace($Text, 'bla', 'BLA')
...
So, you don't need to bother about xdmp:set. ;-)
Moreover, I am not sure that Mark Logic actually implemented let's as
variables, making it hard to predict what really happens when the query is
executed. Those let's could just as well be seen as macro definitions that are
replaced in the syntax parsing stage. But someone from Mark Logic will have to
confirm about this. There are optimizations applied as well, so there is only
one real option: do tests and measure..
Last but not least: if you have many search/replace operations and worry that
doing those in a single pass would exhaust memory usage, then you could convert
it to a multi-step process, and use the Content Processing Framework to tie the
separate steps together. Might be worth your while to take a look at CPF
anyhow. It is really interesting for import processes..
Kind regards,
Geert
>
Drs. G.P.H. Josten
Consultant
http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit
bericht kunnen geen rechten worden ontleend.
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Neil Bradley
> Sent: vrijdag 9 oktober 2009 12:57
> To: [email protected]
> Subject: [MarkLogic Dev General] Text Updates Garbage Collection?
>
> Hi,
>
>
>
> I want to check if there is likely to be any problem with
> memory exhaustion in the following scenario.
>
>
>
> I will have text documents stored in a MarkLogic database
> that I will to update using a large number of consecutive
> search/replaces, then finally convert to XML.
>
>
>
> It seems obvious to me that I could easily run out of memory
> if I adopt this approach (and have hundreds of replaces
> applied to large text documents). In this trivial example, I
> am simply converting the word "Document" to "DOCUMENT" in
> three steps, which I would obviously do in one for real, but
> just to show the method I originally considered...
>
>
>
> let $Text :=
> ".............................................................
> . (large text document).............................."
>
> let $NewText1 := fn:replace($Text, "Doc", "DOC")
>
> let $NewText2 := fn:replace($NewText1, "ume", "UME"))
>
> let $NewText3 := fn:replace($NewText2, "nt", "NT"))
>
> let $XML := xdmp:unquote($NewText3)
>
> return
>
> $XML
>
>
>
> I am assuming that each variable contains a variant of the
> text document, so memory will quickly become exhausted.
>
>
>
> However, if I use xdmp:set(), would that solve the problem,
> because the first variable content is being replaced, and the
> later variables have no content at all?...
>
>
>
> let $Text :=
> ".............................................................
> . (large text document).............................."
>
> let $NewText1 := fn:replace($Text, "Doc", "DOC")
>
> let $NewText2 := xdmp:set($NewText1,
> fn:replace($NewText1, "ume", "UME"))
>
> let $NewText3 := xdmp:set($NewText1,
> fn:replace($NewText1, "nt", "NT"))
>
> let $XML := xdmp:unquote($NewText1)
>
> return
>
> $XML
>
>
>
> Or would I still expect old text to still be occupying memory
> (lack of string garbage collection)?
>
>
>
> Thanks,
>
>
>
> Neil.
>
>
>
>
>
>
>
>
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general