Hi Neil,

First of all, you end your code with an xdmp:unquote, while having done 
search/replaces pretending that the text is plain text, not XML. That is rather 
risky. I hope your search patterns are a bit smarter than the ones you are 
giving as examples, you don't want to be replacing parts of element or 
attribute names by mistake, when you intend to perform replacements in 
character data. But perhaps your example is just an oversimplified version of 
what you are actually doing.

Secondly, it is allowed to have let statements redefining existing variables. 
So you could write:

        let $Text := '....bla...'
        let $Text := replace($Text, 'bla', 'BLA')
        ...

So, you don't need to bother about xdmp:set. ;-)

Moreover, I am not sure that Mark Logic actually implemented let's as 
variables, making it hard to predict what really happens when the query is 
executed. Those let's could just as well be seen as macro definitions that are 
replaced in the syntax parsing stage. But someone from Mark Logic will have to 
confirm about this. There are optimizations applied as well, so there is only 
one real option: do tests and measure..

Last but not least: if you have many search/replace operations and worry that 
doing those in a single pass would exhaust memory usage, then you could convert 
it to a multi-step process, and use the Content Processing Framework to tie the 
separate steps together. Might be worth your while to take a look at CPF 
anyhow. It is really interesting for import processes..

Kind regards,
Geert

>


Drs. G.P.H. Josten
Consultant


http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van 
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
bericht kunnen geen rechten worden ontleend.


> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Neil Bradley
> Sent: vrijdag 9 oktober 2009 12:57
> To: [email protected]
> Subject: [MarkLogic Dev General] Text Updates Garbage Collection?
>
> Hi,
>
>
>
> I want to check if there is likely to be any problem with
> memory exhaustion in the following scenario.
>
>
>
> I will have text documents stored in a MarkLogic database
> that I will to update using a large number of consecutive
> search/replaces, then finally convert to XML.
>
>
>
> It seems obvious to me that I could easily run out of memory
> if I adopt this approach (and have hundreds of replaces
> applied to large text documents). In this trivial example, I
> am simply converting the word "Document" to "DOCUMENT" in
> three steps, which I would obviously do in one for real, but
> just to show the method I originally considered...
>
>
>
>     let $Text :=
> ".............................................................
> . (large text document).............................."
>
>     let $NewText1 := fn:replace($Text, "Doc", "DOC")
>
>     let $NewText2 := fn:replace($NewText1, "ume", "UME"))
>
>     let $NewText3 := fn:replace($NewText2, "nt", "NT"))
>
>     let $XML := xdmp:unquote($NewText3)
>
>     return
>
>       $XML
>
>
>
> I am assuming that each variable contains a variant of the
> text document, so memory will quickly become exhausted.
>
>
>
> However, if I use xdmp:set(), would that solve the problem,
> because the first variable content is being replaced, and the
> later variables have no content at all?...
>
>
>
>     let $Text :=
> ".............................................................
> . (large text document).............................."
>
>     let $NewText1 := fn:replace($Text, "Doc", "DOC")
>
>     let $NewText2 := xdmp:set($NewText1,
> fn:replace($NewText1, "ume", "UME"))
>
>     let $NewText3 := xdmp:set($NewText1,
> fn:replace($NewText1, "nt", "NT"))
>
>     let $XML := xdmp:unquote($NewText1)
>
>     return
>
>       $XML
>
>
>
> Or would I still expect old text to still be occupying memory
> (lack of string garbage collection)?
>
>
>
> Thanks,
>
>
>
> Neil.
>
>
>
>
>
>
>
>

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to