Hi Geert,

 

I wasn't sure exactly how it was working so it's good to know that there's a
reason for it.  Too small a batch size can slow down a query so finding the
large-as-possible place can be difficult.  In parts of the application I've
made this adjustable.

 

This current issue could be due to large documents, so your advice of a
smaller batch makes sense.  Thanks for your help!

 

Gary

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Geert
Jostenult.
Sent: Thursday, November 07, 2013 2:56 PM
To: MarkLogic Developer Discussion
Subject: [SPAM]Re: [MarkLogic Dev General] export / import ?

 

Hi Gary,

 

Not data related: well, yes and no. I indeed believe that MarkLogic does
some kind of garbage collection after each iteration. The FLWOR statement is
very well optimized in MarkLogic. It has to be, as it is the basis for
sorting cts:search results.

 

But, your batch size is still rather large. If those 1000 docs happen to be
relatively large, that will easily max out the reserved memory space like
tree cache for that thread..

 

Kind regards,

Geert

 

Van: [email protected]
[mailto:[email protected]] Namens Gary Larsen
Verzonden: woensdag 6 november 2013 14:42
Aan: 'MarkLogic Developer Discussion'
Onderwerp: Re: [MarkLogic Dev General] export / import ?

 

Hi Geert,

 

I use this syntax on queries which return many results.  It appears that the
objects from cts:search() are able to be removed from the cache after each
iteration.

 

I didn't think the problem would be data related (large documents), but need
to investigate that.  Thanks for the MLCP link.  I was putting of upgrading
to version 6 but now have a good reason.

 

 

Gary Larsen

Envisn Inc.

508 259-6465

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Geert Josten
Sent: Tuesday, November 05, 2013 3:46 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] export / import ?

 

Hi Gary,

 

Different memory settings, larger documents, total nr of documents, things
like that..

 

>From the looks of your code, you are still trying to handle all documents
within one query. How about spawning tasks to do your import/export? Perhaps
making $incr a bit smaller too.

 

Information Studio takes that approach too. Might be worth taking a closer
look at that also. I used that as foundation for import/export in one of my
projects..

 

Kind regards,

Geert

 

Van: [email protected]
[mailto:[email protected]] Namens Gary Larsen
Verzonden: dinsdag 5 november 2013 17:20
Aan: 'MarkLogic Developer Discussion'
Onderwerp: Re: [MarkLogic Dev General] export / import ?

 

I'll check it out.  Thanks.

 

I've had good luck using this construct to avoid the tree cache errors when
a query returns a lot of data.  Not sure why it's not working at this site.

 

        let $incr := 5000

        let $size := xdmp:estimate(cts:search(doc(), $cq, 'unfiltered')) 

        let $segs := ceiling($size div $incr) return 

        

        for $x in (1 to $segs) 

            let $start :=  (($x -1) * $incr) +1 

            let $end := $start + $incr -1 

            return cts:search(doc(), $cq, 'unfiltered')[$start to $end]

 

Gary Larsen

Envisn Inc.

508 259-6465

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Eric Bloch
Sent: Tuesday, November 05, 2013 11:01 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] export / import ?

 

Gary, 

 

See http://developer.marklogic.com/products/mlcp.

 

That said, tree cache errors usually mean you have a query that needs to be
rewritten in some way.  For example, see

 

http://stackoverflow.com/questions/16979086/marklogic-expanded-tree-cache-er
ror-while-inserting-documents

 

Best,

Eric

 

Eric Bloch

Director, Community

MarkLogic Corporation

 

desk +1 650 655 2390 | mobile +1 650 339 0376

email  [email protected]

web    developer.marklogic.com

twitter @eedeebee

 

On Nov 5, 2013, at 7:59 AM, Gary Larsen <[email protected]>

 wrote:

 

Is there a utility which will allow me to extract all the documents from a
collection, and them import into another database?

 

Trying to debug tree cache errors at a customer site and need to replicate
the error.  I'd rather not try to use backup/restore of the database due to
the size.

 

Thanks,

Gary

_______________________________________________
General mailing list
 <mailto:[email protected]> [email protected]
 <http://developer.marklogic.com/mailman/listinfo/general>
http://developer.marklogic.com/mailman/listinfo/general

 

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to