Re: [MarkLogic Dev General] Too Many Stands

Steiner, David J. (LNG-DAY) Tue, 30 Oct 2012 05:52:23 -0700

So, you're saying that turning off merging isn't an effective technique to use 
during a load, if I understand what you're saying.
However, merges use memory and leaving them "on" interferes with other memory 
based operations like transformations.  I continue to get expanded tree cache 
exceptions when I leave merging on while trying to load and I don't when I turn 
it off.
It would appear that's because ML evenly distributes the documents across the 
forests and thus, when it's time to merge, they all merge, which leaves no 
memory for the collections and transformations that are going on.
So, instead of being able to load hundreds of millions of records however the 
data comes to me in files, you're saying I need to figure out how to pre-batch 
my data to ensure that I don't have memory issues?  That's just shifting the 
burden from one place to another and it still doesn't help me load data.  If my 
load dies because memory gets exhausted, how's that any better than dying 
because I've run out of stands?


Let me try my question another way then: In general, how many fragments does it 
take to make up a stand?  I'm guessing it's a fragment thing that's making the 
newer stands as data is being loaded because it can't be a document size thing 
since my documents are 3 elements and at most 100 - 200 characters.

David

From: [email protected] 
[mailto:[email protected]] On Behalf Of Charles Greer
Sent: Monday, October 29, 2012 1:21 PM
To: MarkLogic Developer Discussion
Cc: Steiner, David J. (LNG-DAY)
Subject: Re: [MarkLogic Dev General] Too Many Stands

My understanding is that you should really not turn off merges -- merging has 
improved a lot in later versions, and while it can be a performance hit if the 
server starts merging during a big load, MarkLogic does a better job now with 
scheduling and throttling merges than in the past.  Moreover, merging improves 
the performance of the database after the fact, a lot.

I think you should probably look to other means for helping with ingest times 
-- batch inserts (multiple docs per transaction) are probably the biggest 
improvement you can get (but of course this is highly dependent on your 
document structures)

Charles



On 10/29/2012 09:53 AM, Steiner, David J. (LNG-DAY) wrote:
Hello,

Thought that I'd seen in documentation where one could "speed up" loading by 
turning off merges, so I did.  Seemed to work pretty good until I got this 
error:

XDMP-TOOMANYSTANDS: xdmp:eval("import module namespace infodev = 
&quot;http://marklogic.com/app...";, (fn:QName("", "document"), 
fn:doc("[uri].xml"), fn:QName("", "path"), ...), <options 
xmlns="xdmp:eval"><database>1385720675613291619</database></options>) -- Too 
many stands

So, apparently a periodic merge is required to even proceed with loading.  Is 
there documentation on how to know when a merge would be needed?  For instance, 
I have X docs to load into Y forests so at most I can load X/Z docs, then I'll 
need to manually merge before more loading.

Thanks,
David




--

Charles Greer

Senior Engineer

MarkLogic Corporation

[email protected]<mailto:[email protected]>

Phone: +1 707 408 3277

www.marklogic.com<http://www.marklogic.com>



This e-mail and any accompanying attachments are confidential. The information 
is intended solely for the use of the individual to whom it is addressed. Any 
review, disclosure, copying, distribution, or use of this e-mail communication 
by others is strictly prohibited. If you are not the intended recipient, please 
notify us immediately by returning this message to the sender and delete all 
copies. Thank you for your cooperation.

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Too Many Stands

Reply via email to