Re: [MarkLogic Dev General] Processing Large Documents?

Todd Gochenour Sun, 19 Feb 2012 22:57:29 -0800

This advice repeats a recommendation I saw earlier tonight during some of
my research, namely that with MarkLogic it's better to break up documents
into smaller fragments.  I guess there's a performance gain in bursting a
document into small fragments, something to do with concurrency and locking
or minimizing the depth of the hierarchy, perhaps?


Note that my document doesn't equate to tables but instead it equates to
the entire database, which is two levels away from this recommendation to
have documents equate to rows.  It seems like the conventional wisdom is to
burst large documents into smaller fragments so that each fragment can be
handled independently.  I've always felt it simpler and more accurate to
load and use the XML file as is and not shred it into multiple parts.  I
want to replace the MySQL database with an XML database for this very
reason.

So I've managed to load this large document into the database and I've done
my first transformation of this document using XQuery to perform the
extraction and performance seems rather impressive.   I've done the same
thing with both eXistDB and xDB with no problem, indexing everything
including the deep hierarchical structure.  Once in the database, I should
be able to update fragments within the document as easily as if these
fragments were burst into individual files.  Is there a technical reason
(I've yet to discover) for why this would not be the case?

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Processing Large Documents?

Reply via email to