Is there a CPF pipeline attached to the database (or some other triggers)?  If 
so, that might be what is causing the updates.

In general, batches are more efficient than single updates, maybe batches of 
100 or 1000 (or more or less...), depending.  As you said, tho, a failure meant 
that the whole batch fails.  But that might not be a very big deal, as long as 
you have logic to deal with that.   Another thing that might work, even with 
single updates, is to run many threads from different clients, especially if 
you have enough resources.  If you have a single client, it is likely 
serializing in a single thread, so ewach transaction might have to wait for the 
previous one to commit.  Another option is to have a single client spawn a 
bunch of tasks to do updates.

Again, without knowing what the code is doing it is hard to speculate; 1 per 
second might be slow or might not, depending on what it is doing.

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of Bob O
Sent: Thursday, May 23, 2013 12:57 PM
To: [email protected]
Subject: [MarkLogic Dev General] Updates creating additional documents

Hello again!

So, I'm doing some investigation to my new ML project and I'm finding out some 
weird things happening:

-When updates are done, it creates additional documents (creating rather than 
updating).  We would get new documents when we try to update a particular field 
that is part of the Data Access Descriptor (DAD) object, such as the URI that 
points to the document's product. What would cause this? I'm thinking some 
logic on their code.

-The time to ingest documents takes about one per second which seems really 
slow to me (average size of document is approximately 15Kb). On my last 
project, we would batch 1,000 documents in one file and that seems to work 
better for us then. The only drawback is that if one document rejects during 
the ingest, the entire batch of 1,000 doesn't get sent. It sends up to the 
point when the corrupted document comes up. For example, the 999th document 
fails, only 998 get sent through. Is batch processing something we should 
consider now?

Any thoughts?

Any suggestions is appreciated. Thanks in advance!

~~Bob O.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to