Hi!

  Well, on my PowerBook G4 1.67 Ghz, not yet with conflicts:

  1050 records saved one by one: 14 secs
  1050 records saved in 50 record batches: 6 secs
  1050 records saved in 500 record batches: 5 segundos
  10000 records saved in 50 record batches: 31 secs
  10000 records saved in 500 record batches: 26 secs
[Curiosity] pasting a 10k line text block on a Safari text area: >1 minute!

  Facing this results, some preliminary conclusions:

1) This takes much less time than what I expected. If it runs at this speed in my old PowerPC, with a slow drive, lots of processes running (including Eclipse), WO app running in development mode etc, then on a real server (with intel procs) it will run even faster (much faster for what I've seen of Java running on intel).

2) There are no significative differences between 50 and 500 in the size batch. I was NSLogging every time I saved a batch, so I did a lot more logging in the 50 batch-sized tests. Logging takes a lot of time, so I think globally it's not that different.

3) Inserting one by one is noticeable slower, but not THAT slower (I logged every 50 inserts, so no special logging time here).

So, I think what I'll do is to write in batches of 50 or so, and if a batch fails, then I write the batch contacts one by one. It's probably a bit slower than fetching, removing duplicates and saving, but it's not that bad and it's much easier to code, and it won't fail a second time if concurrent updates are being made (each contact will be saved, or not, period). It's actually fast enough to not be made on a background process, but instead on an AJAXed long response.

  Thank all of you for the help!

  Yours

Miguel Arroz

On 2008/01/15, at 15:19, Mike Schrag wrote:

1) Fo a fetch request to get the contacts with the emails of the 100 contacts batch (ie, blablabla where email = email1 or email = email2 or email = email3 ...). 2) Remove duplicates in memory using a fast method, like putting the stuff in NSSets or whatever. 3) Try to save again. Of course, it may still fail (concurrency sucks) but the probability is much lower.

This is all thought with the assumption that the UNIQUE-related exception is thrown when the first offending object is inserted, so I won't get all the information I need in one single exception, which I'm not 100% sure it's true yet.
Depending on how your unique constraint is configured, it may throw when the first conflicting insert happens or at the end of the commit (this is that deferrable initially deferred, thing, which I've honestly never tried on a unique constraint, but presumably it works the same).

The only thing I would consider is how frequent conflicts will be. If conflicts will be frequent, it may be cheaper to fetch dupes first to weed them out (so you're not constantly failing out 100- insert blocks).

I think if I were in your position I would just benchmark:
1) committing one at a time -- this is logically the easiest, but it may be the overhead for this is way high ... but WO doesn't do batching inserts ANYWAY, so who knows
2) fetching 100, comparing, deduping, then inserting and committing
3) inserting 100, committing, catch exception (fetch 100, comparing, deduping, inserting, rinse and repeat)

You might also just benchmark the fetching and the inserting independently so you know the relative cost of 100 of each for your average data.

ms

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/arroz% 40guiamac.com

This email sent to [EMAIL PROTECTED]

Miguel Arroz
http://www.terminalapp.net
http://www.ipragma.com



Attachment: smime.p7s
Description: S/MIME cryptographic signature

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to