On Mar 22, 2010, at 4:06 AM, Steve Steinitz wrote:

> On 18/3/10, Ben Trumbull wrote:
> 
>> there wasn't a good solution for multiple machines simultaneously
>> accessing an SQLite db file (or most other incrementally updated
>> binary file formats).  By "good" I mean a solution that worked
>> reliably and didn't make other more important things work less well.
> 
> I'm curious about the reliability issues you saw.  Also, by less well do you 
> mean slower?

Because the different NFS clients have file caches with differing amounts of 
staleness, and the SQLite db is updated incrementally, it's possible for an NFS 
client to think it has the latest blocks, and then derive material from one and 
write it into another (it is after all a b-tree).  The written blocks have 
implicit dependencies on all the other active blocks in the database, so having 
stale data is bad.

>> For nearly all Mac OS X customers (sadly not you) achieving a near
>> 100x performance boost when accessing database files on an AFP or SMB
>> mount (like their home directory in educational deployments) is pretty
>> huge.
> 
> I agree.  But wouldn't those same educational institutions be prime 
> candidates for multiple machine access?

No.  The restriction is multiple physical machines using the same database 
files simultaneously (open).  While AFP will allow multiple logins to the same 
account when configured with an advanced setting, in general, AFP servers 
actually prevent users from multiple simultaneous logins to the same account.

>> To address both sets of problems on all network FS, we enforce a
>> single exclusive lock on the server for the duration the application
>> has the database open.  Closing the database connection (or logging
>> out) allows another machine to take its turn.
> 
> Could my application close the database connection and re-open it to work 
> around the problem?  How would I do that?  I suppose once I got it going I'd 
> have to retry saves, fetches etc.

In theory, one could, but in practice that won't be very manageable without 
architectural changes.  Something along the lines of open the remote 
connection, pull down interesting data and save it to a local file, and close 
the remote connection.  Given your current deployment set up provides adequate 
performance, and all you need is a bug fix, I'm not sure this would be very 
helpful.

>> You'll get the 10.5 performance characteristics, however.
> 
> Again, the 10.5 performance over gigabit ethernet is almost unbelievably 
> fast.  I may know why.  Despite your helpful explanations I'm still not 
> exactly clear on the relationship between caching and locking, but I wonder 
> if the speed we are seeing is helped by the fact that the entire database 
> fits into the Synology NAS's 128meg cache?

That probably doesn't hurt.

> In another message in this thread, you made a tantalizing statement:
> 
>> Each machine can have its own database and they can share their results
>> with NSDistributedNotification or some other IPC/networking protocol. You 
>> can hook into the NSManagedObjectContextDidSaveNotification to track
>> when one of the peers has committed changes locally.
> 
> Let me guess how that would work: before saving, the peer would create a 
> notification containing the inserted, updated and deleted objects as recorded 
> in the MOC.  The receiving machine would attempt to make those changes on its 
> own database.  Some questions:
> 
>    Would that really be feasible?

yes, but as you observe, it's more tractable for simple data records than 
complex graphs with common merge conflicts

>    Would it be a problem that each machine would have having different 
> primary keys?

yes

>    How would the receiving machine identify the local objects that changed 
> remotely?

typically this is done with a UUID & a fetch.  Since each database on each 
client is different, the stores themselves have different UUIDs, and any 
encoded NSManagedObjectID URI will note which store the objectID came from, so 
you could also map them the objectID URIs to the local value directly.

>    Could relationships (indeed the object graph itself) feasibly be 
> maintained?

yes.  Updates to to-many relationships require the use of a (additions, 
subtractions) pair instead of simply setting the new contents.

>    How would relationships between the remote objects be identified?  
> Hand-parsing?

Either by UUID or objectID URI.

>    Has anyone done it that you know of?

yes.  I'm aware of 4 solutions, however, I would only recommend 1 as 
appropriate for the general (skill, time, pain threshold) and it avoids complex 
relationship graphs.  Basic data record replication over DO.

>    Is there sample code?


no.  The only real trick in converting the didSave notification into something 
appropriate for DO to consume is to copy it and replace the NSManagedObjects 
with a dictionary that has a UUID instead of an object ID, the attribute 
values, and the relationship contents expressed as a list of UUIDs.

- Ben



_______________________________________________

Cocoa-dev mailing list ([email protected])

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to