Re: Design pattern for bulk data handling
Thank you for the comments everyone, however I think I must have phrased the question badly. I am not looking for assistance on speeding up SQLite, more for designing a cleaner API that works well with this. I am already pretty familiar with SQLite and it's various options. To try and clarify: I have an database object that looks something like this: @interface MyDB : NSObject - (id)initWithFileAtURL:(NSURL *)fileURL; - (id)executeQuery:(NSString *)sql completionHandler:(myBlock)handler; @end Using -executeQuery:completionHandler: passes the sql string to an NSOperation which does the database-touching work and passes results back through the completion block. I'm now trying to work out a tidy API of executing (up to) millions of SQL statements without building up a huge queue of operations. After sleeping on the matter, my current idea is to add a method to the MyDB class like: - (void)importFromURL:(NSURL *)fileURL withOptions:(NSDictionary *)options completionHandler:(myBlock)handler; This would allow me to pass key/value pairs with options for table or column mappings and have all the database-specific API grouped in one place. - Ben On 8 Feb 2014, at 18:20, Ben ben_cocoa_dev_l...@yahoo.co.uk wrote: Hi list, I'm looking for the right design pattern for providing different types of access to an SQLite database. Currently I have a database object where queries are run on a serial NSOperationQueue and each operation has a completion block for reporting its results. The operation queue is an implementation detail behind a more basic API. This is fine for most things, except that I sometimes need faster access to the underlying database - for example, when importing/exporting data. In these cases I'm after bulk data throughput without the overhead of creating/destroying many NSOperations with completion handlers since there can be in the order of millions of statements to handle. In the past I had simply broken encapsulation by exposing the underlying database engine API. I'd rather find a better method than this for the future. Options considered so far: 1. Continue exposing underlying DB engine 2. Add a simpler synchronous API for use on a separate thread 3. Expose NSOperationQueue and allow custom import operations to be queued 4. Add importing functions to the database object itself Does anyone have any suggestions as to how best to approach this problem? - Ben ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Design pattern for bulk data handling
Hi list, I'm looking for the right design pattern for providing different types of access to an SQLite database. Currently I have a database object where queries are run on a serial NSOperationQueue and each operation has a completion block for reporting its results. The operation queue is an implementation detail behind a more basic API. This is fine for most things, except that I sometimes need faster access to the underlying database - for example, when importing/exporting data. In these cases I'm after bulk data throughput without the overhead of creating/destroying many NSOperations with completion handlers since there can be in the order of millions of statements to handle. In the past I had simply broken encapsulation by exposing the underlying database engine API. I'd rather find a better method than this for the future. Options considered so far: 1. Continue exposing underlying DB engine 2. Add a simpler synchronous API for use on a separate thread 3. Expose NSOperationQueue and allow custom import operations to be queued 4. Add importing functions to the database object itself Does anyone have any suggestions as to how best to approach this problem? - Ben ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: Design pattern for bulk data handling
On Feb 8, 2014, at 10:20 AM, Ben ben_cocoa_dev_l...@yahoo.co.uk wrote: This is fine for most things, except that I sometimes need faster access to the underlying database - for example, when importing/exporting data. In these cases I'm after bulk data throughput without the overhead of creating/destroying many NSOperations with completion handlers since there can be in the order of millions of statements to handle. You can read arbitrarily large amounts of data with a single statement, so that's not an issue. But in general it takes many statements to insert a lot of rows. (And for performance reasons you really want to group all of those statements in a single transaction, or you'll lose an order of magnitude of performance.) What I'd do is provide a new operation type that does a bulk-insert or bulk-update. When run the operation performs a series of SQL statements starting with a BEGIN and ending with an END. —Jens ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: Design pattern for bulk data handling
Also, consider dropping the table indexes before you start inserting. After the you're done inserting, rebuild the indexes. This should give you an an additional performance boost. For even more speed, consider turning off the following setting: http://www.sqlite.org/pragma.html#pragma_synchronous *warning*: read the link carefully and understand what it does before using it! -- Tito On Feb 8, 2014, at 14:25, Jens Alfke j...@mooseyard.com wrote: On Feb 8, 2014, at 10:20 AM, Ben ben_cocoa_dev_l...@yahoo.co.uk wrote: This is fine for most things, except that I sometimes need faster access to the underlying database - for example, when importing/exporting data. In these cases I'm after bulk data throughput without the overhead of creating/destroying many NSOperations with completion handlers since there can be in the order of millions of statements to handle. You can read arbitrarily large amounts of data with a single statement, so that's not an issue. But in general it takes many statements to insert a lot of rows. (And for performance reasons you really want to group all of those statements in a single transaction, or you'll lose an order of magnitude of performance.) What I'd do is provide a new operation type that does a bulk-insert or bulk-update. When run the operation performs a series of SQL statements starting with a BEGIN and ending with an END. —Jens ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/tciuro%40mac.com This email sent to tci...@mac.com ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: Design pattern for bulk data handling
On Feb 8, 2014, at 2:54 PM, Tito Ciuro tci...@mac.com wrote: For even more speed, consider turning off the following setting: http://www.sqlite.org/pragma.html#pragma_synchronous (This is getting slightly off-topic, but…) The only time I'd set synchronous to OFF is while populating a _new_ database file, as part of some sort of safe save process where you create the file in a temporary directory, write to it, and only move it into the real directory at the end. Basically when synchronous is OFF it's entirely possible that a power failure or kernel panic could corrupt the database. (I've seen this happen several times. It's not just theoretical.) So don't use it on a database file that has any data you mind losing, or that's in a location where the user might try to re-open it later (because if she does, you'll get a support ticket about your app crashing or failing to open the file.) —Jens ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com