Dominique Devienne wrote: > > Looking at the sqlite web site and mailing lists shows that the SQLite > team > > has taken a stab at answering the question, "is it faster to read a blob > > out of sqlite or out of a file?". See the links below. > > > > Does the team have analogous guidance regarding write speeds? > > > > Depends if you care only about the initial write, or also about > incremental writes.
Thanks to those of you who responded already! I'll give a little more flavor for those who are interested. My particular application is a cache of large, immutable files. The UPDATE case doesn't apply to me for this particular application. I already have a system on top of SQLite's blobs in which I store the file data in chunks, one chunk per row. (I'm downloading the files from the network, and at the moment am using a version of the Python SQLite API that does not have hooks to the incremental blob I/O.) So I download a few megabytes, start the next hunk download in the background (using the "requests-futures" Python library, which is really sexy BTW), write out the current hunk, commit, continue. The row also has the byte range that it is storing: CREATE TABLE IF NOT EXISTS file( fileID TEXT NOT NULL, start INTEGER NOT NULL, -- offset of first byte of chunk end INTEGER NOT NULL, -- offset of last byte of chunk bytes BLOB NOT NULL, CONSTRAINT BlobSize CHECK(length(bytes) == (end-start+1)) ); All requests are for retrieval of a particular range of bytes for a particular file. So I have an index: CREATE INDEX IF NOT EXISTS FileIdxNameStart ON file( fileID, start ); and I create my own little "query plan" in application logic for servicing a particular request on the read side. The read side is fast enough for my purposes. Inserts are slow and deletes are slow. The disk I am using is capable of writing about 6.5MB/sec, but my app + SQLite only hit around 600KB-1MB/sec under this scheme, depending on some variables. This is approximately half of the overall speed that I expect (because if I use 'wget' to just write out a downloaded file to disk, I get about 2MB/sec sustained over hundreds of MB). I've played with the chunk size, the synchronous pragma, the commit interval, the page size pragma and the cache size pragma. I have not cleared my application of all blame yet, since I have not taken enough measurements. I'm considering prototyping a C implementation that uses incremental blob I/O to write out the file and is cleverer about the download state machine, but was hoping to know from this group if that is a fool's errand (e.g. if you all have some particular reason for knowing that huge blobs in SQLite are a priori a bad idea for performance reasons). Deletes by fileID are really awful compared to a simple filesystem delete -- like 2 orders of magnitude worse by my eyeballing, at least with the default 1024-byte page size. Again, I think you'd expect it to be -- an unlink(2) only has to touch a few pages of disk, whereas an SQLite delete of 1GB worth of blob rows I guess goes and touches something every page_size bytes (right?). So I'm thinking of just storing the data out of the database, perhaps retaining an SQLite index. But I'm worried about the safety issues that arise from that. E.g. I finish writing my file, I write an index entry and commit it, and immediately afterward I get a power failure. Am I sure that the file bytes are correct? And so on. All of those considerations are right in SQLite's wheelhouse -- I'd hate not to be able to take advantage of it. -- Eric A. Rubin-Smith _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users