Re: [sqlite] Performance of writing blobs
On Tue, 12 Jun 2018 14:13:33 +0200 Dominique Devienne wrote: > > You're right of course. Thank you Clemens. > > With synchronous = OFF, which suits my use-case here, the commit-time > just vanishes, > and even out-performs HDF5 now (see below). I might still prefer HDF5, > mainly because > the N-d typed arrays are self-describing and thus visible using > Hdf5View for example, instead > of completely opaque in SQLite, but now I know SQLite in non-durable > mode is as fast or even > faster for these smallish blob sizes than HDF5. I'll need to > double-check for larger sizes > (individual blobs get upward of 1GB in size), with my large 31GB, 200K > blobs, 1.2M entities project. I join late to this thread but here are some comments, I use sqlite in environments similar to yours: - Don't use synchronous=OFF if you need more speed, use pragma journal=memory or pragma journal=none. They aren't safer but faster, - As you noted, don't preallocate a zero-blob, insert directly the blob, - Don't delete/update blobs, instead mark the row as deprecated/invalid and insert a new one, - If you have blobs larger than 2GB you'll need to split them, sqlite can't work with blobs larger than 2GB, - Use misc/unionvtab extension if you need a safe environment, it's a bit tricky to use but you will get a RAIDB0/JBODB0 (raid 0/jbod db split in many files) easily and is faster, - Compile sqlite with SQLITE_DIRECT_OVERFLOW_READ, with this, you blob reads don't pollute sqlite cache, - Compress the blobs with misc/compress extension while writing, less you write, faster you read, see sqlar project (http://sqlite.org/sqlar). HTH --- --- Eduardo Morras ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance of writing blobs
On Tue, Jun 12, 2018 at 12:49 PM Clemens Ladisch wrote: > Dominique Devienne wrote: > > In JOURNAL mode, new data goes to DB file directly, and modified pages > go to the JOURNAL file. > > And since here this is INSERT-only, from empty tables, I assumed pages > copied to the JOURNAL > > file should be minimal. > > Yes. You can check the journal size with PRAGMA journal_mode = PERSIST. > > > ... --blobs-as-hdf5=false --blobs-inline=false > > Write ROWs = 4.358s (12.6%) > > Write BLOBs = 3.130s ( 9.0%) > > Epilog = 10.221s (29.5%) > > > > ... --blobs-as-hdf5=true --blobs-inline=false > > Write ROWs = 4.368s (17.0%) > > Write BLOBs = 4.179s (16.3%) > > Epilog = 0.461s ( 1.8%) > > This looks as if HDF5 does not do fsync(). > > Compare with PRAGMA synchronous = OFF, or run > https://docs.microsoft.com/en-us/sysinternals/downloads/sync > after you've finished with the HDF5 file. > You're right of course. Thank you Clemens. With synchronous = OFF, which suits my use-case here, the commit-time just vanishes, and even out-performs HDF5 now (see below). I might still prefer HDF5, mainly because the N-d typed arrays are self-describing and thus visible using Hdf5View for example, instead of completely opaque in SQLite, but now I know SQLite in non-durable mode is as fast or even faster for these smallish blob sizes than HDF5. I'll need to double-check for larger sizes (individual blobs get upward of 1GB in size), with my large 31GB, 200K blobs, 1.2M entities project. So thanks again Clemens for the guidance. Amazing job from SQLite, as usual. --DD D:\>... --export-sqlite GL-blobs-outofline-sync-OFF.db ... --blobs-as-hdf5=false --blobs-inline=false ... Total= 23.455s ... Write ROWs = 4.110s (17.5%) Write BLOBs = 2.822s (12.0%) Epilog = 0.005s ( 0.0%) D:\>... --export-sqlite GL-blobs-ashdf5-sync-OFF.db ... --blobs-as-hdf5=true --blobs-inline=false ... Total= 25.238s ... Write ROWs = 4.464s (17.7%) Write BLOBs = 4.358s (17.3%) Epilog = 0.020s ( 0.1%) 06/12/2018 01:49 PM64,475,136 GL-blobs-ashdf5-sync-OFF.db 06/12/2018 01:49 PM 1,039,652,416 GL-blobs-ashdf5-sync-OFF.h5 06/12/2018 01:48 PM 1,111,736,320 GL-blobs-outofline-sync-OFF.db ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance of writing blobs
Dominique Devienne wrote: > In JOURNAL mode, new data goes to DB file directly, and modified pages go to > the JOURNAL file. > And since here this is INSERT-only, from empty tables, I assumed pages copied > to the JOURNAL > file should be minimal. Yes. You can check the journal size with PRAGMA journal_mode = PERSIST. > ... --blobs-as-hdf5=false --blobs-inline=false > Write ROWs = 4.358s (12.6%) > Write BLOBs = 3.130s ( 9.0%) > Epilog = 10.221s (29.5%) > > ... --blobs-as-hdf5=true --blobs-inline=false > Write ROWs = 4.368s (17.0%) > Write BLOBs = 4.179s (16.3%) > Epilog = 0.461s ( 1.8%) This looks as if HDF5 does not do fsync(). Compare with PRAGMA synchronous = OFF, or run https://docs.microsoft.com/en-us/sysinternals/downloads/sync after you've finished with the HDF5 file. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance of writing blobs
On Tue, Jun 12, 2018 at 8:03 AM Clemens Ladisch wrote: > Dominique Devienne wrote: > > On Mon, Jun 11, 2018 at 4:27 PM Clemens Ladisch > wrote: > >> It does write to the same pages, but those pages must be copied to the > >> rollback journal so that they can be restored if the transaction is > >> rolled back. (Or are the two passes inside the same transaction?) > > > > They are part of the same transactions. So the no-rewrite pages should > > still apply, no? > > Indeed. I'd guess that just writing all those zero-filled pages is too > much overhead. > > And the incremental blob I/O function are intended to access partial > blobs; when you write the entire value at once, just use SQL. > > >> Don't insert the zero blobs in the first pass; this still rewrites all > >> rows, but there is less data that might need to be rolled back. > >> > >> Consider moving the blobs to a separate table. > > > > Is that really/still relevant, considering the above? > > Well, there's only one way to find out ... ;-) > OK! Output below should be self-explanatory. Blobs are still saved in a 2nd pass, still in the same transaction as 1st pass inserting the rows, but now the blobs are inserted directly (sqlite3_bind_blob) but in a separate (tab, col, row, val) "blob table", as suggested. Still in JOURNAL mode. Bottom line is that not calling zeroblob(N) helps, but the commit time (in Epilog) is still around 10s :( I guess my mental model of JOURNAL mode and heavy BLOB insert is not correct... In WAL mode, new data goes to WAL file, and then is checkpointed later to the DB file. In JOURNAL mode, new data goes to DB file directly, and modified pages go to the JOURNAL file. And since here this is INSERT-only, from empty tables, I assumed pages copied to the JOURNAL file should be minimal. So what's going on during the commit to surpass writing all row *AND* blob data combined??? That commit time represent 30% of my ETL, and that 1GB project export is a "small" one... FWIW, here's a rough "histogram" of blob sizes and max size (1/2 MB) for this small project: Read 20771 LOBs; min-max: 3-508032 bytes; 1333 <1KB; 3625 <8KB; 10599 <64KB; 5214 <512KB; D:\>... --export-sqlite GL-blobs-inline.db ... --blobs-as-hdf5=false --blobs-inline=true ... Total= 37.586s ... Write ROWs = 7.068s (18.8%) Write BLOBs = 3.016s ( 8.0%) Epilog = 10.637s (28.3%) D:\>... --export-sqlite GL-blobs-outofline.db ... --blobs-as-hdf5=false --blobs-inline=false ... Total= 34.609s ... Write ROWs = 4.358s (12.6%) Write BLOBs = 3.130s ( 9.0%) Epilog = 10.221s (29.5%) D:>... --export-sqlite GL-blobs-ashdf5.db ... --blobs-as-hdf5=true --blobs-inline=false ... Total= 25.646s ... Write ROWs = 4.368s (17.0%) Write BLOBs = 4.179s (16.3%) Epilog = 0.461s ( 1.8%) D:\> dir GL* ... 06/12/2018 10:51 AM64,475,136 GL-blobs-ashdf5.db 06/12/2018 10:51 AM 1,039,652,416 GL-blobs-ashdf5.h5 06/12/2018 10:53 AM 1,110,253,568 GL-blobs-inline.db 06/12/2018 10:50 AM 1,111,736,320 GL-blobs-outofline.db ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance of writing blobs
Dominique Devienne wrote: > On Mon, Jun 11, 2018 at 4:27 PM Clemens Ladisch wrote: >> It does write to the same pages, but those pages must be copied to the >> rollback journal so that they can be restored if the transaction is >> rolled back. (Or are the two passes inside the same transaction?) > > They are part of the same transactions. So the no-rewrite pages should > still apply, no? Indeed. I'd guess that just writing all those zero-filled pages is too much overhead. And the incremental blob I/O function are intended to access partial blobs; when you write the entire value at once, just use SQL. >> Don't insert the zero blobs in the first pass; this still rewrites all >> rows, but there is less data that might need to be rolled back. >> >> Consider moving the blobs to a separate table. > > Is that really/still relevant, considering the above? Well, there's only one way to find out ... ;-) Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance of writing blobs
On Mon, Jun 11, 2018 at 4:27 PM Clemens Ladisch wrote: > Dominique Devienne wrote: > > My assumption > > was that after the zeroblob(N), there was enough room in the main DBs > > pages, such that the subsequent blob open+write+close did not need to > > generate any "page churn" (i.e. journal activity) and could write > directly > > to the pages created on initial insert. > > It does write to the same pages, but those pages must be copied to the > rollback journal so that they can be restored if the transaction is > rolled back. (Or are the two passes inside the same transaction?) > They are part of the same transactions. So the no-rewrite pages should still apply, no? > Don't insert the zero blobs in the first pass; this still rewrites all > rows, but there is less data that might need to be rolled back. > > Consider moving the blobs to a separate table. > Is that really/still relevant, considering the above? Thanks, --DD ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance of writing blobs
Dominique Devienne wrote: > My assumption > was that after the zeroblob(N), there was enough room in the main DBs > pages, such that the subsequent blob open+write+close did not need to > generate any "page churn" (i.e. journal activity) and could write directly > to the pages created on initial insert. It does write to the same pages, but those pages must be copied to the rollback journal so that they can be restored if the transaction is rolled back. (Or are the two passes inside the same transaction?) Don't insert the zero blobs in the first pass; this still rewrites all rows, but there is less data that might need to be rolled back. Consider moving the blobs to a separate table. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Performance of writing blobs
I'm surprised about the commit time of SQLite, when writing blobs is involved. Can anybody shed light on this subject? Below's a description of what I do, with the results. I've exporting data into SQLite, spread in several tables. I process only about 240,000 rows, and write around 1GB in 20,000 blobs. I process the data in two passes: 1) write the rows, record rowid when a row has a blob, zeroblob(N) during insert, to leave room for Incremental Blob I/O later in 2nd pass. 2) process blobs only Then I compare this with writing the blobs in HDF5 instead. Instead of writing the blob itself in-row, I just insert its size in SQLite. (HDF5 is a "filesystem in a file", non-transactional, with "files" being N-dimensional typed arrays, thus a 1D array of bytes approximates a "file") Results writing both rows and blobs in SQLite: Write ROWs = 6.968s (18.7%) (includes the zeroblob(N)) Write BLOBs = 2.764s ( 7.4%) Commit TX= 10.566s (28.4%) Results is a single 1,110,253,568 bytes SQLite DB file. Now writing only rows to SQLite and blobs to HDF5: Write ROWs = 4.265s (16.6%) Write BLOBs = 4.010s (15.6%) Commit TX= 0.195s ( 0.8%) (includes flushing the HDF5 file too) Results in two files, 64,475,136 bytes SQLite DB, and 1,039,652,416 bytes HDF5 file. SQLite (version 3.19.3) is used in journal mode, and DB written "from scratch", so there's not much to put in the journal, AFAIK. My assumption was that after the zeroblob(N), there was enough room in the main DBs pages, such that the subsequent blob open+write+close did not need to generate any "page churn" (i.e. journal activity) and could write directly to the pages created on initial insert. But what surprised me was the commit time jumping to 10s, i.e. 50x more that when not writing the blobs. After the initial zeroblob(N) + sqlite3_last_insert_rowid(), part of "Write ROWs", the code writing the blobs is below (simplified, error handling removed). sqlite3_blob* p_blob = nullptr; sqlite3_blob_open(db, "main", tab, col, rowid, /*read-write*/1, _blob); assert(sqlite3_blob_bytes(p_blob) == length, ;); sqlite3_blob_write(p_blob, buffer, static_cast(length), 0); sqlite3_blob_close(p_blob); So if the code above "the write way" to write blobs? I don't yet have the buffers to write the blobs directly at-row-insert-time, for reasons beyond this particular problem. Please assume this is by-design. I can understand that 64MB I/O vs 1GB I/O takes more time, and that's reflected in the 4.2s writing just the rows, versus the 7s + 2.7s = 9.7s when writing both rows and blobs, but why the commit time jumps from 0.2s to 10.5s ??? Thanks for any insights in this. Thanks, --DD ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users