Re: [sqlite] fsync on -wal still happening
One presumes that you mayhaps read the documentation? "In WAL mode when synchronous is NORMAL (1), the WAL file is synchronized before each checkpoint and the database file is synchronized after each completed checkpoint and the WAL file header is synchronized when a WAL file begins to be reused after a checkpoint, but no sync operations occur during most transactions. With synchronous=FULL in WAL mode, an additional sync operation of the WAL file happens after each transaction commit. The extra WAL sync following each transaction help ensure that transactions are durable across a power loss. Transactions are consistent with or without the extra syncs provided by synchronous=FULL. If durability is not a concern, then synchronous=NORMAL is normally all one needs in WAL mode." So, the question is whether or not you want D in your ACID. If all you need is ACI, then synchronous=NORMAL guarantees that. synchronous=FULL adds the D to your ACI giving you ACID. --- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. >-Original Message- >From: sqlite-users [mailto:sqlite-users- >boun...@mailinglists.sqlite.org] On Behalf Of Ian Freeman >Sent: Friday, 29 December, 2017 22:38 >To: SQLite mailing list >Subject: Re: [sqlite] fsync on -wal still happening > >On Fri, 2017-12-29 at 22:18 +, Simon Slavin wrote: >> >> On 29 Dec 2017, at 7:07pm, Ian Freemanwrote: >> >> > Thanks, Simon. Indeed I did see that option. But I'm concerned >> > about >> > maintaining integrity in the middle of an power-interrupted >commit, >> >> This is not a problem for either WAL mode or the older journal >> modes. If you avoid these commands >> >> PRAGMA journal_mode = OFF >> PRAGMA journal_mode = MEMORY >> PRAGMA synchronous = OFF >> PRAGMA synchronous = NORMAL >> >> then SQLite will ensure that the database file always reflects your >> data as it was before the more recent COMMIT or after the most >recent >> COMMIT. >> > >Agreed, the defaults would be very safe. But I realize I never >mentioned what my actual goal was: minimizing disk writes for a >write- >heavy application. That's why I want to run -wal out of memory, or >set >synchronous=NORMAL. Of course, I cannot do this at the expense of >integrity. > >There are other ways to solve my problem, for example I could operate >solely in memory and perform periodic online backups. But then I'm >dumping entire databases to disk when I really only need to capture a >single row's update, etc. Those could be minimized with breaking up >the >database into sub-sections, etc. But if I can just keep the -wal >updates off of the disk until checkpoint, it would be the best >solution. > >Thanks, all. I'm still open to your appreciated input as I mull this >over. > > >On Fri, 2017-12-29 at 15:09 -0700, Barry Smith wrote: >> I believe the inotifywait does not actually wait for the fsync >> operation before notifying. > >Barry Smith is making me question my inotifywait results. I need to >look into that further. >___ >sqlite-users mailing list >sqlite-users@mailinglists.sqlite.org >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] advice about schema versioning and WAL
Thanks Dan for this information! (For some reason, your reply appears on the mailing list archive, but never found the way of my inbox. I'm replying to my own message) Gwendal > Dan Kennedy wrote: > > "PRAGMA schema_version" should work the same way in WAL mode. The pragma > will read the "database header" from the newest version of page 1 in the > wal file if required. > > Dan. > >> Le 28 déc. 2017 à 19:28, Gwendal Rouéa écrit : >> >> Hello, >> >> Season's greetings to all SQLite fellows! >> >> >> I'm developping a library that would like to keep a "cache" of some >> information about the database schema. Such information are the columns of a >> table, its primary key, or its indexes. The purpose of this cache is not >> really speed, even if it may help, but mainly to avoid cluttering the >> sqlite3_trace/sqlite3_trace_v2 hooks with noisy pragmas whenever the library >> needs to infer some implicit information from the actual database schema. >> >> This cache has to be invalidated whenever the schema changes. The >> Compile-Time Authorization Callback [1] is the perfect tool for the job, >> since it allows to identify statements that create, drop, alter tables and >> indexes. >> >> Everything is fine and easy when a single connection is used in a >> single-threaded way: statements are executed one after the other, and the >> management of the schema cache is trivial. >> >> >> It's much less trivial with the WAL mode. I focus on a setup which uses a >> unique writer connection, and several reader connections. All connections >> are used sequentially in their own thread, but readers and writer can run >> concurrently in order to take advantage from the WAL mode. >> >> When a read-only connection uses a deferred transaction to enter snapshot >> isolation, it doesn't see the changes performed by other transactions. For >> example, if a reader acquires snapshot isolation before a table is altered >> by the writer, it won't see the alteration until it commits its deferred >> transaction. I wish my schema cache would behave the same. >> >> To be precise, I only have two important needs: >> >> 1. A connection's schema cache is correct, which means that it never >> contains information that does not match SQLite's genuine view of the >> database schema. Being invalidated/empty is correct, if not efficient (the >> missing information is then loaded from SQLite). >> 2. Synchronization points between readers and writers are avoided >> (non-blocking access is the whole point of WAL, and I want to avoid locks as >> much as possible) >> >> I was hoping that a connection would have a "schema version": an >> automatically incremented value that SQLite bumps whenever the schema is >> changed. That would have been enough for my use case. Unfortunately, PRAGMA >> schema_version reads the database header, and I thus guess that it does not >> play well with WAL (I'm not sure). Furthermore, PRAGMA schema_version >> clutters the tracing hook. >> >> The most simple solution I have is to invalidate a reader's schema cache >> each time it is used. This would unfortunately invalidate the readers' >> caches too often, since most real-life uses only alter the schema at >> application start-up, which means that the schema is, practically speaking, >> stable after this initialisation phase. >> >> Do any of you have any better idea? >> >> Thanks in advance, regards, >> Gwendal Roué >> >> [1] https://sqlite.org/c3ref/set_authorizer.html > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
On Sat, 2017-12-30 at 05:40 +, Simon Slavin wrote: > Accumulate your change commands in memory as text. Eventually > execute them all inside BEGIN … COMMIT. I like it; why didn't I think of it? Don't you love those moments? The application would need severe retooling for reads unless I also have duplicate memory versions of the table state, but it is a solution. With this scheme you could also load the entire db into memory without the expense of having to dump the entire db to disk at "checkpoints". ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
On 30 Dec 2017, at 5:37am, Ian Freemanwrote: > Agreed, the defaults would be very safe. But I realize I never > mentioned what my actual goal was: minimizing disk writes for a write- > heavy application. That's why I want to run -wal out of memory, or set > synchronous=NORMAL. Of course, I cannot do this at the expense of > integrity. Accumulate your change commands in memory as text. Eventually execute them all inside BEGIN … COMMIT. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
On Fri, 2017-12-29 at 22:18 +, Simon Slavin wrote: > > On 29 Dec 2017, at 7:07pm, Ian Freemanwrote: > > > Thanks, Simon. Indeed I did see that option. But I'm concerned > > about > > maintaining integrity in the middle of an power-interrupted commit, > > This is not a problem for either WAL mode or the older journal > modes. If you avoid these commands > > PRAGMA journal_mode = OFF > PRAGMA journal_mode = MEMORY > PRAGMA synchronous = OFF > PRAGMA synchronous = NORMAL > > then SQLite will ensure that the database file always reflects your > data as it was before the more recent COMMIT or after the most recent > COMMIT. > Agreed, the defaults would be very safe. But I realize I never mentioned what my actual goal was: minimizing disk writes for a write- heavy application. That's why I want to run -wal out of memory, or set synchronous=NORMAL. Of course, I cannot do this at the expense of integrity. There are other ways to solve my problem, for example I could operate solely in memory and perform periodic online backups. But then I'm dumping entire databases to disk when I really only need to capture a single row's update, etc. Those could be minimized with breaking up the database into sub-sections, etc. But if I can just keep the -wal updates off of the disk until checkpoint, it would be the best solution. Thanks, all. I'm still open to your appreciated input as I mull this over. On Fri, 2017-12-29 at 15:09 -0700, Barry Smith wrote: > I believe the inotifywait does not actually wait for the fsync > operation before notifying. Barry Smith is making me question my inotifywait results. I need to look into that further. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] finding the number of records until a value is different
I think it's possible with CTE. Recently I wondered whether it would be possible to implement an operation that might be called "an accumulated group by". It's when you enumerate the rows and based on the values of the previous row and current row you apply some new "group" value that can be used in the following "group by" query. My use case was a "words" table when every word has a length and the task is to "format" them as the words are formatted when printing a text on the page or on the screen, so the new "group" here is a printing row that increased when the current printing line is no longer able to fit the following word. I see your task is similar, even a little simpler since we don't accumulate here, we just compare previous and next values. CTE worked for me, but the big problem is that there are much redundancy in the text of the query since contrary to general select queries, it's much harder to to reusable aliasing in CTE, you will see this in the final query of this post. The basic template for any filtering for "accumulated group by". given the table CREATE TABLE [testdata] ([id] integer primary key) the following query outputs the table as it is but the second select inside now has a luxury of compare previous and next values (ordered by the primary key). with recursive filter(curid) as ( select (select min(id) from testdata) UNION ALL select (select id from testdata where id > curid order by id limit 1) as nextid from filter where nextid not null ) select * from filter Back to your case if the table is CREATE TABLE [testdata] ([id] integer primary key, [bc] integer, [temp] integer) the following gigantic query should output the counts for every consecutive groups ("grp" here is a temporal column used exclusively for the final group by I was talking about previously). with recursive filter(curid, bc, temp, grp) as ( select (select min(id) from testdata), (select bc from testdata where id=(select min(id) from testdata)), (select temp from testdata where id=(select min(id) from testdata)), 1 UNION ALL select (select id from testdata where id > curid order by id limit 1) as nextid, (select bc from testdata where id > curid order by id limit 1), (select temp from testdata where id > curid order by id limit 1), case when (select bc from testdata where id > curid order by id limit 1) = bc then grp else grp + 1 end from filter where nextid not null ) select bc, count(*) from filter group by grp I'll be glad to reduce expressions here to some more readable constructs, but I suspect it's impossible with current SQL syntax. Max On Thu, Dec 7, 2017 at 9:46 PM,wrote: > Hi all, > > I have a DB i.e. like this: > > table values > bc temp > 35 123 > 35 124 > 35 123 > 20 123 > 12 123 > 12 123 > 16 123 > 35 123 > 35 123 > 35 123 > 35 123 > 35 123 > > The value in temp (or all the other columns) is not of interest. > > Now I a looking for the basic concept how to count the number of rows where > bc=bc_of_last_row after the last different bc. > > SELECT COUNT(bc) FROM values WHERE bc=35; > > gives me the number of all rows with bc=35, so here 8. > But I am looking for 5, the number of rows after the last change of bc. > > Regards Matth > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
On 29 Dec 2017, at 7:07pm, Ian Freemanwrote: > Thanks, Simon. Indeed I did see that option. But I'm concerned about > maintaining integrity in the middle of an power-interrupted commit, This is not a problem for either WAL mode or the older journal modes. If you avoid these commands PRAGMA journal_mode = OFF PRAGMA journal_mode = MEMORY PRAGMA synchronous = OFF PRAGMA synchronous = NORMAL then SQLite will ensure that the database file always reflects your data as it was before the more recent COMMIT or after the most recent COMMIT. SQLite has a lot of code in to ensure the best handling in the case of power-loss. Each time a database file is reopened SQLite looks for tell-tales that it wasn’t properly closed, and uses a number of strategies to restore one of those two situations. This includes comparing the state of the database file with the states of the associated files (-wal, -shm, etc.). Because of this, do not delete/rename any of the associated files even when SQLite does not have them open. SQLite must be able to find these files /in the same folder as the database file/. If you ever back up one of those files, back them all up, including the database file, as a unit. They are useless without copies of all the other files as they were at the same time. Since the "synchronous" setting is default, the advice for installations where power-loss is not unexpected is to issue PRAGMA synchronous = EXTRA , to have the database in WAL mode if you want it (neither safer nor less safer with respect to power-loss), and leave everything else as it is. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Dan. Latest zipfile.c checkin feedback
I tried the new write feature. Not sure if intended but read back row order is somewhat unintuitive. Is there an easy way to create a functional archive read back order without introducing sortable name prefix artifacts or an index file entry? CREATE VIRTUAL TABLE newzip USING zipfile('new.zip'); INSERT INTO newzip(name,data) VALUES ('row1','row1 data'),('row2','row2 data'); SELECT * FROM newzip; name,mode,mtime,sz,data,method row2,0,4039286400,0,"row2 data",0 row1,0,4039286400,0,"row1 data",0 A sensible default for mtime could also be helpful. Also, for compiling, it couldn't hurt to add a modest comment about how to link to the right zlib library on the major platforms. Since the library is now linked, are you planning on decompressing data automatically(or by module parameter flag) in a near release? It doesn't decompress data automatically from my tests. Aside from crashing when sensible arguments aren't provided for name or data, the write upgrade and inclusion of compression seems very usable. Creating the table suggests a related question. Forgive my ignorance if it was debated before. Why can't the normal table name syntax of INSERT/UPDATE support eponymous vtable module arguments? Using this example for instance: INSERT INTO newzip('new.zip')(name,data) VALUES ... Or something more efficiently parsed like square brackets? INSERT INTO newzip['new.zip'](name,data) VALUES ... Peter ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
I believe the inotifywait does not actually wait for the fsync operation before notifying. Process A can write to a file, the OS can cache it without flushing to disk, and a request by process B will be served directly from the cache. Therefore the operating system can notify you of the change before it's written to persistent storage. So, in normal synchronous mode, SQLite doesn't cache its writes, it just waits before calling fsync and relies on the OS cache. Perhaps benchmarking before engaging in such an endeavour as you plan would indicate whether the OS cache gives you sufficient performance. > On 29 Dec 2017, at 12:07 pm, Ian Freemanwrote: > > Thanks, Simon. Indeed I did see that option. But I'm concerned about > maintaining integrity in the middle of an power-interrupted commit, so > I was hoping to leverage WAL's inverted behavior of not modifying the > database file directly until checkpoint time. The only thing left to > investigate is how WAL handles power loss during the middle of a > checkpoint. Normally I believe it's able to recover because the next db > open has the -wal file to look at. But not if I'm storing it in-memory. > > So I'm thinking I will copy the -wal file to disk before the checkpoint > op for safety, and then load it back onto ramdisk after. I suppose I > may have to exit WAL mode in order to perform that copy, making > checkpointing a very expensive operation for my app. > >>> On Fri, 2017-12-29 at 16:48 +, Simon Slavin wrote: >>> >>> On 29 Dec 2017, at 4:34pm, Ian Freeman wrote: >>> >>> I see, then what I'm seeing is just normal behavior of the writes >>> being >>> flushed to disk. I read what I wanted to hear about >>> synchronous=NORMAL >>> delaying writes to the -wal file. Instead I'm going to see if I can >>> move -wal to a ramdisk and see how that will affect db integrity >>> with >>> power losses at in-opportune times. >> >> Did you see >> >> PRAGMA journal_mode = MEMORY >> >> ? >> >> Simon. >> ___ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
Thanks, Simon. Indeed I did see that option. But I'm concerned about maintaining integrity in the middle of an power-interrupted commit, so I was hoping to leverage WAL's inverted behavior of not modifying the database file directly until checkpoint time. The only thing left to investigate is how WAL handles power loss during the middle of a checkpoint. Normally I believe it's able to recover because the next db open has the -wal file to look at. But not if I'm storing it in-memory. So I'm thinking I will copy the -wal file to disk before the checkpoint op for safety, and then load it back onto ramdisk after. I suppose I may have to exit WAL mode in order to perform that copy, making checkpointing a very expensive operation for my app. On Fri, 2017-12-29 at 16:48 +, Simon Slavin wrote: > > On 29 Dec 2017, at 4:34pm, Ian Freemanwrote: > > > I see, then what I'm seeing is just normal behavior of the writes > > being > > flushed to disk. I read what I wanted to hear about > > synchronous=NORMAL > > delaying writes to the -wal file. Instead I'm going to see if I can > > move -wal to a ramdisk and see how that will affect db integrity > > with > > power losses at in-opportune times. > > Did you see > > PRAGMA journal_mode = MEMORY > > ? > > Simon. > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Knowing when not to use a R-Tree index, and clustering tables along spatial indexes
Hi, I'm writing some software that can read data off GeoPackage (SQLite + rtree + standardized set of metadata tables) as an alternative format for spatial databases, like PostgreSql with the PostGIS extension. Now, when I use PostGIS the query plan optimizer checks the bbox provided in the query and verifies if using the spatial index is a good idea, or not. At conferences I've been told that the query has to be rather selective (e.g., retrieve less than 10% of the data) in order for the index to actually be used. With SQLite R-Tree I'm using either a join with the index virtual table, or a subquery retrieving the ids from the rtree. Regardless, the query is basically ordering SQLite to use the index. So I was wondering, is there any opportunity to run a blazing fast pre-query against the index that will tell me whether joining/subquerying into the rtree is going to be a win, or not? Also, while I'm here, in PostGIS there is an option to cluster a table along the spatial index, in order to reduce IO when the spatial index is the main access driver (which is often the case in geographic information systems). I looked at tables with no rowids, but it does not seem like a way to do it (spatial index not being suitable for primary key). Anything else that could be done here? Cheers Andrea ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
On 29 Dec 2017, at 4:34pm, Ian Freemanwrote: > I see, then what I'm seeing is just normal behavior of the writes being > flushed to disk. I read what I wanted to hear about synchronous=NORMAL > delaying writes to the -wal file. Instead I'm going to see if I can > move -wal to a ramdisk and see how that will affect db integrity with > power losses at in-opportune times. Did you see PRAGMA journal_mode = MEMORY ? Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
I see, then what I'm seeing is just normal behavior of the writes being flushed to disk. I read what I wanted to hear about synchronous=NORMAL delaying writes to the -wal file. Instead I'm going to see if I can move -wal to a ramdisk and see how that will affect db integrity with power losses at in-opportune times. Thank you. On Fri, 2017-12-29 at 11:17 -0500, Richard Hipp wrote: > On 12/28/17, Ian Freemanwrote: > > I have configured journal_mode=WAL, synchronous=NORMAL, and > > disabled > > autocheckpointing, and yet the -wal file is modified after every > > update > > operation... > > That is correct. "sync" means forcing the data from the internal > operating-system buffers out to persistent storage on your disk > drive, > where it will survive and OS reset or power loss. The WAL is written > on every transaction, in the sense that the content is moved into the > operating system. But it is only synced to persistent storage on a > checkpoint. > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] fsync on -wal still happening
On 12/28/17, Ian Freemanwrote: > I have configured journal_mode=WAL, synchronous=NORMAL, and disabled > autocheckpointing, and yet the -wal file is modified after every update > operation... That is correct. "sync" means forcing the data from the internal operating-system buffers out to persistent storage on your disk drive, where it will survive and OS reset or power loss. The WAL is written on every transaction, in the sense that the content is moved into the operating system. But it is only synced to persistent storage on a checkpoint. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] fsync on -wal still happening
I have configured journal_mode=WAL, synchronous=NORMAL, and disabled autocheckpointing, and yet the -wal file is modified after every update operation... That's not the documented behavior as I understand it. This smells like a bug to me, but before I go that direction, is there an explanation for this behavior? My simple test code loops an update query followed by a sleep call, and inotifywait shows a MODIFY for each of those update queries. I'm confused. Thanks for any light you can shed. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] The value of nFetchOut and zPath after wal-file calls sqlite3OsOpen()
I want to analyze the executive process of wal-file with the source code 3.16.2 . And I added: unixFile *pFd = (unixFile *)pRet->pWalFd; if(pFd->nFetchOut != 0){ printf("hi\n"); } to sqlite3WalOpen() after sqlite3OsOpen() is called: rc = sqlite3OsOpen(pVfs, zWalName, pRet->pWalFd, vfsFlags, ); I think sqlite3OsOpen() should open the pWalFd file and set the most of its member elements in struct unixFile to 0. But I saw so many 'hi' when I ran the TCL test of sqlite, especially in e_walckpt.test, wal.test and wal2.test. I find nFetchOut has been correctly initialized after fillInUnixFile() is called before unixOpen() returned. But nFetchOut, and sometimes zPath, mmapSize and other data elements seem to have an incorrect value when I check it after sqliteOsOpen(). Is it right? And why? Thanks -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users