Re: [sqlite] When is data committed on one connection seen on another?
On Wed, 14 Sep 2016 18:29:36 + (UTC) Alex Wardwrote: > Perhaps our schema needs a rework, would one table with a million > rows be better than 500 tables with 2000 rows each? 500 tables isn't right or wrong, but *counting* tables is. Table count is not a design-quality metric. There is no rule of thumb except Boyce-Codd Normal Form. Table design reflects the entities you choose to represent your domain of discourse. You should strive, as Einstein advised, to make your model as simple as possible, but no simpler. Sometimes we see schemas on this list with sets of identical tables, each set dedicated to a different client or somesuch. Those folks buy themselves trouble, because otherwise identical SQL has to vary by tablename, adding nothing but complexity to the application layer. At the opposite end of the spectrum is the classic entity-attribute-value design error. One table conquers all, including the programmers when they discover how slow things are when SQL is used before learned. Number of columns? Many widely used and correctly normalized financial market databases maintain thousands of rows on hundreds of columns. Complex models can easily have a hundred tables. Before I would agree a design has "too many" tables, I would want an affirmative answer to one of two questions: 1. Is there a process in place to dynamically extend the schema? If so, that indicates data -- some change over time -- has found its way into the metadata (the table names). 2. Do you frequently find yourself looking in more than one place for what -- to you -- substantially the same information. That would indicate the tables do not reflect your mental model of the problem domain. There are forces that drive intelligent table design other than the logical model. Things like performance and convenience inevitably cause tables that would otherwise pass textbook muster to be split and combined. Those things count, too. One thing that doesn't count is the count. Regards, --jkl ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
On 2016/09/14 8:29 PM, Alex Ward wrote: We currently have 500 tables and 1500 triggers in the schema. Perhaps that is why we didn't have much luck having one connection per thread or opening a connection per access. Perhaps our schema needs a rework, would one table with a million rows be better than 500 tables with 2000 rows each? YES. In 99% of cases it is far more better to have x million rows in a single table than trying to use multiple tables. It is the job of the DB engine to efficiently handle large tables, and sqlite (like most other DB engines) does a brilliant job of it because it is the Alpha use-case and the single biggest focus of efforts to enhance during development. There are cases when a dispersed schema might be superior, but they are few and obscure. 1 Table = 1 set of Indices, 1 set of triggers, 1 set of views. Perhaps your case was different, but mostly when people do something like this multiple-table thing, it is because the try to think for the DB engine, and assume the multiple table way is better (maybe it just "feels" better), but upon testing, you will find the amount of effort to run multiple tables outweighs any gain in access speed significantly. Also, a B-Tree Index works far better on one large table than many B-Tree indices on many tables. The increase in time taken to hit a specific PK in a large table diminishes rapidly with table growth. (In simple binary terms, IIRC, it takes 5 lookup steps to hit a PK in just 30 rows, yet only 24 look-up steps to hit a PK in 1 million rows, 25 to hit it in 2 mil rows, etc. - Law of diminishing returns in action) We were considering adding a view per table too, that would make it 3000 elements in the schema, if 100 is considered a lot are we attempting to use sqlite in a manner that it is not suited for? We may need to consider another alternative if that's the case, although that would be a shame. This alone should be enough of a reason to reconsider. Sadly we are not comparing sqlite with another SQL DB. We are attempting to replace an in-house memory resident non-sql database with sqlite. So our comparison is between a) reading from memory and b) opening a connection and reading from sqlite. This is perfect, it's what SQLite was born to do. Just trust SQLite to do it's thing and don't try to pre-empt how it will fare and prematurely optimize by distributed schemata and the like. Take the simplest route first, if that turns out to really be too slow, /then/ perhaps ask what can be done to improve, given the data/schema specifics. (And yes, we are not oblivious to the fact that you may have already invested insane amounts of time in doing it the other way, don't throw away the code yet, just try the normal way also)! Good luck! Ryan ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
>> As already stated, and per my own experience, each thread should have it's >> own connection and do whatever it needs to do there, without interference >> from other threads. > >I appreciate this point. Early prototyping indicated that this might not be >possible for our system, which makes me a little nervous if that is what most >users end up doing. We will >definitely take another look at not sharing >connections. > Maybe consider creating a pool of "reader" threads that get created at application startup. Each one opens it's very own dedicated connection at the beginning. Also have one "listener" thread that accepts requests and farms them out to the "readers". That way you have control over the number of connections being created and the overhead of connecting is done once at application startup. ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
Alex, > Le 14 sept. 2016 à 20:29, Alex Warda écrit : > >> Unless you have a very complex schema (I would say at the very least more >> than on hundred tables and other create statement), opening a SQLite >> connection is lightweight, > > We currently have 500 tables and 1500 triggers in the schema. Perhaps that > is why we didn't have much luck having one connection per thread or opening a > connection per access. Perhaps our schema needs a rework, would one table > with a million rows be better than 500 tables with 2000 rows each? > > We were considering adding a view per table too, that would make it 3000 > elements in the schema, if 100 is considered a lot are we attempting to use > sqlite in a manner that it is not suited for? We may need to consider > another alternative if that's the case, although that would be a shame. My 100 number is nothing definitive. SQLite stores the text of the schema, and parses the schema on connection. The parser has an impressive speed. The larger/more complex the schema becomes, it clearly will take a little bit longer to parse. But it is very hard to know where to draw the line. Below 100 is certainly small, that does not necessarily make > 100 too big. For sure one million rows (assuming they're not each extraordinary in size) is _not_ a large number of rows. Is it more fitted than 500 tables with 2000 rows each? I can't discuss your schema (and many people here are way more proficient than me in SQL design) with so few knowledge of what's the model of the data. I can only say that if you're artificially splitting a single logical data set in 500 tables of 2000 rows, then I would keep them in a single table. You would have much less triggers also. But I assume this is not simply the case. (Reviewing this text before posting, I read Simon goes even further along the same line. I'd take that as a good incentive to have a closer look at your schema.) This aside, if you have a problem with the time taken by establishing a new connection, then building on what I briefly suggested, I would pool threads without closing their connection when the thread is done with its work and ready to suspend until needed again. This way you would have the benefit of one connection per thread, without the full impact of having to re-open the database file each time a thread has got to do some work. It's a pool of both threads and connections, to gain on the two sides. A simple pool of pre-connected connections could be worth considering (probably what you had in mind) but I would again strongly advise you to refrain to share any of these connections between two (or more) running threads. You would have to either use SQLite in its mode of SERIALIZED (default unless changed at runtime or compile time) or add mutual exclusion between threads sharing a connection if using the MULTITHREAD mode. The MULTITHREAD mode is not a magic mode that 'makes it work' within threaded applications. To the contrary, it is meant for threaded applications which take the whole responsibility of knowing what they're doing. SQLite will not protect concurrent access to the connection state. So bad things will happen, unless very properly serialized. In either solution, threads sharing a same connection are now executing serialized. It might be equivalent and simpler to queue up the requests and process them one after each other through one single thread... If you can achieve/afford one connection per thread, MULTITHREAD mode is then easy to use without risks. Along with the database set for WAL journal mode, all threads doing reads will really have opportunities to work at the same time. Will the time needed to establish a new connection per each thread will kill the benefits of having threads which can actually work simultaneously? Only you will tell. (Sorry for these long answers.) Best, -- Meilleures salutations, Met vriendelijke groeten, Best Regards, Olivier Mascia, integral.be/om ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
On 14 Sep 2016, at 7:29pm, Alex Wardwrote: > Perhaps our schema needs a rework, would one table with a million rows be > better than 500 tables with 2000 rows each? Yes. As a guideline, if two tables have the same columns (or even nearly the same columns) you should consider merging them and adding one extra column to replace the table name. This is not a 100% rule, but it's definitely the way to think. Fewer tables mean you fewer indexes, fewer views, fewer database pages, better re-use of released space and more efficient schema handling (no need to search through hashes of 500 tables for each command). Since we're talking guidelines, another is that you should be able to hold the entire column makeup of one table in your head at once. So a table with more than say 20 columns is another sign you might rethink things. I'm not accusing you of making that mistake, I just thought I'd mention the two guidelines together. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
> On 14 Sep 2016, at 5:47pm, Alex Wardwrote: > >> Behalf Of Simon Slavin > >> It is not enough to lock the _prepare, lock the _step()s, and lock the >> _finalize. >> If they're sharing a connection with other threads then the lock has to be >> placed at the start of the prepare and be released at the end of the >> finalize. >> Don't forget that they're all part of the same transaction. > > Got it, it's the set of the 3 operations that should be serialized. Thanks. > > The finalize is the end of the implicit transaction, correct? Right. > Assuming that it is the finalize for the only statement being executed on > that connection. Then if the same thread did another prepare/step/finalize > on that same connection it would be in a new implicit transaction. Do I have > that right? The parameter you pass to _step() and _finalize() is a statement handle (and the statement knows which connection it should be using). So yes, it's possible to maintain multiple statements for one connection, and SQLite will know which statement you're finalizing. The SQL standard says that if you try doing SELECT without a BEGIN it should return an error message. Because you shouldn't be doing database stuff outside a transaction. Instead of returning an error SQLite automatically wraps BEGIN/COMMIT around any statement (even if it just reads) if you haven't already done your own BEGIN. So what you're getting from prepare/step/finalize is actually (I might have this wrong I haven't read the source code) _prepare() first _step() BEGIN SELECT result (first row) more _step() SELECT results (more rows) eventually a SELECT results in SQLITE_DONE _finalize() COMMIT So think about what would happen if you had two of these running simultaneously. The BEGIN on one query would not lock out the BEGIN on another query, but it could prevent a INSERT/UPDATE from writing to the database. This is what you're trying to simulate when you do your own locking/mutex, and that's why it's so easy to get it wrong. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On > Behalf Of Olivier Mascia > If I'm permitted: you're wrong. Ha, yes, well if our initial threading model is anything to go by, I think you're on safe ground with that assertion. > Unless you have a very complex schema (I would say at the very least more > than on hundred tables and other create statement), opening a SQLite > connection is lightweight, We currently have 500 tables and 1500 triggers in the schema. Perhaps that is why we didn't have much luck having one connection per thread or opening a connection per access. Perhaps our schema needs a rework, would one table with a million rows be better than 500 tables with 2000 rows each? We were considering adding a view per table too, that would make it 3000 elements in the schema, if 100 is considered a lot are we attempting to use sqlite in a manner that it is not suited for? We may need to consider another alternative if that's the case, although that would be a shame. > is lightweight, especially compared to whatever experience you > might have had with quite any other SQL system. Sadly we are not comparing sqlite with another SQL DB. We are attempting to replace an in-house memory resident non-sql database with sqlite. So our comparison is between a) reading from memory and b) opening a connection and reading from sqlite. > you intend to loose on both of them (pooling connections > and sharing them across threads at the cost of mutual exclusion contention > to get it working). Believe me that we would like nothing more than to do what you suggest, we will be looking into any way possible to get this implemented in the optimal manner. Thanks a lot for giving your advice. Merci Alex ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On > Behalf Of Igor Tandetnik > Statement execution starts with the first sqlite3_step after sqlite3_prepare > or the most recent sqlite3_reset; and ends with sqlite3_reset or > sqlite3_finalize. Thanks makes sense. Thanks. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
> Le 14 sept. 2016 à 18:14, Alex Warda écrit : > > I don't think we can afford to have a connection per thread. We have an > arbitrarily large number of threads coming and going all the time and a > combination of the amount of memory each connection is taking up and how long > it takes to connect to the db it looks like limiting the number of > connections would be the direction we go if we can get it to work. Maybe a > pool of connections. If I'm permitted: you're wrong. This assertion is plain wrong, even when I don't have the least idea of your code architecture and the number of threads we're talking here. You, generally, won't get anything, except problems or at least 'complications', by sharing a SQLite connection between multiple threads. For one thing, please, do _not_ think you will gain anything like better performance from this approach. To get it to work you will basically have to get to do so much mutual exclusion between threads competing for the shared connection that the workload done by these threads could as well be serialised in a single thread. And this is not what your threading design is intended to do, I guess. Unless you have a very complex schema (I would say at the very least more than on hundred tables and other create statement), opening a SQLite connection is lightweight, especially compared to whatever experience you might have had with quite any other SQL system. Also please keep in mind that in SQLite, the 'connection' object is the 'transactional-control' object, which quite often is a distinct object in other SQL systems. My only recommendation would be to take the time to test your software with one new connection opened by any thread when it needs it or at start of thread, then properly closed when done with it or right before thread end. The one case where I agree this wouldn't be appropriate is if the database file is NON local (network filesystem). But then a heavily multi-threaded piece of software dealing with such a remote file is probably wrong in the first place. On Windows systems what I do is to pool threads. Creating a thread is far from being costless (in resources and time). There I get performance benefits. And from using WAL mode along with a nice distinct connection per each thread, I get a good level of read concurrency among threads. I win on two levels where you intend to loose on both of them (pooling connections and sharing them across threads at the cost of mutual exclusion contention to get it working). :) -- Meilleures salutations, Met vriendelijke groeten, Best Regards, Olivier Mascia, integral.be/om ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On > Behalf Of Simon Slavin > It is not enough to lock the _prepare, lock the _step()s, and lock the > _finalize. > If they're sharing a connection with other threads then the lock has to be > placed at the start of the prepare and be released at the end of the > finalize. > Don't forget that they're all part of the same transaction. Got it, it's the set of the 3 operations that should be serialized. Thanks. The finalize is the end of the implicit transaction, correct? Assuming that it is the finalize for the only statement being executed on that connection. Then if the same thread did another prepare/step/finalize on that same connection it would be in a new implicit transaction. Do I have that right? > This is why you don't generally share a connection between simultaneous > threads. If the threads have different connections and you let SQLite handle > the locking things work properly. Hmm yes, I'm hearing that repeatedly. We will attempt to identify why using a lot of connections is using such a huge amount of memory and maybe we will be able to go that route. Thanks again Alex ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On > Behalf Of Hick Gunter > Sharing a connection between threads makes it practically impossible for any > one thread to tell when a transaction begins or ends. From the point of view > of the database connection, the first statement to begin processing opens a > transaction and the last statement to end processing (which could be in a > totally different thread) closes it. IT ia all too easy to have a thread open > a > transaction and forget about it later. This causes all the other threads to > see > consistent (not stale) data. Enabling "read uncommitted" may alleviate the > sypmtoms, but it does not remove the cause. Seems like if we are going to share a single connection we would need to ensure that only one operation is happening at one time. I'm hoping that when you say above "and the last statement to end processing...closes it" means that if we guarantee to have just one set of prepare/step/finalize happening at one time on a connection then the next set will always start a new transaction. Does that sound correct? > Your assertions "a deleted row on one connection is found by a select on the > other" and "BEGIN/DELETE/COMMIT and SELECT is happening in the same > thread" is perfectly consistent if BEGIN/DELETE/COMMIT happens on one > connection and SELECT on the other. Yes that is exactly what we were doing. BEGIN/DELETE/COMMIT happened on one connection and SELECT on the other. Ooops. > As already stated, and per my own experience, each thread should have it's > own connection and do whatever it needs to do there, without interference > from other threads. I appreciate this point. Early prototyping indicated that this might not be possible for our system, which makes me a little nervous if that is what most users end up doing. We will definitely take another look at not sharing connections. Danke sehr Alex ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On > Behalf Of Simon Slavin > Your description of your fault suggests that at least two of your > threads/processes are trying to use the same connection to the database at > the same time We have a bunch of reads happening at the same time on one connection. The writes were serialized with a write lock but the reads were just doing whatever they wanted. It seems like this is our mistake. > > We are trying to implement a system where all writes occur on one > connection ... Does it > sound like we are doing something fundamentally wrong by trying to share > connections across threads in this way? > > If you don't have your own mutex locks, then yes. Don't try to set up a > situation where two transactions are happening at the same time with the > same connection. The standard way to do that correctly is to give each > thread its own connection to the database and let SQLite do all the locking > necessary. However it's not difficult to make your own mutex system work, > it just seems like your own one isn't working properly. > Great info. I don't think we can afford to have a connection per thread. We have an arbitrarily large number of threads coming and going all the time and a combination of the amount of memory each connection is taking up and how long it takes to connect to the db it looks like limiting the number of connections would be the direction we go if we can get it to work. Maybe a pool of connections. We'll try serializing the reads on one connection and see what performance that gives us. It seems like we will be giving up a lot of parallelism, but as we are not sure yet exactly how the locking in the DB works that may not be as bad system impact as we fear. Thanks for the info Alex ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
AFAIK there is no "write mark" in the WAL journal. Instead, each read transaction (either implicit or explicit) has an associated "read mark" that determines which data the transaction will see (i.e. the data from the main file plus any pages in the WAL journal file before the read mark). This is the "CI" in ACID. Whatever the state of any table in the database, the transaction will always see the same state, even when querying the same row more than once (Consistent), irrespective of what other transactions may have "concurrently" changed (Isolation). Sharing a connection between threads makes it practically impossible for any one thread to tell when a transaction begins or ends. From the point of view of the database connection, the first statement to begin processing opens a transaction and the last statement to end processing (which could be in a totally different thread) closes it. IT ia all too easy to have a thread open a transaction and forget about it later. This causes all the other threads to see consistent (not stale) data. Enabling "read uncommitted" may alleviate the sypmtoms, but it does not remove the cause. Your assertions "a deleted row on one connection is found by a select on the other" and "BEGIN/DELETE/COMMIT and SELECT is happening in the same thread" is perfectly consistent if BEGIN/DELETE/COMMIT happens on one connection and SELECT on the other. As already stated, and per my own experience, each thread should have it's own connection and do whatever it needs to do there, without interference from other threads. -Ursprüngliche Nachricht- Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Alex Ward Gesendet: Dienstag, 13. September 2016 22:00 An: sqlite-users@mailinglists.sqlite.org Betreff: [sqlite] When is data committed on one connection seen on another? We think we are seeing the case where a read on one WAL mode Sqlite database connection using the C API is reading stale data after a transaction has committed data changes on another connection. For instance, a deleted row on one connection is still found by a select on the other. The BEGIN/DELETE/COMMIT then SELECT (prepare/step/finalize for each) is happening in the same thread.We expected the commit of the write to be the point in time after which any read would read that committed data. This does not seem to be the case here. What is the standard idiom to avoid stale data while still allowing all but a writing thread not to see uncommitted data? Is there a window of time between a commit on one connection and the data being available on another connection? Is that deterministic? When is the WAL mode "end mark" moved on a connection past other changes? Is it possible that we have an end mark on the read connection that is still behind the write on the other connection? What would trigger it to move? Would other threads doing reads in parallel on the same connection affect when the end mark is moved? In our test we serialized reads so there could only be one happening at a time in an attempt to remove any chance of this. But we still saw stale data. We are trying to implement a system where all writes occur on one connection (in a SQL transaction where a writing thread would see the uncommitted data) and all other reads on a second connection. Does it sound like we are doing something fundamentally wrong by trying to share connections across threads in this way? Any advice greatly appreciated Alex ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
On 13 Sep 2016, at 9:00pm, Alex Wardwrote: > What is the standard idiom to avoid stale data while still allowing all but a > writing thread not to see uncommitted data? You should not need to do anything special to arrange this. > Is there a window of time between a commit on one connection and the data > being available on another connection? Is that deterministic? When is the > WAL mode "end mark" moved on a connection past other changes? Is it possible > that we have an end mark on the read connection that is still behind the > write on the other connection? What would trigger it to move? It doesn't work quite the way you describe but again you should not need to do anything special to make it all work properly. > Would other threads doing reads in parallel on the same connection affect > when the end mark is moved? In our test we serialized reads so there could > only be one happening at a time in an attempt to remove any chance of this. > But we still saw stale data. There's no 'end mark' as such. A read should, at absolute worst, delay changes making it to the database until the read is finalized. A read done using sqlite3_exec() finalizes itself. A read done using _prepare() and _step() must be manually finalized using either _finalize() or _reset(). Your description of your fault suggests that at least two of your threads/processes are trying to use the same connection to the database at the same time. (It doesn't really matter whether these threads are reading or writing, you can get confusion on the connection either way.) Another possibility is that you're using a PRAGMA which looks like it should speed things up but it does so by telling SQLite not to do necessary multi-thread/multi-processing checks. > We are trying to implement a system where all writes occur on one connection > (in a SQL transaction where a writing thread would see the uncommitted data) > and all other reads on a second connection. Does it sound like we are doing > something fundamentally wrong by trying to share connections across threads > in this way? If you don't have your own mutex locks, then yes. Don't try to set up a situation where two transactions are happening at the same time with the same connection. The standard way to do that correctly is to give each thread its own connection to the database and let SQLite do all the locking necessary. However it's not difficult to make your own mutex system work, it just seems like your own one isn't working properly. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] When is data committed on one connection seen on another?
On 9/13/2016 4:00 PM, Alex Ward wrote: We think we are seeing the case where a read on one WAL mode Sqlite database connection using the C API is reading stale data after a transaction has committed data changes on another connection. For instance, a deleted row on one connection is still found by a select on the other. The BEGIN/DELETE/COMMIT then SELECT (prepare/step/finalize for each) is happening in the same thread.We expected the commit of the write to be the point in time after which any read would read that committed data. This does not seem to be the case here. Somehow or other, you have an open read transaction on the connection doing the SELECT, dating from before the write was committed. Often caused by forgetting to reset or finalize a statement. Would other threads doing reads in parallel on the same connection affect when the end mark is moved? Which connection is "the same" one? The reader or the writer? If you have other reads in progress on the reader connection, then they likely keep a transaction open, and your SELECT is done within that transaction, which goes back in time farther than you thought. If you have other reads in progress on the writer connection, then COMMIT likely fails. We are trying to implement a system where all writes occur on one connection (in a SQL transaction where a writing thread would see the uncommitted data) and all other reads on a second connection. Does it sound like we are doing something fundamentally wrong by trying to share connections across threads in this way? Realize that a transaction is a property of a connection, not a thread or a query. Suppose thread A starts a read at time T and ends it at time T+20; and thread B starts a read on the same connection at time T+10 and ends it at T+30. Both reads are part of the same unbroken transaction lasting (at least) from T to T+30. If there's an update committed at T+5 on a different connection, neither read would see it, even though thread B started after it. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users