Re: [sqlite] SQLite Application Server Concurrency
James, You don't need to go all the length to creating a sqlite proxy RPC as has been suggested. Your service could implement one thing only - a lock to the database file (external to the SQLite locking mechanism, of course). So it only needs to serialize concurrency; the client does the RPC (via either protocol handy to you - http, WCF, whatever) to obtain the lock; then it can safely write to the DB over NFS. As long as all clients pass through the locking mechanism, you are safe. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Efficient ways to maintaining order rank / row_number() in a rather large set ?
Analogous to the percentile solution (it's actually the same thing), you can use a checkpointing table. This has roughly the complexity of SQRT(n) for both read and write. I.E. say you expect to have 1M records and define order based on value then id. You then make a checkpoint table (first_rank,value,rec_id) holding every 1000th record in sorted order. So row 1 of checkpoint table coresponds to the 1000th sorted record. When you insert/delete a row, you only need to update checkpoints that come after said row. When you are searching for row 4521, you do something like: SELECT FROM table JOIN checkpoint WHERE ( (table.value=checkpoint.value AND table.id>=checkpoint.id) OR table.value>checkpoint.value ) AND checkpoint.first_rank=4500 ORDER BY table.value ASC,table.id ASC LIMIT 21,1 -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Efficient ways to maintaining order rank / row_number() in a rather large set ?
The only way to efficiently do this would be to have counting (range) index b-trees. Since you don't, you're stuck with a O(n) implementation, either on reading or writing. So your solution is as good as it gets, save maybe some implementation particularities. However, you may consider a shift in perspective: with this kind of data, statisticians use percentiles. That is, instead of querying for ranks 21,000-22,000, you query for "top 1%", "6-8%" etc, based on either value or rank; this way, you can maintain a percentile rank table as granular as you like (i.e. every 1% results in a table with 100 lines). Each line would have count, value min, value max. Such a table is much faster to update and then if you need to retrieve the actual records, you use by range (value BETWEEN min AND max) joined with the percentile table. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Another reason to do at least the LEFT JOIN + WHERE -> INNER JOIN optimisation: If a view is based on a LEFT JOIN, running a WHERE query on it will exhibit the same poor behavior and here there will be no way to rewrite the query. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] The performance of indexed select
... and the downside that it's just linear overhead for i.e. an unique index, it works best for indexes with low cardinality... win some, lose some :) -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] The performance of indexed select
What I do notice reading https://www.sqlite.org/fileformat.html (if I get it right) is that index lines are in the form (for an index on a,b,c ie): Whereas they could be represented as: [ , , ](3) whith [pk_list] being a subtree; reverse lookup from table record to index record would be arguably as fast if not faster. So this would have the advantage of more compact indexes (less data), having an index line count (no prefix so there is always just 1 update involved), with the downside of the complexity of an added level of indirection. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Why doesn't SQLite optimise this subselect?
If you want just 1 value (any) from lookup, you can use: SELECT post_processing_info, ( SELECT is_json FROM tables_lookup WHERE tables_lookup.content_hash=webpage_contents.content_hash LIMIT 1 ) AS is_json FROM webpage_contents WHERE content_hash = 'abc' If you want only one value, or SELECT post_processing_info, x.is_json FROM webpage_contents JOIN ( SELECT content_hash, MIN(is_json) FROM tables_lookup GROUP BY content_hash ) AS x ON x.content_hash=webpage_contents.content_hash FROM webpage_contents if you want the full join... I'm using MIN() here as an arbitrary function to get just one value from the group. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] sqlite3_column_decltype and max and min
Keith Medcalf wrote > Full Schema Tables: Thanks for this, I will have grat use for it too, soon! Actually it's so nice I think it could go into the documentation. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] The performance of indexed select
I think storing index prefix counts would only make sense in a special kind of 'statistical' index, where you would store count(x IS NOT NULL), sum(x), sum(x^2) so that usual statistical functions can be computed optimally. For a table count, I think it would make sense. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Why doesn't SQLite optimise this subselect?
Could you describe your intent in more detail? SELECT * FROM ... GROUP BY is unstable at best, so I cannot really grasp your intention. Which lookup record's If I guess right, you might want something in the lines of: SELECT w.post_processing_info, l.is_json FROM webpage_contents w JOIN (SELECT DISTINCT is_json from lookup_tables.lookups) l USING (content_hash) WHERE content_hash = 'abc' (this might return 0-3 rows if lookup_tables contains 3 values (0,1,and NULL) for is_json) which in turn I think would optimize best if written as: SELECT DISTINCT w.content_hash, w.post_processing_info, l.is_json FROM webpage_contents w JOIN lookup_tables.lookups l USING (content_hash) WHERE content_hash = 'abc' -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] The performance of indexed select
Richard Hipp-3 wrote > all the parent b-tree pages must be updated Yup, no question about it, at best it could be an opt-in. But as it is a design decision, I checked to make sure count() really is O(n) as Jonathan's question implied. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] The performance of indexed select
Clemens Ladisch wrote > For count(*), the database does not need the actual table rows. I think this is not true, he has a point here: SELECT COUNT(*) WHERE =? needs to examine every index key prefix (excluding at least ROWID) that matches. This may mean reading in the whole index. I think b-trees can store the counts of descendant nodes for every node to solve this issue in O(log n), but I don't see anything like it in the SQLite format. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Keith Medcalf wrote > The primary difference is likely the test on b.c. /There was some text here that seems to have been lost and my browser opened a "super resume templates" page... maybe I've been already spectre-hacked? Anyway, I repost:/ The primary difference from what I see is that it changes the query plan to start off with an index-based search on b, not a search (scan) on a. To me, this can be due to the supplimentary condition in the ON clause making this plan more appealing to the optimizer. So with this in mind, an easy optimisation would be to just "permeate" any imperative non-null condition in the where clause to the on clause. I don't know if the execution plan is identical to that of an inner join (I never tried to actually convert it to an inner join and I'm working on sth else atm), but it's satisfactory and it starts off the right way, with a search on b, with an execution time well in the expected margins. Keith Medcalf wrote > I think the query is ill-conceived when written as an outer join. If anything, to reflect intentionality the query should have been written as such: SELECT FROM ( SELECT FROM a LEFT JOIN b ) WHERE b.c=5 -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Keith Medcalf wrote > The primary difference is likely the test on b.c. Keith Medcalf wrote > I think the query is ill-conceived when written as an outer join. If anything, to reflect intentionality the query should have been written as such: SELECT FROM ( SELECT FROM a LEFT JOIN b ) WHERE b.c=5 -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
To reiterate, Keith: to get the query to execute properly, I didn't change the LEFT JOIN to an INNER JOIN! Nope, I rewrote SELECT FROM a LEFT JOIN b ON WHERE b.c=5 to SELECT FROM a LEFT JOIN b ON AND b.c=5 WHERE b.c IS NOT NULL So I just added a redundant predicate and it runs perfectly, on SQLite! That's why I said this simple improvement can surely be taken care of on the optimizer, while the larger discussion of actually changing the outer join to an inner join or even tackling the -OR- case is for sure something nice to think of, but increasingly more complicated. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Keith Medcalf wrote > but cannot be re-ordered for OUTER JOINS. Actually, I think order requirements (or rather, row grouping requirements, as far as I can paint it the requirement is just that all outer joined rows come in a bunch for each main join tuple) would not be violated if the join is made on an unique key left-side and an index is used right-side :) or something similar. I don't know, extensive algebra must be involved :) Even without index order inference, the main trunk keys can be sorted in a temp structure to preserve the condition, like in a GROUP BY query. However, I do see that SQLite seems to actually do it (scan b before a I mean) if I reqrite the query as I showed. select * from a, b where a.a *= b.a was replaced by select * from a LEFT JOIN b ON a.a = b.a Right, right, and with this in mind you can see my problem with the query is so easy to understand: My query, on the old format, is: select * from a, b where a.a *= b.a AND b.c = 5 My "improved" query, on the old format: select * from a, b where a.a *= b.a AND b.c *= 5 AND b.c = 5 // (OR IS NOT NULL) You can see the b.c *= 5 (JOIN ON ... AND b.c=5 ... WHERE b.c IS NOT NULL) is redundant, because it's just a weaker predicate, and I needed to add it just as an index hint on the join loop to trigger the right execution plan. Keith Medcalf wrote > I would sincerely doubt that there is *any* SQL optimizer or query planner > that can optimize mis-spoken queries containing OUTER JOINS. I don't know about mis-spoken, I don't think anything is mis-spoken. This app is running fine on Maria, I'm in the process of porting in to SQLite. I wouldn't have picked on this query unless it was lagging behind orders of magnitude (2.5s vs 50ms). So I think Maria does it (I haven't bothered to check the execution plan there, went straight to hacking SQLite). -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Short sum-up: - The -OR- alternative I provided is not semantically equivalent, I will work on one that is :) - The other one, without the -OR- (second post) still stands. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Richard, Thanks for acknowledging this, you are absolutely right, that's why I stated that no DB does perfect optimisations and that computing the alternative -OR- based closures are probably much harder to tackle. Also E. Pasma pointed out the -OR- queries as I wrote them are not really semantically equivalent unless the 2 joins are disjunct. However, the case of the imperative WHERE NOT NULL implying INNER JOIN is just a matter of replacing a predicate with a stronger one, so in all fairness I imagined it a far lesser overhead than, say, the query flattener. And I imagine it's a much more common situation, too, especially when users are adding additional filters via WHERE clauses to a base query, so it might benefit a lot of users, too. I know it would us, by not having to rewrite these queries when porting; we are working on x86 servers, and a stick of memory or a hard drive cost less than a programmer's day for us :) For the extra memory, I know for computing relational closures the spatial complexity can get big, but only when the structure of the query is written warrants it in the first place, so it shouldn't manifest heavily on a query that doesn't have this structure. This is just my best view on this, obviously it's a political decision to be made so it's no make-it-or-break-it thing, like mentioned before, we are porting some pretty big system and when I notice differences with SQLite, I jolt them down, in the hope it might benefit you or the millions of users, if not by changing SQLite, then simply by pointing out the workaround to other users, such as moving the WHERE condition out to the ON clause, it's not necessarily a trivial thing to consider for everyone. Ryan, You cannot ask SQL a Method query, that's where my whole RDBMS understanding takes me. It nullifies the purpose of queries as well as all efforts you yourselves have put into a lot of things, query flattening to mention just one. The "same result" is not accidental, the equivalent queries will produce the same result no matter which data populates the tables. That is the only deffinition I know of semantic equivalence. SQL is declarative and thus everything that describes the same thing is the same thing. E.Pasma, Thanks for taking the time to make the TC. This is always a huge putdown for me, because finally the execution plan depends on the data indexes are populated with (via ANALYZE) and are tables are huge so it's always a putdown for me to create a minimal TC. Indeed I noticed just now the 2 queries are not equivalent that way :) Thanks for pointing that out! I will work on an equivalent -and- optimized rewrite :) For the query plans though, here is where the index stats come in: here a "SCAN a" makes sense, but in our case the number of records in a is on the order of 1x records to b and c, and also the cardinality of b.d and c.d is on the order of 1000; so a "SEARCH b, SEARCH c" works out. At minimum you should have indexes on b.d, c.d, a.ab, a.ac; but even so and with adding another 1000 records on a, b and c and running the query: EXPLAIN QUERY PLAN SELECT * FROM a JOIN b ON b=ab AND b.d IN (1,2,3) JOIN c ON c=ac AND c.d IN (4,5) selectid |order |from |detail | -|--|-|---| 0|0 |0|SCAN TABLE a | 0|1 |1|SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) | 0|0 |0|EXECUTE LIST SUBQUERY 1| 0|2 |2|SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) | it still plans a "SCAN a" first. So I guess I'll have to backtrack from the real data to generate a TC. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Thank you for your answer, Keith. I had my problem "fixed" before I wrote the first mail. Also with every problem I also provided the fix that worked, for anyone that might run into the same problem. However, it's difficult to not get a little frustrated with your answer. At https://sqlite.org/queryplanner.html I read: "The best feature of SQL (in all its implementations, not just SQLite) is that it is a declarative language, not a procedural language. When programming in SQL you tell the system what you want to compute, not how to compute it." And I completely agree with this, "how to compute it" is called relational algebra and it's what a query planner should do best. And the two queries are algebrically identical. "(X ∊ S or X:=null) AND (X is not null)" is equivalent to "X ∊ S is not null". The two queries might look different only from an imperative programming point of view. As to why the query is written that way: with the above in mind, I will contend that there can absolutely never exist a "mistaken" way to write a query, as long as the description of the predicates is correct and consistent with the schema. You should consider that quite frequently queries are the result of one or more levels of logic abstraction (ORM, DBAL, etc). In my case, modifying the query was not difficult to do, but in other cases one may have few options on rewriting the way the query structure is generated. The only way to reduce a fabricated query is through relational algebra, and that is up to the DB, not the programmer, not the abstractions in-between. In this particular case, the where is optional; depending on parameters, I want the set of data that is correctly defined as the left join of tables a and b, or I might want a subset of this join that has a particular property over the left-joined set. The query was correctly written, to rewrite it so that the query planner might know how to run it is wrong, IMHO. To sum it up: I think it's every DB's intention to optimize as best possible a query into an execution plan. None does it perfectly, but all try to, very hard. With this intention, I reported a case where the query planner COULD be improved. I think you will at least agree with me that making it better can't be wrong. Whether that happens tomorrow, in a year or never, that's up to the mercy, resources and priorities of the developers, so I am really am not interested in an argue over this. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Algebrically, having a non-null imperative lookup condition in the WHERE clause means you have a stronger predicate on the same subject (ALL MUST fit vs. ANY that fit). -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
I think they are equivalent, if you look closer. SELECT FROM a LEFT JOIN b ON a.x=b.x WHERE b.y=5 -is- equivalent to SELECT FROM a JOIN b ON a.x=b.x AND b.y=5 SELECT FROM a JOIN b WHERE a.x=b.x AND b.y=5 SELECT FROM a LEFT JOIN b ON a.x=b.x AND b.y=5 WHERE b.y IS NOT NULL All the above are semantically equivalent. When there is only one LEFT JOIN, the presence of any non-null non-alternative condition on the joined table in the WHERE clause transforms it in an INNER join. There is no other way to have a non-null value except if the row exists. The reciprocal is not true of course. I don't know how difficult it is to compute the -OR- closure, as it is more difficult. But for an imperative non-null condition, I did expect the WHERE condition to be ported to the ON lookup for optimisation. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN + WHERE / OR optimisation
Probably related: Compound join with a left outer join generates different execution plans: LEFT JOIN ( b JOIN c ON ) WHERE b.something = 5 vs. LEFT JOIN ( b JOIN c ON AND b.something = 5 ) WHERE b.something IS NOT NULL -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] LEFT JOIN + WHERE / OR optimisation
Hi all, I've ran into an optimisation problem with a double-left join that works as an "either" clause. The query is as follows: SELECT * FROM a LEFT JOIN b ON LEFT JOIN c ON WHERE b.someId IN (1,2,3) OR c.someId IN (4,5) This results in a bloated execution plan: SEARCH a SEARCH SUBQUERY 1 USING AUTOMATIC COVERING INDEX SEARCH SUBQUERY 2 USING AUTOMATIC COVERING INDEX However, the semantically equivalent: SELECT * FROM a LEFT JOIN b ON AND b.someId IN (1,2,3) LEFT JOIN c ON AND c.someId IN (4,5) WHERE b.someId IS NOT NULL OR c.someId IS NOT NULL Gets the proper execution plan: SEARCH b SEARCH c EXECUTE LIST SUBQUERY -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Efficient query to count number of leaves in a DAG.
Yes, Lifepillar's way is the more orthodox approach, however I always preferred the path-based one because: 1) One seldom runs queries only based on the descendants map; there usually is an "AND -some other conditions-" involved; thus the ability to have one covering index of the condition comes in very handy 2) In ontologies you very often also require "structural" sorting - such as, the chapters in a book may need to be read like 1<1.1<1.1.A<1.1.B<1.2 etc; this is very likely to be needed at some point and only achievable using some modified form of the path field (with alpa-sortable numbers). This cannot be achieved with a normalized descendency map. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Efficient query to count number of leaves in a DAG.
If a different perspective may be helpful to you: If moving overhead to writes is an option (ie you dont have many or time critical writes), then the tree descendants problem can be sped up to stellar speeds by using a path column. IE. add a column "path" in the nodes table that would contain something like "1.2.3" for node 3 that is descendant of node 2 that is descendant of node 1. Then querying 2's descendant would result in the following range: "path">='1.2.' AND "path"<'1.2/' or you can try using LIKE semantics - both can use an index if you are careful with collation; I found out the hard way that the range queries are more resilient and portable, SQLite and others have a pretty awkward way of plugging in the LIKE optimisations that may result in the index being skipped for not-so-obvious reasons. Inserting nodes is trivial, but moving edges requires an algorithm to update paths (whenever a node's parent changes, all descendant's paths must be updated). However, for most real-world ontology use scenarios, this opperation happens very rarely and usually on the admin range of functions, so you can afford this operation that can be pretty slow. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Thank you all for the replies, I will hack this problem one way or another after the hoildays and let you know how it went. In the mean time, I wish you all happy peaceful holidays, and a great New Year! Dinu -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Hi sub sk79, I have so far from this thread the following suggestions: 1) Copy table -> TRUNCATE -> copy back; this doesn't work, what was supposed to be TRUNCATE semantics (DELETE FROM without WHERE) has the same performance as with WHERE. 2) Structure alterations; either drop table, or drop indexes : I am reluctant to do this; my evangelical instinct tells me hacking the semantics of life might lead to implosion of Earth :) 3) "Deleted" bit field - presumably the "soft delete" as you call it; I am analyzing this, but here the question is whether we include the bit in the indexes. If so, performing a heap of UPDATEs should be even more inefficient; if we don't include it in the index, the problem of the cost of filtering the row needs some analysis which I will probably do; the problem with this solution is that is has residual effects: we run some pretty complex queries against this table, with complicated joins and we already got some surprising execution plans that needed query rewriting. So with this "deleted" bit out of the index pool, we need to check various other queries to make sure they are still optimized to what we need. All this said and done, 3 hours to delete 15G of data seems atrocious even if you do it by standards resulted from generations of DOD and NSA inbreeding... so I'm still hopeful for some DB-related solution. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [EXTERNAL] Re: Atomic DELETE index optimisation?
Hick Gunter wrote > If you are running an SQLite version 3.11.0 or newer, only pages that are > changed for the first time in the transaction are copied to the WAL file, > thus extending it. > > While progressing the transaction, the rate of "newly changed" to "changed > again" pages will shift towards re-reading and rewriting the WAL file copy > of a page; this will tend to slow down the rate at which the WAL file is > growing, even at a constant delete rate, until every page has been updated > at least once, and then stop growing until all the remaining deletes have > been processed. Running 3.19; thanks for the explanation, this never occured to me and makes perfect sense. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Simon Slavin-3 wrote > DELETE FROM MyTable We have tried it and DELETE FROM table (without any WHERE) behaves exactly the same! I reiterate there are no FKs or triggers defined. So this is no fix... -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Simon Slavin-3 wrote > Once you’ve bust the cache things slow down. I do realize that. However, not illinearly. Once I bust the cache, throughput should drop X times and stay there. Here, the speed decreases with the progress. Let me put it this way: say the cache size was 0. I would expect the delete to work very slow, but near-linear with the # of rows being deleted or the progress of deleting them. Here the performance dramatically and constantly decreases with growing of the WAL file. You can literally see how every M of throughput is slower than the previous :) I am trying to find out why. For me it's very important for the migration process. Simon Slavin-3 wrote > Have you tried these things ? Did the time taken improve or get worse ? Not yet, we will probably implement what you proposed (it's been suggested before). But if I can find out the reason behind the nonlinear DELETE behavior, it would still help greatly. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Richard Hipp-3 wrote > DROP, CREATE, and ALTER are transactional in SQLite, just like DELETE, > INSERT, and UPDATE. This has always been the case, and always shall > be. Thanks! But still, isn't this incompatible with any lower-than-DB-level transaction locking mechanism? I.E. should you ever have row-level locking, this absolutely needs a persistent, cross-connection ROWID index; while any FOR UPDATE locking semantics need persistent gap-locking indexes... Just a thought for the distant future, I realize it's not a discussion to have now :) Keith Medcalf wrote > pragma cache_size=262144; -- 1 GB page cache Actually I realized that the DB page size is 1K. Is this bad? I tried to run the pragma query with 1M pages, to amount to the same 1G; there seems to be a dramatic improvement in throughput at the beginning of the query, but it quickly succombs to nonlinear slow-down-to-a-drag nonetheless. Richard, leaving alternate strategies aside (DROP, TRUNCATE etc.), what could be the reason behing this non-linear delete behavior? Why does it slow down to a grinding halt? It would be tremendously helping for me to know; we are in the process of migrating more than just this table to SQLite so knowledge about the inner workings of SQLite helps us tremendously. Thanks! -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Rowan Worth-2 wrote > The entire point of SQL transactions is to implement this magic! I beg to differ; you proposed: > (1) BEGIN TRANSACTION > > (2) DROP all indexes from the table This is by no means valid SQL semantics; in all RDBBMS I've worked with, structure changes' relation to a transaction is undefined at best. Even if it works now, there's no guarantee the "BEGIN; DROP" behavior will be consistent any time in the future. So in repect to this, the alternative of copying to a different table, TRUNCATE, copy back, looks much more semantically acceptable. Rowan Worth-2 wrote > WAL has different concurrency characteristics. Yeap, it's supposed to do just this, keep readers from starving until a write is complete; the WAL flush works quite well it seems, keeping the HDD at a respectable 100% and so the WAL flush will take less than 30s. Populating the table, on the other hand, takes much longer. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Keith Medcalf wrote > If you "moved" that memory (or some of it) from Level 2 to Level 1 you > would increase performance tremendously. > > pragma cache_size=262144; -- 1 GB page cache Thanks, I will try that! -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [EXTERNAL] Re: Atomic DELETE index optimisation?
Hick Gunter wrote > SQLite does this too Thanks! -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Rowan Worth-2 wrote > I'm not sure what you're worried about? Dropping and recreating identical > indices within a transaction won't cause a visible structure change to > concurrent readers -- that's the point of a transaction. I honestly don't see how in any DB system the client process would not crash if the index it's running a curson on were to be removed. Even if SQLite were to pull this magic out of the hat, starving client processes for the lack of an index (a full scan query would probably take in excess of 30s) would quickly pile up the clients to the point where one would have to kill them anyway. So with this in mind, I'm really not looking for a barbaric fix to this, I'm more of tryng to understand the problem and find a viable, semantically stable solution (and maybe trigger some improvements in SQLite, if there's a system bug). -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Richard Hipp-3 wrote > Can you try this: Thanks Richard, as mentioned earlier, any structure change is unacceptable due to concurrent reader clients. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Keith Medcalf wrote > I had a similar problem with a multi-terabyte database once upon a time. Na, they are not a single-time use rows, otherwise I'd have used a FIFO :) Every now and then, a large portion of the table becomes obsolete by external factors. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
SQLite 3.19.3 CentOS 7 (64 bit) RAM: 6G total, 4.4G buff/cache, 0.6G unused Sqlite memory: 133M RES, 0 SWAP, 0 DIRTY - don't know if it's relevant anymore, the query seems to have entered into another execution phase, it looks like now it's flushing the WAL. No other PRAGMA No FKs, no triggers. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
INSERTs are taking constant time as they should, the DELETE is quite obviously working is a blatantly non-linear progress. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Thanks, it's something worth trying. However, no, that can't be the only reason. Populating scriptically with the same data takes less than 10 minutes, including side processing... So the disproportion between insert and delete performance is huge. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
For the table swap operation, that is out of the question, the DB is running on a live system that has multiple clients. Foreign keys or not, any structure change crashes all clients (DB structure has changed). Too high a price to pay for a delete thay may happen routinely. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
By "one time only", I mean in my understanding, the way most DBs do on a DELETE is this: cache the ROWIDs while deleting data rows from the main and from the indexes, then when all ROWIDS are explored, sort the ROWID stream, and prune the trees from a sorted stream. This is both highly efficient (just like inserts, deletes of already ordered records are very efficient) and highly parallelizable. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Ok, so to address the size of the problem properly: We have ~32M records out of which we are deleting about ~24M. The raw size of the DB is about 20G out of which 15G goes away. Under these circumstances any page cache becomes irrelevant. The delete operation, which is a single delete run from the sqlite3 tool, is running for 3 hours now. The WAL size is in excess of 9G atm. The WAL size is growing ever slower (was about 10M/s, now it's 1M every 2s, slowing down). This indicates to me a non-linear process that I can link only to the B-trees, it's the only non-linear component I can think of that could cause this slowing-down-to-a-drag. The CPU is capped up badly, the HDD is at idle level so this also hints to the same issue. In reply to your remarks: - I understand the B-trees need to be modified. However, if you prune, (maybe rebalance), write-to-disk every node at a time, that is hugely stressful and inefficient when you are pruning half of a 32M nodes tree. Since the operation is atomic, the indexes could be updated one time and one time only. So, what to do? I think this delete may never end... by the time it took to write this reply, the WAL grow has succombed to 1M every 4s. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Atomic DELETE index optimisation?
Update: we are also using WAL; I have noticed during a huge delete, the WAL also grows huge. Could this also be a reason for slowness, that SQLite duplicates every block that hosts a delete? Is there any way to work around this? On 17.12.2017 22:53, Dinu Marina wrote: Hi all, It became apparent to me from performance measurements that the DELETE operation is very slow, when operating on large recordsets with indexes involved. My deduction is that SQLite updates the indexes for every deleted row, which in painstakingly slow... Since the DELETE should be atomic, the index update could also be atomic, and a mass pruning of the index tree is hugely faster than individual removals. My question is: is my deduction correct? Is there any way to fix/improve this in userland? Or are there prospects for this to be improved in SQLite in the foreseeable future? Thanks, Dinu ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Atomic DELETE index optimisation?
Hi all, It became apparent to me from performance measurements that the DELETE operation is very slow, when operating on large recordsets with indexes involved. My deduction is that SQLite updates the indexes for every deleted row, which in painstakingly slow... Since the DELETE should be atomic, the index update could also be atomic, and a mass pruning of the index tree is hugely faster than individual removals. My question is: is my deduction correct? Is there any way to fix/improve this in userland? Or are there prospects for this to be improved in SQLite in the foreseeable future? Thanks, Dinu ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] EXISTS optimisation?
Quick note to self and others: IN() and EXISTS() in all RDB's I know of are the uncle noone mentions; that is to say, they have different compare semantics than JOIN so the naive strategy is to evaluate them as dependent subqueries, not correlated ones, which would be consistent with the behavior I noticed. However, I do know of Maria and Postgres that do a decent job at optimizing EXISTS () (which I think is by all means the correct semantic for this intent). But there's by no means a golden standard across RDB's so that's why it would be very useful to have some documentation on it, as it's one of the migration pitfalls. I'm still in debt with the TC, will work on it the next days. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] EXISTS optimisation?
Thanks Clemens, I triple-checked it and it is indeed generating different execution plans, with the queries being absolutely equivalent. I will try to produce a minimal test case (right now the query where this occurs is a 100 lines long monster). However, I am a bit confused by the examples you provided: The 3rd query is the equivalent of the first 2 only if the lookup table has an unique index on x (which it does). However, this would make a very restricted case of why you would use EXISTS(). In my case, this is completely opposite: x is definitely not unique in the lookup table, that's precisely why I'm using EXISTS or IN, to avoid the row multiplication generated by a JOIN. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Weird (slow) TEMP B-TREE ORDER BY
Yeap. And as I said, the predicted result sizes by ANALYZE are pretty accurate, the data is pretty linearly distributed. The unsorted plan is perfect, so a wrong prediction is not at fault, but the sorted plan is completely different. The sorted column is an aggregate anyway, so no index can help, but a sort in place could deal with the problem. After some more research it appears sqlite does not have a sort buffer (someone correct me if I'm wrong), but then what is the TEMP B-TREE? On 14.03.2015 18:40, Bert Huijben wrote: > >> -Original Message- >> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users- >> bounces at mailinglists.sqlite.org] On Behalf Of Dinu Marina >> Sent: donderdag 12 maart 2015 11:34 >> To: sqlite-users at mailinglists.sqlite.org >> Subject: Re: [sqlite] Weird (slow) TEMP B-TREE ORDER BY >> >> Hi Richard, >> >> Like said, we've already tried: >> CREATE TEMP TABLE x AS (...query without outer sort...); CREATE xind ON >> x (price); SELECT * FROM x ORDER BY 2 >> but it also takes 30 seconds; for some really esoteric reason, CREATE >> TABLE x AS ... takes all of those 30s, even though the execution plan >> looks identical to the simple select (the same 72 steps). This is what I >> would dare call "spooky", any explanation would be appreciated here too. > Did you try running analyze after adding some data and before looking at the > query plans? > > With indexes that contain multiple columns adding the sqlite_stat1 table > (via analyze, or in another way) may introduce huge differences in query > plans and performance. > > We found that out the hard way in Subversion, because we have a pretty much > static first column wc_id in most of our indexes in preparation for some new > features. Even when it used the right indexes in query plans, it sometimes > didn't use all the columns that it should. > > We now add an initial sqlite_stat1 table and we can trust to see a stable > (good) performance from Sqlite. > https://www.sqlite.org/optoverview.html#manctrl > > > Bert > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Releasing a read (SHARED) lock
You should also check, as R.Smith and Slavin pointed, that you don't accidentally have a BEGIN [DEFERRED] somewhere, as that would trigger exactly this behavior: it will lock on the first select (the first step() actually) On 13.03.2015 02:44, Simon Slavin wrote: > On 13 Mar 2015, at 12:17am, Barry wrote: > >> On 13 March 2015 at 01:21, Dinu Marina wrote: >> >>> You should be also aware of a more common pitfall: unclosed result sets. >>> Any lock is held until you read PAST the last row or you call stmt_finalize >>> (or the equivalent abstraction in your DBAL). Always close select >>> statements. >> Hi Dinu, >> >> Am I correct in my understanding then that it is not enough to call >> sqlite3_reset, I must call sqlite3_finalize after I am done with a >> statement? > In order to dispose of a statement you can do either sqlite3_reset() or > sqlite3_finalize(). The common mistake is to do something like this: > > SELECT * FROM myTable WHERE rowid = 36 > > and then not do either _reset() or _finalize() because you know you have > asked for only one row so you expect SQLite to have done a _finalize() for > you. > > Simon. > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Releasing a read (SHARED) lock
Na, it should be the same, everywhere I said finalize you can replace with reset; I had this problem come over and over from people forgetting to finalize (there was no intention to reuse the statement). But reset is the same. My point is just that if you don't have explicit transactions, a read lock is acquired by the first step() and not released until free() or reset() or step() returning SQLITE_MISUSE due to one too many calls and you should check one-rowers (COUNT is the usual suspect for me) first to make sure they are finalized (reset) properly because they are the easiest to miss, since they don't have an ugly loop following. Other than that, there is no reason a lock should be held in autocommit mode. You could debug the reader client this way: create a second connection and try to create a write lock (BEGIN EXCLUSIVE) at various points. It will return SQLITE_BUSY when you have a leaked lock. On 13.03.2015 02:17, Barry wrote: > On 13 March 2015 at 01:21, Dinu Marina wrote: > >> You should be also aware of a more common pitfall: unclosed result sets. >> Any lock is held until you read PAST the last row or you call stmt_finalize >> (or the equivalent abstraction in your DBAL). Always close select >> statements. > > Hi Dinu, > > Am I correct in my understanding then that it is not enough to call > sqlite3_reset, I must call sqlite3_finalize after I am done with a > statement? > > Cheers, > > - Barry > > >> On Mar 12, 2015 11:40 AM, "R.Smith" wrote: >> >>> >>> On 2015-03-12 04:38 AM, Barry wrote: >>> >>>> Hello everybody, >>>> >>>> I have a situation where two processes are accessing the same SQLite >>>> database. One process only reads from the database, one process reads >> and >>>> writes. >>>> >>>> These processes keep a single database connection open for the lifetime >> of >>>> the process. >>>> >>>> It seems to me that once the reader process accesses the database (after >>>> it >>>> performs its first SELECT statement), it maintains a lock on the >> database >>>> until the connection is closed (when the program is exited). This >> prevents >>>> the writer process from updating the database. >>>> >>> Hi Barry, >>> >>> This is the usual situation when one of your transactions in the "reading >>> only" database does not finalize. i.e. you started a transaction there >> and >>> did not end it with either "END TRANSACTION" or "COMMIT" or "ROLLBACK". >>> >>> In the non-WAL DB it will simply hold the lock preventing changes. In the >>> WAL DB it will hold the lock for its own view of the data but let the >> other >>> writer write.. however it won't see the changes for itself. >>> >>> This is very common actually. Just find every transaction you start >>> (explicitly or implicitly) and make sure you end it and when you end it, >>> see what the return value is from SQLite and whether it reports any >> error. >>> HTH! >>> Ryan >>> >>> >>> >>>> I tried changing to WAL. This made the writer process able to commit its >>>> changes, but now the reader does not see any modifications made to the >>>> database until it is restarted (It seems to see a snapshot of the DB at >>>> the >>>> time of its first read). >>>> >>>> I am using prepared statements: On opening the DB, I create all my >>>> prepared >>>> statements. When I need to execute a statement, I bind to the statement, >>>> call sqlite3_step (possibly multiple times), then call sqlite3_reset. I >> do >>>> not finalise the statements until the program closes. >>>> >>>> In order to simulate 'save' behaviour, the writer process always holds a >>>> transaction open. When the user chooses 'save', the current transaction >> is >>>> committed and a new transaction is begun. (I understand that the reader >>>> will not see any changes in the uncommitted transaction, but is not >> seeing >>>> any committed transactions either). >>>> >>>> I have checked quite thoroughly through my code and cannot find any >>>> instances of statements executed without a sqlite3_reset quickly >>>> following. >>>> >>>> Is this intended behaviour - that once a connection has performed a >> read, >>>> it maintains its lock on
[sqlite] Releasing a read (SHARED) lock
There is also a more obscure side-effect exhibited by sqlite only: if you issue consecutive unfinalized select statements, it never releases the write lock (although they should be atomic), but it does trip the deadlock mechanism; any writer in the wait will receive SQLITE_BUSY at the very moment the second read statement is issued, regardless of the busy_timeout value. This happens in non-WAL mode. The usual culprit are one-record results, like SELECT COUNT(*)... usually the client expects one row and reads one row, but you need to either try to read 2 rows or explicitly free the cursor. Do note (for evangelism sake) that DBs with random-access cursors do not free the read lock even when you read past the last record, because you could decide to rewind. So always free result sets.
[sqlite] Releasing a read (SHARED) lock
You should be also aware of a more common pitfall: unclosed result sets. Any lock is held until you read PAST the last row or you call stmt_finalize (or the equivalent abstraction in your DBAL). Always close select statements. On Mar 12, 2015 11:40 AM, "R.Smith" wrote: > > > On 2015-03-12 04:38 AM, Barry wrote: > >> Hello everybody, >> >> I have a situation where two processes are accessing the same SQLite >> database. One process only reads from the database, one process reads and >> writes. >> >> These processes keep a single database connection open for the lifetime of >> the process. >> >> It seems to me that once the reader process accesses the database (after >> it >> performs its first SELECT statement), it maintains a lock on the database >> until the connection is closed (when the program is exited). This prevents >> the writer process from updating the database. >> > > Hi Barry, > > This is the usual situation when one of your transactions in the "reading > only" database does not finalize. i.e. you started a transaction there and > did not end it with either "END TRANSACTION" or "COMMIT" or "ROLLBACK". > > In the non-WAL DB it will simply hold the lock preventing changes. In the > WAL DB it will hold the lock for its own view of the data but let the other > writer write.. however it won't see the changes for itself. > > This is very common actually. Just find every transaction you start > (explicitly or implicitly) and make sure you end it and when you end it, > see what the return value is from SQLite and whether it reports any error. > > HTH! > Ryan > > > >> I tried changing to WAL. This made the writer process able to commit its >> changes, but now the reader does not see any modifications made to the >> database until it is restarted (It seems to see a snapshot of the DB at >> the >> time of its first read). >> >> I am using prepared statements: On opening the DB, I create all my >> prepared >> statements. When I need to execute a statement, I bind to the statement, >> call sqlite3_step (possibly multiple times), then call sqlite3_reset. I do >> not finalise the statements until the program closes. >> >> In order to simulate 'save' behaviour, the writer process always holds a >> transaction open. When the user chooses 'save', the current transaction is >> committed and a new transaction is begun. (I understand that the reader >> will not see any changes in the uncommitted transaction, but is not seeing >> any committed transactions either). >> >> I have checked quite thoroughly through my code and cannot find any >> instances of statements executed without a sqlite3_reset quickly >> following. >> >> Is this intended behaviour - that once a connection has performed a read, >> it maintains its lock on the database for its lifetime? Is SQLite smart >> enough to know that the pages it holds in cache of the reader are invalid >> after the writer has made changes to the DB on disk? >> >> If this is not the intended behaviour - is there a way I can find out >> which >> statements are causing the lock to be held open? Or can I force SQLite to >> discard its cache? >> >> Any help would be appreciated. >> >> Regards, >> >> Barry Smith >> ___ >> sqlite-users mailing list >> sqlite-users at mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >
[sqlite] Weird (slow) TEMP B-TREE ORDER BY
Hi Richard, Like said, we've already tried: CREATE TEMP TABLE x AS (...query without outer sort...); CREATE xind ON x (price); SELECT * FROM x ORDER BY 2 but it also takes 30 seconds; for some really esoteric reason, CREATE TABLE x AS ... takes all of those 30s, even though the execution plan looks identical to the simple select (the same 72 steps). This is what I would dare call "spooky", any explanation would be appreciated here too. Thanks
[sqlite] UPDATE Problem
Hi Peter, From https://www.sqlite.org/datatype3.html: "When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if such conversion is lossless and reversible"; basically any fraction you insert into a NUMERIC column with at most 15 digits will be converted do REAL (float). What you are looking for is DECIMAL data type, which is not supported by sqlite, and NUMERIC is not supposed to be a replacement, NUMERIC is only useful as an extension beyond the range of INTEGER or FLOAT for huge numbers or extreme precision, in which case it will behave as TEXT and it's the responsibility of the client app to convert back to some numeric representation. Do note however, that if you're dealing with only positive fractions, you can safely use TEXT to emulate DECIMAL, as long as all strings are the same length and right-aligned: " 100.50" < "999.00" On 12.03.2015 01:27, Peter Haworth wrote: > I have a table, Transactions, with a column , Value, of type NUMERIC. The > Value column is supposed to have 2 decimal places in all rows but some have > only one. > > To correct this I issued the following command: > > UPDATE Transactions SET Value=Value||0 WHERE substr(Value,-2,1)='.' > > No errors on execution but nothing was changed. > > To help me figure out why that didn't work, I issued the following command: > > SELECT Value||0 FROM Transactions WHERE substr(Value,-2,1)='.' > > This time, all the values with only 1 decimal place were listed with a > trailing zero added. > > On the face of it, the WHERE statement works fine in a SELECT statement but > does not find any rows in an UPDATE statement. Either that or the SET is > not calculating the correct value. > > Is this a bug or am I missing something? > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Weird (slow) TEMP B-TREE ORDER BY
Hi guys, Sorry to be a pain, but does anyone have an idea about this issue? I have to recommend switching DBs this week based on this, and I am reluctant to do so, but we haven't been able to figure a fix either. Thanks, Dinu
[sqlite] Weird (slow) TEMP B-TREE ORDER BY
Hi Donald, Thanks for the answer, I'll try to address these issues: 1,4) I was lazy and got the DDL from the GUI without looking at it. Here is the proper form from the master table (it shouldnt affect replicability though) CREATE TABLE [materializedInstances] ([provider] TEXT, [instanceId] TEXT, [packageId] INTEGER, [date] TEXT, [duration] INTEGER, [transportType] INTEGER, PRIMARY KEY ([provider], [instanceId])) CREATE INDEX "1ab7a7b8a24dce53a34a650587c5bfdabcc785a2" ON [materializedInstances] ( "packageId" , "provider" , "instanceId" , "date" , "duration" , "transportType") CREATE TABLE "materializedPrices" ("id" INTEGER ,"price" FLOAT ,"perPerson" float ,"currency" TEXT ,"promotionType" INTEGER ,"promotionName" TEXT ,"provider" text ,"instanceId" TEXT ,"accomodationHash" TEXT ,"accomodationName" TEXT ,"start" DATE ,"end" DATE ,PRIMARY KEY ("id")) CREATE INDEX "ffd0d22435eccb1020821f3ec70d038dfc7e5d91" ON "materializedPrices"("provider","instanceId","start","end","accomodationHash","price") Seems my GUI does some postprocessing on the DDL to constraint + index (not sure how wrong that is). 2) As this is basically a hierarchical query it would be best suited for CTEs, but I wouldn't go there as I'm sure optimizations aren't in place for these in sqlite. The problems with joins are: 1) An optimization problem regarding WHERE and GROUP BY that seem to not be able to both use the same index in sqlite (I'll get back with that later) 2) Since this aggregates over chunks of the same table, a JOIN is always of C1*C2logN complexity while iterating with nested selects does (C1+C2)logN Example: SELECT MIN(a.d)+MIN(b.d) FROM t a JOIN t b ON a.y=b.y WHERE a.cond=1 AND b.cond=2 GROUP BY a.y Is basically squared time (looks like cubic but I think that's the planner bug I was talking about, need to reproduce a minimal test case) while SELECT (SELECT MIN(d) FROM t WHERE t.y=j.y AND t.cond=1 )+ (SELECT MIN(d) FROM t WHERE t.y=j.y AND t.cond=2 ) FROM (SELECT DISTINCT t.y) j Although much uglier, gets there in index search time. 3) Yeap. And as I said, the predicted result sizes by ANALYZE are pretty accurate, the data is pretty linearly distributed. The unsorted plan is perfect, so a wrong prediction is not at fault, but the sorted plan is completely different. Thanks, Dinu On 10.03.2015 17:17, Donald Griggs wrote: > Hi, Dinu, > > I'll have to leave it to those more knowledgeable to respond to other > aspects of your question, but I was struck by the following. > > 1) Regarding columns such as "provider TEXT(20)" > > Do you really mean that values in such columns might need to contain up to > two giga-characters (2 * 10**9)? Sqlite will simply interpret this as > "TEXT" plain and simple -- but if your actual data contains > monstrously-long values, AND these are part of primary keys, then I can > imagine this would disastrously affect performance. > > 2) Complexity. > > I understand you said could not achieve good performance with > straightforward joins. I wondered if it might be helpful to post your > cleanest, simplest JOIN-based SELECT (and indexes and query plan) that you > thought should have good performance, yet does not. > > > 3) Analyze. You've likely already run the ANALYZE command, right? (A > one-time run of VACUUM wouldn't hurt either, I suppose.) > > 4) Regarding: "CREATE INDEX sqlite_autoindex_materializedInstances_1 ON > materializedInstances (provider,instanceId); > > Am I right that since (provider,instanceId) is already PRIMARY KEY, then > this index is redundant? (Even so, I would only expect it to take up disk > space and slow down inserts/deletes, but not slow down SELECTs.) > > > Donald > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Weird (slow) TEMP B-TREE ORDER BY
Forgot to add DDL for the outer table: CREATE TABLE materializedInstances ( provider TEXT(20), instanceId TEXT(20), packageId INTEGER, date TEXT(20), duration INTEGET, transportType INTEGER, CONSTRAINT MATERIALIZEDINSTANCES_PK PRIMARY KEY (provider,instanceId) ); CREATE INDEX 1ab7a7b8a24dce53a34a650587c5bfdabcc785a2 ON materializedInstances (packageId,provider,instanceId,date,duration,transportType); CREATE INDEX sqlite_autoindex_materializedInstances_1 ON materializedInstances (provider,instanceId);
Re: [sqlite] Problem with sqlite3_column_origin_name and AS Clause
"Igor Tandetnik" <itandet...@mvps.org> schrieb im Newsbeitrag news:h6bg82$td...@ger.gmane.org... > Dinu Scheppelmann (DAISY) wrote: > > Unfortunately when I get the column names by function > > sqlite3_column_origin_name(), the result columns have the names > > "DokId", "Name" and "Name" again - instead of "Id", "PatientName" and > > "Name"!! > > You want sqlite3_column_name > > Igor Tandetnik > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Thanks for that fast help;-) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Problem with sqlite3_column_origin_name and AS Clause
Hello to all SQLite people (that's my first post - please be patient:-)) I already searched the mailings from 2008/2009 but could not find a post that describes this problem. I have a select over a few tables, and since many tables use identical names for some columns (like "Id"), I rename the column using AS clause: SELECT Dokument.DokId AS Id, Patient.Name AS PatientName, Einsender.Name FROM Dokument, Patient, Einsender WHERE Dokument.AkteId = Patient.Id AND Dokument.SenderId = Einsender.Id; Unfortunately when I get the column names by function sqlite3_column_origin_name(), the result columns have the names "DokId", "Name" and "Name" again - instead of "Id", "PatientName" and "Name"!! For me this is severe because I pass back the whole result set of a query as one nice "object" and the column names have to be unique! (hope you understand what I'm saying) Is it possible to fix this "bug" or is there a workaround? (a VIEW does not help either) Thanks for any help! Dinu == DAISY Archivierungssysteme GmbH Preysingstr. 2 68526 Ladenburg www.adakta.de Tel: +49-6203-40400 Fax: +49-6203-404020 Mail: dinu.scheppelm...@adakta.de HRB: 432841 Amtsgericht Mannheim Geschäftsführer: Dr. Dinu Scheppelmann == ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users