Re: [sqlite] EXISTS optimisation?

2017-11-28 Thread Dinu
Quick note to self and others: IN() and EXISTS() in all RDB's I know of are the uncle noone mentions; that is to say, they have different compare semantics than JOIN so the naive strategy is to evaluate them as dependent subqueries, not correlated ones, which would be consistent with the behavior

Re: [sqlite] EXISTS optimisation?

2017-11-28 Thread Dinu
Thanks Clemens, I triple-checked it and it is indeed generating different execution plans, with the queries being absolutely equivalent. I will try to produce a minimal test case (right now the query where this occurs is a 100 lines long monster). However, I am a bit confused by the examples

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Rowan Worth-2 wrote > I'm not sure what you're worried about? Dropping and recreating identical > indices within a transaction won't cause a visible structure change to > concurrent readers -- that's the point of a transaction. I honestly don't see how in any DB system the client process would

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Keith Medcalf wrote > If you "moved" that memory (or some of it) from Level 2 to Level 1 you > would increase performance tremendously. > > pragma cache_size=262144; -- 1 GB page cache Thanks, I will try that! -- Sent from: http://sqlite.1065341.n5.nabble.com/

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Rowan Worth-2 wrote > The entire point of SQL transactions is to implement this magic! I beg to differ; you proposed: > (1) BEGIN TRANSACTION > > (2) DROP all indexes from the table This is by no means valid SQL semantics; in all RDBBMS I've worked with, structure changes' relation to a

Re: [sqlite] [EXTERNAL] Re: Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Hick Gunter wrote > SQLite does this too Thanks! -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Richard Hipp-3 wrote > DROP, CREATE, and ALTER are transactional in SQLite, just like DELETE, > INSERT, and UPDATE. This has always been the case, and always shall > be. Thanks! But still, isn't this incompatible with any lower-than-DB-level transaction locking mechanism? I.E. should you ever

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Simon Slavin-3 wrote > DELETE FROM MyTable We have tried it and DELETE FROM table (without any WHERE) behaves exactly the same! I reiterate there are no FKs or triggers defined. So this is no fix... -- Sent from: http://sqlite.1065341.n5.nabble.com/

Re: [sqlite] [EXTERNAL] Re: Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Hick Gunter wrote > If you are running an SQLite version 3.11.0 or newer, only pages that are > changed for the first time in the transaction are copied to the WAL file, > thus extending it. > > While progressing the transaction, the rate of "newly changed" to "changed > again" pages will shift

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-18 Thread Dinu
Simon Slavin-3 wrote > Once you’ve bust the cache things slow down. I do realize that. However, not illinearly. Once I bust the cache, throughput should drop X times and stay there. Here, the speed decreases with the progress. Let me put it this way: say the cache size was 0. I would expect the

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-19 Thread Dinu
Hi sub sk79, I have so far from this thread the following suggestions: 1) Copy table -> TRUNCATE -> copy back; this doesn't work, what was supposed to be TRUNCATE semantics (DELETE FROM without WHERE) has the same performance as with WHERE. 2) Structure alterations; either drop table, or drop

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
By "one time only", I mean in my understanding, the way most DBs do on a DELETE is this: cache the ROWIDs while deleting data rows from the main and from the indexes, then when all ROWIDS are explored, sort the ROWID stream, and prune the trees from a sorted stream. This is both highly efficient

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
For the table swap operation, that is out of the question, the DB is running on a live system that has multiple clients. Foreign keys or not, any structure change crashes all clients (DB structure has changed). Too high a price to pay for a delete thay may happen routinely. -- Sent from:

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
Thanks, it's something worth trying. However, no, that can't be the only reason. Populating scriptically with the same data takes less than 10 minutes, including side processing... So the disproportion between insert and delete performance is huge. -- Sent from:

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
Ok, so to address the size of the problem properly: We have ~32M records out of which we are deleting about ~24M. The raw size of the DB is about 20G out of which 15G goes away. Under these circumstances any page cache becomes irrelevant. The delete operation, which is a single delete run from

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
INSERTs are taking constant time as they should, the DELETE is quite obviously working is a blatantly non-linear progress. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
SQLite 3.19.3 CentOS 7 (64 bit) RAM: 6G total, 4.4G buff/cache, 0.6G unused Sqlite memory: 133M RES, 0 SWAP, 0 DIRTY - don't know if it's relevant anymore, the query seems to have entered into another execution phase, it looks like now it's flushing the WAL. No other PRAGMA No FKs, no triggers.

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
Keith Medcalf wrote > I had a similar problem with a multi-terabyte database once upon a time. Na, they are not a single-time use rows, otherwise I'd have used a FIFO :) Every now and then, a large portion of the table becomes obsolete by external factors. -- Sent from:

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu
Richard Hipp-3 wrote > Can you try this: Thanks Richard, as mentioned earlier, any structure change is unacceptable due to concurrent reader clients. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-05 Thread Dinu
Short sum-up: - The -OR- alternative I provided is not semantically equivalent, I will work on one that is :) - The other one, without the -OR- (second post) still stands. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-05 Thread Dinu
Richard, Thanks for acknowledging this, you are absolutely right, that's why I stated that no DB does perfect optimisations and that computing the alternative -OR- based closures are probably much harder to tackle. Also E. Pasma pointed out the -OR- queries as I wrote them are not really

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-04 Thread Dinu
Algebrically, having a non-null imperative lookup condition in the WHERE clause means you have a stronger predicate on the same subject (ALL MUST fit vs. ANY that fit). -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-04 Thread Dinu
I think they are equivalent, if you look closer. SELECT FROM a LEFT JOIN b ON a.x=b.x WHERE b.y=5 -is- equivalent to SELECT FROM a JOIN b ON a.x=b.x AND b.y=5 SELECT FROM a JOIN b WHERE a.x=b.x AND b.y=5 SELECT FROM a LEFT JOIN b ON a.x=b.x AND b.y=5 WHERE b.y IS NOT NULL All the above are

[sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-04 Thread Dinu
Hi all, I've ran into an optimisation problem with a double-left join that works as an "either" clause. The query is as follows: SELECT * FROM a LEFT JOIN b ON LEFT JOIN c ON WHERE b.someId IN (1,2,3) OR c.someId IN (4,5) This results in a bloated execution plan: SEARCH a SEARCH

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-04 Thread Dinu
Probably related: Compound join with a left outer join generates different execution plans: LEFT JOIN ( b JOIN c ON ) WHERE b.something = 5 vs. LEFT JOIN ( b JOIN c ON AND b.something = 5 ) WHERE b.something IS NOT NULL -- Sent from: http://sqlite.1065341.n5.nabble.com/

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-04 Thread Dinu
Thank you for your answer, Keith. I had my problem "fixed" before I wrote the first mail. Also with every problem I also provided the fix that worked, for anyone that might run into the same problem. However, it's difficult to not get a little frustrated with your answer. At

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-05 Thread Dinu
To reiterate, Keith: to get the query to execute properly, I didn't change the LEFT JOIN to an INNER JOIN! Nope, I rewrote SELECT FROM a LEFT JOIN b ON WHERE b.c=5 to SELECT FROM a LEFT JOIN b ON AND b.c=5 WHERE b.c IS NOT NULL So I just added a redundant predicate and it runs

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-05 Thread Dinu
Keith Medcalf wrote > The primary difference is likely the test on b.c. Keith Medcalf wrote > I think the query is ill-conceived when written as an outer join. If anything, to reflect intentionality the query should have been written as such: SELECT FROM ( SELECT FROM a LEFT JOIN b ) WHERE

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-05 Thread Dinu
Keith Medcalf wrote > but cannot be re-ordered for OUTER JOINS. Actually, I think order requirements (or rather, row grouping requirements, as far as I can paint it the requirement is just that all outer joined rows come in a bunch for each main join tuple) would not be violated if the join is

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-05 Thread Dinu
Keith Medcalf wrote > The primary difference is likely the test on b.c. /There was some text here that seems to have been lost and my browser opened a "super resume templates" page... maybe I've been already spectre-hacked? Anyway, I repost:/ The primary difference from what I see is that it

Re: [sqlite] The performance of indexed select

2018-01-06 Thread Dinu
Clemens Ladisch wrote > For count(*), the database does not need the actual table rows. I think this is not true, he has a point here: SELECT COUNT(*) WHERE =? needs to examine every index key prefix (excluding at least ROWID) that matches. This may mean reading in the whole index. I think

Re: [sqlite] The performance of indexed select

2018-01-06 Thread Dinu
Richard Hipp-3 wrote > all the parent b-tree pages must be updated Yup, no question about it, at best it could be an opt-in. But as it is a design decision, I checked to make sure count() really is O(n) as Jonathan's question implied. -- Sent from: http://sqlite.1065341.n5.nabble.com/

Re: [sqlite] The performance of indexed select

2018-01-07 Thread Dinu
What I do notice reading https://www.sqlite.org/fileformat.html (if I get it right) is that index lines are in the form (for an index on a,b,c ie): Whereas they could be represented as: [ , , ](3) whith [pk_list] being a subtree; reverse lookup from table record to index record

Re: [sqlite] The performance of indexed select

2018-01-07 Thread Dinu
... and the downside that it's just linear overhead for i.e. an unique index, it works best for indexes with low cardinality... win some, lose some :) -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list

Re: [sqlite] sqlite3_column_decltype and max and min

2018-01-06 Thread Dinu
Keith Medcalf wrote > Full Schema Tables: Thanks for this, I will have grat use for it too, soon! Actually it's so nice I think it could go into the documentation. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list

Re: [sqlite] Why doesn't SQLite optimise this subselect?

2018-01-06 Thread Dinu
If you want just 1 value (any) from lookup, you can use: SELECT post_processing_info, ( SELECT is_json FROM tables_lookup WHERE tables_lookup.content_hash=webpage_contents.content_hash LIMIT 1 ) AS is_json FROM

Re: [sqlite] Efficient ways to maintaining order rank / row_number() in a rather large set ?

2018-01-09 Thread Dinu
Analogous to the percentile solution (it's actually the same thing), you can use a checkpointing table. This has roughly the complexity of SQRT(n) for both read and write. I.E. say you expect to have 1M records and define order based on value then id. You then make a checkpoint table

Re: [sqlite] Efficient ways to maintaining order rank / row_number() in a rather large set ?

2018-01-09 Thread Dinu
The only way to efficiently do this would be to have counting (range) index b-trees. Since you don't, you're stuck with a O(n) implementation, either on reading or writing. So your solution is as good as it gets, save maybe some implementation particularities. However, you may consider a shift in

Re: [sqlite] LEFT JOIN + WHERE / OR optimisation

2018-01-08 Thread Dinu
Another reason to do at least the LEFT JOIN + WHERE -> INNER JOIN optimisation: If a view is based on a LEFT JOIN, running a WHERE query on it will exhibit the same poor behavior and here there will be no way to rewrite the query. -- Sent from: http://sqlite.1065341.n5.nabble.com/

Re: [sqlite] Why doesn't SQLite optimise this subselect?

2018-01-06 Thread Dinu
Could you describe your intent in more detail? SELECT * FROM ... GROUP BY is unstable at best, so I cannot really grasp your intention. Which lookup record's If I guess right, you might want something in the lines of: SELECT w.post_processing_info, l.is_json FROM

Re: [sqlite] The performance of indexed select

2018-01-06 Thread Dinu
I think storing index prefix counts would only make sense in a special kind of 'statistical' index, where you would store count(x IS NOT NULL), sum(x), sum(x^2) so that usual statistical functions can be computed optimally. For a table count, I think it would make sense. -- Sent from:

Re: [sqlite] SQLite Application Server Concurrency

2018-01-13 Thread Dinu
James, You don't need to go all the length to creating a sqlite proxy RPC as has been suggested. Your service could implement one thing only - a lock to the database file (external to the SQLite locking mechanism, of course). So it only needs to serialize concurrency; the client does the RPC (via

Re: [sqlite] Efficient query to count number of leaves in a DAG.

2018-01-01 Thread Dinu
If a different perspective may be helpful to you: If moving overhead to writes is an option (ie you dont have many or time critical writes), then the tree descendants problem can be sped up to stellar speeds by using a path column. IE. add a column "path" in the nodes table that would contain

Re: [sqlite] Efficient query to count number of leaves in a DAG.

2018-01-02 Thread Dinu
Yes, Lifepillar's way is the more orthodox approach, however I always preferred the path-based one because: 1) One seldom runs queries only based on the descendants map; there usually is an "AND -some other conditions-" involved; thus the ability to have one covering index of the condition comes

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-22 Thread Dinu
Thank you all for the replies, I will hack this problem one way or another after the hoildays and let you know how it went. In the mean time, I wish you all happy peaceful holidays, and a great New Year! Dinu -- Sent from: http://sqlite.1065341.n5.nabble.com

[sqlite] Weird (slow) TEMP B-TREE ORDER BY

2015-03-10 Thread Dinu Marina
Forgot to add DDL for the outer table: CREATE TABLE materializedInstances ( provider TEXT(20), instanceId TEXT(20), packageId INTEGER, date TEXT(20), duration INTEGET, transportType INTEGER, CONSTRAINT MATERIALIZEDINSTANCES_PK PRIMARY KEY

[sqlite] Weird (slow) TEMP B-TREE ORDER BY

2015-03-10 Thread Dinu Marina
t.y=j.y AND t.cond=2 ) FROM (SELECT DISTINCT t.y) j Although much uglier, gets there in index search time. 3) Yeap. And as I said, the predicted result sizes by ANALYZE are pretty accurate, the data is pretty linearly distributed. Th

[sqlite] Weird (slow) TEMP B-TREE ORDER BY

2015-03-11 Thread Dinu Marina
Hi guys, Sorry to be a pain, but does anyone have an idea about this issue? I have to recommend switching DBs this week based on this, and I am reluctant to do so, but we haven't been able to figure a fix either. Thanks, Dinu

[sqlite] UPDATE Problem

2015-03-12 Thread Dinu Marina
Hi Peter, From https://www.sqlite.org/datatype3.html: "When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if such conversion is lossless and reversible"; basically any fraction you insert into a NUMERIC

[sqlite] Weird (slow) TEMP B-TREE ORDER BY

2015-03-12 Thread Dinu Marina
Hi Richard, Like said, we've already tried: CREATE TEMP TABLE x AS (...query without outer sort...); CREATE xind ON x (price); SELECT * FROM x ORDER BY 2 but it also takes 30 seconds; for some really esoteric reason, CREATE TABLE x AS ... takes all of those 30s, even though the execution plan

[sqlite] Releasing a read (SHARED) lock

2015-03-12 Thread Dinu Marina
You should be also aware of a more common pitfall: unclosed result sets. Any lock is held until you read PAST the last row or you call stmt_finalize (or the equivalent abstraction in your DBAL). Always close select statements. On Mar 12, 2015 11:40 AM, "R.Smith" wrote: > > > On 2015-03-12 04:38

[sqlite] Releasing a read (SHARED) lock

2015-03-12 Thread Dinu Marina
There is also a more obscure side-effect exhibited by sqlite only: if you issue consecutive unfinalized select statements, it never releases the write lock (although they should be atomic), but it does trip the deadlock mechanism; any writer in the wait will receive SQLITE_BUSY at the very

[sqlite] Releasing a read (SHARED) lock

2015-03-13 Thread Dinu Marina
when you have a leaked lock. On 13.03.2015 02:17, Barry wrote: > On 13 March 2015 at 01:21, Dinu Marina wrote: > >> You should be also aware of a more common pitfall: unclosed result sets. >> Any lock is held until you read PAST the last row or you call stmt_finalize >

[sqlite] Releasing a read (SHARED) lock

2015-03-13 Thread Dinu Marina
7am, Barry wrote: > >> On 13 March 2015 at 01:21, Dinu Marina wrote: >> >>> You should be also aware of a more common pitfall: unclosed result sets. >>> Any lock is held until you read PAST the last row or you call stmt_finalize >>> (or the equivalent a

[sqlite] Weird (slow) TEMP B-TREE ORDER BY

2015-03-15 Thread Dinu Marina
e-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users- >> bounces at mailinglists.sqlite.org] On Behalf Of Dinu Marina >> Sent: donderdag 12 maart 2015 11:34 >> To: sqlite-users at mailinglists.sqlite.org >> Subject: Re: [sqlite] Weird (slow) TEMP B-TREE ORDER BY &

Re: [sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu Marina
Update: we are also using WAL; I have noticed during a huge delete, the WAL also grows huge. Could this also be a reason for slowness, that SQLite duplicates every block that hosts a delete? Is there any way to work around this? On 17.12.2017 22:53, Dinu Marina wrote: Hi all, It became

[sqlite] Atomic DELETE index optimisation?

2017-12-17 Thread Dinu Marina
future? Thanks, Dinu ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Problem with sqlite3_column_origin_name and AS Clause

2009-08-17 Thread Dinu Scheppelmann (DAISY)
have to be unique! (hope you understand what I'm saying) Is it possible to fix this "bug" or is there a workaround? (a VIEW does not help either) Thanks for any help! Dinu == DAISY Archivierungssysteme GmbH Preysingstr. 2 68526 Ladenburg www.

Re: [sqlite] Problem with sqlite3_column_origin_name and AS Clause

2009-08-20 Thread Dinu Scheppelmann (DAISY)
"Igor Tandetnik" <itandet...@mvps.org> schrieb im Newsbeitrag news:h6bg82$td...@ger.gmane.org... > Dinu Scheppelmann (DAISY) wrote: > > Unfortunately when I get the column names by function > > sqlite3_column_origin_name(), the result columns have the names >