Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread dirdi
On 8/30/19 3:34 AM, Keith Medcalf wrote:
> Fascinating again, because the code is identical.  The p1 difference in the 
> OpenWrite/IdxInsert/IdxDelete opcodes is just the "file handle" that happens 
> to be being used and doesn't have any real significance (at least I don't 
> think it does).  The p3 in the Transaction opcode is merely the schemacookie 
> and not only indicates that the schema has seen 2 changes between these plans 
> being generated (presumably the DROP and CREATE of the index).
> 
> I am at a loss to explain the difference in execution speed. 

Do not bother! I was just curious if this was expected behavior. Until
now I put about 5h into solving this little riddle. With DROP INDEX /
CREATE INDEX I got something that works for me and therefore I will
leave it at that.

However, if one wants to investigate further, I am happy to provide
additional information.


-- 
Best regards

dirdi
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread Keith Medcalf

Fascinating again, because the code is identical.  The p1 difference in the 
OpenWrite/IdxInsert/IdxDelete opcodes is just the "file handle" that happens to 
be being used and doesn't have any real significance (at least I don't think it 
does).  The p3 in the Transaction opcode is merely the schemacookie and not 
only indicates that the schema has seen 2 changes between these plans being 
generated (presumably the DROP and CREATE of the index).

I am at a loss to explain the difference in execution speed.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of dirdi
>Sent: Thursday, 29 August, 2019 15:10
>To: sqlite-users@mailinglists.sqlite.org
>Subject: Re: [sqlite] Unexpected REINDEX behavior.
>
>On 8/29/19 9:19 PM, Keith Medcalf wrote:
>> Fascinating ... because the optimal query plan would be to do a
>table scan of tbl1 and then dip into tbl2 for each row to get the
>update value.  The only meaningful index then would be on tbl2(int)
>(that is idx3) since an index will be unhelpful for the LIKE
>constraint.
>
>I noticed that, too. But I stopped wondering about optimizer
>internals,
>back when web-browsers started to JIT compile javascript ;)
>
>> Is the generated code different for the two queries, the one that
>runs slow -vs- the one that runs fast?  (That is, the output from
>EXPLAIN ..., not the output from EXPLAIN QUERY PLAN, or from .eqp
>full in the CLI).
>
>This is the output of EXPLAIN UPDATE before I recreate the index:
>> 0Init0   73  0   00
>> 1Null0   4   5   00
>> 2OpenRead0   13  0   0   00
>> 3Rewind  0   7   0   00
>> 4Rowid   0   5   0   00
>> 5RowSetAdd   4   5   0   00
>> 6Next0   4   0   01
>> 7OpenWrite   0   13  0   7   00
>> 8OpenWrite   1   18  0   k(2,,)  00
>> 9RowSetRead  4   72  5   00
>> 10   NotExists   0   9   5   00
>> 11   Rowid   0   6   0   00
>> 12   Null0   7   0   00
>> 13   Null0   8   0   00
>> 14   Column  0   3   9   00
>> 15   Null0   10  0   00
>> 16   Null0   11  0   00
>> 17   Null0   12  0   00
>> 18   Copy5   13  0   00
>> 19   Null0   14  0   00
>> 20   Column  0   1   15  00
>> 21   Column  0   2   16  00
>> 22   Null0   22  22  00
>> 23   Noop5   3   0   00
>> 24   Integer 1   23  0   00
>> 25   OpenRead4   11  0   3   00
>> 26   OpenRead6   23  0   k(2,,)  00
>> 27   Column  0   2   24  00
>> 28   Affinity24  1   0   D   00
>> 29   SeekLE  6   39  24  1   00
>> 30   DeferredSeek6   0   4   00
>> 31   Column  4   1   29  00
>> 32   Concat  29  28  26  00
>> 33   Column  0   1   27  00
>> 34   Function0   0   26  25  like(2) 02
>> 35   IfNot   25  38  1   00
>> 36   IdxRowid6   22  0   00
>> 37   DecrJumpZero23  39  0   00
>> 38   Prev6   30  0   00
>> 39   SCopy   22  17  0   00
>> 40   Column  0   4   18  NULL00
>> 41   Column  0   5   19  NULL00
>> 42   Column  0   6   20  0   00
>> 43   Affinity14  7   0   DBDDBDD 00
>> 44   SCopy   17  2   0   00
>> 45   IntCopy 13  3   0   00
>> 46   MakeRecord  2   2   1   00
>> 47   MakeRecord  14  7   21  00
>> 48   FkIfZero0   56  0   00
>> 49   IsNull  9   56  0   00
>> 50   SCopy   9   25  0   00
>> 51   MustBeInt   25  55  0   00
>> 52   OpenRead7   11  0   3   00
>> 53   NotExists   7   55  25  00
>> 54   Goto0   56  0   00
>> 55   FkCounter   0   -1  0   00
>> 56   Close   7   0   0   00
>> 57   Column  0   3   26  00
>> 58   Rowid   0   27  0   00
>> 59   IdxDelete   1   26  2   00
>> 60   Delete  0   68  13  tbl100
>> 61   IsNull  17  68  0   00
>> 62   SCopy   

Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread dirdi
On 8/29/19 9:19 PM, Keith Medcalf wrote:
> Fascinating ... because the optimal query plan would be to do a table scan of 
> tbl1 and then dip into tbl2 for each row to get the update value.  The only 
> meaningful index then would be on tbl2(int) (that is idx3) since an index 
> will be unhelpful for the LIKE constraint.

I noticed that, too. But I stopped wondering about optimizer internals,
back when web-browsers started to JIT compile javascript ;)

> Is the generated code different for the two queries, the one that runs slow 
> -vs- the one that runs fast?  (That is, the output from EXPLAIN ..., not the 
> output from EXPLAIN QUERY PLAN, or from .eqp full in the CLI).

This is the output of EXPLAIN UPDATE before I recreate the index:
> 0 Init0   73  0   00  
> 1 Null0   4   5   00  
> 2 OpenRead0   13  0   0   00  
> 3 Rewind  0   7   0   00  
> 4 Rowid   0   5   0   00  
> 5 RowSetAdd   4   5   0   00  
> 6 Next0   4   0   01  
> 7 OpenWrite   0   13  0   7   00  
> 8 OpenWrite   1   18  0   k(2,,)  00  
> 9 RowSetRead  4   72  5   00  
> 10NotExists   0   9   5   00  
> 11Rowid   0   6   0   00  
> 12Null0   7   0   00  
> 13Null0   8   0   00  
> 14Column  0   3   9   00  
> 15Null0   10  0   00  
> 16Null0   11  0   00  
> 17Null0   12  0   00  
> 18Copy5   13  0   00  
> 19Null0   14  0   00  
> 20Column  0   1   15  00  
> 21Column  0   2   16  00  
> 22Null0   22  22  00  
> 23Noop5   3   0   00  
> 24Integer 1   23  0   00  
> 25OpenRead4   11  0   3   00  
> 26OpenRead6   23  0   k(2,,)  00  
> 27Column  0   2   24  00  
> 28Affinity24  1   0   D   00  
> 29SeekLE  6   39  24  1   00  
> 30DeferredSeek6   0   4   00  
> 31Column  4   1   29  00  
> 32Concat  29  28  26  00  
> 33Column  0   1   27  00  
> 34Function0   0   26  25  like(2) 02  
> 35IfNot   25  38  1   00  
> 36IdxRowid6   22  0   00  
> 37DecrJumpZero23  39  0   00  
> 38Prev6   30  0   00  
> 39SCopy   22  17  0   00  
> 40Column  0   4   18  NULL00  
> 41Column  0   5   19  NULL00  
> 42Column  0   6   20  0   00  
> 43Affinity14  7   0   DBDDBDD 00  
> 44SCopy   17  2   0   00  
> 45IntCopy 13  3   0   00  
> 46MakeRecord  2   2   1   00  
> 47MakeRecord  14  7   21  00  
> 48FkIfZero0   56  0   00  
> 49IsNull  9   56  0   00  
> 50SCopy   9   25  0   00  
> 51MustBeInt   25  55  0   00  
> 52OpenRead7   11  0   3   00  
> 53NotExists   7   55  25  00  
> 54Goto0   56  0   00  
> 55FkCounter   0   -1  0   00  
> 56Close   7   0   0   00  
> 57Column  0   3   26  00  
> 58Rowid   0   27  0   00  
> 59IdxDelete   1   26  2   00  
> 60Delete  0   68  13  tbl100  
> 61IsNull  17  68  0   00  
> 62SCopy   17  25  0   00  
> 63MustBeInt   25  67  0   00  
> 64OpenRead8   11  0   3   00  
> 65NotExists   8   67  25  00  
> 66Goto0   68  0   00  
> 67FkCounter   0   1   0   00  
> 68Close   8   0   0   00  
> 69IdxInsert   1   1   2   2   00  
> 70Insert  0   21  13  tbl1 

Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread dirdi
On 8/29/19 9:35 PM, Simon Slavin wrote:
> On 29 Aug 2019, at 8:12pm, dirdi  wrote:
> 
>> The only difference between both runs:
> 
> Well, that's nothing.  I see no reason for the massive change in timing from 
> what you posted.  But someone else might.

Well the quoted line was shifted to output's second to last line. Hence
I assume that internally something must have changed.

-- 
Best regards

dirdi
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_BUSY, database is locked in "PRAGMA journal_mode"

2019-08-29 Thread test user
Just some more details to clarify the issue:

If I do a `BEGIN IMMEDIATE` on one connection, and then a `PRGAMA
journal_mode` on another, BUSY is not returned (as expected).

But if I have around 7 connections contending for a write lock via `BEGIN
IMMEDIATE`, and a different connection runs `PRGAMA journal_mode` (with no
`BEGIN`), it returns BUSY (not expected).

I also occasionally get a "SQLITE_BUSY_RECOVERY", but all API usages are
using finalize/close to give back any sqlite API resources.

*Questions:*
A. In which cases will BUSY be returned for read only queries when in WAL
mode?
- I assumed this was never, as WAL mode allows many concurrent readers.
- How can I find out exceptions to this rule?


B. Is there any method for determining lock transitions for connections?
- Is there an API?
- Would it be possible to use dtrace to instrument SQLite to detect
lock transitions?
- Where should I be looking?

Id really appreciate any pointers,

Thanks.

On Wed, Aug 28, 2019 at 9:46 PM test user 
wrote:

> Hello,
>
> Im getting this message in the log:
>
> `SQLITE_BUSY, database is locked in "PRAGMA journal_mode"`
>
> I get this response when running the query `PRAGMA journal_mode`.
>
> The file is in journal_mode=WAL.
>
> Another connection holds a write transaction.
>
> Seeing as `PRAGMA journal_mode` is just a read, why would this return BUSY?
>
> I assumed that in WAL mode you can have many reads and a single writer at
> the same time?
>
> Thanks
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread Simon Slavin
On 29 Aug 2019, at 8:12pm, dirdi  wrote:

> The only difference between both runs:

Well, that's nothing.  I see no reason for the massive change in timing from 
what you posted.  But someone else might.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the zlib compression level used by sqlar?

2019-08-29 Thread Richard Hipp
On 8/29/19, Peng Yu  wrote:
> It is not clear what level of zlib compression is used in sqlar. Does
> anybody know?
>
> https://sqlite.org/sqlar/doc/trunk/README.md
>
> Is the following python code guarantee to generate a file that conform
> to the sqlar format standard? (Note that I deliberately set the
> timestamp and permission to zero as I don't care about them.)

I don't think the compression level matters as far as the file format
is concerned.  A higher compression level simply means that the
compressor tries harder to find repeated patterns.  The same decoder
works regardless of the compression level.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] What is the zlib compression level used by sqlar?

2019-08-29 Thread Peng Yu
It is not clear what level of zlib compression is used in sqlar. Does
anybody know?

https://sqlite.org/sqlar/doc/trunk/README.md

Is the following python code guarantee to generate a file that conform
to the sqlar format standard? (Note that I deliberately set the
timestamp and permission to zero as I don't care about them.)

```
import sqlite3
conn=sqlite3.connect(sys.argv[1])
c=conn.cursor()
import zlib
data = sys.stdin.read()
c.execute('REPLACE INTO sqlar VALUES(?, ?, ?, ?, ?)', [sys.argv[2], 0,
0, len(data), buffer(zlib.compress(data))])
conn.commit()
```

The content is taken from stdin and the sqlar file is in argv[1] and
the path in the archive is in argv[2].

-- 
Regards,
Peng
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread Keith Medcalf

Fascinating ... because the optimal query plan would be to do a table scan of 
tbl1 and then dip into tbl2 for each row to get the update value.  The only 
meaningful index then would be on tbl2(int) (that is idx3) since an index will 
be unhelpful for the LIKE constraint.

Is the generated code different for the two queries, the one that runs slow 
-vs- the one that runs fast?  (That is, the output from EXPLAIN ..., not the 
output from EXPLAIN QUERY PLAN, or from .eqp full in the CLI).

Also, what version of SQLite3 are you using?

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of dirdi
>Sent: Thursday, 29 August, 2019 11:57
>To: sqlite-users@mailinglists.sqlite.org
>Subject: Re: [sqlite] Unexpected REINDEX behavior.
>
>On 8/29/19 6:10 PM, Keith Medcalf wrote:
>> Do you have sqlite_stat1 or sqlite_stat4 tables present (that is,
>have you ever run analyze)?
>
>No.
>
>> If the sqlite_stat1 or sqlite_stat4 tables are present, then if you
>run ANALYZE; rather than reindexing or dropping/recreating the index,
>what is the result?
>
>If I run ...
>
>> ANALYZE;
>> REINDEX;
>> ANALYZE;
>
>... instead of ...
>
>> DROP INDEX `idx1`;
>> CREATE INDEX `idx1` ON `tbl1` (
>> `int`
>> );
>
>the UPDATE query remains being slow (~36m).
>
>--
>Best regards
>
>dirdi
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread dirdi
On 8/29/19 6:00 PM, Simon Slavin wrote:
> Just in case, please run an integrity-check on the database as it is before 
> you delete/rebuild the index.

The python script creates the DB from scratch every time. However ...

> PRAGMA integrity_check;

... returned ...

> ok

> You might be able to learn more about the various versions of that index 
> using this:
> 
> 
> 
> It may show that the various versions of idx1 you see/create take different 
> numbers of pages, or have different depths.  However, I agree with you that 
> the amount of time disparity is unexpected.

Here is what sqlanalyze reports about the index:
> *** Index IDX1 of table TBL1 ***
> 
> Percentage of total database..   3.7%
> Number of entries. 551407
> Bytes of storage consumed. 5517312   
> Bytes of payload.. 3826907 69.4% 
> Bytes of metadata. 1670381 30.3% 
> B-tree depth.. 3 
> Average payload per entry. 6.94  
> Average unused bytes per entry 0.04  
> Average metadata per entry 3.03  
> Average fanout 224.00
> Non-sequential pages.. 128295.2% 
> Maximum payload per entry. 7 
> Entries that use overflow. 00.0% 
> Index pages used.. 6 
> Primary pages used 1341  
> Overflow pages used... 0 
> Total pages used.. 1347  
> Unused bytes on index pages... 581623.7% 
> Unused bytes on primary pages. 142080.26% 
> Unused bytes on overflow pages 0 
> Unused bytes on all pages. 200240.36%

I ran it twice, once before executing DROP INDEX and CREATE INDEX
commands ...

> $ ./sqlite3_analyzer db.sqlite3 > run1

... and once afterwards:

> $ ./sqlite3_analyzer db.sqlite3 > run2

The only difference between both runs:
> $ diff run1 run2
> 1260d1259
> < INSERT INTO space_used 
> VALUES('idx_words1','words',1,0,551407,550067,3,3826907,0,0,7,6,1341,0,5816,14208,0,1282,5517312);
> 1270a1270
>> INSERT INTO space_used 
>> VALUES('idx_words1','words',1,0,551407,550067,3,3826907,0,0,7,6,1341,0,5816,14208,0,1282,5517312);

-- 
Best regards

dirdi
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread Richard Damon
On 8/29/19 10:56 AM, dirdi wrote:
> On 8/29/19 6:10 PM, Keith Medcalf wrote:
>> Do you have sqlite_stat1 or sqlite_stat4 tables present (that is, have you 
>> ever run analyze)?  
> No.
>
>> If the sqlite_stat1 or sqlite_stat4 tables are present, then if you run 
>> ANALYZE; rather than reindexing or dropping/recreating the index, what is 
>> the result?
> If I run ...
>
>> ANALYZE;
>> REINDEX;
>> ANALYZE;
> ... instead of ...
>
>> DROP INDEX `idx1`;
>> CREATE INDEX `idx1` ON `tbl1` (
>> `int`
>> );
> the UPDATE query remains being slow (~36m).
>
One thought, could there perhaps be something about the query that the
ANALYZE gives bad information about how the structure of data in the
table that dropping and recreating (which deletes the results of the
last ANALYZE) makes the query planner chose a different plan which works
better. Perhaps you could look at and compare the query plans of the two
different scenarios.

-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread dirdi
On 8/29/19 6:10 PM, Keith Medcalf wrote:
> Do you have sqlite_stat1 or sqlite_stat4 tables present (that is, have you 
> ever run analyze)?  

No.

> If the sqlite_stat1 or sqlite_stat4 tables are present, then if you run 
> ANALYZE; rather than reindexing or dropping/recreating the index, what is the 
> result?

If I run ...

> ANALYZE;
> REINDEX;
> ANALYZE;

... instead of ...

> DROP INDEX `idx1`;
> CREATE INDEX `idx1` ON `tbl1` (
> `int`
> );

the UPDATE query remains being slow (~36m).

-- 
Best regards

dirdi
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread Keith Medcalf

Do you have sqlite_stat1 or sqlite_stat4 tables present (that is, have you ever 
run analyze)?  

Neither running a REINDEX nor a VACUUM will modify the statistics, however, 
dropping the index will delete the index statistics for the index (and 
re-creating the index will not re-create the statistics).

If the statistics tables (sqlite_stat1 and/or sqlite_stat4) are present then 
you update them by running analyze;  You can also drop the sqlite_stat1 and/or 
sqlite_stat4 tables to get rid of the statistics entirely, or delete individual 
rows from the table to get rid of stale shape data.

If the sqlite_stat1 or sqlite_stat4 tables are present, then if you run 
ANALYZE; rather than reindexing or dropping/recreating the index, what is the 
result?

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of dirdi
>Sent: Thursday, 29 August, 2019 08:10
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] Unexpected REINDEX behavior.
>
>Hi there, I noticed some - at least for me - unexpected behavior of
>the
>REINDEX command:
>
>I have a DB with 7 tables, 11 indexes and a size of about 140MiB. The
>DB
>is generated by a python script parsing an XML file. Entries are
>added
>to the tables in arbitrary order.
>
>After the tables have been created and filled, some UPDATE queries
>are
>executed. One of those queries took ~36m. I tried to reduce the
>runtime
>by using the REINDEX command, but with no success. However, if I
>delete
>and recreate one of the indexes, using the DROP INDEX and CREATE
>INDEX
>commands, the UPDATE query takes only ~2m.
>
>Unfortunately I can not share the DB with you, due to legal reasons
>and
>was also not able to create a test script to trigger this behavior.
>
>The two tables, affected by the UPDATE query:
>> CREATE TABLE `tbl1` (
>> `id` INTEGER PRIMARY KEY AUTOINCREMENT,
>> `str` TEXT NOT NULL UNIQUE COLLATE NOCASE,
>> `int` INTEGER NOT NULL,
>> `tbl2_id` INTEGER,
>> `other_str` TEXT DEFAULT NULL,
>> `other_int1` INTEGER DEFAULT NULL,
>> `other_int2` INTEGER NOT NULL DEFAULT 0,
>> FOREIGN KEY(`tbl2_id`) REFERENCES `tbl2`(`id`)
>> );
>>
>> CREATE TABLE `tbl2` (
>> `id` INTEGER PRIMARY KEY AUTOINCREMENT,
>> `str` TEXT NOT NULL UNIQUE COLLATE NOCASE,
>> `int` INTEGER NOT NULL
>> );
>
>Their corresponding indexes:
>> CREATE INDEX `idx1` ON `tbl1` (
>> `int`
>> );
>> CREATE INDEX `idx2` ON `tbl1` (
>> `tbl2_id`
>> );
>> CREATE INDEX `idx3` ON `tbl2` (
>> `int`
>> );
>... filling the tables ...
>
>Recreating index idx1 (To archive the speedup, recreation of the
>other
>indexes is not necessary):
>> DROP INDEX `idx1`;
>> CREATE INDEX `idx1` ON `tbl1` (
>> `int`
>> );
>And the update query:
>> UPDATE `tbl1`
>> SET `tbl2_id` = (
>> SELECT `tbl2`.`id`
>> FROM `tbl2`
>> WHERE `tbl1`.`int` >= `tbl2`.`int`
>> AND `tbl1`.`str` LIKE '%' || `tbl2`.`str`
>> ORDER BY `tbl2`.`int` DESC
>> LIMIT 1
>> );
>
>And it's corresponding query plan:
>> idparent  notused detail> 3  0   0   SCAN TABLE tbl1
>> 23   0   0   CORRELATED SCALAR SUBQUERY 1
>> 29   23  0   SEARCH TABLE tbl2 USING INDEX idx3 (int
>tbl1 and tbl2 contain about 600k and 20k entries, respectively.
>
>Does one have an explanation for this behavior?
>Is it expected that the REINDEX command produces other results than
>"manually" recreation of an index?
>If yes, under which circumstances does this happen?
>And is there some way to measure the "quality" of an index / if it
>has
>been scattered?
>
>PS: I also tried the VACUUM (combined w/ and w/o REINDEX) command to
>no
>success.
>
>--
>Best regards
>
>dirdi
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread Simon Slavin
On 29 Aug 2019, at 3:09pm, dirdi  wrote:

> Does one have an explanation for this behavior?

> Is it expected that the REINDEX command produces other results than 
> "manually" recreation of an index?
> If yes, under which circumstances does this happen?
> And is there some way to measure the "quality" of an index / if it has been 
> scattered?

Just in case, please run an integrity-check on the database as it is before you 
delete/rebuild the index.

What you're seeing seems weird to me.  One thing that comes to mind is that 
deleting and recreating an index might put all the pages of the new index at 
the end of the database file, whereas the REINDEX command might reuse the pages 
which the index is currently using.  But I don't know whether this is how 
SQLite really works.

You might be able to learn more about the various versions of that index using 
this:



It may show that the various versions of idx1 you see/create take different 
numbers of pages, or have different depths.  However, I agree with you that the 
amount of time disparity is unexpected.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] dqlite 1.0.0 - SQLite replication and failover library

2019-08-29 Thread Free Ekanayaka
Hi Dominique,

Dominique Devienne  writes:

> On Thu, Aug 29, 2019 at 2:35 PM Jose Isaias Cabrera 
> wrote:
>
>> Free Ekanayaka, on Thursday, August 29, 2019 06:40 AM, wrote...
>> > See https://dqlite.io for more details.
>>
>> Can dsqlite be installed on Windows?  I went to the site, read the
>> README.md file, and could not find any reference of it.  I can see the
>>
>> $ sudo add-apt-repository -y ppa:dqlite/v1 && sudo apt install dqlite
>>
>> which lets me know that it linux/unix based.  But, is Windows an option
>> also?  Thanks.
>>
>
> Ditto. Wasn't clear is Windows was supported.

Added this as well to the FAQ:

https://github.com/canonical/dqlite/blob/master/doc/faq.md#are-windows-and-osx-supported

> But I'd add whether dqlite could replicate across platforms as well.
> SQLite DB files are portable across platforms and architectures.
> So can dqlite replicate across Windows, Linux, OSX, etc???

As said, yes it's possible, but not done. I don't think it would be too
much work.
>
> Also, does it come in easy-to-digest amalgamation?
> Especially since it requires a patch to SQLite, a portable
> C-amalgamation source distribution form would maximize
> changes of it being tried out.

Good point, I have thought about it but didn't not have time to
implement it. I'll definitely provide that option in the near future.

> Perhaps OT, but would the hooks added to the WAL-journaling
> ever be up-streamed by Richard, under a build flag for example?
> Without it, that's basically "forking" SQLite, which is not
> ideal... --DD

Absolutely. I'd love to upstream it at some point. However I'll wait
until I'm 100% sure of the API, the code and the logic, since Richard
and the team are known to be rightfully very conservative when it comes
to contributions.

One bit that I'd like to explore is to have a bit more replication
options than just WAL pages: for instance you could have some kind of
mixed logical/binary replication that streams the operations peformed on
the btree (e.g. I updated this cell with this value), or statement-based
replication when it makes sense. Ideally a mix of all those approaches
with some euristics to automatically decide which one is the best for a
particular transaction.

Free

> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] dqlite 1.0.0 - SQLite replication and failover library

2019-08-29 Thread Free Ekanayaka
Hello Jose,

Jose Isaias Cabrera  writes:

> Free Ekanayaka, on Thursday, August 29, 2019 06:40 AM, wrote...
>>
>> Hi,
>>
>> following up from my previous post back in 2017 [0], I'd like to
>> announce version 1.0.0 of dqlite, a C library that brings data
>> replication and high-availability to SQLite, using the Raft consensus
>> algorithm.
>>
>> The biggest change is that Go is not used anymore, the engine itself is
>> all pure C now.
>>
>> It still requires to apply a patch to SQLite, but it's a rather tiny one
>> that just adds a few hooks when writing to the Write-Ahead log.
>>
>> See https://dqlite.io for more details.
>>
>> Thanks again to the SQLite authors for their excellent work.
>
> Can dsqlite be installed on Windows?  I went to the site, read the README.md 
> file, and could not find any reference of it.  I can see the
>
> $ sudo add-apt-repository -y ppa:dqlite/v1 && sudo apt install dqlite
>
> which lets me know that it linux/unix based.  But, is Windows an
> option also?  Thanks.

At the moment Windows is not an option, mainly because under the hood
dqlite uses a Linux-specifc asynchronous file system API (io_submit) not
available on Windows. However, such code lives behind a pluggable
interface that could be implemented on other OSs.

Free
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Unexpected REINDEX behavior.

2019-08-29 Thread dirdi
Hi there, I noticed some - at least for me - unexpected behavior of the
REINDEX command:

I have a DB with 7 tables, 11 indexes and a size of about 140MiB. The DB
is generated by a python script parsing an XML file. Entries are added
to the tables in arbitrary order.

After the tables have been created and filled, some UPDATE queries are
executed. One of those queries took ~36m. I tried to reduce the runtime
by using the REINDEX command, but with no success. However, if I delete
and recreate one of the indexes, using the DROP INDEX and CREATE INDEX
commands, the UPDATE query takes only ~2m.

Unfortunately I can not share the DB with you, due to legal reasons and
was also not able to create a test script to trigger this behavior.

The two tables, affected by the UPDATE query:
> CREATE TABLE `tbl1` (
> `id` INTEGER PRIMARY KEY AUTOINCREMENT,
> `str` TEXT NOT NULL UNIQUE COLLATE NOCASE,
> `int` INTEGER NOT NULL,
> `tbl2_id` INTEGER,
> `other_str` TEXT DEFAULT NULL,
> `other_int1` INTEGER DEFAULT NULL,
> `other_int2` INTEGER NOT NULL DEFAULT 0,
> FOREIGN KEY(`tbl2_id`) REFERENCES `tbl2`(`id`)
> );
> 
> CREATE TABLE `tbl2` (
> `id` INTEGER PRIMARY KEY AUTOINCREMENT,
> `str` TEXT NOT NULL UNIQUE COLLATE NOCASE,
> `int` INTEGER NOT NULL
> );

Their corresponding indexes:
> CREATE INDEX `idx1` ON `tbl1` (
> `int`
> );
> CREATE INDEX `idx2` ON `tbl1` (
> `tbl2_id`
> );
> CREATE INDEX `idx3` ON `tbl2` (
> `int`
> );
... filling the tables ...

Recreating index idx1 (To archive the speedup, recreation of the other
indexes is not necessary):
> DROP INDEX `idx1`;
> CREATE INDEX `idx1` ON `tbl1` (
> `int`
> );
And the update query:
> UPDATE `tbl1`
> SET `tbl2_id` = (
> SELECT `tbl2`.`id`
> FROM `tbl2`
> WHERE `tbl1`.`int` >= `tbl2`.`int`
> AND `tbl1`.`str` LIKE '%' || `tbl2`.`str`
> ORDER BY `tbl2`.`int` DESC
> LIMIT 1
> );

And it's corresponding query plan:
> idparent  notused detail> 3   0   0   SCAN TABLE tbl1
> 230   0   CORRELATED SCALAR SUBQUERY 1
> 2923  0   SEARCH TABLE tbl2 USING INDEX idx3 (inthttp://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] dqlite 1.0.0 - SQLite replication and failover library

2019-08-29 Thread Dominique Devienne
On Thu, Aug 29, 2019 at 2:35 PM Jose Isaias Cabrera 
wrote:

> Free Ekanayaka, on Thursday, August 29, 2019 06:40 AM, wrote...
> > See https://dqlite.io for more details.
>
> Can dsqlite be installed on Windows?  I went to the site, read the
> README.md file, and could not find any reference of it.  I can see the
>
> $ sudo add-apt-repository -y ppa:dqlite/v1 && sudo apt install dqlite
>
> which lets me know that it linux/unix based.  But, is Windows an option
> also?  Thanks.
>

Ditto. Wasn't clear is Windows was supported.
But I'd add whether dqlite could replicate across platforms as well.
SQLite DB files are portable across platforms and architectures.
So can dqlite replicate across Windows, Linux, OSX, etc???

Also, does it come in easy-to-digest amalgamation?
Especially since it requires a patch to SQLite, a portable
C-amalgamation source distribution form would maximize
changes of it being tried out.

Perhaps OT, but would the hooks added to the WAL-journaling
ever be up-streamed by Richard, under a build flag for example?
Without it, that's basically "forking" SQLite, which is not ideal... --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] dqlite 1.0.0 - SQLite replication and failover library

2019-08-29 Thread Jose Isaias Cabrera

Free Ekanayaka, on Thursday, August 29, 2019 06:40 AM, wrote...
>
> Hi,
>
> following up from my previous post back in 2017 [0], I'd like to
> announce version 1.0.0 of dqlite, a C library that brings data
> replication and high-availability to SQLite, using the Raft consensus
> algorithm.
>
> The biggest change is that Go is not used anymore, the engine itself is
> all pure C now.
>
> It still requires to apply a patch to SQLite, but it's a rather tiny one
> that just adds a few hooks when writing to the Write-Ahead log.
>
> See https://dqlite.io for more details.
>
> Thanks again to the SQLite authors for their excellent work.

Can dsqlite be installed on Windows?  I went to the site, read the README.md 
file, and could not find any reference of it.  I can see the

$ sudo add-apt-repository -y ppa:dqlite/v1 && sudo apt install dqlite

which lets me know that it linux/unix based.  But, is Windows an option also?  
Thanks.

josé

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] dqlite 1.0.0 - SQLite replication and failover library

2019-08-29 Thread Free Ekanayaka
Hello,

I updated the FAQ [0] with your two questions.

Free

[0] https://github.com/canonical/dqlite/blob/master/doc/faq.md#why-c

test user  writes:

> Hey Free,
>
> Looks like an interesting project.
>
> Is there a blog or docs about the reasons for the move from Go to C?
>
> Also what types of systems would utilise dqlite? Are there current users?
>
> Thanks
>
>
> On Thu, 29 Aug 2019 at 11:41, Free Ekanayaka  wrote:
>
>> Hi,
>>
>> following up from my previous post back in 2017 [0], I'd like to
>> announce version 1.0.0 of dqlite, a C library that brings data
>> replication and high-availability to SQLite, using the Raft consensus
>> algorithm.
>>
>> The biggest change is that Go is not used anymore, the engine itself is
>> all pure C now.
>>
>> It still requires to apply a patch to SQLite, but it's a rather tiny one
>> that just adds a few hooks when writing to the Write-Ahead log.
>>
>> See https://dqlite.io for more details.
>>
>> Thanks again to the SQLite authors for their excellent work.
>>
>> Cheers,
>>
>> Free
>>
>> [0]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2017-August/074384.html
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] dqlite 1.0.0 - SQLite replication and failover library

2019-08-29 Thread test user
Hey Free,

Looks like an interesting project.

Is there a blog or docs about the reasons for the move from Go to C?

Also what types of systems would utilise dqlite? Are there current users?

Thanks


On Thu, 29 Aug 2019 at 11:41, Free Ekanayaka  wrote:

> Hi,
>
> following up from my previous post back in 2017 [0], I'd like to
> announce version 1.0.0 of dqlite, a C library that brings data
> replication and high-availability to SQLite, using the Raft consensus
> algorithm.
>
> The biggest change is that Go is not used anymore, the engine itself is
> all pure C now.
>
> It still requires to apply a patch to SQLite, but it's a rather tiny one
> that just adds a few hooks when writing to the Write-Ahead log.
>
> See https://dqlite.io for more details.
>
> Thanks again to the SQLite authors for their excellent work.
>
> Cheers,
>
> Free
>
> [0]
> http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2017-August/074384.html
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] dqlite 1.0.0 - SQLite replication and failover library

2019-08-29 Thread Free Ekanayaka
Hi,

following up from my previous post back in 2017 [0], I'd like to
announce version 1.0.0 of dqlite, a C library that brings data
replication and high-availability to SQLite, using the Raft consensus
algorithm.

The biggest change is that Go is not used anymore, the engine itself is
all pure C now.

It still requires to apply a patch to SQLite, but it's a rather tiny one
that just adds a few hooks when writing to the Write-Ahead log.

See https://dqlite.io for more details.

Thanks again to the SQLite authors for their excellent work.

Cheers,

Free

[0] 
http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2017-August/074384.html
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users