Re: [sqlite] Sqlite error code 14 when using 3.31.0+

2020-03-09 Thread Rowan Worth
On Mon, 9 Mar 2020 at 23:22, Daniel Polski  wrote:

> Updated to 3.31.1 but my application started spitting out an error when
> opening the database, so I tested some earlier sqlite versions to figure
> out when the problem starts.
> I don't get the message in versions <= 3.30.1.
>
> (from the applications log)
> SQLite Version: 3.31.0
> INFO: Database opened: /tmp/database.sqlite
> WARNING: SQLITE error code: 14 cannot open file at line 36982 of
> [3bfa9cc97d]
> WARNING: SQLITE error code: 14 os_unix.c:36982: (40) openDirectory(/tmp) -
>

errno 40 is ELOOP, "Too many symbolic links encountered". open(2) says:

   ELOOP  Too many symbolic links were encountered in resolving
 pathname,  or  O_NOFOLLOW  was
  specified but pathname was a symbolic link.

Is your /tmp/ a symlink? Sqlite seems to use O_NOFOLLOW unconditionally in
openDirectory() since this checkin:

https://www.sqlite.org/src/info/6a64fb6a2da6c98f

Probably a bug? The changelog for sqlite 3.31.0 include this which is
likely related:

- * Add the SQLITE_OPEN_NOFOLLOW
 option to
sqlite3_open_v2()  that prevents
SQLite from opening symbolic links.

-Rowan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-09 Thread Keith Medcalf

On Monday, 9 March, 2020 18:18, Peng Yu  wrote:

>But I never experience the problem in my original email when I used
>python3's default sqlite3 module (WAL was not used). What is the
>difference between the default sqlite3 module and apsw? Thanks.

THe relevant difference is that the sqlite3 wrapper sets a default busy timeout 
of 5000 milliseconds when opening a connection, which you can change with the 
timeout parameter on the sqlite3.connect connection constructor.

APSW does not do anything behind your back, so if you want a non-zero 
busy_timeout you have to set one.

cn = apsw.Connection( ... )

cn.setbusytimeout(5000)
or
cn.cursor().execute('pragma busy_timeout=5000;')

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-09 Thread Peng Yu
On 3/8/20, Keith Medcalf  wrote:
>
> On Sunday, 8 March, 2020 21:24, Peng Yu  wrote:
>
>>When I open an sqlite3 db using the following python code,
>
>>conn=apsw.Connection(filepath, flags = apsw.SQLITE_OPEN_READONLY)
>
>>, I got the following error.
>
>>Traceback (most recent call last):
>>  File "/xxx.py", line 21, in 
>>for x in c.execute('SELECT (data) FROM sqlar'):
>>  File "src/cursor.c", line 236, in resetcursor
>>apsw.BusyError: BusyError: database is locked
>
>>The db file is currently processed by a python script which opens the
>>db file for writing in the following way.
>
>>conn = apsw.Connection(filepath)
>
>>Since the first process just reads, I'd like it not be blocked. What
>>is the correct way to do so? Thanks.
>
> Opening a connection with the SQLITE_OPEN_READONLY only means that the
> connection cannot write to the database using that connection.  It does not
> affect the locking and transaction system in any way.  Merely that if you
> try to "write" using the readonly connection that you will get an error to
> the effect that the connection is read-only.
>
> Perhaps you want to set a busy timeout or use WAL journal mode.

But I never experience the problem in my original email when I used
python3's default sqlite3 module (WAL was not used). What is the
difference between the default sqlite3 module and apsw? Thanks.


-- 
Regards,
Peng
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3: .width counts bytes, not characters

2020-03-09 Thread Richard Hipp
On 3/9/20, Keith Medcalf  wrote:
>>I use sqlite3 (sqlite3 --version = "3.11.0 2016-02-15 17:29:24
>>3d862f207e3adc00f78066799ac5a8c282430a5f" on Ubuntu 16.04.6 LTS) for
>
> Have you tried version more recent than 4 years and 1 month old?
> I think some of these issues may have been fixed in the last couple of
> years.

I think the enhancement is here:
https://sqlite.org/src/timeline?c=ed0842c156ab1a78

That would correspond to version 3.20.0.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3: .width counts bytes, not characters

2020-03-09 Thread Keith Medcalf
>I use sqlite3 (sqlite3 --version = "3.11.0 2016-02-15 17:29:24
>3d862f207e3adc00f78066799ac5a8c282430a5f" on Ubuntu 16.04.6 LTS) for

Have you tried version more recent than 4 years and 1 month old?  
I think some of these issues may have been fixed in the last couple of years.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3: .width counts bytes, not characters

2020-03-09 Thread Jose Isaias Cabrera

softw...@quantentunnel.de, on Monday, March 9, 2020 04:02 PM, wrote...
>
> Hi
>
> I use sqlite3 (sqlite3 --version = "3.11.0 2016-02-15 17:29:24
> 3d862f207e3adc00f78066799ac5a8c282430a5f" on Ubuntu 16.04.6 LTS) for
> formatted output. '.width' does not behave as I expected when non-ASCII
> Unicode characters are printed. It seems that .width counts bytes and not
> characters. See test case below (in case the email does not display
> non-ASCII characters properly: the second INSERT has 6 characters '3/4' as
> first field). Is there an option to format the output nicely in columns
> when non-ASCII characters may be included?
>
> Test case:
> echo "CREATE TEMPORARY TABLE width_test (data, description);" >
> width_test.sql
> echo "INSERT INTO width_test VALUES('aa', '6 ASCII letters');" >>
> width_test.sql
> echo "INSERT INTO width_test VALUES('¾¾', '6 non-AScII
> letters');" >> width_test.sql
> echo ".mode col" >> width_test.sql
> echo ".width 6 20" >> width_test.sql
> echo "SELECT * FROM width_test;" >> width_test.sql
> echo ".width 9 20" >> width_test.sql
> echo "SELECT * FROM width_test;" >> width_test.sql
> echo ".width 12 20" >> width_test.sql
> echo "SELECT * FROM width_test;" >> width_test.sql
> cat width_test.sql | sqlite3 "test.db"

On Windows is working as expected.  Or maybe, I don't understand the problem, 
which is something that could be very likely.
12:10:51.82>sqlite3
SQLite version 3.24.0 2018-06-04 19:24:41
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TEMPORARY TABLE width_test (data, description);
sqlite> INSERT INTO width_test VALUES('aa', '6 ASCII letters');
sqlite> INSERT INTO width_test VALUES('¾¾', '6 non-AScII letters');
sqlite> .mode col
sqlite> .width 6 20
sqlite> SELECT * FROM width_test;
aa  6 ASCII letters
__  6 non-AScII letters
sqlite> .width 9 20
sqlite> SELECT * FROM width_test;
aa 6 ASCII letters
__ 6 non-AScII letters
sqlite> .width 12 20
sqlite> SELECT * FROM width_test;
aa6 ASCII letters
__6 non-AScII letters
sqlite>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] sqlite3: .width counts bytes, not characters

2020-03-09 Thread Software
Hi

I use sqlite3 (sqlite3 --version = "3.11.0 2016-02-15 17:29:24 
3d862f207e3adc00f78066799ac5a8c282430a5f" on Ubuntu 16.04.6 LTS) for formatted 
output. '.width' does not behave as I expected when non-ASCII Unicode 
characters are printed. It seems that .width counts bytes and not characters. 
See test case below (in case the email does not display non-ASCII characters 
properly: the second INSERT has 6 characters '3/4' as first field). Is there an 
option to format the output nicely in columns when non-ASCII characters may be 
included?

Test case:
echo "CREATE TEMPORARY TABLE width_test (data, description);" > width_test.sql
echo "INSERT INTO width_test VALUES('aa', '6 ASCII letters');" >> 
width_test.sql
echo "INSERT INTO width_test VALUES('¾¾', '6 non-AScII letters');" >> 
width_test.sql
echo ".mode col" >> width_test.sql
echo ".width 6 20" >> width_test.sql
echo "SELECT * FROM width_test;" >> width_test.sql
echo ".width 9 20" >> width_test.sql
echo "SELECT * FROM width_test;" >> width_test.sql
echo ".width 12 20" >> width_test.sql
echo "SELECT * FROM width_test;" >> width_test.sql
cat width_test.sql | sqlite3 "test.db"


Best regards

Andreas
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-09 Thread Keith Medcalf

On Monday, 9 March, 2020 08:33, Simon Slavin  wrote:

>If your .shm and .wal files still exist when no apps are accessing the
>database, the most likely cause is that at least one of the apps is not
>closing its connection correctly.

or you are opening the database connection with SQLITE_OPEN_READNLY because a 
READONLY connection cannot delete the shm and wal files as that would require 
writing, and that readonly connection happens to be the last one to close.

In other words where there are multiple connections to a WAL database, the 
associated WAL and SHM files will be deleted when the last connection to that 
database closes cleanly *unless* that connection is not permitted to write (was 
opened with SQLITE_OPEN_READONLY) or was otherwise configured not to delete the 
WAL and SHM files.

This is, of course, documented.

The original unreferenced contextless fragment of text was this:

There is an additional quasi-persistent "-wal" file and "-shm" shared
memory file associated with each database, which can make SQLite less
appealing for use as an application file-format.

This is because now you have three things which may in certain circumstances 
constitute 1 thing.  If journal_mode=DELETE you only have one thing be one 
thing -- the database file -- and if there is a -journal file then you know 
something "is broked".  However in journal_mode=WAL it is not so simple.  More 
watts (not merely milliwatts) much be expended to understand why there may be 
three things consitituting one thing, and that it does not necessarily indicate 
any "brokeness" but may rather be a necessary and normal state of affairs.  Or 
not.  But an external observed will not be able to tell.  Hence a version of 
Word that stores its documents in an SQLite database in WAL mode may not be 
suitable for use by someone who expects that "one document is one file".

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Custom VFSes and PENDING_BYTE position

2020-03-09 Thread Alexey Alyaev
Hi D. Richard Hipp,

Thank you for quick response.

I did consider all options as it stands:

1) Skipping the lock bytes in our vfs. This becomes really messy very fast, as
   it is based on test_onefile vfs and both db and journal are part of the same
   file, and lock bytes may end up in any of 3 segments: db, free space, or
   journal.
   
   In addition we have logic to grow (during writes)/shrink (via PRAGMA) the
   file as necessary. Basically, really a lot of skipping logic in many places.
   
   And even if I'd go ahead and coded all this up, we'd end up breaking all our
   currently deployed applications for all our users, since the skipped offsets
   would contain valid pages.
   
2) Set new PENDING_BYTE via sqlite3_test_control(). This looked like an option
   at first, until I realized that the same value [sqlite3PendingByte]
   is used in both: in pager, and the os_* vfses.
   
   Since our VFS is based on test_onefile with db, free space, and journal all
   occupying one file, it becomes impossible to handle all 3 cases that could
   end up writing into the lock byte region.
   
   And again, modifying the PENDING_BYTE would break the application for all
   users, due to lock-byte page position change.

3) Which is what I considered as the most suitable solution in cases similar
   to ours (one file vfses), and the reason I decide to write to this list:
   
   To be able to set (via config), an offsets for pending byte region
   in os_* VFSes, but have no effect on the lock-byte page offset as seen by
   higher layers.
   
   Since os_* VFSes are pretty much generic and are used as auxilaries in many
   other "wrapper" VFSes, it would be logical to at least have an option to
   configure the pending byte region only for os_* VFSes.
   
In our case, having an option such as (3), we could move the pending byte
region into the head space (occurs before the database) of the file, which
contains sufficient space to house all the lock bytes and that always exists.

This would handle all possible scenarios with one config tweak, as well as not
break any current deployments.

For others, any custom VFS that allocates space before the database in file,
and use os_* VFSes as auxilary, the same option can be used to add a shift
to the pending byte region, aligning it correctly with the lock byte page
offset used by pager.

Ofcourse, I can patch a few lines in sqlite3 source that we use, but I just
wanted to let you guys know of a situation we ran into and had no real simple
way of going forward with, even though everything else seemed to be configurable
and flexible.

Thanks.

-- Alexey
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Sqlite error code 14 when using 3.31.0+

2020-03-09 Thread Daniel Polski
Updated to 3.31.1 but my application started spitting out an error when 
opening the database, so I tested some earlier sqlite versions to figure 
out when the problem starts.

I don't get the message in versions <= 3.30.1.

(from the applications log)
SQLite Version: 3.31.0
INFO: Database opened: /tmp/database.sqlite
WARNING: SQLITE error code: 14 cannot open file at line 36982 of 
[3bfa9cc97d]

WARNING: SQLITE error code: 14 os_unix.c:36982: (40) openDirectory(/tmp) -

It seems related to opening the WAL file. Anything I should change when 
using sqlite >= 3.31.0 ?


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-09 Thread Simon Slavin
On 9 Mar 2020, at 2:16pm, David Raymond  wrote:

> In general I believe the last connection tries to do a complete checkpoint 
> when it closes, and if it succeeds then it'll delete the -wal and -shm files. 
> If you have automatic checkpointing turned off (maybe you're doing regular 
> checkpoints from a separate dedicated process) then it's possible for all 
> connections to finish and close the database without error, but still have 
> those files sitting around because they haven't been checkpointed yet.

If your .shm and .wal files still exist when no apps are accessing the 
database, the most likely cause is that at least one of the apps is not closing 
its connection correctly.

If your app runs code to close connections, and the connections are still not 
closing properly, then the most likely cause is that you have a statement 
active when you close the connection.

To make sure your statements are not active, make sure you call 
sqlite3_finalize() on each statement when you no longer need it.  If you are 
not using the C library to execute SQLite calls, look for some equivalent in 
your SQLite library.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-09 Thread David Raymond
I see this. What does it mean? Does it mean even when the sqlite3
session is closed there is still -wal and -shm left on the disk?

"""
There is an additional quasi-persistent "-wal" file and "-shm" shared
memory file associated with each database, which can make SQLite less
appealing for use as an application file-format.
"""


In rollback journal mode the -journal file only shows up when you're making 
changes. In wal mode the -wal and -shm files are there for the entire life of 
the connection, as long as there's anything even reading the file, they're 
there.

In general I believe the last connection tries to do a complete checkpoint when 
it closes, and if it succeeds then it'll delete the -wal and -shm files. If you 
have automatic checkpointing turned off (maybe you're doing regular checkpoints 
from a separate dedicated process) then it's possible for all connections to 
finish and close the database without error, but still have those files sitting 
around because they haven't been checkpointed yet.

Some combination of those two is what is being referred to there I believe.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Custom VFSes and PENDING_BYTE position

2020-03-09 Thread Richard Hipp
On 3/9/20, Alexey Alyaev  wrote:
>
> I have been working on a custom VFS project since 2017
>
> would it not be reasonable to allow setting a different
> lock byte offset just in windows/unix VFS, while keeping the lock-byte page
> position untouched?
>
> This would allow VFSes such as ours to move the lock-byte region used by
> default OS VFS, outside of the database segment, say into the head space
> which we have allocated and reserved for meta.

In the default implementation, the location of the PENDING_BYTE is set
by a global variable sqlite3PendingByte.  If you want to change that
global variable, you can.  Be warned, however, that changing the
location of the pendingByte results in an incompatible file format.
You will also run into severe problems (assertion-faults and/or
segfaults) if you set the pendingByte to some value that is not a
multiple of the page size or that overlaps with page 1 of the database
file.

Aside:  The PENDING_BYTE location can be adjusted in this way for
testing purposes.  Even release builds are able to move the
PENDING_BYTE (using the sqlite3_test_control() interface) since our
dogma is "fly what you test and test what you fly."

Maybe a better solution for you would be to modify your custom xRead()
and xWrite() routines so that the skip over the bytes in the range of
0x4000 through 0x4200.  If the offset is greater than or equal
to 0x4000 just add 512 to the offset before continuing.  Depending
on your circumstances, I suppose you would also have to do something
similar with xTruncate() and xFileSize().

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] speeding up an FTS5 search with a JOIN

2020-03-09 Thread P Kishor
following up on my own response to Dan…

I see what you mean Dan. I am not an expert at reading the query plans, but I 
do see that effectively my new query has the same query plan as the last two 
queries combined as well as the original query. The only difference is the 
order in which it proceeds. So, the original (slow query)

|--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)

The two sub-queries separately that run very fast

`--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m

and 
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)

and the new query that also runs very fast using the two sub-queries

QUERY PLAN
|--CO-ROUTINE 1
|  |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
|  `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
|--SCAN SUBQUERY 1 AS a
`--LIST SUBQUERY 2
  `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m


No idea what is going on.


> On Mar 9, 2020, at 2:08 PM, P Kishor  wrote:
> 
> 
> 
>> On Mar 9, 2020, at 1:04 PM, Dan Kennedy  wrote:
>> 
>> 
>> On 9/3/63 01:44, Puneet Kishor wrote:
>>> Update: so, after much hitting of my head against the wall of sql, I came 
>>> up with the following – as noted above, I really have two distinct set of 
>>> queries I can do separately like so
>>> 
>>> Q1: (SELECT t1Id FROM t1 WHERE …) AS a
>>> 
>>> Q2: (SELECT t1Id FROM vt1 WHERE vt1 MATCH ‘bar’) AS b
>>> 
>>> Then, I can do the following -
>>> 
>>> SELECT Count(*) FROM a WHERE a.t1Id IN b
>>> 
>>> Of course, in reality, I don’t do this separately but all in one go to make 
>>> a really messy SQL but a really fast query, a couple of hundred ms as 
>>> opposed to > 25s
>> 
>> 
>> I don't see why that would be any different from your join query. What does 
>> EXPLAIN QUERY PLAN say for the version that takes a few hundred ms?
> 
> 
> Here it is, the new query made up of two queries derived from the original 
> query (summarize again below). The query itself takes 301ms. 
> 
> ```
> SQLite version 3.30.0 2019-10-04 15:03:17
> Enter ".help" for usage hints.
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(*) FROM
>   ...> 
>   ...> -- this one is 'QUERY a'
>   ...> (SELECT Count(DISTINCT t1.t1Id) t1Id 
>   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id 
>   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '') a
>   ...>
>   ...> WHERE a.t1Id IN 
>   ...> 
>   ...> -- and this one is 'QUERY b'
>   ...> (SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH "foo");
> QUERY PLAN
> |--CO-ROUTINE 1
> |  |--SEARCH TABLE t2 USING INDEX ix_t2 (deleted=? AND deleted=?)
> |  `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
> |--SCAN SUBQUERY 1 AS a
> `--LIST SUBQUERY 2
>   `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
> sqlite>
> ```
> 
> Here is the original query (QUERY zero). This is the one that takes a very 
> long time, 10.611s. Note, in this one I’ve added an additional table ’t2’ but 
> the concept is the same. Normal tables JOINed to each other, and then JOINed 
> to a virtual table.
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(DISTINCT t1.t1Id)
>   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '' AND vt1 MATCH 
> "foo";
> QUERY PLAN
> |--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
> |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
> `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
> sqlite>
> ```
> 
> Here are the two queries derived from QUERY zero. The first one (QUERY a) 
> takes 324ms
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(DISTINCT t1.t1Id)
>   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id
>   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '';
> QUERY PLAN
> |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
> `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
> sqlite>
> ```
> 
> The second query (QUERY b) takes: 27ms
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH “foo";
> QUERY PLAN
> `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
> sqlite>
> ```
> 
> One more thing: All of this goes to hell if the virtual table returns way too 
> many matches. For example, I have a term that returns 80K rows from the FTS 
> MATCH. In that case, even my new query very slow because, well, because the 
> QUERY b above is slow.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-09 Thread Peng Yu
I see this. What does it mean? Does it mean even when the sqlite3
session is closed there is still -wal and -shm left on the disk?

"""
There is an additional quasi-persistent "-wal" file and "-shm" shared
memory file associated with each database, which can make SQLite less
appealing for use as an application file-format.
"""

On 3/8/20, Simon Slavin  wrote:
> [This explanation is much simplified for clarity.  Before filling in missing
> details please consider what the OP wants.  Don't just show off your
> exhaustive knowledge of SQLite.]
>
> A database is normally in delete journal mode, as if you'd executed
>
> PRAGMA journal_mode=DELETE
>
> In this journal mode you can't read from a database which is being changed.
> If one process is changing the database it has the database locked.  Another
> process cannot read it until it is unlocked.  Because the reading process
> might read some of the row before it is changed, and the rest of the row
> after it is changed.
>
> To fix this, change the journal mode to WAL:
>
> PRAGMA journal_mode=WAL
>
> In this mode one process can read a database while another process is
> changing it.  The process that reads the database gets the data as it was
> before the change.
>
> For further details see
>
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Regards,
Peng
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] speeding up an FTS5 search with a JOIN

2020-03-09 Thread P Kishor


> On Mar 9, 2020, at 1:04 PM, Dan Kennedy  wrote:
> 
> 
> On 9/3/63 01:44, Puneet Kishor wrote:
>> Update: so, after much hitting of my head against the wall of sql, I came up 
>> with the following – as noted above, I really have two distinct set of 
>> queries I can do separately like so
>> 
>> Q1: (SELECT t1Id FROM t1 WHERE …) AS a
>> 
>> Q2: (SELECT t1Id FROM vt1 WHERE vt1 MATCH ‘bar’) AS b
>> 
>> Then, I can do the following -
>> 
>> SELECT Count(*) FROM a WHERE a.t1Id IN b
>> 
>> Of course, in reality, I don’t do this separately but all in one go to make 
>> a really messy SQL but a really fast query, a couple of hundred ms as 
>> opposed to > 25s
> 
> 
> I don't see why that would be any different from your join query. What does 
> EXPLAIN QUERY PLAN say for the version that takes a few hundred ms?


Here it is, the new query made up of two queries derived from the original 
query (summarize again below). The query itself takes 301ms. 

```
SQLite version 3.30.0 2019-10-04 15:03:17
Enter ".help" for usage hints.
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) FROM
   ...> 
   ...> -- this one is 'QUERY a'
   ...> (SELECT Count(DISTINCT t1.t1Id) t1Id 
   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id 
   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '') a
   ...>
   ...> WHERE a.t1Id IN 
   ...> 
   ...> -- and this one is 'QUERY b'
   ...> (SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH "foo");
QUERY PLAN
|--CO-ROUTINE 1
|  |--SEARCH TABLE t2 USING INDEX ix_t2 (deleted=? AND deleted=?)
|  `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
|--SCAN SUBQUERY 1 AS a
`--LIST SUBQUERY 2
   `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
sqlite>
```

Here is the original query (QUERY zero). This is the one that takes a very long 
time, 10.611s. Note, in this one I’ve added an additional table ’t2’ but the 
concept is the same. Normal tables JOINed to each other, and then JOINed to a 
virtual table.

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(DISTINCT t1.t1Id)
   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '' AND vt1 MATCH 
"foo";
QUERY PLAN
|--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
sqlite>
```

Here are the two queries derived from QUERY zero. The first one (QUERY a) takes 
324ms

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(DISTINCT t1.t1Id)
   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id
   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '';
QUERY PLAN
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
sqlite>
```

The second query (QUERY b) takes: 27ms

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH “foo";
QUERY PLAN
`--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
sqlite>
```

One more thing: All of this goes to hell if the virtual table returns way too 
many matches. For example, I have a term that returns 80K rows from the FTS 
MATCH. In that case, even my new query very slow because, well, because the 
QUERY b above is slow.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Custom VFSes and PENDING_BYTE position

2020-03-09 Thread Alexey Alyaev
Hi sqlite developers,

As this is my first post to the mailing list, I wanted to say huge thanks
and regards to D. Richard Hipp and his team for all the hard work put into
this wonderful database software/library :-)

I have been working on a custom VFS project since 2017. The project
got released, and has since been in deployment as part of one mobile app.

Recently, we ran into an issue that cannot be addressed using any of the
provided sqlite configuration mechanisms, and I would like to see if it's
possible to make a feature request.

The problem we hit is around PENDING_BYTE position.

Our VFS project is based on test_onefile vfs found in sqlite source tree, with
some modifications to store additional meta data at the beginning of file.

The actual reading/writing, as in original onefile vfs, is delegated to the
auxilary vfs - default (OS) vfs of sqlite.

The database begins at an offset just after the initial metadata. So, the
offsets at which read/writes occur in auxilary vfs, are shifted to the right.

This all worked well, until our db files started growing to sizes > 1GB, and
the lock byte issue became evident on windows platform.

After investigating PENDING_BYTE, the history of lock-byte page, and
mechanisms of tweaking its position, I realized that there is strong coupling
between the lock-byte page position as seen by pager, and the offsets which OS
VFSes use for lock bytes.

As I understand, wrapping default OS VFS for reading/writing in another VFS is
a common use case, so would it not be reasonable to allow setting a different
lock byte offset just in windows/unix VFS, while keeping the lock-byte page
position untouched?

This would allow VFSes such as ours to move the lock-byte region used by
default OS VFS, outside of the database segment, say into the head space
which we have allocated and reserved for meta.

Thanks.

-- Alexey
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] speeding up an FTS5 search with a JOIN

2020-03-09 Thread Dan Kennedy


On 9/3/63 01:44, Puneet Kishor wrote:

Update: so, after much hitting of my head against the wall of sql, I came up 
with the following – as noted above, I really have two distinct set of queries 
I can do separately like so

Q1: (SELECT t1Id FROM t1 WHERE …) AS a

Q2: (SELECT t1Id FROM vt1 WHERE vt1 MATCH ‘bar’) AS b

Then, I can do the following -

SELECT Count(*) FROM a WHERE a.t1Id IN b

Of course, in reality, I don’t do this separately but all in one go to make a 
really messy SQL but a really fast query, a couple of hundred ms as opposed to 
> 25s



I don't see why that would be any different from your join query. What 
does EXPLAIN QUERY PLAN say for the version that takes a few hundred ms?


Dan.





You might notice that in my Q2 above I MATCHed for ‘bar’ instead of ‘foo’. That is 
because ‘bar’ returns fewer rows than ‘foo’ does. The problem remains when there 
are too many matches in the FTS query in which case the FTS query itself is slow, 
for example, with ‘foo’ which matches > 80K rows.

Now, one interesting point of comparison – the same kind of query (from the 
user point-of-view) against an ElasticSearch instance (that is, all the rows 
with ‘foo’ anywhere in the text) is very fast, in the order of sub-hundred ms. 
I realize it might be unfair comparing SQLite with ElasticSearch, but still. 
(Or, is it a fair comparison?)



On Mar 7, 2020, at 8:59 AM, P Kishor  wrote:

I asked this question on Stackoverflow with not much success, and a suggestion 
to ask it on the list. So here I am. I have two tables, t1(id, t1Id, … other 
cols …, fullText) and a FTS5 virtual table vt1(t1Id, fullText)

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) as num FROM t1 WHERE deleted = 0;

QUERY
PLAN
--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=?)

sqlite> SELECT Count(*) as num FROM t1 WHERE deleted = 0;
308498

Run Time
: real 0.043 user 0.023668 sys 0.009005
```


As can be see above, the actual query takes ~43ms

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) as num FROM vt1 WHERE vt1 MATCH 'foo';

QUERY
PLAN
--SCAN TABLE vt1 VIRTUAL TABLE INDEX 131073:

sqlite> SELECT Count(*) as num FROM vt1 WHERE vt1 MATCH 'foo';
80789

Run Time
: real 0.047 user 0.008021 sys 0.009640
```

The actual query, in this case, takes ~47ms. So far so good. But the problem 
occurs when I join the two tables

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo';

QUERY
PLAN
|--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
--SEARCH TABLE t1 USING INDEX sqlite_autoindex_t1_1 (t1Id=?)

sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo';
80789

Run Time
: real 26.218 user 1.396376 sys 5.413630
```

The answer is correct but the query takes more than 26 seconds! Of course, I 
would like to speed up this query by several orders of magnitude, but I would 
also like to understand why this join is causing the slowdown.

Now, the reason I have constructed a query like this is because users can add 
further constraints for the table t1. For example,

```
sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND
   ...> WHERE t1.frob = ‘bar' AND
   ...> WHERE t1.nob = ‘baz' AND
   ...> vt1 MATCH 'foo’;
```

Also, in every operation, for every given constraint, two queries are 
performed, one that returns the count and the other that returns the actual 
columns. And, finally, only a subset of the results are returned using LIMIT 
and OFFSET but *after* a sort ORDER has been prescribed. So, in reality, the 
last constraint above would result in the following

```
sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND
   ...> WHERE t1.frob = ‘bar' AND
   ...> WHERE t1.nob = ‘baz' AND
   ...> vt1 MATCH 'foo’;
20367

sqlite> SELECT t1.id, t1.t1Id, … other cols …,
   ...> snippet(vt1, 1, "", "", "", 50) AS context,
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND
   ...> WHERE t1.frob = ‘bar' AND
   ...> WHERE t1.nob = ‘baz' AND
   ...> vt1 MATCH ‘foo’
   ...> ORDER BY 
   ...> LIMIT 30 OFFSET ;
```

When no t1 columns are prescribed in the constraint, the default count (shown 
above) and default cols are returned with the FTS search

```
sqlite> SELECT Count(*) as num
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0
   ...> vt1 MATCH 'foo’;
20367

sqlite> SELECT t1.id, t1.t1Id, … other cols …,
   ...> snippet(vt1, 1, "", "", "", 50) AS context,
   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0
   ...> vt1 MATCH ‘foo’
   ...> ORDER BY 
   ...> LIMIT 30 OFFSET 0;
```

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] speeding up an FTS5 search with a JOIN

2020-03-09 Thread P Kishor

Hi Dan,


> On Sat Mar 7 13:32:54 UTC 2020,Dan Kennedy danielk1977 at gmail.com wrote:

>> On 7/3/63 14:58, P Kishor wrote: 

[snipped]

>> The actual query, in this case, takes ~47ms. So far so good. But the problem 
>> occurs when I join the two tables 
>>
>> ``` 
>> sqlite> EXPLAIN QUERY PLAN 
>> ...> SELECT Count(*) as num 
>> ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id 
>> ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo'; 
>>
>> QUERY 
>> PLAN 
>> |--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m 
>> --SEARCH TABLE t1 USING INDEX sqlite_autoindex_t1_1 (t1Id=?) 
>>
>> sqlite> SELECT Count(*) as num 
>> ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id 
>> ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo'; 
>> 80789 
>>
>> Run Time 
>> : real 26.218 user 1.396376 sys 5.413630 
>> ``` 

>That does seem slow. Are there many rows in table "t1" with t1.deleted set to 
>something other than 0? 

No, none of them have `deleted != 0`. Explanation: `deleted` is a flag field 
that will track (in the future) when any of the rows are tagged as deleted, no 
longer in use. The idea is to exclude “deleted” rows from searches, but not 
really delete them. As of now, all the rows are active, so the column is set to 
0 (false) for all the rows. In other words,

```
sqlite> SELECT count(*) FROM t1 WHERE deleted = 0;
308498

sqlite> SELECT count(*) FROM t1 WHERE deleted != 0;
0
```

> What does:   SELECT count(*) AS num FROM t1 JOIN vt1 ON t1.t1Id=vt1.t1Id 
> WHERE vt1 MATCH 'foo'; return? Dan. 

sqlite> SELECT count(*) AS num FROM t1 JOIN vt1 ON t1.t1Id=vt1.t1Id WHERE vt1 
MATCH 'foo';
80789


[snipped]

Please also see my follow-up email with an update sent yesterday, Mar 8 at 
18:44:14 UTC 2020

Many thanks.

Note: I had accidentally set the wrong mail-delivery flag on my subscription so 
I didn’t get earlier emails. It has been set right now. There is a possibility 
this email may not be a part of the original thread, in which case, my 
apologies in advance.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users