Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-08 Thread Simon Slavin
[This explanation is much simplified for clarity.  Before filling in missing 
details please consider what the OP wants.  Don't just show off your exhaustive 
knowledge of SQLite.]

A database is normally in delete journal mode, as if you'd executed

PRAGMA journal_mode=DELETE

In this journal mode you can't read from a database which is being changed.  If 
one process is changing the database it has the database locked.  Another 
process cannot read it until it is unlocked.  Because the reading process might 
read some of the row before it is changed, and the rest of the row after it is 
changed.

To fix this, change the journal mode to WAL:

PRAGMA journal_mode=WAL

In this mode one process can read a database while another process is changing 
it.  The process that reads the database gets the data as it was before the 
change.

For further details see


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-08 Thread Keith Medcalf

On Sunday, 8 March, 2020 21:24, Peng Yu  wrote:

>When I open an sqlite3 db using the following python code,

>conn=apsw.Connection(filepath, flags = apsw.SQLITE_OPEN_READONLY)

>, I got the following error.

>Traceback (most recent call last):
>  File "/xxx.py", line 21, in 
>for x in c.execute('SELECT (data) FROM sqlar'):
>  File "src/cursor.c", line 236, in resetcursor
>apsw.BusyError: BusyError: database is locked

>The db file is currently processed by a python script which opens the
>db file for writing in the following way.

>conn = apsw.Connection(filepath)

>Since the first process just reads, I'd like it not be blocked. What
>is the correct way to do so? Thanks.

Opening a connection with the SQLITE_OPEN_READONLY only means that the 
connection cannot write to the database using that connection.  It does not 
affect the locking and transaction system in any way.  Merely that if you try 
to "write" using the readonly connection that you will get an error to the 
effect that the connection is read-only.

Perhaps you want to set a busy timeout or use WAL journal mode.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to open an sqlite3 for reading without being blocked by process writing to it?

2020-03-08 Thread Peng Yu
When I open an sqlite3 db using the following python code,

conn=apsw.Connection(filepath, flags = apsw.SQLITE_OPEN_READONLY)

, I got the following error.

Traceback (most recent call last):
  File "/xxx.py", line 21, in 
for x in c.execute('SELECT (data) FROM sqlar'):
  File "src/cursor.c", line 236, in resetcursor
apsw.BusyError: BusyError: database is locked

The db file is currently processed by a python script which opens the
db file for writing in the following way.

conn = apsw.Connection(filepath)

Since the first process just reads, I'd like it not be blocked. What
is the correct way to do so? Thanks.

-- 
Regards,
Peng
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] speeding up an FTS5 search with a JOIN

2020-03-08 Thread Puneet Kishor
Update: so, after much hitting of my head against the wall of sql, I came up 
with the following – as noted above, I really have two distinct set of queries 
I can do separately like so

Q1: (SELECT t1Id FROM t1 WHERE …) AS a

Q2: (SELECT t1Id FROM vt1 WHERE vt1 MATCH ‘bar’) AS b

Then, I can do the following -

SELECT Count(*) FROM a WHERE a.t1Id IN b

Of course, in reality, I don’t do this separately but all in one go to make a 
really messy SQL but a really fast query, a couple of hundred ms as opposed to 
> 25s

You might notice that in my Q2 above I MATCHed for ‘bar’ instead of ‘foo’. That 
is because ‘bar’ returns fewer rows than ‘foo’ does. The problem remains when 
there are too many matches in the FTS query in which case the FTS query itself 
is slow, for example, with ‘foo’ which matches > 80K rows.

Now, one interesting point of comparison – the same kind of query (from the 
user point-of-view) against an ElasticSearch instance (that is, all the rows 
with ‘foo’ anywhere in the text) is very fast, in the order of sub-hundred ms. 
I realize it might be unfair comparing SQLite with ElasticSearch, but still. 
(Or, is it a fair comparison?)


> On Mar 7, 2020, at 8:59 AM, P Kishor  wrote:
> 
> I asked this question on Stackoverflow with not much success, and a 
> suggestion to ask it on the list. So here I am. I have two tables, t1(id, 
> t1Id, … other cols …, fullText) and a FTS5 virtual table vt1(t1Id, fullText)
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(*) as num FROM t1 WHERE deleted = 0;
> 
> QUERY 
> PLAN
> --SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=?)
> 
> sqlite> SELECT Count(*) as num FROM t1 WHERE deleted = 0;
> 308498
> 
> Run Time
> : real 0.043 user 0.023668 sys 0.009005
> ```
> 
> 
> As can be see above, the actual query takes ~43ms
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(*) as num FROM vt1 WHERE vt1 MATCH 'foo';
> 
> QUERY 
> PLAN
> --SCAN TABLE vt1 VIRTUAL TABLE INDEX 131073:
> 
> sqlite> SELECT Count(*) as num FROM vt1 WHERE vt1 MATCH 'foo';
> 80789
> 
> Run Time
> : real 0.047 user 0.008021 sys 0.009640
> ```
> 
> The actual query, in this case, takes ~47ms. So far so good. But the problem 
> occurs when I join the two tables
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(*) as num
>   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo';
> 
> QUERY 
> PLAN
> |--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
> --SEARCH TABLE t1 USING INDEX sqlite_autoindex_t1_1 (t1Id=?)
> 
> sqlite> SELECT Count(*) as num
>   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 AND vt1 MATCH 'foo';
> 80789
> 
> Run Time
> : real 26.218 user 1.396376 sys 5.413630
> ```
> 
> The answer is correct but the query takes more than 26 seconds! Of course, I 
> would like to speed up this query by several orders of magnitude, but I would 
> also like to understand why this join is causing the slowdown.
> 
> Now, the reason I have constructed a query like this is because users can add 
> further constraints for the table t1. For example, 
> 
> ```
> sqlite> SELECT Count(*) as num
>   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 AND 
>   ...> WHERE t1.frob = ‘bar' AND 
>   ...> WHERE t1.nob = ‘baz' AND 
>   ...> vt1 MATCH 'foo’;
> ```
> 
> Also, in every operation, for every given constraint, two queries are 
> performed, one that returns the count and the other that returns the actual 
> columns. And, finally, only a subset of the results are returned using LIMIT 
> and OFFSET but *after* a sort ORDER has been prescribed. So, in reality, the 
> last constraint above would result in the following
> 
> ```
> sqlite> SELECT Count(*) as num
>   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 AND 
>   ...> WHERE t1.frob = ‘bar' AND 
>   ...> WHERE t1.nob = ‘baz' AND 
>   ...> vt1 MATCH 'foo’;
> 20367
> 
> sqlite> SELECT t1.id, t1.t1Id, … other cols …, 
>   ...> snippet(vt1, 1, "", "", "", 50) AS context,
>   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 AND 
>   ...> WHERE t1.frob = ‘bar' AND 
>   ...> WHERE t1.nob = ‘baz' AND 
>   ...> vt1 MATCH ‘foo’
>   ...> ORDER BY 
>   ...> LIMIT 30 OFFSET ;
> ```
> 
> When no t1 columns are prescribed in the constraint, the default count (shown 
> above) and default cols are returned with the FTS search
> 
> ```
> sqlite> SELECT Count(*) as num
>   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 
>   ...> vt1 MATCH 'foo’;
> 20367
> 
> sqlite> SELECT t1.id, t1.t1Id, … other cols …, 
>   ...> snippet(vt1, 1, "", "", "", 50) AS context,
>   ...> FROM t1 JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 
>   ...> vt1 MATCH ‘foo’
>   ...> ORDER BY 
>   ...> LIMIT 30 OFFSET 0;
> ```
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org