Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-03 Thread David Raymond
Un-analyzed here's what I'm getting while looking at the db:

With distinct:

sqlite> explain query plan select distinct ppos from move join pos on mto = 
pnum where pcensus = 18 and pmin < pmax;
selectid|order|from|detail
0|0|1|SCAN TABLE pos USING INDEX pipos
0|1|0|SEARCH TABLE move USING COVERING INDEX mrev (mto=?)


Without distict:

sqlite> explain query plan select ppos from move join pos on mto = pnum where 
pcensus = 18 and pmin < pmax;
selectid|order|from|detail
0|0|1|SCAN TABLE pos
0|1|0|SEARCH TABLE move USING COVERING INDEX mrev (mto=?)


With distinct:

sqlite> explain select distinct ppos from move join pos on mto = pnum where 
pcensus = 18 and pmin < pmax;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 22000  Start at 22
1 Null   1 7 008  r[7]=NULL
2 OpenRead   1 4 0 7  00  root=4 iDb=0; pos
3 OpenRead   3 6 0 k(1,)  00  root=6 iDb=0; pipos
4 OpenRead   4 110 k(3,,,)02  root=11 iDb=0; mrev
5 Rewind 3 211 0  00
6   Seek   3 0 100  Move 1 to 3.rowid
7   Column 1 2 100  r[1]=pos.pcensus
8   Ne 2 201 (BINARY)   54  if r[1]!=r[2] goto 
20
9   Column 1 5 3 -9900  r[3]=pos.pmin
10  Column 1 6 4 99 00  r[4]=pos.pmax
11  Ge 4 203 (BINARY)   53  if r[3]>=r[4] goto 
20
12  IdxRowid   3 5 000  r[5]=rowid
13  SeekGE 4 205 1  00  key=r[5]
14IdxGT  4 205 1  00  key=r[5]
15Column 3 0 600  r[6]=pos.ppos
16Eq 6 197 (BINARY)   80  if r[7]==r[6] 
goto 19
17Copy   6 7 000  r[7]=r[6]
18ResultRow  6 1 000  output=r[6]
19  Next   4 14000
20Next   3 6 001
21Halt   0 0 000
22Transaction0 0 155   0  01  usesStmtJournal=0
23Integer182 000  r[2]=18
24Goto   0 1 000


Without distinct:

sqlite> explain select ppos from move join pos on mto = pnum where pcensus = 18 
and pmin < pmax;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 17000  Start at 17
1 OpenRead   1 4 0 7  00  root=4 iDb=0; pos
2 OpenRead   2 110 k(3,,,)02  root=11 iDb=0; mrev
3 Rewind 1 16000
4   Column 1 2 100  r[1]=pos.pcensus
5   Ne 2 151 (BINARY)   54  if r[1]!=r[2] goto 
15
6   Column 1 5 3 -9900  r[3]=pos.pmin
7   Column 1 6 4 99 00  r[4]=pos.pmax
8   Ge 4 153 (BINARY)   53  if r[3]>=r[4] goto 
15
9   Rowid  1 5 000  r[5]=rowid
10  SeekGE 2 155 1  00  key=r[5]
11IdxGT  2 155 1  00  key=r[5]
12Column 1 1 600  r[6]=pos.ppos
13ResultRow  6 1 000  output=r[6]
14  Next   2 11000
15Next   1 4 001
16Halt   0 0 000
17Transaction0 0 155   0  01  usesStmtJournal=0
18Integer182 000  r[2]=18
19Goto   0 1 000


Will run analyze and re-run those when I get time.

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Dominique Devienne
Sent: Friday, February 03, 2017 3:55 AM
To: SQLite mailing list
Subject: Re: [sqlite] "DISTINCT" makes a query take 37 times as long

On Fri, Feb 3, 2017 at 12:27 AM, Kevin O'Gorman <kevinogorm...@gmail.com>
wrote:

> But the big thing is that I took a look at EXPLAIN QUERY PLAN using this
> ...

Maybe somebody can explain them to me, but it doesn't really matter whether
> I ever understand them. Perhaps Mr. Hipp can make use of them.
>

EXPLAIN, and EXPLAIN QUERY PLAN have co

Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-03 Thread Hick Gunter
EXPLAIN dumps SQLite "machine code". Use EXPLAIN QUERY PLAN for the human 
readable version.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Kevin O'Gorman
Gesendet: Freitag, 03. Februar 2017 00:27
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] "DISTINCT" makes a query take 37 times as long

When I read this, it seemed like it made sense.  The thing is, it does not 
match up with reality.

First, the analysis of what happens when I pipe the results to 'sort'
misses the fact that the sort process executes within the 31 minutes of that 
version.  It would not make a dent in the time of the slow version.

But the big thing is that I took a look at EXPLAIN QUERY PLAN using this
script:
#!/usr/bin/env
python3

"""Output positions that are reachable but unsolved at census 18 See page 76 of 
Qubic log

Last Modified: Thu Feb  2 07:46:03 PST 2017 """

import sqlite3  # https://docs.python.org/3.5/library/sqlite3.html

with sqlite3.connect("917.db") as conn:
print("BEOFRE ANALYZE")
for row in conn.execute("""
EXPLAIN
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row)
print()
print()
conn.execute("ANALYZE")
print("AFTER ANALYZE")
for row in conn.execute("""
EXPLAIN
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row)

and after waiting most of the day for the analyze to finish, I got two 
identical query plans, neither of which I could decipher:
BEFORE
ANALYZE

(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None) (3, 'OpenRead', 3, 6, 0, 'k(1,)', 
'00', None) (4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None) (5, 'Rewind', 3, 
21, 1, '0', '00', None) (6, 'Seek', 3, 0, 1, '', '00', None) (7, 'Column', 1, 
2, 1, '', '00', None) (8, 'Ne', 2, 20, 1, '(BINARY)', '54', None) (9, 'Column', 
1, 5, 3, '-99', '00', None) (10, 'Column', 1, 6, 4, '99', '00', None) (11, 
'Ge', 4, 20, 3, '(BINARY)', '53', None) (12, 'IdxRowid', 3, 5, 0, '', '00', 
None) (13, 'SeekGE', 4, 20, 5, '1', '00', None) (14, 'IdxGT', 4, 20, 5, '1', 
'00', None) (15, 'Column', 3, 0, 6, '', '00', None) (16, 'Eq', 6, 19, 7, 
'(BINARY)', '80', None) (17, 'Copy', 6, 7, 0, '', '00', None) (18, 'ResultRow', 
6, 1, 0, '', '00', None) (19, 'Next', 4, 14, 0, '', '00', None) (20, 'Next', 3, 
6, 0, '', '01', None) (21, 'Close', 1, 0, 0, '', '00', None) (22, 'Close', 3, 
0, 0, '', '00', None) (23, 'Close', 4, 0, 0, '', '00', None) (24, 'Halt', 0, 0, 
0, '', '00', None) (25, 'Transaction', 0, 0, 155, '0', '01', None) (26, 
'TableLock', 0, 4, 0, 'pos', '00', None) (27, 'TableLock', 0, 7, 0, 'move', 
'00', None) (28, 'Integer', 18, 2, 0, '', '00', None) (29, 'Goto', 0, 1, 0, '', 
'00', None)


AFTER ANALYZE
(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None) (3, 'OpenRead', 3, 6, 0, 'k(1,)', 
'00', None) (4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None) (5, 'Rewind', 3, 
21, 1, '0', '00', None) (6, 'Seek', 3, 0, 1, '', '00', None) (7, 'Column', 1, 
2, 1, '', '00', None) (8, 'Ne', 2, 20, 1, '(BINARY)', '54', None) (9, 'Column', 
1, 5, 3, '-99', '00', None) (10, 'Column', 1, 6, 4, '99', '00', None) (11, 
'Ge', 4, 20, 3, '(BINARY)', '53', None) (12, 'IdxRowid', 3, 5, 0, '', '00', 
None) (13, 'SeekGE', 4, 20, 5, '1', '00', None) (14, 'IdxGT', 4, 20, 5, '1', 
'00', None) (15, 'Column', 3, 0, 6, '', '00', None) (16, 'Eq', 6, 19, 7, 
'(BINARY)', '80', None) (17, 'Copy', 6, 7, 0, '', '00', None) (18, 'ResultRow', 
6, 1, 0, '', '00', None) (19, 'Next', 4, 14, 0, '', '00', None) (20, 'Next', 3, 
6, 0, '', '01', None) (21, 'Close', 1, 0, 0, '', '00', None) (22, 'Close', 3, 
0, 0, '', '00', None) (23, 'Close', 4, 0, 0, '', '00', None) (24, 'Halt', 0, 0, 
0, '', '00', None) (25, 'Transaction', 0, 0, 155, '0', '01', None) (26, 
'TableLock', 0, 4, 0, 'pos', '00', None) (27, 'TableLock', 0, 7, 0, 'move', 
'00', None) (28, 'Integer', 18, 2, 0, '', '00', None) (29, 'Goto', 0, 1, 0, '', 
'00', None) Maybe somebody can explain them to me, but it doesn't really matter 
whether I ever understand them.
Perhaps Mr. Hipp can make use of them.

Absent some flaw in the above script, I think I'm done with this.  I have a 
solution that works for me, and I'd just as soon get back to my real task.  I 
just wanted to give feedback in case it would be useful.
That's how i say thanks for a really useful product.

Thanks. Mr. Hipp, and anyone else that has contributed to this product.

++ kevin


O

Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-03 Thread Dominique Devienne
On Fri, Feb 3, 2017 at 12:27 AM, Kevin O'Gorman 
wrote:

> But the big thing is that I took a look at EXPLAIN QUERY PLAN using this
> ...

Maybe somebody can explain them to me, but it doesn't really matter whether
> I ever understand them. Perhaps Mr. Hipp can make use of them.
>

EXPLAIN, and EXPLAIN QUERY PLAN have completely different output.
EXPLAIN is basically the SQLite "assembler" code (or its "bytecode" if you
prefer),
while EXPLAIN QUERY PLAN gives you a much more human readable high-level
overview of the plan.

The low-level plans do look identical, but please also share the high-level
plan, should take you only a minute. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-02 Thread Kevin O'Gorman
When I read this, it seemed like it made sense.  The thing is, it does not
match up with reality.

First, the analysis of what happens when I pipe the results to 'sort'
misses the fact that the sort process executes within the 31 minutes of
that version.  It would not make a dent in the time of the slow version.

But the big thing is that I took a look at EXPLAIN QUERY PLAN using this
script:
#!/usr/bin/env
python3

"""Output positions that are reachable but unsolved at census 18
See page 76 of Qubic log

Last Modified: Thu Feb  2 07:46:03 PST 2017
"""

import sqlite3  # https://docs.python.org/3.5/library/sqlite3.html

with sqlite3.connect("917.db") as conn:
print("BEOFRE ANALYZE")
for row in conn.execute("""
EXPLAIN
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row)
print()
print()
conn.execute("ANALYZE")
print("AFTER ANALYZE")
for row in conn.execute("""
EXPLAIN
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row)

and after waiting most of the day for the analyze to finish, I got two
identical query plans,
neither of which I could decipher:
BEFORE
ANALYZE

(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None)
(3, 'OpenRead', 3, 6, 0, 'k(1,)', '00', None)
(4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None)
(5, 'Rewind', 3, 21, 1, '0', '00', None)
(6, 'Seek', 3, 0, 1, '', '00', None)
(7, 'Column', 1, 2, 1, '', '00', None)
(8, 'Ne', 2, 20, 1, '(BINARY)', '54', None)
(9, 'Column', 1, 5, 3, '-99', '00', None)
(10, 'Column', 1, 6, 4, '99', '00', None)
(11, 'Ge', 4, 20, 3, '(BINARY)', '53', None)
(12, 'IdxRowid', 3, 5, 0, '', '00', None)
(13, 'SeekGE', 4, 20, 5, '1', '00', None)
(14, 'IdxGT', 4, 20, 5, '1', '00', None)
(15, 'Column', 3, 0, 6, '', '00', None)
(16, 'Eq', 6, 19, 7, '(BINARY)', '80', None)
(17, 'Copy', 6, 7, 0, '', '00', None)
(18, 'ResultRow', 6, 1, 0, '', '00', None)
(19, 'Next', 4, 14, 0, '', '00', None)
(20, 'Next', 3, 6, 0, '', '01', None)
(21, 'Close', 1, 0, 0, '', '00', None)
(22, 'Close', 3, 0, 0, '', '00', None)
(23, 'Close', 4, 0, 0, '', '00', None)
(24, 'Halt', 0, 0, 0, '', '00', None)
(25, 'Transaction', 0, 0, 155, '0', '01', None)
(26, 'TableLock', 0, 4, 0, 'pos', '00', None)
(27, 'TableLock', 0, 7, 0, 'move', '00', None)
(28, 'Integer', 18, 2, 0, '', '00', None)
(29, 'Goto', 0, 1, 0, '', '00', None)


AFTER ANALYZE
(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None)
(3, 'OpenRead', 3, 6, 0, 'k(1,)', '00', None)
(4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None)
(5, 'Rewind', 3, 21, 1, '0', '00', None)
(6, 'Seek', 3, 0, 1, '', '00', None)
(7, 'Column', 1, 2, 1, '', '00', None)
(8, 'Ne', 2, 20, 1, '(BINARY)', '54', None)
(9, 'Column', 1, 5, 3, '-99', '00', None)
(10, 'Column', 1, 6, 4, '99', '00', None)
(11, 'Ge', 4, 20, 3, '(BINARY)', '53', None)
(12, 'IdxRowid', 3, 5, 0, '', '00', None)
(13, 'SeekGE', 4, 20, 5, '1', '00', None)
(14, 'IdxGT', 4, 20, 5, '1', '00', None)
(15, 'Column', 3, 0, 6, '', '00', None)
(16, 'Eq', 6, 19, 7, '(BINARY)', '80', None)
(17, 'Copy', 6, 7, 0, '', '00', None)
(18, 'ResultRow', 6, 1, 0, '', '00', None)
(19, 'Next', 4, 14, 0, '', '00', None)
(20, 'Next', 3, 6, 0, '', '01', None)
(21, 'Close', 1, 0, 0, '', '00', None)
(22, 'Close', 3, 0, 0, '', '00', None)
(23, 'Close', 4, 0, 0, '', '00', None)
(24, 'Halt', 0, 0, 0, '', '00', None)
(25, 'Transaction', 0, 0, 155, '0', '01', None)
(26, 'TableLock', 0, 4, 0, 'pos', '00', None)
(27, 'TableLock', 0, 7, 0, 'move', '00', None)
(28, 'Integer', 18, 2, 0, '', '00', None)
(29, 'Goto', 0, 1, 0, '', '00', None)
Maybe somebody can explain them to me, but it doesn't really matter whether
I ever understand them.
Perhaps Mr. Hipp can make use of them.

Absent some flaw in the above script, I think I'm done with this.  I have a
solution that works for me,
and I'd just as soon get back to my real task.  I just wanted to give
feedback in case it would be useful.
That's how i say thanks for a really useful product.

Thanks. Mr. Hipp, and anyone else that has contributed to this product.

++ kevin


On Thu, Feb 2, 2017 at 12:27 AM, Hick Gunter  wrote:

> DISTINCT forces the query optimizer to create an intermediate table to
> hold the results and compare each row of the non-distinct result set with
> an automatically created index. It may also affect the query plan in a way
> that chooses inefficient indices, which is more likely if you have not run
> ANALYZE on the fully loaded database.
>
> Using a 3 stage pipe instead you additionally have more CPUs (1 running
> the query, 1 or more sorting the results) working in paralell.
>
> Try EXPLAIN QUERY PLAN to see what the query planner 

Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-02 Thread Richard Hipp
On 2/2/17, Hick Gunter  wrote:
> DISTINCT forces the query optimizer to create an intermediate table to hold
> the results and compare each row of the non-distinct result set with an
> automatically created index. It may also affect the query plan in a way that
> chooses inefficient indices, which is more likely if you have not run
> ANALYZE on the fully loaded database.

Creating an intermediate table is one way in which DISTINCT might be
implemented.  The query planner might also try to force the output
into sorted order, so that it can eliminate duplicates simply by
comparing against the previous output.  The second mechanism can cause
the use of inefficient indexes, if they are present and un-ANALYZED.
The query planner computes an estimated run-time for each of various
techniques it considers, and picks the one it thinks will run the
fastest.  Running ANALYZE helps the query planner to generate better
(more accurate) cost estimates.

I have not yet analyzed this situation sufficiently to tell what is going on.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-02 Thread Hick Gunter
DISTINCT forces the query optimizer to create an intermediate table to hold the 
results and compare each row of the non-distinct result set with an 
automatically created index. It may also affect the query plan in a way that 
chooses inefficient indices, which is more likely if you have not run ANALYZE 
on the fully loaded database.

Using a 3 stage pipe instead you additionally have more CPUs (1 running the 
query, 1 or more sorting the results) working in paralell.

Try EXPLAIN QUERY PLAN to see what the query planner is doing.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Kevin O'Gorman
Gesendet: Donnerstag, 02. Februar 2017 03:28
An: sqlite-users 
Betreff: [sqlite] "DISTINCT" makes a query take 37 times as long

I have a database of positions and moves in a strategic game, and I'm searching 
for unsolved positions that have been connected to an immediate ancestor.  I'm 
using Python 3.5.2, and the code looks like

#!/usr/bin/env python3
"""Output positions that are reachable but unsolved at census 18 or greater See 
page 76 of Qubic log

Last Modified: Tue Jan 31 12:13:07 PST 2017 """

import sqlite3  # https://docs.python.org/3.5/library/sqlite3.html

with sqlite3.connect("917.db") as conn:
for row in conn.execute("""
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row[0])

As written here, this query runs for 1193 minutes (just short of 20 hours).  If 
I remove the "DISTINCT" and instead pipe the result into the sort program that 
comes with Linux "sort --unique" the query and sort takes only 31 minutes.  The 
results are the same, and consist of 4.2 million rows.

This seems extreme.

--
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-01 Thread Kevin O'Gorman
On Wed, Feb 1, 2017 at 6:35 PM, Richard Hipp  wrote:

> On 2/1/17, Kevin O'Gorman  wrote:
> > I have a database of positions and moves in a strategic game, and I'm
> > searching for unsolved positions that have been connected to an immediate
> > ancestor.  I'm using Python 3.5.2, and the code looks like
>
> Please provide us with the following additional information:
>
> (1) In python, run the query: "SELECT sqlite_version(),
> sqlite_source_id();"
>
> (2) In a recent sqlite3 command-line shell (the latest release, not
> whatever 5-year-old release happens to be installed on your system)
> bring up your database and run the command:
>
>  .fullschema --indent
>
> And send in the output.
>
> (3) Download the bundle of command-line tools for your OS, then run
> the command "sqlite3_analyzer" on your database, and send in the
> output.
>
> Thanks.
>
>
>
I am unable to comply with items 2 and 3.  I can download the linux x86
versions, which I expected would run on my x86-64 system, but they don't.
Instead, even when I point right at them, they report "No such file or
directory".  I take this to mean that there is some file they do not find,
like a library, and they report the error code in their return status.

However, my "recent" software reports:
 SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
sqlite> .fullschema --indent
Usage: .fullschema
sqlite> .fullschema
CREATE TABLE base64 (
b64char CHAR NOT NULL PRIMARY KEY,
b64val  INTEGER);
CREATE TABLE pos (
pnum INTEGER PRIMARY KEY AUTOINCREMENT,
ppos CHAR(64) NOT NULL,
pcensus INTEGER NOT NULL,
pscore INTEGER,
pstate CHAR DEFAULT "N" NOT NULL,
pmin INTEGER DEFAULT -99 NOT NULL,
pmax INTEGER DEFAULT 99 NOT NULL,
pmain CHAR(64));
CREATE UNIQUE INDEX pipos ON pos (ppos);
CREATE TABLE move (
mfrom INTEGER NOT NULL,
mto   INTEGER NOT NULL,
mtype CHAR NOT NULL,
mcell INTEGER NOT NULL,
mvalue INTEGER,
ma INTEGER DEFAULT -99,
mb INTEGER DEFAULT 99,
PRIMARY KEY (mfrom, mto, mcell));
CREATE UNIQUE INDEX mrev ON move (mto, mfrom, mcell);
CREATE TABLE expanded (
census INTEGER NOT NULL,
number INTEGER NOT NULL,
pos CHAR(64),
PRIMARY KEY (census, number));
ANALYZE sqlite_master;
ANALYZE sqlite_master;
INSERT INTO sqlite_stat1 VALUES('move','mrev','48329866 2 2 1');
INSERT INTO sqlite_stat1 VALUES('move','sqlite_autoindex_move_1','48329866
38 2 1');
INSERT INTO sqlite_stat1 VALUES('pos','pipos','74409802 1');
INSERT INTO sqlite_stat1 VALUES('base64','sqlite_autoindex_base64_1','64
1');
ANALYZE sqlite_master;
sqlite>

The analyzer is not included in my distribution or its repositiories, as
far as I can tell.  This is Xubuntu, which is a flavor of Ubuntu, which is
derived from Debian.

I'm not sure I want to build your entire software suite.  Perhaps you'd
care to download my database, which I freshly tar-ed and gzip-ed to
http://kosmanor.com/917/917.db.tgz
the databse is 21 GB; the tar is 3.1 GB

> >
> > #!/usr/bin/env python3
> > """Output positions that are reachable but unsolved at census 18 or
> greater
> > See page 76 of Qubic log
> >
> > Last Modified: Tue Jan 31 12:13:07 PST 2017
> > """
> >
> > import sqlite3  # https://docs.python.org/3.5/
> library/sqlite3.html
> >
> > with sqlite3.connect("917.db") as conn:
> > for row in conn.execute("""
> > SELECT DISTINCT ppos
> > FROM move JOIN pos ON mto = pnum
> > WHERE pcensus = 18 and pmin < pmax
> > """):
> > print(row[0])
> >
> > As written here, this query runs for 1193 minutes (just short of 20
> > hours).  If I remove the "DISTINCT" and instead pipe the result into the
> > sort program that comes with Linux "sort --unique" the query and sort
> takes
> > only 31 minutes.  The results are the same, and consist of 4.2 million
> rows.
> >
> > This seems extreme.
> >
> > --
> > word of the year: *kakistocracy*
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
>



-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-01 Thread Richard Hipp
On 2/1/17, Kevin O'Gorman  wrote:
> I have a database of positions and moves in a strategic game, and I'm
> searching for unsolved positions that have been connected to an immediate
> ancestor.  I'm using Python 3.5.2, and the code looks like

Please provide us with the following additional information:

(1) In python, run the query: "SELECT sqlite_version(), sqlite_source_id();"

(2) In a recent sqlite3 command-line shell (the latest release, not
whatever 5-year-old release happens to be installed on your system)
bring up your database and run the command:

 .fullschema --indent

And send in the output.

(3) Download the bundle of command-line tools for your OS, then run
the command "sqlite3_analyzer" on your database, and send in the
output.

Thanks.


>
> #!/usr/bin/env python3
> """Output positions that are reachable but unsolved at census 18 or greater
> See page 76 of Qubic log
>
> Last Modified: Tue Jan 31 12:13:07 PST 2017
> """
>
> import sqlite3  # https://docs.python.org/3.5/library/sqlite3.html
>
> with sqlite3.connect("917.db") as conn:
> for row in conn.execute("""
> SELECT DISTINCT ppos
> FROM move JOIN pos ON mto = pnum
> WHERE pcensus = 18 and pmin < pmax
> """):
> print(row[0])
>
> As written here, this query runs for 1193 minutes (just short of 20
> hours).  If I remove the "DISTINCT" and instead pipe the result into the
> sort program that comes with Linux "sort --unique" the query and sort takes
> only 31 minutes.  The results are the same, and consist of 4.2 million rows.
>
> This seems extreme.
>
> --
> word of the year: *kakistocracy*
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users