EXPLAIN dumps SQLite "machine code". Use EXPLAIN QUERY PLAN for the human 
readable version.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Kevin O'Gorman
Gesendet: Freitag, 03. Februar 2017 00:27
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] "DISTINCT" makes a query take 37 times as long

When I read this, it seemed like it made sense.  The thing is, it does not 
match up with reality.

First, the analysis of what happens when I pipe the results to 'sort'
misses the fact that the sort process executes within the 31 minutes of that 
version.  It would not make a dent in the time of the slow version.

But the big thing is that I took a look at EXPLAIN QUERY PLAN using this
script:
#!/usr/bin/env
python3

"""Output positions that are reachable but unsolved at census 18 See page 76 of 
Qubic log

Last Modified: Thu Feb  2 07:46:03 PST 2017 """

import sqlite3          # https://docs.python.org/3.5/library/sqlite3.html

with sqlite3.connect("917.db") as conn:
    print("BEOFRE ANALYZE")
    for row in conn.execute("""
                EXPLAIN
                SELECT DISTINCT ppos
                FROM move JOIN pos ON mto = pnum
                WHERE pcensus = 18 and pmin < pmax
            """):
        print(row)
    print()
    print()
    conn.execute("ANALYZE")
    print("AFTER ANALYZE")
    for row in conn.execute("""
                EXPLAIN
                SELECT DISTINCT ppos
                FROM move JOIN pos ON mto = pnum
                WHERE pcensus = 18 and pmin < pmax
            """):
        print(row)

and after waiting most of the day for the analyze to finish, I got two 
identical query plans, neither of which I could decipher:
BEFORE
ANALYZE

(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None) (3, 'OpenRead', 3, 6, 0, 'k(1,)', 
'00', None) (4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None) (5, 'Rewind', 3, 
21, 1, '0', '00', None) (6, 'Seek', 3, 0, 1, '', '00', None) (7, 'Column', 1, 
2, 1, '', '00', None) (8, 'Ne', 2, 20, 1, '(BINARY)', '54', None) (9, 'Column', 
1, 5, 3, '-99', '00', None) (10, 'Column', 1, 6, 4, '99', '00', None) (11, 
'Ge', 4, 20, 3, '(BINARY)', '53', None) (12, 'IdxRowid', 3, 5, 0, '', '00', 
None) (13, 'SeekGE', 4, 20, 5, '1', '00', None) (14, 'IdxGT', 4, 20, 5, '1', 
'00', None) (15, 'Column', 3, 0, 6, '', '00', None) (16, 'Eq', 6, 19, 7, 
'(BINARY)', '80', None) (17, 'Copy', 6, 7, 0, '', '00', None) (18, 'ResultRow', 
6, 1, 0, '', '00', None) (19, 'Next', 4, 14, 0, '', '00', None) (20, 'Next', 3, 
6, 0, '', '01', None) (21, 'Close', 1, 0, 0, '', '00', None) (22, 'Close', 3, 
0, 0, '', '00', None) (23, 'Close', 4, 0, 0, '', '00', None) (24, 'Halt', 0, 0, 
0, '', '00', None) (25, 'Transaction', 0, 0, 155, '0', '01', None) (26, 
'TableLock', 0, 4, 0, 'pos', '00', None) (27, 'TableLock', 0, 7, 0, 'move', 
'00', None) (28, 'Integer', 18, 2, 0, '', '00', None) (29, 'Goto', 0, 1, 0, '', 
'00', None)


AFTER ANALYZE
(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None) (3, 'OpenRead', 3, 6, 0, 'k(1,)', 
'00', None) (4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None) (5, 'Rewind', 3, 
21, 1, '0', '00', None) (6, 'Seek', 3, 0, 1, '', '00', None) (7, 'Column', 1, 
2, 1, '', '00', None) (8, 'Ne', 2, 20, 1, '(BINARY)', '54', None) (9, 'Column', 
1, 5, 3, '-99', '00', None) (10, 'Column', 1, 6, 4, '99', '00', None) (11, 
'Ge', 4, 20, 3, '(BINARY)', '53', None) (12, 'IdxRowid', 3, 5, 0, '', '00', 
None) (13, 'SeekGE', 4, 20, 5, '1', '00', None) (14, 'IdxGT', 4, 20, 5, '1', 
'00', None) (15, 'Column', 3, 0, 6, '', '00', None) (16, 'Eq', 6, 19, 7, 
'(BINARY)', '80', None) (17, 'Copy', 6, 7, 0, '', '00', None) (18, 'ResultRow', 
6, 1, 0, '', '00', None) (19, 'Next', 4, 14, 0, '', '00', None) (20, 'Next', 3, 
6, 0, '', '01', None) (21, 'Close', 1, 0, 0, '', '00', None) (22, 'Close', 3, 
0, 0, '', '00', None) (23, 'Close', 4, 0, 0, '', '00', None) (24, 'Halt', 0, 0, 
0, '', '00', None) (25, 'Transaction', 0, 0, 155, '0', '01', None) (26, 
'TableLock', 0, 4, 0, 'pos', '00', None) (27, 'TableLock', 0, 7, 0, 'move', 
'00', None) (28, 'Integer', 18, 2, 0, '', '00', None) (29, 'Goto', 0, 1, 0, '', 
'00', None) Maybe somebody can explain them to me, but it doesn't really matter 
whether I ever understand them.
Perhaps Mr. Hipp can make use of them.

Absent some flaw in the above script, I think I'm done with this.  I have a 
solution that works for me, and I'd just as soon get back to my real task.  I 
just wanted to give feedback in case it would be useful.
That's how i say thanks for a really useful product.

Thanks. Mr. Hipp, and anyone else that has contributed to this product.

++ kevin


On Thu, Feb 2, 2017 at 12:27 AM, Hick Gunter <h...@scigames.at> wrote:

> DISTINCT forces the query optimizer to create an intermediate table to
> hold the results and compare each row of the non-distinct result set
> with an automatically created index. It may also affect the query plan
> in a way that chooses inefficient indices, which is more likely if you
> have not run ANALYZE on the fully loaded database.
>
> Using a 3 stage pipe instead you additionally have more CPUs (1
> running the query, 1 or more sorting the results) working in paralell.
>
> Try EXPLAIN QUERY PLAN to see what the query planner is doing.
>
> -----Ursprüngliche Nachricht-----
> Von: sqlite-users
> [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> Im Auftrag von Kevin O'Gorman
> Gesendet: Donnerstag, 02. Februar 2017 03:28
> An: sqlite-users <sqlite-users@mailinglists.sqlite.org>
> Betreff: [sqlite] "DISTINCT" makes a query take 37 times as long
>
> I have a database of positions and moves in a strategic game, and I'm
> searching for unsolved positions that have been connected to an
> immediate ancestor.  I'm using Python 3.5.2, and the code looks like
>
> #!/usr/bin/env python3
> """Output positions that are reachable but unsolved at census 18 or
> greater See page 76 of Qubic log
>
> Last Modified: Tue Jan 31 12:13:07 PST 2017 """
>
> import sqlite3          # https://docs.python.org/3.5/library/sqlite3.html
>
> with sqlite3.connect("917.db") as conn:
>     for row in conn.execute("""
>                 SELECT DISTINCT ppos
>                 FROM move JOIN pos ON mto = pnum
>                 WHERE pcensus = 18 and pmin < pmax
>             """):
>         print(row[0])
>
> As written here, this query runs for 1193 minutes (just short of 20
> hours).  If I remove the "DISTINCT" and instead pipe the result into
> the sort program that comes with Linux "sort --unique" the query and
> sort takes only 31 minutes.  The results are the same, and consist of 4.2 
> million rows.
>
> This seems extreme.
>
> --
> word of the year: *kakistocracy*
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___________________________________________
>  Gunter Hick
> Software Engineer
> Scientific Games International GmbH
> FN 157284 a, HG Wien
> Klitschgasse 2-4, A-1130 Vienna, Austria
> Tel: +43 1 80100 0
> E-Mail: h...@scigames.at
>
> This communication (including any attachments) is intended for the use
> of the intended recipient(s) only and may contain information that is
> confidential, privileged or legally protected. Any unauthorized use or
> dissemination of this communication is strictly prohibited. If you
> have received this communication in error, please immediately notify
> the sender by return e-mail message and delete all copies of the
> original communication. Thank you for your cooperation.
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



--
word of the year: *kakistocracy*
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to