Hi Frank, On Tue, May 26, 2020 at 9:23 AM Frank Millman <fr...@chagford.com> wrote:
> Hi all > > I have a SELECT that runs over 5 times slower on PostgreSQL compared > with Sql Server and sqlite3. I am trying to understand why. > > I have a table that looks like this (simplified) - > > CREATE TABLE my_table ( > row_id SERIAL PRIMARY KEY, > deleted_id INT DEFAULT 0, > fld_1 INT REFERENCES table_1(row_id), > fld_2 INT REFERENCES table_2(row_id), > fld_3 INT REFERENCES table_3(row_id), > fld_4 INT REFERENCES table_4(row_id), > tran_date DATE, > tran_total DEC(21,2) > ); > > CREATE UNIQUE INDEX my_table_ndx ON my_table (fld_1, fld_2, fld_3, > fld_4, tran_date) WHERE deleted_id = 0; > > The table sizes are - > my_table : 167 rows > table_1 : 21 rows > table_2 : 11 rows > table_3 : 3 rows > table_4 : 16 rows > > Therefore for each tran_date in my_table there are potentially > 21x11x3x16 = 11088 rows. Most will be null. > > I want to select the row_id for the last tran_date for each of those > potential groups. This is my select - > > SELECT ( > SELECT a.row_id FROM my_table a > WHERE a.fld_1 = b.row_id > AND a.fld_2 = c.row_id > AND a.fld_3 = d.row_id > AND a.fld_4 = e.row_id > AND a.deleted_id = 0 > ORDER BY a.tran_date DESC LIMIT 1 > ) > FROM table_1 b, table_2 c, table_3 d, table_4 e > > Out of 11088 rows selected, 103 are not null. > > On identical data, this takes 0.06 sec on SQL Server, 0.04 sec on > sqlite3, and 0.31 sec on PostgreSQL. > SQL Server does a good job at caching data in memory. PostgreSQL does too on consecutive calls to the same table. What execution time do you get if you issue the query a second time? My first guess would be to add an index on my_table.tran_date and check in EXPLAIN that you don't have a SEQUENTIAL SCAN on that table. > I have looked at the EXPLAIN, but I don't really know what to look for. > I can supply it if that would help. > > Thanks for any advice. > > Frank Millman > > -- Olivier Gautherot