Re: [PERFORM] Querying with multicolumn index
På fredag 09. desember 2016 kl. 18:00:16, skrev Eric Jiang <e...@doublemap.com <mailto:e...@doublemap.com>>: Hi all, I have a query that I *think* should use a multicolumn index, but sometimes isn't, resulting in slow queries. We have a DB that records GPS coordinates for vehicles: Table "public.updates" Column | Type | Modifiers +--+-- id | integer | not null default nextval('updates_id_seq'::regclass) driver_id | integer | not null latitude | double precision | not null longitude | double precision | not null time | timestamp with time zone | not null default now() vehicle_id | integer | Indexes: "updates_pkey" PRIMARY KEY, btree (id) "ix_updates_time" btree ("time") "updates_driver_id_time_idx" btree (driver_id, "time") "updates_vehicle_id_time_idx" btree (vehicle_id, "time") Table has about 15M records across 100 distinct driver_id. I want to get the last record for a specific driver: SELECT * FROM updates WHERE driver_id=123 ORDER BY "time" DESC LIMIT 1; For some values of driver_id, it does what I expect and uses updates_driver_id_time_idx to fetch the records in 2 ms or less. For other values of driver_id, it does an index scan backwards on ix_updates_time, taking upwards of 2 minutes. Good plan: Limit (cost=0.11..1.38 rows=1 width=56) (actual time=2.710..2.710 rows=1 loops=1) -> Index Scan Backward using updates_driver_id_time_idx on updates (cost=0.11..139278.28 rows=110051 width=56) (actual time=2.709..2.709 rows=1 loops=1) Index Cond: (driver_id = 17127) Total runtime: 2.732 ms (4 rows) Bad plan: Limit (cost=0.09..0.69 rows=1 width=56) (actual time=216769.111..216769.112 rows=1 loops=1) -> Index Scan Backward using ix_updates_time on updates (cost=0.09..272339.04 rows=448679 width=56) (actual time=216769.110..216769.110 rows=1 loops=1) Filter: (driver_id = 30132) Rows Removed by Filter: 5132087 Total runtime: 216769.174 ms From cursory testing, the difference seems to be based on how many total rows there are for a particular driver. The above query uses updates_driver_id_time_idx for drivers with less than about 300K rows, but uses ix_updates_time for drivers with more than about 300K rows. Anything we can do to make it do the "right" thing? We are also considering denormalizing the data and keeping a "cache" of the same data in another table. pgsql version: 9.3.14 and 9.5.3, already tried vacuum analyze. Thanks, Eric You should be having this index: create index updates_driver_time_idx ON updates(driver_id, "time" DESC); -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Tuning one Recurcive CTE
På onsdag 09. november 2016 kl. 15:30:20, skrev Henrik Ekenberg < hen...@ekenberg.pw <mailto:hen...@ekenberg.pw>>: Hi, I will need to anonymized before sending it. Do you know if there is any tuning documents related to CTE scans You might want to read this: http://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/ https://robots.thoughtbot.com/advanced-postgres-performance-tips#common-table-expressions-and-subqueries https://www.postgresql.org/message-id/CAPo4y_XUJR1sijvTySy9W%2BShpORwzbhSdEzE9pgtc1%3DcTkvpkw%40mail.gmail.com -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Tuning one Recurcive CTE
På onsdag 09. november 2016 kl. 14:05:55, skrev Henrik Ekenberg < hen...@ekenberg.pw <mailto:hen...@ekenberg.pw>>: Hi, I try to tune one Recursive CTE. Explain Plan can be found here https://explain.depesz.com/s/yLVd Anyone can give me direction to check? //H. Rule number one; Always provide the query in question when asking for help tuning it. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Very poor performance with Nested Loop Anti Join
På tirsdag 02. august 2016 kl. 01:15:05, skrev Tom Lane <t...@sss.pgh.pa.us <mailto:t...@sss.pgh.pa.us>>: Andreas Joseph Krogh <andr...@visena.com> writes: > This query performs terribly slow (~26 minutes,��1561346.597ms): Seems like the key misestimation is on the inner antijoin: > -> Hash Anti Join (cost=654.21..4008.72 rows=1 width=8) (actual time=9.016..40.672 rows=76174 loops=1) > Hash Cond: (il.invoice_id = creditnote.credit_against) > -> Seq Scan on onp_crm_invoice_line il (cost=0.00..3062.01 rows=78001 width=8) (actual time=0.005..11.259 rows=78614 loops=1) > -> Hash (cost=510.56..510.56 rows=11492 width=8) (actual time=8.940..8.940 rows=372 loops=1) > Buckets: 16384 Batches: 1 Memory Usage: 143kB > -> Seq Scan on onp_crm_invoice creditnote (cost=0.00..510.56 rows=11492 width=8) (actual time=0.014..7.882 rows=11507 loops=1) > Filter: ((sent_date <= '2016-06-27'::date) AND ((status_key)::text = 'INVOICE_STATUS_INVOICED'::text)) > Rows Removed by Filter: 149 If it realized that this produces 78k rows not 1, it'd likely do something smarter at the outer antijoin. I have no idea why that estimate's so far off though. What PG version is this? Stats all up to date on these two tables? Sorry for not providing PG-version, this is on 9.5.3. All stats are up to date, or should be a I've analyzed all manually. Are the rows excluded by the filter condition on "creditnote" significantly different from the rest of that table? This happens also without the filter-cond: explain analyze SELECT log.relation_id as company_id , sum(log.duration) AS durationFROM onp_crm_activity_log log JOIN onp_crm_person logfor ON logfor.onp_user_id =log.logged_for AND logfor.is_resource = FALSE WHERE 1 = 1 -- Filter out already invoiced AND NOT EXISTS( SELECT * FROM onp_crm_calendarentry_invoice_membership cemJOIN onp_crm_invoice_line il ON cem.invoice_line_id = il.idJOIN onp_crm_invoice inv ON il.invoice_id = inv.entity_idWHERE cem.calendar_entry_id = log.id AND NOT EXISTS( SELECT * FROM onp_crm_invoice creditnoteWHERE il.invoice_id = creditnote.credit_against ) ) GROUP BY log.relation_id ; QUERY PLAN - HashAggregate (cost=12049.35..12067.11 rows=1421 width=12) (actual time =1386683.646..1386683.858rows=720 loops=1) Group Key: log.relation_id -> Nested LoopAnti Join (cost=512.08..12039.32 rows=2006 width=12) (actual time =395.017..1386576.756rows=93480 loops=1) Join Filter: (cem.calendar_entry_id = log.id) Rows Removed by Join Filter: 12185913244 -> Hash Join (cost =86.56..9757.61rows=2006 width=20) (actual time=0.165..366.778 rows=181872 loops=1) Hash Cond: (log.logged_for = logfor.onp_user_id) -> Seq Scan on onp_crm_activity_loglog (cost=0.00..8957.45 rows=184945 width=24) (actual time =0.003..256.862rows=182606 loops=1) -> Hash (cost=39.46..39.46 rows=3768 width=8 ) (actualtime=0.132..0.132 rows=36 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 34kB -> Bitmap Heap Scan on onp_crm_person logfor (cost=3.69..39.46 rows= 3768width=8) (actual time=0.033..0.125 rows=36 loops=1) Recheck Cond: (onp_user_idIS NOT NULL) Filter: (NOT is_resource) Rows Removed by Filter: 5 Heap Blocks: exact=10 -> Bitmap Index Scan on onp_crm_person_onp_id_idx (cost =0.00..2.75rows=41 width=0) (actual time=0.017..0.017 rows=41 loops=1) -> Materialize (cost=425.53..2251.62 rows=1 width=4) (actual time=0.000..2.544 rows =67003 loops=181872) -> Nested Loop (cost=425.53..2251.61 rows=1 width=4) (actualtime=3.283..320.057 rows=88511 loops=1) -> Nested Loop (cost =425.24..2251.30rows=1 width=8) (actual time=3.241..154.783 rows=88511 loops=1) ->Hash Anti Join (cost=424.95..2250.75 rows=1 width=8) (actual time =3.110..30.097rows=76281 loops=1) Hash Cond: (il.invoice_id = creditnote.credit_against) ->Index Only Scan using origo_invoice_line_id_invoice_idxon onp_crm_invoice_line il (cost=0.29..1530.95 rows=78707 width=8) (actual time=0.030..13.719 rows=78740 loops=1) Heap Fetches: 2967 -> Hash (cost=278.22..278.22 rows=11715 width=8) (actual time=3.003..3.003 rows=376 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 143kB -> Index Only Scanusing origo_invoice_credit_against_idx on onp_crm_invoice creditnote (cost =0.29..278.22rows=11715 width=8) (actual time=0.042..2.082 rows=11692 loops=1) Heap Fetches:1151 -> Index Only Scan using onp_crm_calendarentry_invoice_invoice_line_id_calendar_entr_keyon onp_crm_calendarentry_invoice_membership
Re: [PERFORM] Very poor performance with Nested Loop Anti Join
På mandag 01. august 2016 kl. 15:33:04, skrev Andreas Joseph Krogh < andr...@visena.com <mailto:andr...@visena.com>>: I have this schema: CREATE TABLE onp_crm_person( id serial PRIMARY KEY, onp_user_id bigint referencesonp_user(id) deferrable initially deferred, is_resource boolean not null default false, UNIQUE(onp_user_id) ); CREATE TABLE onp_crm_activity_log( id bigserial PRIMARY KEY, relation_id integer REFERENCES onp_crm_relation(entity_id), logged_forint references onp_crm_person(onp_user_id), durationbigint ); CREATE TABLE onp_crm_invoice( entity_idbigint PRIMARY KEY REFERENCES onp_crm_entity(entity_id), status_key VARCHAR NOT NULL, credit_against bigint REFERENCES onp_crm_invoice(entity_id), sent_dateDATE, UNIQUE(credit_against) deferrable INITIALLY DEFERRED -- invoice_print_template_id is added after creation of origo_invoice_print_template); CREATE TABLE onp_crm_invoice_line ( id SERIAL PRIMARY KEY, invoice_id INTEGER NOT NULL REFERENCES onp_crm_invoice (entity_id) );CREATE TABLE onp_crm_calendarentry_invoice_membership( invoice_line_id INTEGER NOT NULL REFERENCESonp_crm_invoice_line(id) ON DELETE CASCADE, calendar_entry_idINTEGER NOT NULL REFERENCES onp_crm_activity_log(id), unique (invoice_line_id, calendar_entry_id)DEFERRABLE INITIALLY DEFERRED ); This query performs terribly slow ( ~26 minutes, 1561346.597ms): explain analyze SELECT log.relation_id as company_id , sum(log.duration) AS durationFROM onp_crm_activity_log log JOIN onp_crm_person logfor ON logfor.onp_user_id =log.logged_for AND logfor.is_resource = FALSE WHERE 1 = 1 -- Filter out already invoiced before 2016-06-27 AND NOT EXISTS( SELECT * FROM onp_crm_calendarentry_invoice_membership cemJOIN onp_crm_invoice_line il ON cem.invoice_line_id = il.idJOIN onp_crm_invoice inv ON il.invoice_id = inv.entity_idWHERE cem.calendar_entry_id = log.id AND inv.status_key = 'INVOICE_STATUS_INVOICED' AND inv.sent_date <= '2016-06-27' AND NOT EXISTS( SELECT* FROM onp_crm_invoice creditnote WHERE il.invoice_id = creditnote.credit_againstAND creditnote.status_key = 'INVOICE_STATUS_INVOICED' ANDcreditnote.sent_date <= '2016-06-27' ) ) GROUP BY log.relation_id ; Explain output: QUERY PLAN - HashAggregate (cost=13778.63..13796.39 rows=1421 width=12) (actual time=1561343.861..1561344.042 rows=724 loops=1) Group Key: log.relation_id -> Nested Loop Anti Join (cost=741.35..13768.63 rows=2000 width=12) (actual time=471.973..1561221.929 rows=96095 loops=1) Join Filter: (cem.calendar_entry_id = log.id) Rows Removed by Join Filter: 11895758618 -> Hash Join (cost=86.56..9729.03 rows=2000 width=20) (actual time=0.170..668.911 rows=181644 loops=1) Hash Cond: (log.logged_for = logfor.onp_user_id) -> Seq Scan on onp_crm_activity_log log (cost=0.00..8930.98 rows=184398 width=24) (actual time=0.007..538.893 rows=182378 loops=1) -> Hash (cost=39.46..39.46 rows=3768 width=8) (actual time=0.126..0.126 rows=36 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 34kB -> Bitmap Heap Scan on onp_crm_person logfor (cost=3.69..39.46 rows=3768 width=8) (actual time=0.040..0.106 rows=36 loops=1) Recheck Cond: (onp_user_id IS NOT NULL) Filter: (NOT is_resource) Rows Removed by Filter: 5 Heap Blocks: exact=10 -> Bitmap Index Scan on onp_crm_person_onp_id_idx (cost=0.00..2.75 rows=41 width=0) (actual time=0.019..0.019 rows=41 loops=1) -> Materialize (cost=654.79..4009.60 rows=1 width=4) (actual time=0.000..2.829 rows=65490 loops=181644) -> Nested Loop (cost=654.79..4009.59 rows=1 width=4) (actual time=9.056..386.835 rows=85668 loops=1) -> Nested Loop (cost=654.50..4009.27 rows=1 width=8) (actual time=9.046..165.280 rows=88151 loops=1) -> Hash Anti Join (cost=654.21..4008.72 rows=1 width=8) (actual time=9.016..40.672 rows=76174 loops=1) Hash Cond: (il.invoice_id = creditnote.credit_against) -> Seq Scan on onp_crm_invoice_line il (cost=0.00..3062.01 rows=78001 width=8) (actual time=0.005..11.259 rows=78614 loops=1) -> Hash (cost=510.56..510.56 rows=1149
[PERFORM] Very poor performance with Nested Loop Anti Join
onp_crm_invoice_line il (cost=0.00..3062.01 rows=78001 width=8) (actual time=0.008..12.476 rows=78622 loops=1) -> Hash (cost=510.56..510.56 rows=11492 width=8) (actual time=12.477..12.477 rows=372 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 143kB -> Seq Scan on onp_crm_invoice creditnote (cost=0.00..510.56 rows=11492 width=8) (actual time=0.008..10.963 rows=11507 loops=1) Filter: ((sent_date <= '2016-06-27'::date) AND ((status_key)::text = 'INVOICE_STATUS_INVOICED'::text)) Rows Removed by Filter: 151 Planning time: 3.510 ms Execution time: 338.349 ms (39 rows) So my question is is there something I can do to make PG favor a Hash Anti Join instead of a Nested Loop Anti Join (which I assume is the problem)? Can the nested NOT EXISTS be re-written to be more performant? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
På mandag 21. mars 2016 kl. 16:13:07, skrev Oleg Bartunov <obartu...@gmail.com <mailto:obartu...@gmail.com>>: On Mon, Mar 21, 2016 at 5:41 PM, Andreas Joseph Krogh <andr...@visena.com <mailto:andr...@visena.com>> wrote: På lørdag 19. mars 2016 kl. 03:44:55, skrev Jeff Janes <jeff.ja...@gmail.com <mailto:jeff.ja...@gmail.com>>: On Wed, Mar 16, 2016 at 6:53 AM, Andreas Joseph Krogh <andr...@visena.com <mailto:andr...@visena.com>> wrote: På onsdag 16. mars 2016 kl. 14:37:27, skrev Tom Lane <t...@sss.pgh.pa.us <mailto:t...@sss.pgh.pa.us>>: Andreas Joseph Krogh <andr...@visena.com <mailto:andr...@visena.com>> writes: > 1. Why isnt' folder_id part of the index-cond? Because a GIN index is useless for sorting. > 2. Is there a way to make it use the (same) index to sort by > received_timestamp? No. > 3. Using a GIN-index, is there a way to use the index at all for sorting? No. > 4. It doesn't seem like ts_rank uses the index for sorting either. Same reason. regards, tom lane So it's basically impossible to use FTS/GIN with sorting on large datasets? Are there any plans to improve this situation? I don't see why it would not be possible to create a new execution node type that does an index scan to obtain order (or just to satisfy an equality or range expression), and takes a bitmap (as produced by the FTS/GIN) to apply as a filter. But, I don't know of anyone planning on doing that. Isn't this what Postgres Pro are planning? http://postgrespro.com/roadmap/mssearch <http://postgrespro.com/roadmap/mssearch> "Unlike external special-purpose search engines, a full-text search engine built in a DBMS is capable of combining full-text and attributive search criteria in SQL query syntax. It is planned to improve the existing PostgreSQL full-text search engine byextending the functionality of Generalized Inverted Index (GIN) to make it capable of storing extra information required for ranging query results. This search acceleration will allow to go back from external full-text search engines, thus facilitating system administration and use, reducing technology risks, and improving information security." This is different feature ! Actually, we already have prototype of what Jeff suggested, we called it bitmap filtering, but failed to find use case where it provides benefits. Teodor will comment this idea more detail. The feature I'm missing is the ability to do FTS (or use GIN in general) and then sort on some other column (also indexed by the same GIN-index, using the btree-gin extention), often of type BIGINT or TIMESTAMP. Are you planning to work on such a feature for GIN? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
På lørdag 19. mars 2016 kl. 03:44:55, skrev Jeff Janes <jeff.ja...@gmail.com <mailto:jeff.ja...@gmail.com>>: On Wed, Mar 16, 2016 at 6:53 AM, Andreas Joseph Krogh <andr...@visena.com <mailto:andr...@visena.com>> wrote: På onsdag 16. mars 2016 kl. 14:37:27, skrev Tom Lane <t...@sss.pgh.pa.us <mailto:t...@sss.pgh.pa.us>>: Andreas Joseph Krogh <andr...@visena.com <mailto:andr...@visena.com>> writes: > 1. Why isnt' folder_id part of the index-cond? Because a GIN index is useless for sorting. > 2. Is there a way to make it use the (same) index to sort by > received_timestamp? No. > 3. Using a GIN-index, is there a way to use the index at all for sorting? No. > 4. It doesn't seem like ts_rank uses the index for sorting either. Same reason. regards, tom lane So it's basically impossible to use FTS/GIN with sorting on large datasets? Are there any plans to improve this situation? I don't see why it would not be possible to create a new execution node type that does an index scan to obtain order (or just to satisfy an equality or range expression), and takes a bitmap (as produced by the FTS/GIN) to apply as a filter. But, I don't know of anyone planning on doing that. Isn't this what Postgres Pro are planning? http://postgrespro.com/roadmap/mssearch "Unlike external special-purpose search engines, a full-text search engine built in a DBMS is capable of combining full-text and attributive search criteria in SQL query syntax. It is planned to improve the existing PostgreSQL full-text search engine byextending the functionality of Generalized Inverted Index (GIN) to make it capable of storing extra information required for ranging query results. This search acceleration will allow to go back from external full-text search engines, thus facilitating system administration and use, reducing technology risks, and improving information security." -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
På onsdag 16. mars 2016 kl. 14:37:27, skrev Tom Lane <t...@sss.pgh.pa.us <mailto:t...@sss.pgh.pa.us>>: Andreas Joseph Krogh <andr...@visena.com> writes: > 1. Why isnt' folder_id part of the index-cond? Because a GIN index is useless for sorting. > 2. Is there a way to make it use the (same) index to sort by > received_timestamp? No. > 3. Using a GIN-index, is there a way to use the index at all for sorting? No. > 4. It doesn't seem like ts_rank uses the index for sorting either. Same reason. regards, tom lane So it's basically impossible to use FTS/GIN with sorting on large datasets? Are there any plans to improve this situation? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
På onsdag 16. mars 2016 kl. 16:04:08, skrev Evgeniy Shishkin < itparan...@gmail.com <mailto:itparan...@gmail.com>>: > On 16 Mar 2016, at 17:52, Evgeniy Shishkin <itparan...@gmail.com> wrote: > > >> On 16 Mar 2016, at 16:37, Tom Lane <t...@sss.pgh.pa.us> wrote: >> >> Andreas Joseph Krogh <andr...@visena.com> writes: >>> 1. Why isnt' folder_id part of the index-cond? >> >> Because a GIN index is useless for sorting. > > I don't see how gin inability to return sorted data relates to index condition. > In fact i tried to reproduce the example, > and if i change folder_id to int from bigint, then index condition with folder_id is used > > Index Cond: ((fts_all @@ '''hi'''::tsquery) AND (folder_id = 1)) > Looks like documentation http://www.postgresql.org/docs/9.5/static/btree-gin.html is lying about supporting int8 type Hm, interesting! @Tom: Any idea why BIGINT doesn't work, but INTEGER does? -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
På onsdag 16. mars 2016 kl. 16:07:56, skrev Evgeniy Shishkin < itparan...@gmail.com <mailto:itparan...@gmail.com>>: > On 16 Mar 2016, at 18:04, Evgeniy Shishkin <itparan...@gmail.com> wrote: > >> >> On 16 Mar 2016, at 17:52, Evgeniy Shishkin <itparan...@gmail.com> wrote: >> >> >>> On 16 Mar 2016, at 16:37, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> >>> Andreas Joseph Krogh <andr...@visena.com> writes: >>>> 1. Why isnt' folder_id part of the index-cond? >>> >>> Because a GIN index is useless for sorting. >> >> I don't see how gin inability to return sorted data relates to index condition. >> In fact i tried to reproduce the example, >> and if i change folder_id to int from bigint, then index condition with folder_id is used >> >> Index Cond: ((fts_all @@ '''hi'''::tsquery) AND (folder_id = 1)) >> > > Looks like documentation http://www.postgresql.org/docs/9.5/static/btree-gin.html > is lying about supporting int8 type > Uh, it works if i cast to bigint explicitly WHERE del.fts_all @@ to_tsquery('simple', 'hi') AND del.folder_id = 1::bigint; results in Index Cond: ((folder_id = '1'::bigint) AND (fts_all @@ '''hi'''::tsquery)) Hm, this is quite cranky, but thanks for the heads-up! Tho it looks like it works if prepared, without explicit cast: prepare fish AS SELECT del.id , del.received_timestamp FROM delivery del WHERE 1= 1 AND del.fts_all @@ to_tsquery('simple', $1) AND del.folder_id = $2 ORDER BY del.received_timestampDESC LIMIT 101 OFFSET 0; explain analyze execute fish( 'hi:*', 1); QUERY PLAN Limit (cost=9.13..9.13 rows=1 width=16) (actual time=0.047..0.048 rows=2 loops=1) -> Sort (cost=9.13..9.13 rows=1 width=16) (actual time=0.045..0.045 rows=2 loops=1) Sort Key: received_timestamp DESC Sort Method: quicksort Memory: 25kB -> Bitmap Heap Scan on delivery del (cost=7.80..9.12 rows=1 width=16) (actual time=0.034..0.034 rows=2 loops=1) Recheck Cond: ((fts_all @@ '''hi'':*'::tsquery) AND (folder_id = '1'::bigint)) Heap Blocks: exact=1 -> Bitmap Index Scan on fts_idx (cost=0.00..7.80 rows=1 width=0) (actual time=0.023..0.023 rows=2 loops=1) Index Cond: ((fts_all @@ '''hi'':*'::tsquery) AND (folder_id = '1'::bigint)) Execution time: 0.103 ms (10 rows) -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
På onsdag 16. mars 2016 kl. 14:53:04, skrev Andreas Joseph Krogh < andr...@visena.com <mailto:andr...@visena.com>>: På onsdag 16. mars 2016 kl. 14:37:27, skrev Tom Lane <t...@sss.pgh.pa.us <mailto:t...@sss.pgh.pa.us>>: Andreas Joseph Krogh <andr...@visena.com> writes: > 1. Why isnt' folder_id part of the index-cond? Because a GIN index is useless for sorting. > 2. Is there a way to make it use the (same) index to sort by > received_timestamp? No. > 3. Using a GIN-index, is there a way to use the index at all for sorting? No. > 4. It doesn't seem like ts_rank uses the index for sorting either. Same reason. regards, tom lane So it's basically impossible to use FTS/GIN with sorting on large datasets? Are there any plans to improve this situation? Thanks. This paper talks about ORDER BY optimizations for FTS (starting at slide 6 and 7): http://www.sai.msu.su/~megera/postgres/talks/Next%20generation%20of%20GIN.pdf This indicates some work is being done in this area. Oleg, if you're listening, do you guys have any exiting news regarding this? -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
På torsdag 17. mars 2016 kl. 18:20:23, skrev Tom Lane <t...@sss.pgh.pa.us <mailto:t...@sss.pgh.pa.us>>: Evgeniy Shishkin <itparan...@gmail.com> writes: > Uh, it works if i cast to bigint explicitly FWIW, the reason for that is that the int8_ops operator class that btree_gin creates doesn't contain any cross-type operators. Probably wouldn't be that hard to fix if somebody wanted to put in the work. regards, tom lane Thanks for info. Can you explain why it works when using prepared statement without casting? Does the machinary then know the type so the "setParameter"-call uses the correct type? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
[PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
Hi all. I'm doing full-text-search and want do display the results in the order the articles were received (timestamp). I have millions of articles where the newest are the most interesting, and a search may match many articles so doing the sort using some INDEX would be great. Take the following example-schema: create extension if not exists btree_gin; drop table if EXISTS delivery; create table delivery( id BIGSERIAL primary key , fts_allTSVECTOR not null, folder_id BIGINT NOT NULL, received_timestamp TIMESTAMP not null, message varchar not null ); create index fts_idx ON delivery using gin(fts_all, folder_id); CREATE OR REPLACE FUNCTION update_delivery_tsvector_tf()RETURNS TRIGGER AS $$ BEGIN NEW.fts_all = to_tsvector('simple', NEW.message); return NEW; END; $$ LANGUAGE PLPGSQL; CREATE TRIGGERupdate_delivery_tsvector_t BEFORE INSERT OR UPDATE ON delivery FOR EACH ROW EXECUTE PROCEDUREupdate_delivery_tsvector_tf(); insert into delivery(folder_id, received_timestamp,message) values (1, '2015-01-01', 'Yes hit four') , (1, '2014-01-01', 'Hi man') , (2, '2013-01-01', 'Hi man') , (2, '2013-01-01', 'fish') ; analyze delivery; set ENABLE_SEQSCAN to off; explain analyze SELECTdel.id , del.received_timestamp FROM delivery del WHERE 1 = 1 AND del.fts_all @@ to_tsquery('simple', 'hi:*') AND del.folder_id = 1 ORDER BY del.received_timestampDESC LIMIT 101 OFFSET 0; I use btree_gin extention to make folder_id part of index. I get the following plan (using 9.6 from master): QUERY PLAN Limit (cost=5.23..5.23 rows=1 width=16) (actual time=0.042..0.043 rows=2 loops=1) -> Sort (cost=5.23..5.23 rows=1 width=16) (actual time=0.040..0.040 rows=2 loops=1) Sort Key: received_timestamp DESC Sort Method: quicksort Memory: 25kB -> Bitmap Heap Scan on delivery del (cost=3.90..5.22 rows=1 width=16) (actual time=0.029..0.030 rows=2 loops=1) Recheck Cond: (fts_all @@ '''hi'':*'::tsquery) Filter: (folder_id = 1) Rows Removed by Filter: 1 Heap Blocks: exact=1 -> Bitmap Index Scan on fts_idx (cost=0.00..3.90 rows=1 width=0) (actual time=0.018..0.018 rows=3 loops=1) Index Cond: (fts_all @@ '''hi'':*'::tsquery) Planning time: 0.207 ms Execution time: 0.085 ms (13 rows) Here is the explain from a real-world query: EXPLAIN ANALYZE SELECT del.entity_id , del.received_timestamp FROM origo_email_delivery delWHERE 1 = 1 AND del.fts_all @@ to_tsquery('simple', 'andre:*') AND del.folder_id = 44964 ORDER BY del.received_timestamp DESC LIMIT 101OFFSET 0; QUERY PLAN ── Limit (cost=92260.66..92260.91 rows=101 width=16) (actual time=347.891..347.907 rows=101 loops=1) -> Sort (cost=92260.66..92291.08 rows=12167 width=16) (actual time=347.888..347.899 rows=101 loops=1) Sort Key: received_timestamp DESC Sort Method: top-N heapsort Memory: 29kB -> Bitmap Heap Scan on origo_email_delivery del (cost=2480.95..91794.77 rows=12167 width=16) (actual time=152.568..346.132 rows=18257 loops=1) Recheck Cond: (fts_all @@ '''andre'':*'::tsquery) Filter: (folder_id = 44964) Rows Removed by Filter: 264256 Heap Blocks: exact=80871 -> Bitmap Index Scan on temp_fts_idx (cost=0.00..2477.91 rows=309588 width=0) (actual time=134.903..134.903 rows=282513 loops=1) Index Cond: (fts_all @@ '''andre'':*'::tsquery) Planning time: 0.530 ms Execution time: 347.967 ms (13 rows) 1. Why isnt' folder_id part of the index-cond? 2. Is there a way to make it use the (same) index to sort by received_timestamp? 3. Using a GIN-index, is there a way to use the index at all for sorting? 4. It doesn't seem like ts_rank uses the index for sorting either. Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com <mailto:andr...@visena.com> www.visena.com <https://www.visena.com> <https://www.visena.com>
Re: [PERFORM] problem with select *
På mandag 24. august 2015 kl. 09:04:07, skrev bhuvan Mitra bhuvan...@gmail.com mailto:bhuvan...@gmail.com: Hello, I have a table with 12 columns and 20 Million rows. While writing the table I do not find any problem but when reading that I have some issues faced. When I perform a 'select * from table limit 1400;' (selecting 14million rows), it is working fine. If the limit value is 1500, it is throwing the error as 'out of memory'. If the query is 'select * from table' , The process is getting killed by displaying the message 'killed'. Kindly tell me where it is going wrong. I have 6MB cache, 1.6GHz CPU, linux 14.04 OS, 8GB RAM. In what application are you performing these queries? -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Slow HashAggregate/cache access
=1) Filter: (fr01codemp = 1::smallint) Buffers: shared hit=2 Total runtime: 18.528 ms (35 rows) Tomorrow I will try to do the same with the other slow query, reporting here. It will be interesting to see how Oracle and SQL-Server perform with the re-written query too. Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Slow HashAggregate/cache access
På onsdag 05. august 2015 kl. 20:25:25, skrev Maxim Boguk maxim.bo...@gmail.com mailto:maxim.bo...@gmail.com: [snip] I think I know where issue is. The PostgreSQL planner unable pass join conditions into subquery with aggregate functions (it's well known limitation). [snip] I'm curious; will 9.5 help here as it has WHERE clause pushdown in subqueries with window functions? http://michael.otacoo.com/postgresql-2/postgres-9-5-feature-highlight-where-pushdown-with-window-function/ Are you able to try 9.5 and post the results? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Slow HashAggregate/cache access
På onsdag 05. august 2015 kl. 22:53:25, skrev Alexandre de Arruda Paes adald...@gmail.com mailto:adald...@gmail.com: Hi Andreas, Same plan in 9.5, but the execution time was greater than 9.3 (maybe need some tunning): Thanks for sharing. Maybe some @hackers will chime in and comment. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] New server: SSD/RAID recommendations?
På torsdag 02. juli 2015 kl. 01:06:57, skrev Craig James cja...@emolecules.com mailto:cja...@emolecules.com: We're buying a new server in the near future to replace an aging system. I'd appreciate advice on the best SSD devices and RAID controller cards available today. The database is about 750 GB. This is a warehouse server. We load supplier catalogs throughout a typical work week, then on the weekend (after Q/A), integrate the new supplier catalogs into our customer-visible store, which is then copied to a production server where customers see it. So the load is mostly data loading, and essentially no OLTP. Typically there are fewer than a dozen connections to Postgres. Linux 2.6.32 Postgres 9.3 Hardware: 2 x INTEL WESTMERE 4C XEON 2.40GHZ 12GB DDR3 ECC 1333MHz 3WARE 9650SE-12ML with BBU 12 x 1TB Hitachi 7200RPM SATA disks RAID 1 (2 disks) Linux partition Swap partition pg_xlog partition RAID 10 (8 disks) Postgres database partition We get 5000-7000 TPS from pgbench on this system. The new system will have at least as many CPUs, and probably a lot more memory (196 GB). The database hasn't reached 1TB yet, but we'd like room to grow, so we'd like a 2TB file system for Postgres. We'll start with the latest versions of Linux and Postgres. Intel's products have always received good reports in this forum. Is that still the best recommendation? Or are there good alternatives that are price competitive? What about a RAID controller? Are RAID controllers even available for PCI-Express SSD drives, or do we have to stick with SATA if we need a battery-backed RAID controller? Or is software RAID sufficient for SSD drives? Are spinning disks still a good choice for the pg_xlog partition and OS? Is there any reason to get spinning disks at all, or is it better/simpler to just put everything on SSD drives? Thanks in advance for your advice! Depends on you SSD-drives, but today's enterprise-grade SSD disks can handle pg_xlog just fine. So I'd go full SSD, unless you have many BLOBs in pg_largeobject, then move that to a separate tablespace with archive-grade-disks (spinning disks). -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
[PERFORM] Performance of vacuumlo
Hi all. I have a pg_largeobject of ~300GB size and when I run vacuumlo -n dbname, I get: Would remove 82172 large objects from database dbname. So I'm running without -n to do the actual work, but it seems to take forever. The disks are 8 SAS 10K HDD drives in RAID5. Any hints on how long this is supposed to take? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
[PERFORM] Poor performance when deleting from entity-attribute-value type master-table
Hi all. Using PG-9.4.0 I'm seeing this trying to delete from an entity-master table: *# explain analyze delete from onp_crm_entity where entity_id IN (select tmp.delivery_id from temp_delete_delivery_id tmp); QUERY PLAN --- Delete on onp_crm_entity (cost=0.43..5673.40 rows=1770 width=12) (actual time=7.370..7.370 rows=0 loops=1) - Nested Loop (cost=0.43..5673.40 rows=1770 width=12) (actual time=0.050..1.374 rows=108 loops=1) - Seq Scan on temp_delete_delivery_id tmp (cost=0.00..27.70 rows=1770 width=14) (actual time=0.014..0.080 rows=108 loops=1) - Index Scan using onp_crm_entity_pkey on onp_crm_entity (cost=0.43..3.18 rows=1 width=14) (actual time=0.010..0.011 rows=1 loops=108) Index Cond: (entity_id = tmp.delivery_id) Planning time: 0.314 ms Trigger for constraint onp_crm_activity_entity_id_fkey: time=4.141 calls=108 Trigger for constraint ... Trigger for constraint ... Trigger for constraint ... I have lots of tables referencing onp_crm_entity(entity_id) so I expect the poor performance of deleting from it is caused by all the triggers firing to check FKI-constraints. Are there any ways around this or do people simply avoid having FKs in schemas like this? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Poor performance when deleting from entity-attribute-value type master-table
På mandag 09. februar 2015 kl. 22:36:55, skrev Igor Neyman iney...@perceptron.com mailto:iney...@perceptron.com: From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org]On Behalf Of Andreas Joseph Krogh Sent: Monday, February 09, 2015 4:13 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Poor performance when deleting from entity-attribute-value type master-table Hi all. Using PG-9.4.0 I'm seeing this trying to delete from an entity-master table: *# explain analyze delete from onp_crm_entity where entity_id IN (select tmp.delivery_id from temp_delete_delivery_id tmp); QUERY PLAN --- Delete on onp_crm_entity (cost=0.43..5673.40 rows=1770 width=12) (actual time=7.370..7.370 rows=0 loops=1) - Nested Loop (cost=0.43..5673.40 rows=1770 width=12) (actual time=0.050..1.374 rows=108 loops=1) - Seq Scan on temp_delete_delivery_id tmp (cost=0.00..27.70 rows=1770 width=14) (actual time=0.014..0.080 rows=108 loops=1) - Index Scan using onp_crm_entity_pkey on onp_crm_entity (cost=0.43..3.18 rows=1 width=14) (actual time=0.010..0.011 rows=1 loops=108) Index Cond: (entity_id = tmp.delivery_id) Planning time: 0.314 ms Trigger for constraint onp_crm_activity_entity_id_fkey: time=4.141 calls=108 Trigger for constraint ... Trigger for constraint ... Trigger for constraint ... I have lots of tables referencing onp_crm_entity(entity_id) so I expect the poor performance of deleting from it is caused by all the triggers firing to check FKI-constraints. Andreas, do you have indexes on FK columns in child tables? If not – there is your problem. Yes, they have indexes, but deleting 1M rows still results in calling the triggers 1M times * number of FKs... -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
[PERFORM] Query performing very bad and sometimes good
time=0.002..0.002 rows=0 loops=354300) Index Cond: ((owner_id = 3::bigint) AND (message_id = del.message_id)) - Index Only Scan using origo_email_folder_message_delivery_id_folder_id_key on origo_email_folder_message fm (cost=0.42..0.50 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=12870) Index Cond: ((delivery_id = del.entity_id) AND (folder_id = 44961::bigint)) Heap Fetches: 75 - Index Scan using origo_email_message_pkey on origo_email_message em (cost=0.42..2.50 rows=1 width=455) (actual time=0.007..0.007 rows=1 loops=75) Index Cond: (entity_id = del.message_id) - Index Scan using origo_person_entity_id_idx on onp_crm_person pers (cost=0.28..0.29 rows=1 width=35) (actual time=0.000..0.000 rows=0 loops=75) Index Cond: (em.from_entity_id = entity_id) - Index Scan using onp_crm_relation_pkey on onp_crm_relation comp (cost=0.27..0.35 rows=1 width=19) (actual time=0.000..0.000 rows=0 loops=75) Index Cond: (entity_id = pers.relation_id) - Index Scan using origo_email_address_pkey on origo_email_address ef (cost=0.43..0.68 rows=1 width=50) (actual time=0.007..0.008 rows=1 loops=75) Index Cond: (em.from_id = entity_id) SubPlan 1 - Nested Loop (cost=0.85..16.90 rows=1 width=24) (actual time=0.012..0.014 rows=1 loops=75) - Index Scan using origo_email_address_owner_message_id_recipient_type_address_key on origo_email_address_owner o (cost=0.42..8.45 rows=1 width=12) (actual time=0.008..0.008 rows=1 loops=75) Index Cond: ((message_id = em.entity_id) AND ((recipient_type)::text = 'TO'::text)) - Index Scan using origo_email_address_pkey on origo_email_address a (cost=0.43..8.45 rows=1 width=28) (actual time=0.002..0.003 rows=1 loops=93) Index Cond: (entity_id = o.address_id) SubPlan 2 - Nested Loop (cost=0.85..16.90 rows=1 width=26) (actual time=0.007..0.008 rows=1 loops=75) - Index Scan using origo_email_address_owner_message_id_recipient_type_address_key on origo_email_address_owner o_1 (cost=0.42..8.45 rows=1 width=12) (actual time=0.004..0.004 rows=1 loops=75) Index Cond: ((message_id = em.entity_id) AND ((recipient_type)::text = 'TO'::text)) - Index Scan using origo_email_address_pkey on origo_email_address a_1 (cost=0.43..8.45 rows=1 width=30) (actual time=0.002..0.002 rows=1 loops=93) Index Cond: (entity_id = o_1.address_id) SubPlan 3 - Seq Scan on origo_favourite_owner person_fav (cost=0.00..4.75 rows=1 width=4) (actual time=0.027..0.027 rows=0 loops=75) Filter: ((favourite_for = 3::bigint) AND (favourite_item = pers.entity_id)) Rows Removed by Filter: 183 SubPlan 4 - Seq Scan on origo_favourite_owner company_fav (cost=0.00..4.75 rows=1 width=4) (actual time=0.023..0.023 rows=0 loops=75) Filter: ((favourite_for = 3::bigint) AND (favourite_item = comp.entity_id)) Rows Removed by Filter: 183 Total runtime: 3481.136 ms (42 rows) Does anyone see anything obvious or have any hints what to investigate further? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] OFFSET/LIMIT - Disparate Performance w/ Go application
På torsdag 12. juni 2014 kl. 16:58:06, skrev Tom Lane t...@sss.pgh.pa.us mailto:t...@sss.pgh.pa.us: Matt Silverlock m...@eatsleeprepeat.net writes: Hi all. This might be tricky in so much as there���s a few moving parts (when isn���t there?), but I���ve tried to test the postgres side as much as possible. Trying to work out a potential database bottleneck with a HTTP application (written in Go): Pages that render HTML templates but don���t perform DB queries can hit ~36k+ req/s Pages that perform a SELECT on a single row net about ~6.6k req/s: db.Get(l, SELECT * FROM listings WHERE id = $1 AND expiry_date current_date, l.Id) Pages that SELECT multiple rows with OFFSET and LIMIT conditions struggle to top 1.3k req/s You don't show us exactly what you're doing with OFFSET/LIMIT, but I'm going to guess that you're using it to paginate large query results. That's basically always going to suck: Postgres has no way to implement OFFSET except to generate and then throw away that number of initial rows. If you do the same query over again N times with different OFFSETs, it's going to cost you N times as much as the base query would. Are there any plans to make PG implement OFFSET more efficiently, so it doesn't have to read and throw away? I used SQL Server back in 2011 in a project and seem to remember they implemented offset pretty fast. Paging in a resultset of millions was much faster than in PG. -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På fredag 02. mai 2014 kl. 02:17:58, skrev Craig James cja...@emolecules.com mailto:cja...@emolecules.com: On Thu, May 1, 2014 at 4:26 AM, Andreas Joseph Kroghandr...@visena.com mailto:andr...@visena.com wrote: I have a schema where I have lots of messages and some users who might have read some of them. When a message is read by a user I create an entry i a table message_property holding the property (is_read) for that user. The schema is as follows: [...] create table person( id serial primary key, username varchar not null unique ); create table message( id serial primary key, subject varchar ); create table message_property( message_id integer not null references message(id), person_id integer not null references person(id), is_read boolean not null default false, unique(message_id, person_id) ); [...] So, for person 1 there are 10 unread messages, out of a total 1mill. 5 of those unread does not have an entry in message_property and 5 have an entry and is_read set to FALSE. Here's a possible enhancement: add two columns, an indexed timestamp to the message table, and a timestamp of the oldest message this user has NOT read on the person table. If most users read messages in a timely fashion, this would (in most cases) narrow down the portion of the messages table to a tiny fraction of the total -- just those messages newer than the oldest message this user has not read. When you sign up a new user, you can set his timestamp to the time the account was created, since presumably messages before that time don't apply. Whether this will help depends a lot on actual use patterns, i.e. do users typically read all messages or do they leave a bunch of unread messages sitting around forever? Thanks fort the suggestion. A user must be able to read arbitrary old messages, and messages don't expire. -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
[PERFORM] Optimize query for listing un-read messages
Hi all, I'm using PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu I have a schema where I have lots of messages and some users who might have read some of them. When a message is read by a user I create an entry i a table message_property holding the property (is_read) for that user. The schema is as follows: drop table if exists message_property; drop table if exists message; drop table if exists person; create table person( id serial primary key, username varchar not null unique ); create table message( id serial primary key, subject varchar ); create table message_property( message_id integer not null references message(id), person_id integer not null references person(id), is_read boolean not null default false, unique(message_id, person_id) ); insert into person(username) values('user_' || generate_series(0, 999)); insert into message(subject) values('Subject ' || random() || generate_series(0, 99)); insert into message_property(message_id, person_id, is_read) select id, 1, true from message order by id limit 90; insert into message_property(message_id, person_id, is_read) select id, 1, false from message order by id limit 5 offset 90; analyze; So, for person 1 there are 10 unread messages, out of a total 1mill. 5 of those unread does not have an entry in message_property and 5 have an entry and is_read set to FALSE. I have the following query to list all un-read messages for person with id=1: SELECT m.id AS message_id, prop.person_id, coalesce(prop.is_read, FALSE) AS is_read, m.subject FROM message m LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND prop.person_id = 1 WHERE 1 = 1 AND NOT EXISTS(SELECT * FROM message_property pr WHERE pr.message_id = m.id AND pr.person_id = prop.person_id AND prop.is_read = TRUE) ; The problem is that it's not quite efficient and performs badly, explain analyze shows: QUERY PLAN - Merge Anti Join (cost=1.27..148784.09 rows=5 width=40) (actual time=918.906..918.913 rows=10 loops=1) Merge Cond: (m.id = pr.message_id) Join Filter: (prop.is_read AND (pr.person_id = prop.person_id)) Rows Removed by Join Filter: 5 - Merge Left Join (cost=0.85..90300.76 rows=100 width=40) (actual time=0.040..530.748 rows=100 loops=1) Merge Cond: (m.id = prop.message_id) - Index Scan using message_pkey on message m (cost=0.42..34317.43 rows=100 width=35) (actual time=0.014..115.829 rows=100 loops=1) - Index Scan using message_property_message_id_person_id_key on message_property prop (cost=0.42..40983.40 rows=95 width=9) (actual time=0.020..130.728 rows=95 loops=1) Index Cond: (person_id = 1) - Index Only Scan using message_property_message_id_person_id_key on message_property pr (cost=0.42..40983.40 rows=95 width=8) (actual time=0.024..140.349 rows=95 loops=1) Index Cond: (person_id = 1) Heap Fetches: 95 Total runtime: 918.975 ms (13 rows) Does anyone have suggestions on how to optimize the query or schema? It's important that any message not having an entry in message_property for a user is considered un-read. Thanks! -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På torsdag 01. mai 2014 kl. 20:35:07, skrev Jochem Berndsen joc...@functor.nl mailto:joc...@functor.nl: Hi Andreas, [New to this list, forgive my ignorance.] [snip] I'm getting better performance with: SELECT m.id AS message_id, 1 AS person_id, FALSE AS is_read, m.subject FROM message m WHERE 1 = 1 AND NOT EXISTS(SELECT * FROM message_property pr WHERE pr.message_id = m.id AND pr.person_id = 1 AND pr.is_read); You then lose the distinction between message_property with is_read = FALSE, and nonexistent message_property for the message row. If that is essential, I'm getting a roughly 2x speedup on my non-tuned PostgreSQL with: SELECT m.id AS message_id, prop.person_id, coalesce(prop.is_read, FALSE) AS is_read, m.subject FROM message m LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND prop.person_id = 1 WHERE not coalesce(prop.is_read, false); Hi Jochem, Thansk for looking at it. I'm still seing ~500ms being spent and I was hoping for a way to do this using index so one could achieve 1-10ms, but maybe that's impossible given the schema? Is there a way to design an equivalent schema to achieve 10ms execution-time? -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: Hello [snip] I had a perfect success on similar use case with descent ordered partial index http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html I'm not getting good performance. Are you able to craft an example using my schema and partial index? Thanks. -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På torsdag 01. mai 2014 kl. 21:53:32, skrev Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: 2014-05-01 21:39 GMT+02:00 Andreas Joseph Kroghandr...@visena.com mailto:andr...@visena.com: På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: Hello [snip] I had a perfect success on similar use case with descent ordered partial index http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html I'm not getting good performance. Are you able to craft an example using my schema and partial index? maybe some like CREATE INDEX ON message_property (person_id, message_id) WHERE pr.is_read When I am thinking about your schema, it is designed well, but it is not index friendly, so for some fast access you should to hold a cache (table) of unread messages Ah, that's what I was hoping to not having to do. In my system, messages arrive all the time and having to update a cache for all new messages for all users seems messy... Seems I could just as well create a message_property for all users when a new message arrives, so I can INNER JOIN it and get good performance. But that table will quickly grow *very* large... --Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På torsdag 01. mai 2014 kl. 23:19:55, skrev David G Johnston david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com: How does something like: WITH unreads AS ( SELECT messageid FROM message EXCEPT SELECT messageid FROM message_property WHERE personid=1 AND has_read ) SELECT ... FROM unreads JOIN messages USING (messageid) ; perform? It actually performs worse. The best query so far is: SELECT m.id AS message_id, prop.person_id, coalesce(prop.is_read, FALSE) AS is_read, m.subject FROM message m LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND prop.person_id = 1 WHERE coalesce(prop.is_read, false) = false; Giving the plan: QUERY PLAN --- Merge Left Join (cost=4.20..90300.76 rows=50 width=40) (actual time=445.021..445.025 rows=10 loops=1) Merge Cond: (m.id = prop.message_id) Filter: (NOT COALESCE(prop.is_read, false)) Rows Removed by Filter: 90 - Index Scan using message_pkey on message m (cost=0.42..34317.43 rows=100 width=35) (actual time=0.014..113.314 rows=100 loops=1) - Index Scan using message_property_message_id_person_id_key on message_property prop (cost=0.42..40983.40 rows=95 width=9) (actual time=0.018..115.019 rows=95 loops=1) Index Cond: (person_id = 1) Total runtime: 445.076 ms (8 rows) -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På torsdag 01. mai 2014 kl. 23:45:49, skrev Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz: On 1.5.2014 23:19, Andreas Joseph Krogh wrote: På torsdag 01. mai 2014 kl. 23:02:13, skrev Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: 2014-05-01 22:30 GMT+02:00 Andreas Joseph Krogh andr...@visena.com mailto:andr...@visena.com: På torsdag 01. mai 2014 kl. 21:53:32, skrev Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: 2014-05-01 21:39 GMT+02:00 Andreas Joseph Krogh andr...@visena.com mailto:andr...@visena.com: På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: Hello [snip] I had a perfect success on similar use case with descent ordered partial index http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html I'm not getting good performance. Are you able to craft an example using my schema and partial index? maybe some like CREATE INDEX ON message_property (person_id, message_id) WHERE pr.is_read When I am thinking about your schema, it is designed well, but it is not index friendly, so for some fast access you should to hold a cache (table) of unread messages Ah, that's what I was hoping to not having to do. In my system, messages arrive all the time and having to update a cache for all new messages for all users seems messy... Seems I could just as well create a message_property for all users when a new message arrives, so I can INNER JOIN it and get good performance. But that table will quickly grow *very* large... What you need is a JOIN index, that is not possible in Postgres. I afraid so some ugly solutions is necessary (when you require extra fast access). You need a index (small index) and it require some existing set - you cannot do index on the difference two sets. I expect so some flag on the relation message - like it should not be not read can helps little bit - and can be used in partial index as conditions. Other possibility is some variant of partitioning - you can divide a messages and users to distinct sets and then you decrease a number of possible combinations. Just curious: Is such a JOIN index possible in other DBs, if so - which? Can other DBs do index on difference between two sets? Will PG ever have this, is it even possible? I'm not aware of such database, but maybe it's possible at least for some cases. But I'd expect that to significantly depend on the schema. And I'm not aware of any such effort in case of PostgreSQL, do don't hold your breath. IMHO the problem with your schema is that while each 'read' message has a matching row in message_property, 'undread' messages may or may not have a matching row. Is there a particular reason for that? Yes. The point is that maintaining a message_property pair for all messages for all users in the system imposes quite a maintainance-headache. As the schema is now any new message is per definition un-read, and when a user reads it then it gets an entry with is_read=true in message_property. This table holds other properties too. This way I'm avoiding having to book-keep so much when a new message arrives and when a new user is created. A message in my system does not necessarily have only one recipient, it might have one, many or none, and might be visible to many. If you could get rid of this, i.e. require that each pair (message, recipient) has a row in message_propery (irrespectedly whether the message was read or not), you can do this: CREATE INDEX message_unread_idx ON message_property(message_id, person_id) WHERE (NOT is_read) and then simply do the query like this: SELECT m.id, prop.person_id, prop.is_read, m.subject FROM messages m JOIN message_property p ON (m.id = p.message_id) WHERE (NOT is_read) AND person_id = :pid and I'd expect this to use the partial index, and being much faster. I'm aware of the performance-gain using such a plain JOIN-query. Thanks for your feedback. -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På fredag 02. mai 2014 kl. 00:34:34, skrev Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz: On 1.5.2014 23:58, Andreas Joseph Krogh wrote: På torsdag 01. mai 2014 kl. 23:45:49, skrev Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz: On 1.5.2014 23:19, Andreas Joseph Krogh wrote: Just curious: Is such a JOIN index possible in other DBs, if so - which? Can other DBs do index on difference between two sets? Will PG ever have this, is it even possible? I'm not aware of such database, but maybe it's possible at least for some cases. But I'd expect that to significantly depend on the schema. And I'm not aware of any such effort in case of PostgreSQL, do don't hold your breath. IMHO the problem with your schema is that while each 'read' message has a matching row in message_property, 'undread' messages may or may not have a matching row. Is there a particular reason for that? Yes. The point is that maintaining a message_property pair for all messages for all users in the system imposes quite a maintainance-headache. As the schema is now any new message is per definition un-read, and when a user reads it then it gets an entry with is_read=true in message_property. This table holds other properties too. This way I'm avoiding having to book-keep so much when a new message arrives and when a new user is created. A message in my system does not necessarily have only one recipient, it might have one, many or none, and might be visible to many. So how do you determine who's the recipient of a message? Or is that the case that everyone can read everything (which is why you're displaying them the unread messages, right)? A message might have a recipient and might be read by others. I understand you're trying to solve this without storing a row for each possible message-person combination, but won't this eventually happen anyway (with is_read=true for all rows)? I will end up with that only if all users read all messages, which is not nearly the case. -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På fredag 02. mai 2014 kl. 00:55:25, skrev David G Johnston david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com: Andreas Joseph Krogh-2 wrote I will end up with that only if all users read all messages, which is not nearly the case. These observations probably won't help but... You have what amounts to a mathematical spare matrix problem on your hands... Is there any way to expire messages so that dimension does not grow unbounded? No, unfortunately... Per-User caching does seem to be something that is going to be needed... Depending on how many users are being tracked would storing the reader_id in an indexed array improve matters? SELECT ... FROM message WHERE NOT (1 = ANY(reader_ids)) ; UPDATE message SET reader_ids = reader_ids || 1 WHERE messageid = ... I'm not that familiar with how well indexes over arrays work or which kind is needed (i.e. gin/gist). is_read is one of many properties being tracked for a message... -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [PERFORM] Optimize query for listing un-read messages
På fredag 02. mai 2014 kl. 01:58:04, skrev David G Johnston david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com: Per-User caching does seem to be something that is going to be needed... Depending on how many users are being tracked would storing the reader_id in an indexed array improve matters? SELECT ... FROM message WHERE NOT (1 = ANY(reader_ids)) ; UPDATE message SET reader_ids = reader_ids || 1 WHERE messageid = ... I'm not that familiar with how well indexes over arrays work or which kind is needed (i.e. gin/gist). is_read is one of many properties being tracked for a message... But you don't have to have all of them on the same table. Once you've identified the messages in question performing a standard join onto a supplemental detail table should be fairly straight-forward. Do these other properties have values when is_read is false or only when is_read is true? Since you already allow for the possibility of a missing record (giving it the meaning of not read) these other properties cannot currently exist in that situation. A message might hold a property (ie. is_important) when is_read is FALSE (it might be set back to is_read=FALSE after being read the first time). -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com