Re: [PERFORM] Slow update on column that is part of exclusion constraint
> On 14 Apr 2016, at 07:17, Adam Brusselback <adambrusselb...@gmail.com> wrote: > > So fair enough, it does seem to be related to the lookup rather than > maintenance on the index. I was misguided in my initial assumption. > > Spent quite a bit of time trying to come up with a self contained test, and > it seems like I can't make it choose the GiST index unless I remove the > regular btree index in my test case, though the opposite is true for my table > in production. Not really sure what that means as far as what I need to do > though. I've tried a vacuum full, analyze, rebuild index, drop and re-add the > constraint... It still uses that GiST index for this query. > > Hell, a sequential scan is a ton faster even. > As i understand it, postgres needs a way to find rows for update. In explain analyze you provided, we see that it chose gist index for that. And that is a poor chose. I think you need a proper btree index for update query to work properly fast. Like index on (product_id, company_id, date_range) WHERE upper(price_generated_test.active_range) IS NULL. > On Wed, Apr 13, 2016 at 2:54 PM, Evgeniy Shishkin <itparan...@gmail.com> > wrote: > > > On 13 Apr 2016, at 20:14, Adam Brusselback <adambrusselb...@gmail.com> > > wrote: > > > > Sorry, brain stopped working and I forgot to include the normal info. > > > > Postgres version: 9.5.1 > > Hardware: 2 core, 4gb Digital Ocean virtual server > > OS: Debian > > > > explain analyze for an example update: > > 'Update on price_generated (cost=32.45..644.83 rows=1 width=157) (actual > > time=29329.614..29329.614 rows=0 loops=1)' > > ' -> Nested Loop (cost=32.45..644.83 rows=1 width=157) (actual > > time=29329.608..29329.608 rows=0 loops=1)' > > '-> HashAggregate (cost=32.04..34.35 rows=231 width=52) (actual > > time=1.137..2.090 rows=231 loops=1)' > > ' Group Key: pti.product_id, pti.company_id, pti.date_range' > > ' -> Seq Scan on _prices_to_insert pti (cost=0.00..30.31 > > rows=231 width=52) (actual time=0.060..0.678 rows=231 loops=1)' > > '-> Index Scan using > > price_generated_company_product_date_active_excl on price_generated > > (cost=0.41..2.63 rows=1 width=151) (actual time=126.949..126.949 rows=0 > > loops=231)' > > ' Index Cond: (date_range = pti.date_range)' > > ' Filter: ((upper(active_range) IS NULL) AND (pti.product_id = > > product_id) AND (pti.company_id = company_id))' > > ' Rows Removed by Filter: 29460' > > 'Planning time: 3.134 ms' > > 'Execution time: 29406.717 ms' > > Well, you see execution time of 30 seconds because there are 231 index > lookups, > each taking 126 ms. > > And that lookup is slow because of > Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND > (pti.company_id = company_id))' > > Can you provide self-containing example of update? > I don't see there (upper(active_range) IS NULL condition is coming from. > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow update on column that is part of exclusion constraint
> On 13 Apr 2016, at 20:14, Adam Brusselbackwrote: > > Sorry, brain stopped working and I forgot to include the normal info. > > Postgres version: 9.5.1 > Hardware: 2 core, 4gb Digital Ocean virtual server > OS: Debian > > explain analyze for an example update: > 'Update on price_generated (cost=32.45..644.83 rows=1 width=157) (actual > time=29329.614..29329.614 rows=0 loops=1)' > ' -> Nested Loop (cost=32.45..644.83 rows=1 width=157) (actual > time=29329.608..29329.608 rows=0 loops=1)' > '-> HashAggregate (cost=32.04..34.35 rows=231 width=52) (actual > time=1.137..2.090 rows=231 loops=1)' > ' Group Key: pti.product_id, pti.company_id, pti.date_range' > ' -> Seq Scan on _prices_to_insert pti (cost=0.00..30.31 > rows=231 width=52) (actual time=0.060..0.678 rows=231 loops=1)' > '-> Index Scan using > price_generated_company_product_date_active_excl on price_generated > (cost=0.41..2.63 rows=1 width=151) (actual time=126.949..126.949 rows=0 > loops=231)' > ' Index Cond: (date_range = pti.date_range)' > ' Filter: ((upper(active_range) IS NULL) AND (pti.product_id = > product_id) AND (pti.company_id = company_id))' > ' Rows Removed by Filter: 29460' > 'Planning time: 3.134 ms' > 'Execution time: 29406.717 ms' Well, you see execution time of 30 seconds because there are 231 index lookups, each taking 126 ms. And that lookup is slow because of Filter: ((upper(active_range) IS NULL) AND (pti.product_id = product_id) AND (pti.company_id = company_id))' Can you provide self-containing example of update? I don't see there (upper(active_range) IS NULL condition is coming from. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
> On 16 Mar 2016, at 16:37, Tom Lanewrote: > > Andreas Joseph Krogh writes: >> 1. Why isnt' folder_id part of the index-cond? > > Because a GIN index is useless for sorting. I don't see how gin inability to return sorted data relates to index condition. In fact i tried to reproduce the example, and if i change folder_id to int from bigint, then index condition with folder_id is used Index Cond: ((fts_all @@ '''hi'''::tsquery) AND (folder_id = 1)) > >> 2. Is there a way to make it use the (same) index to sort by >> received_timestamp? > > No. > >> 3. Using a GIN-index, is there a way to use the index at all for sorting? > > No. > >> 4. It doesn't seem like ts_rank uses the index for sorting either. > > Same reason. > > regards, tom lane > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
> On 16 Mar 2016, at 17:52, Evgeniy Shishkin <itparan...@gmail.com> wrote: > > >> On 16 Mar 2016, at 16:37, Tom Lane <t...@sss.pgh.pa.us> wrote: >> >> Andreas Joseph Krogh <andr...@visena.com> writes: >>> 1. Why isnt' folder_id part of the index-cond? >> >> Because a GIN index is useless for sorting. > > I don't see how gin inability to return sorted data relates to index > condition. > In fact i tried to reproduce the example, > and if i change folder_id to int from bigint, then index condition with > folder_id is used > > Index Cond: ((fts_all @@ '''hi'''::tsquery) AND (folder_id = 1)) > Looks like documentation http://www.postgresql.org/docs/9.5/static/btree-gin.html is lying about supporting int8 type > >> >>> 2. Is there a way to make it use the (same) index to sort by >>> received_timestamp? >> >> No. >> >>> 3. Using a GIN-index, is there a way to use the index at all for sorting? >> >> No. >> >>> 4. It doesn't seem like ts_rank uses the index for sorting either. >> >> Same reason. >> >> regards, tom lane >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance > -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Searching GIN-index (FTS) and sort by timestamp-column
> On 16 Mar 2016, at 18:04, Evgeniy Shishkin <itparan...@gmail.com> wrote: > >> >> On 16 Mar 2016, at 17:52, Evgeniy Shishkin <itparan...@gmail.com> wrote: >> >> >>> On 16 Mar 2016, at 16:37, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> >>> Andreas Joseph Krogh <andr...@visena.com> writes: >>>> 1. Why isnt' folder_id part of the index-cond? >>> >>> Because a GIN index is useless for sorting. >> >> I don't see how gin inability to return sorted data relates to index >> condition. >> In fact i tried to reproduce the example, >> and if i change folder_id to int from bigint, then index condition with >> folder_id is used >> >>Index Cond: ((fts_all @@ '''hi'''::tsquery) AND (folder_id = 1)) >> > > Looks like documentation > http://www.postgresql.org/docs/9.5/static/btree-gin.html > is lying about supporting int8 type > Uh, it works if i cast to bigint explicitly WHERE del.fts_all @@ to_tsquery('simple', 'hi') AND del.folder_id = 1::bigint; results in Index Cond: ((folder_id = '1'::bigint) AND (fts_all @@ '''hi'''::tsquery)) >> >>> >>>> 2. Is there a way to make it use the (same) index to sort by >>>> received_timestamp? >>> >>> No. >>> >>>> 3. Using a GIN-index, is there a way to use the index at all for sorting? >>> >>> No. >>> >>>> 4. It doesn't seem like ts_rank uses the index for sorting either. >>> >>> Same reason. >>> >>> regards, tom lane >>> >>> >>> -- >>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query that took a lot of time in Postgresql when not using trim in order by
> What is your Postgres version? > Do you have correct statistics on this tables? > Please show yours execution plans with buffers i.e. explain > (analyze,buffers) ... > Fast: Sort (cost=193101.41..195369.80 rows=907357 width=129) (actual time=3828.176..3831.261 rows=43615 loops=1) Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome, (btrim((dim_cliente.tipocliente)::text)) Sort Key: (btrim((dim_cliente.tipocliente)::text)), dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome Sort Method: quicksort Memory: 13121kB -> HashAggregate (cost=91970.52..103312.49 rows=907357 width=129) (actual time=2462.690..2496.729 rows=43615 loops=1) Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome, btrim((dim_cliente.tipocliente)::text) -> Hash Join (cost=856.30..80628.56 rows=907357 width=129) (actual time=29.524..1533.880 rows=907357 loops=1) Slow: Group (cost=170417.48..184027.84 rows=907357 width=129) (actual time=36649.329..37235.158 rows=43615 loops=1) Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome -> Sort (cost=170417.48..172685.88 rows=907357 width=129) (actual time=36649.315..36786.760 rows=907357 loops=1) Output: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome Sort Key: dim_cliente.tipocliente, dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome Sort Method: quicksort Memory: 265592kB -> Hash Join (cost=856.30..80628.56 rows=907357 width=129) (actual time=26.719..1593.693 rows=907357 loops=1) The difference is in the top of plans. As we see, hashjoin time is practically the same. But fast plan uses hashagg first and only 43k rows require sorting. Slow plan dominated by sorting 900k rows. I wonder if increasing cpu_tuple_cost will help. As cost difference between two plans is negligible now. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Yet another abort-early plan disaster on 9.3
Sorry for disrupting the thread, i am wondering will it be possible to use BRIN indexes to better estimate distribution? I mean create btree index and brin index, probe brin during planning and estimate if abort early plan with btree will be better. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index Scan Backward Slow
On 01 May 2015, at 13:54, David Osborne da...@qcode.co.uk wrote: Hi, We have a query which finds the latest row_id for a particular code. We've found a backwards index scan is much slower than a forward one, to the extent that disabling indexscan altogether actually improves the query time. Can anyone suggest why this might be, and what's best to do to improve the query time? dev= \d table Table public.table Column| Type | Modifiers --++--- row_id | integer| code | character(2) | Indexes: table_code_idx btree (code) table_row_idx btree (row_id) dev= select count(*) from table; count - 6090254 (1 row) dev= select count(distinct(row_id)) from table; count - 5421022 (1 row) dev= select n_distinct from pg_stats where tablename='table' and attname='row_id'; n_distinct -0.762951 (1 row) dev= show work_mem; work_mem --- 1249105kB (1 row) dev= select version(); version -- PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2), 64-bit (1 row) The query in question: dev= explain (analyse,buffers) select row_id as last_row_id from table where code='XX' order by row_id desc limit 1; QUERY PLAN -- Limit (cost=0.43..1.67 rows=1 width=4) (actual time=835.281..835.282 rows=1 loops=1) Buffers: shared hit=187961 - Index Scan Backward using table_row_idx on table (cost=0.43..343741.98 rows=278731 width=4) (actual time=835.278..835.278 rows=1 loops=1) Filter: (code = 'XX'::bpchar) Rows Removed by Filter: 4050971 Buffers: shared hit=187961 Total runtime: 835.315 ms (7 rows) http://explain.depesz.com/s/uGC So we can see it's doing a backwards index scan. Out of curiosity I tried a forward scan and it was MUCH quicker: dev= explain (analyse,buffers) select row_id as first_row_id from table where code='XX' order by row_id asc limit 1; QUERY PLAN --- Limit (cost=0.43..1.67 rows=1 width=4) (actual time=19.473..19.474 rows=1 loops=1) Buffers: shared hit=26730 - Index Scan using table_row_idx on table (cost=0.43..343741.98 rows=278731 width=4) (actual time=19.470..19.470 rows=1 loops=1) Filter: (code = 'XX'::bpchar) Rows Removed by Filter: 62786 Buffers: shared hit=26730 Total runtime: 19.509 ms (7 rows) http://explain.depesz.com/s/ASxD I thought adding a index on row_id desc might be the answer but it has little effect: dev= create index row_id_desc_idx on table(row_id desc); CREATE INDEX Time: 5293.812 ms dev= explain (analyse,buffers) select row_id as last_row_id from table where code='XX' order by row_id desc limit 1; QUERY PLAN Limit (cost=0.43..1.66 rows=1 width=4) (actual time=944.666..944.667 rows=1 loops=1) Buffers: shared hit=176711 read=11071 - Index Scan using row_id_desc_idx on table (cost=0.43..342101.98 rows=278731 width=4) (actual time=944.663..944.663 rows=1 loops=1) Filter: (code = 'XX'::bpchar) Rows Removed by Filter: 4050971 Buffers: shared hit=176711 read=11071 Total runtime: 944.699 ms (7 rows) http://explain.depesz.com/s/JStM In fact, disabling the index scan completely improves matters considerably: dev= drop index row_id_desc_idx; DROP INDEX dev= set enable_indexscan to off; SET dev= explain (analyse,buffers) select row_id as last_row_id from table where code='XX' order by row_id desc limit 1; QUERY PLAN
Re: [PERFORM] Tuning the configuration
On 16 Dec 2014, at 14:51, Graeme B. Bell g...@skogoglandskap.no wrote: I don't understand the logic behind using drives, which are best for random io, for sequent io workloads. Because they are also best for sequential IO. I get 1.3-1.4GB/second from 4 SSDs in RAID or 500MB/s for single disk systems, even with cheap models. Are you getting more than that from high-end spinning rust? I better use ssd for random iops when database doesn't fit in ram. For wal logs i use raid with bbu cache and couple of sas drives. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tuning the configuration
On 11 Dec 2014, at 15:02, Andrea Suisani sick...@opinioni.net wrote: On 12/10/2014 11:44 AM, Maila Fatticcioni wrote: 2- I would like to use the two SDD to store the wal file. Do you think it is useful or how should I use them? I definitely would give it a try. I don't understand the logic behind using drives, which are best for random io, for sequent io workloads. Better use 10k sas with BBU raid for wal, money wise. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] two table join with order by on both tables attributes
On 08 Aug 2014, at 16:29, Marti Raudsepp ma...@juffo.org wrote: On Fri, Aug 8, 2014 at 4:05 AM, Evgeniy Shishkin itparan...@gmail.com wrote: select * from users join notifications on users.id=notifications.user_id ORDER BY users.priority desc ,notifications.priority desc limit 10; In my understanding, i need to have two indexes on users(priority desc, id) and notifications(user_id, priority desc) And actually with this kind of query we really want the most wanted notifications, by the user. So we really can rewrite to order by users.priority desc, id asc, notifications.priority desc according to business logic. You can rewrite it with LATERAL to trick the planner into sorting each user's notifications separately. This should give you the nestloop plan you expect: SELECT * FROM users, LATERAL ( SELECT * FROM notifications WHERE notifications.user_id=users.id ORDER BY notifications.priority DESC ) AS notifications ORDER BY users.priority DESC, users.id Thank you very much. It would be great if Postgres could do this transformation automatically. There's a partial sort patch in the current CommitFest, which would solve the problem partially (it could use the index on users, but the notifications sort would have to be done in memory still). https://commitfest.postgresql.org/action/patch_view?id=1368 Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] two table join with order by on both tables attributes
Hello, suppose you have two very simple tables with fk dependency, by which we join them and another attribute for sorting like this select * from users join notifications on users.id=notifications.user_id ORDER BY users.priority desc ,notifications.priority desc limit 10; Very typical web query. No matter which composite indexes i try, postgresql can not make efficient nested loop plan using indexes. It chooses all sorts of seq scans and hash joins or merge join and always a sort node and then a limit 10. Neither plan provides acceptable performance. And tables tend to grow =\ Can anybody suggest something or explain this behavior? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] two table join with order by on both tables attributes
My question was about that you can not have fast execution of this kind of query in postgresql. With any runtime configuration you just swith from seq scan and hash join to merge join, and then you have a sort node. In my understanding, i need to have two indexes on users(priority desc, id) and notifications(user_id, priority desc) then postgresql would choose nested loop and get sorted data from indexes. But it wont. I don't understand why. Do you have any schema and GUCs which performs this kind of query well? Sorry for top posting. Can you explain why a nested loop is best for your data? Given my understanding of an expected prioritycardinality I would expect your ORDER BY to be extremely inefficient and not all that compatible with a nested loop approach. You can use the various parameters listed on this page to force the desired plan and then provide EXPLAIN ANALYZE results for the various executed plans and compare them. http://www.postgresql.org/docs/9.3/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-ENABLE And now for the obligatory read this link: https://wiki.postgresql.org/wiki/SlowQueryQuestions If you can show that in fact the nested loop (or some other plan) performs better than the one chosen by the planner - and can provide data that the developers can use to replicate the experiment - then improvements can be made. At worse you will come to understand why the planner is right and can then explore alternative models. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/two-table-join-with-order-by-on-both-tables-attributes-tp5814135p5814137.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] two table join with order by on both tables attributes
select * from users join notifications on users.id=notifications.user_id ORDER BY users.priority desc ,notifications.priority desc limit 10; In my understanding, i need to have two indexes on users(priority desc, id) and notifications(user_id, priority desc) then postgresql would choose nested loop and get sorted data from indexes. But it wont. Indeed. If you think a bit harder, you'll realize that the plan you suggest would *not* produce the sort order requested by this query. It would (if I'm not confused myself) produce an ordering like users.priority desc, id asc, notifications.priority desc which would only match what the query asks for if there's just a single value of id per users.priority value. Offhand I think that the planner will not recognize a nestloop as producing a sort ordering of this kind even if the query did request the right ordering. That could perhaps be improved, but I've not seen many if any cases where it would be worth the trouble. Thanks Tom, you are right. But may be some sort of skip index scan ala loose index scan will help with index on notifications(priority desc,user_id)? I know that this is currently not handled by native executors. May by i can work around this using WITH RECURSIVE query? Also, are there any plans to handle loose index scan in the upcoming release? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] two table join with order by on both tables attributes
On 08 Aug 2014, at 03:43, Evgeniy Shishkin itparan...@gmail.com wrote: select * from users join notifications on users.id=notifications.user_id ORDER BY users.priority desc ,notifications.priority desc limit 10; In my understanding, i need to have two indexes on users(priority desc, id) and notifications(user_id, priority desc) then postgresql would choose nested loop and get sorted data from indexes. But it wont. Indeed. If you think a bit harder, you'll realize that the plan you suggest would *not* produce the sort order requested by this query. It would (if I'm not confused myself) produce an ordering like users.priority desc, id asc, notifications.priority desc which would only match what the query asks for if there's just a single value of id per users.priority value. Offhand I think that the planner will not recognize a nestloop as producing a sort ordering of this kind even if the query did request the right ordering. That could perhaps be improved, but I've not seen many if any cases where it would be worth the trouble. And actually with this kind of query we really want the most wanted notifications, by the user. So we really can rewrite to order by users.priority desc, id asc, notifications.priority desc according to business logic. And we will benefit if this case would be improved. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] slave wal is ahead of master
Hello, we have 3 servers with postgresql 9.3.3. One is master and two slaves. We run synchronous_replication and fsync, synchronous_commit and full_page_writes are on. Suddenly master hang up with hardware failure, it is a strange bug in iLo which we investigate with HP. Before master was rebooted, i ran ps aux on slave postgres: wal receiver process streaming 12/F1031DF8 Last messages in slaves logs was 2014-03-19 02:41:29.005 GMT,,,7389,,53108c69.1cdd,16029,,2014-02-28 13:17:29 GMT,,0,LOG,0,recovery restart point at 12/DFFBB3E8,last completed transaction was at log time 2014-03-19 02:41:28.886869+00 and then there was silence, because master hang. Then master was rebooted and slave wrote in log 2014-03-19 15:36:39.176 GMT,,,7392,,53108c69.1ce0,2,,2014-02-28 13:17:29 GMT,,0,FATAL,XX000,terminating walreceiver due to timeout, 2014-03-19 15:36:39.177 GMT,,,7388,,53108c69.1cdc,6,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,record with zero length at 12/F1031DF8, 2014-03-19 15:36:57.181 GMT,,,12100,,5329b996.2f44,1,,2014-03-19 15:36:54 GMT,,0,FATAL,XX000,could not connect to the primary server: could not connect to server: No route to host Is the server running on host 10.162.2.50 and accepting TCP/IP connections on port 5432? , Then master finally came back, slave wrote 2014-03-19 15:40:09.389 GMT,,,13121,,5329ba59.3341,1,,2014-03-19 15:40:09 GMT,,0,FATAL,XX000,could not connect to the primary server: FATAL: the database system is starting up , 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,1,,2014-03-19 15:40:14 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:40:16.468 GMT,,,13136,,5329ba5e.3350,2,,2014-03-19 15:40:14 GMT,,0,FATAL,XX000,could not receive data from WAL stream: ERROR: requested starting point 12/F100 is ahead of the WAL flush position of this server 12/F0FFFCE8 , last message was repeated several times and then this happened 2014-03-19 15:42:04.623 GMT,,,13722,,5329bacc.359a,1,,2014-03-19 15:42:04 GMT,,0,LOG,0,started streaming WAL from primary at 12/F100 on timeline 1, 2014-03-19 15:42:04.628 GMT,,,7388,,53108c69.1cdc,7,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:04.628 GMT,,,13722,,5329bacc.359a,2,,2014-03-19 15:42:04 GMT,,0,FATAL,57P01,terminating walreceiver process due to administrator command, 2014-03-19 15:42:09.628 GMT,,,7388,,53108c69.1cdc,8,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:14.628 GMT,,,7388,,53108c69.1cdc,9,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8, 2014-03-19 15:42:19.628 GMT,,,7388,,53108c69.1cdc,10,,2014-02-28 13:17:29 GMT,1/0,0,LOG,0,invalid record length at 12/F1031DF8,” and it just repeats forever. Meanwhile on master 2014-03-19 15:39:30.957 GMT,,,7115,,5329ba32.1bcb,2,,2014-03-19 15:39:30 GMT,,0,LOG,0,database system was not properly shut down; automatic recovery in progress, 2014-03-19 15:39:30.989 GMT,,,7115,,5329ba32.1bcb,3,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,4,,2014-03-19 15:39:30 GMT,,0,LOG,0,redo done at 12/F0FFFC38, 2014-03-19 15:39:47.114 GMT,,,7115,,5329ba32.1bcb,5,,2014-03-19 15:39:30 GMT,,0,LOG,0,last completed transaction was at log time 2014-03-19 05:02:29.273138+00, 2014-03-19 15:39:47.115 GMT,,,7115,,5329ba32.1bcb,6,,2014-03-19 15:39:30 GMT,,0,LOG,0,checkpoint starting: end-of-recovery immediate, 2014-03-19 15:40:16.466 GMT,replicator,,7986,10.162.2.52:44336,5329ba5e.1f32,1,idle,2014-03-19 15:40:14 GMT,2/0,0,ERROR,XX000,requested starting point 12/F100 is ahead of the WAL flush position of this server 12/F0FFFCE8,walreceiver So, all two slaves are disconnected from master, which somehow is past his slaves. I decided to promote one of the slaves, so we can have some snapshot of the data. relevant logs from this are 2014-03-19 16:50:43.118 GMT,,,,,5329cae3.115c,3,,2014-03-19 16:50:43 GMT,,0,LOG,0,redo starts at 12/DFFBB3E8, 2014-03-19 16:50:50.028 GMT,dboperator,postgres,4452,[local],5329caea.1164,1,,2014-03-19 16:50:50 GMT,,0,FATAL,57P03,the database system is starting up, 2014-03-19 16:50:51.128 GMT,,,,,5329cae3.115c,4,,2014-03-19 16:50:43 GMT,,0,LOG,0,invalid contrecord length 5736 at 12/F0FFFC80, 2014-03-19 16:50:51.128 GMT,,,,,5329cae3.115c,5,,2014-03-19 16:50:43 GMT,,0,LOG,0,redo done at 12/F0FFFC38,”” It is interesting that redo done at 12/F0FFFC38 both on master and promoted slave. The main question is there is actual latest data, and how is it possible that master is behind his slave in synchronous replication. Thanks for the help. -- Sent via
Re: [PERFORM] Query taking long time
On 07 Mar 2014, at 13:18, acanada acan...@cnio.es wrote: The table entity2document2 has 30GB. In consecutive runs it gets much better... 30ms apron. So you just benchmarking your hard drives with random iops. You need more ram and faster disks. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query taking long time
Hello Mat, Setting enable_bitmapscan to off doesn't really helps. It gets worse... x= SET enable_bitmapscan=off; SET x= explain analyze select * from (select * from entity2document2 where name='ranitidine' ) as a order by a.hepval; QUERY PLAN Sort (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1) Sort Key: entity2document2.hepval Sort Method: quicksort Memory: 2301kB - Index Scan using entity2document2_name on entity2document2 (cost=0.00..18509.70 rows=4595 width=131) (actual time=67.507..79945.362 rows=13512 loops=1) Index Cond: ((name)::text = 'ranitidine'::text) Total runtime: 79967.705 ms (6 rows) Any other idea? Please post your hw configuration. I think that your db is on disk and they are slow. Thank you very much for your help. Regards, Andrés El Mar 6, 2014, a las 2:11 PM, desmodemone escribió: Il 05/mar/2014 00:36 Venkata Balaji Nagothi vbn...@gmail.com ha scritto: After looking at the distinct values, yes the composite Index on name and hepval is not recommended. That would worsen - its expected. We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this. Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ? Do you have any other processes effecting this query's performance ? Any info about your Disk, RAM, CPU would also help. Regards, Venkata Balaji N Fujitsu Australia Venkata Balaji N Sr. Database Administrator Fujitsu Australia On Tue, Mar 4, 2014 at 10:23 PM, acanada acan...@cnio.es wrote: Hello, I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse... explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval; QUERY PLAN - Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1) Sort Key: entity_compounddict2document.hepval Sort Method: quicksort Memory: 25622kB - Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1) Recheck Cond: ((name)::text = 'progesterone'::text) - Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1) Index Cond: ((name)::text = 'progesterone'::text) Total runtime: 95811.838 ms (8 rows) Any ideas please? Thank you Andrés. El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió: On Mon, Mar 3, 2014 at 9:17 PM, acanada acan...@cnio.es wrote: Hello, Thankyou for your answer. I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table: x= \d+ entity_compounddict2document; Table public.entity_compounddict2document Column | Type | Modifiers | Storage | Description --++---+--+- id | integer| not null | plain | document_id | integer| | plain | name | character varying(255) | | extended | qualifier| character varying(255) | | extended | tagMethod| character varying(255) | | extended | created | timestamp(0) without time zone | | plain | updated | timestamp(0) without time zone | | plain | curation | integer| | plain | hepval | double precision | | plain | cardval | double precision | | plain | nephval | double precision
Re: [PERFORM] Query taking long time
On 07 Mar 2014, at 12:46, acanada acan...@cnio.es wrote: El Mar 7, 2014, a las 10:39 AM, Evgeniy Shishkin escribió: Hello Mat, Setting enable_bitmapscan to off doesn't really helps. It gets worse... x= SET enable_bitmapscan=off; SET x= explain analyze select * from (select * from entity2document2 where name='ranitidine' ) as a order by a.hepval; QUERY PLAN Sort (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1) Sort Key: entity2document2.hepval Sort Method: quicksort Memory: 2301kB - Index Scan using entity2document2_name on entity2document2 (cost=0.00..18509.70 rows=4595 width=131) (actual time=67.507..79945.362 rows=13512 loops=1) Index Cond: ((name)::text = 'ranitidine'::text) Total runtime: 79967.705 ms (6 rows) Any other idea? Please post your hw configuration. I think that your db is on disk and they are slow. The server has 2 processors quadcore, 10GB of RAM and data is located in a fiber disk of 2TB. It doesn't seem to be the problem… And your database size is? Also do this timings get better in consecutive runs? Thank you Andrés Thank you very much for your help. Regards, Andrés El Mar 6, 2014, a las 2:11 PM, desmodemone escribió: Il 05/mar/2014 00:36 Venkata Balaji Nagothi vbn...@gmail.com ha scritto: After looking at the distinct values, yes the composite Index on name and hepval is not recommended. That would worsen - its expected. We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this. Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ? Do you have any other processes effecting this query's performance ? Any info about your Disk, RAM, CPU would also help. Regards, Venkata Balaji N Fujitsu Australia Venkata Balaji N Sr. Database Administrator Fujitsu Australia On Tue, Mar 4, 2014 at 10:23 PM, acanada acan...@cnio.es wrote: Hello, I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse... explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval; QUERY PLAN - Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1) Sort Key: entity_compounddict2document.hepval Sort Method: quicksort Memory: 25622kB - Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1) Recheck Cond: ((name)::text = 'progesterone'::text) - Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1) Index Cond: ((name)::text = 'progesterone'::text) Total runtime: 95811.838 ms (8 rows) Any ideas please? Thank you Andrés. El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió: On Mon, Mar 3, 2014 at 9:17 PM, acanada acan...@cnio.es wrote: Hello, Thankyou for your answer. I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table: x= \d+ entity_compounddict2document; Table public.entity_compounddict2document Column | Type | Modifiers | Storage | Description --++---+--+- id | integer| not null | plain | document_id | integer| | plain | name | character varying(255) | | extended | qualifier| character varying(255) | | extended | tagMethod| character varying(255) | | extended | created | timestamp(0) without time zone | | plain | updated | timestamp(0) without time zone | | plain | curation
Re: [PERFORM] Slow SELECT by primary key? Postgres 9.1.2
On 28.05.2013, at 2:17, John Mudd johnbm...@gmail.com wrote: Thanks again. Well, I have two problems with using the CLUSTER option. It's only temporary since any updates, depending how much free space is reserved per page, requires re-running the CLUSTER. And my primary concern is that it arbitrarily gives an unfair advantage to the primary key SELECT. Still, it's easy to test so here are the results. The primary key still looses even with the CLUSTER. Granted it is close but considering this is now an unfair comparison it still doesn't make sense to me. How can a search for a specific row that should be fairly straight forward take longer than a search that includes an ORDER BY clause? Well, you do just regular index scan because of LIMIT 1. And now it is just a matter of index size and table organization. I also don't understand why you consider CLUSTER unfair - the way you populated the table was natural cluster over my_key. But it bothers me why my_key is always better. Can you please test it on different values but the same rows? Because now it is two different tuples and you count every io. test=# CLUSTER test_select USING test_select_pkey ; CLUSTER test=# VACUUM ANALYZE test_select ; VACUUM (stopped postgres; reset O/S cache; started postgres) test=# explain analyze SELECT * FROM test_select WHERE key1 = 50 ORDER BY key1, key2, key3, id LIMIT 1; QUERY PLAN -- Limit (cost=0.00..0.08 rows=1 width=21) (actual time=19.430..19.431 rows=1 loops=1) - Index Scan using my_key on test_select (cost=0.00..41938.15 rows=42 width=21) (actual time=19.428..19.428 rows=1 loops=1) Index Cond: (key1 = 50) Total runtime: 19.526 ms (stopped postgres; reset O/S cache; started postgres) test=# explain analyze SELECT * FROM test_select WHERE id = 50; QUERY PLAN - Index Scan using test_select_pkey on test_select (cost=0.00..8.36 rows=1 width=21) (actual time=21.070..21.072 rows=1 loops=1) Index Cond: (id = 50) Total runtime: 21.178 ms On Mon, May 27, 2013 at 10:59 AM, Evgeny Shishkin itparan...@gmail.com wrote: On May 27, 2013, at 6:35 PM, John Mudd johnbm...@gmail.com wrote: Thanks, that's easy enough to test. Didn't seem to help though. Ok. And if you CLUSTER tables USING PK? test=# REINDEX index test_select_pkey; REINDEX test=# VACUUM ANALYZE test_select ; VACUUM (stopped postgres; reset O/S cache; started postgres) test=# explain analyze SELECT * FROM test_select WHERE key1 = 50 ORDER BY key1, key2, key3, id LIMIT 1; QUERY PLAN -- Limit (cost=0.00..0.08 rows=1 width=21) (actual time=16.368..16.369 rows=1 loops=1) - Index Scan using my_key on test_select (cost=0.00..41981.16 rows=501333 width=21) (actual time=16.366..16.366 rows=1 loops=1) Index Cond: (key1 = 50) Total runtime: 16.444 ms (stopped postgres; reset O/S cache; started postgres) test=# explain analyze SELECT * FROM test_select WHERE id = 50; QUERY PLAN - Index Scan using test_select_pkey on test_select (cost=0.00..8.36 rows=1 width=21) (actual time=23.072..23.074 rows=1 loops=1) Index Cond: (id = 50) Total runtime: 23.192 ms On Mon, May 27, 2013 at 10:21 AM, Evgeny Shishkin itparan...@gmail.com wrote: On May 27, 2013, at 6:02 PM, John Mudd johnbm...@gmail.com wrote: Postgres 9.1.2 on Ubuntu 12.04 Any reason why a select by primary key would be slower than a select that includes an ORDER BY? I was really hoping using the primary key would give me a boost. You created my_key after data loading, and PK was there all the time. If you REINDEX PK, i bet it will be as fast. I stopped the server and cleared the O/S cache using sync; echo 3 /proc/sys/vm/drop_caches between the runs. test=# VACUUM ANALYZE test_select; VACUUM (stopped postgres; reset O/S cache; started postgres) test=# explain analyze SELECT * FROM test_select WHERE key1 = 50 ORDER BY key1, key2, key3, id LIMIT 1;
Re: [PERFORM] autovacuum fringe case?
On 23.01.2013, at 20:53, AJ Weber awe...@comcast.net wrote: I have a server that is IO-bound right now (it's 4 cores, and top indicates the use rarely hits 25%, but the Wait spikes above 25-40% regularly). The server is running postgresql 9.0 and tomcat 6. As I have mentioned in a previous thread, I can't alter the hardware to add disks unfortunately, so I'm going to try and move postgresql off this application server to its own host, but this is a production environment, so in the meantime... Is it possible that some spikes in IO could be attributable to the autovacuum process? Is there a way to check this theory? Try iotop Would it be advisable (or even permissible to try/test) to disable autovacuum, and schedule a manual vacuumdb in the middle of the night, when this server is mostly-idle? Thanks for any tips. I'm in a bit of a jam with my limited hardware. -AJ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance