Re: [GENERAL] partitioning query planner almost always scans all tables
Hello, Your constraint column is of 'timestamp with time zone' and the query uses 'date'. The comparison between them is an operator based on a static, non-immutable function so constraint exclusion doesn't work. SELECT o.oprname, o.oprcode, p.provolatile FROM pg_operator o join pg_proc p on (p.oid = o.oprcode) WHERE oprname = '' AND oprleft = 'timestamp with time zone'::regtype AND oprright = 'date'::regtype; oprname | oprcode | provolatile -+-+- | timestamptz_lt_date | s # '' is a random selection. Any comparison ops will do. The following query instead will do what you wanted. Only the operand of '=' is changed. EXPLAIN ANALYZE SELECT * FROM measurement_events WHERE measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND measurement_time = '2015-01-01 00:00:00+0' LIMIT 1; The function in the expression used for exclusion is immutable. SELECT o.oprname, o.oprcode, p.provolatile FROM pg_operator o join pg_proc p on (p.oid = o.oprcode) WHERE oprname = '' AND oprleft = 'timestamp with time zone'::regtype AND oprright = 'timestamp with time zone::regtype; oprname |oprcode | provolatile -++- | timestamptz_lt | i The details about this in the following page, http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html 5.9.6 Cavert regards, -- Kyotaro Horiguchi NTT Open Source Software Center Jan 2015 06:42:53 -0700, Rob Sargent robjsarg...@gmail.com wrote in 31fed87e-d31b-4cf9-93f8-cc0f131cb...@gmail.com I don't understand having both UUID and time stamp in your PK? The first is by defn. unique and the second might be. Sent from my iPhone On Jan 19, 2015, at 6:12 AM, Spiros Ioannou siv...@inaccess.com wrote: Hello group, we have a timeseries table, and we tried to partition it by month (with pg_partman). It seems the query planner always reads all tables regardless of WHERE, except when WHERE is equality. the parent table: ifms_db=# \dS measurement_events Table public.measurement_events Column | Type | Modifiers ---+--+--- measurement_source_id | uuid | not null measurement_time | timestamp with time zone | not null event_reception_time | timestamp with time zone | not null measurement_value | character varying(200) | not null quality | character varying(500) | not null Indexes: measurement_events_pkey PRIMARY KEY, btree (measurement_source_id, measurement_time) Triggers: measurement_events_part_trig BEFORE INSERT ON measurement_events FOR EACH ROW EXECUTE PROCEDURE measurement_events_part_trig_func() Number of child tables: 25 (Use \d+ to list them.) One of the children tables (2014_3) ifms_db=# \dS measurement_events_p2014_03 Table public.measurement_events_p2014_03 Column | Type | Modifiers ---+--+--- measurement_source_id | uuid | not null measurement_time | timestamp with time zone | not null event_reception_time | timestamp with time zone | not null measurement_value | character varying(200) | not null quality | character varying(500) | not null Indexes: measurement_events_p2014_03_pkey PRIMARY KEY, btree (measurement_source_id, measurement_time) Check constraints: measurement_events_p2014_03_partition_check CHECK (measurement_time = '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time '2014-04-01 00:00:00+03'::timestamp with time zone) Inherits: measurement_events The query: # explain analyze select * from measurement_events where measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND measurement_time = DATE '2015-01-01' limit 1; Limit (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1 loops=1) - Append (cost=0.00..2696.08 rows=655 width=87) (actual time=0.376..0.376 rows=1 loops=1) - Seq Scan on measurement_events (cost=0.00..0.00 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Filter: ((measurement_time = '2015-01-01'::date) AND (measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)) - Index Scan using measurement_events_p2014_01_pkey on measurement_events_p2014_01 (cost=0.14..8.16 rows=1 width=966) (actual time=0.005..0.005 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND
[GENERAL] Getting truncated queries from pg_stat_statements
I am trying to get some slow query information and the results from my query are truncated at 2047 characters. Some of my queries are very long so they get truncated in the select. Is there a way around this? Here is my query. WITH query_stats AS ( SELECT query::text, (total_time / 1000 / 60) as total_minutes, (total_time / calls) as average_time, calls FROM pg_stat_statements INNER JOIN pg_database ON pg_database.oid = pg_stat_statements.dbid WHERE pg_database.datname = current_database() ) SELECT query, total_minutes, average_time, calls, total_minutes * 100.0 / (SELECT SUM(total_minutes) FROM query_stats) AS total_percent FROM query_stats WHERE calls = 10 AND average_time = 20 ORDER BY average_time DESC LIMIT 100
Re: [GENERAL] Partitioning
Hi, 19 Jan 2015 06:58:21 -0500, François Beausoleil franc...@teksol.info wrote in cc2fd572-320a-4225-b98c-48f209925...@teksol.info Le 2015-01-18 à 20:58, James Sewell james.sew...@lisasoft.com a écrit : Hello, I am using partitioning with around 100 sub-tables. Each sub-table is around 11GB and partitioned on the 'id' column. I have an index on the id column on each sub-table. Is it possible to get a query like the following working using constraint exclusion, or am I doomed to do index/sequential scans of every sub-table? I want to select all rows which have an id which is in another query, so something like: WITH idlist as (SELECT id from othertable) SELECT id from mastertable WHERE id = idlist.id); I am guessing that I am not getting constraint exclusion to work as the planner doesn't know the outcome of my subquery at plan time? How many rows in idlist? Can you do two queries? Fetch the ids, then call the 2nd query with those values hard-coded in the query. If necessary, and you know where each partition starts, tou could sort in the app and query the correct ranges, in multiple queries. The strategy's effectiveness is depends mainly on how many ids come from the othertable. It wins if fewer than certain number or converged in a few partitions, however, straight joins will win elsewise. The result of EXPLAIN ANALYZE might draw more precise advices. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Getting truncated queries from pg_stat_statements
Thanks for that tip. I'll check it out. On Tue, Jan 20, 2015 at 3:12 PM, Melvin Davidson melvin6...@gmail.com wrote: Since you are on 9.3. you might want to consider using PgBadger as a better way to get information about slow queries. http://sourceforge.net/projects/pgbadger/ On Mon, Jan 19, 2015 at 6:41 PM, Tim Uckun timuc...@gmail.com wrote: Ok thanks. I am still on 9.3 so I'll adjust that setting. On Tue, Jan 20, 2015 at 12:15 PM, Peter Geoghegan peter.geoghega...@gmail.com wrote: On Mon, Jan 19, 2015 at 2:37 PM, Tim Uckun timuc...@gmail.com wrote: I am trying to get some slow query information and the results from my query are truncated at 2047 characters. Some of my queries are very long so they get truncated in the select. Is there a way around this? Yes. This is fixed in PostgreSQL 9.4, so that query texts can be of virtually unlimited size and still be stored. Otherwise, it depends on your track_activity_query_size setting. -- Regards, Peter Geoghegan -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
Re: [GENERAL] Simple Atomic Relationship Insert
On 1/19/2015 4:58 PM, Robert DiFalco wrote: Hometowns get selected and possibly inserted in unpredictable ways even from multiple concurrent sessions. The only way I could figure out how to solve it was to force each INSERT hometowns to be in its own transaction. So you fixed it - good. In our situation we were never inserting more than one hometown** record in the same transaction - so I see now that my scenario was not the same as yours - but yours evolved to that. (**using hometown for our record type) On Mon, Jan 19, 2015 at 1:56 PM, Robert DiFalco robert.difa...@gmail.com mailto:robert.difa...@gmail.com wrote: I don't think an advisory lock would remove the deadlock. For 2 or more hometown inserts per transaction - I agree - it won't fix it. Glad to hear you have it fixed. Roxanne On Sun, Jan 18, 2015 at 10:33 PM, Roxanne Reid-Bennett r...@tara-lu.com mailto:r...@tara-lu.com wrote: On 1/16/2015 2:41 AM, Jim Nasby wrote: On 1/15/15 10:57 PM, Roxanne Reid-Bennett wrote: try this: (if you still get deadlocks, uncomment the advisory lock [thanks Daniel] and try again) Logically I suppose it might run faster to do the select, then insert if. I almost always write these as insert first - because it's the more restrictive lock. CREATE OR REPLACE FUNCTION select_hometown_id(hometown_name VARCHAR) RETURNS INTEGER AS $BODY$ DECLARE v_id integer; BEGIN --perform pg_advisory_xact_lock(hashtext(hometown_name)); BEGIN insert into hometowns (name) select hometown_name where not exists (select id from hometowns where name = hometown_name) returning id into v_id; That has a race condition. The only safe way to do this (outside of SSI) is using the example code at http://www.postgresql.org/docs/devel/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING And if the advisory lock is used? That presumably creates an exclusive lock on the asset hometown_name. [in most examples given Portland, OR.] Would not any other process that runs (this function) on the same asset have to wait for this specific transaction to commit or roll back - blocking the race condition? Roxanne (sorry, I was out of town) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org mailto:pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- [At other schools] I think the most common fault in general is to teach students how to pass exams instead of teaching them the science. Donald Knuth
Re: [GENERAL] Getting truncated queries from pg_stat_statements
On Mon, Jan 19, 2015 at 2:37 PM, Tim Uckun timuc...@gmail.com wrote: I am trying to get some slow query information and the results from my query are truncated at 2047 characters. Some of my queries are very long so they get truncated in the select. Is there a way around this? Yes. This is fixed in PostgreSQL 9.4, so that query texts can be of virtually unlimited size and still be stored. Otherwise, it depends on your track_activity_query_size setting. -- Regards, Peter Geoghegan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Getting truncated queries from pg_stat_statements
Ok thanks. I am still on 9.3 so I'll adjust that setting. On Tue, Jan 20, 2015 at 12:15 PM, Peter Geoghegan peter.geoghega...@gmail.com wrote: On Mon, Jan 19, 2015 at 2:37 PM, Tim Uckun timuc...@gmail.com wrote: I am trying to get some slow query information and the results from my query are truncated at 2047 characters. Some of my queries are very long so they get truncated in the select. Is there a way around this? Yes. This is fixed in PostgreSQL 9.4, so that query texts can be of virtually unlimited size and still be stored. Otherwise, it depends on your track_activity_query_size setting. -- Regards, Peter Geoghegan
Re: [GENERAL] Getting truncated queries from pg_stat_statements
Since you are on 9.3. you might want to consider using PgBadger as a better way to get information about slow queries. http://sourceforge.net/projects/pgbadger/ On Mon, Jan 19, 2015 at 6:41 PM, Tim Uckun timuc...@gmail.com wrote: Ok thanks. I am still on 9.3 so I'll adjust that setting. On Tue, Jan 20, 2015 at 12:15 PM, Peter Geoghegan peter.geoghega...@gmail.com wrote: On Mon, Jan 19, 2015 at 2:37 PM, Tim Uckun timuc...@gmail.com wrote: I am trying to get some slow query information and the results from my query are truncated at 2047 characters. Some of my queries are very long so they get truncated in the select. Is there a way around this? Yes. This is fixed in PostgreSQL 9.4, so that query texts can be of virtually unlimited size and still be stored. Otherwise, it depends on your track_activity_query_size setting. -- Regards, Peter Geoghegan -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
Re: [GENERAL] Partitioning
Yeah definitely not staying resident once read, although the machine does gave 256GB of memory so some will persist in the OS cache. Actually this brings up another question, if I have an unique integer index of 2.gb what percentage would I expect to read for a value that was higher or lower than all index values? Cheers, James On Monday, 19 January 2015, John R Pierce pie...@hogranch.com wrote: On 1/18/2015 11:13 PM, James Sewell wrote: Each index is about 2.5GB, I suspect I am trying to read a these into memory in entirety. an 11GB table with a (presumably integer) primary key requires an 2.5GB index ? 100 of these would need 250GB of shared_buffers to stay resident, not likely. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- James Sewell, PostgreSQL Team Lead / Solutions Architect __ Level 2, 50 Queen St, Melbourne VIC 3000 *P *(+61) 3 8370 8000 *W* www.lisasoft.com *F *(+61) 3 8370 8099 -- -- The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
Re: [GENERAL] Partitioning
Hi, At Mon, 19 Jan 2015 00:14:55 -0800, John R Pierce pie...@hogranch.com wrote in 54bcbcff.5040...@hogranch.com On 1/18/2015 11:59 PM, James Sewell wrote: Actually this brings up another question, if I have an unique integer index of 2.gb http://2.gb what percentage would I expect to read for a value that was higher or lower than all index values? a couple 8k blocks. its a b-tree. Yeah, scanning children reading highest/lowest values for each of them would make it faster than the first single query. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Partitioning
On 1/18/2015 11:59 PM, James Sewell wrote: Actually this brings up another question, if I have an unique integer index of 2.gb http://2.gb what percentage would I expect to read for a value that was higher or lower than all index values? a couple 8k blocks. its a b-tree. -- john r pierce 37N 122W somewhere on the middle of the left coast
[GENERAL] splitting up tables based on read/write frequency of columns
This is really a theoretical/anecdotal question, as I'm not at a scale yet where this would measurable. I want to investigate while this is fresh in my mind... I recall reading that unless a row has columns that are TOASTed, an `UPDATE` is essentially an `INSERT + DELETE`, with the previous row marked for vacuuming. A few of my tables have the following characteristics: - The Primary Key has many other tables/columns that FKEY onto it. - Many columns (30+) of small data size - Most columns (90%) are 1 WRITE(UPDATE) for 1000 READS - Some columns (10%) do a bit of internal bookkeeping and are 1 WRITE(UPDATE) for 50 READS Has anyone done testing/benchmarking on potential efficiency/savings by consolidating the frequent UPDATE columns into their own table? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Simple Atomic Relationship Insert
Hometowns get selected and possibly inserted in unpredictable ways even from multiple concurrent sessions. The only way I could figure out how to solve it was to force each INSERT hometowns to be in its own transaction. On Mon, Jan 19, 2015 at 1:56 PM, Robert DiFalco robert.difa...@gmail.com wrote: I don't think an advisory lock would remove the deadlock. On Sun, Jan 18, 2015 at 10:33 PM, Roxanne Reid-Bennett r...@tara-lu.com wrote: On 1/16/2015 2:41 AM, Jim Nasby wrote: On 1/15/15 10:57 PM, Roxanne Reid-Bennett wrote: try this: (if you still get deadlocks, uncomment the advisory lock [thanks Daniel] and try again) Logically I suppose it might run faster to do the select, then insert if. I almost always write these as insert first - because it's the more restrictive lock. CREATE OR REPLACE FUNCTION select_hometown_id(hometown_name VARCHAR) RETURNS INTEGER AS $BODY$ DECLARE v_id integer; BEGIN --perform pg_advisory_xact_lock(hashtext(hometown_name)); BEGIN insert into hometowns (name) select hometown_name where not exists (select id from hometowns where name = hometown_name) returning id into v_id; That has a race condition. The only safe way to do this (outside of SSI) is using the example code at http://www.postgresql.org/ docs/devel/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING And if the advisory lock is used? That presumably creates an exclusive lock on the asset hometown_name. [in most examples given Portland, OR.] Would not any other process that runs (this function) on the same asset have to wait for this specific transaction to commit or roll back - blocking the race condition? Roxanne (sorry, I was out of town) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Simple Atomic Relationship Insert
I don't think an advisory lock would remove the deadlock. On Sun, Jan 18, 2015 at 10:33 PM, Roxanne Reid-Bennett r...@tara-lu.com wrote: On 1/16/2015 2:41 AM, Jim Nasby wrote: On 1/15/15 10:57 PM, Roxanne Reid-Bennett wrote: try this: (if you still get deadlocks, uncomment the advisory lock [thanks Daniel] and try again) Logically I suppose it might run faster to do the select, then insert if. I almost always write these as insert first - because it's the more restrictive lock. CREATE OR REPLACE FUNCTION select_hometown_id(hometown_name VARCHAR) RETURNS INTEGER AS $BODY$ DECLARE v_id integer; BEGIN --perform pg_advisory_xact_lock(hashtext(hometown_name)); BEGIN insert into hometowns (name) select hometown_name where not exists (select id from hometowns where name = hometown_name) returning id into v_id; That has a race condition. The only safe way to do this (outside of SSI) is using the example code at http://www.postgresql.org/ docs/devel/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING And if the advisory lock is used? That presumably creates an exclusive lock on the asset hometown_name. [in most examples given Portland, OR.] Would not any other process that runs (this function) on the same asset have to wait for this specific transaction to commit or roll back - blocking the race condition? Roxanne (sorry, I was out of town) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] splitting up tables based on read/write frequency of columns
Hi I'm pretty sure PostgreSQL can handle this. But since you asked with a theoretic background, it's probably worthwhile to look at column stores (like [1]). -S. [*] http://citusdata.github.io/cstore_fdw/ 2015-01-19 22:47 GMT+01:00 Jonathan Vanasco postg...@2xlp.com: This is really a theoretical/anecdotal question, as I'm not at a scale yet where this would measurable. I want to investigate while this is fresh in my mind... I recall reading that unless a row has columns that are TOASTed, an `UPDATE` is essentially an `INSERT + DELETE`, with the previous row marked for vacuuming. A few of my tables have the following characteristics: - The Primary Key has many other tables/columns that FKEY onto it. - Many columns (30+) of small data size - Most columns (90%) are 1 WRITE(UPDATE) for 1000 READS - Some columns (10%) do a bit of internal bookkeeping and are 1 WRITE(UPDATE) for 50 READS Has anyone done testing/benchmarking on potential efficiency/savings by consolidating the frequent UPDATE columns into their own table? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning query planner almost always scans all tables
I don't understand having both UUID and time stamp in your PK? The first is by defn. unique and the second might be. Sent from my iPhone On Jan 19, 2015, at 6:12 AM, Spiros Ioannou siv...@inaccess.com wrote: Hello group, we have a timeseries table, and we tried to partition it by month (with pg_partman). It seems the query planner always reads all tables regardless of WHERE, except when WHERE is equality. the parent table: ifms_db=# \dS measurement_events Table public.measurement_events Column | Type | Modifiers ---+--+--- measurement_source_id | uuid | not null measurement_time | timestamp with time zone | not null event_reception_time | timestamp with time zone | not null measurement_value | character varying(200) | not null quality | character varying(500) | not null Indexes: measurement_events_pkey PRIMARY KEY, btree (measurement_source_id, measurement_time) Triggers: measurement_events_part_trig BEFORE INSERT ON measurement_events FOR EACH ROW EXECUTE PROCEDURE measurement_events_part_trig_func() Number of child tables: 25 (Use \d+ to list them.) One of the children tables (2014_3) ifms_db=# \dS measurement_events_p2014_03 Table public.measurement_events_p2014_03 Column | Type | Modifiers ---+--+--- measurement_source_id | uuid | not null measurement_time | timestamp with time zone | not null event_reception_time | timestamp with time zone | not null measurement_value | character varying(200) | not null quality | character varying(500) | not null Indexes: measurement_events_p2014_03_pkey PRIMARY KEY, btree (measurement_source_id, measurement_time) Check constraints: measurement_events_p2014_03_partition_check CHECK (measurement_time = '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time '2014-04-01 00:00:00+03'::timestamp with time zone) Inherits: measurement_events The query: # explain analyze select * from measurement_events where measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND measurement_time = DATE '2015-01-01' limit 1; Limit (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1 loops=1) - Append (cost=0.00..2696.08 rows=655 width=87) (actual time=0.376..0.376 rows=1 loops=1) - Seq Scan on measurement_events (cost=0.00..0.00 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Filter: ((measurement_time = '2015-01-01'::date) AND (measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)) - Index Scan using measurement_events_p2014_01_pkey on measurement_events_p2014_01 (cost=0.14..8.16 rows=1 width=966) (actual time=0.005..0.005 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_02_pkey on measurement_events_p2014_02 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_03_pkey on measurement_events_p2014_03 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_04_pkey on measurement_events_p2014_04 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_05_pkey on measurement_events_p2014_05 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_06_pkey on measurement_events_p2014_06 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index
[GENERAL] partitioning query planner almost always scans all tables
Hello group, we have a timeseries table, and we tried to partition it by month (with pg_partman). It seems the query planner always reads all tables regardless of WHERE, except when WHERE is equality. *the parent table:* ifms_db=# \dS measurement_events Table public.measurement_events Column | Type | Modifiers ---+--+--- measurement_source_id | uuid | not null measurement_time | timestamp with time zone | not null event_reception_time | timestamp with time zone | not null measurement_value | character varying(200) | not null quality | character varying(500) | not null Indexes: measurement_events_pkey PRIMARY KEY, btree (measurement_source_id, measurement_time) Triggers: measurement_events_part_trig BEFORE INSERT ON measurement_events FOR EACH ROW EXECUTE PROCEDURE measurement_events_part_trig_func() Number of child tables: 25 (Use \d+ to list them.) One of the children tables (2014_3) ifms_db=# \dS measurement_events_p2014_03 Table public.measurement_events_p2014_03 Column | Type | Modifiers ---+--+--- measurement_source_id | uuid | not null measurement_time | timestamp with time zone | not null event_reception_time | timestamp with time zone | not null measurement_value | character varying(200) | not null quality | character varying(500) | not null Indexes: measurement_events_p2014_03_pkey PRIMARY KEY, btree (measurement_source_id, measurement_time) Check constraints: measurement_events_p2014_03_partition_check CHECK (measurement_time = '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time '2014-04-01 00:00:00+03'::timestamp with time zone) Inherits: measurement_events The query: # explain analyze select * from measurement_events where measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND measurement_time = DATE '2015-01-01' limit 1; Limit (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1 loops=1) - Append (cost=0.00..2696.08 rows=655 width=87) (actual time=0.376..0.376 rows=1 loops=1) - Seq Scan on measurement_events (cost=0.00..0.00 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Filter: ((measurement_time = '2015-01-01'::date) AND (measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)) - Index Scan using measurement_events_p2014_01_pkey on measurement_events_p2014_01 (cost=0.14..8.16 rows=1 width=966) (actual time=0.005..0.005 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_02_pkey on measurement_events_p2014_02 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_03_pkey on measurement_events_p2014_03 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_04_pkey on measurement_events_p2014_04 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_05_pkey on measurement_events_p2014_05 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_06_pkey on measurement_events_p2014_06 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using measurement_events_p2014_07_pkey on measurement_events_p2014_07 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = '2015-01-01'::date)) - Index Scan using
Re: [GENERAL] bdr_init_copy fails when starting 2nd BDR node
On 4 January 2015 at 02:52, John Casey john.ca...@innovisors.com wrote: I'm still experiencing similar problems. I'm not certain what parameter you are referring to when you say 'ehost'. Otherwise, I did want to clarify a couple of things. I have tried several combinations, each one fails in various ways. So ... (1) What is the exact syntax when calling bdr_init_copy from new nodes when your database name is not 'postgres' and your user name is not 'postgres'. Please note if you supply local or remote host/port in the command. Use a connection string to identify the remote and the local ends. e.g.: bdr_init_copy --remote-dbname=host=node1 dbname=mydb \ --local-dbname=dbname=mydb \ -D datadir Both --remote-dbname and --local-dbname are libpq connection strings. (2) Should you do a pg_ctl start on new node before trying to execute bdr_init_copy. If I don't I get the error I posted earlier. No, you should not and must not start the server before running bdr_init_copy. (In the current development version of BDR this has all gone away, and bdr_init_copy will make a base backup for you).
Re: [GENERAL] Partitioning
Sorry, Yeah, scanning children reading highest/lowest values for each of them would make it faster than the first single query. Mmm..no. It has nothing different from querieng on the parent table. Please Ignore the silly thing. -- Kyotaro Horiguchi
Re: [GENERAL] Partitioning
Le 2015-01-18 à 20:58, James Sewell james.sew...@lisasoft.com a écrit : Hello, I am using partitioning with around 100 sub-tables. Each sub-table is around 11GB and partitioned on the 'id' column. I have an index on the id column on each sub-table. Is it possible to get a query like the following working using constraint exclusion, or am I doomed to do index/sequential scans of every sub-table? I want to select all rows which have an id which is in another query, so something like: WITH idlist as (SELECT id from othertable) SELECT id from mastertable WHERE id = idlist.id); I am guessing that I am not getting constraint exclusion to work as the planner doesn't know the outcome of my subquery at plan time? How many rows in idlist? Can you do two queries? Fetch the ids, then call the 2nd query with those values hard-coded in the query. If necessary, and you know where each partition starts, tou could sort in the app and query the correct ranges, in multiple queries. Hope that helps! François Beausoleil
Re: [GENERAL] asynchronous commit
Robert DiFalco robert.difa...@gmail.com wrote: I have several tables that I use for logging and real-time stats. These are not critical and since they are a bottleneck I want transactions against them to always be asynchronous. Is there a way to specify this at a table level or do I have to make sure to call set synchronous_commit='off' every time I insert or update to them? And presumably remember to turn it back on again for safety. I think, you can use unlogged tables instead. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) If I was god, I would recompile penguin with --enable-fly. (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889° -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] asynchronous commit
Andreas, I think UNLOGGED would be something different but I'm not totally clear. However, it seems to me that an unlogged table would simply disappear (be truncated) after a server crash. That means instead of maybe loosing a record or two that I could loose a ton or records. But maybe my understanding is off. On Mon, Jan 19, 2015 at 11:10 AM, Andreas Kretschmer akretsch...@spamfence.net wrote: Robert DiFalco robert.difa...@gmail.com wrote: I have several tables that I use for logging and real-time stats. These are not critical and since they are a bottleneck I want transactions against them to always be asynchronous. Is there a way to specify this at a table level or do I have to make sure to call set synchronous_commit='off' every time I insert or update to them? And presumably remember to turn it back on again for safety. I think, you can use unlogged tables instead. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) If I was god, I would recompile penguin with --enable-fly. (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889° -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] asynchronous commit
I have several tables that I use for logging and real-time stats. These are not critical and since they are a bottleneck I want transactions against them to always be asynchronous. Is there a way to specify this at a table level or do I have to make sure to call set synchronous_commit='off' every time I insert or update to them? And presumably remember to turn it back on again for safety.
Re: [GENERAL] asynchronous commit
On Mon, Jan 19, 2015 at 10:03 AM, Robert DiFalco robert.difa...@gmail.com wrote: I have several tables that I use for logging and real-time stats. These are not critical and since they are a bottleneck I want transactions against them to always be asynchronous. Is there a way to specify this at a table level or do I have to make sure to call set synchronous_commit='off' every time I insert or update to them? And presumably remember to turn it back on again for safety. There is no clean way to do what you want currently. You have to do the bookkeeping yourself. There was discussion about adding an option for asynchronous commit to tables. Then if all the tables changed during a transaction were asynchronous (and a few other conditions were met), the commit would automatically be asynchronous as well. I'd like to see that, but I don't think anyone is actively working on implementing it. Cheers, Jeff