Re: [GENERAL] Slow query with join
On Mon, Mar 16, 2015 at 11:50 AM, Marc Watson mark.wat...@jurisconcept.ca wrote: Hello all, I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit, as downloaded from EnterpriseDB, and is running on my dev system under Win 7 64-bit. [...] However, when I combine the two queries into one, the result set takes 6742 ms: explain analyze select * from v_actor where v_actor.actor_id in(select ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by ir_actor_id); You might want to consider whether the following is acceptable; but it would depend on the relationship between f_intervenant_ref and v_actor: SELECT * FROM v_actor JOIN f_intervenant_ref ON (actor_id = ir_actor_id) WHERE ir_dos_id = '5226'; IN has issues due to necessary consideration of possible NULLs in the list. Furthermore; even in your original query there is no value to incorporating an ORDER BY into the IN subquery. I suspect that this second problem is preventing the planner from pushing the subquery down into the view and so is forced to perform a Merge Semi Join against the full (and thus expensive) view while the constants in the second query can be pushed down and the planner is able to choose the Nested Loop Left Join over 8 keys (4 rows) which ends up being very fast. The other option is to use EXISTS: SELECT * FROM v_actor WHERE EXIST (SELECT 1 FROM f_intervenant_ref WHERE actor_id = ir_actor_id AND ir_dos_id = '5226') Merge Semi Join (cost=71.79..108061.92 rows=8 width=1461) (actual time=7884.994..7927.699 rows=4 loops=1) Merge Cond: (actor.id = f_intervenant_ref.ir_actor_id) - Merge Left Join (cost=0.85..554314.28 rows=20670 width=138) (actual time=2.820..7926.001 rows=3072 loops=1) [...] - Materialize (cost=17.28..17.40 rows=8 width=4) (actual time=0.024..0.027 rows=8 loops=1) - Sort (cost=17.28..17.30 rows=8 width=4) (actual time=0.021..0.022 rows=8 loops=1) Sort Key: f_intervenant_ref.ir_actor_id Sort Method: quicksort Memory: 25kB - Index Scan using ir_dos_id_idx on f_intervenant_ref (cost=0.28..17.16 rows=8 width=4) (actual time=0.012..0.017 rows=8 loops=1) Index Cond: ((ir_dos_id)::text = '5226'::text) Planning time: 0.820 ms Execution time: 7927.838 ms Any suggestions to help me speed this up will be greatly appreciated. David J.
[GENERAL] How does one make the following psql statement sql-injection resilient?
psql $SERVICE \ --echo-queries \ --set=string_input=${1:-ok_to_return} \ --set=start=${2:-5} \ --set=end=${3:-10} \ 'SQL' SELECT idx FROM generate_series(1, 20) gs (idx) WHERE 'short-circuit' != :'string_input' AND idx BETWEEN :start AND :end; SQL # (6 rows) --set=end=${3:-10 AND false} # (0 rows) Am I forced to represent the input as text (using :'end') and then perform a conversion to integer? Thanks! David J.
Re: [GENERAL] Group by range in hour of day
On Mar 16, 2015, at 2:22 PM, David G. Johnston david.g.johns...@gmail.com wrote: On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.com wrote: On 03/16/2015 02:57 PM, Israel Brewster wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario. The goal here is to figure out, for each hour of the day, a) what is the total number of active records for that hour, and b) what is the total active time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour. So, for simplified example, if the table contained three records: start | end - 2015-03-15 08:15 | 2015-03-15 10:45 2015-03-15 09:30 | 2015-03-15 10:15 2015-03-15 10:30 | 2015-03-15 11:30 Then the results should break out something like this: hour | count | sum - 8 |1 | 0.75 9 |2 | 1.5 10 |3 | 1.5 11 |1 | 0.5 I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. Do not have an answer for you, but a question: What version of Postgres are you on? This will help determine what tools are available to work with. The following will give you endpoints for your bounds. Version is important since range types could be very useful in this situation - but you'd still need to generate the bounds info regardless. SELECT * FROM (SELECT * FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s CROSS JOIN (SELECT end_ts + '1 hour'::interval AS end_ts FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) e (end_ts)) AS e You would join this using an ON condition with an OR (start BETWEEN [...] OR end BETWEEN [...]) - range logic will be better and you may want to adjust the upper bound by negative 1 (nano-second?) to allow for easier = logic if using BETWEEN. Thanks, that is very helpful, but are you sure CROSS JOIN is what you wanted here? using that, I get a 625 row result set where each row from the first SELECT is paired up with EVERY row from the second select. I would think I would want the first row of the first SELECT paired up with only the first row of the second, second row of the first paired with the second row of the second, etc - i.e. 24 start and end bounds. Or am I missing something? --- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7293 --- David J.
Re: [GENERAL] bdr replication latency monitoring
Steve, The relevant change was made during the commit of logical decoding to PostgreSQL 9.4, where the field of interest was renamed from 'xmin' to 'catalog_xmin'. It's around then that pg_stat_logical_decoding was renamed to pg_replication_slots too. To get lag in bytes, use: SELECT slot_name, database, active, pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) FROM pg_replication_slots WHERE plugin = 'bdr'; The catalog_xmin doesn't really reflect lag at all. Replay may have continued past that xid and fully caught up. Additionally, the commit timestamp records for the catalog xmin may be truncated away, rendering its commit time unknown and causing pg_get_transaction_committime(...) to report the epoch 2000-01-01 as the commit time. So using pg_get_transaction_committime on the catalog xmin isn't as useful as it was in earlier versions of BDR. I don't currently have a good way to get you a sense of replay lag in wall-clock time and will need to get back to you on that one. Note that we're in the process of updating all that documentation, moving it into the same SGML format used for PostgreSQL's official documentation and putting it in the BDR source tree. Some of the documentation on the wiki has become outdated since 0.7.x as a result. The coming 0.9.x release will bundle the documentation in the source tree and make the wiki docs obsolete. Thanks for your patience in the mean time. Please bring up any other issues you encounter, as it'll help make sure I and the rest of the team don't miss anything. On 14 March 2015 at 03:06, Steve Boyle sbo...@connexity.com wrote: I'm trying to follow the BDR monitoring docs: https://wiki.postgresql.org/wiki/BDR_Monitoring My postgres version string is (its from the 2nd Quadrant repo): PostgreSQL 9.4.0 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit My BDR plugin is version 0.8.0beta1 From the docs, I've come up with this query: select slot_name, plugin, database, active, xmin, pg_get_transaction_committime(xmin) FROM pg_replication_slots ; BDR is working. When I run that query, the 'xmin' value is always null, even though there is activity on the database. I do/can get a catalog_xmin value. Should I expect the 'xmin' value to be null? Is there another way to monitor the replication latency when using BDR? Thanks, Steve Boyle -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
[GENERAL] Group by range in hour of day
I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by "hour of day", that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario.The goal here is to figure out, for each hour of the day, a) what is the total number of "active" records for that hour, and b) what is the total "active" time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour.So, for simplified example, if the table contained three records:start |end-2015-03-15 08:15 | 2015-03-15 10:452015-03-15 09:30 | 2015-03-15 10:152015-03-15 10:30 | 2015-03-15 11:30Then the results should break out something like this:hour | count | sum-8| 1| 0.759| 2| 1.510 | 3| 1.511 | 1| 0.5I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. ---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD VERSION:3.0 N:Brewster;Israel;;; FN:Israel Brewster ORG:Frontier Flying Service;MIS TITLE:PC Support Tech II EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com TEL;type=WORK;type=pref:907-450-7293 item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701; item1.X-ABADR:us CATEGORIES:General X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson END:VCARD
Re: [GENERAL] Group by range in hour of day
On Mon, Mar 16, 2015 at 4:16 PM, Israel Brewster isr...@ravnalaska.net wrote: On Mar 16, 2015, at 2:22 PM, David G. Johnston david.g.johns...@gmail.com wrote: On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 03/16/2015 02:57 PM, Israel Brewster wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario. The goal here is to figure out, for each hour of the day, a) what is the total number of active records for that hour, and b) what is the total active time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour. So, for simplified example, if the table contained three records: start | end - 2015-03-15 08:15 | 2015-03-15 10:45 2015-03-15 09:30 | 2015-03-15 10:15 2015-03-15 10:30 | 2015-03-15 11:30 Then the results should break out something like this: hour | count | sum - 8 |1 | 0.75 9 |2 | 1.5 10 |3 | 1.5 11 |1 | 0.5 I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. Do not have an answer for you, but a question: What version of Postgres are you on? This will help determine what tools are available to work with. The following will give you endpoints for your bounds. Version is important since range types could be very useful in this situation - but you'd still need to generate the bounds info regardless. SELECT * FROM (SELECT * FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s CROSS JOIN (SELECT end_ts + '1 hour'::interval AS end_ts FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) e (end_ts)) AS e You would join this using an ON condition with an OR (start BETWEEN [...] OR end BETWEEN [...]) - range logic will be better and you may want to adjust the upper bound by negative 1 (nano-second?) to allow for easier = logic if using BETWEEN. Thanks, that is very helpful, but are you sure CROSS JOIN is what you wanted here? using that, I get a 625 row result set where each row from the first SELECT is paired up with EVERY row from the second select. I would think I would want the first row of the first SELECT paired up with only the first row of the second, second row of the first paired with the second row of the second, etc - i.e. 24 start and end bounds. Or am I missing something? No, I rushed things...:( Sorry. My concept is good though but indeed you want to end up with a table having only 24 rows (for the sample). LATERAL may work here but I haven't had a chance to play with it yet. A simple ordinal column to join on would be sufficient. David J.
Re: [GENERAL] pitr archive_command cp fsync
On 3/14/15 3:27 PM, Миша Тюрин wrote: should we add disclaimer in pitr documentation about cp and fsync? cp does not fsync. and dd for example can do fsync. only on some platforms -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Update using non-existent fields does not throw an error
Greetings! An update query is apparently succeeding, even though the query refers to fields that do not exist. Here's the query: update inventory set x_coordinate = (select x_coordinate from bases where base = '101'), y_coordinate = (select y_coordinate from bases where base = '101') where charge = 100 -- select x_coordinate, y_coordinate from bases where base = '101' When I run the update query, it tells me that the query succeeded and that four records were updated, which is what I expect. But when I looked at the inventory table, I found that the four records were unchanged. So, I tried to check the values of the base coordinates by running the select statement shown above. That statement threw an error complaining that x_coordinate and y_coordinate did not exist. This is correct; I should have been querying a view that includes those fields. But why didn't the update statement throw an error? RobR
Re: [GENERAL] Update using non-existent fields does not throw an error
On 16 March 2015 at 17:02, Rob Richardson rdrichard...@rad-con.com wrote: Greetings! An update query is apparently succeeding, even though the query refers to fields that do not exist. Here’s the query: update inventory set x_coordinate = (select x_coordinate from bases where base = '101'), y_coordinate = (select y_coordinate from bases where base = '101') where charge = 100 -- select x_coordinate, y_coordinate from bases where base = '101' When I run the update query, it tells me that the query succeeded and that four records were updated, which is what I expect. But when I looked at the inventory table, I found that the four records were unchanged. So, I tried to check the values of the base coordinates by running the select statement shown above. That statement threw an error complaining that x_coordinate and y_coordinate did not exist. This is correct; I should have been querying a view that includes those fields. But why didn’t the update statement throw an error? Because inventory contains those fields. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Update using non-existent fields does not throw an error
Rob Richardson wrote: An update query is apparently succeeding, even though the query refers to fields that do not exist. Here’s the query: update inventory set x_coordinate = (select x_coordinate from bases where base = '101'), y_coordinate = (select y_coordinate from bases where base = '101') where charge = 100 -- select x_coordinate, y_coordinate from bases where base = '101' When I run the update query, it tells me that the query succeeded and that four records were updated, which is what I expect. But when I looked at the inventory table, I found that the four records were unchanged. So, I tried to check the values of the base coordinates by running the select statement shown above. That statement threw an error complaining that x_coordinate and y_coordinate did not exist. This is correct; I should have been querying a view that includes those fields. But why didn’t the update statement throw an error? That's an old one. Since there is no x_coordinate in bases, the column will refer to x_coordinate from the outer query. So you set x_coordinate and y_coordinate to their old values. You can avoid problems like that by using column names that are qualified with the table name. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Update using non-existent fields does not throw an error
Thanks very much. Now that you've explained it, it should have been obvious. RobR -Original Message- From: Albe Laurenz [mailto:laurenz.a...@wien.gv.at] Sent: Monday, March 16, 2015 12:21 PM To: Rob Richardson; pgsql-general@postgresql.org Subject: RE: Update using non-existent fields does not throw an error Rob Richardson wrote: An update query is apparently succeeding, even though the query refers to fields that do not exist. Here’s the query: update inventory set x_coordinate = (select x_coordinate from bases where base = '101'), y_coordinate = (select y_coordinate from bases where base = '101') where charge = 100 -- select x_coordinate, y_coordinate from bases where base = '101' When I run the update query, it tells me that the query succeeded and that four records were updated, which is what I expect. But when I looked at the inventory table, I found that the four records were unchanged. So, I tried to check the values of the base coordinates by running the select statement shown above. That statement threw an error complaining that x_coordinate and y_coordinate did not exist. This is correct; I should have been querying a view that includes those fields. But why didn’t the update statement throw an error? That's an old one. Since there is no x_coordinate in bases, the column will refer to x_coordinate from the outer query. So you set x_coordinate and y_coordinate to their old values. You can avoid problems like that by using column names that are qualified with the table name. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] how to recovery your database when stops replicating
Hi all, I´m testing BDR and have a big deal. When it gets corrupted, after time of working well, I cannot do a recovery and the only way to get the system working again seems to reset all again. It means: - Stop all nodes. - Delete data directories in both nodes - Comment BDR entries in postgresql.conf in node A. - Start node A and restore database. - Uncomment BDR entries in node A. - Start node B with BDR entries so that executes initial copy. - It works again. I think there should be a better way to do it. I have some mails from here telling this: - Execute this sql in both nodes: SELECT slot_name, datoid, database, active FROM pg_replication_slots WHERE slot_name LIKE 'bdr_'||(SELECT oid FROM pg_database WHERE datname = current_database())||'_%'; - Execute this sql to be able to DDL again: COPY (SELECT * FROM pg_logical_slot_get_binary_changes('bdr_17911_6120567807158814813_1_16385__' , NULL, 1, 'interactive', 'true')) TO '/dev/null'; - Reconnect and do this: SET bdr.skip_ddl_replication = on; But that method don´t work for me. Any help? How do you recovery your database when stops replicating? Thanks in advance Ruth Patricia Melendo Ventura Software Engineer TELTRONIC, S.A.U. T: +34 976 465656 Ext. 179 F: +34 976 465722 http://www.teltronic.es/ www.teltronic.es Logo40 Before printing this e-mail please consider your environmental responsibility. * DISCLAIMER * This message is intended exclusively for the named person. It may contain confidential, propietary or legally privileged information. No confidentiality or privilege is waived or lost by any mistransmission. If you receive this message in error, please immediately delete it and all copies of it from your system, destroy any hard copies of it and notify the sender. Your must not, directly or indirectly, use, disclose, distribute, print, or copy any part of this message if you are not the intended recipient. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of TELTRONIC. If the addressee of this message does not consent to the use of internet e-mail, please communicate it to us immediately.
[GENERAL] psql sqlstate return code access
Hi, I have a requirement to log into a table the completion code for each statement being submitted in a psql script. I've looked around and can't seem to find how to access the sqlstate completion code. Is there a system variable in psql available or some other trick to get it? For example insert into x select ... from y; insert into log(message, code,timestamp) values('insert into x',:SQLSTATE,now()); where :SQLSTATE contains the completion code of the prior statement. Thanks in advance Doug Little
Re: [GENERAL] psql sqlstate return code access
On Mon, Mar 16, 2015 at 4:23 PM, Little, Doug C doug.lit...@vend.frb.org wrote: insert into x select … from y; insert into log(message, code,timestamp) values('insert into x',:SQLSTATE,now()); I'm pretty sure you have to wrap it into a plpgsql function: http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html Moreover, if the command executes correctly you would have no exception and sql state will be 0. Hope this helps. Luca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql BDR(Bi-Directional Replication) Latency Monitoring
Hi I just replied to a fairly similar question here: http://www.postgresql.org/message-id/CAMsr+YEqeFXDS=amqbzmopqdfnt+ho50uv-+s854kw9oa-x...@mail.gmail.com and I'll follow up on that thread as it's essentially the same thing. On 17 March 2015 at 02:27, Bill Brown bi...@consistentstate.com wrote: Hi All, I'm looking at BDR monitoring documentation at this location https://wiki.postgresql.org/wiki/BDR_Monitoring I understand the query in documentation: select slot_name, plugin, database, active, xmin, pg_get_transaction_committime(xmin) FROM pg_stat_logical_decoding ; Should be: select slot_name, plugin, database, active, xmin, pg_get_transaction_committime(xmin) FROM pg_replication_slots ; I am running PostgreSQL 9.4.0 bits on Centos: postgresql-bdr94-contrib-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-libs-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-bdr-0.8.0beta1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch postgresql-bdr94-devel-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-server-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 We have 3 nodes set-up. We would like to understand the latency between nodes. When using http://oltpbenchmark.com/ to create a load on one node, I would expect the 'xmin' value not to be null but is. What is the best way to monitor replication latency when using Postgresql BDR? Regards, Bill -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [GENERAL] How does one make the following psql statement sql-injection resilient?
David G. Johnston wrote: Thanks! I got the gist even with the typo. I actually pondered about prepare/execute after hitting send. Am I correct in remembering that CREATE TEMP TABLE cannot be prepared? I was using the actual query with CREATE TEMP TABLE and then issuing \copy to dump the result out to the file. The limitation of copy to having to be written on a single line makes the intermediary temporary table seem almost a necessity. CREATE TEMP TABLE AS EXECUTE -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How does one make the following psql statement sql-injection resilient?
On Monday, March 16, 2015, Alvaro Herrera alvhe...@2ndquadrant.com wrote: David G. Johnston wrote: Thanks! I got the gist even with the typo. I actually pondered about prepare/execute after hitting send. Am I correct in remembering that CREATE TEMP TABLE cannot be prepared? I was using the actual query with CREATE TEMP TABLE and then issuing \copy to dump the result out to the file. The limitation of copy to having to be written on a single line makes the intermediary temporary table seem almost a necessity. CREATE TEMP TABLE AS EXECUTE Thanks. Though unless I need to work on the temp table I think: PREPARE ...; \copy (EXECUTE ...) TO '~/temp.csv' ...; Gives the best of all worlds. David J.
Re: [GENERAL] Group by range in hour of day
On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 03/16/2015 02:57 PM, Israel Brewster wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario. The goal here is to figure out, for each hour of the day, a) what is the total number of active records for that hour, and b) what is the total active time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour. So, for simplified example, if the table contained three records: start | end - 2015-03-15 08:15 | 2015-03-15 10:45 2015-03-15 09:30 | 2015-03-15 10:15 2015-03-15 10:30 | 2015-03-15 11:30 Then the results should break out something like this: hour | count | sum - 8 |1 | 0.75 9 |2 | 1.5 10 |3 | 1.5 11 |1 | 0.5 I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. Do not have an answer for you, but a question: What version of Postgres are you on? This will help determine what tools are available to work with. The following will give you endpoints for your bounds. Version is important since range types could be very useful in this situation - but you'd still need to generate the bounds info regardless. SELECT * FROM (SELECT * FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s CROSS JOIN (SELECT end_ts + '1 hour'::interval AS end_ts FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) e (end_ts)) AS e You would join this using an ON condition with an OR (start BETWEEN [...] OR end BETWEEN [...]) - range logic will be better and you may want to adjust the upper bound by negative 1 (nano-second?) to allow for easier = logic if using BETWEEN. David J.
Re: [GENERAL] How does one make the following psql statement sql-injection resilient?
On 3/16/2015 4:30 PM, David G. Johnston wrote: psql $SERVICE \ --echo-queries \ --set=string_input=${1:-ok_to_return} \ --set=start=${2:-5} \ --set=end=${3:-10} \ 'SQL' SELECT idx FROM generate_series(1, 20) gs (idx) WHERE 'short-circuit' != :'string_input' AND idx BETWEEN :start AND :end; SQL # (6 rows) --set=end=${3:-10 AND false} # (0 rows) Am I forced to represent the input as text (using :'end') and then perform a conversion to integer? Thanks! David J. The --set's make it a little complicated. How about: string_input=${1:-ok_to_return} start=${2:-5} end=${3:-10} psql $SERVICE --echo-queries 'SQL' prepare tmp as SELECT idx FROM generate_series(1, 20) gs (idx) WHERE 'short-circuit' != $1 AND idx BETWEEN $2 AND :$3; execute tmp($string_input, $start, $end); deallocate tmp; SQL That's untested, and probably wont work. The execute tmp($1, $2, $3) need to be passed to psql as-is, but $string_input, $start and $end need to be replaced in bash before its sent to psql. Maybe use \$1? Docs here: http://www.postgresql.org/docs/9.4/static/sql-prepare.html -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Group by range in hour of day
On Mar 16, 2015, at 2:16 PM, Paul Jungwirth p...@illuminatedcomputing.com wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, I think you can do this by selecting `FROM generate_series(0, 23) s(h)` and then joining to your table based on `h BETWEEN start AND end`. Whenever I need to write a time-series aggregate query I reach for generate_series. Mostly that's so I have output rows even when COUNT(*) would be 0, but here it also means that a row from your data can feed into multiple output rows. I could probably write this out in more detail if you like, but that's the short version. :-) I think I can work with that :-) Hadn't considered doing a join there, so that's a new approach I can investigate. Thanks! Good luck! Paul -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Group by range in hour of day
On 03/16/2015 04:16 PM, Israel Brewster wrote: On Mar 16, 2015, at 2:22 PM, David G. Johnston david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com wrote: On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.comwrote: On 03/16/2015 02:57 PM, Israel Brewster wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario. The goal here is to figure out, for each hour of the day, a) what is the total number of active records for that hour, and b) what is the total active time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour. So, for simplified example, if the table contained three records: start | end --__--- 2015-03-15 08:15 | 2015-03-15 10:45 2015-03-15 09:30 | 2015-03-15 10:15 2015-03-15 10:30 | 2015-03-15 11:30 Then the results should break out something like this: hour | count | sum - 8 |1 | 0.75 9 |2 | 1.5 10 |3 | 1.5 11 |1 | 0.5 I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. Do not have an answer for you, but a question: What version of Postgres are you on? This will help determine what tools are available to work with. The following will give you endpoints for your bounds. Version is important since range types could be very useful in this situation - but you'd still need to generate the bounds info regardless. SELECT * FROM (SELECT * FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s CROSS JOIN (SELECT end_ts + '1 hour'::interval AS end_ts FROM generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 hour'::interval) e (end_ts)) AS e You would join this using an ON condition with an OR (start BETWEEN [...] OR end BETWEEN [...]) - range logic will be better and you may want to adjust the upper bound by negative 1 (nano-second?) to allow for easier = logic if using BETWEEN. Thanks, that is very helpful, but are you sure CROSS JOIN is what you wanted here? using that, I get a 625 row result set where each row from the first SELECT is paired up with EVERY row from the second select. I would think I would want the first row of the first SELECT paired up with only the first row of the second, second row of the first paired with the second row of the second, etc - i.e. 24 start and end bounds. Or am I missing something? Given this: test= select * from start_end ; id | start_time |end_time ++ 1 | 2015-03-16 08:15:00-07 | 2015-03-16 09:35:00-07 2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07 3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 using Pauls hints I got: test= select * from start_end, generate_series(0, 23) as s(h) where h between extract(hour from start_time) and extract(hour from end_time) ; id | start_time |end_time| h +++ 1 | 2015-03-16 08:15:00-07 | 2015-03-16 09:35:00-07 | 8 3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 | 8 1 | 2015-03-16 08:15:00-07 | 2015-03-16 09:35:00-07 | 9 2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07 | 9 3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 | 9 2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07 | 10 3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 | 10 2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07 | 11 3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 | 11 test= select h, count(*) from start_end, generate_series(0, 23) as s(h) where h between extract(hour from start_time) and extract(hour from end_time) group by h order by h; h | count +--- 8 | 2 9 | 3 10 | 2 11 | 2 --- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709
Re: [GENERAL] Slow query with join
Tomas Vondra tomas.von...@2ndquadrant.com writes: On 16.3.2015 19:50, Marc Watson wrote: I hope someone can help me with a problem I'm having when joining a view with a table. The view is somewhat involved, but I can provide the details if necessary First, get rid of the ORDER BY clauses in the subselects - it's completely pointless, and might prevent proper optimization (e.g. replacing the IN() with optimized joins. I'm suspicious that the cause may be an ORDER BY in the view. It's hard to tell when we've not seen the view definition, but I see that both plans we've been shown are going to produce output sorted by actor.id. Maybe that's happenstance, or maybe not. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How does one make the following psql statement sql-injection resilient?
On 3/16/2015 4:45 PM, Andy Colson wrote: On 3/16/2015 4:30 PM, David G. Johnston wrote: psql $SERVICE \ --echo-queries \ --set=string_input=${1:-ok_to_return} \ --set=start=${2:-5} \ --set=end=${3:-10} \ 'SQL' SELECT idx FROM generate_series(1, 20) gs (idx) WHERE 'short-circuit' != :'string_input' AND idx BETWEEN :start AND :end; SQL # (6 rows) --set=end=${3:-10 AND false} # (0 rows) Am I forced to represent the input as text (using :'end') and then perform a conversion to integer? Thanks! David J. The --set's make it a little complicated. How about: string_input=${1:-ok_to_return} start=${2:-5} end=${3:-10} psql $SERVICE --echo-queries 'SQL' prepare tmp as SELECT idx FROM generate_series(1, 20) gs (idx) WHERE 'short-circuit' != $1 AND idx BETWEEN $2 AND :$3; execute tmp($string_input, $start, $end); deallocate tmp; SQL That's untested, and probably wont work. The execute tmp($1, $2, $3) need to be passed to psql as-is, but $string_input, $start and $end need to be replaced in bash before its sent to psql. Maybe use \$1? Docs here: http://www.postgresql.org/docs/9.4/static/sql-prepare.html -Andy Wow. Sorry. what a mess. AND idx BETWEEN $2 AND :$3; should be: AND idx BETWEEN $2 AND $3; That's untested, and probably wont work. The execute tmp($1, $2, $3) should be: execute tmp($string_input, $start, $end); -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Group by range in hour of day
On 03/16/2015 02:57 PM, Israel Brewster wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario. The goal here is to figure out, for each hour of the day, a) what is the total number of active records for that hour, and b) what is the total active time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour. So, for simplified example, if the table contained three records: start | end - 2015-03-15 08:15 | 2015-03-15 10:45 2015-03-15 09:30 | 2015-03-15 10:15 2015-03-15 10:30 | 2015-03-15 11:30 Then the results should break out something like this: hour | count | sum - 8 |1 | 0.75 9 |2 | 1.5 10 |3 | 1.5 11 |1 | 0.5 I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. Do not have an answer for you, but a question: What version of Postgres are you on? This will help determine what tools are available to work with. --- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7293 --- -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Group by range in hour of day
Assuming 3 things Table name - test Column names - start_time, end_time Added an id column (int) to distinguish each record in the table You can go with this. (my apologies for formatting issues) with slots as ( select * fromgenerate_series(0,1439) as s(slot) ), slots_hours as ( select slot, slot / 60 as hour from slots ), minutes as ( select id, date_part('hour', start_time) * 60 + date_part('minute', start_time) as start_minute, date_part('hour', end_time) * 60 + date_part('minute', end_time) as end_minute fromtest ), minute_slots as ( select id, slot, hour fromminutes joinslots_hours on minutes.start_minute = slots_hours.slot and minutes.end_minute slots_hours.slot ) select hour, count(*) / 60.0 as sum, count(distinct id) as count fromminute_slots group byhour I'm certain there are more elegant solutions possible - but you can grasp each step this way. John On Mon, Mar 16, 2015 at 2:57 PM, Israel Brewster isr...@ravnalaska.net wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario. The goal here is to figure out, for each hour of the day, a) what is the total number of active records for that hour, and b) what is the total active time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour. So, for simplified example, if the table contained three records: start | end - 2015-03-15 08:15 | 2015-03-15 10:45 2015-03-15 09:30 | 2015-03-15 10:15 2015-03-15 10:30 | 2015-03-15 11:30 Then the results should break out something like this: hour | count | sum - 8 |1 | 0.75 9 |2 | 1.5 10 |3 | 1.5 11 |1 | 0.5 I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. --- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7293 ---
Re: [GENERAL] Slow query with join
On 16.3.2015 19:50, Marc Watson wrote: Hello all, I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit, as downloaded from EnterpriseDB, and is running on my dev system under Win 7 64-bit. I hope someone can help me with a problem I'm having when joining a view with a table. The view is somewhat involved, but I can provide the details if necessary A query on the table is quick (16 ms): explain analyze select ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by ir_actor_id; Sort (cost=17.28..17.30 rows=8 width=4) (actual time=0.032..0.033 rows=8 loops=1) Sort Key: ir_actor_id Sort Method: quicksort Memory: 25kB - Index Scan using ir_dos_id_idx on f_intervenant_ref (cost=0.28..17.16 rows=8 width=4) (actual time=0.019..0.024 rows=8 loops=1) Index Cond: ((ir_dos_id)::text = '5226'::text) Planning time: 0.180 ms Execution time: 0.049 ms .. ISTM the database is applying the IN() condition last, i.e. it executes SELECT * FROM v_actor and then proceeds to filter the result. I'd bet if you measure time for that (SELECT * FROM v_actor) you'll get ~7 seconds. First, get rid of the ORDER BY clauses in the subselects - it's completely pointless, and might prevent proper optimization (e.g. replacing the IN() with optimized joins. I.e. try this: SELECT * FROM v_actor WHERE v_actor.actor_id IN (SELECT ir_actor_id FROM f_intervenant_ref WHERE ir_dos_id = '5226'); I'd also try replacing this with EXISTS SELECT * FROM v_actor WHERE EXISTS (SELECT 1 FROM f_intervenant_ref WHERE (actor_id = ir_actor_id) AND (ir_dos_id = '5226')); or even an explicit join SELECT v_actor.* FROM v_actor JOIN f_intervenant_ref ON (actor_id = ir_actor_id) WHERE ir_dos_id = '5226'); That might behave a bit differently if there are multiple f_intervenant_ref rows matching the actor. If that's the case, a simple DISTINCT should fix that. -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How does one make the following psql statement sql-injection resilient?
On Mon, Mar 16, 2015 at 2:51 PM, Andy Colson a...@squeakycode.net wrote: On 3/16/2015 4:45 PM, Andy Colson wrote: On 3/16/2015 4:30 PM, David G. Johnston wrote: psql $SERVICE \ --echo-queries \ --set=string_input=${1:-ok_to_return} \ --set=start=${2:-5} \ --set=end=${3:-10} \ 'SQL' SELECT idx FROM generate_series(1, 20) gs (idx) WHERE 'short-circuit' != :'string_input' AND idx BETWEEN :start AND :end; SQL # (6 rows) --set=end=${3:-10 AND false} # (0 rows) Am I forced to represent the input as text (using :'end') and then perform a conversion to integer? Thanks! David J. The --set's make it a little complicated. How about: string_input=${1:-ok_to_return} start=${2:-5} end=${3:-10} psql $SERVICE --echo-queries 'SQL' prepare tmp as SELECT idx FROM generate_series(1, 20) gs (idx) WHERE 'short-circuit' != $1 AND idx BETWEEN $2 AND :$3; execute tmp($string_input, $start, $end); deallocate tmp; SQL That's untested, and probably wont work. The execute tmp($1, $2, $3) need to be passed to psql as-is, but $string_input, $start and $end need to be replaced in bash before its sent to psql. Maybe use \$1? Docs here: http://www.postgresql.org/docs/9.4/static/sql-prepare.html -Andy Wow. Sorry. what a mess. AND idx BETWEEN $2 AND :$3; should be: AND idx BETWEEN $2 AND $3; That's untested, and probably wont work. The execute tmp($1, $2, $3) should be: execute tmp($string_input, $start, $end); -Andy Thanks! I got the gist even with the typo. I actually pondered about prepare/execute after hitting send. Am I correct in remembering that CREATE TEMP TABLE cannot be prepared? I was using the actual query with CREATE TEMP TABLE and then issuing \copy to dump the result out to the file. The limitation of copy to having to be written on a single line makes the intermediary temporary table seem almost a necessity. I'd rather write the :'start'::integer than go through the prepare/execute cycle... David J.
Re: [GENERAL] Group by range in hour of day
On Mar 16, 2015, at 2:13 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 03/16/2015 02:57 PM, Israel Brewster wrote: I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario. The goal here is to figure out, for each hour of the day, a) what is the total number of active records for that hour, and b) what is the total active time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour. So, for simplified example, if the table contained three records: start | end - 2015-03-15 08:15 | 2015-03-15 10:45 2015-03-15 09:30 | 2015-03-15 10:15 2015-03-15 10:30 | 2015-03-15 11:30 Then the results should break out something like this: hour | count | sum - 8 |1 | 0.75 9 |2 | 1.5 10 |3 | 1.5 11 |1 | 0.5 I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks. Do not have an answer for you, but a question: What version of Postgres are you on? This will help determine what tools are available to work with. Oh, right. Of course. I'm on 9.4.0 --- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7293 --- --- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7293 --- -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Group by range in hour of day
I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by hour of day, I think you can do this by selecting `FROM generate_series(0, 23) s(h)` and then joining to your table based on `h BETWEEN start AND end`. Whenever I need to write a time-series aggregate query I reach for generate_series. Mostly that's so I have output rows even when COUNT(*) would be 0, but here it also means that a row from your data can feed into multiple output rows. I could probably write this out in more detail if you like, but that's the short version. :-) Good luck! Paul -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgresql BDR(Bi-Directional Replication) Latency Monitoring
Hi All, I'm looking at BDR monitoring documentation at this location https://wiki.postgresql.org/wiki/BDR_Monitoring I understand the query in documentation: select slot_name, plugin, database, active, xmin, pg_get_transaction_committime(xmin) FROM pg_stat_logical_decoding ; Should be: select slot_name, plugin, database, active, xmin, pg_get_transaction_committime(xmin) FROM pg_replication_slots ; I am running PostgreSQL 9.4.0 bits on Centos: postgresql-bdr94-contrib-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-libs-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-bdr-0.8.0beta1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch postgresql-bdr94-devel-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-server-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 We have 3 nodes set-up. We would like to understand the latency between nodes. When using http://oltpbenchmark.com/ to create a load on one node, I would expect the 'xmin' value not to be null but is. What is the best way to monitor replication latency when using Postgresql BDR? Regards, Bill
[GENERAL] Slow query with join
Hello all, I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit, as downloaded from EnterpriseDB, and is running on my dev system under Win 7 64-bit. I hope someone can help me with a problem I'm having when joining a view with a table. The view is somewhat involved, but I can provide the details if necessary A query on the table is quick (16 ms): explain analyze select ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by ir_actor_id; Sort (cost=17.28..17.30 rows=8 width=4) (actual time=0.032..0.033 rows=8 loops=1) Sort Key: ir_actor_id Sort Method: quicksort Memory: 25kB - Index Scan using ir_dos_id_idx on f_intervenant_ref (cost=0.28..17.16 rows=8 width=4) (actual time=0.019..0.024 rows=8 loops=1) Index Cond: ((ir_dos_id)::text = '5226'::text) Planning time: 0.180 ms Execution time: 0.049 ms A query on the view is also quick (31 ms), using the results from the previous query on the table: explain analyze select * from v_actor where v_actor.actor_id in(77170,77170,77184,77184,77185,77185,77186,77186); Nested Loop Left Join (cost=0.86..385.18 rows=8 width=138) (actual time=2.819..9.652 rows=4 loops=1) Join Filter: (actor.type = 'physical'::business.actor_type) - Nested Loop Left Join (cost=0.58..105.16 rows=8 width=114) (actual time=0.015..0.032 rows=4 loops=1) Join Filter: (actor.type = 'moral'::business.actor_type) - Index Scan using actor_pkey on actor (cost=0.29..38.66 rows=8 width=78) (actual time=0.010..0.018 rows=4 loops=1) Index Cond: (id = ANY ('{77170,77170,77184,77184,77185,77185,77186,77186}'::integer[])) Filter: (deleted IS FALSE) - Index Scan using moral_actor_pkey on moral_actor (cost=0.28..8.30 rows=1 width=40) (actual time=0.002..0.002 rows=0 loops=4) Index Cond: (id = actor.id) - Index Scan using physical_actor_pkey on physical_actor (cost=0.29..8.30 rows=1 width=28) (actual time=0.003..0.004 rows=1 loops=4) Index Cond: (id = actor.id) SubPlan 1 - Limit (cost=8.30..8.31 rows=1 width=8) (never executed) - Sort (cost=8.30..8.31 rows=1 width=8) (never executed) Sort Key: contact.rank - Index Scan using contact_actor_idx on contact (cost=0.28..8.29 rows=1 width=8) (never executed) Index Cond: (actor_id = actor.id) Filter: (NOT deleted) SubPlan 2 - Limit (cost=8.30..8.31 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=4) - Sort (cost=8.30..8.31 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=4) Sort Key: contact_1.rank Sort Method: quicksort Memory: 25kB - Index Scan using contact_actor_idx on contact contact_1 (cost=0.28..8.29 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=4) Index Cond: (actor_id = actor.id) Filter: (NOT deleted) Planning time: 0.721 ms Execution time: 9.759 ms However, when I combine the two queries into one, the result set takes 6742 ms: explain analyze select * from v_actor where v_actor.actor_id in(select ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by ir_actor_id); Merge Semi Join (cost=71.79..108061.92 rows=8 width=1461) (actual time=7884.994..7927.699 rows=4 loops=1) Merge Cond: (actor.id = f_intervenant_ref.ir_actor_id) - Merge Left Join (cost=0.85..554314.28 rows=20670 width=138) (actual time=2.820..7926.001 rows=3072 loops=1) Merge Cond: (actor.id = physical_actor.id) Join Filter: (actor.type = 'physical'::business.actor_type) - Merge Left Join (cost=0.57..1679.17 rows=20670 width=114) (actual time=0.016..4.768 rows=3072 loops=1) Merge Cond: (actor.id = moral_actor.id) Join Filter: (actor.type = 'moral'::business.actor_type) - Index Scan using actor_pkey on actor (cost=0.29..1275.50 rows=20670 width=78) (actual time=0.008..3.190 rows=3072 loops=1) Filter: (deleted IS FALSE) - Index Scan using moral_actor_pkey on moral_actor (cost=0.28..268.78 rows=5548 width=40) (actual time=0.006..0.006 rows=1 loops=1) - Index Scan using physical_actor_pkey on physical_actor (cost=0.29..725.98 rows=15122 width=28) (actual time=0.003..3.208 rows=3072 loops=1) SubPlan 1 - Limit (cost=8.30..8.31 rows=1 width=8) (never executed) - Sort (cost=8.30..8.31 rows=1 width=8) (never executed) Sort Key: contact.rank - Index Scan using contact_actor_idx on contact (cost=0.28..8.29 rows=1 width=8) (never executed) Index Cond: (actor_id = actor.id) Filter: (NOT deleted) SubPlan 2 - Limit (cost=8.30..8.31 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=3072) - Sort