Re: [GENERAL] Slow query with join

2015-03-16 Thread David G. Johnston
On Mon, Mar 16, 2015 at 11:50 AM, Marc Watson mark.wat...@jurisconcept.ca
wrote:

 Hello all,
 I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit, as
 downloaded from EnterpriseDB, and is running on my dev system under Win 7
 64-bit.

​[...]​



 However, when I combine the two queries into one, the result set takes
 6742 ms:
 explain analyze select * from v_actor where v_actor.actor_id in(select
 ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by
 ir_actor_id);


​You might want to consider whether the following is acceptable; but it
would depend on the relationship between f_intervenant_ref and v_actor:

SELECT *
FROM v_actor
JOIN f_intervenant_ref ON (actor_id = ir_actor_id)
WHERE ir_dos_id = '5226';


IN has issues due to necessary consideration of possible NULLs in the
list.

Furthermore; even in your original query there is no value to incorporating
an ORDER BY into the IN subquery.

I suspect that this second problem is preventing the planner from pushing
the subquery down into the view and so is forced to perform a Merge Semi
Join against the full (and thus expensive) view while the constants in the
second query can be pushed down and the planner is able to choose the
Nested Loop Left Join over 8 keys (4 rows) which ends up being very fast.

​The other option is to use EXISTS:

SELECT *
FROM v_actor
WHERE EXIST (SELECT 1 FROM f_intervenant_ref WHERE actor_id = ir_actor_id
AND ir_dos_id = '5226')
​


Merge Semi Join  (cost=71.79..108061.92 rows=8 width=1461) (actual
 time=7884.994..7927.699 rows=4 loops=1)
   Merge Cond: (actor.id = f_intervenant_ref.ir_actor_id)
   -  Merge Left Join  (cost=0.85..554314.28 rows=20670 width=138)
 (actual time=2.820..7926.001 rows=3072 loops=1)

​[...]​


   -  Materialize  (cost=17.28..17.40 rows=8 width=4) (actual
 time=0.024..0.027 rows=8 loops=1)
 -  Sort  (cost=17.28..17.30 rows=8 width=4) (actual
 time=0.021..0.022 rows=8 loops=1)
   Sort Key: f_intervenant_ref.ir_actor_id
   Sort Method: quicksort  Memory: 25kB
   -  Index Scan using ir_dos_id_idx on f_intervenant_ref
 (cost=0.28..17.16 rows=8 width=4) (actual time=0.012..0.017 rows=8 loops=1)
 Index Cond: ((ir_dos_id)::text = '5226'::text)
 Planning time: 0.820 ms
 Execution time: 7927.838 ms

 Any suggestions to help me speed this up will be greatly appreciated.


​David J.​

​


[GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread David G. Johnston
psql $SERVICE \
 --echo-queries \
 --set=string_input=${1:-ok_to_return} \
 --set=start=${2:-5} \
 --set=end=${3:-10} \
'SQL'
SELECT idx
FROM generate_series(1, 20) gs (idx)
WHERE 'short-circuit' != :'string_input'
AND idx BETWEEN :start AND :end;
SQL

# (6 rows)

--set=end=${3:-10 AND false}

# (0 rows)

Am I forced to represent the input as text (using :'end') and then perform
a conversion to integer?

Thanks!

David J.


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread Israel Brewster
On Mar 16, 2015, at 2:22 PM, David G. Johnston david.g.johns...@gmail.com 
wrote:
 
 On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver adrian.kla...@aklaver.com 
 mailto:adrian.kla...@aklaver.com wrote:
 On 03/16/2015 02:57 PM, Israel Brewster wrote:
 I have a table with two timestamp columns for the start time and end
 time of each record (call them start and end).I'm trying to figure out
 if there is a way to group these records by hour of day, that is the
 record should be included in the group if the hour of the day for the
 group falls anywhere in the range [start,end]. Obviously each record may
 well fall into multiple groups under this scenario.
 
 The goal here is to figure out, for each hour of the day, a) what is the
 total number of active records for that hour, and b) what is the total
 active time for those records during the hour, with an ultimate goal
 of figuring out the average active time per record per hour.
 
 So, for simplified example, if the table contained three records:
 
   start  |   end
 -
 2015-03-15 08:15  |  2015-03-15 10:45
 2015-03-15 09:30  |  2015-03-15 10:15
 2015-03-15 10:30  |  2015-03-15 11:30
 
 
 Then the results should break out something like this:
 
 hour  |  count  |  sum
 -
 8   |1   |   0.75
 9   |2   |   1.5
 10 |3   |   1.5
 11 |1   |   0.5
 
 I can then easily manipulate these values to get my ultimate goal of the
 average, which would of course always be less than or equal to 1. Is
 this doable in postgress? Or would it be a better idea to simply pull
 the raw data and post-process in code? Thanks.
 
 Do not have an answer for you, but a question:
 
 What version of Postgres are you on?
 
 This will help determine what tools are available to work with.
 
 ​The following will give you endpoints for your bounds.  Version is important 
 since range types could be very useful in this situation - but you'd still 
 need to generate the bounds info regardless.​
 
 ​SELECT * 
 FROM
 (SELECT * FROM generate_series('2015-03-15'::timestamptz, 
 '2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s
 CROSS JOIN
 (SELECT end_ts + '1 hour'::interval AS end_ts FROM 
 generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1 
 hour'::interval) e (end_ts)) AS e
 
 You would join this using an ON condition with an OR (start BETWEEN [...] OR 
 end BETWEEN [...]) - range logic will be better and you may want to adjust 
 the upper bound by negative 1 (nano-second?) to allow for easier = logic 
 if using BETWEEN.
 

Thanks, that is very helpful, but are you sure CROSS JOIN is what you wanted 
here? using that, I get a 625 row result set where each row from the first 
SELECT is paired up with EVERY row from the second select. I would think I 
would want the first row of the first SELECT paired up with only the first row 
of the second, second row of the first paired with the second row of the 
second, etc - i.e. 24 start and end bounds. Or am I missing something?



---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---

 ​David J.​
 



Re: [GENERAL] bdr replication latency monitoring

2015-03-16 Thread Craig Ringer
Steve,

The relevant change was made during the commit of logical decoding to
PostgreSQL 9.4, where the field of interest was renamed from 'xmin' to
'catalog_xmin'. It's around then that pg_stat_logical_decoding was renamed
to pg_replication_slots too.

To get lag in bytes, use:

SELECT slot_name, database, active,
pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn)
FROM pg_replication_slots
WHERE plugin = 'bdr';

The catalog_xmin doesn't really reflect lag at all. Replay may have
continued past that xid and fully caught up. Additionally, the commit
timestamp records for the catalog xmin may be truncated away, rendering its
commit time unknown and causing pg_get_transaction_committime(...) to
report the epoch 2000-01-01 as the commit time. So using
pg_get_transaction_committime on the catalog xmin isn't as useful as it was
in earlier versions of BDR. I don't currently have a good way to get you a
sense of replay lag in wall-clock time and will need to get back to you on
that one.


Note that we're in the process of updating all that documentation, moving
it into the same SGML format used for PostgreSQL's official documentation
and putting it in the BDR source tree. Some of the documentation on the
wiki has become outdated since 0.7.x as a result. The coming 0.9.x release
will bundle the documentation in the source tree and make the wiki docs
obsolete.

Thanks for your patience in the mean time. Please bring up any other issues
you encounter, as it'll help make sure I and the rest of the team don't
miss anything.





On 14 March 2015 at 03:06, Steve Boyle sbo...@connexity.com wrote:

 I'm trying to follow the BDR monitoring docs:
 https://wiki.postgresql.org/wiki/BDR_Monitoring

 My postgres version string is (its from the 2nd Quadrant repo):
 PostgreSQL 9.4.0 on x86_64-unknown-linux-gnu, compiled by gcc
 (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit

 My BDR plugin is version 0.8.0beta1

 From the docs, I've come up with this query:
 select slot_name, plugin, database, active, xmin,
 pg_get_transaction_committime(xmin)
 FROM pg_replication_slots ;

 BDR is working.  When I run that query, the 'xmin' value is always null,
 even though there is activity on the database.  I do/can get a catalog_xmin
 value.  Should I expect the 'xmin' value to be null?  Is there another way
 to monitor the replication latency when using BDR?

 Thanks,
 Steve Boyle





 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


[GENERAL] Group by range in hour of day

2015-03-16 Thread Israel Brewster
I have a table with two timestamp columns for the start time and end time of each record (call them start and end).I'm trying to figure out if there is a way to group these records by "hour of day", that is the record should be included in the group if the hour of the day for the group falls anywhere in the range [start,end]. Obviously each record may well fall into multiple groups under this scenario.The goal here is to figure out, for each hour of the day, a) what is the total number of "active" records for that hour, and b) what is the total "active" time for those records during the hour, with an ultimate goal of figuring out the average active time per record per hour.So, for simplified example, if the table contained three records:start   |end-2015-03-15 08:15 | 2015-03-15 10:452015-03-15 09:30 | 2015-03-15 10:152015-03-15 10:30 | 2015-03-15 11:30Then the results should break out something like this:hour | count | sum-8|  1|  0.759|  2|  1.510   |  3|  1.511   |  1|  0.5I can then easily manipulate these values to get my ultimate goal of the average, which would of course always be less than or equal to 1. Is this doable in postgress? Or would it be a better idea to simply pull the raw data and post-process in code? Thanks.
---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD




Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread David G. Johnston
On Mon, Mar 16, 2015 at 4:16 PM, Israel Brewster isr...@ravnalaska.net
wrote:

 On Mar 16, 2015, at 2:22 PM, David G. Johnston david.g.johns...@gmail.com
 wrote:


 On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver adrian.kla...@aklaver.com
 wrote:

 On 03/16/2015 02:57 PM, Israel Brewster wrote:

 I have a table with two timestamp columns for the start time and end
 time of each record (call them start and end).I'm trying to figure out
 if there is a way to group these records by hour of day, that is the
 record should be included in the group if the hour of the day for the
 group falls anywhere in the range [start,end]. Obviously each record may
 well fall into multiple groups under this scenario.

 The goal here is to figure out, for each hour of the day, a) what is the
 total number of active records for that hour, and b) what is the total
 active time for those records during the hour, with an ultimate goal
 of figuring out the average active time per record per hour.

 So, for simplified example, if the table contained three records:

   start  |   end
 -
 2015-03-15 08:15  |  2015-03-15 10:45
 2015-03-15 09:30  |  2015-03-15 10:15
 2015-03-15 10:30  |  2015-03-15 11:30


 Then the results should break out something like this:

 hour  |  count  |  sum
 -
 8   |1   |   0.75
 9   |2   |   1.5
 10 |3   |   1.5
 11 |1   |   0.5

 I can then easily manipulate these values to get my ultimate goal of the
 average, which would of course always be less than or equal to 1. Is
 this doable in postgress? Or would it be a better idea to simply pull
 the raw data and post-process in code? Thanks.


 Do not have an answer for you, but a question:

 What version of Postgres are you on?

 This will help determine what tools are available to work with.


 ​The following will give you endpoints for your bounds.  Version is
 important since range types could be very useful in this situation - but
 you'd still need to generate the bounds info regardless.​

 ​
 SELECT *
 FROM
 (SELECT * FROM generate_series('2015-03-15'::timestamptz,
 '2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s
 CROSS JOIN
 (SELECT end_ts + '1 hour'::interval AS end_ts FROM
 generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1
 hour'::interval) e (end_ts)) AS e

 You would join this using an ON condition with an OR (start BETWEEN [...]
 OR end BETWEEN [...]) - range logic will be better and you may want to
 adjust the upper bound by negative 1 (nano-second?) to allow for easier
 = logic if using BETWEEN.


 Thanks, that is very helpful, but are you sure CROSS JOIN is what you
 wanted here? using that, I get a 625 row result set where each row from the
 first SELECT is paired up with EVERY row from the second select. I would
 think I would want the first row of the first SELECT paired up with only
 the first row of the second, second row of the first paired with the second
 row of the second, etc - i.e. 24 start and end bounds. Or am I missing
 something?


​No, I rushed things...:( Sorry.  My concept is good though but indeed you
want to end up with a table having only 24 rows (for the sample).

LATERAL may work here but I haven't had a chance to play with it yet.  A
simple ordinal column to join on would be sufficient.

David J.​


Re: [GENERAL] pitr archive_command cp fsync

2015-03-16 Thread Peter Eisentraut
On 3/14/15 3:27 PM, Миша Тюрин wrote:
 should we add disclaimer in pitr documentation about cp and fsync?
 
 cp does not fsync.
 
 and dd for example can do fsync.

only on some platforms


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Update using non-existent fields does not throw an error

2015-03-16 Thread Rob Richardson
Greetings!

An update query is apparently succeeding, even though the query refers to 
fields that do not exist.  Here's the query:

update inventory set
x_coordinate = (select x_coordinate from bases where base = '101'),
y_coordinate = (select y_coordinate from bases where base = '101')
where charge = 100

-- select x_coordinate, y_coordinate from bases where base = '101'

When I run the update query, it tells me that the query succeeded and that four 
records were updated, which is what I expect.  But when I looked at the 
inventory table, I found that the four records were unchanged.  So, I tried to 
check the values of the base coordinates by running the select statement shown 
above.  That statement threw an error complaining that x_coordinate and 
y_coordinate did not exist.  This is correct; I should have been querying a 
view that includes those fields.  But why didn't the update statement throw an 
error?

RobR


Re: [GENERAL] Update using non-existent fields does not throw an error

2015-03-16 Thread Alban Hertroys
On 16 March 2015 at 17:02, Rob Richardson rdrichard...@rad-con.com wrote:
 Greetings!



 An update query is apparently succeeding, even though the query refers to
 fields that do not exist.  Here’s the query:



 update inventory set

 x_coordinate = (select x_coordinate from bases where base = '101'),

 y_coordinate = (select y_coordinate from bases where base = '101')

 where charge = 100



 -- select x_coordinate, y_coordinate from bases where base = '101'



 When I run the update query, it tells me that the query succeeded and that
 four records were updated, which is what I expect.  But when I looked at the
 inventory table, I found that the four records were unchanged.  So, I tried
 to check the values of the base coordinates by running the select statement
 shown above.  That statement threw an error complaining that x_coordinate
 and y_coordinate did not exist.  This is correct; I should have been
 querying a view that includes those fields.  But why didn’t the update
 statement throw an error?

Because inventory contains those fields.

-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Update using non-existent fields does not throw an error

2015-03-16 Thread Albe Laurenz
Rob Richardson wrote:
 An update query is apparently succeeding, even though the query refers to 
 fields that do not exist.
 Here’s the query:
 
 update inventory set
 x_coordinate = (select x_coordinate from bases where base = '101'),
 y_coordinate = (select y_coordinate from bases where base = '101')
 where charge = 100
 
 -- select x_coordinate, y_coordinate from bases where base = '101'
 
 When I run the update query, it tells me that the query succeeded and that 
 four records were updated,
 which is what I expect.  But when I looked at the inventory table, I found 
 that the four records were
 unchanged.  So, I tried to check the values of the base coordinates by 
 running the select statement
 shown above.  That statement threw an error complaining that x_coordinate and 
 y_coordinate did not
 exist.  This is correct; I should have been querying a view that includes 
 those fields.  But why
 didn’t the update statement throw an error?

That's an old one.

Since there is no x_coordinate in bases, the column will refer to 
x_coordinate from
the outer query.  So you set x_coordinate and y_coordinate to their old 
values.

You can avoid problems like that by using column names that are qualified with 
the table name.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Update using non-existent fields does not throw an error

2015-03-16 Thread Rob Richardson
Thanks very much.  Now that you've explained it, it should have been obvious.

RobR

-Original Message-
From: Albe Laurenz [mailto:laurenz.a...@wien.gv.at] 
Sent: Monday, March 16, 2015 12:21 PM
To: Rob Richardson; pgsql-general@postgresql.org
Subject: RE: Update using non-existent fields does not throw an error

Rob Richardson wrote:
 An update query is apparently succeeding, even though the query refers to 
 fields that do not exist.
 Here’s the query:
 
 update inventory set
 x_coordinate = (select x_coordinate from bases where base = '101'), 
 y_coordinate = (select y_coordinate from bases where base = '101') 
 where charge = 100
 
 -- select x_coordinate, y_coordinate from bases where base = '101'
 
 When I run the update query, it tells me that the query succeeded and 
 that four records were updated, which is what I expect.  But when I 
 looked at the inventory table, I found that the four records were 
 unchanged.  So, I tried to check the values of the base coordinates by 
 running the select statement shown above.  That statement threw an 
 error complaining that x_coordinate and y_coordinate did not exist.  This is 
 correct; I should have been querying a view that includes those fields.  But 
 why didn’t the update statement throw an error?

That's an old one.

Since there is no x_coordinate in bases, the column will refer to 
x_coordinate from the outer query.  So you set x_coordinate and 
y_coordinate to their old values.

You can avoid problems like that by using column names that are qualified with 
the table name.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] how to recovery your database when stops replicating

2015-03-16 Thread Ruth Melendo
Hi all,

 

I´m testing BDR and have a big deal. When it gets corrupted, after time of
working well, I cannot do a recovery and the only way to get the system
working again seems to reset all again. It means:

 

-  Stop all nodes.

-  Delete data directories in both nodes

-  Comment BDR entries in postgresql.conf in node A.

-  Start node A and restore database.

-  Uncomment BDR entries in node A.

-  Start node B with BDR entries so that executes initial copy.

-  It works again.

 

I think there should be a better way to do it. I have some mails from here
telling this:

 

-  Execute this sql in both nodes:

 

SELECT slot_name, datoid, database, active FROM pg_replication_slots WHERE
slot_name LIKE 'bdr_'||(SELECT oid FROM pg_database WHERE datname =
current_database())||'_%';

 

-  Execute this sql to be able to DDL again:

COPY (SELECT * FROM
pg_logical_slot_get_binary_changes('bdr_17911_6120567807158814813_1_16385__'
, NULL, 1, 'interactive', 'true')) TO '/dev/null';

 

-  Reconnect  and do this:

SET bdr.skip_ddl_replication = on;

 

 

But that method don´t work for me. Any help? How do you recovery your
database when stops replicating?

 

Thanks in advance

 

Ruth Patricia Melendo Ventura

Software Engineer

TELTRONIC, S.A.U.

T: +34 976 465656   Ext. 179

F: +34 976 465722 

 http://www.teltronic.es/ www.teltronic.es

 

Logo40

 

Before printing this e-mail please consider your environmental
responsibility. 

*  DISCLAIMER  *

This message is intended exclusively for the named person. It may contain
confidential, propietary or legally privileged information. No
confidentiality or privilege is waived or lost by any mistransmission. If
you receive this message in error, please immediately delete it and all
copies of it from your system, destroy any hard copies of it and notify the
sender. Your must not, directly or indirectly, use, disclose, distribute,
print, or copy any part of this message if you are not the intended
recipient. Any views expressed in this message are those of the individual
sender, except where the message states otherwise and the sender is
authorised to state them to be the views of TELTRONIC. If the addressee of
this message does not consent to the use of internet e-mail, please
communicate it to us immediately.

 



[GENERAL] psql sqlstate return code access

2015-03-16 Thread Little, Doug C
Hi,

I have a requirement to log into a table the completion code for each statement 
being submitted in a psql script.

I've looked around and can't seem to find how to access the sqlstate completion 
code. Is there a system variable in psql  available or some other trick to 
get it?

For example

insert into x select ... from y;

insert into log(message, code,timestamp)  values('insert into 
x',:SQLSTATE,now());

where :SQLSTATE contains the completion code of the prior statement.

Thanks in advance



Doug Little




Re: [GENERAL] psql sqlstate return code access

2015-03-16 Thread Luca Ferrari
On Mon, Mar 16, 2015 at 4:23 PM, Little, Doug C
doug.lit...@vend.frb.org wrote:
 insert into x select … from y;



 insert into log(message, code,timestamp)  values('insert into
 x',:SQLSTATE,now());



I'm pretty sure you have to wrap it into a plpgsql function:
http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html
Moreover, if the command executes correctly you would have no
exception and sql state will be 0.

Hope this helps.

Luca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql BDR(Bi-Directional Replication) Latency Monitoring

2015-03-16 Thread Craig Ringer
Hi

I just replied to a fairly similar question here:

http://www.postgresql.org/message-id/CAMsr+YEqeFXDS=amqbzmopqdfnt+ho50uv-+s854kw9oa-x...@mail.gmail.com

and I'll follow up on that thread as it's essentially the same thing.

On 17 March 2015 at 02:27, Bill Brown bi...@consistentstate.com wrote:

 Hi All,

 I'm looking at  BDR monitoring documentation at this location
 https://wiki.postgresql.org/wiki/BDR_Monitoring

 I understand the query in documentation:

 select slot_name, plugin, database, active, xmin,
 pg_get_transaction_committime(xmin)
 FROM pg_stat_logical_decoding ;


 Should be:

 select slot_name, plugin, database, active, xmin,
 pg_get_transaction_committime(xmin)
 FROM pg_replication_slots ;



 I am running PostgreSQL 9.4.0 bits on Centos:

 postgresql-bdr94-contrib-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
 postgresql-bdr94-libs-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
 postgresql-bdr94-bdr-0.8.0beta1-1_2ndQuadrant.el6.x86_64
 postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch
 postgresql-bdr94-devel-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
 postgresql-bdr94-server-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
 postgresql-bdr94-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64


 We have 3 nodes set-up.  We would like to understand the latency between
 nodes.

 When using http://oltpbenchmark.com/ to create a load on one node,  I
 would expect the 'xmin' value not to be null but is.

 What is the best way to monitor replication latency when using Postgresql
 BDR?

 Regards,

 Bill




-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


Re: [GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread Alvaro Herrera
David G. Johnston wrote:

 Thanks!  I got the gist even with the typo.  I actually pondered about
 prepare/execute after hitting send.  Am I correct in remembering that
 CREATE TEMP TABLE cannot be prepared?  I was using the actual query with
 CREATE TEMP TABLE and then issuing \copy to dump the result out to the
 file.  The limitation of copy to having to be written on a single line
 makes the intermediary temporary table seem almost a necessity.

CREATE TEMP TABLE AS EXECUTE

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread David G. Johnston
On Monday, March 16, 2015, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 David G. Johnston wrote:

  Thanks!  I got the gist even with the typo.  I actually pondered about
  prepare/execute after hitting send.  Am I correct in remembering that
  CREATE TEMP TABLE cannot be prepared?  I was using the actual query
 with
  CREATE TEMP TABLE and then issuing \copy to dump the result out to the
  file.  The limitation of copy to having to be written on a single line
  makes the intermediary temporary table seem almost a necessity.

 CREATE TEMP TABLE AS EXECUTE


Thanks.

Though unless I need to work on the temp table I think:

PREPARE ...;
\copy (EXECUTE ...) TO '~/temp.csv' ...;

Gives the best of all worlds.

David J.


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread David G. Johnston
On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver adrian.kla...@aklaver.com
wrote:

 On 03/16/2015 02:57 PM, Israel Brewster wrote:

 I have a table with two timestamp columns for the start time and end
 time of each record (call them start and end).I'm trying to figure out
 if there is a way to group these records by hour of day, that is the
 record should be included in the group if the hour of the day for the
 group falls anywhere in the range [start,end]. Obviously each record may
 well fall into multiple groups under this scenario.

 The goal here is to figure out, for each hour of the day, a) what is the
 total number of active records for that hour, and b) what is the total
 active time for those records during the hour, with an ultimate goal
 of figuring out the average active time per record per hour.

 So, for simplified example, if the table contained three records:

   start  |   end
 -
 2015-03-15 08:15  |  2015-03-15 10:45
 2015-03-15 09:30  |  2015-03-15 10:15
 2015-03-15 10:30  |  2015-03-15 11:30


 Then the results should break out something like this:

 hour  |  count  |  sum
 -
 8   |1   |   0.75
 9   |2   |   1.5
 10 |3   |   1.5
 11 |1   |   0.5

 I can then easily manipulate these values to get my ultimate goal of the
 average, which would of course always be less than or equal to 1. Is
 this doable in postgress? Or would it be a better idea to simply pull
 the raw data and post-process in code? Thanks.


 Do not have an answer for you, but a question:

 What version of Postgres are you on?

 This will help determine what tools are available to work with.


​The following will give you endpoints for your bounds.  Version is
important since range types could be very useful in this situation - but
you'd still need to generate the bounds info regardless.​

​
SELECT *
FROM
(SELECT * FROM generate_series('2015-03-15'::timestamptz,
'2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s
CROSS JOIN
(SELECT end_ts + '1 hour'::interval AS end_ts FROM
generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1
hour'::interval) e (end_ts)) AS e

You would join this using an ON condition with an OR (start BETWEEN [...]
OR end BETWEEN [...]) - range logic will be better and you may want to
adjust the upper bound by negative 1 (nano-second?) to allow for easier
= logic if using BETWEEN.

​David J.​


Re: [GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread Andy Colson

On 3/16/2015 4:30 PM, David G. Johnston wrote:

psql $SERVICE \
  --echo-queries \
  --set=string_input=${1:-ok_to_return} \
  --set=start=${2:-5} \
  --set=end=${3:-10} \
'SQL'
 SELECT idx
 FROM generate_series(1, 20) gs (idx)
 WHERE 'short-circuit' != :'string_input'
 AND idx BETWEEN :start AND :end;
SQL

# (6 rows)

--set=end=${3:-10 AND false}

# (0 rows)

Am I forced to represent the input as text (using :'end') and then
perform a conversion to integer?

Thanks!

David J.




The --set's make it a little complicated.  How about:

string_input=${1:-ok_to_return}
start=${2:-5}
end=${3:-10}

psql $SERVICE --echo-queries 'SQL'
 prepare tmp as SELECT idx
 FROM generate_series(1, 20) gs (idx)
 WHERE 'short-circuit' != $1
 AND idx BETWEEN $2 AND :$3;

  execute tmp($string_input, $start, $end);
  deallocate tmp;
SQL

That's untested, and probably wont work.  The execute tmp($1, $2, $3) 
need to be passed to psql as-is, but $string_input, $start and $end need 
to be replaced in bash before its sent to psql.  Maybe use \$1?


Docs here:

http://www.postgresql.org/docs/9.4/static/sql-prepare.html


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread Israel Brewster
On Mar 16, 2015, at 2:16 PM, Paul Jungwirth p...@illuminatedcomputing.com 
wrote:
 
 I have a table with two timestamp columns for the start time and end
 time of each record (call them start and end).I'm trying to figure out
 if there is a way to group these records by hour of day,
 
 I think you can do this by selecting `FROM generate_series(0, 23) s(h)` and 
 then joining to your table based on `h BETWEEN start AND end`.
 
 Whenever I need to write a time-series aggregate query I reach for 
 generate_series. Mostly that's so I have output rows even when COUNT(*) would 
 be 0, but here it also means that a row from your data can feed into multiple 
 output rows.
 
 I could probably write this out in more detail if you like, but that's the 
 short version. :-)

I think I can work with that :-) Hadn't considered doing a join there, so 
that's a new approach I can investigate. Thanks!
 
 Good luck!
 
 Paul
 
 
 
 
 
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread Adrian Klaver

On 03/16/2015 04:16 PM, Israel Brewster wrote:

On Mar 16, 2015, at 2:22 PM, David G. Johnston
david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com wrote:


On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver
adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.comwrote:

On 03/16/2015 02:57 PM, Israel Brewster wrote:

I have a table with two timestamp columns for the start time
and end
time of each record (call them start and end).I'm trying to
figure out
if there is a way to group these records by hour of day,
that is the
record should be included in the group if the hour of the day
for the
group falls anywhere in the range [start,end]. Obviously each
record may
well fall into multiple groups under this scenario.

The goal here is to figure out, for each hour of the day, a)
what is the
total number of active records for that hour, and b) what is
the total
active time for those records during the hour, with an
ultimate goal
of figuring out the average active time per record per hour.

So, for simplified example, if the table contained three records:

  start  |   end
--__---
2015-03-15 08:15  |  2015-03-15 10:45
2015-03-15 09:30  |  2015-03-15 10:15
2015-03-15 10:30  |  2015-03-15 11:30


Then the results should break out something like this:

hour  |  count  |  sum
-
8   |1   |   0.75
9   |2   |   1.5
10 |3   |   1.5
11 |1   |   0.5

I can then easily manipulate these values to get my ultimate
goal of the
average, which would of course always be less than or equal to
1. Is
this doable in postgress? Or would it be a better idea to
simply pull
the raw data and post-process in code? Thanks.


Do not have an answer for you, but a question:

What version of Postgres are you on?

This will help determine what tools are available to work with.


​The following will give you endpoints for your bounds.  Version is
important since range types could be very useful in this situation -
but you'd still need to generate the bounds info regardless.​

​
SELECT *
FROM
(SELECT * FROM generate_series('2015-03-15'::timestamptz,
'2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s
CROSS JOIN
(SELECT end_ts + '1 hour'::interval AS end_ts FROM
generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz,
'1 hour'::interval) e (end_ts)) AS e

You would join this using an ON condition with an OR (start BETWEEN
[...] OR end BETWEEN [...]) - range logic will be better and you may
want to adjust the upper bound by negative 1 (nano-second?) to allow
for easier = logic if using BETWEEN.



Thanks, that is very helpful, but are you sure CROSS JOIN is what you
wanted here? using that, I get a 625 row result set where each row from
the first SELECT is paired up with EVERY row from the second select. I
would think I would want the first row of the first SELECT paired up
with only the first row of the second, second row of the first paired
with the second row of the second, etc - i.e. 24 start and end bounds.
Or am I missing something?


Given this:

test= select * from start_end ;
 id |   start_time   |end_time
++
  1 | 2015-03-16 08:15:00-07 | 2015-03-16 09:35:00-07
  2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07
  3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07

using Pauls hints I got:

test= select *  from start_end, generate_series(0, 23) as s(h) where h 
between extract(hour from start_time) and extract(hour from end_time) ;


 id |   start_time   |end_time| h
+++
  1 | 2015-03-16 08:15:00-07 | 2015-03-16 09:35:00-07 |  8
  3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 |  8
  1 | 2015-03-16 08:15:00-07 | 2015-03-16 09:35:00-07 |  9
  2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07 |  9
  3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 |  9
  2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07 | 10
  3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 | 10
  2 | 2015-03-16 09:15:00-07 | 2015-03-16 11:05:00-07 | 11
  3 | 2015-03-16 08:00:00-07 | 2015-03-16 11:45:00-07 | 11


test= select h, count(*) from start_end, generate_series(0, 23) as s(h) 
where h between extract(hour from start_time) and extract(hour from 
end_time) group by h order by h;


 h  | count
+---
  8 | 2
  9 | 3
 10 | 2
 11 | 2





---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709

Re: [GENERAL] Slow query with join

2015-03-16 Thread Tom Lane
Tomas Vondra tomas.von...@2ndquadrant.com writes:
 On 16.3.2015 19:50, Marc Watson wrote:
 I hope someone can help me with a problem I'm having when joining a
 view with a table. The view is somewhat involved, but I can provide the
 details if necessary

 First, get rid of the ORDER BY clauses in the subselects - it's
 completely pointless, and might prevent proper optimization (e.g.
 replacing the IN() with optimized joins.

I'm suspicious that the cause may be an ORDER BY in the view.  It's
hard to tell when we've not seen the view definition, but I see that both
plans we've been shown are going to produce output sorted by actor.id.
Maybe that's happenstance, or maybe not.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread Andy Colson

On 3/16/2015 4:45 PM, Andy Colson wrote:

On 3/16/2015 4:30 PM, David G. Johnston wrote:

psql $SERVICE \
  --echo-queries \
  --set=string_input=${1:-ok_to_return} \
  --set=start=${2:-5} \
  --set=end=${3:-10} \
'SQL'
 SELECT idx
 FROM generate_series(1, 20) gs (idx)
 WHERE 'short-circuit' != :'string_input'
 AND idx BETWEEN :start AND :end;
SQL

# (6 rows)

--set=end=${3:-10 AND false}

# (0 rows)

Am I forced to represent the input as text (using :'end') and then
perform a conversion to integer?

Thanks!

David J.




The --set's make it a little complicated.  How about:

string_input=${1:-ok_to_return}
start=${2:-5}
end=${3:-10}

psql $SERVICE --echo-queries 'SQL'
  prepare tmp as SELECT idx
  FROM generate_series(1, 20) gs (idx)
  WHERE 'short-circuit' != $1
  AND idx BETWEEN $2 AND :$3;

   execute tmp($string_input, $start, $end);
   deallocate tmp;
SQL

That's untested, and probably wont work.  The execute tmp($1, $2, $3)
need to be passed to psql as-is, but $string_input, $start and $end need
to be replaced in bash before its sent to psql.  Maybe use \$1?

Docs here:

http://www.postgresql.org/docs/9.4/static/sql-prepare.html


-Andy




Wow.  Sorry.  what a mess.

   AND idx BETWEEN $2 AND :$3;
should be:
   AND idx BETWEEN $2 AND $3;


 That's untested, and probably wont work.  The execute tmp($1, $2, $3)
should be: execute tmp($string_input, $start, $end);

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread Adrian Klaver

On 03/16/2015 02:57 PM, Israel Brewster wrote:

I have a table with two timestamp columns for the start time and end
time of each record (call them start and end).I'm trying to figure out
if there is a way to group these records by hour of day, that is the
record should be included in the group if the hour of the day for the
group falls anywhere in the range [start,end]. Obviously each record may
well fall into multiple groups under this scenario.

The goal here is to figure out, for each hour of the day, a) what is the
total number of active records for that hour, and b) what is the total
active time for those records during the hour, with an ultimate goal
of figuring out the average active time per record per hour.

So, for simplified example, if the table contained three records:

  start  |   end
-
2015-03-15 08:15  |  2015-03-15 10:45
2015-03-15 09:30  |  2015-03-15 10:15
2015-03-15 10:30  |  2015-03-15 11:30


Then the results should break out something like this:

hour  |  count  |  sum
-
8   |1   |   0.75
9   |2   |   1.5
10 |3   |   1.5
11 |1   |   0.5

I can then easily manipulate these values to get my ultimate goal of the
average, which would of course always be less than or equal to 1. Is
this doable in postgress? Or would it be a better idea to simply pull
the raw data and post-process in code? Thanks.


Do not have an answer for you, but a question:

What version of Postgres are you on?

This will help determine what tools are available to work with.



---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---








--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread John W Higgins
Assuming 3 things

Table name - test
Column names - start_time, end_time
Added an id column (int) to distinguish each record in the table

You can go with this. (my apologies for formatting issues)

with
slots as (
select  *
fromgenerate_series(0,1439) as s(slot)
),
slots_hours as (
select  slot,
slot / 60 as hour
from slots
),
minutes as (
select  id,
date_part('hour', start_time) * 60 + date_part('minute',
start_time) as start_minute,
date_part('hour', end_time) * 60 + date_part('minute',
end_time) as end_minute
fromtest
),
minute_slots as (
select  id,
slot,
hour
fromminutes
joinslots_hours
on  minutes.start_minute = slots_hours.slot
and minutes.end_minute  slots_hours.slot
)
select  hour,
count(*) / 60.0 as sum,
count(distinct id) as count
fromminute_slots
group byhour

I'm certain there are more elegant solutions possible - but you can grasp
each step this way.

John

On Mon, Mar 16, 2015 at 2:57 PM, Israel Brewster isr...@ravnalaska.net
wrote:

 I have a table with two timestamp columns for the start time and end time
 of each record (call them start and end).I'm trying to figure out if there
 is a way to group these records by hour of day, that is the record should
 be included in the group if the hour of the day for the group falls
 anywhere in the range [start,end]. Obviously each record may well fall into
 multiple groups under this scenario.

 The goal here is to figure out, for each hour of the day, a) what is the
 total number of active records for that hour, and b) what is the total
 active time for those records during the hour, with an ultimate goal of
 figuring out the average active time per record per hour.

 So, for simplified example, if the table contained three records:

  start  |   end
 -
 2015-03-15 08:15  |  2015-03-15 10:45
 2015-03-15 09:30  |  2015-03-15 10:15
 2015-03-15 10:30  |  2015-03-15 11:30


 Then the results should break out something like this:

 hour  |  count  |  sum
 -
 8   |1   |   0.75
 9   |2   |   1.5
 10 |3   |   1.5
 11 |1   |   0.5

 I can then easily manipulate these values to get my ultimate goal of the
 average, which would of course always be less than or equal to 1. Is this
 doable in postgress? Or would it be a better idea to simply pull the raw
 data and post-process in code? Thanks.

 ---
 Israel Brewster
 Systems Analyst II
 Ravn Alaska
 5245 Airport Industrial Rd
 Fairbanks, AK 99709
 (907) 450-7293
 ---








Re: [GENERAL] Slow query with join

2015-03-16 Thread Tomas Vondra
On 16.3.2015 19:50, Marc Watson wrote:
 Hello all,
 I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800,
64-bit, as downloaded from EnterpriseDB, and is running on my dev system
under Win 7 64-bit.
 I hope someone can help me with a problem I'm having when joining a
view with a table. The view is somewhat involved, but I can provide the
details if necessary
 A query on the table is quick (16 ms):

 explain analyze select ir_actor_id from f_intervenant_ref where ir_dos_id = 
 '5226' order by ir_actor_id;
 
 Sort  (cost=17.28..17.30 rows=8 width=4) (actual time=0.032..0.033 rows=8 
 loops=1)
   Sort Key: ir_actor_id
   Sort Method: quicksort  Memory: 25kB
   -  Index Scan using ir_dos_id_idx on f_intervenant_ref  (cost=0.28..17.16 
 rows=8 width=4) (actual time=0.019..0.024 rows=8 loops=1)
 Index Cond: ((ir_dos_id)::text = '5226'::text)
 Planning time: 0.180 ms
 Execution time: 0.049 ms
 ..

ISTM the database is applying the IN() condition last, i.e. it executes

   SELECT * FROM v_actor

and then proceeds to filter the result. I'd bet if you measure time for
that (SELECT * FROM v_actor) you'll get ~7 seconds.

First, get rid of the ORDER BY clauses in the subselects - it's
completely pointless, and might prevent proper optimization (e.g.
replacing the IN() with optimized joins.

I.e. try this:

 SELECT * FROM v_actor
  WHERE v_actor.actor_id IN (SELECT ir_actor_id FROM f_intervenant_ref
  WHERE ir_dos_id = '5226');

I'd also try replacing this with EXISTS

 SELECT * FROM v_actor
  WHERE EXISTS (SELECT 1 FROM f_intervenant_ref
 WHERE (actor_id = ir_actor_id)
   AND (ir_dos_id = '5226'));

or even an explicit join

 SELECT v_actor.* FROM v_actor JOIN f_intervenant_ref
ON (actor_id = ir_actor_id)
 WHERE ir_dos_id = '5226');

That might behave a bit differently if there are multiple
f_intervenant_ref rows matching the actor. If that's the case, a simple
DISTINCT should fix that.

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread David G. Johnston
On Mon, Mar 16, 2015 at 2:51 PM, Andy Colson a...@squeakycode.net wrote:

 On 3/16/2015 4:45 PM, Andy Colson wrote:

 On 3/16/2015 4:30 PM, David G. Johnston wrote:

 psql $SERVICE \
   --echo-queries \
   --set=string_input=${1:-ok_to_return} \
   --set=start=${2:-5} \
   --set=end=${3:-10} \
 'SQL'
  SELECT idx
  FROM generate_series(1, 20) gs (idx)
  WHERE 'short-circuit' != :'string_input'
  AND idx BETWEEN :start AND :end;
 SQL

 # (6 rows)

 --set=end=${3:-10 AND false}

 # (0 rows)

 Am I forced to represent the input as text (using :'end') and then
 perform a conversion to integer?

 Thanks!

 David J.



 The --set's make it a little complicated.  How about:

 string_input=${1:-ok_to_return}
 start=${2:-5}
 end=${3:-10}

 psql $SERVICE --echo-queries 'SQL'
   prepare tmp as SELECT idx
   FROM generate_series(1, 20) gs (idx)
   WHERE 'short-circuit' != $1
   AND idx BETWEEN $2 AND :$3;

execute tmp($string_input, $start, $end);
deallocate tmp;
 SQL

 That's untested, and probably wont work.  The execute tmp($1, $2, $3)
 need to be passed to psql as-is, but $string_input, $start and $end need
 to be replaced in bash before its sent to psql.  Maybe use \$1?

 Docs here:

 http://www.postgresql.org/docs/9.4/static/sql-prepare.html


 -Andy



 Wow.  Sorry.  what a mess.

AND idx BETWEEN $2 AND :$3;
 should be:
AND idx BETWEEN $2 AND $3;


  That's untested, and probably wont work.  The execute tmp($1, $2, $3)
 should be: execute tmp($string_input, $start, $end);

 -Andy


​Thanks!  I got the gist even with the typo.  I actually pondered about
prepare/execute after hitting send.  Am I correct in remembering that
CREATE TEMP TABLE cannot be prepared?  I was using the actual query with
CREATE TEMP TABLE and then issuing \copy to dump the result out to the
file.  The limitation of copy to having to be written on a single line
makes the intermediary temporary table seem almost a necessity.

I'd rather write the :'start'::integer than go through the prepare/execute
cycle...

David J.


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread Israel Brewster
On Mar 16, 2015, at 2:13 PM, Adrian Klaver adrian.kla...@aklaver.com wrote:
 
 On 03/16/2015 02:57 PM, Israel Brewster wrote:
 I have a table with two timestamp columns for the start time and end
 time of each record (call them start and end).I'm trying to figure out
 if there is a way to group these records by hour of day, that is the
 record should be included in the group if the hour of the day for the
 group falls anywhere in the range [start,end]. Obviously each record may
 well fall into multiple groups under this scenario.
 
 The goal here is to figure out, for each hour of the day, a) what is the
 total number of active records for that hour, and b) what is the total
 active time for those records during the hour, with an ultimate goal
 of figuring out the average active time per record per hour.
 
 So, for simplified example, if the table contained three records:
 
  start  |   end
 -
 2015-03-15 08:15  |  2015-03-15 10:45
 2015-03-15 09:30  |  2015-03-15 10:15
 2015-03-15 10:30  |  2015-03-15 11:30
 
 
 Then the results should break out something like this:
 
 hour  |  count  |  sum
 -
 8   |1   |   0.75
 9   |2   |   1.5
 10 |3   |   1.5
 11 |1   |   0.5
 
 I can then easily manipulate these values to get my ultimate goal of the
 average, which would of course always be less than or equal to 1. Is
 this doable in postgress? Or would it be a better idea to simply pull
 the raw data and post-process in code? Thanks.
 
 Do not have an answer for you, but a question:
 
 What version of Postgres are you on?
 
 This will help determine what tools are available to work with.

Oh, right. Of course. I'm on 9.4.0

---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---


 
 
 ---
 Israel Brewster
 Systems Analyst II
 Ravn Alaska
 5245 Airport Industrial Rd
 Fairbanks, AK 99709
 (907) 450-7293
 ---
 
 
 
 
 
 
 
 -- 
 Adrian Klaver
 adrian.kla...@aklaver.com



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Group by range in hour of day

2015-03-16 Thread Paul Jungwirth

I have a table with two timestamp columns for the start time and end
time of each record (call them start and end).I'm trying to figure out
if there is a way to group these records by hour of day,


I think you can do this by selecting `FROM generate_series(0, 23) s(h)` 
and then joining to your table based on `h BETWEEN start AND end`.


Whenever I need to write a time-series aggregate query I reach for 
generate_series. Mostly that's so I have output rows even when COUNT(*) 
would be 0, but here it also means that a row from your data can feed 
into multiple output rows.


I could probably write this out in more detail if you like, but that's 
the short version. :-)


Good luck!

Paul







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql BDR(Bi-Directional Replication) Latency Monitoring

2015-03-16 Thread Bill Brown
Hi All,

I'm looking at  BDR monitoring documentation at this location
https://wiki.postgresql.org/wiki/BDR_Monitoring

I understand the query in documentation:

select slot_name, plugin, database, active, xmin,
pg_get_transaction_committime(xmin)
FROM pg_stat_logical_decoding ;


Should be:

select slot_name, plugin, database, active, xmin,
pg_get_transaction_committime(xmin)
FROM pg_replication_slots ;



I am running PostgreSQL 9.4.0 bits on Centos:

postgresql-bdr94-contrib-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-libs-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-bdr-0.8.0beta1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch
postgresql-bdr94-devel-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-server-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64


We have 3 nodes set-up.  We would like to understand the latency between
nodes.

When using http://oltpbenchmark.com/ to create a load on one node,  I would
expect the 'xmin' value not to be null but is.

What is the best way to monitor replication latency when using Postgresql
BDR?

Regards,

Bill


[GENERAL] Slow query with join

2015-03-16 Thread Marc Watson
Hello all,
I am using PostgreSQL 9.4.1, compiled by Visual C++ build 1800, 64-bit, as 
downloaded from EnterpriseDB, and is running on my dev system under Win 7 
64-bit.
I hope someone can help me with a problem I'm having when joining a view with a 
table. The view is somewhat involved, but I can provide the details if necessary
A query on the table is quick (16 ms):
explain analyze select ir_actor_id from f_intervenant_ref where ir_dos_id = 
'5226' order by ir_actor_id;

Sort  (cost=17.28..17.30 rows=8 width=4) (actual time=0.032..0.033 rows=8 
loops=1)
  Sort Key: ir_actor_id
  Sort Method: quicksort  Memory: 25kB
  -  Index Scan using ir_dos_id_idx on f_intervenant_ref  (cost=0.28..17.16 
rows=8 width=4) (actual time=0.019..0.024 rows=8 loops=1)
Index Cond: ((ir_dos_id)::text = '5226'::text)
Planning time: 0.180 ms
Execution time: 0.049 ms

A query on the view is also quick (31 ms), using the results from the previous 
query on the table:
explain analyze select * from v_actor where v_actor.actor_id 
in(77170,77170,77184,77184,77185,77185,77186,77186);
Nested Loop Left Join  (cost=0.86..385.18 rows=8 width=138) (actual 
time=2.819..9.652 rows=4 loops=1)
  Join Filter: (actor.type = 'physical'::business.actor_type)
  -  Nested Loop Left Join  (cost=0.58..105.16 rows=8 width=114) (actual 
time=0.015..0.032 rows=4 loops=1)
Join Filter: (actor.type = 'moral'::business.actor_type)
-  Index Scan using actor_pkey on actor  (cost=0.29..38.66 rows=8 
width=78) (actual time=0.010..0.018 rows=4 loops=1)
  Index Cond: (id = ANY 
('{77170,77170,77184,77184,77185,77185,77186,77186}'::integer[]))
  Filter: (deleted IS FALSE)
-  Index Scan using moral_actor_pkey on moral_actor  (cost=0.28..8.30 
rows=1 width=40) (actual time=0.002..0.002 rows=0 loops=4)
  Index Cond: (id = actor.id)
  -  Index Scan using physical_actor_pkey on physical_actor  (cost=0.29..8.30 
rows=1 width=28) (actual time=0.003..0.004 rows=1 loops=4)
Index Cond: (id = actor.id)
  SubPlan 1
-  Limit  (cost=8.30..8.31 rows=1 width=8) (never executed)
  -  Sort  (cost=8.30..8.31 rows=1 width=8) (never executed)
Sort Key: contact.rank
-  Index Scan using contact_actor_idx on contact  
(cost=0.28..8.29 rows=1 width=8) (never executed)
  Index Cond: (actor_id = actor.id)
  Filter: (NOT deleted)
  SubPlan 2
-  Limit  (cost=8.30..8.31 rows=1 width=8) (actual time=0.005..0.005 
rows=0 loops=4)
  -  Sort  (cost=8.30..8.31 rows=1 width=8) (actual time=0.004..0.004 
rows=0 loops=4)
Sort Key: contact_1.rank
Sort Method: quicksort  Memory: 25kB
-  Index Scan using contact_actor_idx on contact contact_1  
(cost=0.28..8.29 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=4)
  Index Cond: (actor_id = actor.id)
  Filter: (NOT deleted)
Planning time: 0.721 ms
Execution time: 9.759 ms

However, when I combine the two queries into one, the result set takes 6742 ms:
explain analyze select * from v_actor where v_actor.actor_id in(select 
ir_actor_id from f_intervenant_ref where ir_dos_id = '5226' order by 
ir_actor_id);
Merge Semi Join  (cost=71.79..108061.92 rows=8 width=1461) (actual 
time=7884.994..7927.699 rows=4 loops=1)
  Merge Cond: (actor.id = f_intervenant_ref.ir_actor_id)
  -  Merge Left Join  (cost=0.85..554314.28 rows=20670 width=138) (actual 
time=2.820..7926.001 rows=3072 loops=1)
Merge Cond: (actor.id = physical_actor.id)
Join Filter: (actor.type = 'physical'::business.actor_type)
-  Merge Left Join  (cost=0.57..1679.17 rows=20670 width=114) (actual 
time=0.016..4.768 rows=3072 loops=1)
  Merge Cond: (actor.id = moral_actor.id)
  Join Filter: (actor.type = 'moral'::business.actor_type)
  -  Index Scan using actor_pkey on actor  (cost=0.29..1275.50 
rows=20670 width=78) (actual time=0.008..3.190 rows=3072 loops=1)
Filter: (deleted IS FALSE)
  -  Index Scan using moral_actor_pkey on moral_actor  
(cost=0.28..268.78 rows=5548 width=40) (actual time=0.006..0.006 rows=1 
loops=1)
-  Index Scan using physical_actor_pkey on physical_actor  
(cost=0.29..725.98 rows=15122 width=28) (actual time=0.003..3.208 rows=3072 
loops=1)
SubPlan 1
  -  Limit  (cost=8.30..8.31 rows=1 width=8) (never executed)
-  Sort  (cost=8.30..8.31 rows=1 width=8) (never executed)
  Sort Key: contact.rank
  -  Index Scan using contact_actor_idx on contact  
(cost=0.28..8.29 rows=1 width=8) (never executed)
Index Cond: (actor_id = actor.id)
Filter: (NOT deleted)
SubPlan 2
  -  Limit  (cost=8.30..8.31 rows=1 width=8) (actual 
time=0.006..0.006 rows=0 loops=3072)
-  Sort