Re: [GENERAL] *** QUESTION *** After successful 'BEGIN;' command -- why PGSQL_TRANSACTION_ACTIVE and not PGSQL_TRANSACTION_INTRANS?

2015-10-12 Thread Steve Petrie, P.Eng.

Hello Rob,

Thanks for your response.

[RS]
I really do not understand "why" you need the SQLSTATE code after

executing a "BEGIN" so as to go into transaction state.

AFAIK you can only retrieve the SQLSTATE error code when an error
actually occurs. So, if your query statement was successful, then PHP
does not expose a SQLSTATE code of 0.


[SP]
Yes, I do know that postgres does not supply a sqlstate value if a command 
completes successfully. That question was resolved earlier.


And yes, I am using pg_send_query(...) to execute the BEGIN; command, so as 
to get a result resource returned. (Please see my forthcoming emailed 
response to Adrian Klaver, wherein I provide the PHP source code that Adrian 
requests.)


* * *
* * *

My present question is not about sqlstate, but about the value returned by 
function pg_transaction_status(...) after a successful BEGIN; command.


As I understand them, the possible return values from function 
pg_transaction_status(...) are:


//
// Return values from function pg_transaction_status($sql_conn);
//
// 0 = PGSQL_TRANSACTION_IDLE(connection is currently idle, not in a 
transaction)
// 1 = PGSQL_TRANSACTION_ACTIVE  (command in progress on the connection, a 
query has been sent via the connection and not yet completed)

// 2 = PGSQL_TRANSACTION_INTRANS (idle, in a transaction block)
// 3 = PGSQL_TRANSACTION_INERROR (idle, in a failed transaction block)
// 4 = PGSQL_TRANSACTION_UNKNOWN (the connection is bad)
//

So I am expecting, after a successful BEGIN; command has completed, that the 
value returned by function pg_transaction_status(...) will be:


  2 == PGSQL_TRANSACTION_INTRANS

But instead, my PHP program is showing:

  1 == PGSQL_TRANSACTION_ACTIVE

Which does not make sense to me, as the BEGIN; command has completed and my 
PHP program has not yet initiated any new command on the postgres 
connection.


Regards,

Steve

* * *

Steve Petrie, P.Eng.

ITS-ETO Consortium
Oakville, Ontario, Canada
(905) 847-3253
apet...@aspetrie.net

- Original Message - 
From: "rob stone" 
To: "Steve Petrie, P.Eng." ; 


Sent: Sunday, October 11, 2015 11:58 PM
Subject: Re: [GENERAL] *** QUESTION *** After successful 'BEGIN;' command --  
why PGSQL_TRANSACTION_ACTIVE and not PGSQL_TRANSACTION_INTRANS?






ETO::0::LOG:  0: connection received: host=127.0.0.1
port=1083
ETO::0::LOCATION:  BackendInitialize,
src\backend\postmaster\postmaster.c:3850
ETO::0::LOG:  0: connection authorized: user=its-eto_pg36
database=eto_sql_db
ETO::0::LOCATION:  PerformAuthentication,
src\backend\utils\init\postinit.c:239
ETO::0::LOG:  0: statement: set client_encoding to 'LATIN1'
ETO::0::LOCATION:  exec_simple_query,
src\backend\tcop\postgres.c:890
ETO::0::LOG:  0: duration: 63.000 ms
ETO::0::LOCATION:  exec_simple_query,
src\backend\tcop\postgres.c:1118
ETO::0::LOG:  0: statement: BEGIN;
ETO::0::LOCATION:  exec_simple_query,
src\backend\tcop\postgres.c:890
ETO::0::LOG:  0: duration: 0.000 ms
ETO::0::LOCATION:  exec_simple_query,
src\backend\tcop\postgres.c:1118
ETO::0::LOG:  0: disconnection: session time: 0:00:00.297
user=its-eto_pg36 database=eto_sql_db host=127.0.0.1 port=1083
ETO::0::LOCATION:  log_disconnections,
src\backend\tcop\postgres.c:
* * *
* * *

Thanks For Any Feedback,

Steve


Hello Steve,

I really do not understand "why" you need the SQLSTATE code after
executing a "BEGIN" so as to go into transaction state.

AFAIK you can only retrieve the SQLSTATE error code when an error
actually occurs. So, if your query statement was successful, then PHP
does not expose a SQLSTATE code of 0.

If I run the following code:-

--


--

it returns the following:-


ERROR:  42P01: relation "rhubarb" does not exist
LINE 1: SELECT * FROM rhubarb
 ^
LOCATION:  parserOpenTable, parse_relation.c:986


SQLSTATE 42P01 is the error "undefined_table".


Note that you have to use pg_send_query to take advantage of
pg_get_result, etc.


HTH,

Rob





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] checkpoints anatomy

2015-10-12 Thread Achilleas Mantzios

http://dba.stackexchange.com/questions/61822/what-happens-in-postgresql-checkpoint

and the now classic :
http://www.depesz.com/2011/07/14/write-ahead-log-understanding-postgresql-conf-checkpoint_segments-checkpoint_timeout-checkpoint_warning/

On 12/10/2015 04:39, Richardson Hinestroza wrote:

Hello, excuse me for my poor english. i am writting from Colombia and i am 
postgresql fan.

I want to know if postgresql checkpoints prevent current transactions to write 
the same page being flush to disk by checkpoint proccess.

And I want know if the postgresql checkpoint use the ARIES algorithmo. and 
known technical details about postgresql checkpoints.

i can not foud in the web answers for my question.

i would apreciate your answer. thanks a lot



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Albe Laurenz
Victor Blomqvist wrote:
[race condition causes errors due to stale plans immediately after ALTER TABLE 
DROP]
> Note that these errors most of the time only happens very briefly at the same 
> time as the ALTER is
> run. When I did some experiments today the server in total had around 3k 
> req/s with maybe 0.1% of them
> touching the table being updated, and the error then happens maybe 1-10% of 
> the times I try this
> operation. If I do the operation on a table with more load the error will 
> happen more frequently.

As far as I gleaned from reading the source, plan cache invalidation happens by 
signals
sent to the other backends, so I can see why there can be small delays.
I wonder if there is any good way to improve this.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] *** QUESTION *** After successful 'BEGIN;' command -- why PGSQL_TRANSACTION_ACTIVE and not PGSQL_TRANSACTION_INTRANS?

2015-10-12 Thread Steve Petrie, P.Eng.

Hello Daniel,

Thanks for your response.

Ahhh -- so after a successful query, the PHP program must keep executing 
pg_get_result($cnx) until a NULL result is obtained !!


And ONLY THEN does transaction status transition from 
PGSQL_TRANSACTION_ACTIVE to PGSQL_TRANSACTION_INTRANS.


OK -- makes sense to me -- I will change the code accordingly.

* * *
* * *

BTW -- this PostgreSQL newbie (bye bye mysql) is getting a nice warm fuzzy 
feeling, about PostgreSQL and its amazingly helpful community :)


Best,

Steve

* * *

Steve Petrie, P.Eng.

ITS-ETO Consortium
Oakville, Ontario, Canada
(905) 847-3253
apet...@aspetrie.net

- Original Message - 
From: "Daniel Verite" 

To: "Steve Petrie, P.Eng." 
Cc: "rob stone" ; 
Sent: Monday, October 12, 2015 5:22 AM
Subject: Re: [GENERAL] *** QUESTION *** After successful 'BEGIN;' command --  
why PGSQL_TRANSACTION_ACTIVE and not PGSQL_TRANSACTION_INTRANS?



Steve Petrie, P.Eng. wrote:

And yes, I am using pg_send_query(...) to execute the BEGIN; command, so 
as

to get a result resource returned. (Please see my forthcoming emailed
response to Adrian Klaver, wherein I provide the PHP source code that 
Adrian

requests.)


After successfully calling pg_send_query($cnx, $query),
pg_transaction_status($cnx) will return PGSQL_TRANSACTION_ACTIVE
until the results have been collected by looping over pg_get_result($cnx).

This is for any query, not specifically $query="BEGIN".

Example:



This will output 2, which corresponds to PGSQL_TRANSACTION_INTRANS
as expected.

OTOH if commenting the call to pg_get_result($dbcnx), then
it ouputs 1 (PGSQL_TRANSACTION_ACTIVE) as you mentioned.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general 




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] *** QUESTION *** After successful 'BEGIN;' command -- why PGSQL_TRANSACTION_ACTIVE and not PGSQL_TRANSACTION_INTRANS?

2015-10-12 Thread Daniel Verite
Steve Petrie, P.Eng. wrote:

> And yes, I am using pg_send_query(...) to execute the BEGIN; command, so as 
> to get a result resource returned. (Please see my forthcoming emailed 
> response to Adrian Klaver, wherein I provide the PHP source code that Adrian
> requests.)

After successfully calling pg_send_query($cnx, $query),
pg_transaction_status($cnx) will return PGSQL_TRANSACTION_ACTIVE
until the results have been collected by looping over pg_get_result($cnx).

This is for any query, not specifically $query="BEGIN".

Example:



This will output 2, which corresponds to PGSQL_TRANSACTION_INTRANS
as expected.

OTOH if commenting the call to pg_get_result($dbcnx), then
it ouputs 1 (PGSQL_TRANSACTION_ACTIVE) as you mentioned.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR workers exiting?

2015-10-12 Thread Jim Nasby

On 10/12/15 10:14 AM, Jim Nasby wrote:

On 10/12/15 9:37 AM, Steve Pribyl wrote:

I am loading up a 60G database into BDR database and these "ERRORS"
are in my logs.  Is not normal behavior or is something going bad.

2015-10-12 09:28:59.389 CDT,,,30371,,561bc17d.76a3,1,,2015-10-12
09:19:41 CDT,5/0,0,ERROR,XX000,"data stream ended","bdr
(6204748238611542317,1,16494,): apply"
2015-10-12 09:28:59.390 CDT,,,12693,,561bb1ae.3195,20,,2015-10-12
08:12:14 CDT,,0,LOG,0,"worker process: bdr
(6204748238611542317,1,16494,)->bdr (6204748255428234532,1, (PID
30371) exited with exit code 1",""
2015-10-12 09:29:04.395 CDT,,,12693,,561bb1ae.3195,21,,2015-10-12
08:12:14 CDT,,0,LOG,0,"starting background worker process ""bdr
(6204748238611542317,1,16494,)->bdr (6204748255428234532,1,""",""


Looks like something's going bad, but you need to ask on the BDR mailing
list.


Nevermind, just discovered there is no separate list. Sorry for the noise.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR workers exiting?

2015-10-12 Thread Steve Pribyl
Yup, there is a disconnect on other side.

This disconnect is preceded by this.
ERROR,XX000,"invalid memory alloc request size 1073741824","slot 
""bdr_16494_6204748238611542317_1_16494__"", output plugin ""bdr"", in the 
change callback, associated LSN 2/FD250E48""bdr 
(6204748238611542317,1,16494,):receive"

Steve Pribyl
Sr. Systems Engineer
steve.pri...@akunacapital.com
Desk: 312-994-4646



From: Andres Freund 
Sent: Monday, October 12, 2015 11:08 AM
To: Steve Pribyl
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] BDR workers exiting?

On 2015-10-12 14:37:07 +, Steve Pribyl wrote:
> I am loading up a 60G database into BDR database and these "ERRORS" are in my 
> logs.  Is not normal behavior or is something going bad.
>
> 2015-10-12 09:28:59.389 CDT,,,30371,,561bc17d.76a3,1,,2015-10-12 09:19:41 
> CDT,5/0,0,ERROR,XX000,"data stream ended","bdr 
> (6204748238611542317,1,16494,): apply"
> 2015-10-12 09:28:59.390 CDT,,,12693,,561bb1ae.3195,20,,2015-10-12 08:12:14 
> CDT,,0,LOG,0,"worker process: bdr (6204748238611542317,1,16494,)->bdr 
> (6204748255428234532,1, (PID 30371) exited with exit code 1",""
> 2015-10-12 09:29:04.395 CDT,,,12693,,561bb1ae.3195,21,,2015-10-12 08:12:14 
> CDT,,0,LOG,0,"starting background worker process ""bdr 
> (6204748238611542317,1,16494,)->bdr (6204748255428234532,1,""",""

There'll possibly be an error message on the other node about ending the
connection.

Do you use SSL? If so, try disabling renegotiation.

Regards,

Andres

 [http://www.akunacapital.com/images/akuna.png]
Steve Pribyl | Senior Systems Engineer
Akuna Capital LLC
36 S Wabash, Suite 310 Chicago IL 60603 USA | www.akunacapital.com 

p: +1 312 994 4646 | m:  | f: +1 312 750 1667 | steve.pri...@akunacapital.com

Please consider the environment, before printing this email.

This electronic message contains information from Akuna Capital LLC that may be 
confidential, legally privileged or otherwise protected from disclosure. This 
information is intended for the use of the addressee only and is not offered as 
investment advice to be relied upon for personal or professional use. 
Additionally, all electronic messages are recorded and stored in compliance 
pursuant to applicable SEC rules. If you are not the intended recipient, you 
are hereby notified that any disclosure, copying, distribution, printing or any 
other use of, or any action in reliance on, the contents of this electronic 
message is strictly prohibited. If you have received this communication in 
error, please notify us by telephone at (312)994-4640 and destroy the original 
message.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Merge join vs merge semi join against primary key

2015-10-12 Thread Igor Neyman
Hello,

I'm seeing some inexplicable (to me) behavior in PostgreSQL 9.2. I checked
the archives, but I still can't explain it. Apologies if I missed something.

1. When I join two tables with "WHERE id IN (...)" versus with an explicit
join, and the join column for the inner table is a primary key, I would expect
the same behavior in both cases, but the optimizer is choosing a merge join in
one case and a merge semi join in the other. There's at most one customer
with a given id. Why not do a semi join?

2. Even though the join methods are different, I would expect about the same
performance in either case, but one query takes only a few hundred
milliseconds while the other takes hundreds of seconds. Ouch!

Can anyone help me explain this behavior?

Some details are below. Let me know if it would be helpful to gather others.

Sean


production=> select version();
version

 PostgreSQL 9.2.13 on i686-pc-linux-gnu, compiled by gcc (Debian 4.7.2-5) 
4.7.2, 32-bit
(1 row)

production=> \d customers
Table "public.customers"
 Column|   Type   | Modifiers
---+--+---
 id| bigint   | not null default 
nextval('customers_id_seq'::regclass)
 group_id  | bigint   |
...
Indexes:
"customers_pkey" PRIMARY KEY, btree (id)
...

production=> select count(*) from customers;
 count

 473733
(1 row)

production=> \d balances
   Table "public.balances"
Column |   Type   | Modifiers
---+--+--
 id| bigint   | not null default 
nextval('balances_id_seq'::regclass)
 balance   | integer  | not null default 0
 tracking_number   | integer  | not null
 customer_id   | bigint   | not null
...
Indexes:
"balances_pkey" PRIMARY KEY, btree (id)
"balances_customer_tracking_number_index" UNIQUE, btree (customer_id, 
tracking_number)
...

production=> select count(*) from balances;
 count
---
 16876
(1 row)

production=> analyze verbose customers;
INFO:  analyzing "public.customers"
INFO:  "customers": scanned 14280 of 14280 pages, containing 475288 live rows 
and 1949 dead rows; 30 rows in sample, 475288 estimated total rows
ANALYZE

production=> analyze verbose balances;
INFO:  analyzing "public.balances"
INFO:  "balances": scanned 202 of 202 pages, containing 16876 live rows and 0 
dead rows; 16876 rows in sample, 16876 estimated total rows
ANALYZE

production=> explain analyze SELECT * FROM balances where customer_id IN 
(SELECT id from customers WHERE group_id = 45);

QUERY PLAN
-
 Merge Semi Join  (cost=2442.14..19958.30 rows=16876 width=80) (actual 
time=119.905..145.126 rows=7318 loops=1)
   Merge Cond: (balances.customer_id = customers.id)
   ->  Index Scan using balances_customer_id_index on balances  
(cost=0.00..727.79 rows=16876 width=80) (actual time=0.302..9.477 rows=16876 
loops=1)
   ->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97 
rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)
 Filter: (group_id = 45)
 Rows Removed by Filter: 141684
 Total runtime: 146.659 ms
(7 rows)

production=> explain analyze SELECT ac.* FROM balances ac join customers o ON 
o.id = ac.customer_id WHERE o.group_id = 45;

QUERY PLAN
---
 Merge Join  (cost=2214.50..20216.86 rows=30 width=80) (actual 
time=185.615..201991.752 rows=7318 loops=1)
   Merge Cond: (ac.customer_id = o.id)
   ->  Index Scan using balances_customer_tracking_number_index on balances ac  
(cost=0.00..1007.49 rows=16876 width=80) (actual time=0.068..25.036 rows=16876 
loops=1)
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 
rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)
 Filter: (group_id = 45)
 Rows Removed by Filter: 212699113
 Total runtime: 201995.044 ms
(7 rows)


What if you rewrite your second query like this:

SELECT ac.*
FROM balances ac JOIN customers o ON (o.id = ac.customer_id AND 
o.group_id = 45);

Regards,
Igor Neyman




Re: [GENERAL] Merge join vs merge semi join against primary key

2015-10-12 Thread Igor Neyman


From: Sean Rhea [mailto:sean.c.r...@gmail.com]
Sent: Friday, October 09, 2015 4:30 PM
To: Igor Neyman 
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Merge join vs merge semi join against primary key

It does the merge (not-semi) join:

production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o ON 
(o.id = ac.customer_id AND o.group_id = 45);
  
QUERY PLAN
--
 Merge Join  (cost=2172.47..19959.82 rows=6 width=80) (actual 
time=114.578..243898.199 rows=7318 loops=1)
   Merge Cond: (ac.customer_id = o.id)
   ->  Index Scan using balances_customer_id_index on balances ac  
(cost=0.00..727.42 rows=16876 width=80) (actual time=0.025..20.972 rows=16876 
loops=1)
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..64811.57 
rows=179 width=8) (actual time=92.174..243813.231 rows=7672 loops=1)
 Filter: (group_id = 45)
 Rows Removed by Filter: 212699113
 Total runtime: 243901.595 ms
(7 rows)

Sean



On Fri, Oct 9, 2015 at 1:09 PM, Igor Neyman 
> wrote:
Hello,

I'm seeing some inexplicable (to me) behavior in PostgreSQL 9.2. I checked
the archives, but I still can't explain it. Apologies if I missed something.

1. When I join two tables with "WHERE id IN (...)" versus with an explicit
join, and the join column for the inner table is a primary key, I would expect
the same behavior in both cases, but the optimizer is choosing a merge join in
one case and a merge semi join in the other. There's at most one customer
with a given id. Why not do a semi join?

2. Even though the join methods are different, I would expect about the same
performance in either case, but one query takes only a few hundred
milliseconds while the other takes hundreds of seconds. Ouch!

Can anyone help me explain this behavior?

Some details are below. Let me know if it would be helpful to gather others.

Sean


production=> select version();
version

 PostgreSQL 9.2.13 on i686-pc-linux-gnu, compiled by gcc (Debian 4.7.2-5) 
4.7.2, 32-bit
(1 row)

production=> \d customers
Table "public.customers"
 Column|   Type   | Modifiers
---+--+---
 id| bigint   | not null default 
nextval('customers_id_seq'::regclass)
 group_id  | bigint   |
...
Indexes:
"customers_pkey" PRIMARY KEY, btree (id)
...

production=> select count(*) from customers;
 count

 473733
(1 row)

production=> \d balances
   Table "public.balances"
Column |   Type   | Modifiers
---+--+--
 id| bigint   | not null default 
nextval('balances_id_seq'::regclass)
 balance   | integer  | not null default 0
 tracking_number   | integer  | not null
 customer_id   | bigint   | not null
...
Indexes:
"balances_pkey" PRIMARY KEY, btree (id)
"balances_customer_tracking_number_index" UNIQUE, btree (customer_id, 
tracking_number)
...

production=> select count(*) from balances;
 count
---
 16876
(1 row)

production=> analyze verbose customers;
INFO:  analyzing "public.customers"
INFO:  "customers": scanned 14280 of 14280 pages, containing 475288 live rows 
and 1949 dead rows; 30 rows in sample, 475288 estimated total rows
ANALYZE

production=> analyze verbose balances;
INFO:  analyzing "public.balances"
INFO:  "balances": scanned 202 of 202 pages, containing 16876 live rows and 0 
dead rows; 16876 rows in sample, 16876 estimated total rows
ANALYZE

production=> explain analyze SELECT * FROM balances where customer_id IN 
(SELECT id from customers WHERE group_id = 45);

QUERY PLAN
-
 Merge Semi Join  (cost=2442.14..19958.30 rows=16876 width=80) (actual 
time=119.905..145.126 rows=7318 loops=1)
   Merge Cond: (balances.customer_id = customers.id)
   ->  Index Scan using balances_customer_id_index on balances  
(cost=0.00..727.79 rows=16876 width=80) (actual time=0.302..9.477 rows=16876 
loops=1)
   ->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97 
rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)
 Filter: (group_id = 45)
 Rows Removed by Filter: 141684
 Total 

Re: [GENERAL] BDR workers exiting?

2015-10-12 Thread Andres Freund
On 2015-10-12 14:37:07 +, Steve Pribyl wrote:
> I am loading up a 60G database into BDR database and these "ERRORS" are in my 
> logs.  Is not normal behavior or is something going bad.
> 
> 2015-10-12 09:28:59.389 CDT,,,30371,,561bc17d.76a3,1,,2015-10-12 09:19:41 
> CDT,5/0,0,ERROR,XX000,"data stream ended","bdr 
> (6204748238611542317,1,16494,): apply"
> 2015-10-12 09:28:59.390 CDT,,,12693,,561bb1ae.3195,20,,2015-10-12 08:12:14 
> CDT,,0,LOG,0,"worker process: bdr (6204748238611542317,1,16494,)->bdr 
> (6204748255428234532,1, (PID 30371) exited with exit code 1",""
> 2015-10-12 09:29:04.395 CDT,,,12693,,561bb1ae.3195,21,,2015-10-12 08:12:14 
> CDT,,0,LOG,0,"starting background worker process ""bdr 
> (6204748238611542317,1,16494,)->bdr (6204748255428234532,1,""",""

There'll possibly be an error message on the other node about ending the
connection.

Do you use SSL? If so, try disabling renegotiation.

Regards,

Andres


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Andres Freund
Hi,

On 2015-10-09 14:32:44 +0800, Victor Blomqvist wrote:
> Typically I have user defined functions for all operations, and my table
> and functions follow this pattern:
> 
> CREATE TABLE users (
>   id integer PRIMARY KEY,
>   name varchar NOT NULL,
>   to_be_removed integer NOT NULL
> );
> 
> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
> $$
> BEGIN
>   RETURN QUERY SELECT * FROM users WHERE id = id_;
> END;
> $$ LANGUAGE plpgsql;
> 
> Then the actual queries are run by our application as
> 
> SELECT id, name FROM select_users(18);
> 
> As you can see the column to_be_removed is not selected. Then to remove the
> column I use:
> 
> ALTER TABLE users DROP COLUMN to_be_removed;
> 
> However, while the system is under load sometimes (more frequently and
> persistent the more load the system is experiencing) I get errors like
> these:
> 
> ERROR #42804 structure of query does not match function result type:
> Number of returned columns (2) does not match expected column count (3).

My guess is that the problem here is that table level locking prevents
modification of the "users" type when the table is used, but there's no
locking preventing the columns to be dropped while the function is
used. So what happens is that 1) the function is parsed & planned 2)
DROP COLUMN is executed 3) the contained statement is executed 4) a
mismatch between the contained statement and the function definition is
detected.

Greetings,

Andres Freund


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Adrian Klaver

On 10/12/2015 05:29 AM, Andres Freund wrote:

Hi,

On 2015-10-09 14:32:44 +0800, Victor Blomqvist wrote:

Typically I have user defined functions for all operations, and my table
and functions follow this pattern:

CREATE TABLE users (
   id integer PRIMARY KEY,
   name varchar NOT NULL,
   to_be_removed integer NOT NULL
);

CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
$$
BEGIN
   RETURN QUERY SELECT * FROM users WHERE id = id_;
END;
$$ LANGUAGE plpgsql;

Then the actual queries are run by our application as

SELECT id, name FROM select_users(18);

As you can see the column to_be_removed is not selected. Then to remove the
column I use:

ALTER TABLE users DROP COLUMN to_be_removed;

However, while the system is under load sometimes (more frequently and
persistent the more load the system is experiencing) I get errors like
these:

 ERROR #42804 structure of query does not match function result type:
Number of returned columns (2) does not match expected column count (3).


My guess is that the problem here is that table level locking prevents
modification of the "users" type when the table is used, but there's no
locking preventing the columns to be dropped while the function is
used. So what happens is that 1) the function is parsed & planned 2)
DROP COLUMN is executed 3) the contained statement is executed 4) a
mismatch between the contained statement and the function definition is
detected.


Except per Albe's[1] example, the above sequence of events does not 
fail. It fails in Victors's case when the server is under load, so it 
seems there is another factor in play.



[1] 
http://www.postgresql.org/message-id/a737b7a37273e048b164557adef4a58b50fb7...@ntex2010i.host.magwien.gv.at


Greetings,

Andres Freund





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Andres Freund
On 2015-10-12 08:07:54 +, Albe Laurenz wrote:
> Victor Blomqvist wrote:
> [race condition causes errors due to stale plans immediately after ALTER 
> TABLE DROP]
> > Note that these errors most of the time only happens very briefly at the 
> > same time as the ALTER is
> > run. When I did some experiments today the server in total had around 3k 
> > req/s with maybe 0.1% of them
> > touching the table being updated, and the error then happens maybe 1-10% of 
> > the times I try this
> > operation. If I do the operation on a table with more load the error will 
> > happen more frequently.
> 
> As far as I gleaned from reading the source, plan cache invalidation happens 
> by signals
> sent to the other backends, so I can see why there can be small delays.
> I wonder if there is any good way to improve this.

The signal based part is only relevant for idle backends, to wake them
up to process pending invalidations. The aim is to shrink the size of
the invalidation queue.

Normal invalidations are performed whenever a relation is locked:
void
LockRelationOid(Oid relid, LOCKMODE lockmode)
{
LOCKTAG tag;
LockAcquireResult res;

SetLocktagRelationOid(, relid);

res = LockAcquire(, lockmode, false, false);

/*
 * Now that we have the lock, check for invalidation messages, so that 
we
 * will update or flush any stale relcache entry before we try to use 
it.
 * RangeVarGetRelid() specifically relies on us for this.  We can skip
 * this in the not-uncommon case that we already had the same type of 
lock
 * being requested, since then no one else could have modified the
 * relcache entry in an undesirable way.  (In the case where our own 
xact
 * modifies the rel, the relcache update happens via
 * CommandCounterIncrement, not here.)
 */
if (res != LOCKACQUIRE_ALREADY_HELD)
AcceptInvalidationMessages();
}

I've not investigated what the OP's problem is.

Greetings,

Andres Freund


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Andres Freund
On 2015-10-12 06:42:52 -0700, Adrian Klaver wrote:
> >My guess is that the problem here is that table level locking prevents
> >modification of the "users" type when the table is used, but there's no
> >locking preventing the columns to be dropped while the function is
> >used. So what happens is that 1) the function is parsed & planned 2)
> >DROP COLUMN is executed 3) the contained statement is executed 4) a
> >mismatch between the contained statement and the function definition is
> >detected.
> 
> Except per Albe's[1] example, the above sequence of events does not fail. It
> fails in Victors's case when the server is under load, so it seems there is
> another factor in play.

The above sequence is only problematic if 2) happens exactly between 1)
and 3), which is not particularly likely given that 1) is a very quick
operation.

Greetings,

Andres Freund


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Tom Lane
Andres Freund  writes:
> On 2015-10-09 14:32:44 +0800, Victor Blomqvist wrote:
>> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
>> $$
>> BEGIN
>> RETURN QUERY SELECT * FROM users WHERE id = id_;
>> END;
>> $$ LANGUAGE plpgsql;

> My guess is that the problem here is that table level locking prevents
> modification of the "users" type when the table is used, but there's no
> locking preventing the columns to be dropped while the function is
> used. So what happens is that 1) the function is parsed & planned 2)
> DROP COLUMN is executed 3) the contained statement is executed 4) a
> mismatch between the contained statement and the function definition is
> detected.

The query plan as such does get refreshed, I believe.  The problem is that
plpgsql has no provision for the definition of a named composite type to
change after a function's been parsed.  This applies to variables of named
composite types for sure, and based on this example I think it must apply
to the function result type as well, though I'm too lazy to go check the
code right now.

We have had past discussions about fixing this.  I believe it would
require getting rid of use of plpgsql's "row" infrastructure for named
composites, at least in most cases, and going over to the "record"
infrastructure instead.  In the past the conversations have stalled as
soon as somebody complained that that would probably make some operations
slower.  I don't entirely understand that objection, since (a) some other
operations would probably get faster, and (b) performance does not trump
correctness.  But that's where the discussion stands at the moment.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Adrian Klaver

On 10/12/2015 06:53 AM, Tom Lane wrote:

Andres Freund  writes:

On 2015-10-09 14:32:44 +0800, Victor Blomqvist wrote:

CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
$$
BEGIN
RETURN QUERY SELECT * FROM users WHERE id = id_;
END;
$$ LANGUAGE plpgsql;



My guess is that the problem here is that table level locking prevents
modification of the "users" type when the table is used, but there's no
locking preventing the columns to be dropped while the function is
used. So what happens is that 1) the function is parsed & planned 2)
DROP COLUMN is executed 3) the contained statement is executed 4) a
mismatch between the contained statement and the function definition is
detected.


The query plan as such does get refreshed, I believe.  The problem is that
plpgsql has no provision for the definition of a named composite type to
change after a function's been parsed.  This applies to variables of named
composite types for sure, and based on this example I think it must apply
to the function result type as well, though I'm too lazy to go check the
code right now.


That makes sense. The problem is that I cannot square that with Albe's 
example, which I tested also:


"
Session 1:

test=> CREATE TABLE users (id integer PRIMARY KEY, name varchar NOT 
NULL, to_be_removed integer NOT NULL);

CREATE TABLE
test=> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
   $$BEGIN RETURN QUERY SELECT * FROM users WHERE id = id_; END;$$ 
LANGUAGE plpgsql;

CREATE FUNCTION

Session 2:

test=> SELECT id, name FROM select_users(18);
 id | name
+--
(0 rows)

Ok, now the plan is cached.

Now in Session 1:

test=> ALTER TABLE users DROP COLUMN to_be_removed;
ALTER TABLE

Session2:

test=> SELECT id, name FROM select_users(18);
 id | name
+--
(0 rows)

No error.  This is 9.4.4.
"



We have had past discussions about fixing this.  I believe it would
require getting rid of use of plpgsql's "row" infrastructure for named
composites, at least in most cases, and going over to the "record"
infrastructure instead.  In the past the conversations have stalled as
soon as somebody complained that that would probably make some operations
slower.  I don't entirely understand that objection, since (a) some other
operations would probably get faster, and (b) performance does not trump
correctness.  But that's where the discussion stands at the moment.

regards, tom lane





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BDR workers exiting?

2015-10-12 Thread Steve Pribyl
I am loading up a 60G database into BDR database and these "ERRORS" are in my 
logs.  Is not normal behavior or is something going bad.

2015-10-12 09:28:59.389 CDT,,,30371,,561bc17d.76a3,1,,2015-10-12 09:19:41 
CDT,5/0,0,ERROR,XX000,"data stream ended","bdr 
(6204748238611542317,1,16494,): apply"
2015-10-12 09:28:59.390 CDT,,,12693,,561bb1ae.3195,20,,2015-10-12 08:12:14 
CDT,,0,LOG,0,"worker process: bdr (6204748238611542317,1,16494,)->bdr 
(6204748255428234532,1, (PID 30371) exited with exit code 1",""
2015-10-12 09:29:04.395 CDT,,,12693,,561bb1ae.3195,21,,2015-10-12 08:12:14 
CDT,,0,LOG,0,"starting background worker process ""bdr 
(6204748238611542317,1,16494,)->bdr (6204748255428234532,1,""",""


Steve Pribyl
Thanks

 [http://www.akunacapital.com/images/akuna.png]
Steve Pribyl | Senior Systems Engineer
Akuna Capital LLC
36 S Wabash, Suite 310 Chicago IL 60603 USA | www.akunacapital.com 

p: +1 312 994 4646 | m:  | f: +1 312 750 1667 | steve.pri...@akunacapital.com

Please consider the environment, before printing this email.

This electronic message contains information from Akuna Capital LLC that may be 
confidential, legally privileged or otherwise protected from disclosure. This 
information is intended for the use of the addressee only and is not offered as 
investment advice to be relied upon for personal or professional use. 
Additionally, all electronic messages are recorded and stored in compliance 
pursuant to applicable SEC rules. If you are not the intended recipient, you 
are hereby notified that any disclosure, copying, distribution, printing or any 
other use of, or any action in reliance on, the contents of this electronic 
message is strictly prohibited. If you have received this communication in 
error, please notify us by telephone at (312)994-4640 and destroy the original 
message.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Left Join with Limit 1

2015-10-12 Thread Alex Magnum
Hello,

I am trying to extract ip addresses from golite by joining two tables as
posted below.

Running the queries individually and using a limit on the golite ip db
results are back immediately 1-2ms but when using the first query it takes
2-3 seconds.

Is there a way to use a limit in the join?

Thanks for any advice on this.

A


SELECT S.referrer_ip,
   I.geoname_id
FROM viewing_stats AS S LEFT JOIN geolite_city_ip4 AS I ON
(S.referrer_ip::inet <<= network)
WHERE viewing_id=74;

  referrer_ip   | geoname_id
+
 111.93.173.230 |1269750
(1 row)

Time: *2609.125 ms*



SELECT geoname_id,network FROM geolite_city_ip4 WHERE
'111.93.173.230'::inet <<= network *LIMIT 1;*
 geoname_id | network
+-
1269750 | 111.93.168.0/21
(1 row)

Time: *1.926 ms*


SELECT geoname_id,network FROM geolite_city_ip4 WHERE
'111.93.173.230'::inet <<= network;
 geoname_id | network
+-
1269750 | 111.93.168.0/21
(1 row)
Time: 645.999 ms

explain
SELECT S.referrer_ip,
   I.geoname_id
FROM viewing_stats AS S LEFT JOIN geolite_city_ip4 AS I ON
(S.referrer_ip::inet <<= network)
WHERE viewing_id=74;
   QUERY PLAN
-
 Nested Loop Left Join  (cost=0.00..242446.05 rows=3746698 width=36)
   Join Filter: ((s.referrer_ip)::inet <<= i.network)
   ->  Seq Scan on viewing_stats s  (cost=0.00..16.62 rows=3 width=32)
 Filter: (viewing_id = 74)
   ->  Materialize  (cost=0.00..74411.99 rows=2497799 width=11)
 ->  Seq Scan on geolite_city_ip4 i  (cost=0.00..49725.99
rows=2497799 width=11)
(6 rows)

Time: 1.326 ms

SELECT S.referrer_ip FROM viewing_stats AS S WHERE viewing_id=74;
  referrer_ip

 111.93.173.230
(1 row)

Time: *1.268 ms*


Re: [GENERAL] Left Join with Limit 1

2015-10-12 Thread Jim Nasby

On 10/12/15 10:03 AM, Alex Magnum wrote:

Is there a way to use a limit in the join?


SELECT ... FROM table_a a
  LEFT JOIN ( SELECT ... FROM table_b WHERE ... LIMIT 1) b
ON a.blah = b.blah
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Left Join with Limit 1

2015-10-12 Thread Paul Jungwirth

Running the queries individually and using a limit on the golite ip db
results are back immediately 1-2ms but when using the first query it
takes 2-3 seconds.

Is there a way to use a limit in the join?


This sounds like the real issue is a missing/incorrect index, but if 
you're on 9.4+ you can use a lateral join like this:



SELECT S.referrer_ip,
   I.geoname_id
FROM viewing_stats AS S
LEFT JOIN LATERAL (
SELECT *
FROM geolite_city_ip4
WHERE S.referrer_ip::inet <<= network
LIMIT 1
) I
ON true
WHERE viewing_id=74;

You might also want some kind of ordering in that subquery so that the 
results are deterministic.



Paul


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR workers exiting?

2015-10-12 Thread Jim Nasby

On 10/12/15 9:37 AM, Steve Pribyl wrote:

I am loading up a 60G database into BDR database and these "ERRORS" are in my 
logs.  Is not normal behavior or is something going bad.

2015-10-12 09:28:59.389 CDT,,,30371,,561bc17d.76a3,1,,2015-10-12 09:19:41 
CDT,5/0,0,ERROR,XX000,"data stream ended","bdr 
(6204748238611542317,1,16494,): apply"
2015-10-12 09:28:59.390 CDT,,,12693,,561bb1ae.3195,20,,2015-10-12 08:12:14 
CDT,,0,LOG,0,"worker process: bdr (6204748238611542317,1,16494,)->bdr 
(6204748255428234532,1, (PID 30371) exited with exit code 1",""
2015-10-12 09:29:04.395 CDT,,,12693,,561bb1ae.3195,21,,2015-10-12 08:12:14 CDT,,0,LOG,0,"starting background 
worker process ""bdr (6204748238611542317,1,16494,)->bdr 
(6204748255428234532,1,""",""


Looks like something's going bad, but you need to ask on the BDR mailing 
list.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Serialization errors despite KEY SHARE/NO KEY UPDATE

2015-10-12 Thread Jim Nasby

On 10/6/15 12:18 PM, Olivier Dony wrote:


We would happily skip the micro-transactions (as a perf workaround) if
there was a way to detect this situation, but we couldn't find a way to
do that in 9.3. <9.3 we used SELECT FOR UPDATE NOWAIT to guard similar
cases.

If there is any way I could help to make the back-patch happen, please
let me know!


I'd say you should probably open a bug about this to make sure it's 
visible if you want it fixed. Or start a thread on -hackers.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Constraint allowing value up to 2 times but no more than 2 times

2015-10-12 Thread Dara Unglaube
Greeetings.
I'm trying to come up with a way to allow one column to have the same value
up to two times, but no more than two times. I attempted adding a
constraint to check the count of the value in the field  - count (trsqqq)
<=2 but aggregate functions are not allowed in constraints. Is there
another way to do this? Any help would be greatly appreciated.
Thanks in advance.
Dara



*--Dara J. Olson Unglaube*
Aquatic Invasive Species Coordinator, Spatial Database Manager
Great Lakes Indian Fish & Wildlife Commission
P.O. Box 9, 72682 Maple Street
Odanah, WI 54861
(715) 682-6619 ext.2129
d...@glifwc.org 
maps.glifwc.org


Re: [GENERAL] BDR workers exiting?

2015-10-12 Thread Steve Pribyl
The process used to created this

Start with clean db
Create host A database with bdr
Join host B with dbr
Load database using psql < file.sql

I was able to get it work if I do the following.
Start with clean db
Create host A database
Load data on host A
Join host A to bdr.
Join host b to bdr.

Glad to have a work around but would like to get to understand the failure.

Steve Pribyl




From: Steve Pribyl
Sent: Monday, October 12, 2015 11:19 AM
To: Andres Freund
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] BDR workers exiting?

Yup, there is a disconnect on other side.

This disconnect is preceded by this.
ERROR,XX000,"invalid memory alloc request size 1073741824","slot 
""bdr_16494_6204748238611542317_1_16494__"", output plugin ""bdr"", in the 
change callback, associated LSN 2/FD250E48""bdr 
(6204748238611542317,1,16494,):receive"

Steve Pribyl



From: Andres Freund 
Sent: Monday, October 12, 2015 11:08 AM
To: Steve Pribyl
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] BDR workers exiting?

On 2015-10-12 14:37:07 +, Steve Pribyl wrote:
> I am loading up a 60G database into BDR database and these "ERRORS" are in my 
> logs.  Is not normal behavior or is something going bad.
>
> 2015-10-12 09:28:59.389 CDT,,,30371,,561bc17d.76a3,1,,2015-10-12 09:19:41 
> CDT,5/0,0,ERROR,XX000,"data stream ended","bdr 
> (6204748238611542317,1,16494,): apply"
> 2015-10-12 09:28:59.390 CDT,,,12693,,561bb1ae.3195,20,,2015-10-12 08:12:14 
> CDT,,0,LOG,0,"worker process: bdr (6204748238611542317,1,16494,)->bdr 
> (6204748255428234532,1, (PID 30371) exited with exit code 1",""
> 2015-10-12 09:29:04.395 CDT,,,12693,,561bb1ae.3195,21,,2015-10-12 08:12:14 
> CDT,,0,LOG,0,"starting background worker process ""bdr 
> (6204748238611542317,1,16494,)->bdr (6204748255428234532,1,""",""

There'll possibly be an error message on the other node about ending the
connection.

Do you use SSL? If so, try disabling renegotiation.

Regards,

Andres

 [http://www.akunacapital.com/images/akuna.png]
Steve Pribyl | Senior Systems Engineer
Akuna Capital LLC
36 S Wabash, Suite 310 Chicago IL 60603 USA | www.akunacapital.com 

p: +1 312 994 4646 | m:  | f: +1 312 750 1667 | steve.pri...@akunacapital.com

Please consider the environment, before printing this email.

This electronic message contains information from Akuna Capital LLC that may be 
confidential, legally privileged or otherwise protected from disclosure. This 
information is intended for the use of the addressee only and is not offered as 
investment advice to be relied upon for personal or professional use. 
Additionally, all electronic messages are recorded and stored in compliance 
pursuant to applicable SEC rules. If you are not the intended recipient, you 
are hereby notified that any disclosure, copying, distribution, printing or any 
other use of, or any action in reliance on, the contents of this electronic 
message is strictly prohibited. If you have received this communication in 
error, please notify us by telephone at (312)994-4640 and destroy the original 
message.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Effecient time ranges in 9.4/9.5?

2015-10-12 Thread Jim Nasby

On 10/4/15 6:18 AM, Karsten Hilbert wrote:

check whether both TZs are equal,
if so
ignore them
else
convert both operands to UTC,
do "time - time",
return result AT TIME ZONE UTC,
document that UTC is returned,

such that the user can know to apply "AT TIME ZONE ..."
to the result as needed.


The problem there is you can't reliably convert a time to a different 
timezone without knowing what date you're talking about.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster
Is there any way to do a pattern match against the elements of an array in 
postgresql (9.4 if the version makes a difference)? I have a grouped query 
that, among other things, returns an array of values, like:

SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY lognum;

Where the flightnum field is a varchar containing either a text string or a 
three-or-four digit number. Now say I want to select all logs that have a 
flight number starting with an '8' (so '800' or '8000' series flights). My 
first thought was to do something like this:

SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs GROUP 
BY lognum) s1 WHERE '8%' like ANY(flightnums);

But while this doesn't give an error, it also doesn't return any results. I'm 
guessing that this is because the wildcard is on the left of the operator, and 
needs to be on the right. Of course, turning it around to be:

WHERE ANY(flightnum) like '8%'

gives me a syntax error. So is there any way I can run this query such that I 
get any rows containing a flight number that starts with an 8 (or whatever)?

---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Constraint allowing value up to 2 times but no more than 2 times

2015-10-12 Thread Jim Nasby

On 10/12/15 12:04 PM, Dara Unglaube wrote:

Greeetings.
I'm trying to come up with a way to allow one column to have the same
value up to two times, but no more than two times. I attempted adding a
constraint to check the count of the value in the field  - count
(trsqqq) <=2 but aggregate functions are not allowed in constraints. Is
there another way to do this? Any help would be greatly appreciated.


The safest bet is some kind of unique constraint. That would require 
having a separate count field, which you would limit to being 1 or 2 via 
a CHECK constraint. You could use a trigger to set the value based on 
what's already in the table.


By the way, the issue with doing a simple count is that it's not safe 
from race conditions, like an insert and a delete happening together.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread dinesh kumar
On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
wrote:

> Is there any way to do a pattern match against the elements of an array in
> postgresql (9.4 if the version makes a difference)? I have a grouped query
> that, among other things, returns an array of values, like:
>
> SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY
> lognum;
>
> Where the flightnum field is a varchar containing either a text string or
> a three-or-four digit number. Now say I want to select all logs that have a
> flight number starting with an '8' (so '800' or '8000' series flights). My
> first thought was to do something like this:
>
> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>
> But while this doesn't give an error, it also doesn't return any results.
> I'm guessing that this is because the wildcard is on the left of the
> operator, and needs to be on the right. Of course, turning it around to be:
>
> WHERE ANY(flightnum) like '8%'
>
> gives me a syntax error. So is there any way I can run this query such
> that I get any rows containing a flight number that starts with an 8 (or
> whatever)?
>
>
Are you looking for this ?

SELECT lognum array_agg(flightnum) FROM logs WHERE flightnum ~ '^8' GROUP
BY lognum;




> ---
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293 <#>
> ---
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 

Regards,
Dinesh
manojadinesh.blogspot.com


Re: [GENERAL] Effecient time ranges in 9.4/9.5?

2015-10-12 Thread Karsten Hilbert
On Mon, Oct 12, 2015 at 12:14:10PM -0500, Jim Nasby wrote:

> On 10/4/15 6:18 AM, Karsten Hilbert wrote:
> > check whether both TZs are equal,
> > if so
> > ignore them
> > else
> > convert both operands to UTC,
> > do "time - time",
> > return result AT TIME ZONE UTC,
> > document that UTC is returned,
> >
> >such that the user can know to apply "AT TIME ZONE ..."
> >to the result as needed.
> 
> The problem there is you can't reliably convert a time to a different
> timezone without knowing what date you're talking about.

I didn't realize we were talking time only. My reading of the
thread suggested we were handling timestamps.

Karsten
-- 
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Jeff Janes
On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
wrote:

> Is there any way to do a pattern match against the elements of an array in
> postgresql (9.4 if the version makes a difference)? I have a grouped query
> that, among other things, returns an array of values, like:
>
> SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY
> lognum;
>
> Where the flightnum field is a varchar containing either a text string or
> a three-or-four digit number. Now say I want to select all logs that have a
> flight number starting with an '8' (so '800' or '8000' series flights). My
> first thought was to do something like this:
>
> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>
> But while this doesn't give an error, it also doesn't return any results.
> I'm guessing that this is because the wildcard is on the left of the
> operator, and needs to be on the right.


Right.  The LIKE operator does not have a commutator by default.  (And if
you created one for it, it could not use an index in this case.)


> Of course, turning it around to be:
>
> WHERE ANY(flightnum) like '8%'
>
> gives me a syntax error. So is there any way I can run this query such
> that I get any rows containing a flight number that starts with an 8 (or
> whatever)?
>

I think you're best bet is to do a subquery against the unaggregated table.

select * from aggregated a where exists
  (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
like '8%')


This is a common problem.  If you find a better solution, I'd love to hear
it!

Cheers,

Jeff


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Tom Lane
Jeff Janes  writes:
> On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
> wrote:
>> My first thought was to do something like this:
>> 
>> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
>> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>> 
>> But while this doesn't give an error, it also doesn't return any results.
>> I'm guessing that this is because the wildcard is on the left of the
>> operator, and needs to be on the right.

> Right.  The LIKE operator does not have a commutator by default.  (And if
> you created one for it, it could not use an index in this case.)

Well, it couldn't use an index anyway, given that the query as written
wants to collect groups if *any* member is LIKE '8%', rather than
restricting the data to such flightnums before aggregation occurs.

Personally I'd suggest building a commutator operator (just need a
one-liner SQL or plpgsql function as infrastructure) and away you go.

> I think you're best bet is to do a subquery against the unaggregated table.

> select * from aggregated a where exists
>   (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
> like '8%')

That would work too, but not sure about performance relative to the other
way.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Constraint allowing value up to 2 times but no more than 2 times

2015-10-12 Thread Jim Nasby

On 10/12/15 1:41 PM, Dara Unglaube wrote:

I created a view with the column of interest and a column of the count.
How do I do a check constraint on a view or do it all at once as a
subquery? Could you provide an example of how to create?


Please don't top-post. And do include the mailing list so others can learn.

The constraint would go on the table, not the view. The column would 
need to not be a count, but a 'record number' or something similar. So 
you'd have one record with 'record_number=1' and the second with 
'record_number=2'.


CREATE TABLE ...(
  ...
  , record_number smallint NOT NULL CONSTRAINT 
record_number_must_be_1_or_2 CHECK( record_number BETWEEN 1 AND 2 )

);
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster
In the general case, that might work, however in my actually use case the inner SQL query (and underlying table structure) is rather more complicated, making it so I really want the WHERE clause in an outside query, as in my (non-functional) example. Just to clarify, the actual structure of the query is more like this:SELECT 	FROM	(SELECT(SELECT			array_agg(flightnum)		FROM legdetails		WHERE logid=logs.id) as flightnums --this is where the array comes from that I want to filter on.	FROM logs		) s1WHERE		ORDER BY So the query is noticeably different than the one I original gave, but the end result is the same: an array in an inner query, and trying to filter based on the contents of the array. Sorry if I confused the issue by trying to simplify the concept too much.
---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD


On Oct 12, 2015, at 10:14 AM, dinesh kumar  wrote:On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster  wrote:Is there any way to do a pattern match against the elements of an array in postgresql (9.4 if the version makes a difference)? I have a grouped query that, among other things, returns an array of values, like:

SELECT lognum, array_agg(flightnum) as flightnums FROM logs GROUP BY lognum;

Where the flightnum field is a varchar containing either a text string or a three-or-four digit number. Now say I want to select all logs that have a flight number starting with an '8' (so '800' or '8000' series flights). My first thought was to do something like this:

SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);

But while this doesn't give an error, it also doesn't return any results. I'm guessing that this is because the wildcard is on the left of the operator, and needs to be on the right. Of course, turning it around to be:

WHERE ANY(flightnum) like '8%'

gives me a syntax error. So is there any way I can run this query such that I get any rows containing a flight number that starts with an 8 (or whatever)?
 Are you looking for this ? SELECT lognum array_agg(flightnum) FROM logs WHERE flightnum ~ '^8' GROUP BY lognum;   
---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
-- Regards,Dineshmanojadinesh.blogspot.com



Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster
On Oct 12, 2015, at 10:39 AM, Tom Lane  wrote:
> 
> Jeff Janes  writes:
>> On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
>> wrote:
>>> My first thought was to do something like this:
>>> 
>>> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
>>> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>>> 
>>> But while this doesn't give an error, it also doesn't return any results.
>>> I'm guessing that this is because the wildcard is on the left of the
>>> operator, and needs to be on the right.
> 
>> Right.  The LIKE operator does not have a commutator by default.  (And if
>> you created one for it, it could not use an index in this case.)
> 
> Well, it couldn't use an index anyway, given that the query as written
> wants to collect groups if *any* member is LIKE '8%', rather than
> restricting the data to such flightnums before aggregation occurs.
> 
> Personally I'd suggest building a commutator operator (just need a
> one-liner SQL or plpgsql function as infrastructure) and away you go.

That could work. I'll look into that.

> 
>> I think you're best bet is to do a subquery against the unaggregated table.
> 
>> select * from aggregated a where exists
>>  (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
>> like '8%')
> 
> That would work too, but not sure about performance relative to the other
> way.
> 
>   regards, tom lane


---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Jeff Janes
On Mon, Oct 12, 2015 at 11:39 AM, Tom Lane  wrote:

> Jeff Janes  writes:
> > On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster  >
> > wrote:
> >> My first thought was to do something like this:
> >>
> >> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM
> logs
> >> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
> >>
> >> But while this doesn't give an error, it also doesn't return any
> results.
> >> I'm guessing that this is because the wildcard is on the left of the
> >> operator, and needs to be on the right.
>
> > Right.  The LIKE operator does not have a commutator by default.  (And if
> > you created one for it, it could not use an index in this case.)
>
> Well, it couldn't use an index anyway, given that the query as written
> wants to collect groups if *any* member is LIKE '8%', rather than
> restricting the data to such flightnums before aggregation occurs.
>

I was jumping ahead a bit here.  I was thinking of the case where the
aggregation query was made a materialized view with a gin index on the
aggregated column.  It is not obvious that a "scalar operator
ANY(flightnums)" can't use an index on array_column, but it can't.  My
interest was more in the % operator from pg_trgm, but also the normal text
= operator would be nice to use here (as opposed to the much uglier <@ or
@> in which the scalar needs to be wrapped into a degenerate array.)



> Personally I'd suggest building a commutator operator (just need a
> one-liner SQL or plpgsql function as infrastructure) and away you go.
>

Right, something like:

create function like_rev (text, text) returns boolean as $$ select $2 like
$1 $$ language SQL;

create operator  (procedure = like_rev,  leftarg=text, rightarg=text);

You can explicitly specify the commutator but it doesn't seem to be
necessary to do so:

create operator  (procedure = like_rev,  leftarg=text,
rightarg=text,commutator = ~~ );



> > I think you're best bet is to do a subquery against the unaggregated
> table.
>
> > select * from aggregated a where exists
> >   (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
> > like '8%')
>
> That would work too, but not sure about performance relative to the other
> way.
>

In my experience, if the subselect can use an index on the LIKE and if '8%'
is rare, then using the subselect will be vastly better.  And if it is
indexable or not rare, it is still likely to be better, or at least not
worse by much. Disaggregating every array for every row to do the ANY is
pretty inefficient.  Particularly if you are not using a materialized view,
and so have to first aggregate it.

Of course good enough is good enough, so if scalar  ANY(array)  is good
enough...

Cheers,

Jeff


Re: [GENERAL] Understanding "seq scans"

2015-10-12 Thread Kevin Grittner
On Monday, October 12, 2015 2:52 PM, Lele Gaifax  wrote:

> I'm doing some experiments to find the better layout for
> reimplementing an existing db (MySQL cough!) with PostgreSQL
> 9.4+.
>
> I noticed a strange plan coming out from a simple query joining
> two tables, both containing 10Mrecs (and both ANALYZEd):

Your best bet to get good advice on this is to follow the steps
outlined here:

https://wiki.postgresql.org/wiki/SlowQueryQuestions

The short answer to the question of why it is picking those plans
is that that the optimizer looks at the various plans that can
generate correct results, and picks the one with the lowest
estimated cost based on your costing factors.  To get a plan more
like what you seem to be expecting you might need to adjust cost
factors or create an index that allows more direct access to the
data needed by the query.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Understanding "seq scans"

2015-10-12 Thread Lele Gaifax
Hi all,

I'm doing some experiments to find the better layout for reimplementing
an existing db (MySQL cough!) with PostgreSQL 9.4+.

I noticed a strange plan coming out from a simple query joining two tables,
both containing 10Mrecs (and both ANALYZEd):

l10ntest=# \d master;
  Table "public.master"
 Column |  Type   |  Modifiers   
+-+--
 num| integer | not null default nextval('master_num_seq'::regclass)
Indexes:
"master_pkey" PRIMARY KEY, btree (num)

l10ntest=# \d master_l10n;
Table "public.master_l10n"
 Column | Type | Modifiers  
   

+--+---
 num| integer  | not null default 
nextval('master_l10n_num_seq'::regclass)
 lang   | character varying(2) | not null
 text   | text | 
Indexes:
"master_l10n_pkey" PRIMARY KEY, btree (num, lang)
"l10n_text_index" btree (lower(text) text_pattern_ops)

l10ntest=# EXPLAIN SELECT count(l.num) AS count_1 FROM master_l10n l WHERE 
l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
  QUERY PLAN
  

--
 Aggregate  (cost=309315.38..309315.39 rows=1 width=4)
   ->  Bitmap Heap Scan on master_l10n l  (cost=64700.56..307801.65 
rows=605492 width=4)
 Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 
'quattro%'::text))
 ->  Bitmap Index Scan on l10n_text_index  (cost=0.00..64549.19 
rows=999662 width=0)
   Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND 
(lower(text) ~<~ 'quattrp'::text))
(5 rows)

Time: 1.665 ms

l10ntest=# EXPLAIN SELECT count(m.num) AS count_1 FROM master_l10n l JOIN 
master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
 QUERY PLAN 



 Aggregate  (cost=676558.14..676558.15 rows=1 width=4)
   ->  Hash Join  (cost=373011.02..675044.41 rows=605492 width=4)
 Hash Cond: (l.num = m.num)
 ->  Bitmap Heap Scan on master_l10n l  (cost=64700.56..307801.65 
rows=605492 width=4)
   Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 
'quattro%'::text))
   ->  Bitmap Index Scan on l10n_text_index  
(cost=0.00..64549.19 rows=999662 width=0)
 Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND 
(lower(text) ~<~ 'quattrp'::text))
 ->  Hash  (cost=144247.76..144247.76 rows=976 width=4)
   ->  Seq Scan on master m  (cost=0.00..144247.76 rows=976 
width=4)
(9 rows)

Time: 1.244 ms

l10ntest=# SELECT count(l.num) AS count_1 FROM master_l10n l WHERE l.lang = 
'it' AND lower(l.text) LIKE 'quattro%';
 count_1 
-
 1101101
(1 row)

Time: 1221.941 ms

l10ntest=# SELECT count(m.num) AS count_1 FROM master_l10n l JOIN master m 
ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
 count_1 
-
 1101101
(1 row)

Time: 3541.852 ms

Why does the join on the master table require a "Seq Scan on master"? I tried
different kinds of "JOIN", but the resulting explanation remains the same.

Am I missing something, or should I stop worrying about that sequential scan?

Thanks in advance,
bye, lele.
-- 
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
l...@metapensiero.it  | -- Fortunato Depero, 1929.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding "seq scans"

2015-10-12 Thread Adrian Klaver

On 10/12/2015 12:06 PM, Lele Gaifax wrote:

Hi all,

I'm doing some experiments to find the better layout for reimplementing
an existing db (MySQL cough!) with PostgreSQL 9.4+.

I noticed a strange plan coming out from a simple query joining two tables,
both containing 10Mrecs (and both ANALYZEd):

 l10ntest=# \d master;
   Table "public.master"
  Column |  Type   |  Modifiers
 +-+--
  num| integer | not null default nextval('master_num_seq'::regclass)
 Indexes:
 "master_pkey" PRIMARY KEY, btree (num)

 l10ntest=# \d master_l10n;
 Table "public.master_l10n"
  Column | Type | Modifiers
 
+--+---
  num| integer  | not null default 
nextval('master_l10n_num_seq'::regclass)
  lang   | character varying(2) | not null
  text   | text |
 Indexes:
 "master_l10n_pkey" PRIMARY KEY, btree (num, lang)
 "l10n_text_index" btree (lower(text) text_pattern_ops)

 l10ntest=# EXPLAIN SELECT count(l.num) AS count_1 FROM master_l10n l WHERE 
l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
   QUERY PLAN
 
--
  Aggregate  (cost=309315.38..309315.39 rows=1 width=4)
->  Bitmap Heap Scan on master_l10n l  (cost=64700.56..307801.65 
rows=605492 width=4)
  Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 
'quattro%'::text))
  ->  Bitmap Index Scan on l10n_text_index  (cost=0.00..64549.19 
rows=999662 width=0)
Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND 
(lower(text) ~<~ 'quattrp'::text))
 (5 rows)

 Time: 1.665 ms

 l10ntest=# EXPLAIN SELECT count(m.num) AS count_1 FROM master_l10n l JOIN 
master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
  QUERY PLAN
 

  Aggregate  (cost=676558.14..676558.15 rows=1 width=4)
->  Hash Join  (cost=373011.02..675044.41 rows=605492 width=4)
  Hash Cond: (l.num = m.num)
  ->  Bitmap Heap Scan on master_l10n l  (cost=64700.56..307801.65 
rows=605492 width=4)
Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 
'quattro%'::text))
->  Bitmap Index Scan on l10n_text_index  
(cost=0.00..64549.19 rows=999662 width=0)
  Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND 
(lower(text) ~<~ 'quattrp'::text))
  ->  Hash  (cost=144247.76..144247.76 rows=976 width=4)
->  Seq Scan on master m  (cost=0.00..144247.76 
rows=976 width=4)
 (9 rows)

 Time: 1.244 ms

 l10ntest=# SELECT count(l.num) AS count_1 FROM master_l10n l WHERE l.lang 
= 'it' AND lower(l.text) LIKE 'quattro%';
  count_1
 -
  1101101
 (1 row)

 Time: 1221.941 ms

 l10ntest=# SELECT count(m.num) AS count_1 FROM master_l10n l JOIN master m 
ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
  count_1
 -
  1101101
 (1 row)

 Time: 3541.852 ms

Why does the join on the master table require a "Seq Scan on master"? I tried
different kinds of "JOIN", but the resulting explanation remains the same.

Am I missing something, or should I stop worrying about that sequential scan?


Off hand I would say it is because of this --> count(m.num). Try 
count(l.num) instead and see what happens. As your queries above show 
they are the same number.




Thanks in advance,
bye, lele.




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding "seq scans"

2015-10-12 Thread Merlin Moncure
On Monday, October 12, 2015, Lele Gaifax  wrote:

> Adrian Klaver > writes:
>
> > Off hand I would say it is because of this --> count(m.num). Try
> count(l.num) instead and see
> > what happens. As your queries above show they are the same number.
>
> No, that's another thing I already tried tweaking and should have
> mentioned.
> Neither count(*) nor count(l.num) have any influence on the plan.
>
> Following Kevin's advice (thanks!) I read the SlowQueryQuestions wiki page
> and
> learned about the "buffers" EXPLAIN option:
>
> EXPLAIN (analyze,buffers) SELECT count(l.num) AS count_1 FROM
> master_l10n l JOIN master m ON m.num = l.num WHERE l.lang = 'it' AND
> lower(l.text) LIKE 'quattro%';
>
> QUERY PLAN
>
> ---
>  Aggregate  (cost=676558.14..676558.15 rows=1 width=4) (actual
> time=4133.991..4133.991 rows=1 loops=1)
>Buffers: shared hit=6 read=84710, temp read=32652 written=32398
>->  Hash Join  (cost=373011.02..675044.41 rows=605492 width=4)
> (actual time=1940.285..4074.654 rows=1101101 loops=1)
>  Hash Cond: (l.num = m.num)
>  Buffers: shared hit=6 read=84710, temp read=32652
> written=32398
>  ->  Bitmap Heap Scan on master_l10n l
> (cost=64700.56..307801.65 rows=605492 width=4) (actual
> time=201.132..1286.629 rows=1101101 loops=1)
>Filter: (((lang)::text = 'it'::text) AND (lower(text)
> ~~ 'quattro%'::text))
>Heap Blocks: exact=25621
>Buffers: shared hit=1 read=40464
>->  Bitmap Index Scan on l10n_text_index
> (cost=0.00..64549.19 rows=999662 width=0) (actual time=195.946..195.946
> rows=1101101 loops=1)
>  Index Cond: ((lower(text) ~>=~ 'quattro'::text)
> AND (lower(text) ~<~ 'quattrp'::text))
>  Buffers: shared read=14844
>  ->  Hash  (cost=144247.76..144247.76 rows=976 width=4)
> (actual time=1738.180..1738.180 rows=999 loops=1)
>Buckets: 16384  Batches: 128  Memory Usage: 2778kB
>Buffers: shared hit=2 read=44246, temp written=29000
>->  Seq Scan on master m  (cost=0.00..144247.76
> rows=976 width=4) (actual time=0.006..629.590 rows=999 loops=1)
>  Buffers: shared hit=2 read=44246
>  Planning time: 0.493 ms
>  Execution time: 4134.144 ms
> (19 rows)
>
> # select version();
> version
>
> 
>  PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
> 5.2.1-21) 5.2.1 20151003, 64-bit
> (1 row)
>
> Thank you,
> ciao, lele.
> --
>

Your data and indexes are organized such that an index is only marginally
helpful, or so the planner thinks.  Try:

1. Cranking effective_cache_size so the planner might think your data is
cached.

2. Reducing random_page_cost to discourage random plans

3. Temporarily disabling seq scans

4. Composite index for better lookups.

merlin


Re: [GENERAL] Merge join vs merge semi join against primary key

2015-10-12 Thread David Rowley
On 10 October 2015 at 08:52, Sean Rhea  wrote:

>
> 1. When I join two tables with "WHERE id IN (...)" versus with an explicit
> join, and the join column for the inner table is a primary key, I would
> expect
> the same behavior in both cases, but the optimizer is choosing a merge
> join in
> one case and a merge semi join in the other. There's at most one customer
> with a given id. Why not do a semi join?
>
>
Unfortunately the 9.2 planner does not make any checks to verify that
customers.id is unique to perform a semi join. There is a pending patch in
the 9.6 cycle to add this optimisation.


> production=> select count(*) from customers;
>  count
> 
>  473733
> (1 row)
>
> ...


>->  Index Scan using customers_pkey on customers o
>  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765
> rows=7672 loops=1)
>  Filter: (group_id = 45)
>  Rows Removed by Filter: 212699113
>
>
Rows Removed by Filter: 212699113 seems to indicate that your 473733 row
count for "customers" is incorrect.

If you're doing lots of filtering on group_id, then perhaps you should
think about adding an index on customers (group_id,id)


--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services


Re: [GENERAL] Pattern match against array elements?

2015-10-12 Thread Israel Brewster

---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD


On Oct 12, 2015, at 11:50 AM, Jeff Janes  wrote:On Mon, Oct 12, 2015 at 11:39 AM, Tom Lane  wrote:Jeff Janes  writes:
> On Mon, Oct 12, 2015 at 10:58 AM, Israel Brewster 
> wrote:
>> My first thought was to do something like this:
>>
>> SELECT * FROM (SELECT lognum,array_agg(flightnum) as flightnums FROM logs
>> GROUP BY lognum) s1 WHERE '8%' like ANY(flightnums);
>>
>> But while this doesn't give an error, it also doesn't return any results.
>> I'm guessing that this is because the wildcard is on the left of the
>> operator, and needs to be on the right.

> Right.  The LIKE operator does not have a commutator by default.  (And if
> you created one for it, it could not use an index in this case.)

Well, it couldn't use an index anyway, given that the query as written
wants to collect groups if *any* member is LIKE '8%', rather than
restricting the data to such flightnums before aggregation occurs.I was jumping ahead a bit here.  I was thinking of the case where the aggregation query was made a materialized view with a gin index on the aggregated column.  It is not obvious that a "scalar operator ANY(flightnums)" can't use an index on array_column, but it can't.  My interest was more in the % operator from pg_trgm, but also the normal text = operator would be nice to use here (as opposed to the much uglier <@ or @> in which the scalar needs to be wrapped into a degenerate array.) 

Personally I'd suggest building a commutator operator (just need a
one-liner SQL or plpgsql function as infrastructure) and away you go.Right, something like:create function like_rev (text, text) returns boolean as $$ select $2 like $1 $$ language SQL;create operator  (procedure = like_rev,  leftarg=text, rightarg=text);...which actually works perfectly for my use case. No, it doesn't use an index, however even the worst case scenario on my data, where that is the ONLY criteria given, "only" takes about 10 seconds. Yes, that is a "long" time, however a) 99% of the time there will be other criteria used as well, drastically reducing the result set and speeding the query, and b) the query is used as part of a report generator, for which there isn't really a problem if the user has to wait a few seconds. You can explicitly specify the commutator but it doesn't seem to be necessary to do so:create operator  (procedure = like_rev,  leftarg=text, rightarg=text,commutator = ~~ );
> I think you're best bet is to do a subquery against the unaggregated table.

> select * from aggregated a where exists
>   (select 1 from unaggregated ua where a.lognum=ua.lognum and flightnum
> like '8%')

That would work too, but not sure about performance relative to the other
way.In my experience, if the subselect can use an index on the LIKE and if '8%' is rare, then using the subselect will be vastly better.  And if it is indexable or not rare, it is still likely to be better, or at least not worse by much. Disaggregating every array for every row to do the ANY is pretty inefficient.  Particularly if you are not using a materialized view, and so have to first aggregate it.Of course good enough is good enough, so if scalar  ANY(array)  is good enough...Exactly. I think I could make the sub-select work, with some tweaking, and as it could well improve performance noticeably I may well spend some time on it, but the commutator operator "just works" and integrates quite nicely with my existing query structure.Thanks for the help! Cheers,Jeff


Re: [GENERAL] Understanding "seq scans"

2015-10-12 Thread Lele Gaifax
Adrian Klaver  writes:

> Off hand I would say it is because of this --> count(m.num). Try count(l.num) 
> instead and see
> what happens. As your queries above show they are the same number.

No, that's another thing I already tried tweaking and should have mentioned.
Neither count(*) nor count(l.num) have any influence on the plan.

Following Kevin's advice (thanks!) I read the SlowQueryQuestions wiki page and
learned about the "buffers" EXPLAIN option:

EXPLAIN (analyze,buffers) SELECT count(l.num) AS count_1 FROM master_l10n l 
JOIN master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 
'quattro%';
  QUERY 
PLAN   

---
 Aggregate  (cost=676558.14..676558.15 rows=1 width=4) (actual 
time=4133.991..4133.991 rows=1 loops=1)
   Buffers: shared hit=6 read=84710, temp read=32652 written=32398
   ->  Hash Join  (cost=373011.02..675044.41 rows=605492 width=4) (actual 
time=1940.285..4074.654 rows=1101101 loops=1)
 Hash Cond: (l.num = m.num)
 Buffers: shared hit=6 read=84710, temp read=32652 written=32398
 ->  Bitmap Heap Scan on master_l10n l  (cost=64700.56..307801.65 
rows=605492 width=4) (actual time=201.132..1286.629 rows=1101101 loops=1)
   Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 
'quattro%'::text))
   Heap Blocks: exact=25621
   Buffers: shared hit=1 read=40464
   ->  Bitmap Index Scan on l10n_text_index  
(cost=0.00..64549.19 rows=999662 width=0) (actual time=195.946..195.946 
rows=1101101 loops=1)
 Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND 
(lower(text) ~<~ 'quattrp'::text))
 Buffers: shared read=14844
 ->  Hash  (cost=144247.76..144247.76 rows=976 width=4) (actual 
time=1738.180..1738.180 rows=999 loops=1)
   Buckets: 16384  Batches: 128  Memory Usage: 2778kB
   Buffers: shared hit=2 read=44246, temp written=29000
   ->  Seq Scan on master m  (cost=0.00..144247.76 rows=976 
width=4) (actual time=0.006..629.590 rows=999 loops=1)
 Buffers: shared hit=2 read=44246
 Planning time: 0.493 ms
 Execution time: 4134.144 ms
(19 rows)

# select version();
version 



 PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 
5.2.1-21) 5.2.1 20151003, 64-bit
(1 row)

Thank you,
ciao, lele.
-- 
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
l...@metapensiero.it  | -- Fortunato Depero, 1929.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-12 Thread Victor Blomqvist
Do you have some advice how to design my functions to work around this
problem?

If I understand your conversation correct the problem is returning the
rowtype users from the function. If so, I can think of two workarounds
(both quite inconvenient and complex):

1. Use RETURNS TABLE(...) together with not selecting * in the functions.
2. Use RETURNS  also without select * in the functions.

What do other people do in this situation? For our system the lowest load
is in the late night, 04 - 06, which might have sufficiently low load to
avoid the issue, but I would much prefer to run schema changes when there
are people in the office.

/Victor

On Mon, Oct 12, 2015 at 10:15 PM, Adrian Klaver 
wrote:

> On 10/12/2015 06:53 AM, Tom Lane wrote:
>
>> Andres Freund  writes:
>>
>>> On 2015-10-09 14:32:44 +0800, Victor Blomqvist wrote:
>>>
 CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
 $$
 BEGIN
 RETURN QUERY SELECT * FROM users WHERE id = id_;
 END;
 $$ LANGUAGE plpgsql;

>>>
>> My guess is that the problem here is that table level locking prevents
>>> modification of the "users" type when the table is used, but there's no
>>> locking preventing the columns to be dropped while the function is
>>> used. So what happens is that 1) the function is parsed & planned 2)
>>> DROP COLUMN is executed 3) the contained statement is executed 4) a
>>> mismatch between the contained statement and the function definition is
>>> detected.
>>>
>>
>> The query plan as such does get refreshed, I believe.  The problem is that
>> plpgsql has no provision for the definition of a named composite type to
>> change after a function's been parsed.  This applies to variables of named
>> composite types for sure, and based on this example I think it must apply
>> to the function result type as well, though I'm too lazy to go check the
>> code right now.
>>
>
> That makes sense. The problem is that I cannot square that with Albe's
> example, which I tested also:
>
> "
> Session 1:
>
> test=> CREATE TABLE users (id integer PRIMARY KEY, name varchar NOT NULL,
> to_be_removed integer NOT NULL);
> CREATE TABLE
> test=> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
>$$BEGIN RETURN QUERY SELECT * FROM users WHERE id = id_; END;$$
> LANGUAGE plpgsql;
> CREATE FUNCTION
>
> Session 2:
>
> test=> SELECT id, name FROM select_users(18);
>  id | name
> +--
> (0 rows)
>
> Ok, now the plan is cached.
>
> Now in Session 1:
>
> test=> ALTER TABLE users DROP COLUMN to_be_removed;
> ALTER TABLE
>
> Session2:
>
> test=> SELECT id, name FROM select_users(18);
>  id | name
> +--
> (0 rows)
>
> No error.  This is 9.4.4.
> "
>
>
>> We have had past discussions about fixing this.  I believe it would
>> require getting rid of use of plpgsql's "row" infrastructure for named
>> composites, at least in most cases, and going over to the "record"
>> infrastructure instead.  In the past the conversations have stalled as
>> soon as somebody complained that that would probably make some operations
>> slower.  I don't entirely understand that objection, since (a) some other
>> operations would probably get faster, and (b) performance does not trump
>> correctness.  But that's where the discussion stands at the moment.
>>
>> regards, tom lane
>>
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: [GENERAL] BDR: no free replication state could be found

2015-10-12 Thread Craig Ringer
On 10 October 2015 at 02:53, Selim Tuvi  wrote:
> node: deliver_sing (the problem node):
>
> postgres=# SELECT * FROM pg_catalog.pg_replication_identifier;
>  riident | riname
> -+
>1 | bdr_6197393155020108291_1_47458_16385_
>2 | bdr_6199712740068695651_1_16385_16385_
>3 | bdr_6197393155020108291_1_47458_17167_
>4 | bdr_6199712740068695651_1_16385_17167_
>5 | bdr_6199712740068695651_1_18817_17951_
>6 | bdr_6197393155020108291_1_48609_17951_
>7 | bdr_6197393155020108291_1_48609_19685_
>8 | bdr_6199712740068695651_1_18817_19685_
> (8 rows)


> On 9 October 2015 at 06:54, Selim Tuvi  wrote:

>> "recovered replication state of node 6 to 0/59F35A8",""
>> "no free replication state could be found, increase
>> max_replication_slots",""

The number of supported replication identifiers (in bdr 9.4) is
controlled by max_replication_slots, hence the error message. This
should be documented; I'll amend the docs appropriately.

https://github.com/2ndQuadrant/bdr/issues/133

The identifiers aren't currently dropped during node part, which
should be changed. It hasn't come up to date because frequent node
addition and removal is something to be avoided, and because most
deployments configure room for more slots than needed to avoid future
restarts.

https://github.com/2ndQuadrant/bdr/issues/134

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding "seq scans"

2015-10-12 Thread Alvaro Herrera
Lele Gaifax wrote:
> Hi all,
> 
> I'm doing some experiments to find the better layout for reimplementing
> an existing db (MySQL cough!) with PostgreSQL 9.4+.
> 
> I noticed a strange plan coming out from a simple query joining two tables,
> both containing 10Mrecs (and both ANALYZEd):

>  ->  Bitmap Index Scan on l10n_text_index  (cost=0.00..64549.19 
> rows=999662 width=0)
>Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND 
> (lower(text) ~<~ 'quattrp'::text))

So 10% of your rows in the master_l10n table start with "quattro"?
That's pretty odd, isn't it?  How did you manufacture these data?

I wonder if the l10n_text_index index should be on (lang, lower(text)).
How often are you going to look for translated text without specifying a
language?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR workers exiting?

2015-10-12 Thread Craig Ringer
BDR is currently memory-limited for extremely large transactions. At a
guess, I'd say one of your big tables is large enough that the logical
decoding facility BDR uses can't keep track of the transaction
properly.

There's no hard limit, it depends on details of the transaction and a
number of other variables, but "many tens or hundreds of GB" is
generally too much.

If I was to load such a big DB, I'd probably do it with ETL tools that
could split up the load and do it progressively.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general