Re: [GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-06 Thread Tom DalPozzo
Hi,
I did: pkill -x postgres
so it should send SIGTERM.
Regards
Pupillo

​​


Re: [GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-06 Thread Jeff Janes
On Tue, Dec 6, 2016 at 2:44 AM, Tom DalPozzo  wrote:

> Hi,
> about SSD light:
>


> I guessed it was WAL -> actual db files data traffic. It explains why the
> light stops blinking after shutting down the server (I did it via kill
> command) .
>

Do you kill with -15 (the default) or -9?  And which process, the postgres
master itself or just some random child?


> But if so, I expected the light to restart blinking after restarting
> the server (in order to continue WAL->db activity).
>

The normal checkpoint is paced.  So trickling out data slowly will keep the
light on, but not actually stress the system.

When you shutdown the system, it does a fast checkpoint.  This gets the
data written out as quickly as possible (since you are shutting down, it
doesn't worry about interfering with performance for other users, as there
are none), so once it is done you don't see the light anymore.  If you do a
clean shutdown (kill -15 of the postgres master) this fast checkpoint
happens upon shutdown.  If you do an abort (kill -9) then the fast
checkpoint happens upon start-up, once recovery is finished but before the
database is opened of regular use.

 Cheers,

Jeff


Re: [GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-06 Thread Tom DalPozzo
Hi,
about SSD light:
I guessed it was WAL -> actual db files data traffic. It explains why the
light stops blinking after shutting down the server (I did it via kill
command) . But if so, I expected the light to restart blinking after
restarting the server (in order to continue WAL->db activity).
Regards





2016-12-05 20:02 GMT+01:00 Jeff Janes :

> On Fri, Dec 2, 2016 at 9:40 AM, Tom DalPozzo  wrote:
>
>> Hi,
>> I've two tables, t1 and t2, both with one bigint id indexed field and one
>> 256 char data field; t1 has always got 1 row, while t2 is increasing as
>> explained in the following.
>>
>> My pqlib client countinously updates  one row in t1 (every time targeting
>> a different row) and inserts a new row in t2. All this in blocks of 1000
>> update-insert per commit, in order to get better performance.
>> Wal_method is fsync, fsync is on, attached my conf file.
>> I've a 3.8ghz laptop with evo SSD.
>>
>> Performance is  measured every two executed blocks and related to these
>> blocks.
>>
>> Over the first few minutes performance is around 10Krow/s then it slowly
>> drops, over next few minutes to 4Krow/s, then it slowly returns high and so
>> on, like a wave.
>> I don't understand this behaviour. Is it normal? What does it depend on?
>>
>
> Yes, that is normal.  It is also very complicated.  It depends on pretty
> much everything.  PostgreSQL, kernel, filesystem, IO controller, firmware,
> hardware, other things going on on the computer simultaneously, etc.
>
>
>>
>> Also, when I stop the client I see the SSD light still heavily working.
>>
>
> This is normal.  It writes out critical data to a WAL log first, and then
> leisurely writes out the changes to the actual data files later.  In the
> case of a crash, the WAL will be used to replay the data file changes which
> may or may not have made it to disk.
>
> It would last quite a while unless I stop the postgresql server, in this
>> case it suddenly stops.
>>
>
> Do you stop postgresql with fast or immediate shutdown?
>
>
>> If I restart the server it remains off.
>> I'm wondering if it's normal. I'd like to be sure that my data are safe
>> once commited.
>>
>
> If your kernel/fs/SSD doesn't lie about syncing the data, then your data
> is safe once committed. (It is possible there are bugs in PostgreSQL, of
> course, but nothing you report indicates you have found one).
>
> If you really want to be sure that the full stack, from PostgreSQL down to
> the hardware on the SSD, is crash safe, the only real way is to do some
> "pull the plug" tests.
>
> Cheers,
>
> Jeff
>


Re: [GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-05 Thread Jeff Janes
On Fri, Dec 2, 2016 at 9:40 AM, Tom DalPozzo  wrote:

> Hi,
> I've two tables, t1 and t2, both with one bigint id indexed field and one
> 256 char data field; t1 has always got 1 row, while t2 is increasing as
> explained in the following.
>
> My pqlib client countinously updates  one row in t1 (every time targeting
> a different row) and inserts a new row in t2. All this in blocks of 1000
> update-insert per commit, in order to get better performance.
> Wal_method is fsync, fsync is on, attached my conf file.
> I've a 3.8ghz laptop with evo SSD.
>
> Performance is  measured every two executed blocks and related to these
> blocks.
>
> Over the first few minutes performance is around 10Krow/s then it slowly
> drops, over next few minutes to 4Krow/s, then it slowly returns high and so
> on, like a wave.
> I don't understand this behaviour. Is it normal? What does it depend on?
>

Yes, that is normal.  It is also very complicated.  It depends on pretty
much everything.  PostgreSQL, kernel, filesystem, IO controller, firmware,
hardware, other things going on on the computer simultaneously, etc.


>
> Also, when I stop the client I see the SSD light still heavily working.
>

This is normal.  It writes out critical data to a WAL log first, and then
leisurely writes out the changes to the actual data files later.  In the
case of a crash, the WAL will be used to replay the data file changes which
may or may not have made it to disk.

It would last quite a while unless I stop the postgresql server, in this
> case it suddenly stops.
>

Do you stop postgresql with fast or immediate shutdown?


> If I restart the server it remains off.
> I'm wondering if it's normal. I'd like to be sure that my data are safe
> once commited.
>

If your kernel/fs/SSD doesn't lie about syncing the data, then your data is
safe once committed. (It is possible there are bugs in PostgreSQL, of
course, but nothing you report indicates you have found one).

If you really want to be sure that the full stack, from PostgreSQL down to
the hardware on the SSD, is crash safe, the only real way is to do some
"pull the plug" tests.

Cheers,

Jeff


Re: [GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-05 Thread Tom DalPozzo
I tried to tune some parameters without appreciable changes in this
behaviour.
I tried to play with:
checkpoint timeout
wal size
shared buffers
commit delay
checkpoijnt completion target

No meaningful info found in the log file.

Regards



2016-12-04 4:02 GMT+01:00 Tomas Vondra :

> On Fri, 2016-12-02 at 13:45 -0800, Adrian Klaver wrote:
> >
> > On 12/02/2016 09:40 AM, Tom DalPozzo wrote:
> > >
> > >
> > > Hi,
> > > I've two tables, t1 and t2, both with one bigint id indexed field
> > > and
> > > one 256 char data field; t1 has always got 1 row, while t2 is
> > > increasing as explained in the following.
> > >
> > > My pqlib client countinously updates  one row in t1 (every time
> > > targeting a different row) and inserts a new row in t2. All this in
> > > blocks of 1000 update-insert per commit, in order to get better
> > > performance.
> > > Wal_method is fsync, fsync is on, attached my conf file.
> > > I've a 3.8ghz laptop with evo SSD.
> > >
> > > Performance is  measured every two executed blocks and related to
> > > these
> > > blocks.
> > >
> > > Over the first few minutes performance is around 10Krow/s then it
> > > slowly
> > > drops, over next few minutes to 4Krow/s, then it slowly returns
> > > high and
> > > so on, like a wave.
> > > I don't understand this behaviour. Is it normal? What does it
> > > depend on?
> > Have you looked at the Postgres log entries that cover these
> > episodes?
> >
> > Is there anything of interest there?
> >
> In particular look at checkpoints. In the config file you've changed
> checkpoint_timeout, but you haven't changed max_wal_size, so my guess
> is the checkpoints happen every few minutes, and run for about 1/2 the
> time (thanks for completion_target=0.5). That would be consistent with
> pattern of good/bad performance.
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>


Re: [GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-03 Thread Tomas Vondra
On Fri, 2016-12-02 at 13:45 -0800, Adrian Klaver wrote:
> 
> On 12/02/2016 09:40 AM, Tom DalPozzo wrote:
> > 
> > 
> > Hi,
> > I've two tables, t1 and t2, both with one bigint id indexed field
> > and
> > one 256 char data field; t1 has always got 1 row, while t2 is
> > increasing as explained in the following.
> > 
> > My pqlib client countinously updates  one row in t1 (every time
> > targeting a different row) and inserts a new row in t2. All this in
> > blocks of 1000 update-insert per commit, in order to get better
> > performance.
> > Wal_method is fsync, fsync is on, attached my conf file.
> > I've a 3.8ghz laptop with evo SSD.
> > 
> > Performance is  measured every two executed blocks and related to
> > these
> > blocks.
> > 
> > Over the first few minutes performance is around 10Krow/s then it
> > slowly
> > drops, over next few minutes to 4Krow/s, then it slowly returns
> > high and
> > so on, like a wave.
> > I don't understand this behaviour. Is it normal? What does it
> > depend on?
> Have you looked at the Postgres log entries that cover these
> episodes?
> 
> Is there anything of interest there?
> 
In particular look at checkpoints. In the config file you've changed
checkpoint_timeout, but you haven't changed max_wal_size, so my guess
is the checkpoints happen every few minutes, and run for about 1/2 the
time (thanks for completion_target=0.5). That would be consistent with
pattern of good/bad performance.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-02 Thread Adrian Klaver

On 12/02/2016 09:40 AM, Tom DalPozzo wrote:

Hi,
I've two tables, t1 and t2, both with one bigint id indexed field and
one 256 char data field; t1 has always got 1 row, while t2 is
increasing as explained in the following.

My pqlib client countinously updates  one row in t1 (every time
targeting a different row) and inserts a new row in t2. All this in
blocks of 1000 update-insert per commit, in order to get better performance.
Wal_method is fsync, fsync is on, attached my conf file.
I've a 3.8ghz laptop with evo SSD.

Performance is  measured every two executed blocks and related to these
blocks.

Over the first few minutes performance is around 10Krow/s then it slowly
drops, over next few minutes to 4Krow/s, then it slowly returns high and
so on, like a wave.
I don't understand this behaviour. Is it normal? What does it depend on?


Have you looked at the Postgres log entries that cover these episodes?

Is there anything of interest there?



Also, when I stop the client I see the SSD light still heavily working.
It would last quite a while unless I stop the postgresql server, in this
case it suddenly stops. If I restart the server it remains off.
I'm wondering if it's normal. I'd like to be sure that my data are safe
once commited.

Regards
Pupillo

P.S.: I put this question in general questions as my concern is not if
the performance is high or not.















--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] INSERT - UPDATE throughput oscillating and SSD activity after stopping the client

2016-12-02 Thread Tom DalPozzo
Hi,
I've two tables, t1 and t2, both with one bigint id indexed field and one
256 char data field; t1 has always got 1 row, while t2 is increasing as
explained in the following.

My pqlib client countinously updates  one row in t1 (every time targeting a
different row) and inserts a new row in t2. All this in blocks of 1000
update-insert per commit, in order to get better performance.
Wal_method is fsync, fsync is on, attached my conf file.
I've a 3.8ghz laptop with evo SSD.

Performance is  measured every two executed blocks and related to these
blocks.

Over the first few minutes performance is around 10Krow/s then it slowly
drops, over next few minutes to 4Krow/s, then it slowly returns high and so
on, like a wave.
I don't understand this behaviour. Is it normal? What does it depend on?

Also, when I stop the client I see the SSD light still heavily working. It
would last quite a while unless I stop the postgresql server, in this case
it suddenly stops. If I restart the server it remains off.
I'm wondering if it's normal. I'd like to be sure that my data are safe
once commited.

Regards
Pupillo

P.S.: I put this question in general questions as my concern is not if the
performance is high or not.


postgresql.conf
Description: Binary data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using UPDATE ... RETURNING in a custom SQL function, which RETURNS integer

2016-12-02 Thread Adrian Klaver

On 12/02/2016 04:23 AM, Alexander Farber wrote:

Thank you, Rob -

On Fri, Dec 2, 2016 at 11:12 AM, Rob Sargent > wrote:


> On Dec 2, 2016, at 2:52 AM, Alexander Farber
> wrote:
>
> CREATE OR REPLACE FUNCTION words_unban_user(
> in_uid integer)
> RETURNS integer AS
> $func$
> UPDATE words_users SET
> banned_until = null,
> banned_reason = null
> WHERE uid = in_uid
> RETURNING uid;  -- returns the user to be notified
>
> $func$ LANGUAGE sql;
>
> words=> SELECT uid FROM words_unban_user(1);
> ERROR:  column "uid" does not exist
> LINE 1: SELECT uid FROM words_unban_user(1);
>^
>

select words_unban_user(1) as uid;
Your function returns an int not a table.


this has worked well.

However if I rewrite the same function as "language plpgsql" - then
suddenly both ways of calling work:

CREATE OR REPLACE FUNCTION words_unban_user(
in_uid integer,
OUT out_uid integer)
RETURNS integer AS
$func$
BEGIN
UPDATE words_users SET
banned_until = null,
banned_reason = null
WHERE uid = in_uid
RETURNING uid into out_uid;
END
$func$ LANGUAGE plpgsql;

words=> select out_uid AS uid from words_unban_user(1);
 uid
-
   1
(1 row)

words=> select words_unban_user(1) AS uid;
 uid
-
   1
(1 row)

I am curious, why is it so...


In the SQL function you are not just using RETURNING to pop out the uid, 
which is not actually assigned to any output variable name but just 
returned as an integer.


In the plpgsql case you actually assign uid to an output variable name.

FYI, you can have OUT in SQL functions also:

https://www.postgresql.org/docs/9.5/static/xfunc-sql.html#XFUNC-OUTPUT-PARAMETERS



Regards
Alex








--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using UPDATE ... RETURNING in a custom SQL function, which RETURNS integer

2016-12-02 Thread Alexander Farber
Thank you, Rob -

On Fri, Dec 2, 2016 at 11:12 AM, Rob Sargent  wrote:

>
> > On Dec 2, 2016, at 2:52 AM, Alexander Farber 
> wrote:
> >
> > CREATE OR REPLACE FUNCTION words_unban_user(
> > in_uid integer)
> > RETURNS integer AS
> > $func$
> > UPDATE words_users SET
> > banned_until = null,
> > banned_reason = null
> > WHERE uid = in_uid
> > RETURNING uid;  -- returns the user to be notified
> >
> > $func$ LANGUAGE sql;
> >
> > words=> SELECT uid FROM words_unban_user(1);
> > ERROR:  column "uid" does not exist
> > LINE 1: SELECT uid FROM words_unban_user(1);
> >^
> >
>
select words_unban_user(1) as uid;
> Your function returns an int not a table.


this has worked well.

However if I rewrite the same function as "language plpgsql" - then
suddenly both ways of calling work:

CREATE OR REPLACE FUNCTION words_unban_user(
in_uid integer,
OUT out_uid integer)
RETURNS integer AS
$func$
BEGIN
UPDATE words_users SET
banned_until = null,
banned_reason = null
WHERE uid = in_uid
RETURNING uid into out_uid;
END
$func$ LANGUAGE plpgsql;

words=> select out_uid AS uid from words_unban_user(1);
 uid
-
   1
(1 row)

words=> select words_unban_user(1) AS uid;
 uid
-
   1
(1 row)

I am curious, why is it so...

Regards
Alex


Re: [GENERAL] Using UPDATE ... RETURNING in a custom SQL function, which RETURNS integer

2016-12-02 Thread rob stone


> And finally here is the failing usage of the function :
> 
> words=> SELECT uid FROM words_unban_user(1);
> ERROR:  column "uid" does not exist
> LINE 1: SELECT uid FROM words_unban_user(1);
>    ^
> 
> 
Shouldn't you be doing:-

SELECT words_unban_user(1);




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Using UPDATE ... RETURNING in a custom SQL function, which RETURNS integer

2016-12-02 Thread Alexander Farber
Hello,

why does this fail in PostgreSQL 9.5 please?

Here is my custom SQL function :

CREATE OR REPLACE FUNCTION words_unban_user(
in_uid integer)
RETURNS integer AS
$func$
UPDATE words_users SET
banned_until = null,
banned_reason = null
WHERE uid = in_uid
RETURNING uid;  -- returns the user to be notified

$func$ LANGUAGE sql;

Here is my table:

words=> TABLE words_users;
 uid |   created   |   visited   |
ip | vip_until | grand_until |banned_until |
banned_reason | win | loss | draw | elo  | medals | green | red | coins
-+-+-+---+---+-+-+---+-+--+--+--++---+-+---
   1 | 2016-12-02 10:33:59.0761+01 | 2016-12-02 10:36:36.3521+01 |
127.0.0.1 |   | | 2016-12-09 10:34:09.9151+01 | ban
user 1|   0 |0 |0 | 1500 |  0 | 0 |   0 | 0
(1 row)

And finally here is the failing usage of the function :

words=> SELECT uid FROM words_unban_user(1);
ERROR:  column "uid" does not exist
LINE 1: SELECT uid FROM words_unban_user(1);
   ^

The background is that this is a websockets-based game and of the custom
functions should return a list of user ids to be notified about changes
(like player was banned, opponent has resigned, ...)

In the custom plpgsql functions I use OUT parameters or return table with
RETURN NEXT and it works fine.

But in the above sql function this does not work...

Regards
Alex


Re: [GENERAL] FOR UPDATE

2016-11-28 Thread said assemlal
> Why do you want to lock these results?

Because we are migrating very old pipeline where we save data in file
texts. So we have decided to keep some logics inside perl code. That means
to update a record:
 1. we have to read/lock it
 2. return the result to the application
 3. the application does its business and return the new value to the
database
 4. we update the record.


> What do you mean "empty"?  If you pass the empty string, the query will
> return rows that have the empty string in those columns.  Note that the
> NULL value is not the same as the empty string.  If the function is
> defined as STRICT (also spelled RETURNS NULL ON NULL INPUT), then the
> function is not even called if you pass NULL arguments, and it simply
> returns NULL without locking anything.

By empty I mean: ''

> Not necessarily ... depends on what you want to happen.

It is a migration case so I was thinking to take the more safe way..



So I think the behaviour is clear to me now.

Thank you.

On Mon, Nov 28, 2016 at 4:37 PM, Alvaro Herrera 
wrote:

> said assemlal wrote:
> > Hello,
> >
> > PG: 9.4
> > CentOS 6
> >
> > I am writing functions to lock results.
> >
> > Let's take an example:
> >
> > CREATE OR REPLACE FUNCTION usp_locking_my_result($1 VARCHAR(50), $2
> > VARCHAR(50))
> > RETURNS TEXT AS $$
> > SELECT value
> > FROM my_table
> > WHERE field1 = $1 AND field2 = $2 FOR UPDATE;
> > $$ LANGUAGE SQL;
>
> Why do you want to lock these results?
>
> > What happens if one of those arguments are empty and database finds
> > results?  ( I think they will locked )
>
> What do you mean "empty"?  If you pass the empty string, the query will
> return rows that have the empty string in those columns.  Note that the
> NULL value is not the same as the empty string.  If the function is
> defined as STRICT (also spelled RETURNS NULL ON NULL INPUT), then the
> function is not even called if you pass NULL arguments, and it simply
> returns NULL without locking anything.
>
> > Should I check the inputs and raise an exception if there are empty ?
>
> Not necessarily ... depends on what you want to happen.
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [GENERAL] FOR UPDATE

2016-11-28 Thread Alvaro Herrera
said assemlal wrote:
> Hello,
> 
> PG: 9.4
> CentOS 6
> 
> I am writing functions to lock results.
> 
> Let's take an example:
> 
> CREATE OR REPLACE FUNCTION usp_locking_my_result($1 VARCHAR(50), $2
> VARCHAR(50))
> RETURNS TEXT AS $$
> SELECT value
> FROM my_table
> WHERE field1 = $1 AND field2 = $2 FOR UPDATE;
> $$ LANGUAGE SQL;

Why do you want to lock these results?

> What happens if one of those arguments are empty and database finds
> results?  ( I think they will locked )

What do you mean "empty"?  If you pass the empty string, the query will
return rows that have the empty string in those columns.  Note that the
NULL value is not the same as the empty string.  If the function is
defined as STRICT (also spelled RETURNS NULL ON NULL INPUT), then the
function is not even called if you pass NULL arguments, and it simply
returns NULL without locking anything.

> Should I check the inputs and raise an exception if there are empty ?

Not necessarily ... depends on what you want to happen.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] FOR UPDATE

2016-11-28 Thread said assemlal
Hello,

PG: 9.4
CentOS 6

I am writing functions to lock results.

Let's take an example:

CREATE OR REPLACE FUNCTION usp_locking_my_result($1 VARCHAR(50), $2
VARCHAR(50))
RETURNS TEXT AS $$
SELECT value
FROM my_table
WHERE field1 = $1 AND field2 = $2 FOR UPDATE;
$$ LANGUAGE SQL;


What happens if one of those arguments are empty and database finds
results?  ( I think they will locked )

Should I check the inputs and raise an exception if there are empty ?

Thank,
Saïd


Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-23 Thread Tom Lane
Andreas Terrius  writes:
> Is there any way to check whether the row already exists before checking
> constraints ? I still want it to fail if it turns out to be a new row
> (which would violate the not null constraint), but updates the row if it
> already exists.

I'm not really sure why you expect this to be able to work.  The data
you are supplying is insufficient to satisfy the INSERT case, so why do
you think that an insert-or-update should be possible?

ISTM that you could try the UPDATE first and notice whether it updates
any rows or not.  I'm not sure what you're going to do if it doesn't,
though, since you still haven't got enough data to do an INSERT.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-23 Thread John R Pierce

On 11/21/2016 2:32 AM, Andreas Terrius wrote:
Is there any way to check whether the row already exists before 
checking constraints ? I still want it to fail if it turns out to be a 
new row (which would violate the not null constraint), but updates the 
row if it already exists.


just do an update.  if the row doesn't exist, it will fail, you then 
rollback the transaction or savepoint.


Since if that is not possible, I would need to do a query to determine 
whether the row exists in the database which kinda eliminates the use 
of upsert. (in this case, partial upsert).


in general, anything that relies on those sorts of checks will fail 
under concurrent loads.




--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-23 Thread Andreas Terrius
Is there any way to check whether the row already exists before checking
constraints ? I still want it to fail if it turns out to be a new row
(which would violate the not null constraint), but updates the row if it
already exists.

Since if that is not possible, I would need to do a query to determine
whether the row exists in the database which kinda eliminates the use of
upsert. (in this case, partial upsert).



On Sun, Nov 20, 2016 at 3:57 AM, Tom Lane  wrote:

> Adrian Klaver  writes:
> > ... So looks like constraints are checked before you get to the ON
> CONFLICT section.
>
> Right.  ON CONFLICT is a means for dealing with duplicate-key errors in
> the specified (or inferred) unique index.  It is *not* an all-purpose
> error catcher.  In the case at hand, the given INSERT request fails due
> to not-null constraints that are unrelated to what the ON CONFLICT clause
> tests for.
>
> regards, tom lane
>


Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-21 Thread Adrian Klaver

On 11/21/2016 02:32 AM, Andreas Terrius wrote:

Is there any way to check whether the row already exists before checking
constraints ? I still want it to fail if it turns out to be a new row
(which would violate the not null constraint), but updates the row if it
already exists.

Since if that is not possible, I would need to do a query to determine
whether the row exists in the database which kinda eliminates the use of
upsert. (in this case, partial upsert).


Before UPSERT appeared in 9.5, folks came up of with alternate methods 
of doing this. I would suggest searching on:


postgres upsert cte


You might be able to modify the examples to get what you want.





On Sun, Nov 20, 2016 at 3:57 AM, Tom Lane > wrote:

Adrian Klaver > writes:
> ... So looks like constraints are checked before you get to the ON
CONFLICT section.

Right.  ON CONFLICT is a means for dealing with duplicate-key errors in
the specified (or inferred) unique index.  It is *not* an all-purpose
error catcher.  In the case at hand, the given INSERT request fails due
to not-null constraints that are unrelated to what the ON CONFLICT
clause
tests for.

regards, tom lane





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-19 Thread Tom Lane
Adrian Klaver  writes:
> ... So looks like constraints are checked before you get to the ON CONFLICT 
> section.

Right.  ON CONFLICT is a means for dealing with duplicate-key errors in
the specified (or inferred) unique index.  It is *not* an all-purpose
error catcher.  In the case at hand, the given INSERT request fails due
to not-null constraints that are unrelated to what the ON CONFLICT clause
tests for.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-19 Thread Adrian Klaver
On 11/19/2016 11:33 AM, Kim Rose Carlsen wrote:
>> AFAIK, EXCLUDED is only available in a trigger function:
> 
>>
>> https://www.postgresql.org/docs/9.5/static/trigger-definition.html
>>
>> You are using EXCLUDED in a regular function so it would not be found.
>> 
>> Can you also show the failure for your alternate method?
> 
> From the manual
> https://www.postgresql.org/docs/9.5/static/sql-insert.html
> 
> "
> conflict_action
> conflict_action specifies an alternative ON CONFLICT action. It can be
> either DO NOTHING, or a DO UPDATE clause specifying the exact details of
> the UPDATE action to be performed in case of a conflict. The SET and
> WHERE clauses in ON CONFLICT DO UPDATE have access to the existing row
> using the table's name (or an alias), and to rows proposed for insertion
> using the special excluded table. SELECT privilege is required on any
> column in the target table where corresponding excluded columns are read.
> "
> 


Oops, my mistake. I should have spent more time on the examples.

Changing the function to;

CREATE OR REPLACE FUNCTION public.upsert_job(job jsonb)
 RETURNS void
 LANGUAGE plpgsql
 SECURITY DEFINER
AS $function$
BEGIN
INSERT INTO jobs AS origin VALUES(
(job->>'id')::INTEGER,
COALESCE(job->>'employee_name'::TEXT, 'test_name'),
COALESCE(job->>'address'::TEXT, 'test_address'),
job->>'phone_number'::TEXT
) ON CONFLICT (id) DO UPDATE SET
employee_name = COALESCE(EXCLUDED.employee_name, origin.employee_name),
address = COALESCE(EXCLUDED.address, origin.address),
phone_number = COALESCE(EXCLUDED.phone_number, origin.phone_number);
END;
$function$
;

makes it work. So looks like constraints are checked before you get to the ON 
CONFLICT section.


-- 
Adrian Klaver
adrian.kla...@aklaver.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-19 Thread Kim Rose Carlsen
> AFAIK, EXCLUDED is only available in a trigger function:

>
> https://www.postgresql.org/docs/9.5/static/trigger-definition.html
>
> You are using EXCLUDED in a regular function so it would not be found.
>
> Can you also show the failure for your alternate method?

>From the manual
https://www.postgresql.org/docs/9.5/static/sql-insert.html

"
conflict_action
conflict_action specifies an alternative ON CONFLICT action. It can be either 
DO NOTHING, or a DO UPDATE clause specifying the exact details of the UPDATE 
action to be performed in case of a conflict. The SET and WHERE clauses in ON 
CONFLICT DO UPDATE have access to the existing row using the table's name (or 
an alias), and to rows proposed for insertion using the special excluded table. 
SELECT privilege is required on any column in the target table where 
corresponding excluded columns are read.
"



Re: [GENERAL] Partial update on an postgres upsert violates constraint

2016-11-19 Thread Adrian Klaver

On 11/17/2016 10:13 PM, Andreas Terrius wrote:

Hi,
Basically I wanted to do a partial update inside pg (9.5), but it seems
that a partial update fails when not all of constraint is fulfilled
(such as the not null constraint)

Below are the sql queries I used,

|CREATETABLEjobs (id integer PRIMARYKEY,employee_name TEXT
NOTNULL,address TEXT NOTNULL,phone_number TEXT );CREATEORREPLACE
FUNCTIONupsert_job(job JSONB)RETURNS VOID AS$$BEGININSERTINTOjobs
ASorigin
VALUES((job->>'id')::INTEGER,job->>'employee_name'::TEXT,job->>'address'::TEXT,job->>'phone_number'::TEXT
)ONCONFLICT (id)DO UPDATESETemployee_name
=COALESCE(EXCLUDED.employee_name,origin.employee_name),address
=COALESCE(EXCLUDED.address,origin.address),phone_number
=COALESCE(EXCLUDED.phone_number,origin.phone_number);END;$$LANGUAGE
PLPGSQL SECURITY DEFINER;--Full insert (OK)SELECTupsert_job('{"id" : 1,
"employee_name" : "AAA", "address" : "City, x street no.y",
"phone_number" : "123456789"}'::jsonb);--Partial update that fulfills
constraint (Ok)SELECTupsert_job('{"id" : 1, "employee_name" : "BBB",
"address" : "City, x street no.y"}'::jsonb);--Partial update that
doesn't fulfill constraint (FAILS)SELECTupsert_job('{"id" : 1,
"phone_number" : "12345"}'::jsonb);--ERROR: null value in column
"employee_name" violates not-null constraint--DETAIL: Failing row
contains (1, null, null, 12345).|

I also tried explicitly stating the columns that I wanted to insert, and
it also fails. How do I go around doing this ?


AFAIK, EXCLUDED is only available in a trigger function:

https://www.postgresql.org/docs/9.5/static/trigger-definition.html

You are using EXCLUDED in a regular function so it would not be found.

Can you also show the failure for your alternate method?



Thank you



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partial update on an postgres upsert violates constraint

2016-11-18 Thread Andreas Terrius
Hi,
Basically I wanted to do a partial update inside pg (9.5), but it seems
that a partial update fails when not all of constraint is fulfilled (such
as the not null constraint)

Below are the sql queries I used,

CREATE TABLE jobs (
id integer PRIMARY KEY,
employee_name TEXT NOT NULL,
address TEXT NOT NULL,
phone_number TEXT);
CREATE OR REPLACE FUNCTION upsert_job(job JSONB)
RETURNS VOID AS $$BEGININSERT INTO jobs AS origin VALUES(
(job->>'id')::INTEGER,
job->>'employee_name'::TEXT,
job->>'address'::TEXT,
job->>'phone_number'::TEXT) ON CONFLICT (id) DO UPDATE SET
employee_name = COALESCE(EXCLUDED.employee_name, origin.employee_name),
address = COALESCE(EXCLUDED.address, origin.address),
phone_number = COALESCE(EXCLUDED.phone_number,
origin.phone_number);END;$$ LANGUAGE PLPGSQL SECURITY DEFINER;

--Full insert (OK)SELECT upsert_job('{"id" : 1, "employee_name" :
"AAA", "address" : "City, x street no.y", "phone_number" :
"123456789"}'::jsonb);
--Partial update that fulfills constraint (Ok)SELECT upsert_job('{"id"
: 1,  "employee_name" : "BBB", "address" : "City, x street
no.y"}'::jsonb);
--Partial update that doesn't fulfill constraint (FAILS)SELECT
upsert_job('{"id" : 1,  "phone_number" : "12345"}'::jsonb);
--ERROR:  null value in column "employee_name" violates not-null
constraint--DETAIL:  Failing row contains (1, null, null, 12345).

I also tried explicitly stating the columns that I wanted to insert, and it
also fails. How do I go around doing this ?

Thank you


Re: [GENERAL] Stats update difference between VACUUM ANALYZE and ANALYZE in 9.2?

2016-09-26 Thread pbj
On Monday, September 26, 2016 9:44 AM, Tom Lane  wrote:

 >> Paul Jones  writes:
 >> For a freshly pg_restore'd 9.2 database, would VACUUM ANALYZE update
 >> statistics any better than just an ANALYZE?

 >
 > VACUUM would have caused the page-all-visible flags to get set for all
 > pages of unchanging tables.  I don't recall whether ANALYZE has any side
 > effects on those flags at all, but it certainly would not have set them
 > for pages it didn't even visit, which would be most.
 >
 > Net result is that the pg_class.relallvisible fractions didn't get high
 > enough to persuade the planner that index-only scans would be effective.
 > I guess you could call that a statistic, but it's really about the
 > contents of the tables' free space maps.
  
 > 
 > regards, tom lane

This is good to know.  I think we will be running VACUUM ANALYZE from 
now on after restore instead of just ANALYZE.
 
I do note that sect. 49.11 claims that ANALYZE updates 
pg_class.relallvisible.  I don't know if this is a documentation problem 
in light of what you explained.

PJ



Re: [GENERAL] Stats update difference between VACUUM ANALYZE and ANALYZE in 9.2?

2016-09-26 Thread Tom Lane
Paul Jones  writes:
> For a freshly pg_restore'd 9.2 database, would VACUUM ANALYZE update
> statistics any better than just an ANALYZE?

Not as far as the statistics kept in pg_stat go.

> After a restore, we ran a bunch of ANALYZEs on each table individually
> using GNU 'parallel' (for speed).  Many of these tables are child tables
> in a partition.  Following the ANALYZEs, a join with the parent table
> showed all of the child tables scanned sequentially.

> After running VACUUM ANALYZE on the whole database, the same join used
> index-only scans on the child tables.

VACUUM would have caused the page-all-visible flags to get set for all
pages of unchanging tables.  I don't recall whether ANALYZE has any side
effects on those flags at all, but it certainly would not have set them
for pages it didn't even visit, which would be most.

Net result is that the pg_class.relallvisible fractions didn't get high
enough to persuade the planner that index-only scans would be effective.
I guess you could call that a statistic, but it's really about the
contents of the tables' free space maps.

Another possible theory is that you skipped ANALYZE'ing the partition
parent tables in your first pass, but I'm betting on the all-visible
fractions as being the issue.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Stats update difference between VACUUM ANALYZE and ANALYZE in 9.2?

2016-09-26 Thread Paul Jones
For a freshly pg_restore'd 9.2 database, would VACUUM ANALYZE update
statistics any better than just an ANALYZE?

After a restore, we ran a bunch of ANALYZEs on each table individually
using GNU 'parallel' (for speed).  Many of these tables are child tables
in a partition.  Following the ANALYZEs, a join with the parent table
showed all of the child tables scanned sequentially.

After running VACUUM ANALYZE on the whole database, the same join used
index-only scans on the child tables.

An examination of the fine manual implies there may be some difference
(or a documentation conflict?) between running ANALYZE manually on
individual tables and an unqualified ANALYZE on the whole database.

5.9.6:
"If you are using manual VACUUM or ANALYZE commands, don't forget
that you need to run them on each partition individually. A
command like:
ANALYZE measurement;
will only process the master table."

ANALYZE:
"If the table being analyzed has one or more children, ANALYZE
will gather statistics twice: once on the rows of the parent table
only, and a second time on the rows of the parent table with all
of its children. This second set of statistics is needed when
planning queries that traverse the entire inheritance tree. The
autovacuum daemon, however, will only consider inserts or updates
on the parent table itself when deciding whether to trigger
an automatic analyze for that table. If that table is rarely
inserted into or updated, the inheritance statistics will not
be up to date unless you run ANALYZE manually."

Can anyone explain what's going on here?

Thanks,
Paul Jones


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Comibining UPDATE ... SET ... FROM (SELECT ...) with a JOIN

2016-09-10 Thread Alexander Farber
Thank you Brian and others, but -

On Fri, Sep 9, 2016 at 5:22 PM, Brian Dunavant  wrote:

> I'm making the assumption that you only have one ip/user in words_users.
>
> with lockrow as (
>SELECT g.gid, u.ip
>FROM   words_games g join words_users u
> ON (g.player1 = u.uid)
>   WHERE  g.finished IS NULL
> ANDg.player1 <> in_uid
> ANDg.played1 IS NOT NULL
> ANDg.player2 IS NULL
>LIMIT  1
>FORUPDATE SKIP LOCKED
> ), do_the_update as (
>UPDATE words_games g1
>SETplayer2 = in_uid
>FROM lockrow g2
>WHERE g1.gid = g2.gid
>RETURNING g1.gid, g1.player2
> )
> select m.gid into out_gid, u.ip into out_uip
> from do_the_update m
>   join lockrow u on (gid)
> ;
>
> The general idea being lock the row in the first CTE, update it in the
> second, returning your values, and then query against those in the
> final select to get the ip.  If it didn't update anything, you'll get
> no results.
>

unfortunately, the above query does not seem to ensure, that players with
same ip can not join the same game, which is actually my question...

But thanks for showing the CTE for UPDATE ... RETURNING - that is probably
the way to go for me

Regards
Alex


[GENERAL] Comibining UPDATE ... SET ... FROM (SELECT ...) with a JOIN

2016-09-09 Thread Alexander Farber
Good afternoon,

I have a question please.

In one table I store user ids and their IP addresses -

CREATE TABLE words_users (
uid SERIAL PRIMARY KEY,
ip inet NOT NULL
);

And in another table I keep 2-player games and timestamps of last moves
(NULL if a player hasn't played yet):

CREATE TABLE words_games (
gid SERIAL PRIMARY KEY,
finished timestamptz,

player1 integer REFERENCES words_users(uid) ON DELETE
CASCADE NOT NULL,
player2 integer REFERENCES words_users(uid) ON DELETE
CASCADE,

played1 timestamptz,
played2 timestamptz
);

When a user wants to start a new game, I first check if there is maybe a
new game already available - with just 1 player while the other "seat" is
vacant:

UPDATE words_games g1
SETplayer2 = in_uid
FROM (
SELECT gid
FROM   words_games
WHERE  finished IS NULL
ANDplayer1 <> in_uid
ANDplayed1 IS NOT NULL
ANDplayer2 IS NULL
LIMIT  1
FORUPDATE SKIP LOCKED
) g2
WHERE g1.gid = g2.gid
RETURNING g1.gid
INTO  out_gid;

This code works great, but now I am trying to add an (obviously not solving
all cheating/proxy/etc. problems) check, that the IP addresses of both
users must be different.

Fetching "ip" in the internal SELECT statement is trivial with:

UPDATE words_games g1
SETplayer2 = in_uid
FROM (
SELECT g.gid, u.ip
FROM   words_games g, words_users u
WHERE  g.finished IS NULL
ANDg.player1 <> in_uid
ANDg.played1 IS NOT NULL
ANDg.player2 IS NULL
ON (g.player1 = u.uid)
LIMIT  1
FORUPDATE SKIP LOCKED
) g2
WHERE g1.gid = g2.gid
RETURNING g1.gid
INTO  out_gid;

But how to fetch the "ip" column in the surrounding UPDATE statement?

Thank you
Alex


Re: [GENERAL] plpgsql update row from record variable

2016-04-03 Thread Adrian Klaver

On 04/02/2016 06:07 PM, Seamus Abshere wrote:

hi,

I want to write a function that updates arbitrary columns and here's my
pseudocode:

CREATE OR REPLACE FUNCTION update_pet(id int, raw_data jsonb) RETURNS
VOID AS $$
DECLARE
   data record;
BEGIN
   SELECT jsonb_populate_record(null::pets, raw_data) INTO data;
   UPDATE pets [... from data ...] WHERE id = id; -- probably impossible
END;
$$ LANGUAGE plpgsql;

e.g.

SELECT update_pets(1, '{ "name" : "Jerry", "age": 9 }'::jsonb);

Back in 2004, Tom showed how to insert from a plpgsql record:

http://www.postgresql.org/message-id/17840.1087670...@sss.pgh.pa.us

Is there any way to "update *" from a record?


What version of Postgres?

In 9.5 you have the following syntax available:

http://www.postgresql.org/docs/9.5/interactive/sql-update.html

UPDATE accounts SET (contact_first_name, contact_last_name) =
(SELECT first_name, last_name FROM salesmen
 WHERE salesmen.id = accounts.sales_id);

or its almost(see docs for notes) pre-9.5 equivalent:

UPDATE accounts SET contact_first_name = first_name,
contact_last_name = last_name
  FROM salesmen WHERE salesmen.id = accounts.sales_id;


So on a 9.4 instance here:

test=# \d company
   Table "public.company"
 Column  | Type  | Modifiers
-+---+---
 id  | integer   | not null
 name| text  | not null
 age | integer   | not null
 address | character(50) |
 salary  | real  |
Indexes:
"company_pkey" PRIMARY KEY, btree (id)


UPDATE
company
SET
salary = jb.salary
FROM
(
SELECT
id,
salary
FROM
jsonb_populate_record ( NULL::company,
'{"id": 1, "age": 32, "name": "Paul", "salary": 1,
  "address": "California"}' ::jsonb ) )
AS
jb
WHERE
company.id = jb.id;




Thanks!
Seamus

PS. Whether I **should** do this is another matter, I just want to know
if it's possible.

--
Seamus Abshere, SCEA
https://github.com/seamusabshere
http://linkedin.com/in/seamusabshere





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] plpgsql update row from record variable

2016-04-02 Thread Seamus Abshere
hi,

I want to write a function that updates arbitrary columns and here's my
pseudocode:

CREATE OR REPLACE FUNCTION update_pet(id int, raw_data jsonb) RETURNS
VOID AS $$
DECLARE
  data record;
BEGIN
  SELECT jsonb_populate_record(null::pets, raw_data) INTO data;
  UPDATE pets [... from data ...] WHERE id = id; -- probably impossible
END;
$$ LANGUAGE plpgsql;

e.g.

SELECT update_pets(1, '{ "name" : "Jerry", "age": 9 }'::jsonb);

Back in 2004, Tom showed how to insert from a plpgsql record:

http://www.postgresql.org/message-id/17840.1087670...@sss.pgh.pa.us

Is there any way to "update *" from a record?

Thanks!
Seamus

PS. Whether I **should** do this is another matter, I just want to know
if it's possible.

--
Seamus Abshere, SCEA
https://github.com/seamusabshere
http://linkedin.com/in/seamusabshere


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Multiple Update queries

2015-09-15 Thread sri harsha
Hello,

 I am using a foreign data wrapper where i get a portion of my data
pre-loaded , i.e I get a set of rows before hand . So now i want to run
multiple update queries on this loaded data , write the changes to file ,
load the next set and continue with updates again.

 How should i try to achieve my requirement ??  At what point in
postgres should i use the hook and change my execution ??

Thanks,
Harsha


[GENERAL] With Update From ... vs. Update ... From (With)

2015-04-13 Thread David G. Johnston
Hello!

Is there any non-functional difference between these two forms of Update?

WITH name AS ( SELECT )
UPDATE tbl SET ...
FROM name
WHERE tbl.id = name.id

and

UPDATE tbl SET ...
FROM ( WITH qry AS ( SELECT ) SELECT * FROM qry ) AS name
WHERE tbl.id = name.id

They both better give the same results but does the backend treat them
differently?

Does the answer to my question depend on the version of PostgreSQL?

Thanks!

David J.


Re: [GENERAL] With Update From ... vs. Update ... From (With)

2015-04-13 Thread Jim Nasby

On 4/13/15 8:12 PM, David G. Johnston wrote:

Hello!

Is there any non-functional difference between these two forms of Update?

WITH name AS ( SELECT )
UPDATE tbl SET ...
FROM name
WHERE tbl.id http://tbl.id = name.id http://name.id

and

UPDATE tbl SET ...
FROM ( WITH qry AS ( SELECT ) SELECT * FROM qry ) AS name
WHERE tbl.id http://tbl.id = name.id http://name.id

They both better give the same results but does the backend treat them
differently?

Does the answer to my question depend on the version of PostgreSQL?


Well, they're both ugly... but see what EXPLAIN or EXPLAIN VERBOSE shows.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Incremental update for matview - automatic trigger-in-c generator

2014-12-08 Thread Nguyễn Trần Quốc Vinh
Dear sir/madam.

We build a program that can automatically generate triggers in C-language
for synchronous incremental matview update. It supports queries with
limitations:
- inner join
- group by with sum, count, avg, min, max
- the join has to be of the form: t1 1-n t2 1-n t3 1-n t4

The binary and source can be find at
http://it.ued.vn/myprojects/pgTriggerGen/.

It is the first version. We now are build another version with some
improvements.

Anyone can use it if they find it useful. All comments, recommendations,
improvements are welcome.

Corresponding address: ntquocv...@ued.vn.

Thank you for attentions.


[GENERAL] which Update quicker

2014-09-23 Thread Emi Lu

Hello list,

For a big table with more than 1,000,000 records, may I know which 
update is quicker please?


(1) update t1
  set c1 = a.c1
  from a
  where pk and
 t1.c1a.c1;
 ..
  update t1
  set c_N = a.c_N
  from a
  where pk and
 t1.c_Na.c_N;


(2)  update t1
  set c1 = a.c1 ,
c2  = a.c2,
...
c_N = a.c_N
 from a
 where pk AND
   (  t1.c1  a.c1 OR t1.c2  a.c2. t1.c_N  a.c_N)


Or other quicker way for update action?

Thank you
Emi




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] which Update quicker

2014-09-23 Thread Daniele Varrazzo
On Tue, Sep 23, 2014 at 8:35 PM, Emi Lu em...@encs.concordia.ca wrote:
 Hello list,

 For a big table with more than 1,000,000 records, may I know which update is
 quicker please?

 (1) update t1
   set c1 = a.c1
   from a
   where pk and
  t1.c1a.c1;
  ..
   update t1
   set c_N = a.c_N
   from a
   where pk and
  t1.c_Na.c_N;


 (2)  update t1
   set c1 = a.c1 ,
 c2  = a.c2,
 ...
 c_N = a.c_N
  from a
  where pk AND
(  t1.c1  a.c1 OR t1.c2  a.c2. t1.c_N  a.c_N)

Definitely the second, and it produces less bloat too.


 Or other quicker way for update action?

You may express the comparison as (t1.c1, t1.c2, ... t1.cN)  (t2.c1,
t2.c2, ... t2.cN)
It's not going to be faster but maybe it's more readable.


-- Daniele


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] which Update quicker

2014-09-23 Thread Steve Crawford

On 09/23/2014 12:35 PM, Emi Lu wrote:

Hello list,

For a big table with more than 1,000,000 records, may I know which 
update is quicker please?


(1) update t1
  set c1 = a.c1
  from a
  where pk and
 t1.c1a.c1;
 ..
  update t1
  set c_N = a.c_N
  from a
  where pk and
 t1.c_Na.c_N;


(2)  update t1
  set c1 = a.c1 ,
c2  = a.c2,
...
c_N = a.c_N
 from a
 where pk AND
   (  t1.c1  a.c1 OR t1.c2  a.c2. t1.c_N  a.c_N)




We don't have any info about table structures, index availability and 
usage for query optimization, whether or not the updated columns are 
part of an index, amount of memory available, disk speed, portion of t1 
that will be updated, PostgreSQL settings, etc. so it's really anyone's 
guess. A million rows is pretty modest so I was able to try a couple 
variants of update...from... on million row tables on my aging desktop 
without coming close to the 60-second mark.


*Usually* putting statements into a single transaction is better (as 
would happen automatically in case 2). Also, to the extent that a given 
tuple would have multiple columns updated you will have less bloat and 
I/O using the query that updates the tuple once rather than multiple 
times. But a lot will depend on the efficiency of looking up the 
appropriate data in a.


Cheers,
Steve






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Correct update statement

2014-05-15 Thread Khangelani Gama
Hi



Please help, we are using postgreSQL 9.2.4. I need to  update over 9000
rows. See the query below: A table called contact has got *addr_id *field
as null which is incorrect. So now I need to update contact table for each
account (cus_acno is in cus table) where contact_addr_id is null. For
example using the first of the results below: I need take set addr_id (in
contact table)  to 187479 where cus_acno = 243492 and con_id = 119360





Example:



select distinct(cus_acno), contact.con_id, address.addr_id from address
join person using (addr_id) join  cus using (per_id) join link_contact
using (cus_acno) join contact using (con_id) where contact.addr_id is null;
 cus_acno | con_id | addr_id
--++-
   243492 | 119360 |  187479
   393701 | 119824 |  458532
   388538 | 118413 |  453178





Thanks



CONFIDENTIALITY NOTICE
The contents of and attachments to this e-mail are intended for the addressee 
only, and may contain the confidential
information of Argility (Proprietary) Limited and/or its subsidiaries. Any 
review, use or dissemination thereof by anyone
other than the intended addressee is prohibited.If you are not the intended 
addressee please notify the writer immediately
and destroy the e-mail. Argility (Proprietary) Limited and its subsidiaries 
distance themselves from and accept no liability
for unauthorised use of their e-mail facilities or e-mails sent other than 
strictly for business purposes.


Re: [GENERAL] Correct update statement

2014-05-15 Thread Sim Zacks

  
  
update contacts set addr_id=b.addr_id 
  from
  (select distinct(cus_acno), contact.con_id, address.addr_id from
  address join person using (addr_id) join  cus using (per_id) join
  link_contact using (cus_acno) join contact using (con_id) where
  contact.addr_id is null) b
  where contacts.con_id=b.con_id
  
  On 05/15/2014 09:19 AM, Khangelani Gama wrote:


  
  
  
  
Hi
 
Please help, we are using postgreSQL 9.2.4.
  I need to  update over 9000 rows. See the query below: A table
  called contact has got addr_id field as null which is
  incorrect. So now I need to update contact table for each
  account (cus_acno is in cus table) where contact_addr_id is
  null. For example using the first of the results below: I need
  take set addr_id (in contact table)  to 187479 where cus_acno
  = 243492 and con_id = 119360
 
 
Example:
 
select distinct(cus_acno), contact.con_id,
  address.addr_id from address join person using (addr_id) join 
  cus using (per_id) join link_contact using (cus_acno) join
  contact using (con_id) where contact.addr_id is null;
   cus_acno | con_id | addr_id 
  --++-
     243492 | 119360 |  187479
     393701 | 119824 |  458532
     388538 | 118413 |  453178
  

 
  
Thanks
 
 
 
  
  
CONFIDENTIALITY NOTICE
The contents of and attachments to this e-mail are intended for the addressee only, and may contain the confidential
information of Argility (Proprietary) Limited and/or its subsidiaries. Any review, use or dissemination thereof by anyone
other than the intended addressee is prohibited.If you are not the intended addressee please notify the writer immediately
and destroy the e-mail. Argility (Proprietary) Limited and its subsidiaries distance themselves from and accept no liability
for unauthorised use of their e-mail facilities or e-mails sent other than strictly for business purposes.






  




Re: [GENERAL] Correct update statement

2014-05-15 Thread Khangelani Gama
Thank very much



*From:* pgsql-general-ow...@postgresql.org [mailto:
pgsql-general-ow...@postgresql.org] *On Behalf Of *Sim Zacks
*Sent:* Thursday, May 15, 2014 10:42 AM
*To:* pgsql-general@postgresql.org
*Subject:* Re: [GENERAL] Correct update statement



update contacts set addr_id=b.addr_id
from
(select distinct(cus_acno), contact.con_id, address.addr_id from address
join person using (addr_id) join  cus using (per_id) join link_contact
using (cus_acno) join contact using (con_id) where contact.addr_id is null)
b
where contacts.con_id=b.con_id

On 05/15/2014 09:19 AM, Khangelani Gama wrote:

Hi



Please help, we are using postgreSQL 9.2.4. I need to  update over 9000
rows. See the query below: A table called contact has got *addr_id *field
as null which is incorrect. So now I need to update contact table for each
account (cus_acno is in cus table) where contact_addr_id is null. For
example using the first of the results below: I need take set addr_id (in
contact table)  to 187479 where cus_acno = 243492 and con_id = 119360





Example:



select distinct(cus_acno), contact.con_id, address.addr_id from address
join person using (addr_id) join  cus using (per_id) join link_contact
using (cus_acno) join contact using (con_id) where contact.addr_id is null;
 cus_acno | con_id | addr_id
--++-
   243492 | 119360 |  187479
   393701 | 119824 |  458532
   388538 | 118413 |  453178





Thanks









CONFIDENTIALITY NOTICE

The contents of and attachments to this e-mail are intended for the
addressee only, and may contain the confidential

information of Argility (Proprietary) Limited and/or its subsidiaries.
Any review, use or dissemination thereof by anyone

other than the intended addressee is prohibited.If you are not the
intended addressee please notify the writer immediately

and destroy the e-mail. Argility (Proprietary) Limited and its
subsidiaries distance themselves from and accept no liability

for unauthorised use of their e-mail facilities or e-mails sent other
than strictly for business purposes.



CONFIDENTIALITY NOTICE
The contents of and attachments to this e-mail are intended for the addressee 
only, and may contain the confidential
information of Argility (Proprietary) Limited and/or its subsidiaries. Any 
review, use or dissemination thereof by anyone
other than the intended addressee is prohibited.If you are not the intended 
addressee please notify the writer immediately
and destroy the e-mail. Argility (Proprietary) Limited and its subsidiaries 
distance themselves from and accept no liability
for unauthorised use of their e-mail facilities or e-mails sent other than 
strictly for business purposes.


Re: [GENERAL] simple update query stuck

2014-04-02 Thread Si Chen
Ok, thanks.  I'll keep that in mind.


On Tue, Apr 1, 2014 at 7:45 PM, Andrew Sullivan a...@crankycanuck.ca wrote:

 On Tue, Apr 01, 2014 at 07:00:16PM -0400, Tom Lane wrote:

  one of the clients, in a way that isn't visible to the deadlock detector.
  One way for that to happen without any external interconnections is if
 the
  client is waiting for a NOTIFY that will never arrive because the
 would-be
  sender is blocked.

 I bet the case I was thinking of was the NOTIFY example.  That never
 occurred to me as an explanation, but now that you mention it, it
 seems quite likely to me.

 More generally (and for the OP's problem), my experience is that lots
 of updates against the same rows in an unpredictable order is an
 excellent way to run into trouble, and long-running transactions are a
 major source of these problems.  Without a more detailed report about
 what is going on in the present case, I don't think it's going to be
 possible to diagnose better than has been done.

 A

 --
 Andrew Sullivan
 a...@crankycanuck.ca


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Si Chen
Open Source Strategies, Inc.
sic...@opensourcestrategies.com
http://www.OpenSourceStrategies.com
LinkedIn: http://www.linkedin.com/in/opentaps
Twitter: http://twitter.com/opentaps


[GENERAL] simple update query stuck

2014-04-01 Thread Si Chen
Hello,

I'm using postgresql 9.0.13, and I have a simple query that seems to be
stuck.  I did a
postgres=# select procpid, query_start, waiting, current_query from
pg_stat_activity;

 procpid |  query_start  | waiting |

  current_query


   32605 | 2014-04-01 12:39:46.957666-07 | t   | UPDATE
public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
GL_ACCOUNT
_ID=$9 AND ORGANIZATION_PARTY_ID=$10

   32685 | 2014-04-01 12:25:10.378481-07 | t   | UPDATE
public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
GL_ACCOUNT
_ID=$9 AND ORGANIZATION_PARTY_ID=$10

As you can see this is a pretty simple update query, but it's been running
24 minutes in one thread and nearly 10 minutes in the other thread.  What's
also strange is it's not trigger a transaction timeout either.

I've already re-started the database recently, and there's not too many
threads executing -- just 38 threads total.  Is it possible that the table
is corrupted or needs repair?

-- 
Si Chen
Open Source Strategies, Inc.
sic...@opensourcestrategies.com
http://www.OpenSourceStrategies.com
LinkedIn: http://www.linkedin.com/in/opentaps
Twitter: http://twitter.com/opentaps


Re: [GENERAL] simple update query stuck

2014-04-01 Thread Paul Jungwirth
Do these queries update more than one row? I ran into a similar issue
a year ago, where two multi-row updates would deadlock because they
processed rows in a different order. I'd love to see UPDATE support
ORDER BY to fix this, but it doesn't yet. (If I ever try contributing
to Postgres, this is a feature I'd love to add. It seems like it might
be an easy one for a newbie to tackle.)

Paul


On Tue, Apr 1, 2014 at 12:51 PM, Si Chen
sic...@opensourcestrategies.com wrote:
 Hello,

 I'm using postgresql 9.0.13, and I have a simple query that seems to be
 stuck.  I did a
 postgres=# select procpid, query_start, waiting, current_query from
 pg_stat_activity;

  procpid |  query_start  | waiting |
 current_query

32605 | 2014-04-01 12:39:46.957666-07 | t   | UPDATE
 public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
 THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
 LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
 GL_ACCOUNT
 _ID=$9 AND ORGANIZATION_PARTY_ID=$10

32685 | 2014-04-01 12:25:10.378481-07 | t   | UPDATE
 public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
 THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
 LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
 GL_ACCOUNT
 _ID=$9 AND ORGANIZATION_PARTY_ID=$10

 As you can see this is a pretty simple update query, but it's been running
 24 minutes in one thread and nearly 10 minutes in the other thread.  What's
 also strange is it's not trigger a transaction timeout either.

 I've already re-started the database recently, and there's not too many
 threads executing -- just 38 threads total.  Is it possible that the table
 is corrupted or needs repair?

 --
 Si Chen
 Open Source Strategies, Inc.
 sic...@opensourcestrategies.com
 http://www.OpenSourceStrategies.com
 LinkedIn: http://www.linkedin.com/in/opentaps
 Twitter: http://twitter.com/opentaps




-- 
_
Pulchritudo splendor veritatis.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] simple update query stuck

2014-04-01 Thread Igor Neyman


From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Si Chen
Sent: Tuesday, April 01, 2014 3:51 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] simple update query stuck

Hello,

I'm using postgresql 9.0.13, and I have a simple query that seems to be stuck.  
I did a
postgres=# select procpid, query_start, waiting, current_query from 
pg_stat_activity;

 procpid |          query_start          | waiting |                            
                                                                                
         current_query                                                          
                      
                                     
   32605 | 2014-04-01 12:39:46.957666-07 | t       | UPDATE 
public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2, THRU_DATE=$3, 
POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5, LAST_UPDATED_TX_STAMP=$6, 
CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE GL_ACCOUNT
_ID=$9 AND ORGANIZATION_PARTY_ID=$10

   32685 | 2014-04-01 12:25:10.378481-07 | t       | UPDATE 
public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2, THRU_DATE=$3, 
POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5, LAST_UPDATED_TX_STAMP=$6, 
CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE GL_ACCOUNT
_ID=$9 AND ORGANIZATION_PARTY_ID=$10

As you can see this is a pretty simple update query, but it's been running 24 
minutes in one thread and nearly 10 minutes in the other thread.  What's also 
strange is it's not trigger a transaction timeout either.

I've already re-started the database recently, and there's not too many threads 
executing -- just 38 threads total.  Is it possible that the table is corrupted 
or needs repair?

-- 
Si Chen
Open Source Strategies, Inc.
sic...@opensourcestrategies.com
http://www.OpenSourceStrategies.com
LinkedIn: http://www.linkedin.com/in/opentaps
Twitter: http://twitter.com/opentaps

Both queries are waiting.  Your table must be locked.
Check pg_locks.

Regards,
Igor Neyman


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] simple update query stuck

2014-04-01 Thread Si Chen
Thanks for writing back, but I don't think so.  There's no message of a
deadlock in the log, and the first query started at 12:25, the next one
12:31, 12:39, 12:50, 12:54, so there's plenty of time in between.


On Tue, Apr 1, 2014 at 1:01 PM, Hoover, Jeffrey jhoo...@jcvi.org wrote:

 Could they both be trying to update the same row, resulting in a deadlock?



 *From:* pgsql-general-ow...@postgresql.org [mailto:
 pgsql-general-ow...@postgresql.org] *On Behalf Of *Si Chen
 *Sent:* Tuesday, April 01, 2014 3:51 PM
 *To:* pgsql-general@postgresql.org
 *Subject:* [GENERAL] simple update query stuck



 Hello,



 I'm using postgresql 9.0.13, and I have a simple query that seems to be
 stuck.  I did a

 postgres=# select procpid, query_start, waiting, current_query from
 pg_stat_activity;



  procpid |  query_start  | waiting |

   current_query




32605 | 2014-04-01 12:39:46.957666-07 | t   | UPDATE
 public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
 THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
 LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
 GL_ACCOUNT

 _ID=$9 AND ORGANIZATION_PARTY_ID=$10



32685 | 2014-04-01 12:25:10.378481-07 | t   | UPDATE
 public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
 THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
 LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
 GL_ACCOUNT

 _ID=$9 AND ORGANIZATION_PARTY_ID=$10



 As you can see this is a pretty simple update query, but it's been running
 24 minutes in one thread and nearly 10 minutes in the other thread.  What's
 also strange is it's not trigger a transaction timeout either.



 I've already re-started the database recently, and there's not too many
 threads executing -- just 38 threads total.  Is it possible that the table
 is corrupted or needs repair?



 --
 Si Chen
 Open Source Strategies, Inc.
 sic...@opensourcestrategies.com
 http://www.OpenSourceStrategies.com
 LinkedIn: http://www.linkedin.com/in/opentaps
 Twitter: http://twitter.com/opentaps




-- 
Si Chen
Open Source Strategies, Inc.
sic...@opensourcestrategies.com
http://www.OpenSourceStrategies.com
LinkedIn: http://www.linkedin.com/in/opentaps
Twitter: http://twitter.com/opentaps


Re: [GENERAL] simple update query stuck

2014-04-01 Thread Si Chen
You are right.  That was the problem.  I tried the query from
http://wiki.postgresql.org/wiki/Lock_Monitoring and found a COMMIT
transaction that was blocking it.

I restarted postgresql again, and (it seems) everything went back to
normal.  Was there another way to unlock the table then?


On Tue, Apr 1, 2014 at 1:10 PM, Igor Neyman iney...@perceptron.com wrote:



 From: pgsql-general-ow...@postgresql.org [mailto:
 pgsql-general-ow...@postgresql.org] On Behalf Of Si Chen
 Sent: Tuesday, April 01, 2014 3:51 PM
 To: pgsql-general@postgresql.org
 Subject: [GENERAL] simple update query stuck

 Hello,

 I'm using postgresql 9.0.13, and I have a simple query that seems to be
 stuck.  I did a
 postgres=# select procpid, query_start, waiting, current_query from
 pg_stat_activity;

  procpid |  query_start  | waiting |

   current_query


32605 | 2014-04-01 12:39:46.957666-07 | t   | UPDATE
 public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
 THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
 LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
 GL_ACCOUNT
 _ID=$9 AND ORGANIZATION_PARTY_ID=$10

32685 | 2014-04-01 12:25:10.378481-07 | t   | UPDATE
 public.GL_ACCOUNT_ORGANIZATION SET ROLE_TYPE_ID=$1, FROM_DATE=$2,
 THRU_DATE=$3, POSTED_BALANCE=$4, LAST_UPDATED_STAMP=$5,
 LAST_UPDATED_TX_STAMP=$6, CREATED_STAMP=$7, CREATED_TX_STAMP=$8 WHERE
 GL_ACCOUNT
 _ID=$9 AND ORGANIZATION_PARTY_ID=$10

 As you can see this is a pretty simple update query, but it's been running
 24 minutes in one thread and nearly 10 minutes in the other thread.  What's
 also strange is it's not trigger a transaction timeout either.

 I've already re-started the database recently, and there's not too many
 threads executing -- just 38 threads total.  Is it possible that the table
 is corrupted or needs repair?

 --
 Si Chen
 Open Source Strategies, Inc.
 sic...@opensourcestrategies.com
 http://www.OpenSourceStrategies.com
 LinkedIn: http://www.linkedin.com/in/opentaps
 Twitter: http://twitter.com/opentaps

 Both queries are waiting.  Your table must be locked.
 Check pg_locks.

 Regards,
 Igor Neyman




-- 
Si Chen
Open Source Strategies, Inc.
sic...@opensourcestrategies.com
http://www.OpenSourceStrategies.com
LinkedIn: http://www.linkedin.com/in/opentaps
Twitter: http://twitter.com/opentaps


Re: [GENERAL] simple update query stuck

2014-04-01 Thread Andrew Sullivan
On Tue, Apr 01, 2014 at 01:37:17PM -0700, Si Chen wrote:
 You are right.  That was the problem.  I tried the query from
 http://wiki.postgresql.org/wiki/Lock_Monitoring and found a COMMIT
 transaction that was blocking it.
 
 I restarted postgresql again, and (it seems) everything went back to
 normal.  Was there another way to unlock the table then?

Probably you could have killed one of the queries.  But it sounds like
what's happening is that you have multiple queries that are all trying
to update the same rows in a different order.  It may be that none of
these is strictly deadlocked, in that no query is waiting on a lock
that another query has, but rather is waiting on a lock that another
query will release only when _it_ gets a lock that another query has
and so on.  (Maybe things have gotten better, but in my experience
it's possible to set up a chain of locks such that it doesn't look
like a deadlock to the detector, but the lock chain is such that no
query will ever be able to release.)

I suspect you need to get your locks in a consistent order or you'll
continue to have this problem.

A


-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] simple update query stuck

2014-04-01 Thread David Johnston
Andrew Sullivan-8 wrote
 On Tue, Apr 01, 2014 at 01:37:17PM -0700, Si Chen wrote:
 You are right.  That was the problem.  I tried the query from
 http://wiki.postgresql.org/wiki/Lock_Monitoring and found a COMMIT
 transaction that was blocking it.
 
 I restarted postgresql again, and (it seems) everything went back to
 normal.  Was there another way to unlock the table then?

http://www.postgresql.org/docs/9.2/interactive/functions-admin.html#FUNCTIONS-ADMIN-SIGNAL-TABLE

I am curious why your original scan of pg-stat-query did not show this
offending session/transaction...did you omit it by chance?


 Probably you could have killed one of the queries.  But it sounds like
 what's happening is that you have multiple queries that are all trying
 to update the same rows in a different order.  It may be that none of
 these is strictly deadlocked, in that no query is waiting on a lock
 that another query has, but rather is waiting on a lock that another
 query will release only when _it_ gets a lock that another query has
 and so on.  (Maybe things have gotten better, but in my experience
 it's possible to set up a chain of locks such that it doesn't look
 like a deadlock to the detector, but the lock chain is such that no
 query will ever be able to release.)
 
 I suspect you need to get your locks in a consistent order or you'll
 continue to have this problem.

It sounds more like a long-running transaction (or something effectively
similar) is blocking updates of the target table through normal locking.

To be honest we seem to be lacking enough information, including the query
and locks state of the server during the stall, to make an informed guess as
to the real cause.

David J.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/simple-update-query-stuck-tp5798237p5798261.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] simple update query stuck

2014-04-01 Thread Tom Lane
Andrew Sullivan a...@crankycanuck.ca writes:
 Probably you could have killed one of the queries.  But it sounds like
 what's happening is that you have multiple queries that are all trying
 to update the same rows in a different order.  It may be that none of
 these is strictly deadlocked, in that no query is waiting on a lock
 that another query has, but rather is waiting on a lock that another
 query will release only when _it_ gets a lock that another query has
 and so on.  (Maybe things have gotten better, but in my experience
 it's possible to set up a chain of locks such that it doesn't look
 like a deadlock to the detector, but the lock chain is such that no
 query will ever be able to release.)

AFAIK the deadlock detector will catch any case where the waits-for
loop is entirely internal to the database; if you've got a reproducible
case where it doesn't, I'd like to see it.  However, it's certainly true
that you can get deadlock situations when some of the blocking conditions
exist outside the database --- that is, to the server it looks like some
transaction(s) are blocked on another session that is idle-in-transaction,
but the client attached to that session is somehow waiting for another
one of the clients, in a way that isn't visible to the deadlock detector.
One way for that to happen without any external interconnections is if the
client is waiting for a NOTIFY that will never arrive because the would-be
sender is blocked.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] simple update query stuck

2014-04-01 Thread Andrew Sullivan
On Tue, Apr 01, 2014 at 07:00:16PM -0400, Tom Lane wrote:

 one of the clients, in a way that isn't visible to the deadlock detector.
 One way for that to happen without any external interconnections is if the
 client is waiting for a NOTIFY that will never arrive because the would-be
 sender is blocked.

I bet the case I was thinking of was the NOTIFY example.  That never
occurred to me as an explanation, but now that you mention it, it
seems quite likely to me.  

More generally (and for the OP's problem), my experience is that lots
of updates against the same rows in an unpredictable order is an
excellent way to run into trouble, and long-running transactions are a
major source of these problems.  Without a more detailed report about
what is going on in the present case, I don't think it's going to be
possible to diagnose better than has been done.

A

-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unexpected update behaviour

2013-12-14 Thread Tom Lane
Victor Yegorov vyego...@gmail.com writes:
 Could you kindly explain me why the query as it is updates no records?

It's a bug, that's why.  See
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=324577f39bc8738ed0ec24c36c5cb2c2f81ec660
or for 9.2,
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=5d545b7ed53a8b2058c1152bd1c9ae227b3280e3

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Unexpected update behaviour

2013-12-13 Thread Victor Yegorov
Greetings.

I've hit a strange issue. This is a simplified setup.
First, run create.sql to get a couple of tables.
Then, run the update query.

Tested on 9.2.6 and 9.3.2.

Now:
- if I remove the UNION ALL part of the inner query, UPDATE works;
- if I move the `raw` subquery into the CTE, UPDATE works (but takes
  hours on my full data);
- if I convert this UPDATE into a SELECT, I get the row to be updated.

Could you kindly explain me why the query as it is updates no records?


-- 
Victor Y. Yegorov


create.sql
Description: Binary data


update.sql
Description: Binary data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] [Q] Update from a table

2013-12-05 Thread Ladislav Lenart
Hello.

What happens if I issue UPDATE SET FROM... but with incomplete/buggy WHERE
condition and thus SEVERAL rows from the from_list match ONE row to update?

Thank you in advance,

Ladislav Lenart



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [Q] Update from a table

2013-12-05 Thread Tom Lane
Ladislav Lenart lenart...@volny.cz writes:
 What happens if I issue UPDATE SET FROM... but with incomplete/buggy WHERE
 condition and thus SEVERAL rows from the from_list match ONE row to update?

Any given row will be updated at most once.  However, the from_list row
it gets updated against will be unpredictable, depending on the
implementation of the join.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [Q] Update from a table

2013-12-05 Thread Ladislav Lenart
On 5.12.2013 15:14, Tom Lane wrote:
 Ladislav Lenart lenart...@volny.cz writes:
 What happens if I issue UPDATE SET FROM... but with incomplete/buggy WHERE
 condition and thus SEVERAL rows from the from_list match ONE row to update?
 
 Any given row will be updated at most once.  However, the from_list row
 it gets updated against will be unpredictable, depending on the
 implementation of the join.
 
   regards, tom lane

Makes perfect sense.

Thank you,

Ladislav Lenart



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-29 Thread Matt
I have a relatively simple data load script, which upserts (UPDATE existing
rows, INSERT new rows), which should be supported by the primary key index,
the only index on this table:

UPDATE destination SET ... FROM staging WHERE staging.pk =
destination.pk;
INSERT INTO destination SELECT ... FROM staging WHERE NOT EXISTS(SELECT
* FROM destination WHERE pk  = staging.pk);

In most cases, EXPLAIN and runtime tell me the index is utilized. However,
sometime on back to back runs (5 minute intervals) the runtime explodes and
EXPLAIN shows me that the PK index is not used, as both the staging table
and destination table a sequentially scanned. The source data maintains the
same pattern, and this load job is the only write activity in this
particular database.

A left anti-join in the above pseudo-query explains the same, and seems to
make the problem occur more frequently.

What could cause PostgreSQL (9.1) to stop using an index in a case like
this, when it normally and expectedly uses it?


Re: [GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-29 Thread Tom Lane
Matt bsg...@gmail.com writes:
 In most cases, EXPLAIN and runtime tell me the index is utilized. However,
 sometime on back to back runs (5 minute intervals) the runtime explodes and
 EXPLAIN shows me that the PK index is not used, as both the staging table
 and destination table a sequentially scanned.

You haven't explained what you do with the staging table, but maybe you
need to ANALYZE it after you've loaded fresh data into it.  The stats for
the main table are presumably somewhat stable, but the stats for the
staging table maybe not.  The apparent correlation to consecutive runs
lends some support to this theory, as that would reduce the time window
for auto-ANALYZE to fix it for you.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-29 Thread Matt
Since this is an upsert, the staging table simply accepts a copy of
pipe-delimited data via a COPY FROM STDIN, and a subsequent SQL script
performs the UPDATE/INSERT. The staging table is then truncated (or delete
without where) for the next run: Truncate staging, COPY into staging,
update then insert destination from staging, repeat on next 5 minute clock
cycle.

Since the staging table is fully loaded into the destination table, a full
scan was not a concern, but the lack of index usage on the destination. For
example, this is normally supported by the PK index, and performs well:

INSERT INTO destination (…)
SELECT (…) FROM staging
WHERE NOT EXISTS (SELECT * FROM source WHERE source.id = destination.id)

The above runs in expected time when the explain plan shows an index scan
on the destination primary key index, and a seq scan on the staging table.
This will continue for many runs, until something causes the engine to stop
using the destination PK index, and begin scanning both the destination and
staging. What triggers this is the interesting question.The only system
event I can (loosely) correlate with the problem is the start of a nightly
pg_dump, but as a read-only process, this would not be changing any index
stats, correct?

An ANALYZE on staging is possible after each load cycle, but time does not
permit on the destination table. I have been focusing on the destination
because it has the index that is not being used. Will stats on the staging
table affect index selection on the destination in a case like this?

In the process of attempting to diagnose this, both tables involved have
been vacuumed (full), analyzed. I have also moved staging to an SSD volume,
and created an equivalent index on staging - which is not used in any plan,
nor do I expect to as there is no filter criteria on staging, and the index
maintenance on staging would seem to be an unneeded overhead. But in this
case, is there any advantage to an index on staging?

For reference, it is possible (not enough test cycles to verify) that left
anti-join makes this situation worse, even though the explain plans appear
identical:

INSERT INTO destination (…)
SELECT (…) FROM staging
LEFT JOIN destination ON destination.id = staging.id
WHERE destination.id IS NULL


On 29 Oct 2013, at 9:45, Tom Lane wrote:

 Matt bsg...@gmail.com writes:
 In most cases, EXPLAIN and runtime tell me the index is utilized.
However,
 sometime on back to back runs (5 minute intervals) the runtime explodes
and
 EXPLAIN shows me that the PK index is not used, as both the staging table
 and destination table a sequentially scanned.

 You haven't explained what you do with the staging table, but maybe you
 need to ANALYZE it after you've loaded fresh data into it.  The stats for
 the main table are presumably somewhat stable, but the stats for the
 staging table maybe not.  The apparent correlation to consecutive runs
 lends some support to this theory, as that would reduce the time window
 for auto-ANALYZE to fix it for you.

 regards, tom lane


[GENERAL] Pretend update

2013-10-01 Thread Perry Smith
With make I can do make -n and it just tells me what it would do but 
doesn't actually do anything.

How could I do that with SQL?

I want to write a really complicated (for me) SQL UPDATE statement.  I'm sure I 
won't get it right the first time.  Is there an easy way to not really make the 
changes?

I've thought about starting a transaction and then roll it back.  That would 
undo the changes.  But I won't be able to tell what the changes were.

Thank you for your time,
Perry



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [GENERAL] Pretend update

2013-10-01 Thread Adrian Klaver

On 10/01/2013 10:16 AM, Perry Smith wrote:

With make I can do make -n and it just tells me what it would do but 
doesn't actually do anything.

How could I do that with SQL?

I want to write a really complicated (for me) SQL UPDATE statement.  I'm sure I 
won't get it right the first time.  Is there an easy way to not really make the 
changes?

I've thought about starting a transaction and then roll it back.  That would 
undo the changes.  But I won't be able to tell what the changes were.


Assuming you are not doing this in a function, you can. Do UPDATE, then 
SELECT to see your changes or not and then ROLLBACK.




Thank you for your time,
Perry




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pretend update

2013-10-01 Thread Perry Smith

On Oct 1, 2013, at 12:23 PM, Adrian Klaver adrian.kla...@gmail.com wrote:

 On 10/01/2013 10:16 AM, Perry Smith wrote:
 With make I can do make -n and it just tells me what it would do but 
 doesn't actually do anything.
 
 How could I do that with SQL?
 
 I want to write a really complicated (for me) SQL UPDATE statement.  I'm 
 sure I won't get it right the first time.  Is there an easy way to not 
 really make the changes?
 
 I've thought about starting a transaction and then roll it back.  That would 
 undo the changes.  But I won't be able to tell what the changes were.
 
 Assuming you are not doing this in a function, you can. Do UPDATE, then 
 SELECT to see your changes or not and then ROLLBACK.

Ah... yes.  I forgot you can see the changes within the same transaction.  
Dohhh...

Thank you very much
Perry



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [GENERAL] Pretend update

2013-10-01 Thread Victor Yegorov
2013/10/1 Perry Smith pedz...@gmail.com


 On Oct 1, 2013, at 12:23 PM, Adrian Klaver adrian.kla...@gmail.com
 wrote:

  Assuming you are not doing this in a function, you can. Do UPDATE, then
 SELECT to see your changes or not and then ROLLBACK.

 Ah... yes.  I forgot you can see the changes within the same transaction.
  Dohhh...


It is possible to use RETURNING clause of the UPDATE statement and avoid
SELECT.

And yes, it is necessary to do this within a transaction and roll it back
after.

-- 
Victor Y. Yegorov


Re: [GENERAL] BEFORE UPDATE trigger doesn't change column value

2013-04-07 Thread Clemens Eisserer
Hi Kevin,

Sorry, that's the wrong way around.  I should have said:

 Your BEFORE UPDATE trigger could leave the synced value in NEW
 alone if force_sync was **true**, and set synced to false
 otherwise.  It could then set NEW.force_sync to false, to leave you
 ready for the next update.


Thanks for your advice (and the patience on this list in general).
Instead of using two columns, I now use an integer-column and set it to a
value taken from an incrementing sequence.

Thanks again, Clemens


Re: [GENERAL] BEFORE UPDATE trigger doesn't change column value

2013-04-06 Thread Kevin Grittner
Clemens Eisserer linuxhi...@gmail.com wrote:

 Here is what I am trying to achieve: Set synced to false at any
 update, except when it has been set explicitly to true.
 This does not seem to be possible, without checking the value SET
 by UPDATE?

Right; since there is no way to check whether a 'true' value there
was explicitly set or just carrying over from the old version of
the row without being set, you need some other mechanism for
handling this.  You could, for exampe, add a force_sync column
which could be tested in a trigger.  Your BEFORE UPDATE trigger
could leave the synced value in NEW alone if force_sync was
false, and set synced to false otherwise.  It could then set
NEW.force_sync to false, to leave you ready for the next update.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BEFORE UPDATE trigger doesn't change column value

2013-04-06 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:

 Your BEFORE UPDATE trigger could leave the synced value in NEW
 alone if force_sync was false, and set synced to false
 otherwise.  It could then set NEW.force_sync to false, to leave you
 ready for the next update.

Sorry, that's the wrong way around.  I should have said:

Your BEFORE UPDATE trigger could leave the synced value in NEW
alone if force_sync was **true**, and set synced to false
otherwise.  It could then set NEW.force_sync to false, to leave you
ready for the next update.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BEFORE UPDATE trigger doesn't change column value

2013-04-05 Thread Clemens Eisserer
Hi,

Sorry for this newbie-question, I am trying for quite some time now to get
the following trigger-function to work properly:

CREATE OR REPLACE FUNCTION update_synced_column()
  RETURNS trigger AS
$BODY$
BEGIN
   IF NEW.synced IS NULL THEN
NEW.synced :=  false;
   END IF;
   RETURN NEW;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

The idea is basically to always update the synced column to false, unless
a value has been provided manually in the UPDATE-clause.
Synced is defined as BOOLEAN DEFAULT FALSE;
I execute this function in a BEFORE UPDATE trigger FOR EACH ROW, however it
doesn't seem to have any effect.

Any ideas what could be wrong here?

Thanks, Clemens


Re: [GENERAL] BEFORE UPDATE trigger doesn't change column value

2013-04-05 Thread Kevin Grittner
Clemens Eisserer linuxhi...@gmail.com wrote:

 I am trying for quite some time now to get the following
 trigger-function to work properly:
 
 CREATE OR REPLACE FUNCTION update_synced_column()
   RETURNS trigger AS
 $BODY$
 BEGIN
    IF NEW.synced IS NULL THEN
 NEW.synced :=  false;
    END IF;
    RETURN NEW;
 END;
 $BODY$
   LANGUAGE plpgsql VOLATILE
   COST 100;
 
 The idea is basically to always update the synced column to
 false, unless a value has been provided manually in the
 UPDATE-clause.
 Synced is defined as BOOLEAN DEFAULT FALSE;
 I execute this function in a BEFORE UPDATE trigger FOR EACH ROW,
 however it doesn't seem to have any effect.
 
 Any ideas what could be wrong here?

NEW reflects what the row will look like after the UPDATE.  There
is no way to tell which columns were specified in the SET clause of
the UPDATE; a column which is omitted from that clause will look
exactly the same as a column which is set to the value it already
had.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BEFORE UPDATE trigger doesn't change column value

2013-04-05 Thread Gavan Schneider

On 5/4/13 at 6:59 AM, Clemens Eisserer wrote:


Sorry for this newbie-question, I am trying for quite some time now to get
the following trigger-function to work properly:

CREATE OR REPLACE FUNCTION update_synced_column()
RETURNS trigger AS
$BODY$
BEGIN
IF NEW.synced IS NULL THEN
NEW.synced :=  false;
END IF;
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;

The idea is basically to always update the synced column to false, unless
a value has been provided manually in the UPDATE-clause.
Synced is defined as BOOLEAN DEFAULT FALSE;
I execute this function in a BEFORE UPDATE trigger FOR EACH ROW, however it
doesn't seem to have any effect.

You could also add 'NOT NULL' to the declaration of synced so 
this column can never be set to NULL and this would further 
ensure the trigger function has nothing to do.


By way of sanity testing, do you get any rows when doing 
something like:

SELECT * FROM relevant_table WHERE synced IS NULL;


Any ideas what could be wrong here?

If the above does not apply and at the risk of being too obvious 
(specifically not wishing to cause offence):


Has the trigger itself been declared?
refer: http://www.postgresql.org/docs/9.2/static/sql-createtrigger.html

Is the function ever getting called?
refer: 
http://www.postgresql.org/docs/9.2/static/plpgsql-errors-and-messages.html

Regards
Gavan Schneider



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BEFORE UPDATE trigger doesn't change column value

2013-04-05 Thread Clemens Eisserer
Hi,

NEW reflects what the row will look like after the UPDATE.  There
 is no way to tell which columns were specified in the SET clause of
 the UPDATE; a column which is omitted from that clause will look
 exactly the same as a column which is set to the value it already
 had.


Thanks a lot for clarifying this ... my logic was flawed.
At stackoverflow.com I found an example which suggested testing against
NULL would allow this and I just relied on it.

Here is what I am trying to achieve: Set synced to false at any update,
except when it has been set explicitly to true.
This does not seem to be possible, without checking the value SET by UPDATE?

Regards, Clemens


Re: [GENERAL] Running update in chunks?

2013-01-27 Thread Jasen Betts
On 2013-01-25, Tim Uckun timuc...@gmail.com wrote:
 I agree that seems like the most likely cause. Each update to the
 row holding the hstore column requires adding new index entries for
 all the hstore elements, and autovacuum will need to clean up the
 old ones in the background. The best solution would be to either
 normalize the data instead of using hstore, or move the hstore to a
 separate table which is referenced by some sort of ID from the
 frequently-updated table.


 That's very interesting. I can certainly split up the table, no big
 deal there.  So would the index be redone even if I am not updating
 the hstore field itself?

Absolutely! see MVCC.

 http://www.postgresql.org/docs/current/static/mvcc-intro.html


-- 
⚂⚃ 100% natural



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-26 Thread Tim Uckun

 That would be one solution, but I think a better one would be to not
 store make_id in imports in the first place, but instead to always
 fetch it by joining imports to models at query time.


My problem here is that the incoming data is quite messy so the join
conditions become weird (lots of ORs and such). A multi pass approach
seems to work better.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Tim Uckun
Sorry I haven't been responsive for a little while.

I ran your script  but creating a new schema instead of my existing
schema. My timings were similar to yours (more or less) except fo the
vacuums which took roughly 147891 ms.


On Tue, Jan 22, 2013 at 3:21 PM, Kevin Grittner kgri...@mail.com wrote:
 Kevin Grittner wrote:

 update imports set make_id = 0

 Query returned successfully: 98834 rows affected, 45860 ms execution time.

 For difficult problems, there is nothing like a self-contained test
 case, that someone else can run to see the issue. Here's a starting
 point:

 create extension if not exists hstore;
 create schema cars;
 drop table if exists cars.imports;
 CREATE TABLE cars.imports
 (
 id serial NOT NULL,
 target_id integer,
 batch_id integer,
 make_id integer,
 model_id integer,
 date timestamp without time zone,
 division_id integer,
 dealer_id integer,
 data hstore,
 created_at timestamp without time zone NOT NULL,
 updated_at timestamp without time zone NOT NULL,
 CONSTRAINT imports_pkey PRIMARY KEY (id)
 );
 CREATE INDEX index_imports_on_data ON cars.imports USING gin (data);
 CREATE INDEX index_imports_on_dealer_id ON cars.imports USING btree 
 (dealer_id);
 CREATE INDEX index_imports_on_division_id ON cars.imports USING btree 
 (division_id);
 CREATE INDEX index_imports_on_make_id ON cars.imports USING btree (make_id);
 CREATE INDEX index_imports_on_model_id ON cars.imports USING btree (model_id);
 insert into cars.imports (created_at, updated_at) select now(), now() from 
 (select generate_series(1, 10)) x;
 vacuum freeze analyze;
 \timing on
 update cars.imports set make_id = 0;
 vacuum analyze;
 update cars.imports set make_id = 0;

 ... and here's what I get when I run it on my desktop computer with
 ordinary disk drives and a completely default configuration:

 test=# create extension if not exists hstore;
 CREATE EXTENSION
 Time: 48.032 ms
 test=# create schema cars;
 CREATE SCHEMA
 Time: 8.150 ms
 test=# drop table if exists cars.imports;
 NOTICE:  table imports does not exist, skipping
 DROP TABLE
 Time: 0.205 ms
 test=# CREATE TABLE cars.imports
 test-# (
 test(# id serial NOT NULL,
 test(# target_id integer,
 test(# batch_id integer,
 test(# make_id integer,
 test(# model_id integer,
 test(# date timestamp without time zone,
 test(# division_id integer,
 test(# dealer_id integer,
 test(# data hstore,
 test(# created_at timestamp without time zone NOT NULL,
 test(# updated_at timestamp without time zone NOT NULL,
 test(# CONSTRAINT imports_pkey PRIMARY KEY (id)
 test(# );
 NOTICE:  CREATE TABLE will create implicit sequence imports_id_seq for 
 serial column imports.id
 NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index imports_pkey 
 for table imports
 CREATE TABLE
 Time: 152.677 ms
 test=# CREATE INDEX index_imports_on_data ON cars.imports USING gin (data);
 CREATE INDEX
 Time: 6.391 ms
 test=# CREATE INDEX index_imports_on_dealer_id ON cars.imports USING btree 
 (dealer_id);
 CREATE INDEX
 Time: 64.668 ms
 test=# CREATE INDEX index_imports_on_division_id ON cars.imports USING btree 
 (division_id);
 CREATE INDEX
 Time: 65.573 ms
 test=# CREATE INDEX index_imports_on_make_id ON cars.imports USING btree 
 (make_id);
 CREATE INDEX
 Time: 64.959 ms
 test=# CREATE INDEX index_imports_on_model_id ON cars.imports USING btree 
 (model_id);
 CREATE INDEX
 Time: 64.906 ms
 test=# insert into cars.imports (created_at, updated_at) select now(), now() 
 from (select generate_series(1, 10)) x;
 INSERT 0 10
 Time: 2516.559 ms
 test=# vacuum freeze analyze;
 VACUUM
 Time: 3357.778 ms
 test=# \timing on
 Timing is on.
 test=# update cars.imports set make_id = 0;
 UPDATE 10
 Time: 2937.241 ms
 test=# vacuum analyze;
 VACUUM
 Time: 2097.426 ms
 test=# update cars.imports set make_id = 0;
 UPDATE 10
 Time: 3935.939 ms

 Ubuntu 12.10
 i7-3770 CPU @ 3.40GHz with 16GB RAM
 Linux Kevin-Desktop 3.5.0-22-generic #34-Ubuntu SMP Tue Jan 8 21:47:00 UTC 
 2013 x86_64 x86_64 x86_64 GNU/Linux
 A pair of WDC WD10EALX-009BA0 (15.01H51) 7200 RPM drives in RAID 1.

  PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 
 4.7.2-2ubuntu1) 4.7.2, 64-bit

 test=# SELECT name, current_setting(name), source
 test-#   FROM pg_settings
 test-#   WHERE source NOT IN ('default', 'override');
 name|  current_setting   |source
 ++--
  application_name   | psql   | client
  client_encoding| UTF8   | client
  DateStyle  | ISO, MDY   | configuration file
  default_text_search_config | pg_catalog.english | configuration file
  lc_messages| en_US.UTF-8| configuration file
  lc_monetary| en_US.UTF-8| configuration file
  lc_numeric | en_US.UTF-8| configuration file
  lc_time| en_US.UTF-8| configuration 

Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Tim Uckun

 What if you do:
 alter table cars.imports set (fillfactor=50);
 Before the vacuum full, and then try the update again?


This makes a dramatic difference when combined with a vacuum.

UPDATE 98834
Time: 3408.210 ms

Ten times faster!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Tim Uckun
relid | 26710
schemaname| cars
relname   | imports
seq_scan  | 280
seq_tup_read  | 25873543
idx_scan  | 4
idx_tup_fetch | 2749
n_tup_ins | 98926
n_tup_upd | 6350466
n_tup_del | 92
n_tup_hot_upd | 625286
n_live_tup| 98834
n_dead_tup| 0
last_vacuum   | 2013-01-25 21:55:36.078614+13
last_autovacuum   | 2013-01-25 21:58:40.850546+13
last_analyze  | 2013-01-25 21:55:36.305967+13
last_autoanalyze  | 2013-01-25 21:51:54.307639+13
vacuum_count  | 6
autovacuum_count  | 32
analyze_count | 6
autoanalyze_count | 25


On Wed, Jan 23, 2013 at 7:50 AM, Steve Crawford
scrawf...@pinpointresearch.com wrote:
 On 01/21/2013 05:02 PM, Tim Uckun wrote:

 I already posted the schema earlier. It's a handful of integer fields
 with one hstore field.


 Oh well. I can't find it but maybe it got lost in shipping or eaten by a
 spam filter.




 On Tue, Jan 22, 2013 at 1:23 PM, Steve Crawford
 scrawf...@pinpointresearch.com wrote:

 select * from pg_stat_user_tables where relname='yourtable';



 Messy output

 Don't know if you are in psql but if so, expanded display works for this.
 I.e.:
 steve@[local] = \x
 Expanded display is on.
 steve@[local] = select * from pg_stat_user_tables where relname='footest';
 -[ RECORD 1 ]-+--
 relid | 781691
 schemaname| public
 relname   | footest
 seq_scan  | 3
 seq_tup_read  | 609
 idx_scan  |
 idx_tup_fetch |
 n_tup_ins | 609
 n_tup_upd | 0
 n_tup_del | 0
 n_tup_hot_upd | 0
 n_live_tup| 301
 n_dead_tup| 0
 last_vacuum   |
 last_autovacuum   |
 last_analyze  |
 last_autoanalyze  | 2012-12-19 08:42:23.347368-08
 vacuum_count  | 0
 autovacuum_count  | 0
 analyze_count | 0
 autoanalyze_count | 2





 relid,schemaname,relname,seq_scan,seq_tup_read,idx_scan,idx_tup_fetch,n_tup_ins,n_tup_upd,n_tup_del,n_tup_hot_upd,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,vacuum_count,autovacuum_count,analyze_count,autoanalyze_count

 26710,chrysler,imports,274,25280539,4,2749,98926,5757462,92,327542,98834,0,2013-01-22
 12:28:29.01505+13,2013-01-22 12:32:29.249588+13,2013-01-22
 12:28:29.173772+13,2013-01-22 12:32:44.123493+13,3,30,3,24


 So at least autovacuum is running (and some manual vacuum and analyze as
 well).

 Cheers,
 Steve



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Tim Uckun
 I agree that seems like the most likely cause. Each update to the
 row holding the hstore column requires adding new index entries for
 all the hstore elements, and autovacuum will need to clean up the
 old ones in the background. The best solution would be to either
 normalize the data instead of using hstore, or move the hstore to a
 separate table which is referenced by some sort of ID from the
 frequently-updated table.


That's very interesting. I can certainly split up the table, no big
deal there.  So would the index be redone even if I am not updating
the hstore field itself?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Richard Huxton

On 25/01/13 08:57, Tim Uckun wrote:

What if you do:
alter table cars.imports set (fillfactor=50);
Before the vacuum full, and then try the update again?


This makes a dramatic difference when combined with a vacuum.

UPDATE 98834
Time: 3408.210 ms

Ten times faster!
That suggests (to me, at least) that it is related to index updating. 
Again, your GIN index seems primary candidate.


A fillfactor of 50% means row updates probably stay on the same 
disk-block as their previous version. This implies less index updates.


Try running iostat (I think that's available on a Mac) with/without the 
fillfactor and with/without the GIN index while you do the updates. It's 
possible your SSD is just behaving oddly under stress.


--
  Richard Huxton
  Archonet Ltd


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Tim Uckun

 That suggests (to me, at least) that it is related to index updating. Again,
 your GIN index seems primary candidate.

 Try running iostat (I think that's available on a Mac) with/without the
 fillfactor and with/without the GIN index while you do the updates. It's
 possible your SSD is just behaving oddly under stress.



I dropped the index and the numbers shot up tenfold or more.  I don't
know why postgres feels the need to update the GIN index on the hstore
field when I am only updating an integer field but it looks like I
need to split the hstore into a different table.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Richard Huxton

On 25/01/13 11:38, Tim Uckun wrote:

That suggests (to me, at least) that it is related to index updating. Again,
your GIN index seems primary candidate.

Try running iostat (I think that's available on a Mac) with/without the
fillfactor and with/without the GIN index while you do the updates. It's
possible your SSD is just behaving oddly under stress.



I dropped the index and the numbers shot up tenfold or more.  I don't
know why postgres feels the need to update the GIN index on the hstore
field when I am only updating an integer field but it looks like I
need to split the hstore into a different table.
If the row moves to a different block, then it has no choice. The old 
index entry will point to an invalid block. There are some optimisations 
(HOT - http://pgsql.tapoueh.org/site/html/misc/hot.html) but that relies 
on (iirc) the update staying on the same block and also not updating any 
indexed fields (and you were, I think).


A GIN index is very expensive to update compared to btree too.
--
  Richard Huxton
  Archonet Ltd


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Albe Laurenz
Tim Uckun wrote:
 I dropped the index and the numbers shot up tenfold or more.  I don't
 know why postgres feels the need to update the GIN index on the hstore
 field when I am only updating an integer field but it looks like I
 need to split the hstore into a different table.

Every UPDATE that is not HOT will create a row version with
a new row id.  That means that all indexes referencing that
row will have to get updated.

That is consistent with better performance with low fillfactor
(which makes HOT more likely).

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-25 Thread Jeff Janes
On Fri, Jan 25, 2013 at 3:38 AM, Tim Uckun timuc...@gmail.com wrote:

 That suggests (to me, at least) that it is related to index updating. Again,
 your GIN index seems primary candidate.

 Try running iostat (I think that's available on a Mac) with/without the
 fillfactor and with/without the GIN index while you do the updates. It's
 possible your SSD is just behaving oddly under stress.



 I dropped the index and the numbers shot up tenfold or more.  I don't
 know why postgres feels the need to update the GIN index on the hstore
 field when I am only updating an integer field


When the row gets updated, it might move to some place else.  An index
maps data values to row locations.  So if the location changes, all
indexes need to be updated, even if the data value for that index did
not change.  (Well, I shouldn't say they *need* to change.  The
database could have been designed, with considerable difficulty and
consequences, to leave behind permanent redirect pointers to the new
location.  But it wasn't)

There is a mechanism called HOT update (Heap-Only Tuple) which can
prevent this under certain conditions.

1) Either none of the fields being updated are indexed, or any that
are both updated and indexed are updated to the value they already
have.

2) There is room for a new copy of the tuple on the same page as the old one.

lowering the fillfactor helps with requirement 2, especially since
your tuples are probably wide (because of the hstore column) and so
not many fit on a page.

Note that if you update a field to have the same value as it already
does, it still makes a new copy of the entire tuple anyway.  (It
detects that the :old = :new for HOT-eligibility purposes if the field
is indexed, but not for suppression of copying purposes.  And if the
tuple needs to be copied but there is no room on that page, then it
isn't eligible for HOT after all).

So you should add a where clause to the UPDATE to filter out things
that are unchanged.


 but it looks like I
 need to split the hstore into a different table.

That would be one solution, but I think a better one would be to not
store make_id in imports in the first place, but instead to always
fetch it by joining imports to models at query time.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-23 Thread Kevin Grittner
Jeff Janes wrote:

 one hstore field can easily be equivalent to 50 text fields with
 an index on each one.
 
 I'm pretty sure that that is your bottleneck.

I agree that seems like the most likely cause. Each update to the
row holding the hstore column requires adding new index entries for
all the hstore elements, and autovacuum will need to clean up the
old ones in the background. The best solution would be to either
normalize the data instead of using hstore, or move the hstore to a
separate table which is referenced by some sort of ID from the
frequently-updated table.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-22 Thread Steve Crawford

On 01/21/2013 05:02 PM, Tim Uckun wrote:

I already posted the schema earlier. It's a handful of integer fields
with one hstore field.


Oh well. I can't find it but maybe it got lost in shipping or eaten by a 
spam filter.





On Tue, Jan 22, 2013 at 1:23 PM, Steve Crawford
scrawf...@pinpointresearch.com wrote:

select * from pg_stat_user_tables where relname='yourtable';



Messy output
Don't know if you are in psql but if so, expanded display works for 
this. I.e.:

steve@[local] = \x
Expanded display is on.
steve@[local] = select * from pg_stat_user_tables where relname='footest';
-[ RECORD 1 ]-+--
relid | 781691
schemaname| public
relname   | footest
seq_scan  | 3
seq_tup_read  | 609
idx_scan  |
idx_tup_fetch |
n_tup_ins | 609
n_tup_upd | 0
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup| 301
n_dead_tup| 0
last_vacuum   |
last_autovacuum   |
last_analyze  |
last_autoanalyze  | 2012-12-19 08:42:23.347368-08
vacuum_count  | 0
autovacuum_count  | 0
analyze_count | 0
autoanalyze_count | 2




relid,schemaname,relname,seq_scan,seq_tup_read,idx_scan,idx_tup_fetch,n_tup_ins,n_tup_upd,n_tup_del,n_tup_hot_upd,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,vacuum_count,autovacuum_count,analyze_count,autoanalyze_count
26710,chrysler,imports,274,25280539,4,2749,98926,5757462,92,327542,98834,0,2013-01-22
12:28:29.01505+13,2013-01-22 12:32:29.249588+13,2013-01-22
12:28:29.173772+13,2013-01-22 12:32:44.123493+13,3,30,3,24


So at least autovacuum is running (and some manual vacuum and analyze as 
well).


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-22 Thread Steve Crawford

On 01/21/2013 06:21 PM, Kevin Grittner wrote:

Kevin Grittner wrote:
...
  shared_buffers | 32MB   | configuration file
...

I did absolutely no tuning from the default configuration.

But Tim has his shared_buffers set to 1600kB 
(shared_buffers;1600kB;configuration file) or roughly 1/20 of the 
typical default value, which is a very low starting value anyway, on a 
machine populated with 8GB RAM.


I'd like to see how it runs with a more reasonable shared_buffers 
setting. At a very minimum the 32MB default.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-22 Thread Alvaro Herrera
Steve Crawford escribió:
 On 01/21/2013 05:02 PM, Tim Uckun wrote:
 I already posted the schema earlier. It's a handful of integer fields
 with one hstore field.
 
 Oh well. I can't find it but maybe it got lost in shipping or eaten
 by a spam filter.

This is what we have the archives are for:

http://www.postgresql.org/message-id/flat/50fedf66.7050...@pinpointresearch.com#50fedf66.7050...@pinpointresearch.com

The original message is at the top of the page (obviously).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-22 Thread Steve Crawford

On 01/22/2013 10:59 AM, Alvaro Herrera wrote:

Steve Crawford escribió:

On 01/21/2013 05:02 PM, Tim Uckun wrote:

I already posted the schema earlier. It's a handful of integer fields
with one hstore field.

Oh well. I can't find it but maybe it got lost in shipping or eaten
by a spam filter.

This is what we have the archives are for:

http://www.postgresql.org/message-id/flat/50fedf66.7050...@pinpointresearch.com#50fedf66.7050...@pinpointresearch.com

The original message is at the top of the page (obviously).

Didn't notice that the information was over on the github site (which, 
of course, also makes it impossible to search for in my email and 
unavailable to the mail archives for those wishing to view it in the 
future).


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Richard Huxton

On 21/01/13 08:04, Tim Uckun wrote:

This is the query I am running

update cars.imports i
 set make_id = md.make_id
 from cars.models md where i.model_id = md.id;


Here is the analyse

Looks like it's the actual update that's taking all the time.

This query takes fifty seconds on a macbook air with i7 processor and
eight gigs of RAM and SSD hard drive.  I am using postgres 9.2
installed with homebrew using the standard conf file.
Can you try a couple of things just to check timings. Probably worth 
EXPLAIN ANALYSE.


SELECT count(*) FROM cars.imports i JOIN cars.models md ON i.model_id = 
md.id;


CREATE TEMP TABLE tt AS SELECT i.* FROM cars.imports i JOIN cars.models 
md ON i.model_id = md.id;


Now the first one should take half a second judging by your previous 
explain. If the second one takes 50 seconds too then that's just the 
limit of your SSD's write. If it's much faster then something else is 
happening.



--
  Richard Huxton
  Archonet Ltd


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Tim Uckun
 Can you try a couple of things just to check timings. Probably worth EXPLAIN
 ANALYSE.

 SELECT count(*) FROM cars.imports i JOIN cars.models md ON i.model_id =
 md.id;


Takes about 300 ms


 CREATE TEMP TABLE tt AS SELECT i.* FROM cars.imports i JOIN cars.models md
 ON i.model_id = md.id;

Takes about 300 ms


 Now the first one should take half a second judging by your previous
 explain. If the second one takes 50 seconds too then that's just the limit
 of your SSD's write. If it's much faster then something else is happening.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Richard Huxton

On 21/01/13 10:30, Tim Uckun wrote:

Can you try a couple of things just to check timings. Probably worth EXPLAIN
ANALYSE.

SELECT count(*) FROM cars.imports i JOIN cars.models md ON i.model_id =
md.id;


Takes about 300 ms


CREATE TEMP TABLE tt AS SELECT i.* FROM cars.imports i JOIN cars.models md
ON i.model_id = md.id;

Takes about 300 ms
OK - so writing all the data takes very under one second but updating 
the same amount takes 50 seconds.


The only differences I can think of are WAL logging (transaction log) 
and index updates (the temp table has no indexes).


1. Try CREATE TABLE rather than CREATE TEMP TABLE - if that's still 
quick then it's not the time taken to write WAL.
2. Run the update query against your new tt table and see how long that 
takes.
3. Add indexes and repeat (in particular I'd be suspicious of the gin 
index on data)


My guess is that it's the time taken to update the data index - gin 
indexes can be slow to rebuild (although 50 seconds seems *very* slow). 
If so there are a few options:
1. Split the table and put whatever this data is into an import_data 
table - assuming it doesn't change often.
2. Try a fill-factor of 50% or less - keeping the updates on the same 
data page as the original might help
3. Drop the gin index before doing your bulk update and rebuild it at 
the end. This is a common approach with bulk-loading / updates.


Oh - I'm assuming you're only updating those rows whose id has changed - 
that seemed to be the suggestion in your first message. If not, simply 
adding AND make_id  md.make_id should help. Also (and you may well 
have considered this) - for a normalised setup you'd just have the 
model-id in imports and look up the make-id through the models table.


--
  Richard Huxton
  Archonet Ltd


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Kevin Grittner
Richard Huxton wrote:

 The only differences I can think of are WAL logging (transaction
 log) and index updates (the temp table has no indexes).

What about foreign keys? Are there any tables which reference the
updated column in a foreign key declaration? Do they have indexes
on that column?

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Tim Uckun
Just to close this up and give some guidance to future googlers...

There are no foreign keys. The table is empty when I start. I copy the
data into it from a text file.

Removing indexes didn't help much (made a very slight difference).

running a query CREATE TEMP TABLE tt AS SELECT  using a massive
join takes about 8 seconds. I presume that's the baseline for the disk
and RAM given my current postgres configuration.  Note that this is
not a satisfactory option for me because I can't do what I want in one
step (the update I specified is one of many).

running a very simple update UPDATE imports set make_id = null
takes over 50 seconds so that's the minimum amount of time any update
is going to take.

Running a complex update where I join all the tables together and
update all the fields takes about 106 seconds.

Just running a complex select with the joins takes no time at all.

I tried chunking the updates using chunks of 100 records and 1000
records (where ID between X and Y repeatedly) and it was even slower.

Conclusion.  Updates on postgres are slow (given the default
postgresql.conf).  I presume this is due to MVCC or the WAL or
something and there are probably some things I can do to tweak the
conf file to make them go faster but out of the box running an update
on a table with lots of rows is going to cost you a lot. Removing the
indexes doesn't help that much.

Suggestion for the PG team.  Deliver a more realistic postgres.conf by
default. The default one seems to be aimed at ten year old PCs with
very little RAM and disk space. At least deliver additional conf files
for small, medium, large, huge setups.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Richard Huxton

On 21/01/13 20:09, Tim Uckun wrote:

Just to close this up and give some guidance to future googlers...

Careful, future googlers.

Conclusion.  Updates on postgres are slow

Nope.


(given the default
postgresql.conf).  I presume this is due to MVCC or the WAL or
something and there are probably some things I can do to tweak the
conf file to make them go faster but out of the box running an update
on a table with lots of rows is going to cost you a lot.
Unlikely. Do you really think that a PostgreSQL installation typically 
runs 100 times slower on updates than inserts and every other user has 
just said oh, that's ok then? Or is it more likely that something 
peculiar is broken on your setup.



  Removing the indexes doesn't help that much.

Suggestion for the PG team.  Deliver a more realistic postgres.conf by
default. The default one seems to be aimed at ten year old PCs with
very little RAM and disk space. At least deliver additional conf files
for small, medium, large, huge setups.

--
  Richard Huxton


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread patrick keshishian
On Mon, Jan 21, 2013 at 1:23 PM, Richard Huxton d...@archonet.com wrote:
 On 21/01/13 20:09, Tim Uckun wrote:

 Just to close this up and give some guidance to future googlers...

 Careful, future googlers.

 Conclusion.  Updates on postgres are slow

 Nope.


 (given the default
 postgresql.conf).  I presume this is due to MVCC or the WAL or
 something and there are probably some things I can do to tweak the
 conf file to make them go faster but out of the box running an update
 on a table with lots of rows is going to cost you a lot.

 Unlikely. Do you really think that a PostgreSQL installation typically runs
 100 times slower on updates than inserts and every other user has just said
 oh, that's ok then? Or is it more likely that something peculiar is broken
 on your setup.


   Removing the indexes doesn't help that much.

 Suggestion for the PG team.  Deliver a more realistic postgres.conf by
 default. The default one seems to be aimed at ten year old PCs with
 very little RAM and disk space. At least deliver additional conf files
 for small, medium, large, huge setups.

I'd be curious to see results of the same update on a standard HDD
vs the SSD, and maybe on a more typical database deployment hardware
vs a macbook air.

--patrick


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Tim Uckun

 Nope.

If you have any suggestions I am all ears. For the purposes of this
discussion we can narrow down the problem this update statement.

Update imports set make_id = null.

There are 98K records in the table. There is no index on the make_id
field. Standard untouched postgresql.conf from the brew install of
postgres.


 Unlikely. Do you really think that a PostgreSQL installation typically runs
 100 times slower on updates than inserts and every other user has just said
 oh, that's ok then? Or is it more likely that something peculiar is broken
 on your setup.

I really don't know. That's why I am here asking.  I don't think
anything particular is broken with my system.  As mentioned above the
setup is really simple. Standard postgres install, the default conf
file, update one field on one table. It takes fifty plus seconds.

I concede that if I was to go into the postgres.conf and make some
changes it will probably run faster (maybe much faster) but I wanted
to exhaust other factors before I went messing with the default
install.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Tim Uckun
 I'd be curious to see results of the same update on a standard HDD
 vs the SSD, and maybe on a more typical database deployment hardware
 vs a macbook air.



I haven't tried it on any other machine yet.  CREATE TEMP TABLE tt as
SELECT ...  takes eight seconds so presumably the disk is not the
choke point.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Kevin Grittner
Richard Huxton wrote:
 On 21/01/13 20:09, Tim Uckun wrote:

 Just to close this up and give some guidance to future
 googlers...

 Careful, future googlers.

+1

 Conclusion. Updates on postgres are slow

 Nope.

Agreed.

 (given the default postgresql.conf). I presume this is due to
 MVCC or the WAL or something and there are probably some things
 I can do to tweak the conf file to make them go faster but out
 of the box running an update on a table with lots of rows is
 going to cost you a lot.

 Unlikely. Do you really think that a PostgreSQL installation
 typically runs 100 times slower on updates than inserts and every
 other user has just said oh, that's ok then? Or is it more
 likely that something peculiar is broken on your setup.

As someone who has managed hundreds of databases ranging up to over
3TB without seeing this without some very specific cause, I agree
that there is something wonky on Tim's setup which he hasn't told
us about. Then again, I'm not sure we've pushed hard enough for the
relevant details.

Tim, if you're still interested in resolving this, please post the
results from running the SQL code on this page:

http://wiki.postgresql.org/wiki/Server_Configuration

It might be worthwhile to read through this page:

http://wiki.postgresql.org/wiki/Slow_Query_Questions

... and try some of the ideas there. Base disk perfromance numbers
would help put the results in perspective.

The cause could be anything from table bloat due to inadequate
vacuuming to hardware problems.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Kevin Grittner
Tim Uckun wrote:

 If you have any suggestions I am all ears. For the purposes of this
 discussion we can narrow down the problem this update statement.
 
 Update imports set make_id = null.

Well, that simplifies things.

First off, what does it say for rows affected? (Hint, if you really
are using a default configuration and it doesn't say 0 rows
affected, please show us the actual query used.)

Second, try connecting to the database as a superuser and running:

VACUUM ANALYZE imports;
-- (show us the results)
VACUUM FULL imports;
VACUUM FREEZE ANALYZE;  -- (don't specify a table)

Then try your query and see whether performance is any different.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread patrick keshishian
On Mon, Jan 21, 2013 at 1:48 PM, Tim Uckun timuc...@gmail.com wrote:
 I'd be curious to see results of the same update on a standard HDD
 vs the SSD, and maybe on a more typical database deployment hardware
 vs a macbook air.



 I haven't tried it on any other machine yet.  CREATE TEMP TABLE tt as
 SELECT ...  takes eight seconds so presumably the disk is not the
 choke point.

you are making an assumption that a fresh write is the same as a
re-write. try the test.

--patrick


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Kevin Grittner
Kevin Grittner wrote:

 First off, what does it say for rows affected? (Hint, if you really
 are using a default configuration and it doesn't say 0 rows
 affected, please show us the actual query used.)

Never mind that bit -- I got myself confused. Sorry for the noise.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running update in chunks?

2013-01-21 Thread Tim Uckun
 First off, what does it say for rows affected? (Hint, if you really
 are using a default configuration and it doesn't say 0 rows
 affected, please show us the actual query used.)

update imports set make_id = null

Query returned successfully: 98834 rows affected, 49673 ms execution time.


vacuum analyze imports

Query returned successfully with no result in 4138 ms.

VACUUM FULL imports;

Query returned successfully with no result in 38106 ms.

VACUUM FREEZE ANALYZE;

Query returned successfully with no result in 184635 ms

update imports set make_id = 0

Query returned successfully: 98834 rows affected, 45860 ms execution time.


So all the vacuuming saved about four seconds of execution time.

here is the postgresql.conf completely untouched from the default
install https://gist.github.com/4590590


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   4   5   >