Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread John R Pierce

On 1/27/2016 2:54 PM, Ivan Voras wrote:
For blocking reasons, we'd like to avoid vacuum fulls on these tables 
(as it seems like touching them will lock up everything else).


vacuum full isn't nearly as nasty in 9.x than it was in much older 
releases, so a lot of the caveats no longer apply.


with 1 actual records, I'd expect a vacuum full of those system 
catalogs to go in a matter of seconds, and afaik the only thing that 
would be locked would be metadata changes (eg, creating or alter tables 
etc).


disabling autovacuum is a very bad idea for exactly these reasons. if 
you have specific large tables that autovacuum is causing performances 
problems on, tune the autovacuum settings on those specific tables.


--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] A contradiction in 13.2.1

2016-01-27 Thread Dane Foster
Hello,

I'm trying to understand concurrency in PostgreSQL so I'm slowly reading
through chapter 13 of the fine manual and I believe I've found a
contradiction in section 13.2.1.

​My understanding of the second sentence of the first paragraph is that
read committed mode never sees "changes committed during query execution by
concurrent transactions". For example let's assume two transactions, A & B,
and the following:

   - A started before B
   - B starts before A commits

My understanding of the second sentence means that if A commits before B
then any updates made by A will continue to be invisible to B because B's
snapshot was before A committed. Now if I'm wrong about this then there is
no contradiction forthcoming.
The final sentence of the first paragraph is where I find the
contradiction. It says: "Also note that two successive SELECT commands can
see different data, even though they are within a single transaction, if
other transactions commit changes after the first SELECT starts and before
the second SELECT starts​"
​.

So the mental model I've built based on the first four sentences of the
first paragraph is that when a transaction starts in read committed mode a
snapshot is taken of the (database) universe as it exists at the moment of
its creation and that it's only updated by changes made by the transaction
that created the snapshot. So for successive SELECTs to see different data
because of updates outside of the transaction that created the snapshot is
a contradiction.

Now my guess is that I'm thinking about it all wrong so if someone in the
know could shed some light on where/how my mental model breaks down I would
appreciate it.

Regards,

Dane


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Andrew Sullivan
On Wed, Jan 27, 2016 at 11:54:37PM +0100, Ivan Voras wrote:
> So, question #1: WTF? How could this happen, on a regularly vacuumed
> system? Shouldn't the space be reused, at least after a VACUUM? The issue
> here is not the absolute existence of the bloat space, it's that it's
> constantly growing for *system* tables.

Some system tables (particularly pg_attribute) are heavily changed by
a lot of temp table use.  You need to amp up the vacuum frequency on
them, and have a lot of workers, or you don't get to them until it's
too late.

> Question #2: What can be done about it?

You may end up taking an outage in effect, because you need to compact
them at least once.  If you can flip to a replica, that is the easiest
way to fix it.

A

-- 
Andrew Sullivan
a...@crankycanuck.ca



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] A contradiction in 13.2.1

2016-01-27 Thread Dane Foster
On Wed, Jan 27, 2016 at 5:59 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wed, Jan 27, 2016 at 3:23 PM, Dane Foster  wrote:
>
>> Hello,
>>
>> I'm trying to understand concurrency in PostgreSQL so I'm slowly reading
>> through chapter 13 of the fine manual and I believe I've found a
>> contradiction in section 13.2.1.
>>
>> ​My understanding of the second sentence of the first paragraph is that
>> read committed mode never sees "changes committed during query execution by
>> concurrent transactions". For example let's assume two transactions, A & B,
>> and the following:
>>
>>- A started before B
>>- B starts before A commits
>>
>> My understanding of the second sentence means that if A commits before B
>> then any updates made by A will continue to be invisible to B because B's
>> snapshot was before A committed. Now if I'm wrong about this then there is
>> no contradiction forthcoming.
>> The final sentence of the first paragraph is where I find the
>> contradiction. It says: "Also note that two successive SELECT commands
>> can see different data, even though they are within a single transaction,
>> if other transactions commit changes after the first SELECT starts and
>> before the second SELECT starts​"
>> ​.
>>
>> So the mental model I've built based on the first four sentences of the
>> first paragraph is that when a transaction starts in read committed mode a
>> snapshot is taken of the (database) universe as it exists at the moment of
>> its creation and that it's only updated by changes made by the transaction
>> that created the snapshot. So for successive SELECTs to see different data
>> because of updates outside of the transaction that created the snapshot is
>> a contradiction.
>>
>> Now my guess is that I'm thinking about it all wrong so if someone in the
>> know could shed some light on where/how my mental model breaks down I would
>> appreciate it.
>>
>> Regards,
>>
>> Dane
>>
>
> ​The main thing to remember is that "query != transaction".​
>
> A1 - BEGIN;
> ​A1 - SELECT FROM a
> B1 - BEGIN;
> B2 - UPDATE a
> B3 - COMMIT;
> A2 - SELECT FROM a - again
> A3 - COMMIT;
>
> Since the commit in B3 occurs before the second select A2 in READ
> COMMITTED the query A2 *will see* the update made in B2.  But B3 must
> complete in its entirety for A2 to see it otherwise "it never sees [...]
> changes committed during query execution by concurrent transactions".  The
> concurrency is with the individual statement A2 and not the entire A
> transaction.  This is why it is called "READ COMMITTED" because within
> transaction A externally committed data is able to be read.
>
> David J.
>
>
> ​You are correct, I was conflating query w/ transaction. But it's clear
now. Thank you​
​
​.​


Dane

​


Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread Don Parris
On Wed, Jan 27, 2016 at 6:24 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wed, Jan 27, 2016 at 4:03 PM, Don Parris  wrote:
>
> 


> ​Parameter passing and variables are client-side considerations.  You
> haven't told us how you plan to execute the SQL.
>
> ​IMO the most straight-forward API is a function.  Whether you implement
> that function using a updating CTE or a sequence of separate SQL commands
> is up to you to decide and, if performance matters, benchmark.
>
> Comparing a CTE and function in general doesn't really do much good.
> There are many non-performance concerns involved and the specific usage
> pattern involved will matter greatly in determining overhead.
>
>
Thanks David, that makes sense.  My main front-end - at the moment - is
LibreOffice Base.  With Base, I can probably just create forms using the
underlying tables.  That said, I may also want to write a Python front-end,
in which case I would prefer to let the database do the work it was
designed to do.

Don
-- 
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
http://dcparris.net/ 

GPG Key ID: F5E179BE


[GENERAL] Catalog bloat (again)

2016-01-27 Thread Ivan Voras
Hi,

I've done my Googling, and it looks like this is a fairly common problem.
In my case, there's a collection of hundreds of databases (10 GB+) with
apps which are pretty much designed (a long time ago) with heavy use of
temp tables - so a non-trivial system.

The databases are vacuumed (not-full) daily, from cron (autovacuum was
turned off some time ago for performance reasons), and still their size
increases unexpectedly. By using some of the queries floating around on the
wiki and stackoverflow[*], I've discovered that the bloat is not, as was
assumed, in the user tables, but in the system tables, mostly in
pg_attributes and pg_class.

This is becoming a serious problem, as I've seen instances of these tables
grow to 6 GB+ (on a 15 GB total database), while still effectively
containing on the order of 10.000 records or so. This is quite abnormal.

For blocking reasons, we'd like to avoid vacuum fulls on these tables (as
it seems like touching them will lock up everything else).

So, question #1: WTF? How could this happen, on a regularly vacuumed
system? Shouldn't the space be reused, at least after a VACUUM? The issue
here is not the absolute existence of the bloat space, it's that it's
constantly growing for *system* tables.

Question #2: What can be done about it?

This is PostgreSQL 9.3, migrating soon to 9.4.



[*] https://wiki.postgresql.org/wiki/Show_database_bloat ,
http://stackoverflow.com/questions/13931989/postgresql-automating-vacuum-full-for-bloated-tables


Re: [GENERAL] A contradiction in 13.2.1

2016-01-27 Thread Hannes Erven

Dane,


> So the mental model I've built based on the first four sentences of
> the first paragraph is that when a transaction starts in read
> committed mode a snapshot is taken of the (database) universe as it
> exists at the moment of its creation and that it's only updated by
> changes made by the transaction that created the snapshot.

This is (almost) true for the REPEATABLE READ or SERIALIZABLE modes.
(However, the snapshot is taken as the first non-transaction-control 
statement in the transaction starts.)



For "READ COMMITTED", the docs read (emphasis added):
"In effect, a SELECT query sees a snapshot of the database as of the 
instant the *query* begins to run. However, SELECT does see the effects 
of previous updates executed within its own transaction, even though 
they are not yet committed."


( 
http://www.postgresql.org/docs/9.5/static/transaction-iso.html#XACT-READ-COMMITTED 
)



So if you re-run the same query multiple times within a single 
transaction, you might get different results depending.



best regards,

-hannes


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread David G. Johnston
On Wed, Jan 27, 2016 at 4:03 PM, Don Parris  wrote:

> On Wed, Jan 27, 2016 at 4:25 PM, Joshua Berkus  wrote:
>
>>
>>
>> > Am I on the right track, or is there some better way to set this up?  My
>> > understanding is that views really aren't meant for insert/update
>> > operations, and I have seen on the web that using views to
>> insert/update is
>> > a bit tricky - and still requires a procedure with a rule on the view.
>>
>> Why not use updatable CTEs?  That's what they're for.
>>
>>
> Sounds great.  But can I use variables, and allow the db user to enter the
> data when the CTE is called?  I've used variables in Python scripts for
> insert/update/delete, but honestly, I've never used a variable in my
> queries in PostgreSQL.  So, instead of 'Joe', as in your example below,
> maybe something like first_name?
>
>
>> WITH update_contact as (
>>INSERT INTO contacts ( contact_id, name )
>>VALUES ( nexval('contacts_id_seq'), 'Joe' )
>>RETURNING contact_id ),
>> new_cont_ids AS (
>>SELECT contact_id FROM update_contact;
>> ),
>> insert_phones AS (
>>INSERT INTO phones ( phone_id, contact_id, phone_no )
>>SELECT ( nextval('phone_id_seq'), contact_id, '415-555-1212' )
>>FROM new_cont_ids
>>RETURNING phone_id
>> ) ...
>>
>> I think you get the idea.  On 9.3 or later, this is the way to go.
>>
>>
>

​Parameter passing and variables are client-side considerations.  You
haven't told us how you plan to execute the SQL.

​IMO the most straight-forward API is a function.  Whether you implement
that function using a updating CTE or a sequence of separate SQL commands
is up to you to decide and, if performance matters, benchmark.

Comparing a CTE and function in general doesn't really do much good.  There
are many non-performance concerns involved and the specific usage pattern
involved will matter greatly in determining overhead.

David J.


Re: [GENERAL] A motion

2016-01-27 Thread Berend Tober

Adrian Klaver wrote:

Motion:

The Coc  discussion be moved to its own list where those who care can
argue to their hearts content and leave the rest of us to deal with
technical questions. Upon a decision on said list the result be posted
to the Postgres web site for consideration.


Been suggested already, and rejected:

http://www.postgresql.org/message-id/56970135.6060...@computer.org






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] A contradiction in 13.2.1

2016-01-27 Thread David G. Johnston
On Wed, Jan 27, 2016 at 3:23 PM, Dane Foster  wrote:

> Hello,
>
> I'm trying to understand concurrency in PostgreSQL so I'm slowly reading
> through chapter 13 of the fine manual and I believe I've found a
> contradiction in section 13.2.1.
>
> ​My understanding of the second sentence of the first paragraph is that
> read committed mode never sees "changes committed during query execution by
> concurrent transactions". For example let's assume two transactions, A & B,
> and the following:
>
>- A started before B
>- B starts before A commits
>
> My understanding of the second sentence means that if A commits before B
> then any updates made by A will continue to be invisible to B because B's
> snapshot was before A committed. Now if I'm wrong about this then there is
> no contradiction forthcoming.
> The final sentence of the first paragraph is where I find the
> contradiction. It says: "Also note that two successive SELECT commands
> can see different data, even though they are within a single transaction,
> if other transactions commit changes after the first SELECT starts and
> before the second SELECT starts​"
> ​.
>
> So the mental model I've built based on the first four sentences of the
> first paragraph is that when a transaction starts in read committed mode a
> snapshot is taken of the (database) universe as it exists at the moment of
> its creation and that it's only updated by changes made by the transaction
> that created the snapshot. So for successive SELECTs to see different data
> because of updates outside of the transaction that created the snapshot is
> a contradiction.
>
> Now my guess is that I'm thinking about it all wrong so if someone in the
> know could shed some light on where/how my mental model breaks down I would
> appreciate it.
>
> Regards,
>
> Dane
>

​The main thing to remember is that "query != transaction".​

A1 - BEGIN;
​A1 - SELECT FROM a
B1 - BEGIN;
B2 - UPDATE a
B3 - COMMIT;
A2 - SELECT FROM a - again
A3 - COMMIT;

Since the commit in B3 occurs before the second select A2 in READ COMMITTED
the query A2 *will see* the update made in B2.  But B3 must complete in its
entirety for A2 to see it otherwise "it never sees [...] changes committed
during query execution by concurrent transactions".  The concurrency is
with the individual statement A2 and not the entire A transaction.  This is
why it is called "READ COMMITTED" because within transaction A externally
committed data is able to be read.

David J.




​
​


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Bill Moran
On Wed, 27 Jan 2016 23:54:37 +0100
Ivan Voras  wrote:
> 
> I've done my Googling, and it looks like this is a fairly common problem.
> In my case, there's a collection of hundreds of databases (10 GB+) with
> apps which are pretty much designed (a long time ago) with heavy use of
> temp tables - so a non-trivial system.
> 
> The databases are vacuumed (not-full) daily, from cron

Vacuuming once a day is seldom often enough, except on very quiet
databases.

> (autovacuum was
> turned off some time ago for performance reasons), and still their size
> increases unexpectedly. By using some of the queries floating around on the
> wiki and stackoverflow[*], I've discovered that the bloat is not, as was
> assumed, in the user tables, but in the system tables, mostly in
> pg_attributes and pg_class.

The size increase isn't really unexpected. If you're only vacuuming once
per day, it's very easy for activity to cause active tables to bloat quite
a bit.

> This is becoming a serious problem, as I've seen instances of these tables
> grow to 6 GB+ (on a 15 GB total database), while still effectively
> containing on the order of 10.000 records or so. This is quite abnormal.
> 
> For blocking reasons, we'd like to avoid vacuum fulls on these tables (as
> it seems like touching them will lock up everything else).

It will. But to get them back down to a reasonable size, you're going to
have to do a VACUUM FULL at least _once_. If you retune things correctly,
you shouldn't need any more FULLs after that 1 time.

> So, question #1: WTF? How could this happen, on a regularly vacuumed
> system? Shouldn't the space be reused, at least after a VACUUM? The issue
> here is not the absolute existence of the bloat space, it's that it's
> constantly growing for *system* tables.

With a lot of activity, once a day probably isn't regular enough.

> Question #2: What can be done about it?

I highly recommend turning autovacuum back on, then tweaking the autovacuum
parameters to prevent any preformance issues.

However, if you're dead set against autovacuum, find out (using the queries
that are available all over the internet) which tables are bloating the
worst, and schedule additional vacuums via cron that vacuum _only_ the
problem tables. How often is something that will require some guesswork
and/or experimenting, but I would recommend at least once per hour. Since
you're only vacuuming selected tables, the performance impact should be
minimal.

You'll have to do a VACUUM FULL on the bloated tables _once_ to get the size
back down, but if you pick a good schedule or use autovacuum with appropriate
settings, they shouldn't need a VACUUM FULL again after that.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Jerry Sievers
Ivan Voras  writes:

> On 28 January 2016 at 00:13, Bill Moran  wrote:
>
> On Wed, 27 Jan 2016 23:54:37 +0100
> Ivan Voras  wrote:
>
> > So, question #1: WTF? How could this happen, on a regularly vacuumed
> > system? Shouldn't the space be reused, at least after a VACUUM? The 
> issue
> > here is not the absolute existence of the bloat space, it's that it's
> > constantly growing for *system* tables.
>
> With a lot of activity, once a day probably isn't regular enough.
>
> I sort of see what you are saying. I'm curious, though, what goes wrong with 
> the following list of expectations:
>
>  1. Day-to-day load is approximately the same
>  2. So, at the end of the first day there will be some amount of bloat
>  3. Vacuum will mark that space re-usable
>  4. Within the next day, this space will actually be re-used
>  5. ... so the bloat won't grow.
>
> Basically, I'm wondering why is it growing after vacuums, not why it exists 
> in the first place?

Probably just a classic case of long-open transactions.

And/or vacuum running as an unprivileged user and thus can't vacuum
catalogs... perhaps  with a naive batch job launcher that sends stderr
to /dev/null.

>

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 312.241.7800


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Request - repeat value of \pset title during \watch interations

2016-01-27 Thread David G. Johnston
On Wed, Jan 27, 2016 at 9:13 PM, Michael Paquier 
wrote:

> On Thu, Jan 28, 2016 at 9:34 AM, David G. Johnston
>  wrote:
> > So how about:
> >
> > + snprintf(title, strlen(myopt.title) + 50,
> > + _("Watch every %lds\t%s\t%s"),
> > +  sleep, head_title, asctime(localtime()));
>
> I would just keep the timestamp and the title separated so what do you
> think about that instead?
> Watch every Xs   $timestamp
> $head_title
>

​That works.  I like having the title immediately above the table.

The other option that came to mind would be to place the time information
after the table display while leaving the title before it.  On an output
that requires more vertical space than is available in the terminal one
would no longer have to scroll up to confirm last execution time.  If doing
this I'd probably get rid of any logic that attempts to center the time
information on the table and simply leave it left-aligned.

David J.​


Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread Berend Tober

Don Parris wrote:

I have several tables...
and want db users to be able to add or update ...
... in one step, and get all the information
into the correct tables.

I think I am ok with setting the privileges on the tables and columns as
appropriate to allow each group to select, insert and update the
appropriate data, and I can create appropriate views for them ...

Ideally, the db user just says "I want to enter Joe Public, and Joe is
affiliated with the Widget Corp entity, and has the phone numbers..."

Am I on the right track, or is there some better way to set this up?  My
understanding is that views really aren't meant for insert/update
operations, and I have seen on the web that using views to insert/update
is a bit tricky - and still requires a procedure with a rule on the view.


The way I do it for the insert case is to define an INSTEAD OF INSERT 
trigger on the view:


CREATE OR REPLACE VIEW protected.bond_ask AS
 SELECT ...
   FROM private.bond_ask
   JOIN private.order_book ON ...
 ;

CREATE OR REPLACE FUNCTION protected.bond_ask_iit()
  RETURNS trigger AS
$BODY$
BEGIN
...

  INSERT INTO private.order_book (...)
VALUES (...) RETURNING order_book_id INTO new.order_book_id;

  INSERT INTO private.bond_ask (...)
VALUES (...)
RETURNING bond_id into new.bond_id;
  RETURN NEW;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE SECURITY DEFINER;

CREATE TRIGGER bond_ask_iit
  INSTEAD OF INSERT
  ON protected.bond_ask
  FOR EACH ROW
  EXECUTE PROCEDURE protected.bond_ask_iit();

And then grant insert privilege on the view.

You can probably do something similar for updates.

--B




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Joshua D. Drake

On 01/27/2016 03:37 PM, Ivan Voras wrote:



On 28 January 2016 at 00:13, Bill Moran > wrote:

On Wed, 27 Jan 2016 23:54:37 +0100
Ivan Voras > wrote:

> So, question #1: WTF? How could this happen, on a regularly vacuumed
> system? Shouldn't the space be reused, at least after a VACUUM? The issue
> here is not the absolute existence of the bloat space, it's that it's
 > constantly growing for *system* tables.

With a lot of activity, once a day probably isn't regular enough.


I sort of see what you are saying. I'm curious, though, what goes wrong
with the following list of expectations:

 1. Day-to-day load is approximately the same
 2. So, at the end of the first day there will be some amount of bloat
 3. Vacuum will mark that space re-usable
 4. Within the next day, this space will actually be re-used
 5. ... so the bloat won't grow.

Basically, I'm wondering why is it growing after vacuums, not why it
exists in the first place?


If something is causing the autovacuum to be aborted you can have this 
problem.


JD


--
Command Prompt, Inc.  http://the.postgres.company/
 +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Bill Moran
On Thu, 28 Jan 2016 00:37:54 +0100
Ivan Voras  wrote:

> On 28 January 2016 at 00:13, Bill Moran  wrote:
> 
> > On Wed, 27 Jan 2016 23:54:37 +0100
> > Ivan Voras  wrote:
> >
> > > So, question #1: WTF? How could this happen, on a regularly vacuumed
> > > system? Shouldn't the space be reused, at least after a VACUUM? The issue
> > > here is not the absolute existence of the bloat space, it's that it's
> > > constantly growing for *system* tables.
> >
> > With a lot of activity, once a day probably isn't regular enough.
> >
> >
> I sort of see what you are saying. I'm curious, though, what goes wrong
> with the following list of expectations:
> 
>1. Day-to-day load is approximately the same
>2. So, at the end of the first day there will be some amount of bloat
>3. Vacuum will mark that space re-usable
>4. Within the next day, this space will actually be re-used
>5. ... so the bloat won't grow.
> 
> Basically, I'm wondering why is it growing after vacuums, not why it exists
> in the first place?

To add to what others have said: are you 100% sure that vacuum is
completing successfully each time it runs? I.e. does your cron job
trap and report failures of vacuum to complete? If it fails occasionally
for whatever reason, it's liable to bloat a lot over 48 hours (i.e.
assuming it succeeds the next time).

Additionally, there's the problem with active transactions causing it to
not clean up quite everything.

Not to belabour the point, but these hiccups are best handled by enabling
autovacuum and allowing it to monitor tables and take care of them for you.
I'm curious of claims of autovacuum causing performance issues, as I've
never seen it have much impact. Generally, if you can't run autovacuum
due to performance issues, your hardware is undersized for your workload
and anything else you do is just going to have problems in a different way.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Request - repeat value of \pset title during \watch interations

2016-01-27 Thread Michael Paquier
On Thu, Jan 28, 2016 at 9:34 AM, David G. Johnston
 wrote:
> So how about:
>
> + snprintf(title, strlen(myopt.title) + 50,
> + _("Watch every %lds\t%s\t%s"),
> +  sleep, head_title, asctime(localtime()));

I would just keep the timestamp and the title separated so what do you
think about that instead?
Watch every Xs   $timestamp
$head_title
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance options for CPU bound multi-SUM query

2016-01-27 Thread David Rowley
On 28 January 2016 at 08:41, Matt  wrote:
> Moving from NUMERIC to FLOAT(8) did indeed lower query times by about 20%.
>
> I will try fixeddecimal and agg() as time permits.

That's surprisingly little gain. Please note that you'll not gain any
further improvements from the fixeddecimal type than you won't have
already gotten from float8. My tests showed that it's very slightly
slower than float8, which is possibly due to float8 addition not
having overflow checks.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Joshua D. Drake

On 01/27/2016 03:37 PM, Ivan Voras wrote:



On 28 January 2016 at 00:13, Bill Moran > wrote:

On Wed, 27 Jan 2016 23:54:37 +0100
Ivan Voras > wrote:

> So, question #1: WTF? How could this happen, on a regularly vacuumed
> system? Shouldn't the space be reused, at least after a VACUUM? The issue
> here is not the absolute existence of the bloat space, it's that it's
 > constantly growing for *system* tables.

With a lot of activity, once a day probably isn't regular enough.


I sort of see what you are saying. I'm curious, though, what goes wrong
with the following list of expectations:

 1. Day-to-day load is approximately the same
 2. So, at the end of the first day there will be some amount of bloat
 3. Vacuum will mark that space re-usable
 4. Within the next day, this space will actually be re-used
 5. ... so the bloat won't grow.

Basically, I'm wondering why is it growing after vacuums, not why it
exists in the first place?


If something is causing the autovacuum to be aborted you can have this 
problem.


JD


--
Command Prompt, Inc.  http://the.postgres.company/
 +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Scott Mead


--
Scott Mead via mobile
IPhone : +1-607-765-1395
Skype  : scottm.openscg
Gtalk: sco...@openscg.com

> On Jan 27, 2016, at 22:11, Joshua D. Drake  wrote:
> 
>> On 01/27/2016 03:37 PM, Ivan Voras wrote:
>> 
>> 
>> On 28 January 2016 at 00:13, Bill Moran > > wrote:
>> 
>>On Wed, 27 Jan 2016 23:54:37 +0100
>>Ivan Voras > wrote:
>> 
>>> So, question #1: WTF? How could this happen, on a regularly vacuumed
>>> system? Shouldn't the space be reused, at least after a VACUUM? The 
>> issue
>>> here is not the absolute existence of the bloat space, it's that it's
>> > constantly growing for *system* tables.
>> 
>>With a lot of activity, once a day probably isn't regular enough.
>> 
>> 
>> I sort of see what you are saying. I'm curious, though, what goes wrong
>> with the following list of expectations:
>> 
>> 1. Day-to-day load is approximately the same
>> 2. So, at the end of the first day there will be some amount of bloat
>> 3. Vacuum will mark that space re-usable
>> 4. Within the next day, this space will actually be re-used
>> 5. ... so the bloat won't grow.
>> 
>> Basically, I'm wondering why is it growing after vacuums, not why it
>> exists in the first place?
> 
> If something is causing the autovacuum to be aborted you can have this 
> problem.
It long-running transactions / idle in transaction / prepared xacts

  Have you considered slowing down on temp tables?  Typically, when bleeding, 
it's good to find the wound and stitch it up instead of just getting more 
towels


> 
> JD
> 
> 
> -- 
> Command Prompt, Inc.  http://the.postgres.company/
> +1-503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread Dane Foster
On Wed, Jan 27, 2016 at 4:25 PM, Joshua Berkus  wrote:

>
>
> > Am I on the right track, or is there some better way to set this up?  My
> > understanding is that views really aren't meant for insert/update
> > operations, and I have seen on the web that using views to insert/update
> is
> > a bit tricky - and still requires a procedure with a rule on the view.
>
> Why not use updatable CTEs?  That's what they're for.
>
> WITH update_contact as (
>INSERT INTO contacts ( contact_id, name )
>VALUES ( nexval('contacts_id_seq'), 'Joe' )
>RETURNING contact_id ),
> new_cont_ids AS (
>SELECT contact_id FROM update_contact;
> ),
> insert_phones AS (
>INSERT INTO phones ( phone_id, contact_id, phone_no )
>SELECT ( nextval('phone_id_seq'), contact_id, '415-555-1212' )
>FROM new_cont_ids
>RETURNING phone_id
> ) ...
>
> I think you get the idea.  On 9.3 or later, this is the way to go.
>
> --
> Josh Berkus
> Red Hat OSAS
> (opinions are my own)
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
​
In general do updateable CTEs have lower overhead than functions?

Dane
​


Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread Don Parris
On Wed, Jan 27, 2016 at 4:25 PM, Joshua Berkus  wrote:

>
>
> > Am I on the right track, or is there some better way to set this up?  My
> > understanding is that views really aren't meant for insert/update
> > operations, and I have seen on the web that using views to insert/update
> is
> > a bit tricky - and still requires a procedure with a rule on the view.
>
> Why not use updatable CTEs?  That's what they're for.
>
>
Sounds great.  But can I use variables, and allow the db user to enter the
data when the CTE is called?  I've used variables in Python scripts for
insert/update/delete, but honestly, I've never used a variable in my
queries in PostgreSQL.  So, instead of 'Joe', as in your example below,
maybe something like first_name?


> WITH update_contact as (
>INSERT INTO contacts ( contact_id, name )
>VALUES ( nexval('contacts_id_seq'), 'Joe' )
>RETURNING contact_id ),
> new_cont_ids AS (
>SELECT contact_id FROM update_contact;
> ),
> insert_phones AS (
>INSERT INTO phones ( phone_id, contact_id, phone_no )
>SELECT ( nextval('phone_id_seq'), contact_id, '415-555-1212' )
>FROM new_cont_ids
>RETURNING phone_id
> ) ...
>
> I think you get the idea.  On 9.3 or later, this is the way to go.
>
>


Re: [GENERAL] Catalog bloat (again)

2016-01-27 Thread Ivan Voras
On 28 January 2016 at 00:13, Bill Moran  wrote:

> On Wed, 27 Jan 2016 23:54:37 +0100
> Ivan Voras  wrote:
>
> > So, question #1: WTF? How could this happen, on a regularly vacuumed
> > system? Shouldn't the space be reused, at least after a VACUUM? The issue
> > here is not the absolute existence of the bloat space, it's that it's
> > constantly growing for *system* tables.
>
> With a lot of activity, once a day probably isn't regular enough.
>
>
I sort of see what you are saying. I'm curious, though, what goes wrong
with the following list of expectations:

   1. Day-to-day load is approximately the same
   2. So, at the end of the first day there will be some amount of bloat
   3. Vacuum will mark that space re-usable
   4. Within the next day, this space will actually be re-used
   5. ... so the bloat won't grow.

Basically, I'm wondering why is it growing after vacuums, not why it exists
in the first place?


Re: [GENERAL] Request - repeat value of \pset title during \watch interations

2016-01-27 Thread David G. Johnston
On Mon, Jan 11, 2016 at 9:55 PM, Michael Paquier 
wrote:

> On Tue, Jan 12, 2016 at 1:15 AM, David G. Johnston
>  wrote:
> > On Mon, Jan 11, 2016 at 9:03 AM, Tom Lane  wrote:
> >>
> >> "David G. Johnston"  writes:
> >> > On Mon, Jan 11, 2016 at 8:14 AM, Tom Lane  wrote:
> >> >> Perhaps we should replace the "Watch every Ns" text by the user-given
> >> >> title if a title has been set?  That would conserve screen space.
> >>
> >> > The extra line doesn't both me and given the length of the timestamp I
> >> > suspect many titles would cause the combined line length to exceed
> >> > terminal
> >> > width and cause wrapping anyway.  In my specific case it would though
> I
> >> > am
> >> > using an abnormally narrow width.
> >>
> >> You speak as though the title will be chosen without any regard for the
> >> context it's used in, which I rather doubt.  Wouldn't people pick the
> >> title for a \watch query so that it fits?  (In any case they could
> >> force the issue by including a \n in their title...)
> >>
> >
> > True that.
> >
> > I don't have a strong opinion either way.  Having a single, condensed,
> title
> > line would be nice though using two in order to not be cryptic has its
> own
> > appeal.
>
> Just looking at that I just hacked the attached that enforces \n at
> the end of the user-defined title (that's easily changeable):
> =# \pset title 'foo bar'
> Title is "foo bar".
> =# select 1;
>  foo bar
>  ?column?
> --
> 1
> (1 row)
> =# \watch 1
> foo bar
> Watch every 1sTue Jan 12 13:54:04 2016
>
>  ?column?
> --
> 1
> (1 row)
> --
> Michael
>

​On the code side calling it "caption" would make for less confusing
reading since "title" is already defined and "head_title" just doesn't
distinguish enough.

There doesn't seem to be any way to let the user decide - by adding a
newline to the end them-self - so a decision will have to be reached.

So how about:

+ snprintf(title, strlen(myopt.title) + 50,
+ _("Watch every %lds\t%s\t%s"),
+  sleep, head_title, asctime(localtime()));

David J


[GENERAL] pgpool II, streaming replication and HA

2016-01-27 Thread Alexander Pyhalov

Hi.

I'd like to setup HA PostgreSQL streaming replication cluster with two 
nodes. The main issue which I'd like to protect from is HW node failure 
or reboot. I supposed to use repmgr to control PostgreSQL cluster and 
two pgpool instances as load balancers/DB proxies.


The issue which I still don't understand how to handle is temporary 
master failure. Let's say, we have host A (master) and host B (slave). 
Now A reboots. Pgpool (or repmgr, doesn't matter) promotes B to master. 
Now A comes online (and still considers itself master). How can Pgpool 
determine that A host is "bad one" and stop routing queries to A?

--
Best regards,
Alexander Pyhalov,
system administrator of Southern Federal University IT department


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance options for CPU bound multi-SUM query

2016-01-27 Thread Matt
Moving from NUMERIC to FLOAT(8) did indeed lower query times by about 
20%.


I will try fixeddecimal and agg() as time permits.


On 25 Jan 2016, at 4:44, David Rowley wrote:


On 25 January 2016 at 15:45, Matt  wrote:
I have a warehousing case where data is bucketed by a key of an 
hourly
timestamp and 3 other columns. In addition there are 32 numeric 
columns. The
tables are partitioned on regular date ranges, and aggregated to the 
lowest

resolution usable.

The principal use case is to select over a range of dates or hours, 
filter

by the other key columns, and SUM() all 32 of the other columns. The
execution plan shows the primary key index limits row selection 
efficiently,

but the query appears CPU bound in performing all of those 32 SUM()
aggregates.



SUM(numeric) also has to work quite a bit harder than an an aggregate
like sum(float8) too since the addition in numeric is implemented in
software.
It depends on the use case, but for some cases the float4 or float8
types might be an option and it will offer much faster aggregation.
There is also https://github.com/2ndQuadrant/fixeddecimal which may be
of some use if you need fixed precision up to a predefined scale. We
found that using fixeddecimal instead of numeric for the TPC-H
benchmark improved performance of query 1 significantly.

--
David Rowley   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread Don Parris
I have several tables related to people and their contact information, and
want db users to be able to add or update a given person and their
respective contact information in one step, and get all the information
into the correct tables.

I think I am ok with setting the privileges on the tables and columns as
appropriate to allow each group to select, insert and update the
appropriate data, and I can create appropriate views for them to view data
they need to see. However, I am not really sure about the best way to allow
someone to actually insert/update the data.

For instance, given the following tables:
core.category
contact.entity
contact.person
contact.entity_category --linking table between entity and category
contact.person_entity --linking table between entity & person
... --other tables for address and phone contact info

I haven't really given much thought as to how such a procedure might look,
but I'm guessing something along the lines of:
CREATE FUNCTION record_insert() RETURNS integer AS $$
BEGIN
INSERT statements... --need PK from entity & category tables to insert into
entity_category table.
END
$$
language plpgsql;

Ideally, the db user just says "I want to enter Joe Public, and Joe is
affiliated with the Widget Corp entity, and has the phone numbers..."

Am I on the right track, or is there some better way to set this up?  My
understanding is that views really aren't meant for insert/update
operations, and I have seen on the web that using views to insert/update is
a bit tricky - and still requires a procedure with a rule on the view.


Thanks,
Don
-- 
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
http://dcparris.net/ 

GPG Key ID: F5E179BE


Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread Bret Stern
On this track (possibly unrelated)...can a view be used as part of the
multi table update 
Just curious


 On Wed, 2016-01-27 at 14:48 -0500, Don Parris wrote:
> I have several tables related to people and their contact information,
> and want db users to be able to add or update a given person and their
> respective contact information in one step, and get all the
> information into the correct tables.
> 
> 
> 
> I think I am ok with setting the privileges on the tables and columns
> as appropriate to allow each group to select, insert and update the
> appropriate data, and I can create appropriate views for them to view
> data they need to see. However, I am not really sure about the best
> way to allow someone to actually insert/update the data.
> 
> 
> 
> For instance, given the following tables:
> 
> 
> core.category
> contact.entity
> 
> 
> contact.person
> 
> contact.entity_category --linking table between entity and category
> 
> 
> contact.person_entity --linking table between entity & person
> 
> ... --other tables for address and phone contact info
> 
> 
> 
> I haven't really given much thought as to how such a procedure might
> look, but I'm guessing something along the lines of:
> CREATE FUNCTION record_insert() RETURNS integer AS $$
> 
> BEGIN
> INSERT statements... --need PK from entity & category tables to insert
> into entity_category table.
> 
> END
> 
> $$
> 
> language plpgsql;
> 
> 
> Ideally, the db user just says "I want to enter Joe Public, and Joe is
> affiliated with the Widget Corp entity, and has the phone numbers..."
> 
> 
> 
> Am I on the right track, or is there some better way to set this up?
> My understanding is that views really aren't meant for insert/update
> operations, and I have seen on the web that using views to
> insert/update is a bit tricky - and still requires a procedure with a
> rule on the view.
> 
> 
> 
> Thanks,
> 
> Don
> 
> -- 
> 
> D.C. Parris, FMP, Linux+, ESL Certificate
> Minister, Security/FM Coordinator, Free Software Advocate
> 
> http://dcparris.net/
> GPG Key ID: F5E179BE



Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread David G. Johnston
On Wed, Jan 27, 2016 at 1:28 PM, Bret Stern <
bret_st...@machinemanagement.com> wrote:

> On this track (possibly unrelated)...can a view be used as part of the
> multi table update
> Just curious
>

​Never done so myself but - I believe so though doing so over a one-to-many
relationship can be tricky.

That said probably no trickier than do so via a function.  In both
scenarios you need to use arrays to approximate sets of records and then
decompose the array to insert the relevant records in the "many" table.

David J.


Re: [GENERAL] Multi-Table Insert/Update Strategy - Use Functions/Procedures?

2016-01-27 Thread Joshua Berkus


> Am I on the right track, or is there some better way to set this up?  My
> understanding is that views really aren't meant for insert/update
> operations, and I have seen on the web that using views to insert/update is
> a bit tricky - and still requires a procedure with a rule on the view.

Why not use updatable CTEs?  That's what they're for.

WITH update_contact as (
   INSERT INTO contacts ( contact_id, name )
   VALUES ( nexval('contacts_id_seq'), 'Joe' )
   RETURNING contact_id ),
new_cont_ids AS (
   SELECT contact_id FROM update_contact;
),
insert_phones AS (
   INSERT INTO phones ( phone_id, contact_id, phone_no )
   SELECT ( nextval('phone_id_seq'), contact_id, '415-555-1212' )
   FROM new_cont_ids
   RETURNING phone_id
) ...

I think you get the idea.  On 9.3 or later, this is the way to go.

-- 
Josh Berkus
Red Hat OSAS
(opinions are my own)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgpool II, streaming replication and HA

2016-01-27 Thread Alexander Pyhalov

On 01/27/2016 11:53, Tatsuo Ishii wrote:


There's a file called "pool_status" which records the previous status
of the DB servers. So the file records that "A is down, B is up". In
your case, when A reboots, the old primary server on A may start as a
primary, but from the file pgpool knows that the PostgreSQL on A was
down and never automatically regards it online. So any query sent to
pgpool will be routed to the new primary running on B, never routed to
A.

To make the old primary on A usable, you need to shutdown the
PostgreSQL (if it's running) and turn it to a new standby connecting
to the current primary on B. You can do it by either "online recovery"
feature of pgpool or do it manually using pg_basebackup etc. In the
latter case, you need to "attach" 'the new standby to make it online
by using pcp_attach_node.


Thank you for clarification.


--
Best regards,
Alexander Pyhalov,
system administrator of Southern Federal University IT department


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgpool II, streaming replication and HA

2016-01-27 Thread Tatsuo Ishii
I don't know anything about repmgr. I neglect the questions to repmgr.

> I'd like to setup HA PostgreSQL streaming replication cluster with two
> nodes. The main issue which I'd like to protect from is HW node
> failure or reboot. I supposed to use repmgr to control PostgreSQL
> cluster and two pgpool instances as load balancers/DB proxies.
> 
> The issue which I still don't understand how to handle is temporary
> master failure. Let's say, we have host A (master) and host B
> (slave). Now A reboots. Pgpool (or repmgr, doesn't matter) promotes B
> to master. Now A comes online (and still considers itself master). How
> can Pgpool determine that A host is "bad one" and stop routing queries
> to A?

There's a file called "pool_status" which records the previous status
of the DB servers. So the file records that "A is down, B is up". In
your case, when A reboots, the old primary server on A may start as a
primary, but from the file pgpool knows that the PostgreSQL on A was
down and never automatically regards it online. So any query sent to
pgpool will be routed to the new primary running on B, never routed to
A.

To make the old primary on A usable, you need to shutdown the
PostgreSQL (if it's running) and turn it to a new standby connecting
to the current primary on B. You can do it by either "online recovery"
feature of pgpool or do it manually using pg_basebackup etc. In the
latter case, you need to "attach" 'the new standby to make it online
by using pcp_attach_node.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general