Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Brendan Jurd
On 15 June 2013 14:43, Craig Ringer cr...@2ndquadrant.com wrote:
 The #1 question I see on Stack Overflow has to be confusion about
 pg_hba.conf, mostly from people who have no idea it exists, don't understand
 how to configure it, etc.

The totally non-obvious name of the file probably has something to do
with that.  It should be called 'auth.conf'.

Cheers,
BJ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Craig Ringer
On 06/15/2013 02:08 PM, Brendan Jurd wrote:
 On 15 June 2013 14:43, Craig Ringer cr...@2ndquadrant.com wrote:
 The #1 question I see on Stack Overflow has to be confusion about
 pg_hba.conf, mostly from people who have no idea it exists, don't understand
 how to configure it, etc.
 The totally non-obvious name of the file probably has something to do
 with that.  It should be called 'auth.conf'.
Not convinced; since it only controls one facet of auth - it doesn't
define users, passwords, grants, etc - that'd probably be just as confusing.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Brendan Jurd
On 15 June 2013 16:18, Craig Ringer cr...@2ndquadrant.com wrote:
 On 06/15/2013 02:08 PM, Brendan Jurd wrote:
 On 15 June 2013 14:43, Craig Ringer cr...@2ndquadrant.com wrote:
 The #1 question I see on Stack Overflow has to be confusion about
 pg_hba.conf, mostly from people who have no idea it exists, don't understand
 how to configure it, etc.
 The totally non-obvious name of the file probably has something to do
 with that.  It should be called 'auth.conf'.
 Not convinced; since it only controls one facet of auth - it doesn't
 define users, passwords, grants, etc ...

When somebody is setting up postgres for the first time, and they list
the contents of the config directory, you want them to have some idea
what each of the files is for.  If they see something called
'auth.conf', they'll get the right general idea.  An understanding of
the nuances (like that it doesn't control user accounts) will come
once they open up the file -- which they may well do, because it is
called 'auth.conf', and 'auth' is a thing you want to configure.

If they see something called 'pg_hba.conf', they may very reasonably
assume that it is some internal/advanced stuff that they don't need to
worry about just yet, because what the heck is a 'pg_hba'?  The 'pg'
is unnecessary and the 'hba' is an internal jargon term that we've
ill-advisedly allowed to leak out into the filename.

If you really feel that 'auth.conf' is too imprecise, maybe something
like 'conn-auth.conf' would be more your style.

Cheers,
BJ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Stefan Drees

On 2013-06-15 08:44 CEST, Brendan Jurd wrote:

On 15 June 2013 16:18, Craig Ringer ... wrote:

On 06/15/2013 02:08 PM, Brendan Jurd wrote:

On 15 June 2013 14:43, Craig Ringer ... wrote:

The #1 question I see on Stack Overflow has to be confusion about
pg_hba.conf, mostly from people who have no idea it exists, don't understand
how to configure it, etc.

The totally non-obvious name of the file probably has something to do
with that.  It should be called 'auth.conf'.

Not convinced; since it only controls one facet of auth - it doesn't
define users, passwords, grants, etc ...


When somebody is setting up postgres for the first time, and they list
the contents of the config directory, you want them to have some idea


as they may not have read up to section 19.1 The pg_hba.conf File inside 
chapter 19 Client Authentication of part III. Server Administration :-?, 
which states (as of 9.2.4):


Client authentication is controlled by a configuration file, which 
traditionally is named pg_hba.conf and is stored in the database 
cluster's data directory. (HBA stands for host-based authentication.) A 
default pg_hba.conf file is installed when the data directory is 
initialized by initdb. It is possible to place the authentication 
configuration file elsewhere, however; see the hba_file configuration 
parameter. ...



;-) thanks to hyperlinks this is quite close to the start, but I was 
surprised to not find it by skimming the text and following the 
hyperlinks but by knowing the filename instead and entering it 
(pg_hba.conf) into the Search Documentation text field on the top 
right corner of http://www.postgresql.org/docs/9.2/interactive/index.html.


Maybe we could find a better place of the whatever-then-name inside the 
part of the docs even the TL;DR mood people might read? A paragraph or 
two spiced up with some catchy StackOverflow-inspired terms people with 
a need to configure this authentication aspect might have expected could 
also be expected in INSTALL like docs or directly observable on the 
hyperlinked way from part I. Tutorial chapter 1 Getting Started section 
1.1 Installation all down to chapter 15. Installation from Source Code. 
But of course only, if this is wanted behavior.


If I read the section 1.1 Installation (again 9.2.4) I have the 
impression, that it more transports the message in our case, that you 
are the site admin, deal with it, read the docs, or don't I read it 
right? (I am a non-native English reader)



what each of the files is for.  If they see something called
'auth.conf', they'll get the right general idea.  An understanding of
the nuances (like that it doesn't control user accounts) will come
once they open up the file -- which they may well do, because it is
called 'auth.conf', and 'auth' is a thing you want to configure.


that may well be, I do not know, how people that prefer reading folder 
and filenames over manuals written for them grok text, as I read the 
docs, promised ;-)



If they see something called 'pg_hba.conf', they may very reasonably
assume that it is some internal/advanced stuff that they don't need to
worry about just yet, because what the heck is a 'pg_hba'?  The 'pg'
is unnecessary and the 'hba' is an internal jargon term that we've
ill-advisedly allowed to leak out into the filename.


at around 1995 when I started using Postgres95 it sure took some time to 
find that pg_hba.conf file, but I then perceived it to be very well 
documented, and also felt a bit guilty, as it's name occured in the 
INSTALL file cf. 
ftp://ftp-archives.postgresql.org/pub/source/v7.2/postgresql-7.2.tar.gz 
and the INSTALL file. Therein burried inside Step 1 of If You Are 
Upgrading ...



If you really feel that 'auth.conf' is too imprecise, maybe something
like 'conn-auth.conf' would be more your style.


I think you guys did and still do a fantastic job with PostgreSQL and 
eps. it's documentation, but in this case I doubt, that any renaming of 
config files will really have an impact on usability in the shady area 
of TL;DR - at least for the next twenty years or so - as it still 
holds, that from a false start (eg. not reading documentation written) 
anything may follow.


But as usability is a practical concern I (as a user) would be +0 on 
renaming it if people not finding it bearing the old name, but then 
editing it is really wanted behavior.


All the best,
Stefan.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Sawada Masahiko
On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila amit.kap...@huawei.com wrote:
 On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
 Hello,

 We have already started a discussion on pgsql-hackers for the problem of
 taking fresh backup during the failback operation here is the link for that:


 http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
 jgwrfu513...@mail.gmail.com

 Let me again summarize the problem we are trying to address.

 When the master fails, last few WAL files may not reach the standby. But
 the master may have gone ahead and made changes to its local file system
 after  flushing WAL to the local storage.  So master contains some file
 system level changes that standby does not have.  At this point, the data
 directory of  master is ahead of standby's data directory.
 Subsequently, the standby will be promoted as new master.  Later when the
 old master wants to be a standby of the new master, it can't just join the
 setup since there is inconsistency in between these two servers. We need
 to take the fresh backup from the new master.  This can happen in both the
 synchronous as well as asynchronous replication.

 Fresh backup is also needed in case of clean switch-over because in the
 current HEAD, the master does not wait for the standby to receive all the
 WAL
 up to the shutdown checkpoint record before shutting down the connection.
 Fujii Masao has already submitted a patch to handle clean switch-over case,
 but the problem is still remaining for failback case.

 The process of taking fresh backup is very time consuming when databases
 are of very big sizes, say several TB's, and when the servers are connected
 over a relatively slower link.  This would break the service level
 agreement of disaster recovery system.  So there is need to improve the
 process of
 disaster recovery in PostgreSQL.  One way to achieve this is to maintain
 consistency between master and standby which helps to avoid need of fresh
 backup.

 So our proposal on this problem is that we must ensure that master should
 not make any file system level changes without confirming that the
 corresponding WAL record is replicated to the standby.

   How will you take care of extra WAL on old master during recovery. If it
 plays the WAL which has not reached new-master, it can be a problem.

you means that there is possible that old master's data ahead of new
master's data.
so there is inconsistent data between those server when fail back. right?
if so , there is not possible inconsistent. because if you use GUC option
as his propose (i.g., failback_safe_standby_mode = remote_flush),
when old master is working fine, all file system level changes aren't
done  before WAL replicated.

--
Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Joshua D. Drake


On 06/14/2013 11:18 PM, Craig Ringer wrote:


On 06/15/2013 02:08 PM, Brendan Jurd wrote:

On 15 June 2013 14:43, Craig Ringer cr...@2ndquadrant.com wrote:

The #1 question I see on Stack Overflow has to be confusion about
pg_hba.conf, mostly from people who have no idea it exists, don't understand
how to configure it, etc.

The totally non-obvious name of the file probably has something to do
with that.  It should be called 'auth.conf'.

Not convinced; since it only controls one facet of auth - it doesn't
define users, passwords, grants, etc - that'd probably be just as confusing.



Yeah this one is not making the grade. pg_hba is just that host based 
auth but I think we are bikeshedding now.


JD



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Joshua D. Drake


On 06/14/2013 11:44 PM, Brendan Jurd wrote:


If they see something called 'pg_hba.conf', they may very reasonably
assume that it is some internal/advanced stuff that they don't need to
worry about just yet, because what the heck is a 'pg_hba'?  The 'pg'


Only the uneducated. Look, I am not trying to be an ass but seriously. 
Read the docs. I will argue vigorously against the idea of us designing 
a system that has people NOT reading the docs.


JD




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 aggregate

2013-06-15 Thread Dean Rasheed
On 13 June 2013 10:35, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Hi,

 Attached is a patch implementing a new aggregate function md5_agg() to
 compute the aggregate MD5 sum across a number of rows. This is
 something I've wished for a number of times. I think the primary use
 case is to do a quick check that 2 tables, possibly on different
 servers, contain the same data, using a query like

   SELECT md5_agg(foo.*::text) FROM (SELECT * FROM foo ORDER BY id) foo;

 or

   SELECT md5_agg(foo.*::text ORDER BY id) FROM foo;

There seem to be 2 separate directions that this could go, which
really meet different requirements:

1). Produce an unordered sum for SQL to compare 2 tables regardless of
the order in which they are scanned. A possible approach to this might
be something like an aggregate

md5_total(text/bytea) returns text

that returns the sum of the md5 values of each input value, treating
each md5 value as an unsigned 128-bit integer, and then producing the
hexadecimal representation of the final sum. This should out-perform a
solution based on numeric addition, and in typical cases, the result
wouldn't be much longer than a regular md5 sum, and so would be easy
to eyeball for differences.

2). Produce an ordered MD5 sum compatible with COPY, whose result
would match that of running unix md5sum on the COPY output. Given all
the possible COPY options that would affect the result (format,
delimiters, headers, quoting, escaping, ...), I think that such a
thing would only reasonably be possible as an extension to the COPY
command itself.

I guess in its simplest form this would just be a new option MD5 to
COPY that would cause it to pipe its output to the md5 aggregator and
then send the final sum to the COPY destination at the end (e.g.,
COPY foo TO STDOUT MD5 would produce the ordered MD5 sum of the data
in foo).


I still think my original md5_agg() has its uses, since what it
produces is comparable with external md5 sums, and is directly
available to SQL, but (1) is probably the most useful for quickly
comparing 2 tables. I'm much less convinced about the value of (2),
but on the face of it, it doesn't seem like it would be hard to
implement.

Thoughts?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Craig Ringer
On 06/15/2013 03:53 PM, Joshua D. Drake wrote:

 Yeah this one is not making the grade. pg_hba is just that host based
 auth but I think we are bikeshedding now.

Agreed... Even as I posted, I realised I shouldn't have mentioned the
last point, since everything else has been ignored.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Martijn van Oosterhout
On Sat, Jun 15, 2013 at 12:43:10PM +0800, Craig Ringer wrote:
 Bloat
 --
 
 Table bloat. Table bloat has been a major issue with PostgreSQL
 users/admins for years. Anyone care to explain to me in a simple
 paragraph how to find out if you have table or index bloat issues in
 your database and what to do about it? (Maybe we need
 pg_catalog.pg_index_bloat and pg_catalog.pg_table_bloat views
 including FILLFACTOR correction?)
 
 I think I'll draft up a patch to add exactly that.

Nice list btw. I monitor this by using the excellent check_progres
nagios plugin, which has stuff to check for things like this.

Which makes me think that it might be possible to add some other checks
like this, in for example pg_ctl.  A big fat warning 'your data may be
eaten' might get noticed at startup.

(A minor annoyance is that in recent version of PostgreSQL you have to
give check_postgres admin rights, otherwise it can't warn you about
idle in transaction problems.)

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund
On 2013-06-14 21:56:52 -0400, Robert Haas wrote:
 I don't think we need it.  I think what we need is to decide is which
 algorithm is legally OK to use.  And then put it in.
 
 In the past, we've had a great deal of speculation about that legal
 question from people who are not lawyers.  Maybe it would be valuable
 to get some opinions from people who ARE lawyers.  Tom and Heikki both
 work for real big companies which, I'm guessing, have substantial
 legal departments; perhaps they could pursue getting the algorithms of
 possible interest vetted.  Or, I could try to find out whether it's
 possible do something similar through EnterpriseDB.

I personally don't think the legal arguments holds all that much water
for snappy and lz4. But then the opinion of a european non-lawyer doesn't
hold much either.
Both are widely used by a large number open and closed projects, some of
which have patent grant clauses in their licenses. E.g. hadoop,
cassandra use lz4, and I'd be surprised if the companies behind those
have opened themselves to litigation.

I think we should preliminarily decide which algorithm to use before we
get lawyers involved. I'd surprised if they can make such a analysis
faster than we can rule out one of them via benchmarks.

Will post an updated patch that includes lz4 as well.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Craig Ringer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/15/2013 05:57 PM, Martijn van Oosterhout wrote:
 On Sat, Jun 15, 2013 at 12:43:10PM +0800, Craig Ringer wrote:
 Bloat --
 
 Table bloat. Table bloat has been a major issue with PostgreSQL 
 users/admins for years. Anyone care to explain to me in a simple 
 paragraph how to find out if you have table or index bloat issues
 in your database and what to do about it? (Maybe we need 
 pg_catalog.pg_index_bloat and pg_catalog.pg_table_bloat
 views including FILLFACTOR correction?)
 
 I think I'll draft up a patch to add exactly that.
 
 Nice list btw. I monitor this by using the excellent check_progres 
 nagios plugin, which has stuff to check for things like this.

It's been a vital tool for me too. It probably serves as a reasonable
guide for some things we could really usefully expose in system views.
Once in core we can document them in the main user manual, making them
reasonably discoverable.

Argh. I think my TODO has some kind of horrible disease, it keeps
growing uncontrollably.

 Which makes me think that it might be possible to add some other
 checks like this, in for example pg_ctl.  A big fat warning 'your
 data may be eaten' might get noticed at startup.

The users who have this kind of issue aren't the ones running pg_ctl.
They'll usually be using launchd, systemctl, upstart, sysv init
scripts, etc ... whatever, something that sends the warning straight
to the system logs that they likely never read.

I don't have tons of sympathy for these people, but I do think making
fsync=off so easy to set without understanding it is kind of like
handing a grenade to a toddler.


- -- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRvD/8AAoJELBXNkqjr+S2aG8H/jnjATe1M+4O6k8iaS6Okgux
HQU14nDaBh7tbRaNQZUlNFDaVoQA6cynN8Xzq8k5lqJxGnuNRR7SNw8+cZZiZmMe
pS2f9q2IyOMz3T/mBNPuAFcPTbp6pjYrBNpMEGF6FYDhmUMSEfhf4Cp1Ns4FG0kx
o5dIXnhgDpCCTBK4XiYqbijFGe0pqbOH98fTQJLXb2ItgE17t4jU0YoYPJovjjT8
xKnDggN+H3uPMmNTcxn0VL6XcrjM6oDeBQPtzCiePWWxYD4nwP3d0ZIok13jZSHm
KC3NWgYQ7uP8/NJitnqewMQ8RArQjAWsW94deZt28jNDeaKp/vovQlZtrU2M6dQ=
=aysr
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Cédric Villemain

Andrew Dunstan and...@dunslane.net a écrit :


On 06/14/2013 08:35 AM, Peter Eisentraut wrote:
 On 6/13/13 9:20 PM, amul sul wrote:
 Agree, only if we consider these contrib module is always gonna
deployed with the postgresql.
 But, what if user going to install such module elsewhere i.e. not
from contrib directory of pg source.
 Why would anyone do that?



Maybe they wouldn't.

I do think we need to make sure that we have at least buildfarm
coverage
of pgxs module building and testing. I have some coverage of a few
extensions I have written, which exercise that, so maybe that will
suffice. If not, maybe we need to have one module that only builds via
pgxs and is build after an install (i.e. not via the standard contrib
build).

I agree, I found very useful to have all the provided extensions build with 
PGXS to debug it.
It also offers a good set of natural regression tests.

I don't really like the directory layout we use for these modules
anyway, so I'm not sure they constitute best practice for extension
builders. Lately I have been using an extension skeleton that looks
something like this:

 License
 Readme.md
 META.json (for pgxn)
 extension.control
 Makefile
 doc/extension.md (soft linked to ../Readme.md)

This makes mandatory to have a MODULEDIR defined or a rule to rename it with 
the extension name suffixed.

 src/extension.c
 sql/extension.sql

It is (was) the default place for regression testsI am not sure it is a 
good thing to shuffle that.
Also, you don't do 'c/source.c'



--
Envoyé de mon téléphone, excusez la brièveté.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing
On 06/15/2013 03:56 AM, Robert Haas wrote:
 On Fri, Jun 14, 2013 at 8:45 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-06-14 17:35:02 -0700, Josh Berkus wrote:
 No. I think as long as we only have pglz and one new algorithm (even if
 that is lz4 instead of the current snappy) we should just always use the
 new algorithm. Unless I missed it nobody seemed to have voiced a
 contrary position?
 For testing/evaluation the guc seems to be sufficient.
 Then it's not pluggable, is it?  It's upgradable compression
 support, if anything.  Which is fine, but let's not confuse people.
 The point is that it's pluggable on the storage level in the sense of
 that several different algorithms can coexist and new ones can
 relatively easily added.
 That part is what seems to have blocked progress for quite a while
 now. So fixing that seems to be the interesting thing.

 I am happy enough to do the work of making it configurable if we want it
 to be... But I have zap interest of doing it and throw it away in the
 end because we decide we don't need it.
 I don't think we need it.  I think what we need is to decide is which
 algorithm is legally OK to use.  And then put it in.
If it were truly pluggable - that is just load a .dll, set a GUG and start
using it - then the selection of algorithms would be much
wider as several slow-but-high-compression ones under GPL could be
used as well, similar to how currently PostGIS works.

gzip and bzip2 are the first two that came in mind, but I am sure there
are more.

 In the past, we've had a great deal of speculation about that legal
 question from people who are not lawyers.  Maybe it would be valuable
 to get some opinions from people who ARE lawyers. 
Making a truly pluggable compression API delegates this question
to end users.

Delegation is good, as it lets you get done more :)


-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Andrew Dunstan


On 06/15/2013 06:24 AM, Cédric Villemain wrote:

Andrew Dunstan and...@dunslane.net a écrit :


On 06/14/2013 08:35 AM, Peter Eisentraut wrote:

On 6/13/13 9:20 PM, amul sul wrote:

Agree, only if we consider these contrib module is always gonna

deployed with the postgresql.

But, what if user going to install such module elsewhere i.e. not

from contrib directory of pg source.

Why would anyone do that?



Maybe they wouldn't.

I do think we need to make sure that we have at least buildfarm
coverage
of pgxs module building and testing. I have some coverage of a few
extensions I have written, which exercise that, so maybe that will
suffice. If not, maybe we need to have one module that only builds via
pgxs and is build after an install (i.e. not via the standard contrib
build).

I agree, I found very useful to have all the provided extensions build with 
PGXS to debug it.
It also offers a good set of natural regression tests.


I don't really like the directory layout we use for these modules
anyway, so I'm not sure they constitute best practice for extension
builders. Lately I have been using an extension skeleton that looks
something like this:

 License
 Readme.md
 META.json (for pgxn)
 extension.control
 Makefile
 doc/extension.md (soft linked to ../Readme.md)

This makes mandatory to have a MODULEDIR defined or a rule to rename it with 
the extension name suffixed.



Of course, for extension foo this would actually be foo.md. It installs 
just fine like that. The makefile template has:


DOCS = $(wildcard doc/*.md)




 src/extension.c
 sql/extension.sql

It is (was) the default place for regression testsI am not sure it is a 
good thing to shuffle that.
Also, you don't do 'c/source.c'




The sql here is the sql to install the extension, not part of the build 
nor part of the tests.


Some time ago I fixed pg_regress to honor --inputdir and --outputdir 
properly, so my Makefile template has this:


   REGRESS_OPTS = --inputdir=test --outputdir=test \
  --load-extension=$(EXTENSION)
   ...
   override pg_regress_clean_files = test/results/
   test/regression.diffs test/regression.out tmp_check/ log/


That keeps the testing stuff out of the way quite nicely.

You might not like this pattern, but I find it much saner that what we 
currently use. I certainly don't claim it's perfect.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing
On 06/15/2013 02:20 AM, Andres Freund wrote:
 On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
 On 06/14/2013 04:01 PM, Andres Freund wrote:
 It still contains a guc as described in the above message to control the
 algorithm used for compressing new tuples but I think we should remove
 that guc after testing.
 Did you add the storage attribute?
 No. I think as long as we only have pglz and one new algorithm (even if
 that is lz4 instead of the current snappy) we should just always use the
 new algorithm. Unless I missed it nobody seemed to have voiced a
 contrary position?
 For testing/evaluation the guc seems to be sufficient.
If not significantly harder than what you currently do, I'd prefer a
true pluggable compression support which is

a) dynamically configurable , say by using a GUG

and

b) self-describing, that is, the compressed data should have enough
info to determine how to decompress it.

additionally it *could* have the property Simon proposed earlier
of *uncompressed* pages having some predetermined size, so we
could retain optimisations of substring() even on compressed TOAST
values.

the latter of course could also be achieved by adding offset
column to toast tables as well.

One more idea - if we are already changing toast table structure, we
could introduce a notion of compress block, which could run over
several storage pages for much improved compression compared
to compressing only a single page at a time.

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Cédric Villemain
  I don't really like the directory layout we use for these modules
  anyway, so I'm not sure they constitute best practice for extension
  builders. Lately I have been using an extension skeleton that looks
  
  something like this:
   License
   Readme.md
   META.json (for pgxn)
   extension.control
   Makefile
   doc/extension.md (soft linked to ../Readme.md)
  
  This makes mandatory to have a MODULEDIR defined or a rule to rename it
  with the extension name suffixed.
 
 Of course, for extension foo this would actually be foo.md. It installs
 just fine like that. The makefile template has:
 
  DOCS = $(wildcard doc/*.md)

Oh! yes, I missed the soft link.

   src/extension.c
   sql/extension.sql
  
  It is (was) the default place for regression testsI am not sure it is
  a good thing to shuffle that. Also, you don't do 'c/source.c'
 
 The sql here is the sql to install the extension, not part of the build
 nor part of the tests.

I am interested by this topic, since we have Extensions we invite users to 
increase the usage of them. So going a step forward with a better layout is 
definitively something to do.

What do you suggest for the previous usage ? we have a hard rule to try to put 
libdir in *sql.in files for example.

 Some time ago I fixed pg_regress to honor --inputdir and --outputdir
 properly, so my Makefile template has this:
 
 REGRESS_OPTS = --inputdir=test --outputdir=test \
--load-extension=$(EXTENSION)
 ...
 override pg_regress_clean_files = test/results/
 test/regression.diffs test/regression.out tmp_check/ log/
 
 
 That keeps the testing stuff out of the way quite nicely.
 
 You might not like this pattern, but I find it much saner that what we
 currently use. I certainly don't claim it's perfect.

I am interested by this topic, since we have Extensions we invite users to 
increase the usage of them. So going a step forward with a better layout is 
definitively something to do. I have no strong assumption on what the ideal 
layout is, 'your' and pgxn layout are good and I won't vote against suggesting 
to use them (and improve PGXS to match those suggestions).

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund
On 2013-06-15 13:25:49 +0200, Hannu Krosing wrote:
 On 06/15/2013 02:20 AM, Andres Freund wrote:
  On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
  On 06/14/2013 04:01 PM, Andres Freund wrote:
  It still contains a guc as described in the above message to control the
  algorithm used for compressing new tuples but I think we should remove
  that guc after testing.
  Did you add the storage attribute?
  No. I think as long as we only have pglz and one new algorithm (even if
  that is lz4 instead of the current snappy) we should just always use the
  new algorithm. Unless I missed it nobody seemed to have voiced a
  contrary position?
  For testing/evaluation the guc seems to be sufficient.

 If not significantly harder than what you currently do, I'd prefer a
 true pluggable compression support which is

 a) dynamically configurable , say by using a GUG

 and

 b) self-describing, that is, the compressed data should have enough
 info to determine how to decompress it.

Could you perhaps actually read the the description and the discussion
before making wild suggestions? Possibly even the patch.
Compressed toast datums now *do* have an identifier of the compression
algorithm used. That's how we can discern between pglz and whatever
algorithm we come up with.

But those identifiers should be *small* (since they are added to all
Datums) and they need to be stable, even across pg_upgrade. So I think
making this user configurable would be grave error at this point.

 additionally it *could* have the property Simon proposed earlier
 of *uncompressed* pages having some predetermined size, so we
 could retain optimisations of substring() even on compressed TOAST
 values.

We are not changing the toast format here, so I don't think that's
applicable. That's a completely separate feature.

 the latter of course could also be achieved by adding offset
 column to toast tables as well.

 One more idea - if we are already changing toast table structure, we
 could introduce a notion of compress block, which could run over
 several storage pages for much improved compression compared
 to compressing only a single page at a time.

We aren't changing the toast table structure. And we can't easily do so,
think of pg_upgrade.
Besides, toast always has compressed datums over several chunks. What
would be beneficial would be to compress in a way that you can compress
several datums together, but that's several magnitudes more complex and
is unrelated to this feature.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund
On 2013-06-15 13:11:47 +0200, Hannu Krosing wrote:
 If it were truly pluggable - that is just load a .dll, set a GUG and start
 using it

Ok. I officially rechristen the patchset to 'extensible compression
support'.

 - then the selection of algorithms would be much
 wider as several slow-but-high-compression ones under GPL could be
 used as well, similar to how currently PostGIS works.

 gzip and bzip2 are the first two that came in mind, but I am sure there
 are more.

gzip barely has a higher compression ratio than lz4 and is a magnitude
slower decompressing, so I don't think it's interesting.
I personally think bzip2 is too slow to be useful, even for
decompression. What might be useful is something like lzma, but it's
implementation is so complex that I don't really want to touch it.

  In the past, we've had a great deal of speculation about that legal
  question from people who are not lawyers.  Maybe it would be valuable
  to get some opinions from people who ARE lawyers. 
 Making a truly pluggable compression API delegates this question
 to end users.
 
 Delegation is good, as it lets you get done more :)

No. It increases the features complexity by a magnitude. That's not
good. And it means that about nobody but a few expert users will benefit
from it, so I am pretty strongly opposed.

You suddently need to solve the question of how the identifiers for
compression formats are allocated and preserved across pg_upgrade and
across machines.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing
On 06/15/2013 01:56 PM, Andres Freund wrote:
 On 2013-06-15 13:25:49 +0200, Hannu Krosing wrote:
 On 06/15/2013 02:20 AM, Andres Freund wrote:
 On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
 On 06/14/2013 04:01 PM, Andres Freund wrote:
 It still contains a guc as described in the above message to control the
 algorithm used for compressing new tuples but I think we should remove
 that guc after testing.
 Did you add the storage attribute?
 No. I think as long as we only have pglz and one new algorithm (even if
 that is lz4 instead of the current snappy) we should just always use the
 new algorithm. Unless I missed it nobody seemed to have voiced a
 contrary position?
 For testing/evaluation the guc seems to be sufficient.
 If not significantly harder than what you currently do, I'd prefer a
 true pluggable compression support which is
 a) dynamically configurable , say by using a GUG
 and
 b) self-describing, that is, the compressed data should have enough
 info to determine how to decompress it.
 Could you perhaps actually read the the description and the discussion
 before making wild suggestions? Possibly even the patch.
 Compressed toast datums now *do* have an identifier of the compression
 algorithm used. 
 That's how we can discern between pglz and whatever
 algorithm we come up with.
Claiming that the algorithm will be one of only two (current and
whatever algorithm we come up with ) suggests that it is
only one bit, which is undoubtedly too little for having a pluggable
compression API :)


 But those identifiers should be *small* (since they are added to all
 Datums) 
if there will be any alignment at all between the datums, then
one byte will be lost in the noise (remember: nobody will need
more than 256 compression algorithms)
OTOH, if you plan to put these format markers in the compressed
stream and change the compression algorithm while reading it, I am lost.
 and they need to be stable, even across pg_upgrade. So I think
 making this user configurable would be grave error at this point.
at this point in what sense ?

 additionally it *could* have the property Simon proposed earlier
 of *uncompressed* pages having some predetermined size, so we
 could retain optimisations of substring() even on compressed TOAST
 values.
 We are not changing the toast format here, so I don't think that's
 applicable. That's a completely separate feature.

 the latter of course could also be achieved by adding offset
 column to toast tables as well.
 One more idea - if we are already changing toast table structure, we
 could introduce a notion of compress block, which could run over
 several storage pages for much improved compression compared
 to compressing only a single page at a time.
 We aren't changing the toast table structure. And we can't easily do so,
 think of pg_upgrade.
Where was the page for features rejected based on of pg_upgrade ;)
 Besides, toast always has compressed datums over several chunks. What
 would be beneficial would be to compress in a way that you can compress
 several datums together, but that's several magnitudes more complex and
 is unrelated to this feature.

 Greetings,

 Andres Freund



-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing
On 06/15/2013 02:02 PM, Andres Freund wrote:
 On 2013-06-15 13:11:47 +0200, Hannu Krosing wrote:
 If it were truly pluggable - that is just load a .dll, set a GUG and start
 using it
 Ok. I officially rechristen the patchset to 'extensible compression
 support'.

 - then the selection of algorithms would be much
 wider as several slow-but-high-compression ones under GPL could be
 used as well, similar to how currently PostGIS works.
 gzip and bzip2 are the first two that came in mind, but I am sure there
 are more.
 gzip barely has a higher compression ratio than lz4 and is a magnitude
 slower decompressing, so I don't think it's interesting.
 I personally think bzip2 is too slow to be useful, even for
 decompression. 
with low compression settings gzip and bzip2 seem to decompress at the
same speed :
http://pokecraft.first-world.info/wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO

(an interesting thing there is memory usage, but I guess it is just an
artefact of outer layers around the algorithm)

and if better compression translates to more speed depends heavily on
disk speeds :

http://www.citusdata.com/blog/64-zfs-compression claims quite big
performance increases from using gzip, even with its slow decompression


 What might be useful is something like lzma, but it's
 implementation is so complex that I don't really want to touch it.

 In the past, we've had a great deal of speculation about that legal
 question from people who are not lawyers.  Maybe it would be valuable
 to get some opinions from people who ARE lawyers. 
 Making a truly pluggable compression API delegates this question
 to end users.

 Delegation is good, as it lets you get done more :)
 No. It increases the features complexity by a magnitude. That's not
 good. And it means that about nobody but a few expert users will benefit
 from it, so I am pretty strongly opposed.

 You suddently need to solve the question of how the identifiers for
 compression formats are allocated and preserved across pg_upgrade and
 across machines.
This is something similar we already need to do for any non-builtin type
OID.

 Greetings,

 Andres Freund



-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund
On 2013-06-15 14:11:54 +0200, Hannu Krosing wrote:
  Could you perhaps actually read the the description and the discussion
  before making wild suggestions? Possibly even the patch.
  Compressed toast datums now *do* have an identifier of the compression
  algorithm used. 
  That's how we can discern between pglz and whatever
  algorithm we come up with.
 Claiming that the algorithm will be one of only two (current and
 whatever algorithm we come up with ) suggests that it is
 only one bit, which is undoubtedly too little for having a pluggable
 compression API :)

No, I am thinking 127 + 2 possibly algorithms for now. If we need more
the space used for the algorithm can be extended transparently at that
point.

  But those identifiers should be *small* (since they are added to all
  Datums) 

 if there will be any alignment at all between the datums, then
 one byte will be lost in the noise (remember: nobody will need
 more than 256 compression algorithms)

No. There's no additional alignment involved here.

 OTOH, if you plan to put these format markers in the compressed
 stream and change the compression algorithm while reading it, I am
 lost.

The marker *needs* to be in the compressed stream. When decompressing a
datum we only only get passed the varlena.

  and they need to be stable, even across pg_upgrade. So I think
  making this user configurable would be grave error at this point.

 at this point in what sense ?

If we add another algorithm with different tradeofs we can have a column
attribute for choosing the algorithm. If there proofs to be a need to
add more configurabily, we can do that at that point.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Amit kapila

On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote:
On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila amit.kap...@huawei.com wrote:
 On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
 Hello,

 We have already started a discussion on pgsql-hackers for the problem of
 taking fresh backup during the failback operation here is the link for that:


 http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
 jgwrfu513...@mail.gmail.com

 Let me again summarize the problem we are trying to address.


   How will you take care of extra WAL on old master during recovery. If it
 plays the WAL which has not reached new-master, it can be a problem.

 you means that there is possible that old master's data ahead of new
 master's data.

  I mean to say is that WAL of old master can be ahead of new master. I 
understood that
  data files of old master can't be ahead, but I think WAL can be ahead.

 so there is inconsistent data between those server when fail back. right?
 if so , there is not possible inconsistent. because if you use GUC option
 as his propose (i.g., failback_safe_standby_mode = remote_flush),
 when old master is working fine, all file system level changes aren't
 done  before WAL replicated.

Would the propose patch will take care that old master's WAL is also not ahead 
in some way?
If yes, I think i am missing some point.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Amit kapila
On Saturday, June 15, 2013 3:50 PM Andres Freund wrote:
On 2013-06-14 21:56:52 -0400, Robert Haas wrote:
 I don't think we need it.  I think what we need is to decide is which
 algorithm is legally OK to use.  And then put it in.

 In the past, we've had a great deal of speculation about that legal
 question from people who are not lawyers.  Maybe it would be valuable
 to get some opinions from people who ARE lawyers.  Tom and Heikki both
 work for real big companies which, I'm guessing, have substantial
 legal departments; perhaps they could pursue getting the algorithms of
 possible interest vetted.  Or, I could try to find out whether it's
 possible do something similar through EnterpriseDB.

 I personally don't think the legal arguments holds all that much water
 for snappy and lz4. But then the opinion of a european non-lawyer doesn't
 hold much either.
 Both are widely used by a large number open and closed projects, some of
 which have patent grant clauses in their licenses. E.g. hadoop,
 cassandra use lz4, and I'd be surprised if the companies behind those
 have opened themselves to litigation.

 I think we should preliminarily decide which algorithm to use before we
 get lawyers involved. I'd surprised if they can make such a analysis
 faster than we can rule out one of them via benchmarks.
  
I have also tried to use snappy for patch Performance Improvement by reducing 
WAL for Update Operation.
It has shown very good results and performed very well for all the tests asked 
by Heikki.
Results are at below link:
http://www.postgresql.org/message-id/009001ce2c6e$9bea4790$d3bed6b0$@kap...@huawei.com

I think if we can get snappy into core, it can be used for more things. 
I wanted to try it for FPW as well.

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane
I wrote:
 Richard Poole rich...@2ndquadrant.com writes:
 This behaviour appears in 6ac7facdd3990baf47efc124e9d7229422a06452 as a
 side-effect of speeding things up by getting rid of setitimer() calls;
 it's not obvious what's a good way to fix it without losing the benefits
 of that commit.

 Ugh.  It doesn't sound very practical to try to guarantee that every
 single kernel call in the backend is set up to recover from EINTR,
 even though ideally they should all be able to cope.

On reflection though, we *do* need to make them cope, because even
without lazy SIGALRM disable, any such place is still at risk.  We
surely must allow for the possibility of SIGHUP arriving at any instant,
for example.

Have you identified the specific place that's giving you trouble?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Sawada Masahiko
On Sat, Jun 15, 2013 at 10:34 PM, Amit kapila amit.kap...@huawei.com wrote:

 On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote:
 On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila amit.kap...@huawei.com wrote:
 On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
 Hello,

 We have already started a discussion on pgsql-hackers for the problem of
 taking fresh backup during the failback operation here is the link for 
 that:


 http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
 jgwrfu513...@mail.gmail.com

 Let me again summarize the problem we are trying to address.


   How will you take care of extra WAL on old master during recovery. If it
 plays the WAL which has not reached new-master, it can be a problem.

 you means that there is possible that old master's data ahead of new
 master's data.

   I mean to say is that WAL of old master can be ahead of new master. I 
 understood that
   data files of old master can't be ahead, but I think WAL can be ahead.

 so there is inconsistent data between those server when fail back. right?
 if so , there is not possible inconsistent. because if you use GUC option
 as his propose (i.g., failback_safe_standby_mode = remote_flush),
 when old master is working fine, all file system level changes aren't
 done  before WAL replicated.

 Would the propose patch will take care that old master's WAL is also not 
 ahead in some way?
 If yes, I think i am missing some point.

yes it will happen that old master's WAL ahead of new master's WAL as you said.
but I think that we can solve them by delete all WAL file when old
master starts as new standby.
 thought?

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revive line type

2013-06-15 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Complete the implementations of line_in, line_out, line_recv,
 line_send.  Remove comments and error messages about the line type not
 being implemented.  Add regression tests for existing line operators
 and functions.
 ---
 This just revives existing functionality, doesn't add anything new.
 One thing that the original code did not settle was how to convert a
 line in form Ax+By+C=0 to the two-points output form.  Obviously, you
 can just pick to random points on the line, but I wonder whether there
 is a more standard solution.

ISTM printing a line as two points is an unfortunate representational
choice altogether.  If the internal form is Ax+By+C=0, shouldn't the
text form expose A,B,C directly?  What you've got here necessarily
suffers a lot of roundoff error during I/O conversions, and so it seems
pretty likely to fail dump-and-reload tests.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Andres Freund
On 2013-06-15 10:45:28 -0400, Tom Lane wrote:
 I wrote:
  Richard Poole rich...@2ndquadrant.com writes:
  This behaviour appears in 6ac7facdd3990baf47efc124e9d7229422a06452 as a
  side-effect of speeding things up by getting rid of setitimer() calls;
  it's not obvious what's a good way to fix it without losing the benefits
  of that commit.
 
  Ugh.  It doesn't sound very practical to try to guarantee that every
  single kernel call in the backend is set up to recover from EINTR,
  even though ideally they should all be able to cope.
 
 On reflection though, we *do* need to make them cope, because even
 without lazy SIGALRM disable, any such place is still at risk.  We
 surely must allow for the possibility of SIGHUP arriving at any instant,
 for example.

All signal handlers we register, including SIGHUP, but the one for
SIGALRM set SA_RESTART... I wonder if we can rejigger things so we don't
need that? I am not sure if there's still a reason for that decision
inside the backend.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote:
 Alvaro,

 This sounds really interesting, and I can see the possibilities.
 However ...

 Value changes in columns that are part of a minmax index, and tuple insertion
 in summarized pages, would invalidate the stored min/max values.  To support
 this, each minmax index has a validity map; a range can only be considered 
 in a
 scan if it hasn't been invalidated by such changes (A range not considered 
 in
 the scan needs to be returned in whole regardless of the stored min/max 
 values,
 that is, it cannot be pruned per query quals).  The validity map is very
 similar to the visibility map in terms of performance characteristics: quick
 enough that it's not contentious, allowing updates and insertions to proceed
 even when data values violate the minmax index conditions.  An invalidated
 range can be made valid by re-summarization (see below).

 This begins to sound like these indexes are only useful on append-only
 tables.  Not that there aren't plenty of those, but ...

The index is basically using the index only scan mechanism. The
only useful on append-only tables comment would/should apply also to
index only scans. I can't see a reason to raise that specifically for
this index type.


 Re-summarization is relatively expensive, because the complete page range has
 to be scanned.

 Why?  Why can't we just update the affected pages in the index?

Again, same thing as index-only scans. For IOS, we reset the
visibility info at vacuum. The route proposed here follows exactly the
same timing, same mechanism. I can't see a reason for any difference
between the two.


  To avoid this, a table having a minmax index would be
 configured so that inserts only go to the page(s) at the end of the table; 
 this
 avoids frequent invalidation of ranges in the middle of the table.  We 
 provide
 a table reloption that tweaks the FSM behavior, so that summarized pages are
 not candidates for insertion.

 We haven't had an index type which modifies table insertion behavior
 before, and I'm not keen to start now; imagine having two indexes on the
 same table each with their own, conflicting, requirements.  This is
 sounding a lot more like a candidate for our prospective pluggable
 storage manager.  Also, the above doesn't help us at all with UPDATEs.

 If we're going to start adding reloptions for specific table behavior,
 I'd rather think of all of the optimizations we might have for a
 prospective append-only table and bundle those, rather than tying it
 to whether a certain index exists or not.

I agree that the FSM behaviour shouldn't be linked to index existence.

IMHO that should be a separate table parameter, WITH (fsm_mode = append)

Index only scans would also benefit from that.


 Also, I hate the name ... if this feature goes ahead, I'm going to be
 lobbying to change it.  But that's pretty minor compared to the update
 issues.

This feature has already had 3 different names. I don't think the name
is crucial, but it makes sense to give it a name up front. So if you
want to lobby for that then you'd need to come up with a name soon, so
poor Alvaro can cope with name #4.

(There's no consistency in naming from any other implementation either).

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-06-15 10:45:28 -0400, Tom Lane wrote:
 On reflection though, we *do* need to make them cope, because even
 without lazy SIGALRM disable, any such place is still at risk.  We
 surely must allow for the possibility of SIGHUP arriving at any instant,
 for example.

 All signal handlers we register, including SIGHUP, but the one for
 SIGALRM set SA_RESTART... I wonder if we can rejigger things so we don't
 need that? I am not sure if there's still a reason for that decision
 inside the backend.

Hmm.  Personally I'd rather go in the other direction:
http://www.postgresql.org/message-id/12819.1183306...@sss.pgh.pa.us
but maybe the path of least resistance is to set SA_RESTART for SIGALRM
too.  Given our current usage of it, there seems no worse harm in
allowing kernel calls to restart after a SIGALRM than any other signal.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 00:07, Tom Lane t...@sss.pgh.pa.us wrote:

 We've talked a lot about index-organized tables in the past.  How much
 of the use case for this would be subsumed by a feature like that?

IOTs would only work for one specific organisation, acting like a
multi-column btree. So you could use IOTs for this but only in a
limited way.

MinMax indexes cover multiple columns, possibly even all columns if
desired. So MMIs have much greater usefulness, as well as not
requiring a full reload of the data when you decide you want one.

Personally, I think IOTs are not worth pursuing. IOTs for Postgres
would require major changes to the definition of the heap, so we're a
long way from implementing them directly and coping with all of the
detailed implementation horrors. Grouped Item Tuple indexes are
theoretically equivalent in terms of size and number of block accesses
required to access data. Heikki's earlier work on that could easily be
revived. IOTs would be a multi-year effort with conflicting use cases.

Note that SQLServer and Sybase use block range indexes for primary
indexes, neatly avoiding the hassle Oracle walked into when they chose
to do IOTs. I think we should learn that same lession ourselves.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote:
 If we're going to start adding reloptions for specific table behavior,
 I'd rather think of all of the optimizations we might have for a
 prospective append-only table and bundle those, rather than tying it
 to whether a certain index exists or not.

 I agree that the FSM behaviour shouldn't be linked to index existence.
 IMHO that should be a separate table parameter, WITH (fsm_mode = append)
 Index only scans would also benefit from that.

-1 ... I cannot believe that such a parameter would ever get turned on
in production by anyone.  If your table has a significant update rate,
the resulting table bloat would make such behavior completely
infeasible.  If you have few enough updates to make such a behavior
practical, then you can live with the expensive index updates instead.

I also find the analogy to index-only scans to be bogus, because those
didn't require any user tuning.

There's a nearby thread complaining bitterly about our willingness to
create hard-to-use, hard-to-tune features.  In its current form,
this will be another one of those.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Andres Freund
On 2013-06-15 11:29:45 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-06-15 10:45:28 -0400, Tom Lane wrote:
  On reflection though, we *do* need to make them cope, because even
  without lazy SIGALRM disable, any such place is still at risk.  We
  surely must allow for the possibility of SIGHUP arriving at any instant,
  for example.
 
  All signal handlers we register, including SIGHUP, but the one for
  SIGALRM set SA_RESTART... I wonder if we can rejigger things so we don't
  need that? I am not sure if there's still a reason for that decision
  inside the backend.
 
 Hmm.  Personally I'd rather go in the other direction:
 http://www.postgresql.org/message-id/12819.1183306...@sss.pgh.pa.us
 but maybe the path of least resistance is to set SA_RESTART for SIGALRM
 too.  Given our current usage of it, there seems no worse harm in
 allowing kernel calls to restart after a SIGALRM than any other signal.

I am not actually objecting that reasoning, I think it would be rather
useful to get there.
But I don't think it's realistic to do that at this point in the
9.3 cycle. It seems like we would fight bugs around that for quite a
while. We have a large number of syscall sites where we don't retry on
EINTR/EAGAIN. And, as you note, that's not even talking about third
party code.

I'd be happy if we decide to go that way at the beginning of the 9.4
cycle. I.e. do it right now on HEAD for all syscalls. That way we have a
chance to find most of the bad callsites before releasing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] request a new feature in fuzzystrmatch

2013-06-15 Thread Josh Berkus
Liming,

I'm putting this in the commitfest regardless, so that it at least gets
a review.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-06-15 11:29:45 -0400, Tom Lane wrote:
 Hmm.  Personally I'd rather go in the other direction:
 http://www.postgresql.org/message-id/12819.1183306...@sss.pgh.pa.us

 I am not actually objecting that reasoning, I think it would be rather
 useful to get there.
 But I don't think it's realistic to do that at this point in the
 9.3 cycle. It seems like we would fight bugs around that for quite a
 while. We have a large number of syscall sites where we don't retry on
 EINTR/EAGAIN. And, as you note, that's not even talking about third
 party code.

Yeah, it's the issue of third-party code within the backend (perl,
python, etc etc etc etc) that really makes complete EINTR-proofing seem
a bit impractical.

Back in the day I was also worried about platforms that didn't have
SA_RESTART, but that's probably pretty much the empty set by now (is
anyone aware of a modern platform on which configure fails to set
HAVE_POSIX_SIGNALS?).  Also, our switch to latches for sleeping purposes
should have ameliorated the issue of signals failing to wake processes
when we wanted them to.

Let's turn on SA_RESTART for SIGALRM in HEAD and 9.3 and see what beta
testing says.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Greg Stark
 fsync=off

synchronous_commits=off replaced fsync=off in almost every use case
where fsync=off might have been useful. The only remaining use case is
for the initial build of a database. In that case what the user really
wants is to turn off WAL logging entirely though. Having a WAL log and
not fsyncing it is kind of pointless. I guess it lets you replicate
the database but it doesn't let you use the WAL log for recovery
locally.

 Bloat
 --

 Table bloat. Table bloat has been a major issue with PostgreSQL users/admins
 for years. Anyone care to explain to me in a simple paragraph how to find
 out if you have table or index bloat issues in your database and what to do
 about it? (Maybe we need pg_catalog.pg_index_bloat and
 pg_catalog.pg_table_bloat views including FILLFACTOR correction?)

A nice view that exposes a summary of information from the fsm per
table would be pretty handy.

In general there's a lot of data tied up in things like the fsm that
could be usefully exposed to users.



-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres 9.3beta2 coming week after next

2013-06-15 Thread Tom Lane
FYI, the core committee is planning to wrap 9.3beta2 on Monday June 24
for release Thursday the 27th.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Alvaro Herrera
Craig Ringer wrote:
 On 06/13/2013 11:16 AM, Peter Eisentraut wrote:
  This has served no purpose except to
 
  1. take up space
  2. confuse users
  3. produce broken external extension modules that take contrib as an example
  4. break builds of PostgreSQL when users try to fix 3. by exporting USE_PGXS
 
 Weak -1 from me; I find being able to build contrib modules using PGXS
 from the source tree extremely useful in development. It allows me to
 easily install an extension from a newer Pg version into an older
 server, test against older servers, etc.

This use case seems too narrow to me to justify the burden of keeping
PGXS-enabled makefiles in contrib.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Jeff Davis
On Fri, 2013-06-14 at 18:27 +0200, Andres Freund wrote:
 I'd like to see a comment around the memcpys in XLogSaveBufferForHint()
 that mentions that they are safe in a non std buffer due to
 XLogCheckBuffer setting an appropriate hole/offset. Or make an explicit
 change of the copy algorithm there.

Done.

 Btw, if you touch that code, I'd vote for renaming XLOG_HINT to XLOG_FPI
 or something like that. I find the former name confusing...

Also done.

Patch attached. Also, since we branched, I think this should be
back-patched to 9.3 as well.

Regards,
Jeff Davis

*** a/src/backend/access/hash/hash.c
--- b/src/backend/access/hash/hash.c
***
*** 287,293  hashgettuple(PG_FUNCTION_ARGS)
  			/*
  			 * Since this can be redone later if needed, mark as a hint.
  			 */
! 			MarkBufferDirtyHint(buf);
  		}
  
  		/*
--- 287,293 
  			/*
  			 * Since this can be redone later if needed, mark as a hint.
  			 */
! 			MarkBufferDirtyHint(buf, true);
  		}
  
  		/*
*** a/src/backend/access/heap/pruneheap.c
--- b/src/backend/access/heap/pruneheap.c
***
*** 262,268  heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
  		{
  			((PageHeader) page)-pd_prune_xid = prstate.new_prune_xid;
  			PageClearFull(page);
! 			MarkBufferDirtyHint(buffer);
  		}
  	}
  
--- 262,268 
  		{
  			((PageHeader) page)-pd_prune_xid = prstate.new_prune_xid;
  			PageClearFull(page);
! 			MarkBufferDirtyHint(buffer, true);
  		}
  	}
  
*** a/src/backend/access/nbtree/nbtinsert.c
--- b/src/backend/access/nbtree/nbtinsert.c
***
*** 413,421  _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
  	 * crucial. Be sure to mark the proper buffer dirty.
  	 */
  	if (nbuf != InvalidBuffer)
! 		MarkBufferDirtyHint(nbuf);
  	else
! 		MarkBufferDirtyHint(buf);
  }
  			}
  		}
--- 413,421 
  	 * crucial. Be sure to mark the proper buffer dirty.
  	 */
  	if (nbuf != InvalidBuffer)
! 		MarkBufferDirtyHint(nbuf, true);
  	else
! 		MarkBufferDirtyHint(buf, true);
  }
  			}
  		}
*** a/src/backend/access/nbtree/nbtree.c
--- b/src/backend/access/nbtree/nbtree.c
***
*** 1052,1058  restart:
  opaque-btpo_cycleid == vstate-cycleid)
  			{
  opaque-btpo_cycleid = 0;
! MarkBufferDirtyHint(buf);
  			}
  		}
  
--- 1052,1058 
  opaque-btpo_cycleid == vstate-cycleid)
  			{
  opaque-btpo_cycleid = 0;
! MarkBufferDirtyHint(buf, true);
  			}
  		}
  
*** a/src/backend/access/nbtree/nbtutils.c
--- b/src/backend/access/nbtree/nbtutils.c
***
*** 1789,1795  _bt_killitems(IndexScanDesc scan, bool haveLock)
  	if (killedsomething)
  	{
  		opaque-btpo_flags |= BTP_HAS_GARBAGE;
! 		MarkBufferDirtyHint(so-currPos.buf);
  	}
  
  	if (!haveLock)
--- 1789,1795 
  	if (killedsomething)
  	{
  		opaque-btpo_flags |= BTP_HAS_GARBAGE;
! 		MarkBufferDirtyHint(so-currPos.buf, true);
  	}
  
  	if (!haveLock)
*** a/src/backend/access/rmgrdesc/xlogdesc.c
--- b/src/backend/access/rmgrdesc/xlogdesc.c
***
*** 82,92  xlog_desc(StringInfo buf, uint8 xl_info, char *rec)
  		appendStringInfo(buf, restore point: %s, xlrec-rp_name);
  
  	}
! 	else if (info == XLOG_HINT)
  	{
  		BkpBlock   *bkp = (BkpBlock *) rec;
  
! 		appendStringInfo(buf, page hint: %s block %u,
  		 relpathperm(bkp-node, bkp-fork),
  		 bkp-block);
  	}
--- 82,92 
  		appendStringInfo(buf, restore point: %s, xlrec-rp_name);
  
  	}
! 	else if (info == XLOG_FPI)
  	{
  		BkpBlock   *bkp = (BkpBlock *) rec;
  
! 		appendStringInfo(buf, full-page image: %s block %u,
  		 relpathperm(bkp-node, bkp-fork),
  		 bkp-block);
  	}
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 7681,7692  XLogRestorePoint(const char *rpName)
   * records. In that case, multiple copies of the same block would be recorded
   * in separate WAL records by different backends, though that is still OK from
   * a correctness perspective.
-  *
-  * Note that this only works for buffers that fit the standard page model,
-  * i.e. those for which buffer_std == true
   */
  XLogRecPtr
! XLogSaveBufferForHint(Buffer buffer)
  {
  	XLogRecPtr	recptr = InvalidXLogRecPtr;
  	XLogRecPtr	lsn;
--- 7681,7689 
   * records. In that case, multiple copies of the same block would be recorded
   * in separate WAL records by different backends, though that is still OK from
   * a correctness perspective.
   */
  XLogRecPtr
! XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
  {
  	XLogRecPtr	recptr = InvalidXLogRecPtr;
  	XLogRecPtr	lsn;
***
*** 7708,7714  XLogSaveBufferForHint(Buffer buffer)
  	 * and reset rdata for any actual WAL record insert.
  	 */
  	rdata[0].buffer = buffer;
! 	rdata[0].buffer_std = true;
  
  	/*
  	 * Check buffer while not holding an exclusive lock.
--- 7705,7711 
  	 * and reset rdata for any 

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/15/2013 11:28 AM, Alvaro Herrera wrote:
 Craig Ringer wrote:
 On 06/13/2013 11:16 AM, Peter Eisentraut wrote:
 This has served no purpose except to
 
 1. take up space 2. confuse users 3. produce broken external
 extension modules that take contrib as an example 4. break
 builds of PostgreSQL when users try to fix 3. by exporting
 USE_PGXS
 
 Weak -1 from me; I find being able to build contrib modules using
 PGXS from the source tree extremely useful in development. It
 allows me to easily install an extension from a newer Pg version
 into an older server, test against older servers, etc.
 
 This use case seems too narrow to me to justify the burden of
 keeping PGXS-enabled makefiles in contrib.

What was the burden of it?

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRvLWLAAoJEDfy90M199hllqIP/3ZHKaiOEidUPoRK2n9BrW3b
MmHzfzn2rhOxhL4XYSxUz5mI8CDecYf5PXi1m0dwdyTCGUNWSQRiNMzN5x0Wd8v3
TVD2PS3m9XM4tWf+Y1DxSjkrDZ4KZFE8HdKFJUoCu73a0wS+nIerPC28PsPJjfen
bBzaWMbZsuY37GUh3uzJW4p/KDrRV5Bh298tP2hgWPCdxpenwT3sK7+qw2Goumly
Gdltp6BBkUTxpk42X72PRPx+QUwZJfmQ0QjubR+vkiYSi5otyVB2q1xr+umts+SW
Z69dfxyQIp0BHaFpUB1DFVXwx+hmWyNKWwA0mIBIXQhlKh8VZXbz5ENHKnK1xPpK
GcO/LFl/z5zI0CMFVFW2nwQPJ8NN+Jmru23Q53FGGsIXXJJJofCDt8U9BDLCGYpZ
uq1E1/7HfIkDL0f6jDRrBmdQacnRRYWKmhEGXjQQ1Med5IG3pFiqedZj3d8xCNDU
xB1wwAnIOiLnIo8WF2eIeYzuPQ5gN+xZFpUrQz+tt3UBuIsvhxXI1qQvJwRnwJTx
O30Aw7zZYbnI+zZvkO1ibyIUUp8byE0DuIBwg2ergKTD4BVpKi2ThIntSXY3W/hO
JzwulZfaQ120NBmrDXx641SnkXlRmMYK82KzJcDFMcyR6y04dZVhrUi0Ml2Yno6Z
dsi1AdDZJpBJq9nwIJ53
=fVLk
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs
On 15 June 2013 16:39, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 15 June 2013 00:01, Josh Berkus j...@agliodbs.com wrote:
 If we're going to start adding reloptions for specific table behavior,
 I'd rather think of all of the optimizations we might have for a
 prospective append-only table and bundle those, rather than tying it
 to whether a certain index exists or not.

 I agree that the FSM behaviour shouldn't be linked to index existence.
 IMHO that should be a separate table parameter, WITH (fsm_mode = append)
 Index only scans would also benefit from that.

 -1 ... I cannot believe that such a parameter would ever get turned on
 in production by anyone.

It's an extremely common use case, epecially with larger databases.

 If your table has a significant update rate,
 the resulting table bloat would make such behavior completely
 infeasible.  If you have few enough updates to make such a behavior
 practical, then you can live with the expensive index updates instead.

Depends. What I had in mind was that append mode would allow updates
normally within the last 1 GB of a table, but relocate updates should
they occur in earlier segments. That would cover the majority of use
cases I see.

 I also find the analogy to index-only scans to be bogus, because those
 didn't require any user tuning.

I don't think anybody is suggesting there is a requirement for user
tuning on this type of index.

But I think having one parameter would make it the same as say, GIN
indexes which also have a single tuning parameter.

I think the idea that index only scans just work is itself bogus. A
mild random update pattern will reduce the benefit of IOS scans
considerably, which is hard to understand or control, especially
without any way of measuring it. We could eliminate that problem if we
chose, but its not even documented as a potential issue for some
reason so its hard to discuss. Not being able to see or control a
known problem is not the same thing as doesn't require user tuning.

 There's a nearby thread complaining bitterly about our willingness to
 create hard-to-use, hard-to-tune features.  In its current form,
 this will be another one of those.

Not based upon anything mentioned here so far, or that I'm aware of.

I think it would be interesting to see some anonymous voting on
feature complaints, so we can see what people really think without
needing to risk offending the main authors of each of the various
parts of our software.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread David E. Wheeler
On Jun 15, 2013, at 4:12 AM, Andrew Dunstan and...@dunslane.net wrote:

   REGRESS_OPTS = --inputdir=test --outputdir=test \
  --load-extension=$(EXTENSION)
   ...
   override pg_regress_clean_files = test/results/
   test/regression.diffs test/regression.out tmp_check/ log/
 
 
 That keeps the testing stuff out of the way quite nicely.

I don't suppose there could be a way for the makefile to notice the --outputdir 
option and add those files to the clean target itself, could there? Having it 
hard-coded is slightly annoying. Maybe it could ask pg_regress where to find 
them?

Thanks,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-15 Thread Simon Riggs
On 9 June 2013 14:56, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 On 8 June 2013 22:25, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change. So we'd need to invent some semantics for a
 linking identifier of some description. Which once we've done it
 would be used by people to join them back together again, which is
 what we already had in the first place. So my take is that it sounds
 expressive, but definitely not performant.

 I have used a feature like this in other database products, and can
 say from experience that these relations in a statement trigger can
 be very useful without the linkage you propose.  I can see how the
 linkage could potentially be useful, but if that is the only
 hang-up, we would be adding a powerful feature without it.

What I'm trying to do is tune FKs, which are currently implemented as
after row triggers. Hence why I'm looking at ways to speed up after
row triggers.

Suggesting I work on after statement triggers would imply you think
that FKs should/could be rewritten as after statement triggers. Is
that what you mean?

Doing it that way would mean rewriting a lot of code and would still
have problems 2 and 3 identified above. I can't imagine anybody would
go for that, but if you have a sketch of how it might work we can
consider it.

 Since my objective is performance, not standards, I don't see a reason
 to work on that, yet. I might have time to work on it later, lets see.

 This seems like it has some overlap with the delta relations I will
 need to generate for incremental maintenance of materialized views,
 so we should coordinate on those efforts if they happen to occur
 around the same time.

IMHO you should use Andres' logical changeset extraction for
incremental maintenance for mat views. Doing mat view maintenance
using triggers would be choosing the lower performance option.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane
I wrote:
 ... Also, our switch to latches for sleeping purposes
 should have ameliorated the issue of signals failing to wake processes
 when we wanted them to.

 Let's turn on SA_RESTART for SIGALRM in HEAD and 9.3 and see what beta
 testing says.

I experimented with this a bit on my old HPUX box (which is one of the
ones where SA_RESTART signals don't terminate a select()), and soon
found an annoying behavior:

regression=# \timing
Timing is on.
regression=# set statement_timeout TO 2000;
SET
Time: 4.983 ms
regression=# select generate_series(1,100);
ERROR:  canceling statement due to statement timeout
Time: 2015.015 ms
regression=# select pg_sleep(10);
ERROR:  canceling statement due to statement timeout
Time: 3009.932 ms

This happens because pg_sleep() is implemented with a loop around
pg_usleep(100), and the latter no longer is terminated by a SIGALRM.

It seems like it'd be a good idea to replace the pg_sleep implementation
with something involving WaitLatch, which would not only ensure prompt
response to SIGALRM (and other signals, eg SIGQUIT), but would eliminate
useless process wakeups during a sleep.

In general, we might want to consider replacing long sleep intervals
with WaitLatch operations.  I thought for a bit about trying to turn
pg_usleep itself into a WaitLatch call; but it's also used in frontend
code where that wouldn't work, and anyway it's not clear this would be
a good thing for short sleeps.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] C++ compiler

2013-06-15 Thread Gaetano Mendola
I have read Peter Eisentraut blog entry about Moving to C++, I full agree
with him about what he wrote.

Is there any interest or work in progress in making the entire Postgresql
code
 base compilable by a C++ compiler?

Regards
Gaetano Mendola

-- 
cpp-today.blogspot.com


Re: [HACKERS] Batch API for After Triggers

2013-06-15 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 On 9 June 2013 14:56, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 On 8 June 2013 22:25, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change. So we'd need to invent some semantics for a
 linking identifier of some description. Which once we've done it
 would be used by people to join them back together again, which is
 what we already had in the first place. So my take is that it sounds
 expressive, but definitely not performant.

 I have used a feature like this in other database products, and can
 say from experience that these relations in a statement trigger can
 be very useful without the linkage you propose.  I can see how the
 linkage could potentially be useful, but if that is the only
 hang-up, we would be adding a powerful feature without it.

 What I'm trying to do is tune FKs, which are currently implemented as
 after row triggers. Hence why I'm looking at ways to speed up after
 row triggers.

 Suggesting I work on after statement triggers would imply you think
 that FKs should/could be rewritten as after statement triggers. Is
 that what you mean?

Well, I didn't make any suggestion -- I was asking questions, and
then when you commented on a hypothetical feature I responded.

That said, I was asking the question because I used SQL Server for
years when it had triggers but no foreign key definitions.  There
was a pattern for enforcing foreign key relationships in AFTER EACH
STATEMENT triggers using the old and new relations (with no linkage
between particular updated rows) which we used so heavily I could
write the triggers for a given foreign key mechanically in just a
minute or two.  This technique had good performance and seems to
have all the semantics you're looking for, so I was wondering
whether that might be a better approach to this problem.  It sure
seems like it would have fewer moving parts.  I know it was very
reliable with S2PL concurrency control, but there may be problems
with adapting it to MVCC that I'm not seeing without a deeper look.

If you're interested, I could try to prod those areas of my memory
to recall the pattern, or find examples of it somewhere.

 Doing it that way would mean rewriting a lot of code and would still
 have problems 2 and 3 identified above. I can't imagine anybody would
 go for that, but if you have a sketch of how it might work we can
 consider it.

I don't, but if there is interest I could probably sketch the outlines.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-15 Thread Fabien COELHO


Hello Greg,

I've done some more testing with the test patch. I have not seen any 
spike at the end of the throttled run.


The attached version 11 patch does ensure that throttle added sleeps are 
not included in latency measures (-r) and that throttling is performed 
right at the beginning of a transaction. There is an ugly goto to do that.


I think there is still a latent bug in the code with listen which should 
be set back to 0 in other places, but this bug is already there.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 1303217..2fc6f6a 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -205,6 +211,7 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +229,10 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+int64   throttle_trigger;  /* previous/next throttling (us) */
+	int64   throttle_lag;  /* total transaction lag behind throttling */
+	int64   throttle_lag_max;  /* max transaction lag */
+
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +241,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -355,6 +368,8 @@ usage(void)
 		 -n   do not run VACUUM before tests\n
 		 -N   do not update tables \pgbench_tellers\ and \pgbench_branches\\n
 		 -r   report average latency per command\n
+		 -R SPEC, --rate SPEC\n
+		  target rate in transactions per second\n
 		 -s NUM   report this scale factor in output\n
 		 -S   perform SELECT-only transactions\n
 	   -t NUM   number of transactions each client runs (default: 10)\n
@@ -898,17 +913,56 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booldo_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 *  10 = 11.5 .. 0
+		 *   1 =  9.2 .. 0
+		 *1000 =  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (throttle_delay  st-state==0)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1095,6 +1149,9 @@ top:
 			st-state = 0;
 			st-use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st-use_file];
+			st-throttled = false;
+			st-listen = 0;
+			do_throttle = (throttle_delay0);
 		}
 	}
 
@@ -1113,6 +1170,12 @@ top:
 		INSTR_TIME_ACCUM_DIFF(*conn_time, end, start);
 	}
 
+	if (do_throttle) {
+		/* delay throttling after reopenning the connection */
+		do_throttle = false;
+		goto top;
+	}
+
 	/* Record transaction start time if logging is enabled */
 	if (logfile  st-state == 0)
 		INSTR_TIME_SET_CURRENT(st-txn_begin);
@@ -2017,7 +2080,8 @@ process_builtin(char *tb)
 static void
 printResults(int ttype, int normal_xacts, int nclients,
 			 TState *threads, int nthreads,
-			 instr_time total_time, instr_time conn_total_time)
+			 instr_time 

Re: [HACKERS] pg_restore -l with a directory archive

2013-06-15 Thread Fujii Masao
On Sat, Jun 15, 2013 at 1:40 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 When I ran pg_restore -l with the directory arhicve input, I found that
 its format is wrongly reported as UNKNOWN.

 $ pg_dump -F d -f hoge
 $ pg_restore -l hoge
 ;
 ; Archive created at Sat Jun 15 01:38:14 2013
 ; dbname: postgres
 ; TOC Entries: 9
 ; Compression: -1
 ; Dump Version: 1.12-0
 ; Format: UNKNOWN
 ; Integer: 4 bytes
 ; Offset: 8 bytes
 ; Dumped from database version: 9.3beta1
 ; Dumped by pg_dump version: 9.3beta1
 ;
 ;
 ; Selected TOC Entries:


 In this case, the format should be reported as DIRECTORY.
 The attached patch fixes this problem.

Committed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add visibility map information to pg_freespace.

2013-06-15 Thread Simon Riggs
On 14 June 2013 15:22, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Kyotaro HORIGUCHI wrote:
 Helle,

 I've added visibility map information to pg_freespace for my
 utility.

 This makes sense to me.  I only lament the fact that this makes the
 module a misnomer.  Do we want to 1) rename the module (how
 inconvenient), 2) create a separate module for this (surely not
 warranted), or 3) accept it and move on?

I'm not sure why this is suggested as being part of pg_freespace and
not part of pageinspect? (Which is where all the other inspection
tools live).

If I wanted to see the vismap (and I do...) then I'd like to see the
whole vismap, not just the part that relates to blocks currently in
cache.

If you do want that, you can just join the two things together
(function to see vismap joined to pg_freespace).

(Having said that, I don't have a major objection to it being in
pg_freespace as well).

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-15 Thread Jeff Davis
On Sun, 2013-03-24 at 20:15 -0400, Nicholas White wrote:

 I've redone the leadlag function changes to use datumCopy as you
 suggested. However, I've had to remove the NOT_USED #ifdef around
 datumFree so I can use it to avoid building up large numbers of copies
 of Datums in the memory context while a query is executing. I've
 attached the revised patch...
 
Comments:

WinGetResultDatumCopy() calls datumCopy, which will just copy in the
current memory context. I think you want it in the per-partition memory
context, otherwise the last value in each partition will stick around
until the query is done (so many partitions could be a problem). That
should be easy enough to do by switching to the
winobj-winstate-partcontext memory context before calling datumCopy,
and then switching back.

For that matter, why store the datum again at all? You can just store
the offset of the last non-NULL value in that partition, and then fetch
it using WinGetFuncArgInPartition(), right? We really want to avoid any
per-tuple allocations.

Alternatively, you might look into setting a mark when you get a
non-NULL value. Then you could just always fetch the oldest one.
Unfortunately, I think that only works with const_offset=true... so the
previous idea might be better.

I'll leave it to someone else to review the grammar changes.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Andres Freund
On 2013-06-15 11:36:54 -0700, Jeff Davis wrote:
 On Fri, 2013-06-14 at 18:27 +0200, Andres Freund wrote:
  I'd like to see a comment around the memcpys in XLogSaveBufferForHint()
  that mentions that they are safe in a non std buffer due to
  XLogCheckBuffer setting an appropriate hole/offset. Or make an explicit
  change of the copy algorithm there.
 
 Done.
 Also done.

Thanks! Looks good to me.

 Patch attached. Also, since we branched, I think this should be
 back-patched to 9.3 as well.

Absolutely.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Andrew Dunstan


On 06/15/2013 02:43 PM, David E. Wheeler wrote:

On Jun 15, 2013, at 4:12 AM, Andrew Dunstan and...@dunslane.net wrote:


   REGRESS_OPTS = --inputdir=test --outputdir=test \
  --load-extension=$(EXTENSION)
   ...
   override pg_regress_clean_files = test/results/
   test/regression.diffs test/regression.out tmp_check/ log/


That keeps the testing stuff out of the way quite nicely.

I don't suppose there could be a way for the makefile to notice the --outputdir 
option and add those files to the clean target itself, could there? Having it 
hard-coded is slightly annoying. Maybe it could ask pg_regress where to find 
them?




That doesn't sound like a promising line of development to me. Better 
would be to provide a PGXS option to specify where tests are based, and 
set the clean target accordingly.


Then instead of the above you'd just be able to say something like

MODULETEST = test
REGRESS_OPTS = --load-extension=$(EXTENSION)

Which would be a good deal cleaner.

cheers

andrew







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-15 Thread Alvaro Herrera
Nicholas White escribió:

 For the parsing changes, it seems I can either make RESPECT and IGNORE
 reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
 and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
 IGNORE were just normal unreserved keywords due to ambiguities after a
 function definition (e.g. select abs(1) respect; - which is currently a
 valid statement).

Well, making them reserved keywords is not that great, so maybe the
lookahead thingy is better.  However, this patch introduces the third
and fourth uses of the save the lookahead token pattern in the
default switch cases.  Can we refactor that bit somehow, to avoid so
many duplicates?  (For a minute I thought that Andrew Gierth's WITH
ORDINALITY patch would add another one, but it seems not.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Robert Haas
On Sat, Jun 15, 2013 at 8:11 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 Claiming that the algorithm will be one of only two (current and
 whatever algorithm we come up with ) suggests that it is
 only one bit, which is undoubtedly too little for having a pluggable
 compression API :)

See 
http://www.postgresql.org/message-id/20130607143053.gj29...@alap2.anarazel.de

 But those identifiers should be *small* (since they are added to all
 Datums)
 if there will be any alignment at all between the datums, then
 one byte will be lost in the noise (remember: nobody will need
 more than 256 compression algorithms)
 OTOH, if you plan to put these format markers in the compressed
 stream and change the compression algorithm while reading it, I am lost.

The above-linked email addresses this point as well: there are bits
available in the toast pointer.  But there aren't MANY bits without
increasing the storage footprint, so trying to do something that's
more general than we really need is going to cost us in terms of
on-disk footprint.  Is that really worth it?  And if so, why?  I don't
find the idea of a trade-off between compression/decompression speed
and compression ratio to be very exciting.  As Andres says, bzip2 is
impractically slow for ... almost everything.  If there's a good
BSD-licensed algorithm available, let's just use it and be done.  Our
current algorithm has lasted us a very long time; I see no reason to
think we'll want to change this again for another 10 years, and by
that time, we may have redesigned the storage format altogether,
making the limited extensibility of our current TOAST pointer format
moot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Alvaro Herrera
Tom Lane wrote:

 In general, we might want to consider replacing long sleep intervals
 with WaitLatch operations.  I thought for a bit about trying to turn
 pg_usleep itself into a WaitLatch call; but it's also used in frontend
 code where that wouldn't work, and anyway it's not clear this would be
 a good thing for short sleeps.

How about having a #ifdef !FRONTEND code path that uses the latch, and
sleep otherwise?  And maybe use plain sleep for short sleeps in the
backend also, to avoid the latch overhead.  I notice we already have
three implementations of pg_usleep.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pluggable compression support

2013-06-15 Thread Robert Haas
On Sat, Jun 15, 2013 at 8:22 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 You suddently need to solve the question of how the identifiers for
 compression formats are allocated and preserved across pg_upgrade and
 across machines.
 This is something similar we already need to do for any non-builtin type
 OID.

That's true, but that code has already been written.  And it's not
trivial.  The code involved is CREATE/ALTER/DROP TYPE plus all the
corresponding pg_dump mechanism.  To do what you're proposing here,
we'd need CREATE/ALTER/DROP COMPRESSION METHOD, and associated pg_dump
--binary-upgrade support.  I think Andres is entirely right to be
skeptical about that.  It will make this project about 4 times as hard
for almost no benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Tom Lane wrote:
  In general, we might want to consider replacing long sleep intervals
  with WaitLatch operations.  I thought for a bit about trying to turn
  pg_usleep itself into a WaitLatch call; but it's also used in frontend
  code where that wouldn't work, and anyway it's not clear this would be
  a good thing for short sleeps.
 
 How about having a #ifdef !FRONTEND code path that uses the latch, and
 sleep otherwise?  And maybe use plain sleep for short sleeps in the
 backend also, to avoid the latch overhead.  I notice we already have
 three implementations of pg_usleep.

Is there really serious overhead from using latches..?  I thought much
of the point of that approach was specifically to minimize overhead...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 Is there really serious overhead from using latches..?

IIRC there's at least one gettimeofday() involved in WaitLatch.  There
are platforms on which that already costs more than you want to spend
for, say, CommitDelay.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] askpass program for libpq

2013-06-15 Thread Daniel Farina
On Fri, May 17, 2013 at 2:03 PM, Daniel Farina dan...@heroku.com wrote:
 On Wed, Jan 9, 2013 at 5:17 AM, Peter Eisentraut pete...@gmx.net wrote:
 I would like to have something like ssh-askpass for libpq.  The main
 reason is that I don't want to have passwords in plain text on disk,
 even if .pgpass is read protected.  By getting the password from an
 external program, I can integrate libpq tools with the host system's key
 chain or wallet thing, which stores passwords encrypted.

 I'm thinking about adding a new connection option askpass with
 environment variable PGASKPASS.  One thing I haven't quite figured out
 is how to make this ask for passwords only if needed.  Maybe it needs
 two connection options, one to say which program to use and one to say
 whether to use it.

 Ideas?

 Okay, I have a patch that does something *like* (but not the same) as
 this, and whose implementation is totally unreasonable, but it's
 enough to get a sense of how the whole thing feels.  Critically, it
 goes beyond askpass, instead allowing a shell-command based hook for
 arbitrary interpretation and rewriting of connection info...such as
 the 'host' libpq keyword.  I have called it, without much thought, a
 'resolver'.  In this way, it's closer to the libpq 'service' facility,
 except with addition of complete control of the interpretation of
 user-provided notation.

Hello everyone,

I'm sort of thinking of attacking this problem again, does anyone have
an opinion or any words of (en/dis)couragement to continue?  The
implementation I posted is bogus but is reasonable to feel around
with, but I'm curious besides its obvious defects as to what the
temperature of opinion is.

Most generally, I think the benefits are strongest in dealing with:

* Security: out-of-band secrets will just prevent people from pasting
  important stuff all over the place, as I see despairingly often
  today.

* Client-side Proxies: pgbouncer comes to mind, a variation being used
  on production applications right now that uses full-blown
  preprocessing of the user environment (only possible in a
  environment with certain assumptions like Heroku)
  https://github.com/gregburek/heroku-buildpack-pgbouncer seems very
  promising and effective, but it'd be nice to confer the same
  benefits to everyone else, too.

* HA: one of the most annoying problems in HA is naming things.  Yes,
  this could be solved with other forms of common dynamic binding DNS
  or Virtual IP (sometimes), but these both are pretty complicated and
  carry baggage and pitfalls, but as long as there is dynamic binding
  of the credentials, I'm thinking it may make sense to have dynamci
  binding of net locations, too.

* Cross-server references

  This is basically the issues seen in HA and Security, but on
  (horrible) steroids: the spate of features making Postgres work
  cross-server (older features like dblink, but now also new ones like
  FDWs and Writable FDWs) make complex interconnection between servers
  more likely and problematic, especially if one has standbys where
  there is a delay in catalog propagation from a primary to standby
  with new connection info.

  So, an out of band way where one can adjust the dynamic binding
  seems useful there.

Knowing those, am I barking up the wrong tree?  Can I do something
else entirely?  I've considered DNS and SSL certs, but these seem
much, much harder and limited, too.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] askpass program for libpq

2013-06-15 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 Okay, I have a patch that does something *like* (but not the same) as
 this, and whose implementation is totally unreasonable, but it's
 enough to get a sense of how the whole thing feels.  Critically, it
 goes beyond askpass, instead allowing a shell-command based hook for
 arbitrary interpretation and rewriting of connection info...such as
 the 'host' libpq keyword.  I have called it, without much thought, a
 'resolver'.  In this way, it's closer to the libpq 'service' facility,
 except with addition of complete control of the interpretation of
 user-provided notation.

 Hello everyone,

 I'm sort of thinking of attacking this problem again, does anyone have
 an opinion or any words of (en/dis)couragement to continue?  The
 implementation I posted is bogus but is reasonable to feel around
 with, but I'm curious besides its obvious defects as to what the
 temperature of opinion is.

 Most generally, I think the benefits are strongest in dealing with:

 * Security: out-of-band secrets will just prevent people from pasting
   important stuff all over the place, as I see despairingly often
   today.

 * Client-side Proxies: pgbouncer comes to mind, a variation being used
   on production applications right now that uses full-blown
   preprocessing of the user environment (only possible in a
   environment with certain assumptions like Heroku)
   https://github.com/gregburek/heroku-buildpack-pgbouncer seems very
   promising and effective, but it'd be nice to confer the same
   benefits to everyone else, too.

 * HA: one of the most annoying problems in HA is naming things.  Yes,
   this could be solved with other forms of common dynamic binding DNS
   or Virtual IP (sometimes), but these both are pretty complicated and
   carry baggage and pitfalls, but as long as there is dynamic binding
   of the credentials, I'm thinking it may make sense to have dynamci
   binding of net locations, too.

 * Cross-server references

   This is basically the issues seen in HA and Security, but on
   (horrible) steroids: the spate of features making Postgres work
   cross-server (older features like dblink, but now also new ones like
   FDWs and Writable FDWs) make complex interconnection between servers
   more likely and problematic, especially if one has standbys where
   there is a delay in catalog propagation from a primary to standby
   with new connection info.

   So, an out of band way where one can adjust the dynamic binding
   seems useful there.

TBH, I see no clear reason to think that a connection-string rewriter
solves any of those problems.  At best it would move them somewhere else.
Nor is it clear that any of this should be libpq's business, as opposed
to something an application might do before invoking libpq.  Also,
I think a facility dependent on invoking a shell command is (a) wide
open for security problems, and (b) not likely to be portable to
Windows.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] askpass program for libpq

2013-06-15 Thread Daniel Farina
On Sat, Jun 15, 2013 at 8:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 Okay, I have a patch that does something *like* (but not the same) as
 this, and whose implementation is totally unreasonable, but it's
 enough to get a sense of how the whole thing feels.  Critically, it
 goes beyond askpass, instead allowing a shell-command based hook for
 arbitrary interpretation and rewriting of connection info...such as
 the 'host' libpq keyword.  I have called it, without much thought, a
 'resolver'.  In this way, it's closer to the libpq 'service' facility,
 except with addition of complete control of the interpretation of
 user-provided notation.

 Hello everyone,

 I'm sort of thinking of attacking this problem again, does anyone have
 an opinion or any words of (en/dis)couragement to continue?  The
 implementation I posted is bogus but is reasonable to feel around
 with, but I'm curious besides its obvious defects as to what the
 temperature of opinion is.

 Most generally, I think the benefits are strongest in dealing with:

 * Security: out-of-band secrets will just prevent people from pasting
   important stuff all over the place, as I see despairingly often
   today.

 * Client-side Proxies: pgbouncer comes to mind, a variation being used
   on production applications right now that uses full-blown
   preprocessing of the user environment (only possible in a
   environment with certain assumptions like Heroku)
   https://github.com/gregburek/heroku-buildpack-pgbouncer seems very
   promising and effective, but it'd be nice to confer the same
   benefits to everyone else, too.

 * HA: one of the most annoying problems in HA is naming things.  Yes,
   this could be solved with other forms of common dynamic binding DNS
   or Virtual IP (sometimes), but these both are pretty complicated and
   carry baggage and pitfalls, but as long as there is dynamic binding
   of the credentials, I'm thinking it may make sense to have dynamci
   binding of net locations, too.

 * Cross-server references

   This is basically the issues seen in HA and Security, but on
   (horrible) steroids: the spate of features making Postgres work
   cross-server (older features like dblink, but now also new ones like
   FDWs and Writable FDWs) make complex interconnection between servers
   more likely and problematic, especially if one has standbys where
   there is a delay in catalog propagation from a primary to standby
   with new connection info.

   So, an out of band way where one can adjust the dynamic binding
   seems useful there.

 TBH, I see no clear reason to think that a connection-string rewriter
 solves any of those problems.  At best it would move them somewhere else.

Yes, that's exactly what I want to achieve: moving them somewhere else
that can be held in common by client applications.

 Nor is it clear that any of this should be libpq's business, as opposed
 to something an application might do before invoking libpq.

Yes, it's unclear.  I have only arrived at seriously exploring this
after trying my very best meditate on other options.  In addition,
sometimes 'the application' is Postgres, and with diverse access paths
like FDWs and dblink, which may not be so easy to adjust, and it would
seem strange to adjust them in a way that can't be shared in common
with regular non-backend-linked client applications.

Also, it seems like a very high bar to set for an application as to
make use of environment keychains or environment-specific high
availability retargeting.  This general approach you mention is used
in Greg Burek's heroku-pgbouncer buildpack, but it took significant
work to iron out (and probably still needs more ironing, although it
seems to work great) and only serves the Heroku-verse.

Basically, the needs seem very similar, so abstracting seems to me
profitable.

This is not that different in principle than pgpass and the services
file in that regard, except taking the final step to delegate their
function...and deliver full control over notation.

Although I don't know much about it, I seem to recall that VMWare felt
inclined to instigate some kind of vaguely related solution to solve a
similar problem in a custom libpq.  After initial recoil and over a
year of contemplation, I think the reasons are more well justified
than I originally thought and it'd be nice to de-weirdify such
approaches.

 Also, I think a facility dependent on invoking a shell command is
 (a) wide open for security problems, and (b) not likely to be
 portable to Windows.

Yeah, those things occurred to me, I think a dlopen based mechanism is
a more likely solution than the shell-command one.  The latter just
let me get started quickly to experiment.  Would a rigorous proposal
about how to do that help the matter?  I mostly wanted to get the
temperature before thinking about Real Mechanisms.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your 

Re: [HACKERS] request a new feature in fuzzystrmatch

2013-06-15 Thread Josh Berkus
Liming,

I've added your proposal in to the commitfest so that you can get a
review this commitfest.  However, you need to generate a context diff
and submit it to this mailing list for it to really get reviewed.
That's a requirement.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Amit kapila
On Saturday, June 15, 2013 8:29 PM Sawada Masahiko wrote:
On Sat, Jun 15, 2013 at 10:34 PM, Amit kapila amit.kap...@huawei.com wrote:

 On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote:
 On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila amit.kap...@huawei.com wrote:
 On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
 Hello,

 We have already started a discussion on pgsql-hackers for the problem of
 taking fresh backup during the failback operation here is the link for 
 that:


 http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
 jgwrfu513...@mail.gmail.com

 Let me again summarize the problem we are trying to address.


   How will you take care of extra WAL on old master during recovery. If it
 plays the WAL which has not reached new-master, it can be a problem.

 you means that there is possible that old master's data ahead of new
 master's data.

   I mean to say is that WAL of old master can be ahead of new master. I 
 understood that
   data files of old master can't be ahead, but I think WAL can be ahead.

 so there is inconsistent data between those server when fail back. right?
 if so , there is not possible inconsistent. because if you use GUC option
 as his propose (i.g., failback_safe_standby_mode = remote_flush),
 when old master is working fine, all file system level changes aren't
 done  before WAL replicated.

 Would the propose patch will take care that old master's WAL is also not 
 ahead in some way?
 If yes, I think i am missing some point.

 yes it will happen that old master's WAL ahead of new master's WAL as you 
 said.
 but I think that we can solve them by delete all WAL file when old
 master starts as new standby.

I think ideally, it should reset WAL location at the point where new master has 
forrked off.
In such a scenario it would be difficult for user who wants to get a dump of 
some data in 
old master which hasn't gone to new master. I am not sure if such a need is 
there for real users, but if it
is there, then providing this solution will have some drawbacks.

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] request a new feature in fuzzystrmatch

2013-06-15 Thread Liming Hu

On 6/15/2013 9:51 PM, Josh Berkus wrote:

Liming,

I've added your proposal in to the commitfest so that you can get a
review this commitfest.  However, you need to generate a context diff
and submit it to this mailing list for it to really get reviewed.
That's a requirement.


Hi Josh,

Thanks. Where can I see the proposal? Thanks.

Liming


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers