Re: [HACKERS] Hash id in pg_stat_statements
On Mon, Oct 1, 2012 at 12:57 AM, Magnus Hagander wrote: > Can we please expose the internal hash id of the statements in > pg_stat_statements? > > I know there was discussions about it earlier, and it wasn't done with > an argument of it not being stable between releases (IIRC). I think we > can live with that drawback, assuming of course that we document this > properly. > > I've now run into multiple customer installations where it would be > very useful to have. The usecase is mainly storing snapshots of the > pg_stat_statements output over time and analyzing those. Weird things > happen for example when the query text is the same, but the hash is > different (which can happen for example when a table is dropped and > recreated). And even without that, in order to do anything useful with > it, you end up hashing the query text anyway - so using the already > existing hash would be easier and more useful. I have a similar problem, however, I am not sure if the hash generated is ideal. Putting aside the number of mechanical, versioning, shut-down/stats files issues, etc reasons given in the main branch of the thread, I also have this feeling that it is not what I want. Consider the following case: SELECT * FROM users WHERE id = ? SELECT * FROM users WHERE id = ? In the intervening time, an equivalent hash could still be evicted and reintroduced and the statistics silently reset, and that'll befuddle principled tools. This is worse than merely less-useful, because it can lead to drastic underestimations that otherwise pass inspection. Instead, I think it makes sense to assign a number -- arbitrarily, but uniquely -- to the generation of a new row in pg_stat_statements, and, on the flip side, whenever a row is retired its number should be eliminated, practically, for-ever. This way re-introductions between two samplings of pg_stat_statements cannot be confused for a contiguously maintained statistic on a query. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PQping command line tool
I was wondering recently if there was any command line tool that utilized PQping() or PQpingParams(). I searched the code and couldn't find anything and was wondering if there was any interest to have something like this included? I wrote something for my purposes of performing a health check that also supports nagios style status output. It's probably convenient for scripting purposes as well. It's not currently ready for submission to a commitfest, but if there was an interest I would clean it up so that it would be. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [9.1] 2 bugs with extensions
Excerpts from Dimitri Fontaine's message of lun oct 01 04:44:28 -0300 2012: > Hi, > > Alvaro Herrera writes: > >> Same for 9.2, attached. master needs yet another patch because of the > >> recent headers reorg, it seems, that's for another day though. > > > > No, just remove the RELKIND_UNCATALOGUED case in that switch. > > Oh. As in the attached? :) That seems to work, yes. While testing this out I noticed that this silently does nothing: SELECT pg_catalog.pg_extension_config_dump('my_config', NULL); i.e. the table is not marked as configuration but no error or warning appears, either. This seems rather surprising. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade does not completely honor --new-port
Applied to head and 9.2. --- On Wed, Sep 26, 2012 at 10:06:50PM -0400, Bruce Momjian wrote: > On Tue, Sep 25, 2012 at 05:36:54PM +0300, Devrim Gunduz wrote: > > > > Hi, > > > > I just performed a test upgrade from 9.1 to 9.2, and used --new-port > > variable. However, the analyze_new_cluster.sh does not include the new > > port, thus when I run it, it fails. Any chance to add the port number to > > the script? > > Well, the reason people normally use the port number is to do a live > check, but obviously when the script is created it isn't doing a check. > I am worried that if I do embed the port number in there, then if they > change the port after the upgrade, they now can't use the script. I > assume users would have PGPORT set before running the script, no? > > > Also, is it worth to add the value specified in --new-bindir as a prefix > > to vacuumdb command in the same script? > > Wow, I never thought of adding a path to those scripts, but it certainly > makes sense. I have created the attached patch which does this. > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c > index bed10f8..2785eb7 100644 > --- a/contrib/pg_upgrade/check.c > +++ b/contrib/pg_upgrade/check.c > @@ -477,7 +477,7 @@ create_script_for_cluster_analyze(char > **analyze_script_file_name) > ECHO_QUOTE, ECHO_QUOTE); > fprintf(script, "echo %sthis script and run:%s\n", > ECHO_QUOTE, ECHO_QUOTE); > - fprintf(script, "echo %svacuumdb --all %s%s\n", ECHO_QUOTE, > + fprintf(script, "echo %s\"%s/vacuumdb\" --all %s%s\n", ECHO_QUOTE, > new_cluster.bindir, > /* Did we copy the free space files? */ > (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ? > "--analyze-only" : "--analyze", ECHO_QUOTE); > @@ -498,7 +498,7 @@ create_script_for_cluster_analyze(char > **analyze_script_file_name) > ECHO_QUOTE, ECHO_QUOTE); > fprintf(script, "echo > %s--%s\n", > ECHO_QUOTE, ECHO_QUOTE); > - fprintf(script, "vacuumdb --all --analyze-only\n"); > + fprintf(script, "\"%s/vacuumdb\" --all --analyze-only\n", > new_cluster.bindir); > fprintf(script, "echo%s\n", ECHO_BLANK); > fprintf(script, "echo %sThe server is now available with minimal > optimizer statistics.%s\n", > ECHO_QUOTE, ECHO_QUOTE); > @@ -519,7 +519,7 @@ create_script_for_cluster_analyze(char > **analyze_script_file_name) > ECHO_QUOTE, ECHO_QUOTE); > fprintf(script, "echo > %s---%s\n", > ECHO_QUOTE, ECHO_QUOTE); > - fprintf(script, "vacuumdb --all --analyze-only\n"); > + fprintf(script, "\"%s/vacuumdb\" --all --analyze-only\n", > new_cluster.bindir); > fprintf(script, "echo%s\n\n", ECHO_BLANK); > > #ifndef WIN32 > @@ -532,7 +532,7 @@ create_script_for_cluster_analyze(char > **analyze_script_file_name) > ECHO_QUOTE, ECHO_QUOTE); > fprintf(script, "echo > %s-%s\n", > ECHO_QUOTE, ECHO_QUOTE); > - fprintf(script, "vacuumdb --all %s\n", > + fprintf(script, "\"%s/vacuumdb\" --all %s\n", new_cluster.bindir, > /* Did we copy the free space files? */ > (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ? > "--analyze-only" : "--analyze"); > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
This case points to a weakness in many programming languages, not having a clear ifof (if and only if) versus if construction. The sane use case for create schema foo if not exists is for building a database dynamically, where several points may be the first to put a table in a schema, and schemas should not be created if there are no objects. The create/search/drop design pattern having its own problems. Thus the construction should default to one behavior, and have an option for the second. e.g. create schema foo if not exists (will not be done if foo existed) create schema foo if not exists FORCE (will be done even if foo existed) This would even allow for mixed e.g. create schema foo if not exists (tables that should be created once and not again) FORCE (objects routine will add if the schema does) On Oct 2, 2012, at 6:33 PM, Fabrízio de Royes Mello wrote: > > 2012/10/2 Alvaro Herrera > Excerpts from Andrew Dunstan's message of mar oct 02 17:24:38 -0300 2012: > > > > On 10/02/2012 03:48 PM, Tom Lane wrote: > > > Alvaro Herrera writes: > > > >> Well, if that's the rationale then you end up with no schema foo at all > > >> (i.e. both die), which seems even more surprising (though I admit it has > > >> the advantage of being a simple rule to document.) > > > I think we should just disallow putting any contained objects in the > > > statement when IF NOT EXISTS is used. It's simple to understand, simple > > > to document and implement, and I think it covers all the sane use-cases > > > anyway. > > > > I thought we'd already agreed on this. > > Well, it's not what the latest proposed patch implements. > > > You're right... the latest proposed patch don't implements it. > > I'll change the patch and send soon... > > Regards, > > -- > Fabrízio de Royes Mello > Consultoria/Coaching PostgreSQL > >> Blog sobre TI: http://fabriziomello.blogspot.com > >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello > >> Twitter: http://twitter.com/fabriziomello >
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
2012/10/2 Alvaro Herrera > Excerpts from Andrew Dunstan's message of mar oct 02 17:24:38 -0300 2012: > > > > On 10/02/2012 03:48 PM, Tom Lane wrote: > > > Alvaro Herrera writes: > > > >> Well, if that's the rationale then you end up with no schema foo at > all > > >> (i.e. both die), which seems even more surprising (though I admit it > has > > >> the advantage of being a simple rule to document.) > > > I think we should just disallow putting any contained objects in the > > > statement when IF NOT EXISTS is used. It's simple to understand, > simple > > > to document and implement, and I think it covers all the sane use-cases > > > anyway. > > > > I thought we'd already agreed on this. > > Well, it's not what the latest proposed patch implements. > > You're right... the latest proposed patch don't implements it. I'll change the patch and send soon... Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] Hash id in pg_stat_statements
On Tue, Oct 02, 2012 at 12:58:15PM -0400, Stephen Frost wrote: > > I simply do not understand objections to the proposal. Have I missed > > something? > > It was my impression that the concern is the stability of the hash value > and ensuring that tools which operate on it don't mistakenly lump two > different queries into one because they had the same hash value (caused > by a change in our hashing algorithm or input into it over time, eg a > point release). I was hoping to address that to allow this proposal to > move forward.. That makes no sense though. The moment you talk about "hash" you consider the possibility of lumping together things that aren't the same. Any tools using these hashes must have realised this. Fortunatly, the statistics are better than the birthday paradox. The chances that the two most important queries in your system end up having the same hash is miniscule. Like mentioned elsewhere, a system with more than 10,000 different queries sounds rare to me (once you exclude query parameters ofcourse). Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
Excerpts from Andrew Dunstan's message of mar oct 02 17:24:38 -0300 2012: > > On 10/02/2012 03:48 PM, Tom Lane wrote: > > Alvaro Herrera writes: > >> Well, if that's the rationale then you end up with no schema foo at all > >> (i.e. both die), which seems even more surprising (though I admit it has > >> the advantage of being a simple rule to document.) > > I think we should just disallow putting any contained objects in the > > statement when IF NOT EXISTS is used. It's simple to understand, simple > > to document and implement, and I think it covers all the sane use-cases > > anyway. > > I thought we'd already agreed on this. Well, it's not what the latest proposed patch implements. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
On 10/02/2012 03:48 PM, Tom Lane wrote: Alvaro Herrera writes: Excerpts from David E. Wheeler's message of mar oct 02 16:16:37 -0300 2012: On Oct 2, 2012, at 12:08 PM, Alvaro Herrera wrote: create schema if not exists foo create table first (a int); create schema if not exists foo create table second (a int); As far as I can see, with the patch as it currently stands, you would end up with only table "first" in the schema, which seems very surprising to me. Yeah, I think the second should die. CINE should only work if there are no other objects created as part of the statement, IMHO. Well, if that's the rationale then you end up with no schema foo at all (i.e. both die), which seems even more surprising (though I admit it has the advantage of being a simple rule to document.) I think we should just disallow putting any contained objects in the statement when IF NOT EXISTS is used. It's simple to understand, simple to document and implement, and I think it covers all the sane use-cases anyway. I thought we'd already agreed on this. +1. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
On Oct 2, 2012, at 12:48 PM, Tom Lane wrote: > I think we should just disallow putting any contained objects in the > statement when IF NOT EXISTS is used. It's simple to understand, simple > to document and implement, and I think it covers all the sane use-cases > anyway. +1 David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
Excerpts from David E. Wheeler's message of mar oct 02 16:37:30 -0300 2012: > On Oct 2, 2012, at 12:30 PM, Alvaro Herrera wrote: > > > How about call this for precedent: > > > > mkdir -p /tmp/foo/bar > > mkdir -p /tmp/foo/baz > > > > In this case you end up with directory "foo" and at least two subdirs in > > it, bar and baz. This works even if /tmp/foo existed previously and > > even if there was some other stuff in it. > > Well, what about this, then? > > create schema if not exists foo create table second (a int); > create schema if not exists foo create table second (b int); Yes, exactly -- what about this case? This is precisely the reason we don't have CREATE TABLE IF NOT EXISTS. I don't know what the best answer is. Most people seem to think that the answer ought to be that you end up with a single column second.a, and the second command errors out. So if you do this: create schema if not exists foo create table first (a int); create schema if not exists foo create table first (b int), create table second (a int); you end up with *only* the first table, because the second command errors out when the first table is observed to exist. Now, what if you were to do this instead: create schema if not exists foo create table if not exists first (a int); create schema if not exists foo create table if not exists first (b int), create table if not exists second (a int); The you end up with first.a and second.a. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
Alvaro Herrera writes: > Excerpts from David E. Wheeler's message of mar oct 02 16:16:37 -0300 2012: >> On Oct 2, 2012, at 12:08 PM, Alvaro Herrera wrote: >>> create schema if not exists foo create table first (a int); >>> create schema if not exists foo create table second (a int); >>> >>> As far as I can see, with the patch as it currently stands, you would >>> end up with only table "first" in the schema, which seems very >>> surprising to me. >> Yeah, I think the second should die. CINE should only work if there are no >> other objects created as part of the statement, IMHO. > Well, if that's the rationale then you end up with no schema foo at all > (i.e. both die), which seems even more surprising (though I admit it has > the advantage of being a simple rule to document.) I think we should just disallow putting any contained objects in the statement when IF NOT EXISTS is used. It's simple to understand, simple to document and implement, and I think it covers all the sane use-cases anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
On Oct 2, 2012, at 12:30 PM, Alvaro Herrera wrote: > How about call this for precedent: > > mkdir -p /tmp/foo/bar > mkdir -p /tmp/foo/baz > > In this case you end up with directory "foo" and at least two subdirs in > it, bar and baz. This works even if /tmp/foo existed previously and > even if there was some other stuff in it. Well, what about this, then? create schema if not exists foo create table second (a int); create schema if not exists foo create table second (b int); David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
Excerpts from David E. Wheeler's message of mar oct 02 16:16:37 -0300 2012: > On Oct 2, 2012, at 12:08 PM, Alvaro Herrera wrote: > > > create schema if not exists foo create table first (a int); > > create schema if not exists foo create table second (a int); > > > > > > As far as I can see, with the patch as it currently stands, you would > > end up with only table "first" in the schema, which seems very > > surprising to me. > > Yeah, I think the second should die. CINE should only work if there are no > other objects created as part of the statement, IMHO. Well, if that's the rationale then you end up with no schema foo at all (i.e. both die), which seems even more surprising (though I admit it has the advantage of being a simple rule to document.) How about call this for precedent: mkdir -p /tmp/foo/bar mkdir -p /tmp/foo/baz In this case you end up with directory "foo" and at least two subdirs in it, bar and baz. This works even if /tmp/foo existed previously and even if there was some other stuff in it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
On Oct 2, 2012, at 12:08 PM, Alvaro Herrera wrote: > create schema if not exists foo create table first (a int); > create schema if not exists foo create table second (a int); > > > As far as I can see, with the patch as it currently stands, you would > end up with only table "first" in the schema, which seems very > surprising to me. Yeah, I think the second should die. CINE should only work if there are no other objects created as part of the statement, IMHO. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- > ow...@postgresql.org] On Behalf Of Noah Misch > Sent: Tuesday, October 02, 2012 3:02 PM > To: Craig Ringer > Cc: PostgreSQL Hackers > Subject: Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing? > > On Tue, Aug 21, 2012 at 02:31:29PM +0800, Craig Ringer wrote: > > It'd really help if REVOKE consistently raised warnings when it didn't > > actually revoke anything. > > +1 > > This will invite the same mixed feelings as the CREATE x IF NOT EXISTS > notices, but I think it's worthwhile. > > > Even better, a special case for REVOKEs on objects that only have > > owner and public permissions could say: > > > > WARNING: REVOKE didn't remove any permissions for user . This > > has default permissions, so there were no GRANTs > > for user to revoke. See the documentation for REVOKE for more > > information. > > The extra aid from saying those particular things is not clear to me. > > It might be overkill, but we could report any other roles indirectly conveying > access to the named role. > Having been bitten by this myself I do see the value in such a warning. It is not uncommon for someone using REVOKE to believe they are installing a block instead of removing an allowance; especially as it interacts with default permissions. That said, and this is an off-the-cuff thought, the entire UI for permissions, and its treatment in the documentation, seems to be fact oriented. The system is well documented but actually getting up to speed to learn and use it is still a matter of reading the documentation and figuring out how everything fits together. I haven't given it that much thought but I am curious if others are of the same opinion. IOW, this proposal is an attempt to fix a symptom without addressing the root cause. Food for thought. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
Excerpts from Volker Grabsch's message of sáb sep 29 06:32:13 -0300 2012: > Dickson S. Guedes schrieb: > > - https://commitfest.postgresql.org/action/patch_view?id=907 > > > > The patch is small and implements a new syntax to CREATE SCHEMA > > that allow the creation of a schema be skipped when IF NOT EXISTS is > > used. > > > > [...] > > > > - Should this patch implements others INEs like ADD COLUMN IF NOT EXISTS? > > If there's still a chance to improve the patch, I'd love to see > the following INEs implemented. Several real-world database > upgrade scripts would benefit from those: I don't see that this patch is responsible for such new commands. If you want them, feel free to submit separate patches for them (or have someone else do it for you). But see the thread starting at http://archives.postgresql.org/message-id/467881.79137.qm%40web27104.mail.ukl.yahoo.com -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash id in pg_stat_statements
On 2 October 2012 17:58, Stephen Frost wrote: > Right, and that's all I'm trying to address here- how do we provide a > value for a given query which can be relied upon by outside sources, > even in the face of a point release which changes what our internal hash > value for a given query is. I don't know of a way. Presumably, we'd hope to avoid this, and would look for alternatives to anything that would necessitate bumping PGSS_FILE_HEADER, while not going so far as to let pg_stat_statements contort things in the core system. If I was aware of a case where this would have come up had pg_stat_statements fingerprinting been around at the time, perhaps I could give a better answer than that. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
The fundamental issue with this patch hasn't been answered sufficiently, I think. Consider the following sequence of commands: create schema if not exists foo create table first (a int); create schema if not exists foo create table second (a int); As far as I can see, with the patch as it currently stands, you would end up with only table "first" in the schema, which seems very surprising to me. I think this needs more thought, and in any case it needs more comprehensive regression test and documentation (i.e. at least the examples ought to explain what would happen in such cases). -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?
On Tue, Aug 21, 2012 at 02:31:29PM +0800, Craig Ringer wrote: > It'd really help if REVOKE consistently raised warnings when it didn't > actually revoke anything. +1 This will invite the same mixed feelings as the CREATE x IF NOT EXISTS notices, but I think it's worthwhile. > Even better, a special case for REVOKEs on objects that only have owner > and public permissions could say: > > WARNING: REVOKE didn't remove any permissions for user . This > > has default permissions, so there were no GRANTs for user to > revoke. See the documentation > for REVOKE for more information. The extra aid from saying those particular things is not clear to me. It might be overkill, but we could report any other roles indirectly conveying access to the named role. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash id in pg_stat_statements
On 2 October 2012 18:16, Tom Lane wrote > 1. Why isn't something like md5() on the reported query text an equally > good solution for users who want a query hash? Because that does not uniquely identify the entry. The very first thing that the docs say on search_path is "Qualified names are tedious to write, and it's often best not to wire a particular schema name into applications anyway". Presumably, the reason it's best not to wire schema names into apps is because it might be useful to modify search_path in a way that dynamically made the same queries in some application reference what are technically distinct relations. If anyone does this, and it seems likely that many do for various reasons, they will be out of luck when using some kind of pg_stat_statements aggregation. This was the behaviour that I intended for pg_stat_statements all along, and I think it's better than a solution that matches query strings. > 2. If people are going to accumulate stats on queries over a long period > of time, is a 32-bit hash really good enough for the purpose? If I'm > doing the math right, the chance of collision is already greater than 1% > at 1 queries, and rises to about 70% for 10 queries; see > http://en.wikipedia.org/wiki/Birthday_paradox > We discussed this issue and decided it was okay for pg_stat_statements's > internal hash table, but it's not at all clear to me that it's sensible > to use 32-bit hashes for external accumulation of query stats. Well, forgive me for pointing this out, but I did propose that the hash be a 64-bit value (which would have necessitated adopting hash_any() to produce 64-bit values), but you rejected the proposal. I arrived at the same probability for a collision as you did and posted in to the list, in discussion shortly after the normalisation stuff was committed. A more sensible way of assessing the risk of a collision would be to try and come up with the probability of a collision that someone actually ends up caring about, which is considerably less than the 1% for 10,000 entries. I'm not being glib - people are very used to the idea that aggregating information on query costs is a lossy process. Prior to 9.2, the only way execution costs could reasonably be measured on at the query granularity on a busy system was to set log_min_duration_statement to something like 1 second. I am also unconvinced by the idea that aggregating historical data (with the same hash value) in a separate application is likely to make the collision situation appreciably worse. People are going to be using something like an RRD circular buffer to aggregate the information, and I can't see anyone caring about detailed information that is more than a couple of weeks in the past. The point of aggregation isn't to store more queries, it's to construct time-series data from snapshots. Besides, do most applications really even have more than 10,000 distinct queries? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent restored WAL files from being archived again Re: Unnecessary WAL archiving after failover
On Wed, Oct 3, 2012 at 3:11 AM, Simon Riggs wrote: > On 2 October 2012 19:06, Fujii Masao wrote: >> On Sat, Aug 11, 2012 at 2:19 AM, Fujii Masao wrote: >>> On Thu, Aug 9, 2012 at 8:08 AM, Simon Riggs wrote: On 29 July 2012 16:01, Fujii Masao wrote: > Attached patch changes the startup process so that it creates .done file > whenever WAL file is successfully restored, whether archive mode is > enabled or not. The restored WAL files will not be archived again because > of .done file. The proposed patch works, for archiving only, but I don't like the code. It's a partial refactoring of existing code. I prefer to go for a full re-factoring version for HEAD, and a zero refactoring version for 9.2 since we're deep into beta. >> >> Isn't it time to push the full re-factoring version to HEAD? If there is no >> such version yet, what about pushing the zero refactoring version for now? > > If you send a rebased patch, I'll review, Okay. Will do. The patch needs to be revised to correspond with the recent split of xlog.c. > but its not high on my radar > right now unless you can explain why it should be higher. It may not be high, but I'm just worried that we are likely to forget to apply that change into HEAD if we postpone it furthermore. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent restored WAL files from being archived again Re: Unnecessary WAL archiving after failover
On 2 October 2012 19:06, Fujii Masao wrote: > On Sat, Aug 11, 2012 at 2:19 AM, Fujii Masao wrote: >> On Thu, Aug 9, 2012 at 8:08 AM, Simon Riggs wrote: >>> On 29 July 2012 16:01, Fujii Masao wrote: >>> Attached patch changes the startup process so that it creates .done file whenever WAL file is successfully restored, whether archive mode is enabled or not. The restored WAL files will not be archived again because of .done file. >>> >>> The proposed patch works, for archiving only, but I don't like the >>> code. It's a partial refactoring of existing code. >>> >>> I prefer to go for a full re-factoring version for HEAD, and a zero >>> refactoring version for 9.2 since we're deep into beta. > > Isn't it time to push the full re-factoring version to HEAD? If there is no > such version yet, what about pushing the zero refactoring version for now? If you send a rebased patch, I'll review, but its not high on my radar right now unless you can explain why it should be higher. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] review: pgbench - aggregation of info written into log
Hello I did review of pgbench patch http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php * this patch is cleanly patched * I had a problem with doc make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml' openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml openjade:pgbench.sgml:767:8:E: document type does not allow element "TITLE" here; missing one of "ABSTRACT", "AUTHORBLURB", "MSGSET", "CALLOUTLIST", "ITEMIZEDLIST", "ORDEREDLIST", "SEGMENTEDLIST", "VARIABLELIST", "CAUTION", "IMPORTANT", "NOTE", "TIP", "WARNING", "FORMALPARA", "BLOCKQUOTE", "EQUATION", "EXAMPLE", "FIGURE", "TABLE", "PROCEDURE", "SIDEBAR", "QANDASET", "REFSECT3" start-tag make[1]: *** [HTML.index] Error 1 make[1]: *** Deleting file `HTML.index' make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml * pgbench is compiled without warnings * there is a few issues in source code + + /* should we aggregate the results or not? */ + if (use_log_agg) + { + + /* are we still in the same interval? if yes, accumulate the +* values (print them otherwise) */ + if (agg->start_time + use_log_agg >= INSTR_TIME_GET_DOUBLE(now)) + { + * a real time range for aggregation is dynamic (pgbench is not realtime application), so I am not sure if following constraint has sense + if ((duration > 0) && (use_log_agg > 0) && (duration % use_log_agg != 0)) { + fprintf(stderr, "duration (%d) must be a multiple of aggregation interval (%d)\n", duration, use_log_agg); + exit(1); + } * it doesn't flush last aggregated data (it is mentioned by Tomas: "Also, I've realized the last interval may not be logged at all - I'll take a look into this in the next version of the patch."). And it can be significant for higher values of "use_log_agg" * a name of variable "use_log_agg" is not good - maybe "log_agg_interval" ? Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent restored WAL files from being archived again Re: Unnecessary WAL archiving after failover
On Sat, Aug 11, 2012 at 2:19 AM, Fujii Masao wrote: > On Thu, Aug 9, 2012 at 8:08 AM, Simon Riggs wrote: >> On 29 July 2012 16:01, Fujii Masao wrote: >> >>> Attached patch changes the startup process so that it creates .done file >>> whenever WAL file is successfully restored, whether archive mode is >>> enabled or not. The restored WAL files will not be archived again because >>> of .done file. >> >> The proposed patch works, for archiving only, but I don't like the >> code. It's a partial refactoring of existing code. >> >> I prefer to go for a full re-factoring version for HEAD, and a zero >> refactoring version for 9.2 since we're deep into beta. Isn't it time to push the full re-factoring version to HEAD? If there is no such version yet, what about pushing the zero refactoring version for now? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Incorrect behaviour when using a GiST index on points
On Tue, Aug 28, 2012 at 05:04:09PM -0400, Tom Lane wrote: > Robert Haas writes: > > On Mon, Aug 27, 2012 at 7:43 PM, Tom Lane wrote: > >> There's also the big-picture question of whether we should just get rid > >> of fuzzy comparisons in the geometric types instead of trying to hack > >> indexes to work around them. > > > +1 for that approach, but only if I don't have to do the work. I agree in the abstract; why should a point (position on a 2D plane) compare fuzzily while a float8 (position on a 1D number line) does not? But ... > > Otherwise, +1 for doing the simplest thing that we're sure will > > eliminate wrong answers. > > What we're forced to speculate about here is how many applications out > there are relying on fuzzy comparison to get answers they like, versus > how many are getting answers they don't like because of it. The fact > that the underlying storage is float8 not numeric suggests there are > probably some cases where fuzzy is helpful. ... yes. Having never used these types in practice, I won't venture a guess. Anyone else? > I've never cared for the particulars of the way the fuzzy comparisons > are done, in any case: using an absolute rather than relative error > threshold is wrong according to every numerical analysis principle > I know. Definitely. > The long and the short of it is that it will probably take a significant > investment of work to make something that's clearly better. If that > weren't the case, we'd have done something long ago. In any event, I think we should entertain a patch to make the GiST operator class methods bug-compatible with corresponding operators. Even if we decide to change operator behavior in HEAD, the back branches could use it. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash id in pg_stat_statements
Stephen Frost writes: > * Peter Geoghegan (pe...@2ndquadrant.com) wrote: >> I simply do not understand objections to the proposal. Have I missed >> something? > It was my impression that the concern is the stability of the hash value > and ensuring that tools which operate on it don't mistakenly lump two > different queries into one because they had the same hash value (caused > by a change in our hashing algorithm or input into it over time, eg a > point release). I was hoping to address that to allow this proposal to > move forward.. I think there are at least two questions that ought to be answered: 1. Why isn't something like md5() on the reported query text an equally good solution for users who want a query hash? 2. If people are going to accumulate stats on queries over a long period of time, is a 32-bit hash really good enough for the purpose? If I'm doing the math right, the chance of collision is already greater than 1% at 1 queries, and rises to about 70% for 10 queries; see http://en.wikipedia.org/wiki/Birthday_paradox We discussed this issue and decided it was okay for pg_stat_statements's internal hash table, but it's not at all clear to me that it's sensible to use 32-bit hashes for external accumulation of query stats. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash id in pg_stat_statements
* Peter Geoghegan (pe...@2ndquadrant.com) wrote: > On 1 October 2012 18:05, Stephen Frost wrote: > > You're going to have to help me here, 'cause I don't see how there can > > be duplicates if we include the PGSS_FILE_HEADER as part of the hash, > > unless we're planning to keep PGSS_FILE_HEADER constant while we change > > what the hash value is for a given query, yet that goes against the > > assumptions that were laid out, aiui. > > Well, they wouldn't be duplicates if you think that the fact that one > query was executed before some point release and another after ought > to differentiate queries. I do not. This would only be if we happened to change what hash was generated for a given query during such a point release, where I share your feeling that it aught to be quite rare. I'm not suggestion we do this for every point release... > By invalidate, I mean that when we go to open the saved file, if the > header doesn't match, the file is considered corrupt, and we simply > log that the file could not be read, before unlinking it. This would > be necessary in the unlikely event of there being some substantive > change in the representation of query trees in a point release. I am > not aware of any precedent for this, though Tom said that there was > one. Right, and that's all I'm trying to address here- how do we provide a value for a given query which can be relied upon by outside sources, even in the face of a point release which changes what our internal hash value for a given query is. > I don't want to get too hung up on what we'd do if this problem > actually occurred, because that isn't what this thread is about. [...] > I simply do not understand objections to the proposal. Have I missed > something? It was my impression that the concern is the stability of the hash value and ensuring that tools which operate on it don't mistakenly lump two different queries into one because they had the same hash value (caused by a change in our hashing algorithm or input into it over time, eg a point release). I was hoping to address that to allow this proposal to move forward.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] xmalloc => pg_malloc
On Tuesday, October 02, 2012 06:30:33 PM Tom Lane wrote: > Andres Freund writes: > >> pg_calloc (randomly different API for pg_malloc0) > > > > Do we need this? > > I thought about getting rid of it, but there are some dozens of calls > scattered across several files, so I wasn't sure it was worth it. I always found calloc to be a pointless API but by now I have learned what it means so I don't have a strong opinion. > > I wonder whether the same set of functions should also be available in > > the backend with ereport(EC_OUT_OF_MEMORY, ...) behaviour as well. > > In the backend, you almost always ought to be using palloc instead. > The only places where it's really appropriate to be using malloc > directly are where you don't want an error thrown for out-of-memory. > So I think providing these in the backend would do little except to > encourage bad programming. We seem to have 100+ usages of malloc() anyway. Several of those are in helper libraries like regex/* though. A quick grep shows most of the others to be valid in my opinion. Mostly its allocating memory which is never deallocated but to big to unconditionally allocated. The quick grep showed that at least src/backend/utils/misc/ps_status.c doesn't properly check the return value. Others (mctx.c) use Asserts... Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xmalloc => pg_malloc
On Tue, Oct 2, 2012 at 12:30 PM, Tom Lane wrote: > Andres Freund writes: >>> pg_calloc(randomly different API for pg_malloc0) > >> Do we need this? > > I thought about getting rid of it, but there are some dozens of calls > scattered across several files, so I wasn't sure it was worth it. > Anybody else have an opinion? I think having more than 1 function that does the same thing is generally a bad idea. It sounds like it is going to cause confusion and provide no real benefit. > >> I wonder whether the same set of functions should also be available in the >> backend with ereport(EC_OUT_OF_MEMORY, ...) behaviour as well. > > In the backend, you almost always ought to be using palloc instead. > The only places where it's really appropriate to be using malloc > directly are where you don't want an error thrown for out-of-memory. > So I think providing these in the backend would do little except to > encourage bad programming. > > regards, tom lane > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xmalloc => pg_malloc
Andres Freund writes: >> pg_calloc(randomly different API for pg_malloc0) > Do we need this? I thought about getting rid of it, but there are some dozens of calls scattered across several files, so I wasn't sure it was worth it. Anybody else have an opinion? > I wonder whether the same set of functions should also be available in the > backend with ereport(EC_OUT_OF_MEMORY, ...) behaviour as well. In the backend, you almost always ought to be using palloc instead. The only places where it's really appropriate to be using malloc directly are where you don't want an error thrown for out-of-memory. So I think providing these in the backend would do little except to encourage bad programming. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xmalloc => pg_malloc
On Tue, Oct 2, 2012 at 12:02:13PM -0400, Tom Lane wrote: > While looking around to fix the pg_malloc(0) issue, I noticed that > various other pieces of code such as pg_basebackup have essentially > identical functions, except they're called xmalloc(). I propose to > standardize all these things on this set of names: > > pg_malloc > pg_malloc0 (for malloc-and-zero behavior) > pg_calloc (randomly different API for pg_malloc0) > pg_realloc > pg_free > pg_strdup > > Any objections? Please standarize. I am totally confused by the many memory implementations we have. (Pg_upgrade uses pg_malloc().) -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xmalloc => pg_malloc
On Tuesday, October 02, 2012 06:02:13 PM Tom Lane wrote: > While looking around to fix the pg_malloc(0) issue, I noticed that > various other pieces of code such as pg_basebackup have essentially > identical functions, except they're called xmalloc(). I propose to > standardize all these things on this set of names: > > pg_malloc > pg_malloc0 (for malloc-and-zero behavior) > pg_calloc (randomly different API for pg_malloc0) Do we need this? > pg_realloc > pg_free > pg_strdup I wonder whether the same set of functions should also be available in the backend with ereport(EC_OUT_OF_MEMORY, ...) behaviour as well. As noted before the are quite some malloc() calls around. Not all of them should be replaced, but I think quite some could. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] xmalloc => pg_malloc
While looking around to fix the pg_malloc(0) issue, I noticed that various other pieces of code such as pg_basebackup have essentially identical functions, except they're called xmalloc(). I propose to standardize all these things on this set of names: pg_malloc pg_malloc0 (for malloc-and-zero behavior) pg_calloc (randomly different API for pg_malloc0) pg_realloc pg_free pg_strdup Any objections? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Tue, Sep 25, 2012 at 09:10:33AM -0400, Bruce Momjian wrote: > > lc_collate cluster values do not match: old "zh_CN.utf8", new "zh_CN.UTF-8" > > Failure, exiting > > > > zh_CN.utf8 is provided by the installer and zh_CN.UTF-8 is my system > > default. > > OK, this tells us that the canonicalization code used in initdb is not > going to help us in pg_upgrade, at least not on your system, and not on > mine. > > I think we should apply the patch that fixes the TOAST problem with > information_schema, and the patch that outputs the old/new values for > easier debugging. Other than that, I don't know what else we can do > except to ignore dashes when comparing locale names, which I am told is > unacceptable. Based on this great bug report and submitter leg-work, I have applied three patches to pg_upgrade in head and 9.2, all attached: * try to get the canonical locale names, and report old/new values on mismatch * update query to skip toast tables for system objects * improve error reporting when the object counts don't match None of these bugs caused pg_upgrade to produce an incorrect upgraded cluster, so I am not going to panic and try to force them into 9.1, which probably isn't being used by many people anymore anyway. I think this closes this report. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c new file mode 100644 index 74b13e7..9d08f41 *** a/contrib/pg_upgrade/info.c --- b/contrib/pg_upgrade/info.c *** get_rel_infos(ClusterInfo *cluster, DbIn *** 269,302 */ snprintf(query, sizeof(query), ! "SELECT c.oid, n.nspname, c.relname, " ! " c.relfilenode, c.reltablespace, %s " "FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n " " ON c.relnamespace = n.oid " ! " LEFT OUTER JOIN pg_catalog.pg_tablespace t " ! " ON c.reltablespace = t.oid " ! "WHERE relkind IN ('r','t', 'i'%s) AND " /* exclude possible orphaned temp tables */ " ((n.nspname !~ '^pg_temp_' AND " "n.nspname !~ '^pg_toast_temp_' AND " ! "n.nspname NOT IN ('pg_catalog', 'information_schema', 'binary_upgrade') AND " " c.oid >= %u) " " OR (n.nspname = 'pg_catalog' AND " ! "relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) )) " ! /* we preserve pg_class.oid so we sort by it to match old/new */ ! "ORDER BY 1;", ! /* 9.2 removed the spclocation column */ ! (GET_MAJOR_VERSION(cluster->major_version) <= 901) ? ! "t.spclocation" : "pg_catalog.pg_tablespace_location(t.oid) AS spclocation", /* see the comment at the top of old_8_3_create_sequence_script() */ (GET_MAJOR_VERSION(old_cluster.major_version) <= 803) ? "" : ", 'S'", - /* this oid allows us to skip system toast tables */ FirstNormalObjectId, /* does pg_largeobject_metadata need to be migrated? */ (GET_MAJOR_VERSION(old_cluster.major_version) <= 804) ? "" : ", 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index'"); res = executeQueryOrDie(conn, "%s", query); ntups = PQntuples(res); --- 269,327 */ snprintf(query, sizeof(query), ! "CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid " "FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n " " ON c.relnamespace = n.oid " ! "WHERE relkind IN ('r', 'i'%s) AND " /* exclude possible orphaned temp tables */ " ((n.nspname !~ '^pg_temp_' AND " "n.nspname !~ '^pg_toast_temp_' AND " ! /* skip pg_toast because toast index have relkind == 'i', not 't' */ ! "n.nspname NOT IN ('pg_catalog', 'information_schema', " ! " 'binary_upgrade', 'pg_toast') AND " " c.oid >= %u) " " OR (n.nspname = 'pg_catalog' AND " ! "relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index'%s) ));", /* see the comment at the top of old_8_3_create_sequence_script() */ (GET_MAJOR_VERSION(old_cluster.major_version) <= 803) ? "" : ", 'S'", FirstNormalObjectId, /* does pg_largeobject_metadata need to be migrated? */ (GET_MAJOR_VERSION(old_cluster.major_version) <= 804) ? "" : ", 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index'"); + PQclear(executeQueryOrDie(conn, "%s", query)); + + /* + * Get TOAST tables and indexes; we have to gather the TOAST tables in + * later steps because we can't schema-qualify TOAST tables. + */ + PQclear(executeQueryOrDie(conn, + "INSERT INTO info_rels " + "SELECT reltoastrelid " + "FROM info_rels i JOIN pg_catalog.pg_class c " + " ON i.reloid = c.oid")); + PQclear(executeQueryOrDie(conn, + "INSERT INTO info_rels " + "SELECT reltoastidxid " + "FROM info_rels i JOIN pg_catalog.pg_class c " + " ON i.reloid = c.oid")); + + snprintf(que
Re: [HACKERS] Doc patch to note which system catalogs have oids
On 09/25/2012 12:28:13 AM, Karl O. Pinc wrote: > On 09/23/2012 08:57:45 PM, Karl O. Pinc wrote: > > > The attached patch documents the oid column of those > > system catalogs having an oid. > > Don't use the first version of this patch (oid_doc.patch) > without discarding the last hunk. The last hunk > introduces an error by duplicating the > documentation of the pg_roles.oid column. > > (I am reluctant to post a revised version > since there's already 3 versions floating > around representing 2 different approaches > and no consensus as to which approach, if any, should > be taken.) I have figured out how to use the commitfest pages to track what should be reviewed. Submitting a corrected version of the very first patch, which treats the oids as regular columns. I am now submitting patches to the commitfest for review. (I'm not sure how I missed this.) Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index f999190..babb11c 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -427,6 +427,13 @@ + oid + oid + + Row identifier + + + amname name @@ -683,6 +690,13 @@ + oid + oid + + Row identifier + + + amopfamily oid pg_opfamily.oid @@ -819,6 +833,13 @@ + oid + oid + + Row identifier + + + amprocfamily oid pg_opfamily.oid @@ -902,6 +923,13 @@ + oid + oid + + Row identifier + + + adrelid oid pg_class.oid @@ -1257,6 +1285,14 @@ + + + oid + oid + + Row identifier + + rolname name @@ -1462,6 +1498,13 @@ + oid + oid + + Row identifier + + + castsource oid pg_type.oid @@ -1577,6 +1620,13 @@ + oid + oid + + Row identifier + + + relname name @@ -1984,6 +2034,13 @@ + oid + oid + + Row identifier + + + conname name @@ -2250,6 +2307,13 @@ + oid + oid + + Row identifier + + + collname name @@ -2350,6 +2414,13 @@ + oid + oid + + Row identifier + + + conname name @@ -2443,6 +2514,13 @@ + oid + oid + + Row identifier + + + datname name @@ -2652,6 +2730,13 @@ + oid + oid + + Row identifier + + + defaclrole oid pg_authid.oid @@ -3005,6 +3090,13 @@ + oid + oid + + Row identifier + + + enumtypid oid pg_type.oid @@ -3078,6 +3170,13 @@ + oid + oid + + Row identifier + + + extname name @@ -3174,6 +3273,13 @@ + oid + oid + + Row identifier + + + fdwname name @@ -3266,6 +3372,13 @@ + oid + oid + + Row identifier + + + srvname name @@ -3675,6 +3788,13 @@ + oid + oid + + Row identifier + + + lanname name @@ -3875,6 +3995,13 @@ + oid + oid + + Row identifier + + + lomowner oid pg_authid.oid @@ -3927,6 +4054,13 @@ + oid + oid + + Row identifier + + + nspname name @@ -3995,6 +4129,13 @@ + oid + oid + + Row identifier + + + opcmethod oid pg_am.oid @@ -4093,6 +4234,13 @@ + oid + oid + + Row identifier + + + oprname name @@ -4243,6 +4391,13 @@ + oid + oid + + Row identifier + + + opfmethod oid pg_am.oid @@ -4427,6 +4582,13 @@ + oid + oid + + Row identifier + + + proname name @@ -4819,6 +4981,13 @@ + oid + oid + + Row identifier + + + rulename name @@ -5488,6 +5657,13 @@ + oid + oid + + Row identifier + + + spcname name @@ -5556,6 +5732,13 @@ + oid + oid + + Row identifier + + + tgrelid oid pg_class.oid @@ -5741,6 +5924,13 @@ + oid + oid
Re: [HACKERS] Hash id in pg_stat_statements
P On Oct 2, 2012 5:04 PM, "Euler Taveira" wrote: > > On 02-10-2012 10:15, Peter Geoghegan wrote: > > There are other, similar tools that exist in proprietary databases. > > They expose a hash value, which is subject to exactly the same caveats > > as our own. They explicitly encourage the type of aggregation by > > third-party tools that I anticipate will happen with > > pg_stat_statements. I want to facilitate this, but I'm confident that > > the use of (dbid, userid, querytext) as a "candidate key" by these > > tools is going to cause confusion, in subtle ways. By using the hash > > value, those tools are exactly as well off as pg_stat_statements is, > > which is to say, as well off as possible. > > > It depends on how you will use this information. If it is for a rapid > analysis, it is fine to use a hash because the object internals won't change > (I hope not). However, if you want to analyze query statistics for a long > period of time (say a few months) or your environment is distributed, you > can't use the hash. There isn't a perfect solution but I see both cases being > useful for analysis tools. Having the hash available in no way prevents you from using the query string ad your key. Not having the hash certainly prevents you from using it. /Magnus
Re: [HACKERS] Hash id in pg_stat_statements
On 02-10-2012 10:15, Peter Geoghegan wrote: > There are other, similar tools that exist in proprietary databases. > They expose a hash value, which is subject to exactly the same caveats > as our own. They explicitly encourage the type of aggregation by > third-party tools that I anticipate will happen with > pg_stat_statements. I want to facilitate this, but I'm confident that > the use of (dbid, userid, querytext) as a "candidate key" by these > tools is going to cause confusion, in subtle ways. By using the hash > value, those tools are exactly as well off as pg_stat_statements is, > which is to say, as well off as possible. > It depends on how you will use this information. If it is for a rapid analysis, it is fine to use a hash because the object internals won't change (I hope not). However, if you want to analyze query statistics for a long period of time (say a few months) or your environment is distributed, you can't use the hash. There isn't a perfect solution but I see both cases being useful for analysis tools. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Missing OID define
Thom Brown and I were doing some hacking the other day and came across this missing define. We argued over who was going to send the patch in and I lost. So here it is. pg_type_uuid_oid.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] MVCC and large objects
While I'm referring to the 8.4 release, it's a slightly more general question. We're moving away from using pg_largeobject as fast as we can, and while doing so I'm seeing interesting behaviour. I'd like to confirm that this is is expected, or perhaps I don't have tuning parameters set quite correctly. I believe we have standard autovacuum running. * Table Foo has an oid "data" column, points to a valid BLOB. We modified this table to also have a "datafilepath" column. * We wrote a throttleable "copier" process which walks rows, reads the BLOB data out to a file and then UPDATEs the "datafilepath" column with where it wrote it. We did not alter the BLOB data in any way. When asked for byte data, the higher level code will first return it from the datafilepath if it's there, and fall back on the lo otherwise. While the above was working away, we nearly missed the fact that the "public.pg_largeobject" table seemed to be growing commensurate with what we were exporting! As we were doing this as the primary disk was nearly out of space, it was fortunate I could pause this work. We were able to move the entire system and it's now continuing along, but my question: Is this expected? I'm a little surprised. My theory is that MVCC seems to be including the pg_largeobject referenced as a part of the row, and even though we're not updating the BLOB at all, a snapshot is getting created. *Is this expected*? Many thanks - single-link RTFM answers welcome, I have seen the MVCC through pictures, and I get it - just not how a BLOB fits into MVCC here. ./scc -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash id in pg_stat_statements
On 1 October 2012 18:05, Stephen Frost wrote: > * Peter Geoghegan (pe...@2ndquadrant.com) wrote: >> That won't really help matters. There'd still be duplicate entries, >> from before and after the change, even if we make it immediately >> obvious which is which. The only reasonable solution in that scenario >> is to bump PGSS_FILE_HEADER, which will cause all existing entries to >> be invalidated. > > You're going to have to help me here, 'cause I don't see how there can > be duplicates if we include the PGSS_FILE_HEADER as part of the hash, > unless we're planning to keep PGSS_FILE_HEADER constant while we change > what the hash value is for a given query, yet that goes against the > assumptions that were laid out, aiui. Well, they wouldn't be duplicates if you think that the fact that one query was executed before some point release and another after ought to differentiate queries. I do not. > If there's a change that results in a given query X no longer hashing to > a value A, we need to change PGSS_FILE_HEADER to invalidate statistics > which were collected for value A (or else we risk an independent query Y > hashing to value A and ending up with completely invalid stats..). > Provided we apply that consistently and don't reuse PGSS_FILE_HEADER > values along the way, a combination of PGSS_FILE_HEADER and the hash > value for a given query should be unique over time. > > We do need to document that the hash value for a given query may > change.. By invalidate, I mean that when we go to open the saved file, if the header doesn't match, the file is considered corrupt, and we simply log that the file could not be read, before unlinking it. This would be necessary in the unlikely event of there being some substantive change in the representation of query trees in a point release. I am not aware of any precedent for this, though Tom said that there was one. Tom: Could you please describe the precedent you had in mind? I would like to better understand this risk. I don't want to get too hung up on what we'd do if this problem actually occurred, because that isn't what this thread is about. Exposing the hash is a completely unrelated question, except that to do so might make pg_stat_statements (including its limitations) better understood. In my opinion, the various log parsing tools have taught people to think about this in the wrong way - the query string is just a representation of a query (and an imperfect one at that). There are other, similar tools that exist in proprietary databases. They expose a hash value, which is subject to exactly the same caveats as our own. They explicitly encourage the type of aggregation by third-party tools that I anticipate will happen with pg_stat_statements. I want to facilitate this, but I'm confident that the use of (dbid, userid, querytext) as a "candidate key" by these tools is going to cause confusion, in subtle ways. By using the hash value, those tools are exactly as well off as pg_stat_statements is, which is to say, as well off as possible. I simply do not understand objections to the proposal. Have I missed something? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
2012/10/1 Heikki Linnakangas : > On 27.09.2012 23:58, Pavel Stehule wrote: >> >> Hello >> >> I reduced this patch as just plpgsql (SPI) problem solution. >> >> Correct generic solution needs a discussion about enhancing our libpq >> interface - we can take a number of returned rows, but we should to >> get number of processed rows too. A users should to parse this number >> from completationTag, but this problem is not too hot and usually is >> not blocker, because anybody can process completationTag usually. >> >> So this issue is primary for PL/pgSQL, where this information is not >> possible. There is precedent - CREATE AS SELECT (thanks for sample >> :)), and COPY should to have same impact on ROW_COUNT like this >> statement (last patch implements it). > > > The comment where CREATE AS SELECT does this says: > >> /* >> * CREATE TABLE AS is a messy special case for historical >> * reasons. We must set _SPI_current->processed even though >> * the tuples weren't returned to the caller, and we must >> * return a special result code if the statement was spelled >> * SELECT INTO. >> */ > > > That doesn't sound like a very good precedent that we should copy to COPY. I > don't have a problem with this approach in general, though. But if we're > going to make it normal behavior, rather than an ugly historical kluge, that > utility statements can return a row count without returning the tuples, we > should document that. The above comment ought to be changed, and the manual > section about SPI_execute needs to mention the possibility that > SPI_processed != 0, but SPI_tuptable == NULL. > > Regarding the patch itself: > >> + else if (IsA(stmt, CopyStmt)) >> + { >> + /* >> +* usually utility statements >> doesn't return a number >> +* of processed rows, but COPY >> does it. >> +*/ >> + Assert(strncmp(completionTag, >> "COPY ", 5) == 0); >> + _SPI_current->processed = >> strtoul(completionTag + 5, >> + >> NULL, 10); >> + } >> else >> res = SPI_OK_UTILITY; > > > 'res' is not set for a CopyStmt, and there's an extra space in "COPY " in > the Assert. > fixed patch Regards Pavel Stehule > - Heikki copy-processed-rows-simple2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] small LDAP error message change
On Tue, Oct 2, 2012 at 2:57 AM, Peter Eisentraut wrote: > I'm proposing to make the attached change to some LDAP error messages. > Aside from fixing a pluralization issue, I want to separate fact (search > resulted in != 1 result) from interpretation (LDAP user does not exist, > and that's a problem). Looks good to me, +1. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On 02.10.2012 10:36, Amit kapila wrote: On Monday, October 01, 2012 4:08 PM Heikki Linnakangas wrote: So let's think how this should ideally work from a user's point of view. I think there should be just two settings: walsender_timeout and walreceiver_timeout. walsender_timeout specifies how long a walsender will keep a connection open if it doesn't hear from the walreceiver, and walreceiver_timeout is the same for walreceiver. The system should figure out itself how often to send keepalive messages so that those timeouts are not reached. By this it implies that we should remove wal_receiver_status_interval. Currently it is also used incase of reply message of data sent by sender which contains till what point receiver has flushed. So if we remove this variable receiver might start sending that message sonner than required. Is that okay behavior? I guess we should keep that setting, then, so that you can get status updates more often than would be required for heartbeat purposes. In walsender, after half of walsender_timeout has elapsed and we haven't received anything from the client, the walsender process should send a "ping" message to the client. Whenever the client receives a Ping, it replies. The walreceiver does the same; when half of walreceiver_timeout has elapsed, send a Ping message to the server. Each Ping-Pong roundtrip resets the timer in both ends, regardless of which side initiated it, so if e.g walsender_timeout< walreceiver_timeout, the client will never have to initiate a Ping message, because walsender will always reach the walsender_timeout/2 point first and initiate the heartbeat message. Just to clarify, walsender should reset timer after it gets reply from receiver of the message it sent. Right. walreceiver should reset timer after sending reply for heartbeat message. > Similar to above timers will be reset when receiver sent the heartbeat message. walreceiver should reset the timer when it *receives* any message from walsender. If it sends the reply right away, I guess that's the same thing, but I'd phrase it so that it's the reception of a message from the other end that resets the timer. The Ping/Pong messages don't necessarily need to be new message types, we can use the message types we currently have, perhaps with an additional flag attached to them, to request the other side to reply immediately. Can't we make the decision to send reply immediately based on message type, because these message types will be unique. To clarify my understanding, 1. the heartbeat message from walsender side will be keepalive message ('k') and from walreceiver side it will be Hot Standby feedback message ('h'). 2. the reply message from walreceiver side will be current reply message ('r'). Yep. I wonder why need separate message types for Hot Standby Feedback 'h' and Reply 'r', though. Seems it would be simpler to have just one messasge type that includes all the fields from both messages. 3. currently there is no reply kind of message from walsender, so do we need to introduce one new message for it or can use some existing message only? if new, do we need to send any additional information along with it, for existing messages can we use keepalive message it self as reply message but with an additional byte to indicate it is reply? Hmm, I think I'd prefer to use the existing Keepalive message 'k', with an additional flag. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] date_in and buffer overrun
On 02.10.2012 01:30, Hitoshi Harada wrote: It seems date_in() has a risk of buffer overrun. If the input is '.', it sets field[0] to the beginning of workbuf and goes into DecodeDate(). This function checks null-termination of the head of string, but it can go beyond the end of string inside the first loop and replace some bytes with zero. The worst scenario we've seen is overwrite of the stack frame, in which the compiler rearranged the memory allocation of local variables in date_in() and work_buf is at lower address than field. I tried to attach a patch file but failed somehow, so I paste the fix here. Thanks, applied to master and all supported back-branches. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch for removng unused targets
Hi! Attached patch removes unused targets which are used only for order by when data already comes in right order. It introduces resorderbyonly flag of TargetEntry which indicated that entry is used only for ORDER BY clause. If data comes in right order then such entries are removed in grouping_planner function. This is my first patch on planner. Probably, I did it in wrong way. But I think it is worthwhile optimization and you could give me direction to rework patch. Actually we meet need of this optimization when ranking full-text search in GIN index (it isn't published yet, will post prototype soon). But there is some synthetic example illustrating benefit from patch. CREATE OR REPLACE FUNCTION slow_func(x float8, y float8) RETURNS float8 AS $$ BEGIN PERFORM pg_sleep(0.01); RETURN x + y; END; $$ IMMUTABLE LANGUAGE plpgsql; CREATE TABLE test AS (SELECT random() AS x, random() AS y FROM generate_series(1,1000)); CREATE INDEX test_idx ON test(slow_func(x,y)); Without patch: test=# EXPLAIN (ANALYZE, VERBOSE) SELECT * FROM test ORDER BY slow_func(x,y) LIMIT 10; QUERY PLAN -- Limit (cost=0.00..3.09 rows=10 width=16) (actual time=11.344..103.443 rows=10 loops=1) Output: x, y, (slow_func(x, y)) -> Index Scan using test_idx on public.test (cost=0.00..309.25 rows=1000 width=16) (actual time=11.341..103.422 rows=10 loops=1) Output: x, y, slow_func(x, y) Total runtime: 103.524 ms (5 rows) With patch: test=# EXPLAIN (ANALYZE, VERBOSE) SELECT * FROM test ORDER BY slow_func(x,y) LIMIT 10; QUERY PLAN --- Limit (cost=0.00..3.09 rows=10 width=16) (actual time=0.062..0.093 rows=10 loops=1) Output: x, y -> Index Scan using test_idx on public.test (cost=0.00..309.25 rows=1000 width=16) (actual time=0.058..0.085 rows=10 loops=1) Output: x, y Total runtime: 0.164 ms (5 rows) -- With best regards, Alexander Korotkov. unused-targets-1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Monday, October 01, 2012 8:36 PM Robert Haas wrote: On Mon, Oct 1, 2012 at 6:38 AM, Heikki Linnakangas wrote: >> Hmm, I think we need to step back a bit. I've never liked the way >> replication_timeout works, where it's the user's responsibility to set >> wal_receiver_status_interval < replication_timeout. It's not very >> user-friendly. I'd rather not copy that same design to this walreceiver >> timeout. If there's two different timeouts like that, it's even worse, >> because it's easy to confuse the two. > I agree, but also note that wal_receiver_status_interval serves > another user-visible purpose as well. By above do you mean to say that wal_receiver_status_interval is used for reply of data sent by server to indicate till what point receiver has flushed data or something else? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Monday, October 01, 2012 4:08 PM Heikki Linnakangas wrote: On 21.09.2012 14:18, Amit kapila wrote: > On Tuesday, September 18, 2012 6:02 PM Fujii Masao wrote: > On Mon, Sep 17, 2012 at 4:03 PM, Amit Kapila wrote: > Approach-2 : Provide a variable wal_send_status_interval, such that if this is 0, then the current behavior would prevail and if its non-zero then KeepAlive message would be send maximum after that time. The modified code of WALSendLoop will be as follows: > > Which way you think is better or you have any other idea to handle. > >>> I think #2 is better because it's more intuitive to a user. > >> Please find a patch attached for implementation of Approach-2. >So let's think how this should ideally work from a user's point of view. >I think there should be just two settings: walsender_timeout and >walreceiver_timeout. walsender_timeout specifies how long a walsender >will keep a connection open if it doesn't hear from the walreceiver, and >walreceiver_timeout is the same for walreceiver. The system should >figure out itself how often to send keepalive messages so that those >timeouts are not reached. By this it implies that we should remove wal_receiver_status_interval. Currently it is also used incase of reply message of data sent by sender which contains till what point receiver has flushed. So if we remove this variable receiver might start sending that message sonner than required. Is that okay behavior? >In walsender, after half of walsender_timeout has elapsed and we haven't >received anything from the client, the walsender process should send a >"ping" message to the client. Whenever the client receives a Ping, it >replies. The walreceiver does the same; when half of walreceiver_timeout >has elapsed, send a Ping message to the server. Each Ping-Pong roundtrip >resets the timer in both ends, regardless of which side initiated it, so >if e.g walsender_timeout < walreceiver_timeout, the client will never >have to initiate a Ping message, because walsender will always reach the >walsender_timeout/2 point first and initiate the heartbeat message. Just to clarify, walsender should reset timer after it gets reply from receiver of the message it sent. walreceiver should reset timer after sending reply for heartbeat message. Similar to above timers will be reset when receiver sent the heartbeat message. >The Ping/Pong messages don't necessarily need to be new message types, >we can use the message types we currently have, perhaps with an >additional flag attached to them, to request the other side to reply >immediately. Can't we make the decision to send reply immediately based on message type, because these message types will be unique. To clarify my understanding, 1. the heartbeat message from walsender side will be keepalive message ('k') and from walreceiver side it will be Hot Standby feedback message ('h'). 2. the reply message from walreceiver side will be current reply message ('r'). 3. currently there is no reply kind of message from walsender, so do we need to introduce one new message for it or can use some existing message only? if new, do we need to send any additional information along with it, for existing messages can we use keepalive message it self as reply message but with an additional byte to indicate it is reply? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2012 RC
On Fri, Sep 14, 2012 at 01:26:30AM +0200, Brar Piening wrote: > Heikki Linnakangas wrote: >> I don't think we can realistically support VS2012 until Microsoft >> releases the gratis Visual Studio Express 2012 for Windows Desktop. > > As they've released now I've updated my Patch with docs and ask for review. I used this patch to build 64-bit PostgreSQL under Windows 7 Professional, using Visual Studio 2012 Express for Windows Desktop. I did not provide a config.pl, so this build used the defaults -- in particular, it did not exercise linking to optional external projects like perl and libxml2. A "vcregress check" passed. The build emitted no warnings beyond those seen on buildfarm member "chough" (VS 2008 Express). My build log filled 8.8 MiB, a large increase from the 432 KiB of the "chough" build log. This isn't strictly a problem, but do you happen to have ideas for curbing the noise? I find no functional problems with the patch, but some comment updates and other trivia need attention. The patch itself was reversed; it applied cleanly with "patch -R". I regenerated it in the usual direction for the portions I quote below. src/tools/msvc/MSBuildProject.pm:4:# Package that encapsulates a MSBuild (Visual C++ 2010) project file Say something like "Visual C++ 2010 or greater". src/tools/msvc/VSObjectFactory.pm:93:# we use nmake as it has existed for a long time and still exists in visual studio 2010 Likewise, modify this comnent to be open-ended. If nmake ever disappears, we'll be updating this code and can see to change the comment. > *** a/doc/src/sgml/install-windows.sgml > --- b/doc/src/sgml/install-windows.sgml > *** > *** 22,28 > Microsoft tools is to install a supported version of the > Microsoft Windows SDK and use the included > compiler. It is also possible to build with the full > ! Microsoft Visual C++ 2005, 2008 or 2010. In > some cases > that requires the installation of the Windows > SDK > in addition to the compiler. > > --- 22,28 > Microsoft tools is to install a supported version of the > Microsoft Windows SDK and use the included > compiler. It is also possible to build with the full > ! Microsoft Visual C++ 2005, 2008, 2010 or 2012. > In some cases > that requires the installation of the Windows > SDK > in addition to the compiler. > I think this paragraph should be more like the one in the next patch hunk: call out Visual Studio 2012 Express for Windows Desktop and Windows SDK 7.1 as the main recommendations. Perhaps even demote the SDK entirely and just recommend VS 2012. It'd odd to recommend only a past-version tool if a current-version tool works just as well. > *** > *** 77,90 > Visual Studio Express or some versions of the > Microsoft Windows SDK. If you do not already > have a > Visual Studio environment set up, the easiest > ! way is to use the compilers in the Windows SDK, > ! which is a free download from Microsoft. > > > > PostgreSQL is known to support compilation using the compilers shipped > with > Visual Studio 2005 to > ! Visual Studio 2010 (including Express > editions), > as well as standalone Windows SDK releases 6.0 to 7.1. > 64-bit PostgreSQL builds are only supported with > Microsoft Windows SDK version 6.0a and above or > --- 77,91 > Visual Studio Express or some versions of the > Microsoft Windows SDK. If you do not already > have a > Visual Studio environment set up, the easiest > ! ways are to use the compilers in the Windows > SDK I would write "Windows SDK 7.1" here and remove the parenthesized bit. There's a later mention of support for older versions. > ! (<= 7.1) or those from Visual Studio Express 2012 for Windows > ! Desktop, which are both free downloads from Microsoft. > > > > PostgreSQL is known to support compilation using the compilers shipped > with > Visual Studio 2005 to > ! Visual Studio 2012 (including Express > editions), > as well as standalone Windows SDK releases 6.0 to 7.1. > 64-bit PostgreSQL builds are only supported with > Microsoft Windows SDK version 6.0a and above or > *** > *** 157,162 $ENV{PATH}=$ENV{PATH} . ';c:\some\where\bison\bin'; > --- 158,165 > If you install the Windows SDK > including the Visual C++ Compilers, > you don't need Visual Studio to build. > + Note that as of Version 8.0a the Windows SDK no longer ships with a > + complete command-line build environment. The part ending here looks like this: Microsoft Windows SDK It is recommended that you upgrade to the latest supported version of the Microsoft Windows SDK (currently version 7.1), available for download from http://www.microsoft.com/downloads/";>. You must always include the Windows Headers and Libraries pa