date:20150203

Re: [HACKERS] [GENERAL] How do I bump a row to the front of sort efficiently

2015-02-03 Thread BladeOfLight16

On Mon, Feb 2, 2015 at 1:16 AM, Sam Saffron sam.saff...@gmail.com wrote:

 However, the contortions on the above query make it very un-ORM
 friendly as I would need to define a view for it but would have no
 clean way to pass limits and offsets in.


This is why ORMs are bad. They make hard problems *much* harder, and the
only benefit is that they maybe make easy problems a little quicker. The
cost/savings is *heavily* skewed toward the cost, since there's no upper
bound on the cost and there is a pretty small lower bound on the savings.
Micro-ORMs tend to do a better job of not shielding you from (or rather,
getting in the way of) the SQL while still providing some good
result-to-object translation. Whether even that is necessary depends on
your language, though. (For example, in Python, psycopg2 has a built in way
of spitting out namedtuples, which means you get result-to-object
translation out of the box. That makes even a micro-ORM pretty unnecessary.
On the other hand, a micro-ORM that does this well without blocking you
from the SQL, such as PetaPOCO, is a boon in .NET.)

If you can, your best bet would probably be to find a way to get your ORM
to execute raw SQL (with good parametrization to prevent injection
attacks) and be done with it. It took me way too much experience
fighting with an ORM on complicated queries to realize that.

Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-03 Thread Jim Nasby


On 2/3/15 5:26 AM, Shay Rojansky wrote:

Sorry if this has been asked before, couldn't find any mention...

I'm working on the Npgsql, the .NET driver for PostgreSQL, and am trying
to find a way to execute a query but without fetching any rows. The
Execute message has a maximum result-row count, but zero is documented
to mean fetch all rows.

The use case would be sending a query which might modify or might not
(e.g. UPDATE), but we know that the user is uninterested in any result row.

My current workaround is to specify maxrows=1, was wondering if I missed
a better alternative.


You might be able to add something like WHERE FALSE to the RETURNING 
clause, but I agree with Andres; this seems like premature optimization.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Jim Nasby


On 2/3/15 5:08 PM, Tom Lane wrote:

Jim Nasby jim.na...@bluetreble.com writes:

On 2/3/15 9:20 AM, Tom Lane wrote:

Well, the object type is not an optional part of the command.  It's
*necessary*.  I was thinking more like
REINDEX { INDEX | TABLE | etc } name [ ( option [, option ...] ) ]



VACUUM puts the options before the table name, so ISTM it'd be best to
keep that with REINDEX. Either REINDEX (options) {INDEX | ...} or
REINDEX {INDEX | ...} (options).


Well, I really really don't like the first of those.  IMO the command name
is REINDEX INDEX etc, so sticking something in the middle of that is
bogus.


Actually, is there a reason we can't just accept all 3? Forcing people 
to remember exact ordering of options has always struck me as silly.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump's aborted transactions

2015-02-03 Thread Tom Lane

Stephen Frost sfr...@snowman.net writes:
 All,
   We recently had a client complain that check_postgres' commitratio
   check would alert about relatively unused databases.  As it turns
   out, the reason for this is because they automate running pg_dump
   against their databases (surely a good thing..), but pg_dump doesn't
   close out its transaction cleanly, leading to rolled back
   transactions.

   At first blush, at least, this strikes me as an oversight which we
   should probably fix and possibly backpatch.

No, somebody should fix check_postgres to count rollbacks as well as
commits as activity (as they obviously are).

This is not an oversight, it's 100% intentional.  The reason pg_dump
aborts rather than commits is to make entirely sure that it does not
commit any changes to the database.  I would be against removing that
safety feature, considering that pg_dump is typically run as superuser.
We have frequently worried about security exploits that involve hijacking
superuser activities, and this behavior provides at least a small
increment of safety against such holes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Table description in the data file (Re: pg_rawdump)

2015-02-03 Thread Jim Nasby


On 2/3/15 1:43 AM, Stephen R. van den Berg wrote:

In relation to the talk and discussions at FOSDEM regarding
helping data recovery, I searched the archives for the
old thread after I performed my last recovery; for reference:

http://www.postgresql.org/message-id/20101019201223.ga15...@cuci.nl

I haven't checked yet if the proposed space there is still
available in the current disk format, but it might serve as
a starter and reminder to get the discussion going once more.


Rather than trying to wedge this into a heap page, ISTM it'd be better 
to use a fork. Presumably if you're storing regular tuples that have the 
essential data from pg_class, pg_attribute (and maybe pg_type).

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] How do I bump a row to the front of sort efficiently

2015-02-03 Thread BladeOfLight16

On Tue, Feb 3, 2015 at 9:33 PM, BladeOfLight16 bladeofligh...@gmail.com
wrote:

 This is why ORMs are bad. They make hard problems *much* harder, and the
 only benefit is that they maybe make easy problems a little quicker. The
 cost/savings is *heavily* skewed toward the cost, since there's no upper
 bound on the cost and there is a pretty small lower bound on the savings.
 Micro-ORMs tend to do a better job of not shielding you from (or rather,
 getting in the way of) the SQL while still providing some good
 result-to-object translation. Whether even that is necessary depends on
 your language, though. (For example, in Python, psycopg2 has a built in way
 of spitting out namedtuples, which means you get result-to-object
 translation out of the box. That makes even a micro-ORM pretty unnecessary.
 On the other hand, a micro-ORM that does this well without blocking you
 from the SQL, such as PetaPOCO, is a boon in .NET.)

 If you can, your best bet would probably be to find a way to get your ORM
 to execute raw SQL (with good parametrization to prevent injection
 attacks) and be done with it. It took me way too much experience
 fighting with an ORM on complicated queries to realize that.


Er, *pretty small upper bound on the savings.

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Tom Lane

Jim Nasby jim.na...@bluetreble.com writes:
 On 2/3/15 5:08 PM, Tom Lane wrote:
 Jim Nasby jim.na...@bluetreble.com writes:
 VACUUM puts the options before the table name, so ISTM it'd be best to
 keep that with REINDEX. Either REINDEX (options) {INDEX | ...} or
 REINDEX {INDEX | ...} (options).

 Well, I really really don't like the first of those.  IMO the command name
 is REINDEX INDEX etc, so sticking something in the middle of that is
 bogus.

 Actually, is there a reason we can't just accept all 3? Forcing people 
 to remember exact ordering of options has always struck me as silly.

And that's an even worse idea.  Useless flexibility in syntax tends to
lead to unfortunate consequences like having to reserve keywords.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_dump's aborted transactions

2015-02-03 Thread Stephen Frost

All,

  We recently had a client complain that check_postgres' commitratio
  check would alert about relatively unused databases.  As it turns
  out, the reason for this is because they automate running pg_dump
  against their databases (surely a good thing..), but pg_dump doesn't
  close out its transaction cleanly, leading to rolled back
  transactions.

  At first blush, at least, this strikes me as an oversight which we
  should probably fix and possibly backpatch.

  Thoughts?

Thanks,

Stephen


signature.asc
Description: Digital signature

[HACKERS] Re: [COMMITTERS] pgsql: Add API functions to libpq to interrogate SSL related stuff.

2015-02-03 Thread Heikki Linnakangas


On 02/04/2015 02:23 AM, Tom Lane wrote:

Heikki Linnakangas heikki.linnakan...@iki.fi writes:

Add API functions to libpq to interrogate SSL related stuff.


This patch is one large brick shy of a load: it creates exported libpq
functions but fails to ensure they always exist.  That's why jacana is
unhappy; though TBH I'm astonished that any non-ssl-enabled builds
are passing.  Apparently missing library functions are less of a hard
error on Linux than they ought to be.


Yeah, that's surprising. I don't see any option in ld man page to make 
it warn either.


This also shows that we don't have any regression tests for this 
function. It's trivial, so I'm not worried about that, but in general it 
would be good to have a regression tests suite specifically for libpq. 
There are probably many other things that are not exercised by psql.



I think probably the exported functions need to be defined in fe-exec.c
or fe-connect.c, with bodies along the lines of

#ifdef USE_OPENSSL
call OpenSSL-specific function
#else
return NULL
#endif

(or whatever's appropriate when no SSL support).  We do want these
functions to exist even in non-SSL-enabled builds.


Sure. There are dummy versions of all the other SSL-related functions, I 
just missed PQsslAttributes. The OpenSSL-versions are in 
fe-secure-openssl.c, and the dummy ones are in fe-secure.c, within a 
#ifndef USE_SSL block.


Fixed now.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Heikki Linnakangas


On 02/02/2015 04:21 PM, Andres Freund wrote:

Hi,

On 2015-02-02 08:36:41 -0500, Robert Haas wrote:

Also, I'd like to propose that we set the default value of
max_checkpoint_segments/checkpoint_wal_size to something at least an
order of magnitude larger than the current default setting.


+1


I don't agree with that principle. I wouldn't mind increasing it a 
little bit, but not by an order of magnitude. For better or worse, *all* 
our defaults are tuned toward small systems, and so that PostgreSQL 
doesn't hog all the resources. We shouldn't make an exception for this.



I think we need to increase checkpoint_timeout too - that's actually
just as important for the default experience from my pov. 5 minutes
often just unnecessarily generates FPWs en masse.


I'll open the bidding at 1600MB (aka 100).


Fine with me.


I wouldn't object to raising it a little bit, but that's way too high. 
It's entirely possible to have a small database that generates a lot of 
WAL. A table that has only a few rows, but is updated very very 
frequently, for example. And checkpointing such a database is quick too, 
so frequent checkpoints are not a problem. You don't want to end up with 
1.5 GB of WAL on a 100 MB database.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Andres Freund

Hi,

I think these days there's no reason for the split between the archive
and hot_standby wal levels. The split was made out of volume and
stability concerns. I think we can by now be confident about the
wal_level = hot_standby changes (note I'm not proposing hot_standby =
on).

So let's remove the split. It just gives users choice between two options
that don't have a meaningful difference.

Additionally I think we should change the default for wal_level to
hot_standby and max_wal_senders (maybe to 5). That way users can use
pg_basebackup and setup streaming standbys without having to restart the
primary. I think that'd be a important step in making setup easier.

Previously there have been arguments against changing the default of
wal_level because it'd mean the regression tests wouldn't exercise
minimal anymore. That might be true, but then right now we just don't
exercise the more complex levels. If we're really concerned we can just
force a different value during the tests, just as we do for prepared
xacts.

Comments?

Additionally, more complex and further into the future, I wonder if we
couldn't also get rid of wal_level = logical by automatically switching
to it whenever logical slots are active.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Fabrízio de Royes Mello

On Tue, Feb 3, 2015 at 9:09 AM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 On Tue, Feb 3, 2015 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Sawada Masahiko sawada.m...@gmail.com writes:
  On Mon, Feb 2, 2015 at 9:21 PM, Michael Paquier
  michael.paqu...@gmail.com wrote:
  Now, I think that it may
  be better to provide the keyword VERBOSE before the type of object
  reindexed as REINDEX [ VERBOSE ] object.
 
  Actually, my first WIP version of patch added VERBOSE word at before
  type of object.
  I'm feeling difficult about that the position of VERBOSE word in
  REINDEX statement.
 
  The way that FORCE was added to REINDEX was poorly thought out; let's
not
  double down on that with another option added without any consideration
  for future expansion.  I'd be happier if we adopted something similar to
  the modern syntax for VACUUM and EXPLAIN, ie, comma-separated options in
  parentheses.
 

 I understood.
 I'm imagining new REINDEX syntax are followings.
 - REINDEX (INDEX, VERBOSE) hoge_idx;
 - REINDEX (TABLE) hoge_table;

 i.g., I will add following syntax format,
 REINDEX ( { INDEX | TABLE | SCHEMA | SYSTEM | DATABASE } , [VERBOSE] )
 name [FORCE];

 Thought?


I don't think the keyworks INDEX, TABLE, SCHEMA, SYSTEM and DATABASE are
options... they are part of the command IMHO.

Maybe something like:


REINDEX { INDEX | TABLE | SCHEMA | SYSTEM | DATABASE } [( [ FORCE ],
[VERBOSE] ) ] name;

And maintain the old syntax for compatibility of course.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread David Fetter

On Tue, Feb 03, 2015 at 09:08:45AM -0500, Andrew Dunstan wrote:
 
 On 02/03/2015 08:55 AM, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
 I share the sentiment that the release notes *seem* too big, but the
 subsequent discussion shows that it's not clear why that's really a
 problem.  Exactly what problem are we trying to fix?
 At a rough count of lines, the release notes for unsupported versions
 are about 18% of documentation overall (47K out of 265K lines).  So
 they're not insubstantial.  Compared to the total size of the tarball,
 though ...
 It would not make that much of a difference in tarball size, agreed.
 It *would* make a difference in the build time and output size of the
 SGML docs --- as I mentioned at the outset, the release notes currently
 account for 25% of the SGML source linecount.
 I run `make -s -j4 world` on my i7 fairly often, and it is often
 the doc build that I wind up waiting for at the end.
 
 I realize this is slightly OT, but I wonder if it might be worth having
 targets that build and install everything but the docs.

That'd be great.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2015-02-03 Thread Robert Haas

On Mon, Feb 2, 2015 at 9:10 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas robertmh...@gmail.com wrote:
  On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
  If REINDEX cannot work without an exclusive lock, we should invent some
  other qualifier, like WITH FEWER LOCKS.
 
  What he said.

 But more to the point  why, precisely, can't this work without an
 AccessExclusiveLock?  And can't we fix that instead of setting for
 something clearly inferior?

 So, here's an alternative approach of how to get rid of the AEL
 locks. They're required because we want to switch the relfilenodes
 around. I've pretty much no confidence in any of the schemes anybody has
 come up to avoid that.

 So, let's not switch relfilenodes around.

 I think if we should instead just use the new index, repoint the
 dependencies onto the new oid, and then afterwards, when dropping,
 rename the new index one onto the old one. That means the oid of the
 index will change and some less than pretty grovelling around
 dependencies, but it still seems preferrable to what we're discussing
 here otherwise.

 Does anybody see a fundamental problem with that approach?

I'm not sure whether that will work out, but it seems worth a try.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 Additionally I think we should change the default for wal_level to
 hot_standby and max_wal_senders (maybe to 5). That way users can use
 pg_basebackup and setup streaming standbys without having to restart the
 primary. I think that'd be a important step in making setup easier.

I always thought the reason for defaulting to minimal was performance.
I'd like to see proof that the impact of wal_level = hot_standby is
negligible before we consider doing this.

The argument that having to change one more GUC is an undue burden while
configuring hot standby seems ridiculous from here.  HS is not nearly
push the EASY button and you're done, and this change wouldn't make
it so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Robert Haas

On Mon, Feb 2, 2015 at 4:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 So I'm bemused by Robert's insistence that he wants that format to support
 searches.  As I said, I find it far more convenient to search the output
 of git log and/or src/tools/git_changelog --- I keep text files of those
 around for exactly that purpose.

 I normally search in one of two ways.  Sometimes a grep the sgml;
 other times, I go to, say,
 http://www.postgresql.org/docs/devel/static/release-9-4.html and then
 edit the URL to take me back to 9.3, 9.2, 9.1, etc.  It's true that
 'git log' is often the place to go searching for stuff, but there are
 times when it's easier to find out what release introduced a feature
 by looking at the release notes, and it's certainly more useful if you
 want to send a link to someone who is not git-aware illustrating the
 results of your search.

 Well, maybe I'm the only one who is doing this and it's not worth
 worrying about it just for me.  But I do it, all the same.

 I'm not out to take away a feature you need.  I'm just wondering why it
 has to be supported in exactly the way it's done now.  Wouldn't a
 separately maintained release-notes-only document serve the purpose fine?

Well, I'd like to have all the release notes on the web site as they
are now, and I'd like to have all the SGML in the build tree as it is
now.  If someone wants to create a make docs-fast target that omits
older release notes, that's obviously not a problem.  What I don't
want to do is have information that is easily available now (put the
right two digits into the URL bar and you are done) suddenly require
more thought to get at (OK, so back through 9.0 is on the web site,
and then there's this other place you can go to get the older stuff,
if you can remember where it is).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Kevin Grittner

Heikki Linnakangas hlinnakan...@vmware.com wrote:
 On 02/02/2015 04:21 PM, Andres Freund wrote:
 On 2015-02-02 08:36:41 -0500, Robert Haas wrote:
 Also, I'd like to propose that we set the default value of
 max_checkpoint_segments/checkpoint_wal_size to something at
 least an order of magnitude larger than the current default
 setting.

 +1

 I don't agree with that principle. I wouldn't mind increasing it
 a little bit, but not by an order of magnitude.

Especially without either confirming that this effect is no longer
present, or having an explanation for it:

http://www.postgresql.org/message-id/4A44E58C022500027FCF@gw.
wicourts.gov

Note that Greg Smith found the same effect on a machine without any
write caching, which shoots down my theory, at least on his
machine:

http://www.postgresql.org/message-id/flat/4bccdad5.3040...@2ndquadrant.com#4bccdad5.3040...@2ndquadrant.com

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How about to have relnamespace and relrole?

2015-02-03 Thread Tom Lane

Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 Most of OID types has reg* OID types. Theses are very convenient
 when looking into system catalog/views, but there aren't OID
 types for userid and namespace id.

 What do you think about having these new OID types?

I'm not really excited about that.  That line of thought would imply
that we should have reg* types for every system catalog, which is
surely overkill.

The existing reg* types were chosen to handle the cases where objects have
schema-qualified (and/or type-overloaded) names so that correct lookup is
not merely a matter of (select oid from pg_foo where name = 'bar') or
vice versa.

I think the policy is, or should be, that we create reg* types for exactly
the cases where lookup isn't that simple.

In particular, both of the cases you mention are hardly new.  They
existed when we made the current set of reg* types and were consciously
not added then.

 SELECT relnamespace::regnamespace, relname, relowner::regrole
 FROM pg_class WHERE relnamespace = 'public'::regnamespace;

Two reasons this isn't terribly compelling are (1) it's creating a
join in a place where the planner can't possibly see it and optimize
it, and (2) you risk MVCC anomalies because the reg* output routines
would not be using the same snapshot as the calling query.

We already have problem (2) with the existing reg* functions so I'm
not that excited about doubling down on the concept.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Marko Tiikkaja


On 2/3/15 4:04 PM, David Fetter wrote:

On Mon, Feb 02, 2015 at 08:56:19PM -, Greg Sabino Mullane wrote:

Robert Haas wrote:

but there are times when it's easier to find out what release
introduced a feature by looking at the release notes, and it's
certainly more useful if you want to send a link to someone who
is not git-aware illustrating the results of your search.

Well, maybe I'm the only one who is doing this and it's not worth
worrying about it just for me.  But I do it, all the same.


I do this *all the time*. Please don't mess with the release notes.
Except to put them all on one page for easy searching. That would
be awesome.


Truly awesome.  When supporting older versions, being able to find
precisely when a feature was introduced in a single search through a
file, or find in page, for those using web browsers, would really
smooth off some burrs.


And now that we're on the subject of ponies, it would be nice if the 
relevant git hashes were included as well.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-02-03 Thread José Luis Tallón


On 02/03/2015 03:44 AM, Jim Nasby wrote:

[snip]

The alternative could be some long LOB (HugeOBject?) using the
equivalent to serial8 whereas regular LOBs would use serial4.


Well, it depends on how we did this. We could (for example) add a 
field to pg_class that determines what type to use for toast pointers; 
OID, int, or bigint. That could then be taken into account in the 
*toast* functions.


But as others have pointed out, we haven't even had any real 
complaints about toast using OIDs as being an issue until now, so I 
think it's premature to start messing with this. At most it's just 
something to keep in mind so we don't preclude doing this in the future.


A patch creating those HOBs (Huge Objects) might well make sense *after* 
the sequence refactoring got merged.
Removing the bottleneck due to the OID allocator for this use case will 
be definitively welcome
(I don't dare to code that just yet, but here's hoping someone will 
step in O:-)


BTW, regarding the size of what gets toasted; I've often thought it 
would be useful to allow a custom size limit on columns so that you 
could easily force data to be toasted if you knew you were very 
unlikely to access it. Basically, a cheap form of vertical partitioning.


Hmmm alter column set storage external / set storage extended ?

From http://www.postgresql.org/docs/9.4/static/sql-altertable.html :
ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | 
EXTENDED | MAIN }


This would do what you described, right?


HTH,

/ J.L.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] 4B row limit for CLOB tables

2015-02-03 Thread Tom Lane

Matthew Kelly mke...@tripadvisor.com writes:
 However, I do have active databases where the current oid is between 1 
 billion and 2 billion.  They were last dump-restored for a hardware upgrade a 
 couple years ago and were a bit more than half the size.  I therefore can 
 imagine that I have tables which are keyed by ~8,000,000 consecutive oids.

 I would argue that when it wraps there will be a single insert that will 
 probably block for 2-5 minutes while it tries to accomplish ~8,000,000 index 
 scans inside of GetNewOidWithIndex.  Even partitioning doesnât protect you 
 from this potential problem.

That may be a hazard, but ...

 That being said Iâd be perfectly happy merely giving each TOAST table its 
 own sequence as that almost entire mitigates the risk of an unexpected lock 
 up on reasonably sized tables/partitions, and provides a functional work 
 around for those of us with larger than average installs.

... this fix would actually make things enormously worse.  With the
single counter feeding all tables, you at least have a reasonable
probability that there are not enormously long runs of consecutive OIDs in
any one toast table.  With a sequence per table, you are nearly guaranteed
that there are such runs, because inserts into other tables don't create a
break.

(This effect is also why you're wrong to claim that partitioning can't fix
it.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread David Fetter

On Mon, Feb 02, 2015 at 08:56:19PM -, Greg Sabino Mullane wrote:
 Robert Haas wrote:
  but there are times when it's easier to find out what release 
  introduced a feature by looking at the release notes, and it's 
  certainly more useful if you want to send a link to someone who 
  is not git-aware illustrating the results of your search.
 
  Well, maybe I'm the only one who is doing this and it's not worth
  worrying about it just for me.  But I do it, all the same.
 
 I do this *all the time*. Please don't mess with the release notes.
 Except to put them all on one page for easy searching. That would 
 be awesome.

Truly awesome.  When supporting older versions, being able to find
precisely when a feature was introduced in a single search through a
file, or find in page, for those using web browsers, would really
smooth off some burrs.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Tom Lane

Sawada Masahiko sawada.m...@gmail.com writes:
 On Tue, Feb 3, 2015 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The way that FORCE was added to REINDEX was poorly thought out; let's not
 double down on that with another option added without any consideration
 for future expansion.  I'd be happier if we adopted something similar to
 the modern syntax for VACUUM and EXPLAIN, ie, comma-separated options in
 parentheses.

 I understood.
 I'm imagining new REINDEX syntax are followings.
 - REINDEX (INDEX, VERBOSE) hoge_idx;
 - REINDEX (TABLE) hoge_table;

 i.g., I will add following syntax format,
 REINDEX ( { INDEX | TABLE | SCHEMA | SYSTEM | DATABASE } , [VERBOSE] )
 name [FORCE];

Well, the object type is not an optional part of the command.  It's
*necessary*.  I was thinking more like

REINDEX { INDEX | TABLE | etc } name [ ( option [, option ...] ) ]

option := FORCE | VERBOSE

We'd still keep the historical syntax where you can write FORCE outside
parens, but it'd be deprecated.

Where to insert the parenthesized option list is a judgment call,
but I'd lean to keeping it at the end where FORCE used to be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unnecessary pointer-NULL checks in pgp-pgsql.c

2015-02-03 Thread Robert Haas

On Sun, Feb 1, 2015 at 9:14 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Coverity is pointing out that we are doing pointer-NULL checks on
 things that cannot be NULL in decrypt_internal():
  out:
 -   if (src)
 -   mbuf_free(src);
 -   if (ctx)
 -   pgp_free(ctx);
 +   Assert(ctx != NULL  src != NULL  dst != NULL);
 +   mbuf_free(src);
 +   pgp_free(ctx);

 if (err)
 {
 px_set_debug_handler(NULL);
 -   if (dst)
 -   mbuf_free(dst);
 +   mbuf_free(dst);

 src, dst and ctx are created respectively from mbuf_create_from_data,
 mbuf_create and pgp_init which never return NULL and they are palloc'd
 all the time. I think that we could simplify things with the patch
 attached, note that I added an assertion for correctness but I don't
 really think that it is much necessary.

Yeah, I'd drop the assertion.  Also, how about changing things around
slightly so that we lose the goto-label construct?  There's only one
goto, and its only about 6 lines before the label, so we could just
flip the sense of the if-test and put the code that gets skipped
inside the if-block.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Magnus Hagander

On Tue, Feb 3, 2015 at 3:51 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Feb 2, 2015 at 4:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  On Mon, Feb 2, 2015 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  So I'm bemused by Robert's insistence that he wants that format to
 support
  searches.  As I said, I find it far more convenient to search the
 output
  of git log and/or src/tools/git_changelog --- I keep text files of
 those
  around for exactly that purpose.
 
  I normally search in one of two ways.  Sometimes a grep the sgml;
  other times, I go to, say,
  http://www.postgresql.org/docs/devel/static/release-9-4.html and then
  edit the URL to take me back to 9.3, 9.2, 9.1, etc.  It's true that
  'git log' is often the place to go searching for stuff, but there are
  times when it's easier to find out what release introduced a feature
  by looking at the release notes, and it's certainly more useful if you
  want to send a link to someone who is not git-aware illustrating the
  results of your search.
 
  Well, maybe I'm the only one who is doing this and it's not worth
  worrying about it just for me.  But I do it, all the same.
 
  I'm not out to take away a feature you need.  I'm just wondering why it
  has to be supported in exactly the way it's done now.  Wouldn't a
  separately maintained release-notes-only document serve the purpose fine?

 Well, I'd like to have all the release notes on the web site as they
 are now, and I'd like to have all the SGML in the build tree as it is
 now.  If someone wants to create a make docs-fast target that omits
 older release notes, that's obviously not a problem.  What I don't
 want to do is have information that is easily available now (put the
 right two digits into the URL bar and you are done) suddenly require
 more thought to get at (OK, so back through 9.0 is on the web site,
 and then there's this other place you can go to get the older stuff,
 if you can remember where it is).


Not sure how feasible that it through build targets, but perhaps we could
also include it in the HTML docs for the website per above, but skip it in
the PDFs, which would make them a *lot* easier to deal with I bet.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Andres Freund

On 2015-02-03 10:20:03 -0500, Tom Lane wrote:
 Sawada Masahiko sawada.m...@gmail.com writes:
  On Tue, Feb 3, 2015 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  The way that FORCE was added to REINDEX was poorly thought out; let's not
  double down on that with another option added without any consideration
  for future expansion.  I'd be happier if we adopted something similar to
  the modern syntax for VACUUM and EXPLAIN, ie, comma-separated options in
  parentheses.
 
  I understood.
  I'm imagining new REINDEX syntax are followings.
  - REINDEX (INDEX, VERBOSE) hoge_idx;
  - REINDEX (TABLE) hoge_table;
 
  i.g., I will add following syntax format,
  REINDEX ( { INDEX | TABLE | SCHEMA | SYSTEM | DATABASE } , [VERBOSE] )
  name [FORCE];
 
 Well, the object type is not an optional part of the command.  It's
 *necessary*.  I was thinking more like
 
 REINDEX { INDEX | TABLE | etc } name [ ( option [, option ...] ) ]
 
 option := FORCE | VERBOSE
 
 We'd still keep the historical syntax where you can write FORCE outside
 parens, but it'd be deprecated.

Why would we allow force inside the parens, given it's a backward compat
only thing afaik? Don't get me wrong, I'm not at all against a
extensible syntax, I just don't see a point in further cargo culting
FORCE.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Robert Haas

On Tue, Feb 3, 2015 at 7:31 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 02/02/2015 03:36 PM, Robert Haas wrote:
 Second, I*think*  that these settings are symmetric and, if that's
 right, then I suggest that they ought to be named symmetrically.
 Basically, I think you've got min_checkpoint_segments (the number of
 recycled segments we keep around always) and max_checkpoint_segments
 (the maximum number of segments we can have between checkpoints),
 essentially splitting the current role of checkpoint_segments in half.
 I'd go so far as to suggest we use exactly that naming.  It would be
 reasonable to allow the value to be specified in MB rather than in
 16MB units, and to specify it that way by default, but maybe a
 unit-less value should have the old interpretation since everybody's
 used to it.  That would require adding GUC_UNIT_XSEG or similar, but
 that seems OK.

 Works for me. However, note that max_checkpoint_segments = 10 doesn't mean
 the same as current checkpoint_segments = 10. With checkpoint_segments =
 10 you end up with about 2x-3x as much WAL as with max_checkpoint_segments =
 10. So the everybody's used to it argument doesn't hold much water.

Hmm, that's surprising.  Why does that happen?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] 4B row limit for CLOB tables

2015-02-03 Thread Matthew Kelly

 Hmm 2^32 times aprox. 2kB (as per usual heuristics, ~4 rows per heap 
 page) is 8796093022208 (~9e13) bytes
 ... which results in 8192 1GB segments :O

8192 1GB segments is just 8TB, its not _that_ large.  At TripAdvisor we’ve been 
using a NoSQL solution to do session storage.  We are looking to probably swap 
that out to be Postgres (every other db backing the site is Postgres).  
Essentially, what I’m building is a system with 1 logical table that maps 
session id to a 2KB+ grab bag of ever changing session attributes which is 
partially normalized, partially json.  315 million uniques a month multiplied 
by the retention policy means I need to hold 2-4 billion session objects (and 
somehow expire old ones).  Additionally, most http calls can update the 
session, so between maintenance windows I expect to take around 20 billion 
'upserts’.  Obviously, I will have to shard and partition the table in 
practice, but this weekend I ran a test that demonstrated that a single table 
on a 9.4 server + logical replication + Dell 730xd can handle 4x that workload. 
 Well, it can for 38 hours… until you wrap xid’s on the toast table.  :P  I’ll 
be the first to admit that isn’t the normal use case though.  I’m happy to have 
found this thread, however, because I’m going to have to build around the 
global oid counter, explicitly the prevent the problem I explain below 
regarding clustering.

 Anybody actually reaching this limit out there?

Well its not the 4 billion row limit that concerns me, its the global shared 
counter in conjunction with pg_restore/clustering that is actually pretty 
concerning.

Just checked through all of TripAdvisor’s normal databases and the max tuples I 
see in single toast table is 17,000,000, so that is still a couple of orders of 
magnitude too small.  (however, close enough that it’ll be a concern in a few 
years).

However, I do have active databases where the current oid is between 1 billion 
and 2 billion.  They were last dump-restored for a hardware upgrade a couple 
years ago and were a bit more than half the size.  I therefore can imagine that 
I have tables which are keyed by ~8,000,000 consecutive oids.

I would argue that when it wraps there will be a single insert that will 
probably block for 2-5 minutes while it tries to accomplish ~8,000,000 index 
scans inside of GetNewOidWithIndex.  Even partitioning doesn’t protect you from 
this potential problem.

What even more weird is that this issue can be trigged by consuming too many 
oid’s in a different database in the same cluster (i.e. creating large amounts 
of temp tables)

 The problem with changing the id from 32 to 64 bits is that the storage *for 
 everybody else* doubles, making the implementation slower for most though 
 this might be actually not that important.

Well, you aren’t doubling the storage.  Even if you have to store the key in 4 
places, you are adding 16 bytes per TOAST tuple.  If we work off the 2KB 
estimate for each TOAST tuple, then you are only increasing the storage by 
0.7%.  I’m sure there are more hidden costs but we are really only talking 
about a low single digit percent increase.  In exchange, you get to drop one 
index scan per toast insert; an index scan looking in the only hot part of the 
index. 

That being said I’d be perfectly happy merely giving each TOAST table its own 
sequence as that almost entire mitigates the risk of an unexpected lock up on 
reasonably sized tables/partitions, and provides a functional work around for 
those of us with larger than average installs.

- Matt K
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Missing markup in pg_receivexlog.sgml

2015-02-03 Thread Heikki Linnakangas


On 02/03/2015 10:27 AM, Michael Paquier wrote:

On Tue, Feb 3, 2015 at 4:25 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

Flush the WAL data to disk immediately after it's being received. Also send
a status packet back to the server immediately after flushing, regardless of
literal--status-interval/

Yes, that's indeed better. As long as I am on it, attached is a patch
for that...


Ok, thanks :-). Applied.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Performance difference between kernel 3.10 and 3.12

2015-02-03 Thread Tatsuo Ishii

I see significant performance difference between systems using Linux
kernel 3.10 and 3.12. We did tests on CentOS7 (kernel 3.10) and SuSE
Enterprise Linux Server 12 (kernel 3.12).

Both systems run on VMWare ESXi 5.5.0.
Xeon CPU E52650 2GHz x 2 (16 cores in total)
Mem 96GB, 10,000 RPM SAS drive

Guest OS setting:
CPU 4
Mem 6GB
file system ext4
PostgreSQL 9.3.5 compiled from source code (all parameters are default)

The benchmark tables are created by pgbench -i -s 500.  pgbench -S
-c NUM -T 500 runs 3 times for each NUM concurrent users and
we calculate the average TPS for the each run.

We ran pgbench in 4 patters:

1) SLES12 (kernel 3.12)
2) CentOS7 (kernel 3.10 with default deadline scheduler)
3) CentOS7 (kernel 3.10 with cfq scheduler)
4) CentOS7 (kernel replaced with 3.12 with cfq scheduler)

The result is roughly:

#1  #4  #2  #3

So it seems the performance difference is dominated by the kernel
version difference (3.10 vs. 3.12). Does anybody know the cause?

Note that kernel parameter sched_autogroup (which is available only
with 3.10 kernel) did not give much difference.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Missing markup in pg_receivexlog.sgml

2015-02-03 Thread Michael Paquier

On Tue, Feb 3, 2015 at 4:25 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Flush the WAL data to disk immediately after it's being received. Also send
 a status packet back to the server immediately after flushing, regardless of
 literal--status-interval/
Yes, that's indeed better. As long as I am on it, attached is a patch
for that...
-- 
Michael
diff --git a/doc/src/sgml/ref/pg_receivexlog.sgml b/doc/src/sgml/ref/pg_receivexlog.sgml
index be321b5..0c99744 100644
--- a/doc/src/sgml/ref/pg_receivexlog.sgml
+++ b/doc/src/sgml/ref/pg_receivexlog.sgml
@@ -139,9 +139,9 @@ PostgreSQL documentation
   termoption--synchronous/option/term
   listitem
para
-Issue sync commands as soon as there is WAL data which has not been
-flushed yet. Also status packets are sent back to the server just after
-WAL data is flushed whatever literal--status-interval/ is set to.
+Flush the WAL data to disk immediately after it has been received. Also
+send a status packet back to the server immediately after flushing,
+regardless of literal--status-interval/.
/para
   /listitem
  /varlistentry

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small memory leak in execute.c of ECPG driver

2015-02-03 Thread Heikki Linnakangas


On 02/03/2015 08:58 AM, Michael Paquier wrote:

Hi all,

In exactly 3 places of the ECPG driver (for numeric, for interval and
for date), we do something as follows:
/* Allocation of mallocedval */
if (!(mallocedval = ecpg_strdup(array [, lineno)))
 return false;

for (element = 0; element  var-arrsize; element++)
{
 int result;

 ptr = stuff_alloc();
 if (!ptr)
 return false; = Leak here of mallocedval

It happens that if the allocation done within this for loop fails we
leak mallocedval that was previously allocated. Attached is a patch to
fix this issue spotted by Coverity.


I think there are more similar leaks nearby. After the first hunk, 
there's another if-check with return false that also leaks 
mallocedval. Right after the two other hunks, if the ecpg_realloc fails, 
we again leak mallocedval.


I wonder why Coverity didn't warn about those? Maybe it would've, after 
fixing the first ones.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Fetch zero result rows when executing a query?

2015-02-03 Thread Shay Rojansky

Sorry if this has been asked before, couldn't find any mention...

I'm working on the Npgsql, the .NET driver for PostgreSQL, and am trying to
find a way to execute a query but without fetching any rows. The Execute
message has a maximum result-row count, but zero is documented to mean
fetch all rows.

The use case would be sending a query which might modify or might not (e.g.
UPDATE), but we know that the user is uninterested in any result row.

My current workaround is to specify maxrows=1, was wondering if I missed a
better alternative.

Thanks,

Shay

Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-03 Thread Andres Freund

Hi,

On 2015-02-03 12:26:33 +0100, Shay Rojansky wrote:
 Sorry if this has been asked before, couldn't find any mention...
 
 I'm working on the Npgsql, the .NET driver for PostgreSQL, and am trying to
 find a way to execute a query but without fetching any rows. The Execute
 message has a maximum result-row count, but zero is documented to mean
 fetch all rows.
 
 The use case would be sending a query which might modify or might not (e.g.
 UPDATE), but we know that the user is uninterested in any result row.
 
 My current workaround is to specify maxrows=1, was wondering if I missed a
 better alternative.

Is this really a relevant optimization? If the user doesn't want
results, RETURNING shouldn't be specified... Sure, sometimes the same
query will be reused over cases where you want the results and those
where you don't, but I doubt this is worthy of optimization.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Jim Nasby


On 2/3/15 9:20 AM, Tom Lane wrote:

i.g., I will add following syntax format,
REINDEX ( { INDEX | TABLE | SCHEMA | SYSTEM | DATABASE } , [VERBOSE] )
name [FORCE];

Well, the object type is not an optional part of the command.  It's
*necessary*.  I was thinking more like

REINDEX { INDEX | TABLE | etc } name [ ( option [, option ...] ) ]


VACUUM puts the options before the table name, so ISTM it'd be best to 
keep that with REINDEX. Either REINDEX (options) {INDEX | ...} or 
REINDEX {INDEX | ...} (options).

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-02-03 Thread Jim Nasby


On 2/3/15 9:50 AM, David Steele wrote:

On 2/3/15 10:01 AM, José Luis Tallón wrote:


Hmmm alter column set storage external / set storage extended ?

From http://www.postgresql.org/docs/9.4/static/sql-altertable.html :
ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL |
EXTENDED | MAIN }

This would do what you described, right?


EXTENDED is the default for most TOAST-able types and is still subject
to TOAST_TUPLE_THRESHOLD which is normally 2K. EXTERNAL is the same but
with no compression.

See: http://www.postgresql.org/docs/9.4/static/storage-toast.html


Right. I'd like to be able to set per-column TOAST_TUPLE_THRESHOLD.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] 4B row limit for CLOB tables

2015-02-03 Thread Jim Nasby


On 2/3/15 9:01 AM, Tom Lane wrote:

Matthew Kelly mke...@tripadvisor.com writes:

However, I do have active databases where the current oid is between 1 billion 
and 2 billion.  They were last dump-restored for a hardware upgrade a couple 
years ago and were a bit more than half the size.  I therefore can imagine that 
I have tables which are keyed by ~8,000,000 consecutive oids.



I would argue that when it wraps there will be a single insert that will 
probably block for 2-5 minutes while it tries to accomplish ~8,000,000 index 
scans inside of GetNewOidWithIndex.  Even partitioning doesnâ€™t protect you 
from this potential problem.


That may be a hazard, but ...


That being said Iâ€™d be perfectly happy merely giving each TOAST table its own 
sequence as that almost entire mitigates the risk of an unexpected lock up on 
reasonably sized tables/partitions, and provides a functional work around for 
those of us with larger than average installs.


... this fix would actually make things enormously worse.  With the
single counter feeding all tables, you at least have a reasonable
probability that there are not enormously long runs of consecutive OIDs in
any one toast table.  With a sequence per table, you are nearly guaranteed
that there are such runs, because inserts into other tables don't create a
break.

(This effect is also why you're wrong to claim that partitioning can't fix
it.)


That's assuming that toasting is evenly spread between tables. In my 
experience, that's not a great bet...

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-02-03 Thread Robert Haas

On Mon, Feb 2, 2015 at 2:44 PM, Ryan Kelly rpkell...@gmail.com wrote:
 The attached patch adds a LINE: ... hint when schemaname.typename
 results in a schema which does not exist. I came across this when a
 missing comma in a SELECT list resulted in an error without a location
 in a query a few thousand lines long.

 Before:

 (postgres@[local]:5432 14:41:25) [postgres] select test.id 'all' as
 example from test;
 ERROR:  3F000: schema test does not exist
 LOCATION:  get_namespace_oid, namespace.c:2826

 After:

 (postgres@[local]:5433 14:42:32) [postgres] select test.id 'all' as
 example from test;
 ERROR:  3F000: schema test does not exist
 LINE 1: select test.id 'all' as example from test;
^
 LOCATION:  LookupTypeName, parse_type.c:171

 -Ryan Kelly

Please add your patch to
https://commitfest.postgresql.org/action/commitfest_view/open so do we
don't forget about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Andres Freund

On 2015-02-03 10:41:04 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Additionally I think we should change the default for wal_level to
  hot_standby and max_wal_senders (maybe to 5). That way users can use
  pg_basebackup and setup streaming standbys without having to restart the
  primary. I think that'd be a important step in making setup easier.
 
 I always thought the reason for defaulting to minimal was performance.
 I'd like to see proof that the impact of wal_level = hot_standby is
 negligible before we consider doing this.

Well, it really depends on what you're doing. The cases where minimal is
beneficial is when you COPY into a table that's been created in the same
(sub)xact or rewrite it.. Other than that there's not really a
difference?

 The argument that having to change one more GUC is an undue burden while
 configuring hot standby seems ridiculous from here.  HS is not nearly
 push the EASY button and you're done, and this change wouldn't make
 it so.

But pg_basebackup is pretty close to being that easy, isn't it? There's
still pg_hba.conf to deal with, but other than that...

And with a littlebit more work (safely autocreate replication slots, so
wal files aren't getting removed prematurely), we can make amke HS
simpler as well.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2015-02-03 10:20:03 -0500, Tom Lane wrote:
 Well, the object type is not an optional part of the command.  It's
 *necessary*.  I was thinking more like
 
 REINDEX { INDEX | TABLE | etc } name [ ( option [, option ...] ) ]
 
 option := FORCE | VERBOSE
 
 We'd still keep the historical syntax where you can write FORCE outside
 parens, but it'd be deprecated.

 Why would we allow force inside the parens, given it's a backward compat
 only thing afaik? Don't get me wrong, I'm not at all against a
 extensible syntax, I just don't see a point in further cargo culting
 FORCE.

Ah, I'd forgotten that that option was now a no-op.  Yeah, there's no
reason to support it in the new syntax.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Peter Eisentraut

On 2/3/15 10:11 AM, Marko Tiikkaja wrote:
 And now that we're on the subject of ponies, it would be nice if the
 relevant git hashes were included as well.

That's probably not going to happen.  A release-note entry is often the
combination of many commits, and accurately tracking those is a lot of work.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Robert Haas

On Tue, Feb 3, 2015 at 7:43 AM, Andres Freund and...@2ndquadrant.com wrote:
 I think these days there's no reason for the split between the archive
 and hot_standby wal levels. The split was made out of volume and
 stability concerns. I think we can by now be confident about the
 wal_level = hot_standby changes (note I'm not proposing hot_standby =
 on).

 So let's remove the split. It just gives users choice between two options
 that don't have a meaningful difference.

 Additionally I think we should change the default for wal_level to
 hot_standby and max_wal_senders (maybe to 5). That way users can use
 pg_basebackup and setup streaming standbys without having to restart the
 primary. I think that'd be a important step in making setup easier.

 Previously there have been arguments against changing the default of
 wal_level because it'd mean the regression tests wouldn't exercise
 minimal anymore. That might be true, but then right now we just don't
 exercise the more complex levels. If we're really concerned we can just
 force a different value during the tests, just as we do for prepared
 xacts.

 Comments?

 Additionally, more complex and further into the future, I wonder if we
 couldn't also get rid of wal_level = logical by automatically switching
 to it whenever logical slots are active.

I think my vote is to maintain the status quo.  What you're basically
proposing to do is ship the system half-configured for replication,
and I don't see the point of that.  The people who want replication
still have to do the rest of the setup anyway, and the people who
don't want replication are losing the benefits of wal_level=minimal
for no real gain.  In particular, they are losing the ability to skip
WAL-logging when bulk-loading a just-created table, which is not a
small thing.  I'm fairly sure we have customers who benefit
significantly from that behavior.

I agree that wal_level=archive doesn't serve much purpose at this
point.  I guess I wouldn't object to removing that, although I can't
see much benefit to doing so, either.

Crazy ideas: Could we make wal_level something other than
PGC_POSTMASTER?  PGC_SIGHUP would be nice...  Could we, maybe, even
make it a derived value rather than one that is explicitly configured?
 Like, if you set max_wal_senders0, you automatically get
wal_level=hot_standby?  If you register a logical replication slot,
you automatically get wal_level=logical?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of LSEG.m

2015-02-03 Thread Heikki Linnakangas


On 02/03/2015 06:47 PM, Tom Lane wrote:

Or perhaps we should just remove both the field and the ifdef'd
assignments.  That's a bit more drastic but I can't really see
this code ever coming back to life ... especially since the notion
of a field that's not stored on disk but is valid in in-memory
copies seems impossibly error-prone.  Most functions can have no
idea whether their input is residing in a disk buffer or not.
And adding the bookkeeping to determine that would surely cost
more than just recomputing the slope when needed.


+1 for removing it altogether.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Peter Eisentraut

On 2/3/15 12:32 AM, Tom Lane wrote:
 It would not make that much of a difference in tarball size, agreed.
 It *would* make a difference in the build time and output size of the
 SGML docs --- as I mentioned at the outset, the release notes currently
 account for 25% of the SGML source linecount.

We could make the release notes a separate top-level document.  Then it
could be built in parallel with the documentation.  There are only two
links from the rest of the docs into the release notes, so there
wouldn't be much harm.

Doing this would also solve an unrelated issue:  Sometimes, when I run a
spell or markup checker over the documentation, the old release notes
tend to get flagged for many things, because we don't edit those after
the release.  Separating the actively maintained documentation from the
it-is-what-it-is release notes would make that distinction clearer.

The web site would need to be reorganized slightly, but I can imagine
that in the long run this could also be a win.

Note also that you only need to present the release notes from the
latest stable release branch on the web site, as opposed to
documentation for each branch.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-02-03 Thread Tom Lane

Ryan Kelly rpkell...@gmail.com writes:
 The attached patch adds a LINE: ... hint when schemaname.typename
 results in a schema which does not exist. I came across this when a
 missing comma in a SELECT list resulted in an error without a location
 in a query a few thousand lines long.

I think this is a good problem to attack, but the proposed patch seems
a bit narrow and brute-force.  Surely this isn't the only place where
we're missing an errposition pointer on a no such schema error?

It's probably not reasonable to add a pstate argument to 
LookupExplicitNamespace itself, given that namespace.c isn't part
of the parser.  But you could imagine adding an interface function
in the parser that calls LookupExplicitNamespace and then throws a
position-aware error on failure, and then making sure that all schema
lookups in the parser go through that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Getting rid of LSEG.m

2015-02-03 Thread Tom Lane

I noticed a Coverity complaint about the m field of LSEG being
uninitialized, which indeed it is because all the places that
would set it are ifdef'd out.  So why didn't the field itself
get ifdef'd out as well?

Or perhaps we should just remove both the field and the ifdef'd
assignments.  That's a bit more drastic but I can't really see
this code ever coming back to life ... especially since the notion
of a field that's not stored on disk but is valid in in-memory
copies seems impossibly error-prone.  Most functions can have no
idea whether their input is residing in a disk buffer or not.
And adding the bookkeeping to determine that would surely cost
more than just recomputing the slope when needed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 Note also that you only need to present the release notes from the
 latest stable release branch on the web site, as opposed to
 documentation for each branch.

Yeah, JD suggested the same upthread.  If we went over to a separate
document containing all the historical notes, then it would make sense
for the main documentation to contain only release notes for the current
branch, which would further reduce its build time.  My thread-starting
proposal of keeping the last five branches was based on the assumption
that we didn't need any whole-history document, but if we're keeping one
separately then this seems to make the most sense.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-02-03 Thread David Steele

On 2/3/15 10:01 AM, José Luis Tallón wrote:

 Hmmm alter column set storage external / set storage extended ?

 From http://www.postgresql.org/docs/9.4/static/sql-altertable.html :
 ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL |
 EXTENDED | MAIN }

 This would do what you described, right?

EXTENDED is the default for most TOAST-able types and is still subject
to TOAST_TUPLE_THRESHOLD which is normally 2K.  EXTERNAL is the same but
with no compression.

See: http://www.postgresql.org/docs/9.4/static/storage-toast.html

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Robert Haas

On Tue, Feb 3, 2015 at 10:44 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Works for me. However, note that max_checkpoint_segments = 10 doesn't
 mean
 the same as current checkpoint_segments = 10. With checkpoint_segments
 =
 10 you end up with about 2x-3x as much WAL as with
 max_checkpoint_segments =
 10. So the everybody's used to it argument doesn't hold much water.


 Hmm, that's surprising.  Why does that happen?

 That's the whole point of this patch. max_checkpoint_segments = 10, or
 max_checkpoint_segments = 160 MB, means that the system will begin a
 checkpoint so that when the checkpoint completes, and it truncates away or
 recycles old WAL, the total size of pg_xlog is 160 MB.

 That's different from our current checkpoint_segments setting. With
 checkpoint_segments, the documented formula for calculating the disk usage
 is (2 + checkpoint_completion_target) * checkpoint_segments * 16 MB. That's
 a lot less intuitive to set.

Hmm, that's different from what I was thinking.  We probably shouldn't
call that max_checkpoint_segments, then.  I got confused and thought
you were just trying to decouple the number of segments that it takes
to trigger a checkpoint from the number we keep preallocated.

But I'm confused: how can we know how much new WAL will be written
before the checkpoint completes?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 On 2/3/15 10:11 AM, Marko Tiikkaja wrote:
 And now that we're on the subject of ponies, it would be nice if the
 relevant git hashes were included as well.

 That's probably not going to happen.  A release-note entry is often the
 combination of many commits, and accurately tracking those is a lot of work.

Well, actually, modern release note entries tend to look like

!--
Author: Andres Freund and...@anarazel.de
Branch: master [3fabed070] 2015-01-07 00:19:37 +0100
Branch: REL9_4_STABLE [7da102154] 2015-01-07 00:24:58 +0100
Author: Andres Freund and...@anarazel.de
Branch: master [31912d01d] 2015-01-07 00:18:00 +0100
Branch: REL9_4_STABLE [84911ff51] 2015-01-07 00:24:47 +0100
--

listitem
 para
  Assorted fixes for logical decoding (Andres Freund)
 /para
/listitem

because they're prepared on the basis of src/tools/git_changelog output
and so having the git hashes is merely a matter of not deleting that
info after we're done writing the user-visible text.  So I could imagine
some tool that presents that info.  I'm not volunteering to write it
though ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Petr Jelinek


On 03/02/15 16:50, Robert Haas wrote:

On Tue, Feb 3, 2015 at 10:44 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:


That's the whole point of this patch. max_checkpoint_segments = 10, or
max_checkpoint_segments = 160 MB, means that the system will begin a
checkpoint so that when the checkpoint completes, and it truncates away or
recycles old WAL, the total size of pg_xlog is 160 MB.

That's different from our current checkpoint_segments setting. With
checkpoint_segments, the documented formula for calculating the disk usage
is (2 + checkpoint_completion_target) * checkpoint_segments * 16 MB. That's
a lot less intuitive to set.


Hmm, that's different from what I was thinking.  We probably shouldn't
call that max_checkpoint_segments, then.  I got confused and thought
you were just trying to decouple the number of segments that it takes
to trigger a checkpoint from the number we keep preallocated.

But I'm confused: how can we know how much new WAL will be written
before the checkpoint completes?



The preallocation is based on estimated size of next checkpoint which is 
basically running average of the previous checkpoints with some 
additional adjustments for unsteady behavior (last checkpoint has higher 
weight in the formula).


(we also still internally have the CheckPointSegments which is 
calculated the way Heikki described above)


In any case, I don't like the max_checkpoint_segments naming too much, 
and I don't even like the number of segments as limit too much, I think 
the ability to set this in actual size is quite nice property of this 
patch and as Heikki says the numbers don't map that well to the old ones 
in practice.


I did some code reading and I do like the patch. Basically only negative 
thing I can say is that I am not big fan of _logSegNo variable name but 
that's not new in this patch, we use it all over the place in xlog.


I would vote for bigger default of the checkpoint_wal_size (or whatever 
it will be named) though, since the current one is not much bigger in 
practice than what we have now and that one is way too conservative.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Heikki Linnakangas


On 02/03/2015 05:19 PM, Robert Haas wrote:

On Tue, Feb 3, 2015 at 7:31 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

On 02/02/2015 03:36 PM, Robert Haas wrote:

Second, I*think*  that these settings are symmetric and, if that's
right, then I suggest that they ought to be named symmetrically.
Basically, I think you've got min_checkpoint_segments (the number of
recycled segments we keep around always) and max_checkpoint_segments
(the maximum number of segments we can have between checkpoints),
essentially splitting the current role of checkpoint_segments in half.
I'd go so far as to suggest we use exactly that naming.  It would be
reasonable to allow the value to be specified in MB rather than in
16MB units, and to specify it that way by default, but maybe a
unit-less value should have the old interpretation since everybody's
used to it.  That would require adding GUC_UNIT_XSEG or similar, but
that seems OK.


Works for me. However, note that max_checkpoint_segments = 10 doesn't mean
the same as current checkpoint_segments = 10. With checkpoint_segments =
10 you end up with about 2x-3x as much WAL as with max_checkpoint_segments =
10. So the everybody's used to it argument doesn't hold much water.


Hmm, that's surprising.  Why does that happen?


That's the whole point of this patch. max_checkpoint_segments = 10, or 
max_checkpoint_segments = 160 MB, means that the system will begin a 
checkpoint so that when the checkpoint completes, and it truncates away 
or recycles old WAL, the total size of pg_xlog is 160 MB.


That's different from our current checkpoint_segments setting. With 
checkpoint_segments, the documented formula for calculating the disk 
usage is (2 + checkpoint_completion_target) * checkpoint_segments * 16 
MB. That's a lot less intuitive to set.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL information view

2015-02-03 Thread Magnus Hagander

On Tue, Feb 3, 2015 at 6:42 AM, Michael Paquier michael.paqu...@gmail.com
wrote:

 Where are we on this patch? No new version has been provided and there
 have been comments provided by Heikki here
 (5491e547.4040...@vmware.com) and by Alexei here
 (87ppbqz00h@commandprompt.com).



Yeah, it's on my shame list. I'm definitely planning to get a new version
in before the next CF and to try to work quicker with it that time to
finish it off on time.

Thanks for the reminder!

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] PageRepairFragmentation performance

2015-02-03 Thread Heikki Linnakangas


On 01/31/2015 01:49 AM, Peter Geoghegan wrote:

The refactoring patch certainly looks very reasonable.


Ok, committed the refactoring part for now. Thanks for the review.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small memory leak in execute.c of ECPG driver

2015-02-03 Thread Michael Paquier

On Tue, Feb 3, 2015 at 5:35 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 I think there are more similar leaks nearby. After the first hunk, there's
 another if-check with return false that also leaks mallocedval. Right
 after the two other hunks, if the ecpg_realloc fails, we again leak
 mallocedval.
Yes, I found some extra ones by re-reading the code again with newcopy
(2) as well as mallocedval (1) as you mentioned.

 I wonder why Coverity didn't warn about those? Maybe it would've, after
 fixing the first ones.
Hard to say. Perhaps it gives up after finding one failure in a code
path, or perhaps it would have found it after a version update.. In
any case, an updated patch is attached.
-- 
Michael
From 4e195f162d879bf563fec052710dd10ccdc4a89a Mon Sep 17 00:00:00 2001
From: Michael Paquier michael@otacoo.com
Date: Tue, 3 Feb 2015 15:48:16 +0900
Subject: [PATCH] Fix memory leak in ecpg driver

Issue pointed out by Coverity.
---
 src/interfaces/ecpg/ecpglib/execute.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/interfaces/ecpg/ecpglib/execute.c b/src/interfaces/ecpg/ecpglib/execute.c
index 8a3dd75..5d26af7 100644
--- a/src/interfaces/ecpg/ecpglib/execute.c
+++ b/src/interfaces/ecpg/ecpglib/execute.c
@@ -803,7 +803,10 @@ ecpg_store_input(const int lineno, const bool force_indicator, const struct vari
 
 	mallocedval = quote_postgres(newcopy, quote, lineno);
 	if (!mallocedval)
+	{
+		ecpg_free(newcopy);
 		return false;
+	}
 
 	*tobeinserted_p = mallocedval;
 }
@@ -835,7 +838,10 @@ ecpg_store_input(const int lineno, const bool force_indicator, const struct vari
 
 	mallocedval = quote_postgres(newcopy, quote, lineno);
 	if (!mallocedval)
+	{
+		ecpg_free(newcopy);
 		return false;
+	}
 
 	*tobeinserted_p = mallocedval;
 }
@@ -859,7 +865,10 @@ ecpg_store_input(const int lineno, const bool force_indicator, const struct vari
 
 			nval = PGTYPESnumeric_new();
 			if (!nval)
+			{
+ecpg_free(mallocedval);
 return false;
+			}
 
 			if (var-type == ECPGt_numeric)
 result = PGTYPESnumeric_copy((numeric *) ((var + var-offset * element)-value), nval);
@@ -869,6 +878,7 @@ ecpg_store_input(const int lineno, const bool force_indicator, const struct vari
 			if (result != 0)
 			{
 PGTYPESnumeric_free(nval);
+ecpg_free(mallocedval);
 return false;
 			}
 
@@ -940,7 +950,10 @@ ecpg_store_input(const int lineno, const bool force_indicator, const struct vari
 		{
 			str = quote_postgres(PGTYPESinterval_to_asc((interval *) ((var + var-offset * element)-value)), quote, lineno);
 			if (!str)
+			{
+ecpg_free(mallocedval);
 return false;
+			}
 			slen = strlen(str);
 
 			if (!(mallocedval = ecpg_realloc(mallocedval, strlen(mallocedval) + slen + 2, lineno)))
@@ -991,7 +1004,10 @@ ecpg_store_input(const int lineno, const bool force_indicator, const struct vari
 		{
 			str = quote_postgres(PGTYPESdate_to_asc(*(date *) ((var + var-offset * element)-value)), quote, lineno);
 			if (!str)
+			{
+ecpg_free(mallocedval);
 return false;
+			}
 			slen = strlen(str);
 
 			if (!(mallocedval = ecpg_realloc(mallocedval, strlen(mallocedval) + slen + 2, lineno)))
-- 
2.2.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Andres Freund

On 2015-02-03 13:51:25 +0100, Magnus Hagander wrote:
 Those who want to optimize their WAL size can set it back to minimal, but
 let's make the default the one that makes life *easy* for people.

Precisely. New users won't usually have tremendous stuff to load in the
specific circumstances in which minimal makes a differences. And pretty
much everyone uses base backups  replication outside of initial loading
anyway.

 The other option, which would be more complicated (I have a semi-finished
 patch that I never got time to clean up) would be for pg_basebackup to be
 able to dynamically raise the value of wal_level during it's run. It would
 not help with the streaming standby part, but it would help with
 pg_basebackup. That could be useful independent - for those who prefer
 using wal_level=minimal and also pg_basebackup..

There's some ugly corner cases in making that happen. I'm sure we could
do it, but I'm really not convinced it's worth the effort.

 Previously there have been arguments against changing the default of
  wal_level because it'd mean the regression tests wouldn't exercise
  minimal anymore. That might be true, but then right now we just don't
  exercise the more complex levels. If we're really concerned we can just
  force a different value during the tests, just as we do for prepared
  xacts.

 Seems we should focus our tests on the stuff that people actually use in
 reality? :) And if we change the default, then even more people will use
 that level.

Agreed. It's not my argument ;)


 But it would definitely be a good idea to have some buildfarm animals set
 up to test each one.

.oO(make it random in the buildfarm code)

  Additionally, more complex and further into the future, I wonder if we
  couldn't also get rid of wal_level = logical by automatically switching
  to it whenever logical slots are active.

 If it can be safely done online, I definitely think that would be a good
 goal to have.

I think it could be at least made work on the primary. I don't really
see how on a standby (which we don't support yet anyway).

 If we could do the same for hot_standby if you had physical slots,
 that might also be a good idea?

I think it's slightly more complicated there. We'd have to delay slot
creation till after a checkpoint and such, which doesn't seem that
desirable.

I think it's more interesting for logical anyway - there is some common
workloads where it actually does imply some (not large, mind you)
overhead. So we can't just change to it as a default.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Michael Paquier

On Tue, Feb 3, 2015 at 9:43 PM, Andres Freund and...@2ndquadrant.com wrote:
 I think these days there's no reason for the split between the archive
 and hot_standby wal levels. The split was made out of volume and
 stability concerns. I think we can by now be confident about the
 wal_level = hot_standby changes (note I'm not proposing hot_standby =
 on).
+1.

 So let's remove the split. It just gives users choice between two options
 that don't have a meaningful difference.

The last time I mentioned something similar (purely removing archive
from wal_level 
CA+TgmoaTG9U4=a_bs8sbdemm2+fapqhzujhj7f-npfy+bns...@mail.gmail.com),
there were two additional suggestions done as well:
- Keep archive and make it mean archive = hot_standby
- Do nothing to still let the users what they think is better and not
what we think is better.
Perhaps times have changed since... I guess that you mean making both
values become equivalent, right?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Petr Jelinek


On 03/02/15 13:51, Magnus Hagander wrote:

On Tue, Feb 3, 2015 at 1:43 PM, Andres Freund and...@2ndquadrant.com
mailto:and...@2ndquadrant.com wrote:

Hi,

I think these days there's no reason for the split between the archive
and hot_standby wal levels. The split was made out of volume and
stability concerns. I think we can by now be confident about the
wal_level = hot_standby changes (note I'm not proposing hot_standby =
on).

So let's remove the split. It just gives users choice between two
options
that don't have a meaningful difference.


+1.



+1 too



Additionally I think we should change the default for wal_level to
hot_standby and max_wal_senders (maybe to 5). That way users can use
pg_basebackup and setup streaming standbys without having to restart the
primary. I think that'd be a important step in making setup easier.


Yes, please!

Those who want to optimize their WAL size can set it back to minimal,
but let's make the default the one that makes life *easy* for people.

The other option, which would be more complicated (I have a
semi-finished patch that I never got time to clean up) would be for
pg_basebackup to be able to dynamically raise the value of wal_level
during it's run. It would not help with the streaming standby part, but
it would help with pg_basebackup. That could be useful independent - for
those who prefer using wal_level=minimal and also pg_basebackup..




This is not that easy to do, let's do it one step at a time.



Comments?

Additionally, more complex and further into the future, I wonder if we
couldn't also get rid of wal_level = logical by automatically switching
to it whenever logical slots are active.



If it can be safely done online, I definitely think that would be a good
goal to have. If we could do the same for hot_standby if you had
physical slots, that might also be a good idea?



+many for the logical, physical would be nice but I think it's again in 
the category of not so easy and maybe better as next step if at all.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Andres Freund

On 2015-02-03 21:58:44 +0900, Michael Paquier wrote:
 On Tue, Feb 3, 2015 at 9:43 PM, Andres Freund and...@2ndquadrant.com wrote:
  I think these days there's no reason for the split between the archive
  and hot_standby wal levels. The split was made out of volume and
  stability concerns. I think we can by now be confident about the
  wal_level = hot_standby changes (note I'm not proposing hot_standby =
  on).
 +1.
 
  So let's remove the split. It just gives users choice between two options
  that don't have a meaningful difference.
 
 The last time I mentioned something similar (purely removing archive
 from wal_level 
 CA+TgmoaTG9U4=a_bs8sbdemm2+fapqhzujhj7f-npfy+bns...@mail.gmail.com),
 there were two additional suggestions done as well:

 - Keep archive and make it mean archive = hot_standby

That's actually what I was thinking of implementing. I.e. accept
archive, but output hot_standby.

 - Do nothing to still let the users what they think is better and not
 what we think is better.

I'd rather remove the supporting code that takes different branches
based on those.

 Perhaps times have changed since...

I don't think the arguments back then made all that much sense. But
perhaps they're also less important if we change the default. For me
removing an option that doesn't have an effect but saving a couple bytes
every now and then and adding to the testing matrix isn't
'nannyism'. It's removing pointless choice.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unlikely-to-happen crash in ecpg driver caused by NULL-pointer check not done

2015-02-03 Thread Michael Paquier

On Tue, Feb 3, 2015 at 7:50 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Hmm. Since the ecpg_add_mem call is done after setting (*(void **) var),
 that's left to point to already-free'd memory. The other call sites have a
 similar issue. I haven't analyzed the code to check if that's harmless or
 not, but seems better to not do that.
Right, it is an error do allocate this memory if there is a risk that
it may be freed. Hence fixed.

 In ecpg_add_mem, the ecpg_raise() call is unnecessary, since ecpg_alloc
 already does that on failure.
Right, check.

 (It would be less error-prone to have an ecpg_alloc_auto() function that
 combines ecpg_alloc+ecpg_add_mem in one call.)
It makes sense.

 /* Here are some methods used by the lib. */

 /* Returns a pointer to a string containing a simple type name. */

 That second comment is completely bogus. It's not this patch's fault, it's
 been like that forever, but just happened to notice..
Corrected.

All those things are addressed in the patch attached.
-- 
Michael
From 206dc0dd95a83e056968ac822f7b9331ffcf82cd Mon Sep 17 00:00:00 2001
From: Michael Paquier michael@otacoo.com
Date: Tue, 3 Feb 2015 16:21:50 +0900
Subject: [PATCH] Fix unlikely-to-happen crash in ecpg_add_mem

This routine was called ecpg_alloc to allocate to memory but did not
actually check the returned pointer allocated, potentially NULL which
is actually the result of a malloc call.

Issue noted by Coverity, though I guessed the legwork needed here.
---
 src/interfaces/ecpg/ecpglib/descriptor.c |  6 ++
 src/interfaces/ecpg/ecpglib/execute.c|  8 +++-
 src/interfaces/ecpg/ecpglib/extern.h |  4 ++--
 src/interfaces/ecpg/ecpglib/memory.c | 22 +-
 4 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/src/interfaces/ecpg/ecpglib/descriptor.c b/src/interfaces/ecpg/ecpglib/descriptor.c
index b2990ca..956c035 100644
--- a/src/interfaces/ecpg/ecpglib/descriptor.c
+++ b/src/interfaces/ecpg/ecpglib/descriptor.c
@@ -432,7 +432,7 @@ ECPGget_desc(int lineno, const char *desc_name, int index,...)
 /* allocate storage if needed */
 if (arrsize == 0  *(void **) var == NULL)
 {
-	void	   *mem = (void *) ecpg_alloc(offset * ntuples, lineno);
+	void	   *mem = (void *) ecpg_auto_alloc(offset * ntuples, lineno);
 
 	if (!mem)
 	{
@@ -440,7 +440,6 @@ ECPGget_desc(int lineno, const char *desc_name, int index,...)
 		return false;
 	}
 	*(void **) var = mem;
-	ecpg_add_mem(mem, lineno);
 	var = mem;
 }
 
@@ -510,7 +509,7 @@ ECPGget_desc(int lineno, const char *desc_name, int index,...)
 		/* allocate storage if needed */
 		if (data_var.ind_arrsize == 0  data_var.ind_value == NULL)
 		{
-			void	   *mem = (void *) ecpg_alloc(data_var.ind_offset * ntuples, lineno);
+			void	   *mem = (void *) ecpg_auto_alloc(data_var.ind_offset * ntuples, lineno);
 
 			if (!mem)
 			{
@@ -518,7 +517,6 @@ ECPGget_desc(int lineno, const char *desc_name, int index,...)
 return false;
 			}
 			*(void **) data_var.ind_pointer = mem;
-			ecpg_add_mem(mem, lineno);
 			data_var.ind_value = mem;
 		}
 
diff --git a/src/interfaces/ecpg/ecpglib/execute.c b/src/interfaces/ecpg/ecpglib/execute.c
index 8a3dd75..46d884f 100644
--- a/src/interfaces/ecpg/ecpglib/execute.c
+++ b/src/interfaces/ecpg/ecpglib/execute.c
@@ -398,11 +398,10 @@ ecpg_store_result(const PGresult *results, int act_field,
 		}
 
 		ecpg_log(ecpg_store_result on line %d: allocating memory for %d tuples\n, stmt-lineno, ntuples);
-		var-value = (char *) ecpg_alloc(len, stmt-lineno);
+		var-value = (char *) ecpg_auto_alloc(len, stmt-lineno);
 		if (!var-value)
 			return false;
 		*((char **) var-pointer) = var-value;
-		ecpg_add_mem(var-value, stmt-lineno);
 	}
 
 	/* allocate indicator variable if needed */
@@ -410,11 +409,10 @@ ecpg_store_result(const PGresult *results, int act_field,
 	{
 		int			len = var-ind_offset * ntuples;
 
-		var-ind_value = (char *) ecpg_alloc(len, stmt-lineno);
-		if (!var-ind_value)
+		var-ind_value = (char *) ecpg_auto_alloc(len, stmt-lineno);
+		if (!var_ind_value)
 			return false;
 		*((char **) var-ind_pointer) = var-ind_value;
-		ecpg_add_mem(var-ind_value, stmt-lineno);
 	}
 
 	/* fill the variable with the tuple(s) */
diff --git a/src/interfaces/ecpg/ecpglib/extern.h b/src/interfaces/ecpg/ecpglib/extern.h
index 3836007..2b670e0 100644
--- a/src/interfaces/ecpg/ecpglib/extern.h
+++ b/src/interfaces/ecpg/ecpglib/extern.h
@@ -136,8 +136,7 @@ extern struct var_list *ivlist;
 
 /* Here are some methods used by the lib. */
 
-/* Returns a pointer to a string containing a simple type name. */
-void		ecpg_add_mem(void *ptr, int lineno);
+bool		ecpg_add_mem(void *ptr, int lineno);
 
 bool ecpg_get_data(const PGresult *, int, int, int, enum ECPGttype type,
 			  enum ECPGttype, char *, char *, long, long, long,
@@ -148,6 +147,7 @@ void		ecpg_pthreads_init(void);
 #endif
 struct connection *ecpg_get_connection(const char *);
 char

Re: [HACKERS] Getting rid of wal_level=archive and default to hot_standby + wal_senders

2015-02-03 Thread Magnus Hagander

On Tue, Feb 3, 2015 at 1:43 PM, Andres Freund and...@2ndquadrant.com
wrote:

 Hi,

 I think these days there's no reason for the split between the archive
 and hot_standby wal levels. The split was made out of volume and
 stability concerns. I think we can by now be confident about the
 wal_level = hot_standby changes (note I'm not proposing hot_standby =
 on).

 So let's remove the split. It just gives users choice between two options
 that don't have a meaningful difference.



+1.



Additionally I think we should change the default for wal_level to
 hot_standby and max_wal_senders (maybe to 5). That way users can use
 pg_basebackup and setup streaming standbys without having to restart the
 primary. I think that'd be a important step in making setup easier.



Yes, please!

Those who want to optimize their WAL size can set it back to minimal, but
let's make the default the one that makes life *easy* for people.

The other option, which would be more complicated (I have a semi-finished
patch that I never got time to clean up) would be for pg_basebackup to be
able to dynamically raise the value of wal_level during it's run. It would
not help with the streaming standby part, but it would help with
pg_basebackup. That could be useful independent - for those who prefer
using wal_level=minimal and also pg_basebackup..


Previously there have been arguments against changing the default of
 wal_level because it'd mean the regression tests wouldn't exercise
 minimal anymore. That might be true, but then right now we just don't
 exercise the more complex levels. If we're really concerned we can just
 force a different value during the tests, just as we do for prepared
 xacts.



Seems we should focus our tests on the stuff that people actually use in
reality? :) And if we change the default, then even more people will use
that level.

But it would definitely be a good idea to have some buildfarm animals set
up to test each one.



Comments?

 Additionally, more complex and further into the future, I wonder if we
 couldn't also get rid of wal_level = logical by automatically switching
 to it whenever logical slots are active.



If it can be safely done online, I definitely think that would be a good
goal to have. If we could do the same for hot_standby if you had physical
slots, that might also be a good idea?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 On 02/02/2015 05:39 PM, Peter Eisentraut wrote:
 I share the sentiment that the release notes *seem* too big, but the
 subsequent discussion shows that it's not clear why that's really a
 problem.  Exactly what problem are we trying to fix?

 At a rough count of lines, the release notes for unsupported versions
 are about 18% of documentation overall (47K out of 265K lines).  So
 they're not insubstantial.  Compared to the total size of the tarball,
 though ...

 It would not make that much of a difference in tarball size, agreed.
 It *would* make a difference in the build time and output size of the
 SGML docs --- as I mentioned at the outset, the release notes currently
 account for 25% of the SGML source linecount.

I run `make -s -j4 world` on my i7 fairly often, and it is often
the doc build that I wind up waiting for at the end.

FWIW, my preference would be that unless you choose a special all
release notes build, it only build release notes for supported
versions and only up to the point that the branch being built split
off.  That way, you don't have to work at it to see whether the
release notes from an older branch duplicate the same bug fixes as
a later branch that is listed.  I think it makes sense to put the
all release notes page up on our web site.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Release note bloat is getting out of hand

2015-02-03 Thread Andrew Dunstan



On 02/03/2015 08:55 AM, Kevin Grittner wrote:

Tom Lane t...@sss.pgh.pa.us wrote:

Josh Berkus j...@agliodbs.com writes:

On 02/02/2015 05:39 PM, Peter Eisentraut wrote:

I share the sentiment that the release notes *seem* too big, but the
subsequent discussion shows that it's not clear why that's really a
problem.  Exactly what problem are we trying to fix?

At a rough count of lines, the release notes for unsupported versions
are about 18% of documentation overall (47K out of 265K lines).  So
they're not insubstantial.  Compared to the total size of the tarball,
though ...

It would not make that much of a difference in tarball size, agreed.
It *would* make a difference in the build time and output size of the
SGML docs --- as I mentioned at the outset, the release notes currently
account for 25% of the SGML source linecount.

I run `make -s -j4 world` on my i7 fairly often, and it is often
the doc build that I wind up waiting for at the end.




I realize this is slightly OT, but I wonder if it might be worth having 
targets that build and install everything but the docs.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Heikki Linnakangas


On 02/02/2015 03:36 PM, Robert Haas wrote:

Second, I*think*  that these settings are symmetric and, if that's
right, then I suggest that they ought to be named symmetrically.
Basically, I think you've got min_checkpoint_segments (the number of
recycled segments we keep around always) and max_checkpoint_segments
(the maximum number of segments we can have between checkpoints),
essentially splitting the current role of checkpoint_segments in half.
I'd go so far as to suggest we use exactly that naming.  It would be
reasonable to allow the value to be specified in MB rather than in
16MB units, and to specify it that way by default, but maybe a
unit-less value should have the old interpretation since everybody's
used to it.  That would require adding GUC_UNIT_XSEG or similar, but
that seems OK.


Works for me. However, note that max_checkpoint_segments = 10 doesn't 
mean the same as current checkpoint_segments = 10. With 
checkpoint_segments = 10 you end up with about 2x-3x as much WAL as with 
max_checkpoint_segments = 10. So the everybody's used to it argument 
doesn't hold much water.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0

2015-02-03 Thread Peter Geoghegan

On Tue, Feb 3, 2015 at 2:05 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 TRAP: FailedAssertion(!(node-spec != SPEC_INSERT || node-arbiterIndex !=
 ((Oid) 0)), File: nodeModifyTable.c, Line: 1619)

 Is that just because of the hack in parse_clause.c?

Yes. I never built with assertions and so didn't see this, but it
doesn't matter.

 With assertions disabled, count_upsert_exclusion.pl ran successfully to the
 end. I also tried running VACUUM FREEZE upsert_race_test in a loop in
 another session at the same time, but it didn't make a difference. How
 quickly do you see the errors?

 I also tried applying crash_REL9_5.patch from the jjanes_upsert kit, and set
 jj_xid=1 to increase XID burn rate, but I'm still not seeing any errors.

Did you build fully optimized, assertion-free code? I've been doing
that. I found it necessary to recreate some of the bugs Jeff's tool
caught. I also think that I might have needed an 8 core box to see it,
but less sure about that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PQgetssl() and alternative SSL implementations

2015-02-03 Thread Heikki Linnakangas


On 01/28/2015 08:05 PM, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

Right, that was the idea. I wanted it to include the word OpenSSL, to
make it clear in the callers that it's specific to OpenSSL. And SSL,
because that's the name of the struct. I agree it looks silly, though.
One idea is to have two separate arguments: the implementation name, and
the struct name. PQgetSSLstruct(ssl, OpenSSL, SSL) would look less
silly.


That's probably overkill.  Why not establish a convention that the main
API struct for the library doesn't have to be named?  So it's just
PQgetSSLstruct(ssl, OpenSSL), and you only need strange naming if
you're dealing with a library that actually has more than one API object
that needs to be fetched this way.  (That set is likely empty...)


Works for me. Committed that way.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Process 'die' interrupts while reading/writing from the client s

2015-02-03 Thread Heikki Linnakangas


On 02/03/2015 11:51 PM, Andres Freund wrote:

+ * This is called just after low-level writes. That might be after the read
+ * finished successfully, or it was interrupted via interrupt. 'blocked' tells
+ * us whether the


Looks like you forgot to complete that sentence before pushing...

Also, nightjar is unhappy with this:

TRAP: FailedAssertion(!(InterruptHoldoffCount == 0  CritSectionCount 
== 0), File: 
/pgbuild/root/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c, 
Line: 574)


I think that assertion in ProcessClientWriteInterrupt() should be simply 
removed. ProcessClientWriteInterrupt() copes just fine, i.e. does 
nothing, if we're in a critical section or interrupts are held.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Fabrízio de Royes Mello

On Tue, Feb 3, 2015 at 8:26 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 2/3/15 9:20 AM, Tom Lane wrote:

 i.g., I will add following syntax format,
 REINDEX ( { INDEX | TABLE | SCHEMA | SYSTEM | DATABASE } , [VERBOSE] )
 name [FORCE];

 Well, the object type is not an optional part of the command.  It's
 *necessary*.  I was thinking more like

 REINDEX { INDEX | TABLE | etc } name [ ( option [, option ...] ) ]


 VACUUM puts the options before the table name, so ISTM it'd be best to
keep that with REINDEX. Either REINDEX (options) {INDEX | ...} or REINDEX
{INDEX | ...} (options).


Makes sense... +1

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello

[HACKERS] GRANT USAGE on FOREIGN SERVER exposes passwords

2015-02-03 Thread Noah Yetter

When a user is granted USAGE on a foreign server, the psql command \deu+
will show them the username and password bound to the applicable user
mapping.

To demonstrate (9.3+):
(as a superuser)
# create extension postgres_fdw ;

# create foreign server loopback_server
foreign data wrapper postgres_fdw
options(host '127.0.0.1', port '5432') ;

# create user mapping for public
server loopback_server
options(username 'foo', password 'bar') ;

(as a normal user)
 \deu+
List of user mappings
Server | User name |  FDW Options
---+---+
 loopback_server   | public|
(1 row)

So far so good?

 select * from dblink('loopback_server', 'select current_date') as
x(column1 date) ;
ERROR:  permission denied for foreign server loopback_server

OK, I can't do that now.  Let's fix it:

# grant usage on foreign server loopback_server to public ;

 select * from dblink('loopback_server', 'select current_date') as
x(column1 date) ;
  column1

 2015-02-03
(1 row)

Sweet!  But...

 \deu+
  List of user mappings
Server | User name |  FDW Options
---+---+
 loopback_server   | public| (user 'foo', password 'bar')
(1 row)

Crap.

(FWIW, it doesn't matter whether you grant to PUBLIC or to a specific user,
the result is the same.)

The obvious objection is, well you should just use foreign tables instead
of dblink().  I'll cut a long story short by saying that doesn't work for
us.  We are using postgres_fdw to allow our analysts to run queries against
AWS Redshift and blend those results with tables in our OLTP schema.  If
you know anything about Redshift, or about analysts, you'll realize
immediately why foreign tables are not a viable solution.  Surely there are
many others in a similar position, where the flexibility offered by
dblink() makes it preferable to fixed foreign tables.

S... what gives?  This seems like a really obvious security hole.  I've
searched the mailing list archives repeatedly and found zero discussion of
this issue.


--
Noah Yetter
Data Architect/DBA @ Craftsy

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Tom Lane

Jim Nasby jim.na...@bluetreble.com writes:
 On 2/3/15 9:20 AM, Tom Lane wrote:
 Well, the object type is not an optional part of the command.  It's
 *necessary*.  I was thinking more like
 REINDEX { INDEX | TABLE | etc } name [ ( option [, option ...] ) ]

 VACUUM puts the options before the table name, so ISTM it'd be best to 
 keep that with REINDEX. Either REINDEX (options) {INDEX | ...} or 
 REINDEX {INDEX | ...} (options).

Well, I really really don't like the first of those.  IMO the command name
is REINDEX INDEX etc, so sticking something in the middle of that is
bogus.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Process 'die' interrupts while reading/writing from the client s

2015-02-03 Thread Andres Freund

On 2015-02-04 00:47:03 +0200, Heikki Linnakangas wrote:
 On 02/03/2015 11:51 PM, Andres Freund wrote:
 + * This is called just after low-level writes. That might be after the read
 + * finished successfully, or it was interrupted via interrupt. 'blocked' 
 tells
 + * us whether the
 
 Looks like you forgot to complete that sentence before pushing...
 
 Also, nightjar is unhappy with this:
 
 TRAP: FailedAssertion(!(InterruptHoldoffCount == 0  CritSectionCount ==
 0), File:
 /pgbuild/root/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c, Line:
 574)
 
 I think that assertion in ProcessClientWriteInterrupt() should be simply
 removed. ProcessClientWriteInterrupt() copes just fine, i.e. does nothing,
 if we're in a critical section or interrupts are held.

Yea, it really was just a brainfart on my endq. It's perfectly normal to be 
called
that way during error handling...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

2015-02-03 Thread Heikki Linnakangas


On 02/03/2015 08:54 PM, Andres Freund wrote:

* Previously the patchset didn't handle SIGTERM while
   InteractiveBackend() was reading from the client. It did handle
   ctrl-c/d, but since getc() isn't interruptible and can't be replaced
   with latches... The fix for that isn't super pretty:
   die():
if (DoingCommandRead  whereToSendOutput != DestRemote)
ProcessInterrupts();
   but imo acceptable for single user mode.


That smells an awful lot like ImmediateInterruptOK. Could you use 
WaitLatchOrSocket to sleep on stdin? Probably needs to be renamed or 
copied to WaitLatchOrStdin(), or a new flag to be added to wait on stdin 
instead of a socket, at least on Windows, but in theory.


Might not be worth it if it gets complicated - I can live with the above 
- but it's a thought.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

2015-02-03 Thread Andres Freund

On 2015-02-03 22:17:22 +0200, Heikki Linnakangas wrote:
 On 02/03/2015 08:54 PM, Andres Freund wrote:
 * Previously the patchset didn't handle SIGTERM while
InteractiveBackend() was reading from the client. It did handle
ctrl-c/d, but since getc() isn't interruptible and can't be replaced
with latches... The fix for that isn't super pretty:
die():
  if (DoingCommandRead  whereToSendOutput != DestRemote)
  ProcessInterrupts();
but imo acceptable for single user mode.
 
 That smells an awful lot like ImmediateInterruptOK.

It awfully does, yes. But we're only exiting, and only in single user
mode. And only during command read. We don't need all the other
complicated stuff around sinval, async, deadlock checking, etc.

 Could you use WaitLatchOrSocket to sleep on stdin?

Unfortunately I don't think so. poll() etc only works properly on
network handles, pipes etc - but stdin can be a file :(. And I think
what exactly happens if it's a file fd isn't super well defined. On
linux the file is always marked as ready, which would probably actually
work...

I think this is better solved in the long run to get rid of single user
mode, akin to the patch Tom had a year or two back. Adding more
complications for it doesn't seem worth it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unlikely-to-happen crash in ecpg driver caused by NULL-pointer check not done

2015-02-03 Thread Heikki Linnakangas


On 02/03/2015 09:28 AM, Michael Paquier wrote:

Hi all,

In ecpg_add_mem of memory.c, we use ecpg_alloc but there is actually
no NULL-pointer check. If an OOM shows up exactly at this point, this
is likely to cause a crash. Attached patch adds some extra processing
to ecpg_add_mem to check if the allocation fails, and to fail properly
if an OOM appears.



--- a/src/interfaces/ecpg/ecpglib/descriptor.c
+++ b/src/interfaces/ecpg/ecpglib/descriptor.c
@@ -440,7 +440,12 @@ ECPGget_desc(int lineno, const char *desc_name, int 
index,...)
return false;
}
*(void **) var = mem;
-   ecpg_add_mem(mem, lineno);
+   if (!ecpg_add_mem(mem, lineno))
+   {
+   ecpg_free(mem);
+   va_end(args);
+   return false;
+   }
var = mem;
}


Hmm. Since the ecpg_add_mem call is done after setting (*(void **) var), 
that's left to point to already-free'd memory. The other call sites have 
a similar issue. I haven't analyzed the code to check if that's harmless 
or not, but seems better to not do that.


In ecpg_add_mem, the ecpg_raise() call is unnecessary, since ecpg_alloc 
already does that on failure.


(It would be less error-prone to have an ecpg_alloc_auto() function that 
combines ecpg_alloc+ecpg_add_mem in one call.)



/* Here are some methods used by the lib. */

/* Returns a pointer to a string containing a simple type name. */
boolecpg_add_mem(void *ptr, int lineno);

bool ecpg_get_data(const PGresult *, int, int, int, enum ECPGttype type,
  enum ECPGttype, char *, char *, long, long, long,
  enum ARRAY_TYPE, enum COMPAT_MODE, bool);


That second comment is completely bogus. It's not this patch's fault, 
it's been like that forever, but just happened to notice..


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-02-03 Thread Sawada Masahiko

On Tue, Feb 3, 2015 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sawada Masahiko sawada.m...@gmail.com writes:
 On Mon, Feb 2, 2015 at 9:21 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Now, I think that it may
 be better to provide the keyword VERBOSE before the type of object
 reindexed as REINDEX [ VERBOSE ] object.

 Actually, my first WIP version of patch added VERBOSE word at before
 type of object.
 I'm feeling difficult about that the position of VERBOSE word in
 REINDEX statement.

 The way that FORCE was added to REINDEX was poorly thought out; let's not
 double down on that with another option added without any consideration
 for future expansion.  I'd be happier if we adopted something similar to
 the modern syntax for VACUUM and EXPLAIN, ie, comma-separated options in
 parentheses.


I understood.
I'm imagining new REINDEX syntax are followings.
- REINDEX (INDEX, VERBOSE) hoge_idx;
- REINDEX (TABLE) hoge_table;

i.g., I will add following syntax format,
REINDEX ( { INDEX | TABLE | SCHEMA | SYSTEM | DATABASE } , [VERBOSE] )
name [FORCE];

Thought?

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-02-03 Thread David Steele

On 2/3/15 5:27 PM, Jim Nasby wrote:
 On 2/3/15 9:50 AM, David Steele wrote:
 EXTENDED is the default for most TOAST-able types and is still subject
 to TOAST_TUPLE_THRESHOLD which is normally 2K. EXTERNAL is the same but
 with no compression.

 See: http://www.postgresql.org/docs/9.4/static/storage-toast.html

 Right. I'd like to be able to set per-column TOAST_TUPLE_THRESHOLD.

No argument there.  There are some columns that I would prefer to always
TOAST because even 2K can be very big for some use cases.

-- 
- David Steele
da...@pgmasters.net




signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Performance improvement for joins where outer side is unique

2015-02-03 Thread Kyotaro HORIGUCHI

Hi, I had a look on this patch. Although I haven't understood
whole the stuff and all of the related things, I will comment as
possible.

Performance:

I looked on the performance gain this patch gives. For several
on-memory joins, I had gains about 3% for merge join, 5% for hash
join, and 10% for nestloop (@CentOS6), for simple 1-level joins
with aggregation similar to what you mentioned in previous
mail like this.

=# SELECT count(*) FROM t1 JOIN t2 USING (id);

 Of course, the gain would be trivial when returning many tuples,
or scans go to disk.

I haven't measured the loss by additional computation when the
query is regarded as not a single join.


Explain representation:

I don't like that the 'Unique Join: occupies their own lines in
the result of explain, moreover, it doesn't show the meaning
clearly. What about the representation like the following? Or,
It might not be necessary to appear there.

   Nested Loop ...
 Output: 
 -   Unique Jion: Yes
   - Seq Scan on public.t2 (cost = ...
 - - Seq Scan on public.t1 (cost = 
 + - Seq Scan on public.t1 (unique) (cost = 


Coding:

The style looks OK. Could applied on master.
It looks to work as you expected for some cases.

Random comments follow.

- Looking specialjoin_is_unique_join, you seem to assume that
  !delay_upper_joins is a sufficient condition for not being
  unique join.  The former simplly indicates that don't
  commute with upper OJs and the latter seems to indicate that
  the RHS is unique, Could you tell me what kind of
  relationship is there between them?

- The naming unique join seems a bit obscure for me, but I
  don't have no better idea:( However, the member name
  has_unique_join seems to be better to be is_unique_join.

- eclassjoin_is_unique_join involves seemingly semi-exhaustive
  scan on ec members for every element in joinlist. Even if not,
  it might be thought to be a bit too large for the gain. Could
  you do the equivelant things by more small code?


regards,


 Jan 2015 00:37:19 +1300, David Rowley dgrowle...@gmail.com wrote in 
caaphdvod_ucmoupovdpxbnkw50o14x3wwkojmzlxkbbn71z...@mail.gmail.com
 On 1 January 2015 at 02:47, David Rowley dgrowle...@gmail.com wrote:
 
  Hi,
 
  I've been hacking a bit at the join code again... This time I've been
  putting some effort into  optimising the case where the inner side of the
  join is known to be unique.
  For example, given the tables:
 
  create table t1 (id int primary key);
  create table t2 (id int primary key);
 
  And query such as:
 
  select * from t1 left outer join t2 on t1.id=t2.id;
 
  It is possible to deduce at planning time that for each row in the outer
  relation, only 0 or 1 rows can exist in the inner relation, (inner being
  t2)
 
 
 I've been hacking at this unique join idea again and I've now got it
 working for all join types -- Patch attached.
 
 Here's how the performance is looking:
 
 postgres=# create table t1 (id int primary key);
 CREATE TABLE
 postgres=# create table t2 (id int primary key);
 CREATE TABLE
 postgres=# insert into t1 select x.x from generate_series(1,100) x(x);
 INSERT 0 100
 postgres=# insert into t2 select x.x from generate_series(1,100) x(x);
 INSERT 0 100
 postgres=# vacuum analyze;
 VACUUM
 postgres=# \q
 
 Query: select count(t1.id) from t1 inner join t2 on t1.id=t2.id;
 
 With Patch on master as of 32bf6ee
 
 D:\Postgres\install\binpgbench -f d:\unijoin3.sql -T 60 -n postgres
 transaction type: Custom query
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 60 s
 number of transactions actually processed: 78
 latency average: 769.231 ms
 tps = 1.288260 (including connections establishing)
 tps = 1.288635 (excluding connections establishing)
 
 Master as of 32bf6ee
 
 D:\Postgres\install\binpgbench -f d:\unijoin3.sql -T 60 -n postgres
 transaction type: Custom query
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 60 s
 number of transactions actually processed: 70
 latency average: 857.143 ms
 tps = 1.158905 (including connections establishing)
 tps = 1.159264 (excluding connections establishing)
 
 That's a 10% performance increase.
 
 I still need to perform more thorough benchmarking with different data
 types.
 
 One weird thing that I noticed before is that in an earlier revision of the
 patch in the executor's join Initialise node code, I had set the
 unique_inner to true for semi joins and replaced the SEMI_JOIN check for a
 unique_join check in the execute node for each join method. With this the
 performance results barely changed from standard... I've yet to find out
 why.

I don't know even what you did precisely but if you do such a
thing, you perhaps should change the name of unique_inner to
something representing that it reads up to one row from the inner
for one outer row. (Sorry I have no good idea for this..)

 The patch also has added a property to the EXPLAIN (VERBOSE)

Re: [HACKERS] Hot Standby WAL reply uses heavyweight session locks, but doesn't have enough infrastructure set up

2015-02-03 Thread Andres Freund

On 2015-02-03 14:18:02 +0900, Michael Paquier wrote:
 -  RecoveryLockList contains entry for lock no longer recorded by
 lock manager: xid %u database %u relation %u,
 -  lock-xid, lock-dbOid, lock-relOid);
 +RecoveryLockList contains entry for lock no longer recorded by
 lock manager: xid %u,
 + lock-xid);
 This patch is making the information provided less verbose, and I
 think that it is useful to have some information not only about the
 lock held, but as well about the database and the relation.

It's debug4 or impossible stuff that lock.c already warned about - I
doubt anybody has ever actually looked at it in a long while, if
ever. If we really want to provide something more we can use something
like LOCK_PRINT() - but I really doubt it's worth neither the
notational, nor the verbosity overhead.

 Also, ISTM that StandbyAcquireLock should still use a database OID and
 a relation OID instead of a only LOCKTAG, and SET_LOCKTAG_RELATION
 should be set in StandbyAcquireLock while
 ResolveRecoveryConflictWithLock is extended only with the lock mode as
 new argument. (Patch 2 adds many calls to SET_LOCKTAG_RELATION btw
 justidying to keep he API changes minimal).

But there's now callers acquiring other locks than relation locks, like
dbase_redo() acquiring a object lock. And we need to acquire those via
the standby mechanism to avoid races around release.  We could add a
separate wrapper for relation locks, but imo the locktag move to the
callers saved about as many lines in some places as it cost in others.

 In patch 2, isn't it necessary to bump XLOG_PAGE_MAGIC?

I don't think so, there's no incompatible change.

Thanks for having a look!

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] File based Incremental backup v9

2015-02-03 Thread Marco Nenciarini

Il 02/02/15 22:28, Magnus Hagander ha scritto:
 On Mon, Feb 2, 2015 at 10:06 PM, Robert Haas robertmh...@gmail.com
 mailto:robertmh...@gmail.com wrote:
 
 On Sat, Jan 31, 2015 at 6:47 PM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it
 mailto:marco.nenciar...@2ndquadrant.it wrote:
  Il 31/01/15 17:22, Erik Rijkers ha scritto:
  On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:
 
  0001-public-parse_filename_for_nontemp_relation.patch
  0002-copydir-LSN-v2.patch
  0003-File-based-incremental-backup-v8.patch
 
  Hi,
 
  It looks like it only compiles with assert enabled.
 
 
  It is due to a typo (assert instead of Assert). You can find the updated
  patch attached to this message.
 
 I would sure like it if you would avoid changing the subject line
 every time you post a new version of this patch.  It breaks the
 threading for me.
 
 
 +1 - it does break gmail.

Ok, sorry for that.

 
 
 
 It seems to have also broken it for the CommitFest app, which thinks
 v3 is the last version.  I was not able to attach the new version.
 
 
 The CF app has detected that it's the same thread, because of the
 headers (gmail is the buggy one here - the headers of the email are
 perfectly correct).
 
 It does not, however, pick up and show the change of subject there (but
 you can see if if you click the link for the latest version into the
 archives - the link under latest or latest attachment both go to the
 v9 patch). 
 
  
 
 When I clicked on attach thread without having logged in, it took me
 to a bad URL.  When I clicked on it after having logged in, it
 
 
 Clearly a bug.
 
  
 
 purported to work, but AFAICS, it didn't actually do anything.
 
 
 That's because the thread is already there, and you're adding it again.
 Of course, it wouldn't hurt if it actually told you that :) 
 

I'm also confused from the (Patch: No) part at the end of every line
if you expand the last attachment line.

Every message shown here contains one or more patch attached.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature

[HACKERS] How about to have relnamespace and relrole?

2015-02-03 Thread Kyotaro HORIGUCHI

Hello,

Most of OID types has reg* OID types. Theses are very convenient
when looking into system catalog/views, but there aren't OID
types for userid and namespace id.

What do you think about having these new OID types? The
usefulness of regnamespace is doubtful but regrole should be
useful because the column like 'owner' appears many places.

For example this works as follows.


SELECT relnamespace::regnamespace, relname, relowner::regrole
FROM pg_class WHERE relnamespace = 'public'::regnamespace;

 relnamespace | relname | relowner 
--+-+--
 public   | t1  | horiguti
(1 row)


What do you think about this?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From cd1137f940aa63c3597e76fd740b05778654d3bb Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Wed, 21 Jan 2015 11:38:21 +0900
Subject: [PATCH 1/2] Add regnamespace.

---
 doc/src/sgml/datatype.sgml| 19 ---
 src/backend/utils/adt/regproc.c   | 97 +++
 src/include/catalog/pg_cast.h |  7 +++
 src/include/catalog/pg_proc.h | 10 
 src/include/catalog/pg_type.h |  5 ++
 src/include/utils/builtins.h  |  5 ++
 src/test/regress/expected/regproc.out | 22 
 src/test/regress/sql/regproc.sql  |  4 ++
 8 files changed, 163 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index edf636b..9799e18 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4422,6 +4422,13 @@ SELECT * FROM pg_attribute
/row
 
row
+entrytyperegnamespace//entry
+entrystructnamepg_namespace//entry
+entrynamespace(schema) name/entry
+entryliteralpg_catalog//entry
+   /row
+
+   row
 entrytyperegtype//entry
 entrystructnamepg_type//entry
 entrydata type name/entry
@@ -4446,12 +4453,12 @@ SELECT * FROM pg_attribute
 /table
 
para
-All of the OID alias types accept schema-qualified names, and will
-display schema-qualified names on output if the object would not
-be found in the current search path without being qualified.
-The typeregproc/ and typeregoper/ alias types will only
-accept input names that are unique (not overloaded), so they are
-of limited use; for most uses typeregprocedure/ or
+All of the OID alias types except regnamespace accept schema-qualified
+names, and will display schema-qualified names on output if the object
+would not be found in the current search path without being qualified.
+The typeregproc/ and typeregoper/ alias types will only accept
+input names that are unique (not overloaded), so they are of limited use;
+for most uses typeregprocedure/ or
 typeregoperator/ are more appropriate.  For typeregoperator/,
 unary operators are identified by writing literalNONE/ for the unused
 operand.
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index 3d1bb32..b09aa2a 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -1159,6 +1159,103 @@ regclasssend(PG_FUNCTION_ARGS)
 
 
 /*
+ * regnamespacein		- converts classname to class OID
+ *
+ * We also accept a numeric OID, for symmetry with the output routine.
+ *
+ * '-' signifies unknown (OID 0).  In all other cases, the input must
+ * match an existing pg_class entry.
+ */
+Datum
+regnamespacein(PG_FUNCTION_ARGS)
+{
+	char	   *nsp_name_or_oid = PG_GETARG_CSTRING(0);
+	Oid			result = InvalidOid;
+
+	/* '-' ? */
+	if (strcmp(nsp_name_or_oid, -) == 0)
+		PG_RETURN_OID(InvalidOid);
+
+	/* Numeric OID? */
+	if (nsp_name_or_oid[0] = '0' 
+		nsp_name_or_oid[0] = '9' 
+		strspn(nsp_name_or_oid, 0123456789) == strlen(nsp_name_or_oid))
+	{
+		result = DatumGetObjectId(DirectFunctionCall1(oidin,
+		CStringGetDatum(nsp_name_or_oid)));
+		PG_RETURN_OID(result);
+	}
+
+	/* Else it's a name */
+
+	result = get_namespace_oid(nsp_name_or_oid, false);
+
+	PG_RETURN_OID(result);
+}
+
+/*
+ * to_regnamespace		- converts nspname to namespace OID
+ *
+ * If the name is not found, we return NULL.
+ */
+Datum
+to_regnamespace(PG_FUNCTION_ARGS)
+{
+	char	   *nsp_name = PG_GETARG_CSTRING(0);
+	Oid			result;
+
+	result = get_namespace_oid(nsp_name, true);
+
+	if (OidIsValid(result))
+		PG_RETURN_OID(result);
+	else
+		PG_RETURN_NULL();
+}
+
+/*
+ * regnamespaceout		- converts namespace OID to nspname
+ */
+Datum
+regnamespaceout(PG_FUNCTION_ARGS)
+{
+	Oid			nspid = PG_GETARG_OID(0);
+	char	   *result;
+
+	if (nspid == InvalidOid)
+	{
+		result = pstrdup(-);
+		PG_RETURN_CSTRING(result);
+	}
+
+	result = get_namespace_name(nspid);
+	if (result)
+		PG_RETURN_CSTRING(result);
+	else
+		PG_RETURN_NULL();
+}
+
+/*
+ *		regnamespacerecv	- converts external binary format to regnamespace
+ */
+Datum
+regnamespacerecv(PG_FUNCTION_ARGS)
+{
+	/* Exactly the same as oidrecv, so share code

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0

2015-02-03 Thread Heikki Linnakangas


On 01/30/2015 01:38 AM, Peter Geoghegan wrote:

On the stress-testing front, I'm still running Jeff Janes' tool [1],
while also continuing to use his Postgres modifications to
artificially increase the XID burn rate.

[1] https://github.com/petergeoghegan/jjanes_upsert


I followed the instructions in README.md to reproduce this. I downloaded 
the tool, applied the upsert patchset, applied the hack to 
parse_clause.c as instructed in the README.md file, installed 
btree_gist, and ran count_upsert_exclusion.pl.


It failed immediately with an assertion failure:

TRAP: FailedAssertion(!(node-spec != SPEC_INSERT || node-arbiterIndex 
!= ((Oid) 0)), File: nodeModifyTable.c, Line: 1619)


Is that just because of the hack in parse_clause.c?

With assertions disabled, count_upsert_exclusion.pl ran successfully to 
the end. I also tried running VACUUM FREEZE upsert_race_test in a loop 
in another session at the same time, but it didn't make a difference. 
How quickly do you see the errors?


I also tried applying crash_REL9_5.patch from the jjanes_upsert kit, and 
set jj_xid=1 to increase XID burn rate, but I'm still not seeing any 
errors.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

2015-02-03 Thread Josh Berkus

On 02/03/2015 07:50 AM, Robert Haas wrote:
 On Tue, Feb 3, 2015 at 10:44 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 That's the whole point of this patch. max_checkpoint_segments = 10, or
 max_checkpoint_segments = 160 MB, means that the system will begin a
 checkpoint so that when the checkpoint completes, and it truncates away or
 recycles old WAL, the total size of pg_xlog is 160 MB.

 That's different from our current checkpoint_segments setting. With
 checkpoint_segments, the documented formula for calculating the disk usage
 is (2 + checkpoint_completion_target) * checkpoint_segments * 16 MB. That's
 a lot less intuitive to set.
 
 Hmm, that's different from what I was thinking.  We probably shouldn't
 call that max_checkpoint_segments, then.  I got confused and thought
 you were just trying to decouple the number of segments that it takes
 to trigger a checkpoint from the number we keep preallocated.

Wait, what?  Because the new setting is an actual soft maximum, we
*shouldn't* call it a maximum?  Or are you saying something else?

On 02/03/2015 04:25 AM, Heikki Linnakangas wrote:
 On 02/02/2015 04:21 PM, Andres Freund wrote:
 I think we need to increase checkpoint_timeout too - that's actually
 just as important for the default experience from my pov. 5 minutes
 often just unnecessarily generates FPWs en masse.

I have yet to see any serious benchmarking on checkpoint_timeout.  It
does seem that for some workloads on some machines a longer timeout is
better, but I've also seen workloads where a longer timeout decreases
throughput or raises IO.  So absent some hard numbers, I'd be opposed to
changing the default.


 I'll open the bidding at 1600MB (aka 100).

 Fine with me.

 I wouldn't object to raising it a little bit, but that's way too high.
 It's entirely possible to have a small database that generates a lot of
 WAL. A table that has only a few rows, but is updated very very
 frequently, for example. And checkpointing such a database is quick too,
 so frequent checkpoints are not a problem. You don't want to end up with
 1.5 GB of WAL on a 100 MB database.

I suggest 192MB instead (12 segments).  That almost doubles our current
real default, without requiring huge disk space which might surprise
some users.

In practice, checkpoint_segments is impossible to automatically tune
correctly.  So let's be conservative.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

82 matches

Mail list logo