Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-20 Thread Amit Kapila
On Wednesday, June 19, 2013 10:45 PM Sawada Masahiko wrote:
On Tuesday, June 18, 2013, Amit Kapila wrote:
On Tuesday, June 18, 2013 12:18 AM Sawada Masahiko wrote:
 On Sun, Jun 16, 2013 at 2:00 PM, Amit kapila amit.kap...@huawei.com
 wrote:
  On Saturday, June 15, 2013 8:29 PM Sawada Masahiko wrote:
  On Sat, Jun 15, 2013 at 10:34 PM, Amit kapila
 amit.kap...@huawei.com wrote:
 
  On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote:
  On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila
 amit.kap...@huawei.com wrote:
  On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
  Hello,
 

 I think that we can dumping data before all WAL files deleting.  All
 WAL files deleting is done when old master starts as new standby.

  Can we dump data without starting server?
Sorry I made a mistake. We can't it.

  this proposing patch need to be able to also handle such scenario in future.

I am not sure the purposed patch can handle it so easily, but I think if others 
also felt it important, then a method should be a 
provided to user for extracting his last committed data.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add visibility map information to pg_freespace.

2013-06-20 Thread Simon Riggs
On 20 June 2013 04:26, Satoshi Nagayasu sn...@uptime.jp wrote:
 I'm looking into this patch as a reviewer.


 (2013/06/19 18:03), Simon Riggs wrote:

 On 19 June 2013 09:19, Kyotaro HORIGUCHI
 horiguchi.kyot...@lab.ntt.co.jp wrote:

 It should useful in other aspects but it seems a bit complicated
 just to know about visibility bits for certain blocks.


 With your current patch you can only see the visibility info for
 blocks in cache, not for all blocks. So while you may think it is
 useful, it is also unnecessarily limited in scope.

 Let's just have something that is easy to use that lets us see the
 visibility state for a block, not just blocks in freespace.


 I think we can have this visibility map statistics also
 in pgstattuple function (as a new column) for this purpose.

 IMHO, we have several modules for different purposes.

 - pageinspect provies several functions for debugging purpose.
 - pg_freespace provies a view for monitoring purpose.
 - pgstattuple provies several functions for collecting
   specific table/index statistics.

 So, we can have similar feature in different modules.

 Any comments?

+1

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Simon Riggs
On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote:

 I think a good starting point would be to use the Intel and IBM
 libraries to implement basic DECIMAL32/64/128 to see if they perform
 better than the gcc builtins tested by Pavel by adapting his extension.

 If the performance isn't interesting it may still be worth adding for
 compliance reasons, but if we can only add IEEE-compliant decimal FP by
 using non-SQL-standard type names I don't think that's super useful.

I think we should be adding a datatype that is IEEE compliant, even if
that doesn't have space and/or performance advantages. We might hope
it does, but if not then it may do in the future.

It seems almost certain that the SQL standard would adopt the IEEE
datatypes in the future.

 If
 there are significant performance/space gains to be had, we could
 consider introducing DECIMAL32/64/128 types with the same names used by
 DB2, so people could explicitly choose to use them where appropriate.

Typenames are easily setup if compatibility is required, so thats not a problem.

We'd want to use the name the SQL std people assign.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change authentication error message (patch)

2013-06-20 Thread Markus Wanner
On 06/20/2013 12:51 AM, Jeff Janes wrote:
 I think we need to keep the first password.  Password authentication
 is a single thing, it is the authentication method attempted.  It is the
 password method (which includes MD5) which failed, as opposed to the
 LDAP method or the Peer method or one of the other methods.

That's against the rule of not revealing any more knowledge than a
potential attacker already has, no? For that reason, I'd rather go with
just authentication failed.

 Without this level of explicitness, it might be hard to figure out which
 row in pg_hba.conf was the one that PostgreSQL glommed onto to use for
 authentication.

As argued before, that should go into the logs for diagnosis by the
sysadmin, but should not be revealed to an attacker.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Thomas Munro
On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote:

 I think a good starting point would be to use the Intel and IBM
 libraries to implement basic DECIMAL32/64/128 to see if they perform
 better than the gcc builtins tested by Pavel by adapting his extension.


Just a few notes:

Not sure if this has already been mentioned, but GCC is using the IBM
decNumber library to implement those built-ins so the performance should be
nearly identical.

Unfortunately, many GCC builds shipped by Linux distributions don't
actually seem to have those built-ins configured anyway!

Also, the IBM 'xlc' compiler supports those built-ins (IBM being behind all
of this stuff...), and generates code using hardware instructions for
POWER6/POWER7, or software otherwise (quite possibly the same code again).

One further (undeveloped) thought: the IBM decNumber library doesn't just
support the 754-2008 types, it also supports a more general decNumber type
with arbitrary precision (well, up to 999,999,999 significant figures), so
if it were to finish up being used by core PG then it could also have other
uses.  I have no idea how decNumber (which encodes significant figures in
an integer coefficient, so one decimal digit per 3.2(?) bits) compares to
PG's DECIMAL (which encodes each digit in 4 bits, BCD style), in terms of
arithmetic performance and other trade-offs.


 If the performance isn't interesting it may still be worth adding for
 compliance reasons, but if we can only add IEEE-compliant decimal FP by
 using non-SQL-standard type names I don't think that's super useful. If
 there are significant performance/space gains to be had, we could
 consider introducing DECIMAL32/64/128 types with the same names used by
 DB2, so people could explicitly choose to use them where appropriate.


+1 for using the DB2 names.

I am interested in this topic as a user of both Postgres and DB2, and an
early adopter of 754-2008 in various software.  Actually I had started
working on my own DECFLOAT types for Postgres using decNumber in 2010 as I
mentioned on one of the lists, but life got in the way.  I had a very basic
extension sort of working though, and core support didn't seem necessary,
although I hadn't started on what I considered to be the difficult bit,
interactions with the other numerical types (ie deciding which conversions
and promotions would make sense and be safe).

Finally, I recently ran into a 3rd software implementation of 754-2008:
libmpdec (the other two being IBM decNumber and Intel's library), but I
haven't looked into it yet.

Thomas Munro


Re: [HACKERS] Patch for removng unused targets

2013-06-20 Thread Etsuro Fujita
 From: Hitoshi Harada [mailto:umi.tan...@gmail.com]

 I guess the patch works fine, but what I'm saying is it might be limited to
 small use cases.  Another instance of this that I can think of is ORDER BY
clause
 of window specifications, which you may want to remove from the target list
 as well, in addition to ORDER BY of query.  It will just not be removed by
this
 approach, simply because it is looking at only parse-sortClause.  Certainly
 you can add more rules to the new function to look at the window
specification,
 but then I'm not sure what we are missing.

Yeah, I thought the extension to the window ORDER BY case, too.  But I'm not
sure it's worth complicating the code, considering that the objective of this
optimization is to improve full-text search related things if I understand
correctly, though general solutions would be desirable as you mentioned.

 So, as it stands it doesn't have
 critical issue, but more generalized approach would be desirable.  That said,
 I don't have strong objection to the current patch, and just posting one
thought
 to see if others may have the same opinion.

OK.  I'll also wait for others' comments.  For review, an updated version of the
patch is attached, which fixed the bug using the approach that directly uses the
clause information in the parse tree.

Thanks,

Best regards,
Etsuro Fujit


unused-targets-20130620.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LEFT JOIN LATERAL can remove rows from LHS

2013-06-20 Thread Vik Fearing
On 06/18/2013 01:52 AM, Jeremy Evans wrote:
 Maybe I am misunderstanding how LATERAL is supposed to work, but my
 expectation is that doing a LEFT JOIN should not remove rows from
 the LHS.

I have added this to the list of 9.3 blockers.

https://wiki.postgresql.org/wiki/PostgreSQL_9.3_Open_Items


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Thomas Munro
On 20 June 2013 08:05, Thomas Munro mu...@ip9.org wrote:

 On 20 June 2013 06:45, Craig Ringer cr...@2ndquadrant.com wrote:

 If the performance isn't interesting it may still be worth adding for

 compliance reasons, but if we can only add IEEE-compliant decimal FP by
 using non-SQL-standard type names I don't think that's super useful. If
 there are significant performance/space gains to be had, we could
 consider introducing DECIMAL32/64/128 types with the same names used by
 DB2, so people could explicitly choose to use them where appropriate.


 +1 for using the DB2 names.


On reflection, I should offer more than +1.  I think that the IBM name
DECFLOAT(16) is better than DECIMAL64 because:

1)  The number of significant decimal digits is probably of greater
importance to a typical end user than the number of binary digits used to
store it.
2)  Other SQL types are parameterised with this notation, such as
VARCHAR(6) and DECIMAL(6, 2).
3)  IEEE 754 has rather different semantics to SQL DECIMAL, I'm thinking
mainly of the behaviour of special values, so using a name like DECFLOAT(n)
instead of DECIMAL64 would draw greater attention to that fact (ie it's not
just a fixed sized DECIMAL).

Also, IBM was here first, and I *guess* they will propose DECFLOAT for
standardisation (they are behind proposals to add support to many other
languages), though I have no information on that.


Re: [HACKERS] Implementing incremental backup

2013-06-20 Thread Magnus Hagander
On Thu, Jun 20, 2013 at 12:18 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Claudio Freire escribió:
 On Wed, Jun 19, 2013 at 6:20 PM, Stephen Frost sfr...@snowman.net wrote:
  * Claudio Freire (klaussfre...@gmail.com) wrote:
  I don't see how this is better than snapshotting at the filesystem
  level. I have no experience with TB scale databases (I've been limited
  to only hundreds of GB), but from my limited mid-size db experience,
  filesystem snapshotting is pretty much the same thing you propose
  there (xfs_freeze), and it works pretty well. There's even automated
  tools to do that, like bacula, and they can handle incremental
  snapshots.
 
  Large databases tend to have multiple filesystems and getting a single,
  consistent, snapshot across all of them while under load is..
  'challenging'.  It's fine if you use pg_start/stop_backup() and you're
  saving the XLOGs off, but if you can't do that..

 Good point there.

 I still don't like the idea of having to mark each modified page. The
 WAL compressor idea sounds a lot more workable. As in scalable.

 There was a project that removed useless WAL records from the stream,
 to make it smaller and useful for long-term archiving.  It only removed
 FPIs as far as I recall.  It's dead now, and didn't compile on recent
 (9.1?) Postgres because of changes in the WAL structs, IIRC.

 This doesn't help if you have a large lot of UPDATEs that touch the same
 set of rows over and over, though.  Tatsuo-san's proposal would allow
 this use-case to work nicely because you only keep one copy of such
 data, not one for each modification.

 If you have the two technologies, you could teach them to work in
 conjunction: you set up WAL replication, and tell the WAL compressor to
 prune updates for high-update tables (avoid useless traffic), then use
 incremental backup to back these up.  This seems like it would have a
 lot of moving parts and be rather bug-prone, though.

Just as a datapoint, I think this is basically what at least some
other database engine (sqlserver) calls incremental vs
differential backup.

Differential backup keep tracks of which blocks have changed (by one
way or another - maybe as simple as the LSN, but it doesn't matter
how, really) and backs up just those blocks (diffed back to the base
backup).

Incremental does the transaction log, which is basically what we do
with log archiving except it's not done in realtime - it's all saved
on the master until the backup command runs.

Of course, it's quite a been a few years since I set up one of those
in anger, so disclaimer for that info being out of date :)

Didn't pg_rman try to do something based on the page LSN to achieve
something similar to this?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-06-20 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 Random observation in this general area:  Regular triggers provide the
 field Trigger *tg_trigger in the trigger data, which allows you to get
 the trigger name, OID, and such.  Event triggers don't expose anything
 comparable.  That should perhaps be added.

Agreed.

Basically we ran out of time to add in any sort of useful information,
so that's all 9.4 material.

 Also, as I'm maybe about the fourth person ever to write an actual event
 trigger, I have a usability report of sorts.  I found it confusing that
 the trigger timing is encoded into the event name.  So instead of ON
 ddl_command_start, I was expecting something more like BEFORE
 ddl_command.  There might be a reason for this design choice, but I
 found it a confusing departure from known trigger concepts.

Your proposal here matches what I did propose in the 9.2 cycle. I even
wanted to go as far as having BEFORE, AFTER and INSTEAD OF command
triggers.

The problem was to find the right place in the code for each different
command, and while I did work on that and proposed command specific
integration points, Robert had the idea of having something more
flexible and not tied too much with commands, hence events.

The idea is to be able to provide events such as table_rewrite or
insert_into_unknown_table etc.

How we got from that decision to prefer ddl_command_start to BEFORE
ddl_command still is unclear to me. We could have kept the grammar and
turned it internally into the before_ddl_command event.

But then certainly you want to be able to say BEFORE CREATE TABLE, ALTER
TABLE or BEFORE ANY EVENT, things like that. In the patches I sent
containing that kind of implementation, it was said to be too much
grammar to maintain, as the patch I had needed to list here all
supported commands, and each time to add support for a new one you
needed to edit that grammar list…

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change authentication error message (patch)

2013-06-20 Thread Marko Tiikkaja

On 20/06/2013 08:47, Markus Wanner wrote:

On 06/20/2013 12:51 AM, Jeff Janes wrote:

I think we need to keep the first password.  Password authentication
is a single thing, it is the authentication method attempted.  It is the
password method (which includes MD5) which failed, as opposed to the
LDAP method or the Peer method or one of the other methods.


That's against the rule of not revealing any more knowledge than a
potential attacker already has, no? For that reason, I'd rather go with
just authentication failed.


My understanding is that the attacker would already have that 
information since the server would have sent an 
AuthenticationMD5Password message to get to the error in the first 
place.  And we still reveal the authentication method to the frontend in 
all other cases (peer authentication failed, for example).



Without this level of explicitness, it might be hard to figure out which
row in pg_hba.conf was the one that PostgreSQL glommed onto to use for
authentication.


As argued before, that should go into the logs for diagnosis by the
sysadmin, but should not be revealed to an attacker.


Isn't the point of this patch exactly that we didn't want to go down 
that road?  I.e. password authentication failed didn't say that the 
password might've expired, but some people thought just logging a 
WARNING/LOG wasn't enough.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Config reload/restart preview

2013-06-20 Thread Thom Brown
Hi,

I've noticed that there's no easy way of checking which settings will
change if the config is reloaded, and I think not being able to do this can
cause some unfortunate problems.

For example, a customer went to change their configuration, just setting
log_autovacuum_min_duration to about 20 seconds, and reloaded the server.
 However, the log file swelled to over 5GB in size before they realised
something was wrong, and then reverted the change.  It transpired that the
reload also pulled in a log_statements change from 'ddl' to 'all' that
someone must have changed at some point without applying it.

Should we have a way of previewing changes that would be applied if we
reloaded/restarted the server?

For example:

pg_ctl previewconfig

SIGHUP: log_statements will change from 'ddl' to 'all'
SIGHUP: log_vacuum_min_duration will change from -1 to 2
POSTMASTER: fsync will change from 'on' to 'off'

I'm not proposing this specifically, but something that would provide such
information.

-- 
Thom


Re: [HACKERS] Bugfix and new feature for PGXS

2013-06-20 Thread Cédric Villemain
Le jeudi 20 juin 2013 05:26:21, Peter Eisentraut a écrit :
 On Wed, 2013-06-19 at 20:58 +0200, Cédric Villemain wrote:
  I believe he answered the proposal to put all headers on the same flat
  directory, instead of a tree.
 
 The headers are used as
 
 #include hstore.h
 #include ltree.h
 etc.
 
 in the existing source code.
 
 If you want to install the for use by others, you can do one of three
 things:
 
 1) Install them into $(pg_config --includedir-server), so other users
 will just include them in the same way as shown above.

I don't like this one because extension can overwrite sever headers.

 2) Install them in a different directory, but keep the same #include
 lines.  That would require PGXS changes, perhaps a new pg_config option,
 or something that produces the right -I option to find them.

Patch of PGXS is not a problem.
pg_config patch is a requirement only if users set their own 
$(includedir_contrib) variable. I didn't though about it, but it is probably 
better to add that. This looks trivial too.

To include hstore header we write «#include hstore.h» but we add :
 -I$(includedir_contrib)/hstore to the FLAGS in the extension makefile which 
does require hstore. It changes nothing to hstore Makefile itself.

The main difference is that before we had to -I$(top_srcdir)/../contrib/hstore 
*iff* we are not building with PGXS (else we need to have the hstore source 
available somewhere), now we can do -I$(includedir_contrib)/hstore with PGXS 
(hstore need to be installed first, as we USE_PGXS)

Then PGXS offers to catch both case transparently so we can do :
-I${include_contrib_header)  in Makefile
and pgxs.mk takes care of adjusting include_contrib_header (or whatever its 
name) according to USE_PGXS value.

 3) Install them in a different directory and require a different
 #include line.  But then you have to change the existing uses as well,
 which would probably require moving files around.

Having to change existing source code do keep the same behavior is not 
attractive at all.

 Both 2 and 3 require a lot of work, possibly compatibility breaks, for
 no obvious reason.

2 is a good solution and not a lot of work, IMO.

Removing the need of setting -I$(include...) in the contrib Makefile can be 
done later by adding a PGXS variable to define the contrib requirements, this 
is something different from the current feature (build contrib depending on 
another(s) without full source tree)
-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Config reload/restart preview

2013-06-20 Thread Magnus Hagander
On Thu, Jun 20, 2013 at 1:04 PM, Thom Brown t...@linux.com wrote:
 Hi,

 I've noticed that there's no easy way of checking which settings will change
 if the config is reloaded, and I think not being able to do this can cause
 some unfortunate problems.

 For example, a customer went to change their configuration, just setting
 log_autovacuum_min_duration to about 20 seconds, and reloaded the server.
 However, the log file swelled to over 5GB in size before they realised
 something was wrong, and then reverted the change.  It transpired that the
 reload also pulled in a log_statements change from 'ddl' to 'all' that
 someone must have changed at some point without applying it.

 Should we have a way of previewing changes that would be applied if we
 reloaded/restarted the server?

 For example:

 pg_ctl previewconfig

 SIGHUP: log_statements will change from 'ddl' to 'all'
 SIGHUP: log_vacuum_min_duration will change from -1 to 2
 POSTMASTER: fsync will change from 'on' to 'off'

 I'm not proposing this specifically, but something that would provide such
 information.

Yes, we should.

This would go well with something I started working on some time ago
(but haven't actually gotten far on at all), which is the ability for
pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload
should also be able to tell you which parameters were changed, without
having to go to the log. Obviously that's almost exactly the same
feature.

The problem today is that pg_ctl just sends off a SIGHUP when it does
a reload. We'd have to give it an actual interface that could return
data back as well, such as a socket of some kind. So it does take some
work to come up with. But I definitely think we should have something
like this.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change authentication error message (patch)

2013-06-20 Thread Markus Wanner
On 06/20/2013 12:27 PM, Marko Tiikkaja wrote:
 My understanding is that the attacker would already have that
 information since the server would have sent an
 AuthenticationMD5Password message to get to the error in the first
 place.  And we still reveal the authentication method to the frontend in
 all other cases (peer authentication failed, for example).

Oh, right, I wasn't aware of that. Never mind, then.

+1 for keeping it mention password authentication explicitly.

However, thinking about this a bit more: Other authentication methods
may also provide password (or even account) expiration times. And may
fail to authenticate a user for entirely different reasons.

Given that, I wonder if password expired is such a special case worth
mentioning in case of the password auth method. If we go down that
path, don't we also have to include auth server unreachable as a
possible cause for authentication failure for methods that use an
external server?

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Config reload/restart preview

2013-06-20 Thread Pavan Deolasee
On Thu, Jun 20, 2013 at 4:34 PM, Thom Brown t...@linux.com wrote:

 Hi,

 I've noticed that there's no easy way of checking which settings will
 change if the config is reloaded, and I think not being able to do this can
 cause some unfortunate problems.

 For example, a customer went to change their configuration, just setting
 log_autovacuum_min_duration to about 20 seconds, and reloaded the server.
  However, the log file swelled to over 5GB in size before they realised
 something was wrong, and then reverted the change.  It transpired that the
 reload also pulled in a log_statements change from 'ddl' to 'all' that
 someone must have changed at some point without applying it.

 Should we have a way of previewing changes that would be applied if we
 reloaded/restarted the server?

 For example:

 pg_ctl previewconfig

  SIGHUP: log_statements will change from 'ddl' to 'all'
 SIGHUP: log_vacuum_min_duration will change from -1 to 2
 POSTMASTER: fsync will change from 'on' to 'off'

 I'm not proposing this specifically, but something that would provide such
 information.


May be we can have a nice little utility which can show configuration diff
between two running servers, or a running server and its modified conf file
?

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] Bugfix and new feature for PGXS

2013-06-20 Thread Cédric Villemain
 Opinions all?
 
 * Give up on being able to use one extension's header from another
 entirely, except in the special case of in-tree builds

There are already a good number of use cases with only hstore and intarray, 
I'm in favor of this feature.

 * Install install-enabled contrib headers into an include/contrib/
 that's always on the pgxs include path, so you can still just #include
 hstore.h . For in-tree builds, explicit add other modules' contrib dirs
 as Peter has done in the proposed plperl_hstore and plpython_hstore.
 (Use unqualified names).

I don't like the idea to share a flat directory with different header files, 
filenames can overlap.

 * Install install-enabled contrib headers to
 include/contrib/[modulename]/ . Within the module, use the unqualified
 header name. When referring to it from outside the module, use #include
 contrib/modulename/header.h. Add $(top_srcdir) to the include path
 when building extensions in-tree.

not either, see my other mail. we still #include hstore.h when we want, we 
just add that the header so it is available for PGXS builds. Contrib Makefile 
still need to -I$(includedir_contrib)/hstore. What's new is that the header is 
installed, not that we don't have to set FLAGS.

 Personally, I'm all for just using the local path when building the
 module, and the qualified path elsewhere. It requires only a relatively
 simple change, and I don't think that using hstore.h within hstore,
 and contrib/hstore/hstore.h elsewhere is of great concern.

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Config reload/restart preview

2013-06-20 Thread Thom Brown
On 20 June 2013 13:16, Pavan Deolasee pavan.deola...@gmail.com wrote:




 On Thu, Jun 20, 2013 at 4:34 PM, Thom Brown t...@linux.com wrote:

 Hi,

 I've noticed that there's no easy way of checking which settings will
 change if the config is reloaded, and I think not being able to do this can
 cause some unfortunate problems.

 For example, a customer went to change their configuration, just setting
 log_autovacuum_min_duration to about 20 seconds, and reloaded the server.
  However, the log file swelled to over 5GB in size before they realised
 something was wrong, and then reverted the change.  It transpired that the
 reload also pulled in a log_statements change from 'ddl' to 'all' that
 someone must have changed at some point without applying it.

 Should we have a way of previewing changes that would be applied if we
 reloaded/restarted the server?

 For example:

 pg_ctl previewconfig

  SIGHUP: log_statements will change from 'ddl' to 'all'
 SIGHUP: log_vacuum_min_duration will change from -1 to 2
 POSTMASTER: fsync will change from 'on' to 'off'

 I'm not proposing this specifically, but something that would provide
 such information.


 May be we can have a nice little utility which can show configuration diff
 between two running servers, or a running server and its modified conf file
 ?


Well checking for configuration differences between 2 running servers is
kind of a separate feature, and I'm not sure it's going to be needed by
that many DBAs.  If they wanted to see differences, they could just create
a foreign table to the other and do an anti join.

However, comparing the difference between a running server and its
modified conf file is the case I'm describing.  I'd personally prefer not
to have a whole new utility if it can be avoided.  Magnus mentioned making
the necessary changes to pg_ctl so that it can provide unlogged feedback,
and pg_ctl does feel more like the go-to tool for such a feature.  Apache
has apache2ctl which is responsible for starting, stopping and restarting
the server, but also provides a configtest option to check that a restart
won't bring the service down.  This seems to be a feature in the same kind
of area.  In fact that's pretty much the only action apache2ctl does that
pg_ctl doesn't do.

-- 
Thom


Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread MauMau

First, thank you for the review.

From: Alvaro Herrera alvhe...@2ndquadrant.com

This seems reasonable.  Why 10 seconds?  We could wait 5 seconds, or 15.
Is there a rationale behind the 10?  If we said 60, that would fit
perfectly well within the already existing 60-second loop in postmaster,
but that seems way too long.


There is no good rationale.  I arbitrarily chose a short period because this 
is immediate shutdown.  I felt more than 10 second was long.  I think 5 
second may be better.  Although not directly related to this fix, these 
influenced my choice:


1. According to the man page of init, init sends SIGKILL to all remaining 
processes 5 seconds after it sends SIGTERM to them.


2. At computer shutdown, Windows proceeds shutdown forcibly after waiting 
for services to terminate 20 seconds.




I have only one concern about this patch, which is visible in the
documentation proposed change:

  para
  This is the firsttermImmediate Shutdown/firstterm mode.
  The master commandpostgres/command process will send a
-  systemitemSIGQUIT/systemitem to all child processes and exit
-  immediately, without properly shutting itself down. The child 
processes

-  likewise exit immediately upon receiving
-  systemitemSIGQUIT/systemitem. This will lead to recovery (by
+  systemitemSIGQUIT/systemitem to all child processes, wait for
+  them to terminate, and exit. The child processes
+  exit immediately upon receiving
+  systemitemSIGQUIT/systemitem. If any of the child processes
+  does not terminate within 10 seconds for some unexpected reason,
+  the master postgres process will send a 
systemitemSIGKILL/systemitem

+  to all remaining ones, wait for their termination
+  again, and exit. This will lead to recovery (by
  replaying the WAL log) upon next start-up. This is recommended
  only in emergencies.
  /para

Note that the previous text said that postmaster will send SIGQUIT, then
terminate without checking anything.  In the new code, postmaster sends
SIGQUIT, then waits, then SIGKILL, then waits again.  If there's an
unkillable process (say because it's stuck in a noninterruptible sleep)
postmaster might never exit.  I think it should send SIGQUIT, then wait,
then SIGKILL, then exit without checking.


At first I thought the same, but decided not to do that.  The purpose of 
this patch is to make the immediate shutdown reliable.  Here, reliable 
means that the database server is certainly shut down when pg_ctl returns, 
not telling a lie that I shut down the server processes for you, so you do 
not have to be worried that some postgres process might still remain and 
write to disk.  I suppose reliable shutdown is crucial especially in HA 
cluster.  If pg_ctl stop -mi gets stuck forever when there is an unkillable 
process (in what situations does this happen? OS bug, or NFS hard mount?), I 
think the DBA has to notice this situation from the unfinished pg_ctl, 
investigate the cause, and take corrective action.  Anyway, in HA cluster, 
the clusterware will terminate the node with STONITH, not leaving pg_ctl 
running forever.




I have tweaked the patch a bit and I'm about to commit as soon as we
resolve the above two items.


I appreciate your tweaking, especially in the documentation and source code 
comments, as English is not my mother tongue.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2013-06-20 Thread Fabrízio de Royes Mello
On Thu, Jun 20, 2013 at 1:52 AM, Amit Langote amitlangot...@gmail.com
wrote:

 Is it possible to:

 CREATE [ OR REPLACE | IF NOT EXISTS ] OPERATOR CLASS

 I am in a situation where I need to conditionally create an operator
 class (that is, create only if already does not exist).

 [...]


The intention is cover all CREATE OPERATOR variants. See my planning [1].

Regards,

[1]
https://docs.google.com/spreadsheet/ccc?key=0Ai7oCVcVQiKFdEctQUxNNlR1R2xRTUpJNFNDcFo4MUEusp=sharing

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Bugfix and new feature for PGXS

2013-06-20 Thread Peter Eisentraut
On 6/20/13 1:21 AM, Craig Ringer wrote:
 As you note, the other option is to just refer to extension headers by
 their unqualified name. I'm *really* not a fan of that idea due to the
 obvious clash possibilities with Pg's own headers or system headers,
 especially given that the whole idea of extensions is that they're user
 supplied. Not having any kind of namespacing is worrying.

We already install all PostgreSQL server header files into a separate
directory, so the only clashes you have to worry about are with other
extensions and with the backend.  And you have to do that anyway,
because you will have namespace issues for the shared libraries, the
symbols in those libraries, and the extension names.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugfix and new feature for PGXS

2013-06-20 Thread Peter Eisentraut
On 6/20/13 1:21 AM, Craig Ringer wrote:
 Personally, I'm all for just using the local path when building the
 module, and the qualified path elsewhere. It requires only a relatively
 simple change, and I don't think that using hstore.h within hstore,
 and contrib/hstore/hstore.h elsewhere is of great concern.

It doesn't work if those header files include other header files (as in
the case of plpython, for example).



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Config reload/restart preview

2013-06-20 Thread Dimitri Fontaine
Magnus Hagander mag...@hagander.net writes:
 Should we have a way of previewing changes that would be applied if we
 reloaded/restarted the server?

 Yes, we should.

+1

 This would go well with something I started working on some time ago
 (but haven't actually gotten far on at all), which is the ability for
 pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload
 should also be able to tell you which parameters were changed, without
 having to go to the log. Obviously that's almost exactly the same
 feature.

It could probably connect to the server and issue the SQL command to
reload, and that one could probably get enhanced to return modified
variable as NOTICE, or be run with the right client_min_message:

SELECT pg_reload_conf();

The pg_ctl client would then have to know to display the messages sent
back by the server.

Then back to what Thom actually is asking, I guess that implementing a
SRF that parses the configuration and return something very much like
what we get from pg_settings should be possible, and then it's a matter
of doing an anti-join, as already proposed.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2013-06-20 Thread Amit Langote
On Thu, Jun 20, 2013 at 9:48 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 On Thu, Jun 20, 2013 at 1:52 AM, Amit Langote amitlangot...@gmail.com
 wrote:

 Is it possible to:

 CREATE [ OR REPLACE | IF NOT EXISTS ] OPERATOR CLASS

 I am in a situation where I need to conditionally create an operator
 class (that is, create only if already does not exist).

 [...]


 The intention is cover all CREATE OPERATOR variants. See my planning [1].


 Regards,

 [1]
 https://docs.google.com/spreadsheet/ccc?key=0Ai7oCVcVQiKFdEctQUxNNlR1R2xRTUpJNFNDcFo4MUEusp=sharing

Hmm, okay. Last time I checked, the CREATE OPERATOR CLASS row was
empty, so asked.


--
Amit Langote


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add long options to pgbench (submission 1)

2013-06-20 Thread Fabien COELHO



Please fix that and re-send the patch.


Find attached diff wrt current master.

--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 1303217..f39fed3 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -2109,6 +2109,30 @@ int
 main(int argc, char **argv)
 {
 	static struct option long_options[] = {
+		/* systematic long/short named options*/
+		{client, required_argument, NULL, 'c'},
+		{connect, no_argument, NULL, 'C'},
+		{debug, no_argument, NULL, 'd'},
+		{define, required_argument, NULL, 'D'},
+		{file, required_argument, NULL, 'f'},
+		{fill-factor, required_argument, NULL, 'F'},
+		{host, required_argument, NULL, 'h'},
+		{initialize, no_argument, NULL, 'i'},
+		{jobs, required_argument, NULL, 'j'},
+		{log, no_argument, NULL, 'l'},
+		{no-vacuum, no_argument, NULL, 'n'},
+		{port, required_argument, NULL, 'p'},
+		{query-mode, required_argument, NULL, 'M'},
+		{quiet-log, no_argument, NULL, 'q'},
+		{report-latencies, no_argument, NULL, 'r'},
+		{scale, required_argument, NULL, 's'},
+		{select-only, no_argument, NULL, 'S'},
+		{skip-some-update, no_argument, NULL, 'N'},
+		{time, required_argument, NULL, 'T'},
+		{transactions, required_argument, NULL, 't'},
+		{username, required_argument, NULL, 'U'},
+		{vacuum-all, no_argument, NULL, 'v'},
+		/* long-named only options */
 		{foreign-keys, no_argument, foreign_keys, 1},
 		{index-tablespace, required_argument, NULL, 3},
 		{tablespace, required_argument, NULL, 2},
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 8775606..3394b93 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -150,6 +150,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-i/option/term
+  termoption--initialize/option/term
   listitem
para
 Required to invoke initialization mode.
@@ -159,6 +160,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-F/option replaceablefillfactor//term
+  termoption--fill-factor/option replaceablefillfactor//term
   listitem
para
 Create the structnamepgbench_accounts/,
@@ -171,6 +173,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-n/option/term
+  termoption--no-vacuum/option/term
   listitem
para
 Perform no vacuuming after initialization.
@@ -180,6 +183,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-q/option/term
+  termoption--quiet-log/option/term
   listitem
para
 Switch logging to quiet mode, producing only one progress message per 5
@@ -191,6 +195,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-s/option replaceablescale_factor//term
+  termoption--scale/option replaceablescale_factor//term
   listitem
para
 Multiply the number of rows generated by the scale factor.
@@ -259,6 +264,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-c/option replaceableclients//term
+  termoption--client/option replaceableclients//term
   listitem
para
 Number of clients simulated, that is, number of concurrent database
@@ -269,6 +275,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-C/option/term
+  termoption--connect/option/term
   listitem
para
 Establish a new connection for each transaction, rather than
@@ -280,6 +287,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-d/option/term
+  termoption--debug/option/term
   listitem
para
 Print debugging output.
@@ -289,6 +297,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-D/option replaceablevarname/literal=/replaceablevalue//term
+  termoption--define/option replaceablevarname/literal=/replaceablevalue//term
   listitem
para
 Define a variable for use by a custom script (see below).
@@ -299,6 +308,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-f/option replaceablefilename//term
+  termoption--file/option replaceablefilename//term
   listitem
para
 Read transaction script from replaceablefilename/.
@@ -311,6 +321,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 
  varlistentry
   termoption-j/option replaceablethreads//term
+  termoption--jobs/option replaceablethreads//term
   listitem
para
 Number of worker threads within applicationpgbench/application.
@@ -324,6 +335,7 @@ pgbench optional replaceableoptions/ /optional 

Re: [HACKERS] [PATCH] add long options to pgbench (submission 1)

2013-06-20 Thread Fabrízio de Royes Mello
On Thu, Jun 20, 2013 at 9:59 AM, Fabien COELHO coe...@cri.ensmp.fr wrote:


  Please fix that and re-send the patch.


 Find attached diff wrt current master.


Thanks.


-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Config reload/restart preview

2013-06-20 Thread Magnus Hagander
On Thu, Jun 20, 2013 at 2:54 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Magnus Hagander mag...@hagander.net writes:
 Should we have a way of previewing changes that would be applied if we
 reloaded/restarted the server?

 Yes, we should.

 +1

 This would go well with something I started working on some time ago
 (but haven't actually gotten far on at all), which is the ability for
 pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload
 should also be able to tell you which parameters were changed, without
 having to go to the log. Obviously that's almost exactly the same
 feature.

 It could probably connect to the server and issue the SQL command to
 reload, and that one could probably get enhanced to return modified
 variable as NOTICE, or be run with the right client_min_message:

 SELECT pg_reload_conf();

 The pg_ctl client would then have to know to display the messages sent
 back by the server.

The problem with that is that now you must *always* have your system
set up to allow the postgres user to log in in pg_hba.conf or it
fails.

But yes, one option would be to use SQL instead of opening a socket.
Maybe that's a better idea - have pg_ctl try to use that if available,
and if not send a regular signal and not try to collect the output.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC catalog access

2013-06-20 Thread Robert Haas
On Mon, Jun 17, 2013 at 8:12 AM, Andres Freund and...@2ndquadrant.com wrote:
 So, the biggest issue with the patch seems to be performance worries. I
 tried to create a worst case scenario:
 postgres (patched and HEAD) running with:
 -c shared_buffers=4GB \
 -c max_connections=2000 \
 -c maintenance_work_mem=2GB \
 -c checkpoint_segments=300 \
 -c wal_buffers=64MB \
 -c synchronous_commit=off \
 -c autovacuum=off \
 -p 5440

 With one background pgbench running:
 pgbench -p 5440 -h /tmp -f /tmp/readonly-busy.sql -c 1000 -j 10 -T 100 
 postgres
 readonly-busy.sql:
 BEGIN;
 SELECT txid_current();
 SELECT pg_sleep(0.0001);
 COMMIT;

 I measured the performance of one other pgbench:
 pgbench -h /tmp -p 5440 postgres -T 10 -c 100 -j 100 -n -f 
 /tmp/simplequery.sql -C
 simplequery.sql:
 SELECT * FROM af1, af2 WHERE af1.x = af2.x;
 tables:
 create table af1 (x) as select g from generate_series(1,4) g;
 create table af2 (x) as select g from generate_series(4,7) g;

 With that setup one can create quite a noticeable overhead for the mvcc
 patch (best of 5):

 master-optimize:
 tps = 1261.629474 (including connections establishing)
 tps = 15121.648834 (excluding connections establishing)

 dev-optimize:
 tps = 773.719637 (including connections establishing)
 tps = 2804.239979 (excluding connections establishing)

 Most of the time in both, patched and unpatched is by far spent in
 GetSnapshotData. I think the reason this shows a far higher overhead
 than what you previously measured is that a) in your test the other
 backends were idle, in mine they actually modify PGXACT which causes
 noticeable cacheline bouncing b) I have higher numer of connections 
 #max_connections

 A quick test shows that even with max_connection=600, 400 background,
 and 100 foreground pgbenches there's noticeable overhead:
 master-optimize:
 tps = 2221.226711 (including connections establishing)
 tps = 31203.259472 (excluding connections establishing)
 dev-optimize:
 tps = 1629.734352 (including connections establishing)
 tps = 4754.449726 (excluding connections establishing)

 Now I grant that's a somewhat harsh test for postgres, but I don't
 think it's entirely unreasonable and the performance impact is quite
 stark.

It's not entirely unreasonable, but it *is* mostly unreasonable.  I
mean, nobody is going to run 1000 connections in the background that
do nothing but thrash PGXACT on a real system.  I just can't get
concerned about that.  What I am concerned about is that there may be
other, more realistic workloads that show similar regressions.  But I
don't know how to find out whether that's actually the case.  On the
IBM POWER box where I tested this, it's not even GetSnapshotData()
that kills you; it's the system CPU scheduler.

The thing about this particular test is that it's artificial -
normally, any operation that wants to modify PGXACT will spend a lot
more time fighting of WALInsertLock and maybe waiting for disk I/O
than is the case here.  Of course, with Heikki's WAL scaling patch and
perhaps other optimizations we expect that other overhead to go down,
which might make the problems here more visible; and some of Heikki's
existing testing has shown significant contention around ProcArrayLock
as things stand.  But I'm still on the fence about whether this is
really a valid test.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic background workers

2013-06-20 Thread Robert Haas
On Mon, Jun 17, 2013 at 10:45 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Fri, 2013-06-14 at 17:00 -0400, Robert Haas wrote:
 Alvaro's work on 9.3, we now have the ability to configure background
 workers via shared_preload_libraries.  But if you don't have the right
 library loaded at startup time, and subsequently wish to add a
 background worker while the server is running, you are out of luck.

 We could tweak shared_preload_libraries so that it reacts sensibly to
 reloads.  I basically gave up on that by writing
 session_preload_libraries, but if there is more general use for that, we
 could try.

 (That doesn't invalidate your work, but it's a thought.)

Yeah, I thought about that.  But it doesn't seem possible to do
anything all that sane.  You can't unload libraries if they've been
removed; you can potentially load new ones if they've been added.  But
that's a bit confusing, if the config file says that's what's loaded
is bar, and what's actually loaded is foo, bar, baz, bletch, and quux.

Some variant of this might still be worth doing, but figuring out the
details sounded like more than I wanted to get into, so I punted.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic background workers

2013-06-20 Thread Markus Wanner
Robert,

On 06/14/2013 11:00 PM, Robert Haas wrote:
 Parallel query, or any subset of that project such as parallel sort,
 will require a way to start background workers on demand.

thanks for continuing this, very much appreciated. Postgres-R and thus
TransLattice successfully use a similar approach for years, now.

I only had a quick glance over the patch, yet. Some comments on the design:

 This requires some communication channel from ordinary workers to the
 postmaster, because it is the postmaster that must ultimately start
 the newly-registered workers.  However, that communication channel has
 to be designed pretty carefully, lest a shared memory corruption take
 out the postmaster and lead to inadvertent failure to restart after a
 crash.  Here's how I implemented that: there's an array in shared
 memory of a size equal to max_worker_processes.  This array is
 separate from the backend-private list of workers maintained by the
 postmaster, but the two are kept in sync.  When a new background
 worker registration is added to the shared data structure, the backend
 adding it uses the existing pmsignal mechanism to kick the postmaster,
 which then scans the array for new registrations.

That sounds like a good simplification. Even if it's an O(n) operation,
the n in question here has relatively low practical limits. It's
unlikely to be much of a concern, I guess.

Back then, I solved it by having a fork request slot. After starting,
the bgworker then had to clear that slot and register with a coordinator
process (i.e. the original requestor), so that one learned its fork
request was successful. At some point I expanded that to multiple
request slots to better handle multiple concurrent fork requests.
However, it was difficult to get right and requires more IPC than your
approach.

On the pro side: The shared memory area used by the postmaster was very
small in size and read-only to the postmaster. These were my main goals,
which I'm not sure were the best ones, now that I read your concept.

 I have attempted to
 make the code that transfers the shared_memory state into the
 postmaster's private state as paranoid as humanly possible.  The
 precautions taken are documented in the comments.  Conversely, when a
 background worker flagged as BGW_NEVER_RESTART is considered for
 restart (and we decide against it), the corresponding slot in the
 shared memory array is marked as no longer in use, allowing it to be
 reused for a new registration.

Sounds like the postmaster is writing to shared memory. Not sure why
I've been trying so hard to avoid that, though. After all, it can hardly
hurt itself *writing* to shared memory.

 Since the postmaster cannot take locks, synchronization between the
 postmaster and other backends using the shared memory segment has to
 be lockless.  This mechanism is also documented in the comments.  An
 lwlock is used to prevent two backends that are both registering a new
 worker at about the same time from stomping on each other, but the
 postmaster need not care about that lwlock.
 
 This patch also extends worker_spi as a demonstration of the new
 interface.  With this patch, you can CREATE EXTENSION worker_spi and
 then call worker_spi_launch(int4) to launch a new background worker,
 or combine it with generate_series() to launch a bunch at once.  Then
 you can kill them off with pg_terminate_backend() and start some new
 ones.  That, in my humble opinion, is pretty cool.

It definitely is. Thanks again.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-20 Thread Robert Haas
On Tue, Jun 18, 2013 at 6:27 PM, Nicholas White n.j.wh...@gmail.com wrote:
 Thanks for the reviews. I've attached a revised patch that has the lexer
 refactoring Alvaro mentions (arbitarily using a goto rather than a bool
 flag) and uses Jeff's idea of just storing the index of the last non-null
 value (if there is one) in the window function's context (and not the Datum
 itself).

 However, Robert's right that SELECT ... ORDER BY respect NULLS LAST will now
 fail. An earlier iteration of the patch had RESPECT and IGNORE as reserved,
 but that would have broken tables with columns called respect (etc.),
 which the current version allows. Is this backwards incompatibility
 acceptable?

I think it's better to add new partially reserved keywords than to
have this kind of random breakage.  When you make something a
partially-reserved keyword, then things break, but in a fairly
well-defined way.  Lexer hacks can break things in ways that are much
subtler, which we may not even realize for a long time, and which in
that case would mean that the word respect needs to be quoted in
some contexts but not others.  That's going to cause a lot of
headaches, because pg_dump etc. know about quoting reserved keywords,
but they don't know anything about quoting unreserved keywords in
contexts where they might happen to be followed by the wrong next
word.

 If not, maybe I should try doing a two-token lookahead in the
 token-aggregating code between the lexer and the parser (i.e. make a
 RESPECT_NULLS token out of a sequence of RESPECT NULLS OVER tokens,
 remembering to replace the OVER token)? Or what about adding an %expect
 statement to the Bison grammar, confirming that the shift / reduce conflicts
 caused by using the RESPECT, IGNORE  NULL_P tokens the in out_clause rule
 are OK?

These lines of inquiry don't seem promising to me.  It's going to be
complicated and unmaintainable and may just move the failure scenarios
to cases that are too obscure for us to reason about.

I think the question is whether this feature is really worth adding
new reserved keywords for.  I have a hard time saying we shouldn't
support something that's part of the SQL standard, but personally,
it's not something I've seen come up prior to this thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC catalog access

2013-06-20 Thread Andres Freund
On 2013-06-20 09:45:26 -0400, Robert Haas wrote:
  With that setup one can create quite a noticeable overhead for the mvcc
  patch (best of 5):
 
  master-optimize:
  tps = 1261.629474 (including connections establishing)
  tps = 15121.648834 (excluding connections establishing)
 
  dev-optimize:
  tps = 773.719637 (including connections establishing)
  tps = 2804.239979 (excluding connections establishing)
 
  Most of the time in both, patched and unpatched is by far spent in
  GetSnapshotData. I think the reason this shows a far higher overhead
  than what you previously measured is that a) in your test the other
  backends were idle, in mine they actually modify PGXACT which causes
  noticeable cacheline bouncing b) I have higher numer of connections 
  #max_connections
 
  A quick test shows that even with max_connection=600, 400 background,
  and 100 foreground pgbenches there's noticeable overhead:
  master-optimize:
  tps = 2221.226711 (including connections establishing)
  tps = 31203.259472 (excluding connections establishing)
  dev-optimize:
  tps = 1629.734352 (including connections establishing)
  tps = 4754.449726 (excluding connections establishing)
 
  Now I grant that's a somewhat harsh test for postgres, but I don't
  think it's entirely unreasonable and the performance impact is quite
  stark.
 
 It's not entirely unreasonable, but it *is* mostly unreasonable. 

Well, sure. So are the tests that you ran. But that's *completely*
fine. In contrast to evaluating whether a performance improvement is
worth its complexity we're not trying to measure real world
improvements. We're trying to test the worst cases we can think of, even
if they aren't really interesting by stressing potential pain points. If
we can't find a relevant regression for those using something akin to
microbenchmarks it's less likely that there are performance regressions.

The not entirely unreasonable point is just about making sure you're
not testing something entirely irrelevant. Say, performance of a 1TB
database when shared_buffers is set to 64k. Or testing DDL performance
while locking pg_class exclusively.

The test was specifically chosen to:
* do uncached syscache lookups (-C) to mesure the impact of the added
  GetSnapshotData() calls
* make individual GetSnapshotData() calls slower. (all processes have an
  xid)
* contend on ProcArrayLock but not much else (high number of clients in
  the background)

 I
 mean, nobody is going to run 1000 connections in the background that
 do nothing but thrash PGXACT on a real system.  I just can't get
 concerned about that.

In the original mail I did retry it with 400 and the regression is still
pretty big. And the background things could also be doing something
that's not that likely to be blocked by global locks. Say, operate on
temporary or unlogged tables. Or just acquire a single row level lock
and then continue to do readonly work in a read committed transaction.

I think we both can come up with scenarios where at least part of the
above scenario is present. But imo that doesn't really matter.

 What I am concerned about is that there may be
 other, more realistic workloads that show similar regressions.  But I
 don't know how to find out whether that's actually the case.

So, given the results from that test and the profile I got where
GetSnapshotData was by far the most expensive thing a more
representative test might be something like a readonly pgbench with a
moderately high number of short lived connections. I wouldn't be
surprised if that still showed performance problems.

If that's not enough something like:
BEGIN;
SELECT * FROM my_client WHERE client_id = :id FOR UPDATE;
SELECT * FROM key_table WHERE key = :random
...
SELECT * FROM key_table WHERE key = :random
COMMIT;

will sure still show the problem.

  On the
 IBM POWER box where I tested this, it's not even GetSnapshotData()
 that kills you; it's the system CPU scheduler.

I haven't tried yet, but I'd guess the above setup shows the difference
with less than 400 clients. Might make it more reasonable to run there.

 But I'm still on the fence about whether this is really a valid test.

I think it shows that we need to be careful and do further performance
evaluations and/or alleviate the pain by making things cheaper (say, a
ddl counter in shared mem, allowing to cache snapshots for the
syscache). If that artificial test hadn't shown problems I'd have voted
for just going ahead and not worry further.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic background workers

2013-06-20 Thread Robert Haas
On Thu, Jun 20, 2013 at 9:57 AM, Markus Wanner mar...@bluegap.ch wrote:
 That sounds like a good simplification. Even if it's an O(n) operation,
 the n in question here has relatively low practical limits. It's
 unlikely to be much of a concern, I guess.

The constant factor is also very small.  Generally, I would expect
num_worker_processes ~ # CPUs, and scanning a 32, 64, or even 128
element array is not a terribly time-consuming operation.  We might
need to re-think this when systems with 4096 processors become
commonplace, but considering how many other things would also need to
be fixed to work well in that universe, I'm not too concerned about it
just yet.

One thing I think we probably want to explore in the future, for both
worker backends and regular backends, is pre-forking.  We could avoid
some of the latency associated with starting up a new backend or
opening a new connection in that way.  However, there are quite a few
details to be thought through there, so I'm not eager to pursue that
just yet.  Once we have enough infrastructure to implement meaningful
parallelism, we can benchmark it and find out where the bottlenecks
are, and which solutions actually help most.

 Back then, I solved it by having a fork request slot. After starting,
 the bgworker then had to clear that slot and register with a coordinator
 process (i.e. the original requestor), so that one learned its fork
 request was successful. At some point I expanded that to multiple
 request slots to better handle multiple concurrent fork requests.
 However, it was difficult to get right and requires more IPC than your
 approach.

I do think we need a mechanism to allow the backend that requested the
bgworker to know whether or not the bgworker got started, and whether
it unexpectedly died.  Once we get to the point of calling user code
within the bgworker process, it can use any number of existing
mechanisms to make sure that it won't die without notifying the
backend that started it (short of a PANIC, in which case it won't
matter anyway).  But we need a way to report failures that happen
before that point.  I have some ideas about that, but decided to leave
them for future passes.  The remit of this patch is just to make it
possible to dynamically register bgworkers.  Allowing a bgworker to be
tied to the session that requested it via some sort of feedback loop
is a separate project - which I intend to tackle before CF2, assuming
this gets committed (and so far nobody is objecting to that).

 I have attempted to
 make the code that transfers the shared_memory state into the
 postmaster's private state as paranoid as humanly possible.  The
 precautions taken are documented in the comments.  Conversely, when a
 background worker flagged as BGW_NEVER_RESTART is considered for
 restart (and we decide against it), the corresponding slot in the
 shared memory array is marked as no longer in use, allowing it to be
 reused for a new registration.

 Sounds like the postmaster is writing to shared memory. Not sure why
 I've been trying so hard to avoid that, though. After all, it can hardly
 hurt itself *writing* to shared memory.

I think there's ample room for paranoia about postmaster interaction
with shared memory, but all it's doing is setting a flag, which is no
different from what CheckPostmasterSignal() already does.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum/visibility is busted

2013-06-20 Thread Andres Freund
On 2013-06-19 15:01:44 -0700, Jeff Janes wrote:
 On Thu, Feb 7, 2013 at 12:01 PM, Andres Freund and...@2ndquadrant.comwrote:
 
  On 2013-02-07 11:15:46 -0800, Jeff Janes wrote:
  
   Does anyone have suggestions on how to hack the system to make it
   fast-forward the current transaction id? It would certainly make
   testing this kind of thing faster if I could make transaction id
   increment by 100 each time a new one is generated.  Then wrap-around
   could be approached in minutes rather than hours.
 
  I had various plpgsql functions to do that, but those still took quite
  some time. As I needed it before I just spent some minutes hacking up a
  contrib module to do the job.
 
 
 Hi Andres,
 
 Your patch needs the file xidfuncs--1.0.sql, but does not include it.
 
 I could probably guess what needs to be in that file, but do you still have
 a copy of it?

Hm. Sorry. Not sure how that happened. The commit in my local branch for
that seems to have it ;). Attached.

  I doubt it really think it makes sense as a contrib module on its own
  though?

 Maybe PGXN?

Hm. As of now I am not yet that convinced of PGXN for C containing
extensions.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From b32cc0dca23f0477b26cb9d733b1d7ef391a77cc Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 20 Feb 2013 17:20:00 +0100
Subject: [PATCH] add xidfuncs

---
 contrib/Makefile   |  3 ++-
 contrib/xidfuncs/Makefile  | 18 ++
 contrib/xidfuncs/xidfuncs--1.0.sql |  5 
 contrib/xidfuncs/xidfuncs.c| 50 ++
 contrib/xidfuncs/xidfuncs.control  |  5 
 5 files changed, 80 insertions(+), 1 deletion(-)
 create mode 100644 contrib/xidfuncs/Makefile
 create mode 100644 contrib/xidfuncs/xidfuncs--1.0.sql
 create mode 100644 contrib/xidfuncs/xidfuncs.c
 create mode 100644 contrib/xidfuncs/xidfuncs.control

diff --git a/contrib/Makefile b/contrib/Makefile
index fcd7c1e..64d10cc 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -51,7 +51,8 @@ SUBDIRS = \
 		tsearch2	\
 		unaccent	\
 		vacuumlo	\
-		worker_spi
+		worker_spi	\
+		xidfuncs
 
 ifeq ($(with_openssl),yes)
 SUBDIRS += sslinfo
diff --git a/contrib/xidfuncs/Makefile b/contrib/xidfuncs/Makefile
new file mode 100644
index 000..6977a3b
--- /dev/null
+++ b/contrib/xidfuncs/Makefile
@@ -0,0 +1,18 @@
+# contrib/xidfuncs/Makefile
+
+MODULE_big	= xidfuncs
+OBJS		= xidfuncs.o
+
+EXTENSION = xidfuncs
+DATA = xidfuncs--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/xidfuncs
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/xidfuncs/xidfuncs--1.0.sql b/contrib/xidfuncs/xidfuncs--1.0.sql
new file mode 100644
index 000..db7c9ed
--- /dev/null
+++ b/contrib/xidfuncs/xidfuncs--1.0.sql
@@ -0,0 +1,5 @@
+CREATE FUNCTION burnxids(numxids int4)
+RETURNS int8
+VOLATILE
+AS 'MODULE_PATHNAME', 'burnxids'
+LANGUAGE C;
diff --git a/contrib/xidfuncs/xidfuncs.c b/contrib/xidfuncs/xidfuncs.c
new file mode 100644
index 000..d24a48f
--- /dev/null
+++ b/contrib/xidfuncs/xidfuncs.c
@@ -0,0 +1,50 @@
+/*-
+ * contrib/xidfuncs/xidfuncs.c
+ *
+ * Copyright (c) 2013, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/xidfuncs/xidfuncs.c
+ *
+ *-
+ */
+
+#include postgres.h
+
+#include access/transam.h
+#include access/xact.h
+#include funcapi.h
+#include storage/proc.h
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(burnxids);
+
+extern Datum burnxids(PG_FUNCTION_ARGS);
+
+Datum
+burnxids(PG_FUNCTION_ARGS)
+{
+	int nxids = PG_GETARG_INT32(0);
+	int i;
+	TransactionId last;
+
+	if (nxids = 1)
+		elog(ERROR, can't burn a negative amount of xids);
+
+	if (nxids  130)
+		elog(ERROR, burning too many xids is dangerous);
+
+	if (GetTopTransactionIdIfAny() != InvalidTransactionId)
+		elog(ERROR, can't burn xids in a transaction with xid);
+
+	for (i = 0; i  nxids; i++)
+	{
+		last = GetNewTransactionId(false);
+	}
+
+	/* don't keep xid as assigned */
+	MyPgXact-xid = InvalidTransactionId;
+
+	return Int64GetDatum((int64)last);
+}
diff --git a/contrib/xidfuncs/xidfuncs.control b/contrib/xidfuncs/xidfuncs.control
new file mode 100644
index 000..bca7194
--- /dev/null
+++ b/contrib/xidfuncs/xidfuncs.control
@@ -0,0 +1,5 @@
+# xidfuncs extension
+comment = 'xid debugging functions'
+default_version = '1.0'
+module_pathname = '$libdir/xidfuncs'
+relocatable = true
-- 
1.8.2.rc2.4.g7799588.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dump difference between 9.3 and master after upgrade

2013-06-20 Thread Robert Haas
On Tue, Jun 18, 2013 at 12:18 PM, Andrew Dunstan and...@dunslane.net wrote:
 As I was updating my cross version upgrade tester to include support for the
 9.3 branch, I noted this dump difference between the dump of the original
 9.3 database and the dump of the converted database, which looks odd. Is it
 correct?

It looks sketchy to me, but I'm not sure exactly what you're comparing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic background workers

2013-06-20 Thread Markus Wanner
On 06/20/2013 04:41 PM, Robert Haas wrote:
 The constant factor is also very small.  Generally, I would expect
 num_worker_processes ~ # CPUs

That assumption might hold for parallel querying, yes. In case of
Postgres-R, it doesn't. In the worst case, i.e. with a 100% write load,
a cluster of n nodes, each with m backends performing transactions, all
of them replicated to all other (n-1) nodes, you end up with ((n-1) * m)
bgworkers. Which is pretty likely to be way above the # CPUs on any
single node.

I can imagine other extensions or integral features like autonomous
transactions that might possibly want many more bgworkers as well.

 and scanning a 32, 64, or even 128
 element array is not a terribly time-consuming operation.

I'd extend that to say scanning an array with a few thousand elements is
not terribly time-consuming, either. IMO the simplicity is worth it,
ATM. It's all relative to your definition of ... eh ... terribly.

.oO( ... premature optimization ... all evil ... )

 We might
 need to re-think this when systems with 4096 processors become
 commonplace, but considering how many other things would also need to
 be fixed to work well in that universe, I'm not too concerned about it
 just yet.

Agreed.

 One thing I think we probably want to explore in the future, for both
 worker backends and regular backends, is pre-forking.  We could avoid
 some of the latency associated with starting up a new backend or
 opening a new connection in that way.  However, there are quite a few
 details to be thought through there, so I'm not eager to pursue that
 just yet.  Once we have enough infrastructure to implement meaningful
 parallelism, we can benchmark it and find out where the bottlenecks
 are, and which solutions actually help most.

Do you mean pre-forking and connecting to a specific database? Or really
just the forking?

 I do think we need a mechanism to allow the backend that requested the
 bgworker to know whether or not the bgworker got started, and whether
 it unexpectedly died.  Once we get to the point of calling user code
 within the bgworker process, it can use any number of existing
 mechanisms to make sure that it won't die without notifying the
 backend that started it (short of a PANIC, in which case it won't
 matter anyway).  But we need a way to report failures that happen
 before that point.  I have some ideas about that, but decided to leave
 them for future passes.  The remit of this patch is just to make it
 possible to dynamically register bgworkers.  Allowing a bgworker to be
 tied to the session that requested it via some sort of feedback loop
 is a separate project - which I intend to tackle before CF2, assuming
 this gets committed (and so far nobody is objecting to that).

Okay, sounds good. Given my background, I considered that a solved
problem. Thanks for pointing it out.

 Sounds like the postmaster is writing to shared memory. Not sure why
 I've been trying so hard to avoid that, though. After all, it can hardly
 hurt itself *writing* to shared memory.
 
 I think there's ample room for paranoia about postmaster interaction
 with shared memory, but all it's doing is setting a flag, which is no
 different from what CheckPostmasterSignal() already does.

Sounds good to me.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2013-06-20 Thread Robert Haas
On Wed, Jun 12, 2013 at 3:00 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 6/12/13 1:29 PM, Fabrízio de Royes Mello wrote:
 The attached patch add support to IF NOT EXISTS to CREATE statements
 listed below:

 - CREATE AGGREGATE [ IF NOT EXISTS ] ...
 - CREATE CAST [ IF NOT EXISTS ] ...
 - CREATE COLLATION [ IF NOT EXISTS ] ...
 - CREATE OPERATOR [ IF NOT EXISTS ] ...
 - CREATE TEXT SEARCH {PARSER | DICTIONARY | TEMPLATE | CONFIGURATION} [
 IF NOT EXISTS ] ...
 - CREATE TYPE [ IF NOT EXISTS ] ... [AS [{ENUM | RANGE}] (...)]

 I'm wondering where IF NOT EXISTS and OR REPLACE will meet.

I kind of don't see the point of having IF NOT EXISTS for things that
have OR REPLACE, and am generally in favor of implementing OR REPLACE
rather than IF NOT EXISTS where possible.  The point is usually to get
the object to a known state, and OR REPLACE will generally accomplish
that better than IF NOT EXISTS.  However, if the object has complex
structure (like a table that contains data) then replacing it is a
bad plan, so IF NOT EXISTS is really the best you can do - and it's
still useful, even if it does require more care.

 Btw., I also want REPLACE BUT DO NOT CREATE.

That's a mouthful.  What's it good for?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC catalog access

2013-06-20 Thread Robert Haas
On Thu, Jun 20, 2013 at 10:35 AM, Andres Freund and...@2ndquadrant.com wrote:
 But I'm still on the fence about whether this is really a valid test.

 I think it shows that we need to be careful and do further performance
 evaluations and/or alleviate the pain by making things cheaper (say, a
 ddl counter in shared mem, allowing to cache snapshots for the
 syscache). If that artificial test hadn't shown problems I'd have voted
 for just going ahead and not worry further.

I tried a general snapshot counter that rolls over every time any
transaction commits, but that doesn't help much.  It's a small
improvement on general workloads, but it's not effective against this
kind of hammering.  A DDL counter would be a bit more expensive
because we'd have to insert an additional branch into
GetSnapshotData() while ProcArrayLock is held, but it might be
tolerable.  Do you have code for this (or some portion of it) already?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MVCC catalog access

2013-06-20 Thread Andres Freund
On 2013-06-20 10:58:59 -0400, Robert Haas wrote:
 On Thu, Jun 20, 2013 at 10:35 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  But I'm still on the fence about whether this is really a valid test.
 
  I think it shows that we need to be careful and do further performance
  evaluations and/or alleviate the pain by making things cheaper (say, a
  ddl counter in shared mem, allowing to cache snapshots for the
  syscache). If that artificial test hadn't shown problems I'd have voted
  for just going ahead and not worry further.
 
 I tried a general snapshot counter that rolls over every time any
 transaction commits, but that doesn't help much.  It's a small
 improvement on general workloads, but it's not effective against this
 kind of hammering.  A DDL counter would be a bit more expensive
 because we'd have to insert an additional branch into
 GetSnapshotData() while ProcArrayLock is held, but it might be
 tolerable.

I actually wasn't thinking of adding it at that level. More like just
not fetching a new snapshot in systable_* if a) the global ddl counter
hasn't been increased b) our transactions hasn't performed any
ddl. Instead use the one used the last time we fetched one for that
purpose.

The global ddl counter could be increased everytime we commit and the
commit has invalidations queued. The local one everytime we execute
local cache invalidation messages (i.e. around CommandCounterIncrement).

I think something roughly along those lines should be doable without
adding new overhead to global paths. Except maybe some check in
snapmgr.c for that new, longer living, snapshot.

 Do you have code for this (or some portion of it) already?

Nope, sorry.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Frontend/backend protocol improvements proposal (request).

2013-06-20 Thread Dmitriy Igrishin
Hackers,

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message
for the following commands:
  PREPARE;
  DEALLOCATE;
  DECLARE;
  CLOSE;
  LISTEN;
  UNLISTEN;
  SET;
  RESET.
Namely, for example, users of my library can prepare statements by using
protocol directly or via PREPARE command. Since the protocol does not
supports prepared statement deallocation, I wrote a wrapper over DEALLOCATE
command. The library knows about all prepared statements and
invalidates them automatically when user performs deallocate() wrapper.
But users can go with DEALLOCATE command directly and in these cases
I need to query the database to get the list of currently prepared
statements
whenever CommandComplete message with DEALLOCATE command tag
is consumed. Moreover, I need to do it *synchronously* and this breaks
asynchronous API.
I propose to include name of the object in the CommandComplete (B)
message for the above commands.


-- 
// Dmitriy.


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-20 Thread Andres Freund
On 2013-06-20 13:45:24 +0800, Craig Ringer wrote:
 On 06/12/2013 07:51 PM, Andres Freund wrote:
  On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
  On 06/12/2013 05:55 PM, Greg Stark wrote:
  On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer cr...@2ndquadrant.com 
  wrote:
  The main thing I'm wondering is how/if to handle backward compatibility 
  with
  the existing NUMERIC and its DECIMAL alias
  If it were 100% functionally equivalent you could just hide the
  implementation internally. Have a bit that indicates which
  representation was stored and call the right function depending.
  That's what I was originally wondering about, but as Tom pointed out it
  won't work. We'd still need to handle scale and precision greater than
  that offered by _Decimal128 and wouldn't know in advance how much
  scale/precision they wanted to preserve. So we'd land up upcasting
  everything to NUMERIC whenever we did anything with it anyway, only to
  then convert it back into the appropriate fixed size decimal type for
  storage.

  Well, you can limit the upcasting to the cases where we would exceed
  the precision.

 How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or
 sqrt(DECIMAL '2') ?

Well, the suggestion above was not to actually implement them as
separate types. If you only store the precision inside the Datum you can
limit the upcasting to whatever you need.

 I think a good starting point would be to use the Intel and IBM
 libraries to implement basic DECIMAL32/64/128 to see if they perform
 better than the gcc builtins tested by Pavel by adapting his extension.

Another good thing to investigate early on is whether there's actually a
need for the feature outside complying to standards.

  Pretty pointless, and made doubly so by the fact that if we're
  not using a nice fixed-width type and have to support VARLENA we miss
  out on a whole bunch of performance benefits.
  I rather doubt that using a 1byte varlena - which it will be for
  reasonably sized Datums - will be a relevant bottleneck here. Maybe if
  you only have 'NOT NULL', fixed width columns, but even then...
 That's good to know - if I've overestimated the cost of using VARLENA
 for this, that's really quite good news.

From what I remember seing in profiles the biggest overhead is that the
short varlenas (not long ones though) frequently need to be copied
around so they are placed at an aligned address. I think with some care
numeric.c could be made to avoid that for the most common cases which
should speed up things nicely.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump difference between 9.3 and master after upgrade

2013-06-20 Thread Andrew Dunstan


On 06/20/2013 10:43 AM, Robert Haas wrote:

On Tue, Jun 18, 2013 at 12:18 PM, Andrew Dunstan and...@dunslane.net wrote:

As I was updating my cross version upgrade tester to include support for the
9.3 branch, I noted this dump difference between the dump of the original
9.3 database and the dump of the converted database, which looks odd. Is it
correct?

It looks sketchy to me, but I'm not sure exactly what you're comparing.



Essentially, cross version upgrade testing runs pg_dumpall from the new 
version on the old cluster, runs pg_upgrade, and then runs pg_dumpall on 
the upgraded cluster, and compares the two outputs. This is what we get 
when the new version is HEAD and the old version is 9.3.


The reason this hasn't been caught by the standard same-version upgrade 
tests is that this module uses a more extensive cluster, which has had 
not only the core regression tests run but also all the contrib and pl 
regression tests, and this problem seems to be exposed by the 
postgres_fdw tests.


At first glance to me like pg_dump in binary-upgrade mode is not 
suppressing something that it should be suppressing.


The

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic background workers

2013-06-20 Thread Robert Haas
On Thu, Jun 20, 2013 at 10:59 AM, Markus Wanner mar...@bluegap.ch wrote:
 On 06/20/2013 04:41 PM, Robert Haas wrote:
 The constant factor is also very small.  Generally, I would expect
 num_worker_processes ~ # CPUs

 That assumption might hold for parallel querying, yes. In case of
 Postgres-R, it doesn't. In the worst case, i.e. with a 100% write load,
 a cluster of n nodes, each with m backends performing transactions, all
 of them replicated to all other (n-1) nodes, you end up with ((n-1) * m)
 bgworkers. Which is pretty likely to be way above the # CPUs on any
 single node.

 I can imagine other extensions or integral features like autonomous
 transactions that might possibly want many more bgworkers as well.

Yeah, maybe.  I think in general it's not going to work great to have
zillions of backends floating around, because eventually the OS
scheduler overhead - and the memory overhead - are going to become
pain points.  And I'm hopeful that autonomous transactions can be
implemented without needing to start a new backend for each one,
because that sounds pretty expensive.  Some users of other database
products will expect autonomous transactions to be cheap; aside from
that, cheap is better than expensive.  But we will see.  At any rate I
think your basic point is that people might end up creating a lot more
background workers than I'm imagining, which is certainly a fair
point.

 and scanning a 32, 64, or even 128
 element array is not a terribly time-consuming operation.

 I'd extend that to say scanning an array with a few thousand elements is
 not terribly time-consuming, either. IMO the simplicity is worth it,
 ATM. It's all relative to your definition of ... eh ... terribly.

 .oO( ... premature optimization ... all evil ... )

Yeah, that thing.

 One thing I think we probably want to explore in the future, for both
 worker backends and regular backends, is pre-forking.  We could avoid
 some of the latency associated with starting up a new backend or
 opening a new connection in that way.  However, there are quite a few
 details to be thought through there, so I'm not eager to pursue that
 just yet.  Once we have enough infrastructure to implement meaningful
 parallelism, we can benchmark it and find out where the bottlenecks
 are, and which solutions actually help most.

 Do you mean pre-forking and connecting to a specific database? Or really
 just the forking?

I've considered both at various times, although in this context I was
mostly thinking about just the forking. Pre-connecting to a specific
database would save an unknown but possibly significant amount of
additional latency.  Against that, it's more complex (because we've
got to track which preforked workers are associated with which
databases) and there's some cost to guessing wrong (because then we're
keeping workers around that we can't use, or maybe even having to turn
around and kill them to make slots for the workers we actually need).
I suspect we'll want to pursue the idea at some point but it's not
near the top of my list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic background workers

2013-06-20 Thread Andres Freund
On 2013-06-20 11:29:27 -0400, Robert Haas wrote:
  Do you mean pre-forking and connecting to a specific database? Or really
  just the forking?
 
 I've considered both at various times, although in this context I was
 mostly thinking about just the forking. Pre-connecting to a specific
 database would save an unknown but possibly significant amount of
 additional latency.  Against that, it's more complex (because we've
 got to track which preforked workers are associated with which
 databases) and there's some cost to guessing wrong (because then we're
 keeping workers around that we can't use, or maybe even having to turn
 around and kill them to make slots for the workers we actually need).
 I suspect we'll want to pursue the idea at some point but it's not
 near the top of my list.

Just as a datapoint, if you benchmark the numbers of forks that can be
performed by a single process (i.e. postmaster) the number is easily in
the 10s of thousands. Now forking that much has some scalability
implications inside the kernel, but still.
I'd be surprised if the actual fork is more than 5-10% of the current
cost of starting a new backend.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2013-06-20 Thread Peter Eisentraut
On 6/20/13 11:04 AM, Robert Haas wrote:
 I kind of don't see the point of having IF NOT EXISTS for things that
 have OR REPLACE, and am generally in favor of implementing OR REPLACE
 rather than IF NOT EXISTS where possible.

I tend to agree.

  Btw., I also want REPLACE BUT DO NOT CREATE.
 That's a mouthful.  What's it good for?

If you run an upgrade SQL script that is supposed to replace, say, a
bunch of functions with new versions, you'd want the behavior that it
replaces the existing function if it exists, but errors out if it
doesn't, because then you're perhaps connected to the wrong database.

It's a marginal feature, and I'm not going to pursue it, but if someone
wanted to make the CREATE commands fully featured, there is use for this.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Alvaro Herrera
MauMau escribió:
 First, thank you for the review.
 
 From: Alvaro Herrera alvhe...@2ndquadrant.com
 This seems reasonable.  Why 10 seconds?  We could wait 5 seconds, or 15.
 Is there a rationale behind the 10?  If we said 60, that would fit
 perfectly well within the already existing 60-second loop in postmaster,
 but that seems way too long.
 
 There is no good rationale.  I arbitrarily chose a short period
 because this is immediate shutdown.  I felt more than 10 second
 was long.  I think 5 second may be better.  Although not directly
 related to this fix, these influenced my choice:
 
 1. According to the man page of init, init sends SIGKILL to all
 remaining processes 5 seconds after it sends SIGTERM to them.
 
 2. At computer shutdown, Windows proceeds shutdown forcibly after
 waiting for services to terminate 20 seconds.

Well, as for the first of these, I don't think it matters whether
processes get their SIGKILL from postmaster or from init, when a
shutdown or runlevel is changing.  Now, one would think that this sets a
precedent.  I think 20 seconds might be too long; perhaps it's expected
in an operating system that forcibly has a GUI.  I feel safe to ignore
that.

I will go with 5 seconds, then.


 Note that the previous text said that postmaster will send SIGQUIT, then
 terminate without checking anything.  In the new code, postmaster sends
 SIGQUIT, then waits, then SIGKILL, then waits again.  If there's an
 unkillable process (say because it's stuck in a noninterruptible sleep)
 postmaster might never exit.  I think it should send SIGQUIT, then wait,
 then SIGKILL, then exit without checking.
 
 At first I thought the same, but decided not to do that.  The
 purpose of this patch is to make the immediate shutdown reliable.

My point is that there is no difference.  For one thing, once we enter
immediate shutdown state, and sigkill has been sent, no further action
is taken.  Postmaster will just sit there indefinitely until processes
are gone.  If we were to make it repeat SIGKILL until they die, that
would be different.  However, repeating SIGKILL is pointless, because it
they didn't die when they first received it, they will still not die
when they receive it second.  Also, if they're in uninterruptible sleep
and don't die, then they will die as soon as they get out of that state;
no further queries will get processed, no further memory access will be
done.  So there's no harm in they remaining there until underlying
storage returns to life, ISTM.

 Here, reliable means that the database server is certainly shut
 down when pg_ctl returns, not telling a lie that I shut down the
 server processes for you, so you do not have to be worried that some
 postgres process might still remain and write to disk.  I suppose
 reliable shutdown is crucial especially in HA cluster.  If pg_ctl
 stop -mi gets stuck forever when there is an unkillable process (in
 what situations does this happen? OS bug, or NFS hard mount?), I
 think the DBA has to notice this situation from the unfinished
 pg_ctl, investigate the cause, and take corrective action.

So you're suggesting that keeping postmaster up is a useful sign that
the shutdown is not going well?  I'm not really sure about this.  What
do others think?

 I have tweaked the patch a bit and I'm about to commit as soon as we
 resolve the above two items.
 
 I appreciate your tweaking, especially in the documentation and
 source code comments, as English is not my mother tongue.

IIRC the only other interesting tweak I did was rename the
SignalAllChildren() function to TerminateChildren().  I did this because
it doesn't really signal all children; syslogger and dead_end backends
are kept around.  So the original name was a bit misleading.  And we
couldn't really name it SignalAlmostAllChildren(), could we ..

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench --startup option

2013-06-20 Thread Jeff Janes
On Sun, Jun 16, 2013 at 9:42 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thu, May 2, 2013 at 11:25 AM, Fabien COELHO coe...@cri.ensmp.frwrote:





 Thanks for doing the review.  I'm not sure what things to go change
 without further feedback/discussion, except point 4.  I'll wait a day to
 see if I get more feedback on the other issues and submit a new patch.



I've fixed a conflict, and I've removed extraneous semicolons from the C.

I've left in the fixing of some existing bad indenting in the existing
code, which is not strictly related to my change.

I hope my defenses of the other points were persuasive.

Cheers,

Jeff


pgbench_startup-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix pgstattuple/pgstatindex to use regclass-type as the argument

2013-06-20 Thread Fujii Masao
On Thu, Jun 20, 2013 at 11:43 AM, Satoshi Nagayasu sn...@uptime.jp wrote:
 (2013/06/17 4:02), Fujii Masao wrote:

 On Sat, Mar 9, 2013 at 3:23 PM, Satoshi Nagayasu sn...@uptime.jp wrote:

 It is obviously easy to keep two types of function interfaces,
 one with regclass-type and another with text-type, in the next
 release for backward-compatibility like below:

 pgstattuple(regclass)  -- safer interface.
 pgstattuple(text)  -- will be depreciated in the future release.


 So you're thinking to remove pgstattuple(oid) soon?


 AFAIK, a regclass type argument would accept an OID value,
 which means regclass type has upper-compatibility against
 oid type.

 So, even if the declaration is changed, compatibility could
 be kept actually. This test case (in sql/pgstattuple.sql)
 confirms that.

 
 select * from pgstatindex('myschema.test_pkey'::regclass::oid);
  version | tree_level | index_size | root_block_no | internal_pages |
 leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
 leaf_fragmentation
 -+++---+++-+---+--+
2 |  0 |   8192 | 1 |  0 |
 1 |   0 | 0 | 0.79 |0
 (1 row)
 


 Having both interfaces for a while would allow users to have enough
 time to rewrite their applications.

 Then, we will be able to obsolete (or just drop) old interfaces
 in the future release, maybe 9.4 or 9.5. I think this approach
 would minimize an impact of such interface change.

 So, I think we can clean up function arguments in the pgstattuple
 module, and also we can have two interfaces, both regclass and text,
 for the next release.

 Any comments?


 In the document, you should mark old functions as deprecated.


 I'm still considering changing the function name as Tom pointed
 out. How about pgstatbtindex?

Or just adding pgstatindex(regclass)?

 In fact, pgstatindex does support only BTree index.
 So, pgstatbtindex seems to be more appropriate for this function.

Can most ordinary users realize bt means btree?

 We can keep having both (old) pgstatindex and (new) pgstatbtindex
 during next 2-3 major releases, and the old one will be deprecated
 after that.

Since pg_relpages(oid) doesn't exist, pg_relpages() is in the same
situation as pgstatindex(), i.e., we cannot just replace pg_relpages(text)
with pg_relpages(regclass) for the backward-compatibility. How do you
think we should solve the pg_relpages() problem? Rename? Just
add pg_relpages(regclass)?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench --startup option

2013-06-20 Thread Fabien COELHO



I've fixed a conflict, and I've removed extraneous semicolons from the C.

I've left in the fixing of some existing bad indenting in the existing
code, which is not strictly related to my change.


There are still unrelated changes : spacing on -c and -t options' help. 
The pgindent command is passed on the sources from time to time, so 
there should be no reason to change this in this commit.


The updated string for PQerrorMessage does not bring much, and the message 
does not seem an improvement. Command failed with ERROR, indeed.


  Command failed with ERROR:  syntax error at or near ;
  LINE 1: set synchronous_commit=on;set synchronous_;

The preceding result seems bother simpler and fine:

  ERROR:  syntax error at or near ;
  LINE 1: set synchronous_commit=on;set synchronous_;

Otherwise I've tested the patch with one set, two sets and a syntax 
error, and it worked as expected.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)

2013-06-20 Thread Bruce Momjian
On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote:
  But, couldn't that be solved by deprecating that function and
  providing a more sensible alternatively named version?
 
  And what would you name that function?  array_dims2?  I can't think of
  a name that makes the difference in behaviour apparent.  Can you
  imagine the documentation for that?
 
 I don't know the answer to that, but I think it's hard to argue that
 deprecating and documenting a few functions is a heavier burden on
 your users than having to sift through older arcane code before
 upgrading to the latest version of the database.  We're not the only
 ones stuck with lousy old functions (C finally ditched gets() in the
 2011 standard).  I also happen to think the current array_api function
 names are not particularly great (especially array_upper/array_lower)
 so I won't shed too many tears.

Sorry to be late on this, but are you saying people have code that is
testing:

select array_dims('{}'::int[])

for a NULL return, and they would need to change that to test for zero?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-20 Thread Dean Rasheed
On 17 June 2013 06:36, David Fetter da...@fetter.org wrote:
   Please find attached two versions of a patch which provides optional
   FILTER clause for aggregates (T612, Advanced OLAP operations).
  
   The first is intended to be applied on top of the previous patch, the
   second without it.  The first is, I believe, clearer in what it's
   doing.  Rather than simply mechanically visiting every place a
   function call might be constructed, it visits a central one to change
   the default, then goes only to the places where it's relevant.
  
   The patches are both early WIP as they contain no docs or regression
   tests yet.
 
  Docs and regression tests added, makeFuncArgs approached dropped for
  now, will re-visit later.

 Regression tests added to reflect bug fixes in COLLATE.

 Rebased vs. master.


Hi,

I've been looking at this patch, which adds support for the SQL
standard feature of applying a filter to rows used in an aggregate.
The syntax matches the spec:

  agg_fn ( args [ ORDER BY sort_clause ] ) [ FILTER ( WHERE qual ) ]

and this patch makes FILTER a new reserved keyword, usable as a
function or type, but not in other contexts, e.g., as a table name or
alias.

I'm not sure if that might be a problem for some people, but I can't
see any way round it, since otherwise there would be no way for the
parser to distinguish a filter clause from an alias expression.


The feature appears to work per-spec, and the code and doc changes
look reasonable. However, it looks a little light on regression tests,
and so I think some necessary changes have been overlooked.

In my testing of sub-queries in the FILTER clause (an extension to the
spec), I was able to produce the following error:

CREATE TABLE t1(a1 int);
CREATE TABLE t2(a2 int);
INSERT INTO t1 SELECT * FROM generate_series(1,10);
INSERT INTO t2 SELECT * FROM generate_series(3,6);

SELECT array_agg(a1) FILTER (WHERE a1 IN (SELECT a2 FROM t2)) FROM t1;

ERROR:  plan should not reference subplan's variable

which looks to be related to subselect.c's handling of sub-queries in
aggregates.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)

2013-06-20 Thread Merlin Moncure
On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote:
  But, couldn't that be solved by deprecating that function and
  providing a more sensible alternatively named version?
 
  And what would you name that function?  array_dims2?  I can't think of
  a name that makes the difference in behaviour apparent.  Can you
  imagine the documentation for that?

 I don't know the answer to that, but I think it's hard to argue that
 deprecating and documenting a few functions is a heavier burden on
 your users than having to sift through older arcane code before
 upgrading to the latest version of the database.  We're not the only
 ones stuck with lousy old functions (C finally ditched gets() in the
 2011 standard).  I also happen to think the current array_api function
 names are not particularly great (especially array_upper/array_lower)
 so I won't shed too many tears.

 Sorry to be late on this, but are you saying people have code that is
 testing:

 select array_dims('{}'::int[])

 for a NULL return, and they would need to change that to test for zero?

Kinda -- what I'm saying is you just don't go around changing function
behaviors to make them 'better' unless the affected behavior was
specifically reserved as undefined.  The fact is nobody knows how many
users will be affected and the extent of the ultimate damage (pro tip:
it's always more and worse than expected); I'm astonished it's even
being considered.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread MauMau

From: Alvaro Herrera alvhe...@2ndquadrant.com

I will go with 5 seconds, then.


OK, I agree.



My point is that there is no difference.  For one thing, once we enter
immediate shutdown state, and sigkill has been sent, no further action
is taken.  Postmaster will just sit there indefinitely until processes
are gone.  If we were to make it repeat SIGKILL until they die, that
would be different.  However, repeating SIGKILL is pointless, because it
they didn't die when they first received it, they will still not die
when they receive it second.  Also, if they're in uninterruptible sleep
and don't die, then they will die as soon as they get out of that state;
no further queries will get processed, no further memory access will be
done.  So there's no harm in they remaining there until underlying
storage returns to life, ISTM.


Here, reliable means that the database server is certainly shut
down when pg_ctl returns, not telling a lie that I shut down the
server processes for you, so you do not have to be worried that some
postgres process might still remain and write to disk.  I suppose
reliable shutdown is crucial especially in HA cluster.  If pg_ctl
stop -mi gets stuck forever when there is an unkillable process (in
what situations does this happen? OS bug, or NFS hard mount?), I
think the DBA has to notice this situation from the unfinished
pg_ctl, investigate the cause, and take corrective action.


So you're suggesting that keeping postmaster up is a useful sign that
the shutdown is not going well?  I'm not really sure about this.  What
do others think?


I think you are right, and there is no harm in leaving postgres processes in 
unkillable state.  I'd like to leave the decision to you and/or others.


One concern is that umount would fail in such a situation because postgres 
has some open files on the filesystem, which is on the shared disk in case 
of traditional HA cluster.  However, STONITH should resolve the problem by 
terminating the stuck node...  I just feel it is strange for umount to fail 
due to remaining postgres, because pg_ctl stop -mi reported success.



IIRC the only other interesting tweak I did was rename the
SignalAllChildren() function to TerminateChildren().  I did this because
it doesn't really signal all children; syslogger and dead_end backends
are kept around.  So the original name was a bit misleading.  And we
couldn't really name it SignalAlmostAllChildren(), could we ..


I see.  thank you.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changeset generation v5-01 - Patches git tree

2013-06-20 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 0007: Adjust Satisfies* interface: required, mechanical,

 Version v5-01 attached

I'm still working on a review and hope to post something more
substantive by this weekend, but when applying patches in numeric
order, this one did not compile cleanly.

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-I../../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2   -c -o 
allpaths.o allpaths.c -MMD -MP -MF .deps/allpaths.Po
vacuumlazy.c: In function ‘heap_page_is_all_visible’:
vacuumlazy.c:1725:3: warning: passing argument 1 of ‘HeapTupleSatisfiesVacuum’ 
from incompatible pointer type [enabled by default]
In file included from vacuumlazy.c:61:0:
../../../src/include/utils/tqual.h:84:20: note: expected ‘HeapTuple’ but 
argument is of type ‘HeapTupleHeader’

Could you post a new version of that?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] isolationtester and 'specs' subdirectory

2013-06-20 Thread Josh Kupershmidt
Hi all,
I have a Debian machine with gcc 4.7.2-5 where make check-world fails
in the isolation check, like so:

...
make[2]: Leaving directory `/home/josh/src/postgresql/src/test/regress'
make -C isolation check

[snip]

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -I.
-I../../../src/interfaces/libpq -I./../regress -I../../../src/include
-D_GNU_SOURCE   -c -o isolation_main.o isolation_main.c
gcc: error: ./specs: Is a directory
make[2]: *** [isolation_main.o] Error 1
...

I eventually tracked down the cause of this failure to a trailing ':'
in my $LIBRARY_PATH, which causes gcc to look inside the current
directory for a 'specs' file [1] among other things. Although I
probably don't need that trailing ':', it seems like we should avoid
naming this directory 'specs' nonetheless to avoid confusion with gcc.

Renaming the 'specs' directory to something like 'isolation_specs' and
adjusting isolation_main.c accordingly lets me pass `make
check-world`. Proposed patch attached.

Josh

[1] http://gcc.gnu.org/ml/gcc-help/2010-05/msg00292.html


rename_specs.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Noah Misch
On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote:
 MauMau escribi?:
  Here, reliable means that the database server is certainly shut
  down when pg_ctl returns, not telling a lie that I shut down the
  server processes for you, so you do not have to be worried that some
  postgres process might still remain and write to disk.  I suppose
  reliable shutdown is crucial especially in HA cluster.  If pg_ctl
  stop -mi gets stuck forever when there is an unkillable process (in
  what situations does this happen? OS bug, or NFS hard mount?), I
  think the DBA has to notice this situation from the unfinished
  pg_ctl, investigate the cause, and take corrective action.
 
 So you're suggesting that keeping postmaster up is a useful sign that
 the shutdown is not going well?  I'm not really sure about this.  What
 do others think?

It would be valuable for pg_ctl -w -m immediate stop to have the property
that an subsequent start attempt will not fail due to the presence of some
backend still attached to shared memory.  (Maybe that's true anyway or can be
achieved a better way; I have not investigated.)

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)

2013-06-20 Thread Bruce Momjian
On Thu, Jun 20, 2013 at 03:33:24PM -0500, Merlin Moncure wrote:
 On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian br...@momjian.us wrote:
  On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote:
   But, couldn't that be solved by deprecating that function and
   providing a more sensible alternatively named version?
  
   And what would you name that function?  array_dims2?  I can't think of
   a name that makes the difference in behaviour apparent.  Can you
   imagine the documentation for that?
 
  I don't know the answer to that, but I think it's hard to argue that
  deprecating and documenting a few functions is a heavier burden on
  your users than having to sift through older arcane code before
  upgrading to the latest version of the database.  We're not the only
  ones stuck with lousy old functions (C finally ditched gets() in the
  2011 standard).  I also happen to think the current array_api function
  names are not particularly great (especially array_upper/array_lower)
  so I won't shed too many tears.
 
  Sorry to be late on this, but are you saying people have code that is
  testing:
 
  select array_dims('{}'::int[])
 
  for a NULL return, and they would need to change that to test for zero?
 
 Kinda -- what I'm saying is you just don't go around changing function
 behaviors to make them 'better' unless the affected behavior was
 specifically reserved as undefined.  The fact is nobody knows how many
 users will be affected and the extent of the ultimate damage (pro tip:
 it's always more and worse than expected); I'm astonished it's even
 being considered.

Well, I think the question is how many people have such arrays that will
be effected.  If we don't do something, we live with this odd behavior
forever.  We have been willing to make some bold decisions in the past
to improve user experience, and it mostly has worked out well.  I
disagree that it is always worse than expected.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)

2013-06-20 Thread Merlin Moncure
On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian br...@momjian.us wrote:
 Kinda -- what I'm saying is you just don't go around changing function
 behaviors to make them 'better' unless the affected behavior was
 specifically reserved as undefined.  The fact is nobody knows how many
 users will be affected and the extent of the ultimate damage (pro tip:
 it's always more and worse than expected); I'm astonished it's even
 being considered.

 Well, I think the question is how many people have such arrays that will
 be effected.  If we don't do something, we live with this odd behavior
 forever.  We have been willing to make some bold decisions in the past
 to improve user experience, and it mostly has worked out well.  I
 disagree that it is always worse than expected.

Well, you can have the last word (although 'bold' was an interesting
word choice, heh)  -- I feel guilty enough about beating up Brendan
already.  I feel this way every time compatibility changes come up, so
it's nothing specific to this patch really.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)

2013-06-20 Thread Bruce Momjian
On Thu, Jun 20, 2013 at 07:13:48PM -0500, Merlin Moncure wrote:
 On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian br...@momjian.us wrote:
  Kinda -- what I'm saying is you just don't go around changing function
  behaviors to make them 'better' unless the affected behavior was
  specifically reserved as undefined.  The fact is nobody knows how many
  users will be affected and the extent of the ultimate damage (pro tip:
  it's always more and worse than expected); I'm astonished it's even
  being considered.
 
  Well, I think the question is how many people have such arrays that will
  be effected.  If we don't do something, we live with this odd behavior
  forever.  We have been willing to make some bold decisions in the past
  to improve user experience, and it mostly has worked out well.  I
  disagree that it is always worse than expected.
 
 Well, you can have the last word (although 'bold' was an interesting
 word choice, heh)  -- I feel guilty enough about beating up Brendan
 already.  I feel this way every time compatibility changes come up, so
 it's nothing specific to this patch really.

Well, sometimes we underestimate the impact of changes, sometimes we
overestimate.  The big problem is weighing the short-term problems of
change but not the long-term benefit of a change.  This array problem
goes back to at least 2008:

http://www.postgresql.org/message-id/28026.1224611...@sss.pgh.pa.us

so we have at least five years of confusion by not changing it then.  I
am not saying we need to change it, but do think we need to weigh both
issues.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)

2013-06-20 Thread Josh Berkus
Bruce,

 Well, sometimes we underestimate the impact of changes, sometimes we
 overestimate.  The big problem is weighing the short-term problems of
 change but not the long-term benefit of a change.  This array problem
 goes back to at least 2008:
 
   http://www.postgresql.org/message-id/28026.1224611...@sss.pgh.pa.us
 
 so we have at least five years of confusion by not changing it then.  I
 am not saying we need to change it, but do think we need to weigh both
 issues.

As much as I hate the current behavior (my first response was yeah, fix
those babies!), I think we don't have a choice about creating new
function names and then waiting three years to deprecate the old ones.
We really can't afford to put obstacles in the way of people upgrading,
especially over an issue as minor as this one.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Why can't I use windowing functions over ordered aggregates?

2013-06-20 Thread Josh Berkus
Hackers,

So, I can create a custom aggregate first and do this:

SELECT first(val order by ts desc) ...

And I can do this:

SELECT first_value(val) OVER (order by ts desc)

... but I can't do this:

SELECT first_value(val order by ts desc)

... even though under the hood, it's the exact same operation.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays (Was: Re: Should array_length() Return NULL)

2013-06-20 Thread Bruce Momjian
On Thu, Jun 20, 2013 at 06:28:07PM -0700, Josh Berkus wrote:
 Bruce,
 
  Well, sometimes we underestimate the impact of changes, sometimes we
  overestimate.  The big problem is weighing the short-term problems of
  change but not the long-term benefit of a change.  This array problem
  goes back to at least 2008:
  
  http://www.postgresql.org/message-id/28026.1224611...@sss.pgh.pa.us
  
  so we have at least five years of confusion by not changing it then.  I
  am not saying we need to change it, but do think we need to weigh both
  issues.
 
 As much as I hate the current behavior (my first response was yeah, fix
 those babies!), I think we don't have a choice about creating new
 function names and then waiting three years to deprecate the old ones.
 We really can't afford to put obstacles in the way of people upgrading,
 especially over an issue as minor as this one.

Perhaps we need to mark the TODO item as will not fix.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] trgm regex index peculiarity

2013-06-20 Thread Erik Rijkers
9.4devel (but same in 9.3)

In a 112 MB test table (containing random generated text) with a trgm index 
(gin_trgm_ops), I consistently get these timings:

select txt from azjunk6 where txt ~ '^abcd';
   130 ms


select txt from azjunk6
where txt ~ 'abcd' and substr(txt,1,4) = 'abcd';
   3 ms

(a similar performance difference occurs when using a regex, i.e. 'abc[de]'  )

This difference is so large that I wonder if there is not something wrong in 
the first case. (The returned results are
correct though)

Here are the two explains:

explain analyze select txt from azjunk6 where txt ~ '^abcd';
 QUERY PLAN
-
 Bitmap Heap Scan on azjunk6  (cost=108.78..484.93 rows=100 width=81) (actual 
time=129.557..129.742 rows=1 loops=1)
   Recheck Cond: (txt ~ '^abcd'::text)
   Rows Removed by Index Recheck: 17
   -  Bitmap Index Scan on azjunk6_trgm_re_idx  (cost=0.00..108.75 rows=100 
width=0) (actual time=129.503..129.503 rows=18
loops=1)
 Index Cond: (txt ~ '^abcd'::text)
 Total runtime: 130.008 ms
(6 rows)

explain analyze select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) 
= 'abcd';
   QUERY PLAN
-
 Bitmap Heap Scan on azjunk6  (cost=56.75..433.40 rows=1 width=81) (actual 
time=2.064..3.379 rows=1 loops=1)
   Recheck Cond: (txt ~ 'abcd'::text)
   Rows Removed by Index Recheck: 14
   Filter: (substr(txt, 1, 4) = 'abcd'::text)
   Rows Removed by Filter: 112
   -  Bitmap Index Scan on azjunk6_trgm_re_idx  (cost=0.00..56.75 rows=100 
width=0) (actual time=1.911..1.911 rows=127
loops=1)
 Index Cond: (txt ~ 'abcd'::text)
 Total runtime: 3.409 ms
(8 rows)


The results in both cases are correct, but does this difference not almost 
amount to a bug?

( Interestingly, the variant WHERE txt ~ 'abcd$'
is as fast as the non-anchored variant )


Thanks,

Erik Rijkers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-20 Thread Jeff Davis
On Sun, 2013-06-16 at 23:53 -0500, Jon Nelson wrote:
 Please find attached v5 of the patch, with the above correction in place.
 The only change from the v4 patch is wrapping the if
 (wal_use_fallocate) block in an #ifdef USE_POSIX_FALLOCATE.
 Thanks!

Thank you.

Greg, are you still working on those tests?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-20 Thread Craig Ringer
Hi all

Since 8.4, PostgreSQL has had extremely useful window function support -
but support for RANGE PRECEDING / FOLLOWING windows was dropped late
in 8.4's development in order to get the rest of the feature in, per
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.

It looks like there was discussion of requiring a new opclass to be
declared for types or otherwise extending opclasses to provide the
information required for RANGE ... PRECEDING / FOLLOWING (
http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org )
. I can't find any sign that it went anywhere beyond some broad
discussion:
http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us at
the time.

I've missed this feature more than once, and am curious about whether
any more recent changes may have made it cleaner to tackle this, or
whether consensus can be formed on adding the new entries to btree's
opclass to avoid the undesirable explicit lookups of the '+' and '-'
oprators.

Some question seems to remain open about how ranges over
timestamps/intervals should work, but this wasn't elaborated on.

There's been interest in this, eg:

http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1

http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-20 Thread Ian Link
I am currently looking 
into this feature. However, as I am quite new to Postgres, I think it 
might take me a while to get up to speed. Anyways, I would also 
appreciate another round of discussion on the future of the windowing 
functions. 

Ian Link


   	   
   	Craig Ringer  
  Thursday, June 
20, 2013 7:24 PM
  Hi allSince 8.4, 
PostgreSQL has had extremely useful window function support -but 
support for "RANGE PRECEDING / FOLLOWING" windows was dropped latein
 8.4's development in order to get the rest of the feature in, perhttp://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.It
 looks like there was discussion of requiring a new opclass to bedeclared
 for types or otherwise extending opclasses to provide theinformation
 required for RANGE ... PRECEDING / FOLLOWING (http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org
 ). I can't find any sign that it went anywhere beyond some broaddiscussion:http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us
 atthe time.I've missed this feature more than once, and am 
curious about whetherany more recent changes may have made it 
cleaner to tackle this, orwhether consensus can be formed on adding 
the new entries to btree'sopclass to avoid the undesirable explicit 
lookups of the '+' and '-'oprators.Some question seems to 
remain open about how ranges overtimestamps/intervals should work, 
but this wasn't elaborated on.There's been interest in this, eg:http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions




[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Alvaro Herrera
Noah Misch escribió:
 On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote:
  MauMau escribi?:
   Here, reliable means that the database server is certainly shut
   down when pg_ctl returns, not telling a lie that I shut down the
   server processes for you, so you do not have to be worried that some
   postgres process might still remain and write to disk.  I suppose
   reliable shutdown is crucial especially in HA cluster.  If pg_ctl
   stop -mi gets stuck forever when there is an unkillable process (in
   what situations does this happen? OS bug, or NFS hard mount?), I
   think the DBA has to notice this situation from the unfinished
   pg_ctl, investigate the cause, and take corrective action.
  
  So you're suggesting that keeping postmaster up is a useful sign that
  the shutdown is not going well?  I'm not really sure about this.  What
  do others think?
 
 It would be valuable for pg_ctl -w -m immediate stop to have the property
 that an subsequent start attempt will not fail due to the presence of some
 backend still attached to shared memory.  (Maybe that's true anyway or can be
 achieved a better way; I have not investigated.)

Well, the only case where a process that's been SIGKILLed does not go
away, as far as I know, is when it is in some uninterruptible sleep due
to in-kernel operations that get stuck.  Personally I have never seen
this happen in any other case than some network filesystem getting
disconnected, or a disk that doesn't respond.  And whenever the
filesystem starts to respond again, the process gets out of its sleep
only to die due to the signal.

So a subsequent start attempt will either find that the filesystem is
not responding, in which case it'll probably fail to work properly
anyway (presumably the filesystem corresponds to part of the data
directory), or that it has revived in which case the old backends have
already gone away.

If we leave postmaster running after SIGKILLing its children, the only
thing we can do is have it continue to SIGKILL processes continuously
every few seconds until they die (or just sit around doing nothing until
they all die).  I don't think this will have a different effect than
postmaster going away trusting the first SIGKILL to do its job
eventually.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-20 Thread Craig Ringer
On 06/21/2013 10:31 AM, Ian Link wrote:
 I am currently looking into this feature. However, as I am quite new to
 Postgres, I think it might take me a while to get up to speed. Anyways,
 I would also appreciate another round of discussion on the future of the
 windowing functions.

Good to know, and welcome.

I hope the links to the archived discussions on the matter were useful
to you.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Alvaro Herrera
MauMau escribió:
 From: Alvaro Herrera alvhe...@2ndquadrant.com

 One concern is that umount would fail in such a situation because
 postgres has some open files on the filesystem, which is on the
 shared disk in case of traditional HA cluster.

See my reply to Noah.  If postmaster stays around, would this be any
different?  I don't think so.

 IIRC the only other interesting tweak I did was rename the
 SignalAllChildren() function to TerminateChildren().  I did this because
 it doesn't really signal all children; syslogger and dead_end backends
 are kept around.  So the original name was a bit misleading.  And we
 couldn't really name it SignalAlmostAllChildren(), could we ..
 
 I see.  thank you.

Actually, in further testing I noticed that the fast-path you introduced
in BackendCleanup (or was it HandleChildCrash?) in the immediate
shutdown case caused postmaster to fail to clean up properly after
sending the SIGKILL signal, so I had to remove that as well.  Was there
any reason for that?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-20 Thread Ian Link
Thanks! The discussions 
have been useful, although I am currently just reviewing the code. 
I think a good starting point will be to refactor/imrpove the 
WinGetFuncArgInPartition and WinGetFuncArgInFrame functions.
Tom Lane wrote this about them before comitting the patch:

I'm
 not terribly happy with the changes you made in 
WinGetFuncArgInPartitionand
 WinGetFuncArgInFrame to force the window function mark to not gopast
 frame start in some modes. Not only is that pretty ugly, but Ithink
 it can mask bugs in window functions: it's an error for a windowfunction
 to fetch a row before what it has set its mark to be, but insome
 cases that wouldn't be detected because of this change. I thinkit
 would be better to revert those changes and find another method ofprotecting
 fetches needed to determine the frame head. One idea isto
 create a separate read pointer that tracks the frame head wheneveractual
 fetches of the frame head might be needed by update_frameheadpos.I
 committed it without changing that, but I think this should berevisited
 before trying to add the RANGE value PRECEDING/FOLLOWINGoptions,
 because those will substantially expand the number of caseswhere
 that hack affects the behavior. 

I am honestly not 100% certain why these functions have issues, but this
 seems a good place to start investigating.

Ian Link


   	   
   	Craig Ringer  
  Thursday, June 
20, 2013 7:37 PM
  Good to know, 
and welcome.I hope the links to the archived discussions on the 
matter were usefulto you.
   	   
   	Craig Ringer  
  Thursday, June 
20, 2013 7:24 PM
  Hi allSince 8.4, 
PostgreSQL has had extremely useful window function support -but 
support for "RANGE PRECEDING / FOLLOWING" windows was dropped latein
 8.4's development in order to get the rest of the feature in, perhttp://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.It
 looks like there was discussion of requiring a new opclass to bedeclared
 for types or otherwise extending opclasses to provide theinformation
 required for RANGE ... PRECEDING / FOLLOWING (http://www.postgresql.org/message-id/20100211201444.ga28...@svana.org
 ). I can't find any sign that it went anywhere beyond some broaddiscussion:http://www.postgresql.org/message-id/13993.1265920...@sss.pgh.pa.us
 atthe time.I've missed this feature more than once, and am 
curious about whetherany more recent changes may have made it 
cleaner to tackle this, orwhether consensus can be formed on adding 
the new entries to btree'sopclass to avoid the undesirable explicit 
lookups of the '+' and '-'oprators.Some question seems to 
remain open about how ranges overtimestamps/intervals should work, 
but this wasn't elaborated on.There's been interest in this, eg:http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions




Re: [HACKERS] single-user vs standalone in docs and messages

2013-06-20 Thread Peter Eisentraut
On Thu, 2013-06-13 at 15:10 -0700, Jeff Janes wrote:
 Some places in the docs and elog hints refer to standalone backends,
 while the official name as used in app-postgres.html is single-user
 mode,
 and in fact standalone does not appear on that page.
 
 This tries to standardize the other locations to use single-user.  I
 think I did the right thing with the message translation files, but I
 can't
 figure out how to test that.
 
 I made no attempt to change code-comments, just the user-facing
 parts. 

committed with some adjustments



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trgm regex index peculiarity

2013-06-20 Thread Tom Lane
Erik Rijkers e...@xs4all.nl writes:
 In a 112 MB test table (containing random generated text) with a trgm index 
 (gin_trgm_ops), I consistently get these timings:
 select txt from azjunk6 where txt ~ '^abcd';
130 ms
 select txt from azjunk6
 where txt ~ 'abcd' and substr(txt,1,4) = 'abcd';
3 ms

Hm, could you provide a self-contained test case?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-20 Thread David Fetter
On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
 On 17 June 2013 06:36, David Fetter da...@fetter.org wrote:
Please find attached two versions of a patch which provides optional
FILTER clause for aggregates (T612, Advanced OLAP operations).
   
The first is intended to be applied on top of the previous patch, the
second without it.  The first is, I believe, clearer in what it's
doing.  Rather than simply mechanically visiting every place a
function call might be constructed, it visits a central one to change
the default, then goes only to the places where it's relevant.
   
The patches are both early WIP as they contain no docs or regression
tests yet.
  
   Docs and regression tests added, makeFuncArgs approached dropped for
   now, will re-visit later.
 
  Regression tests added to reflect bug fixes in COLLATE.
 
  Rebased vs. master.
 
 Hi,
 
 I've been looking at this patch, which adds support for the SQL
 standard feature of applying a filter to rows used in an aggregate.
 The syntax matches the spec:
 
   agg_fn ( args [ ORDER BY sort_clause ] ) [ FILTER ( WHERE qual ) ]
 
 and this patch makes FILTER a new reserved keyword, usable as a
 function or type, but not in other contexts, e.g., as a table name or
 alias.
 
 I'm not sure if that might be a problem for some people, but I can't
 see any way round it, since otherwise there would be no way for the
 parser to distinguish a filter clause from an alias expression.
 
 The feature appears to work per-spec, and the code and doc changes
 look reasonable. However, it looks a little light on regression tests,

What tests do you think should be there that aren't?

 and so I think some necessary changes have been overlooked.
 
 In my testing of sub-queries in the FILTER clause (an extension to the
 spec), I was able to produce the following error:

Per the spec, 

B) A filter clause shall not contain a query expression, a window 
function, or an outer reference.

I'd be happy to see about adding one despite this, though.  That they
didn't figure out how doesn't mean we shouldn't eventually, ideally in
time for 9.4.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Pglife and back-branch commits

2013-06-20 Thread Bruce Momjian
Alvaro asked me to add the display of post minor-release commits to
PgLife (http://pglife.momjian.us/).  I have added a link to that
information as a plus-sign after each minor release number, e.g. Stable:
9.2.4+, 9.1.9+, 9.0.13+, 8.4.17+.

Hopefully this will be helpful in us scheduling minor releases.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-20 Thread Alvaro Herrera
David Fetter escribió:
 On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:

  In my testing of sub-queries in the FILTER clause (an extension to the
  spec), I was able to produce the following error:
 
 Per the spec, 
 
 B) A filter clause shall not contain a query expression, a window 
 function, or an outer reference.

If this is not allowed, I think there should be a clearer error message.
What Dean showed is an internal (not user-facing) error message.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-20 Thread David Fetter
On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote:
 David Fetter escribió:
  On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
 
   In my testing of sub-queries in the FILTER clause (an extension
   to the spec), I was able to produce the following error:
  
  Per the spec, 
  
  B) A filter clause shall not contain a query expression, a
  window function, or an outer reference.
 
 If this is not allowed, I think there should be a clearer error
 message.  What Dean showed is an internal (not user-facing) error
 message.

Folding to lower isn't allowed by the spec either, and then there's
the matter of arrays...

Before going to lots of trouble to figure out how to throw an error
that says, only the spec and no further, I'd like to investigate how
to make this work.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2013-06-20 Thread Jeff Davis
On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote:
 I think the question is whether this feature is really worth adding
 new reserved keywords for.  I have a hard time saying we shouldn't
 support something that's part of the SQL standard, but personally,
 it's not something I've seen come up prior to this thread.

What's the next step here?

The feature sounds useful to me. If the grammar is unacceptable, does
someone have an alternative idea, like using new function names instead
of grammar? If so, what are reasonable names to use?

Also, I think someone mentioned this already, but what about
first_value() and last_value()? Shouldn't we do those at the same time?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Alvaro Herrera
Actually, I think it would be cleaner to have a new state in pmState,
namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT.  When
we're in this state, postmaster is only waiting for the timeout to
expire; and when it does, it sends SIGKILL and exits.  Pretty much the
same you have, except that instead of checking AbortStartTime we check
the pmState variable.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Config reload/restart preview

2013-06-20 Thread Gurjeet Singh
On Thu, Jun 20, 2013 at 9:33 AM, Magnus Hagander mag...@hagander.netwrote:

 On Thu, Jun 20, 2013 at 2:54 PM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
  Magnus Hagander mag...@hagander.net writes:
  Should we have a way of previewing changes that would be applied if we
  reloaded/restarted the server?
 
  Yes, we should.
 
  +1
 
  This would go well with something I started working on some time ago
  (but haven't actually gotten far on at all), which is the ability for
  pg_ctl to be able to give feedback at all. Meaning a pg_ctl reload
  should also be able to tell you which parameters were changed, without
  having to go to the log. Obviously that's almost exactly the same
  feature.
 
  It could probably connect to the server and issue the SQL command to
  reload, and that one could probably get enhanced to return modified
  variable as NOTICE, or be run with the right client_min_message:
 
  SELECT pg_reload_conf();
 
  The pg_ctl client would then have to know to display the messages sent
  back by the server.

 The problem with that is that now you must *always* have your system
 set up to allow the postgres user to log in in pg_hba.conf or it
 fails.

 But yes, one option would be to use SQL instead of opening a socket.
 Maybe that's a better idea - have pg_ctl try to use that if available,
 and if not send a regular signal and not try to collect the output.


I started working on it yesterday after Thom proposed this idea internally
at EDB. The discussion until now doesn't seem to be hostile to a SQL
interface, so attached is a hack attempt, which hopefully will serve at
least as a POC. A sample session is shown below, while changing a few
values in postgresql.conf files.

The central idea is to use the SIGHUP processing function to do the work
for us and report potential changes via DEBUG2, instead of having to write
a new parsing engine. The (GUC-nesting + PGC_S_TEST) is nice to have since
it avoids the current session from adopting the values that are different
in conf file. This approach is susceptible to the fact that the connected
superuser may have its GUC values picked up from user/database/session
level settings (ALTER USER/DATABASE .. SET ; or SET param TO val;).

$ pgsql
Expanded display is used automatically.
psql (9.4devel)
Type help for help.

postgres=# show work_mem;
 work_mem
--
 1MB
(1 row)

postgres=# set client_min_messages = debug2;
SET
postgres=# select pg_test_reload_conf();
DEBUG:  parameter work_mem changed to 70MB
 pg_test_reload_conf
-
 t
(1 row)

postgres=# show work_mem;
 work_mem
--
 1MB
(1 row)

postgres=# select pg_test_reload_conf();
DEBUG:  parameter shared_buffers cannot be changed without restarting the
server
DEBUG:  configuration file
/home/gurjeet/dev/pgdbuilds/report_guc_chanege_pre_reload/db/data/postgresql.conf
contains errors; unaffected changes were applied
 pg_test_reload_conf
-
 t
(1 row)

postgres=# select pg_test_reload_conf();
DEBUG:  parameter log_min_messages removed from configuration file, reset
to default
 pg_test_reload_conf
-
 t
(1 row)

Best regards,
-- 
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.


report_conf_changes_without_applying.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-20 Thread David Fetter
On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote:
 David Fetter escribió:
  On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
 
   In my testing of sub-queries in the FILTER clause (an extension to the
   spec), I was able to produce the following error:
  
  Per the spec, 
  
  B) A filter clause shall not contain a query expression, a window 
  function, or an outer reference.
 
 If this is not allowed, I think there should be a clearer error message.
 What Dean showed is an internal (not user-facing) error message.

Please find attached a patch which allows subqueries in the FILTER
clause and adds regression testing for same.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 68309ba..b289a3a 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -594,10 +594,13 @@ GROUP BY replaceable 
class=parameterexpression/replaceable [, ...]
/para
 
para
-Aggregate functions, if any are used, are computed across all rows
+In the absence of a literalFILTER/literal clause,
+aggregate functions, if any are used, are computed across all rows
 making up each group, producing a separate value for each group
 (whereas without literalGROUP BY/literal, an aggregate
 produces a single value computed across all the selected rows).
+When a literalFILTER/literal clause is present, only those
+rows matching the FILTER clause are included.
 When literalGROUP BY/literal is present, it is not valid for
 the commandSELECT/command list expressions to refer to
 ungrouped columns except within aggregate functions or if the
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index b139212..c4d5f33 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1562,24 +1562,26 @@ sqrt(2)
 syntax of an aggregate expression is one of the following:
 
 synopsis
-replaceableaggregate_name/replaceable 
(replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] )
-replaceableaggregate_name/replaceable (ALL 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] )
-replaceableaggregate_name/replaceable (DISTINCT 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] )
-replaceableaggregate_name/replaceable ( * )
+replaceableaggregate_name/replaceable 
(replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
+replaceableaggregate_name/replaceable (ALL 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
+replaceableaggregate_name/replaceable (DISTINCT 
replaceableexpression/replaceable [ , ... ] [ 
replaceableorder_by_clause/replaceable ] ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
+replaceableaggregate_name/replaceable ( * ) [ FILTER ( WHERE 
replaceablefilter_clause/replaceable ) ]
 /synopsis
 
 where replaceableaggregate_name/replaceable is a previously
 defined aggregate (possibly qualified with a schema name),
-replaceableexpression/replaceable is
-any value expression that does not itself contain an aggregate
-expression or a window function call, and
-replaceableorder_by_clause/replaceable is a optional
-literalORDER BY/ clause as described below.
+replaceableexpression/replaceable is any value expression that
+does not itself contain an aggregate expression or a window
+function call, replaceableorder_by_clause/replaceable is a
+optional literalORDER BY/ clause as described below.  The
+replaceableaggregate_name/replaceable can also be suffixed
+with literalFILTER/literal as described below.
/para
 
para
-The first form of aggregate expression invokes the aggregate
-once for each input row.
+The first form of aggregate expression invokes the aggregate once
+for each input row, or when a FILTER clause is present, each row
+matching same.
 The second form is the same as the first, since
 literalALL/literal is the default.
 The third form invokes the aggregate once for each distinct value
@@ -1607,6 +1609,21 @@ sqrt(2)
/para
 
para
+Adding a FILTER clause to an aggregate specifies which values of
+the expression being aggregated to evaluate.  For example:
+programlisting
+SELECT
+count(*) AS unfiltered,
+count(*) FILTER (WHERE i  5) AS filtered
+FROM generate_series(1,10) AS s(i);
+ unfiltered | filtered 
++--
+ 10