Re: [HACKERS] jsonb_set: update or upsert default?

2015-05-23 Thread Andrew Dunstan


On 05/23/2015 04:03 PM, Petr Jelinek wrote:

On 23/05/15 17:59, David E. Wheeler wrote:

On May 22, 2015, at 7:22 PM, Andrew Dunstan and...@dunslane.net wrote:

The proposed flag for jsonb_set (the renamed jsonb_replace) in the 
patch I recently published is set to false, meaning that the default 
behaviour is to require all elements of the path including the last 
to be present. What that does is effectively UPDATE for jsonb. If 
the flag is true, then the last element can be absent, in which case 
it's created, so this is basically UPSERT for jsonb. The question is 
which should be the default. We got into the weeds on this with 
suggestions of throwing errors on missing paths, but that's going 
nowhere, and I want to get discussion back onto the topic of what 
should be the default.


Here’s JavaScript in Chrome, FWIW:

var f = {}
f[foo][0] = “bar
Uncaught TypeError: Cannot set property '0' of undefined
 at anonymous:2:13
 at Object.InjectedScript._evaluateOn (anonymous:895:140)
 at Object.InjectedScript._evaluateAndWrap (anonymous:828:34)
 at Object.InjectedScript.evaluate (anonymous:694:21)



As I understand it, that's not really the same as what Andrew says. 
The real example of that is

 var f = {}
 f[foo] = “bar
 f
{ foo: 'bar' }



Yeah, more or less.



which works fine in JavaScript and most other dynamic languages like 
Python or Perl. So my opinion is that default should be true here.



OK, although Perl at least will autovivify the whole path:

   [andrew@emma ~]$ perl -e 'my %x; $x{foo}{bar}{baz} = 1; use
   Data::Dumper; print Dumper(\%x);'
   $VAR1 = {
  'foo' = {
 'bar' = {
'baz' = 1
  }
   }
};

But since, as David's example shows, JS doesn't do that we seem to be on 
solid ground not doing it either.




Another thing I noticed is that while following looks as expected:
# select jsonb_set('{baz:1}'::jsonb, '{foo}', 'bar', true);
jsonb_set
--
 {baz: 1, foo: bar}
(1 row)

If I use empty jsonb object it does not work anymore:
# select jsonb_set('{}', '{foo}', 'bar', true);
 jsonb_set
---
 {}
(1 row)





Oh, that looks like a bug. Will check. Thanks.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiler warning about overflow in xlog.c

2015-05-23 Thread Tom Lane
Petr Jelinek p...@2ndquadrant.com writes:
 my compiler complains about overflow in xlog.c.

Yeah, Peter E. reported this yesterday.  Since Heikki didn't do
anything about that yet, I pushed your fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xid wrap / optimize frozen tables?

2015-05-23 Thread Jeff Janes
On Sat, May 23, 2015 at 6:46 AM, Nils Goroll sl...@schokola.de wrote:

 Hi,

 as many before, I ran into the issue of a postgresql database (8.4.1)
 - committing many transactions
 - to huge volume tables (3-figure GB in size)
 - running the xid wrap vacuum (to freeze tuples)

 where the additional read IO load has negative impact to the extent of the
 system becoming unusable.


(9.4.1 noted)

Are you sure it is the read IO that causes the problem?  It should be
pretty light, as it would mostly be satisfied by read-ahead (unless you
have GIN indexes) and for pages that aren't automatically prefetched, it
just waits patiently for the requested data to arrive.  (As opposed to
writing, in which it runs around dirtying things that other processes need,
clogging up their IO rather than just its own).

What monitoring techniques do you use to determine the source of the
slowdown?


 Besides considering the fact that this can be worked around by exchanging
 printed sheets of paper or plastic (hello to .au) for hardware, I'd very
 much
 appreciate answers to these questions:

 * have I missed any more recent improvements regarding this problem? My
 understanding is that the full scan for unfrozen tuples can be made less
 likely
 (by reducing the number of transactions and tuning the autovac), but that
 it is
 still required. Is this correct?

 * A pretty obvious idea seems to be to add special casing for fully frozen
 tables: If we could register the fact that a table is fully frozen (has
 no new
 tuples after the complete-freeze xid), a vacuum would get reduced to just
 increasing that last frozen xid.

 It seems like Alvaro Herrera had implemented a patch along the lines of
 this
 idea but I fail to find any other references to it:

 http://grokbase.com/t/postgresql/pgsql-hackers/0666gann3t/how-to-avoid-transaction-id-wrap#200606113hlzxtcuzrcsfwc4pxjimyvwgu

 Does anyone have pointers what happened to the patch?



I don't know happened to that, but there is another patch waiting for
review and testing:

https://commitfest.postgresql.org/5/221/

Cheers,

Jeff


[HACKERS] Re: fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-23 Thread Christoph Berg
Re: To PostgreSQL Hackers 2015-05-23 20150523172627.ga24...@msg.df7cb.de
 Hi,
 
 the new fsync-pgdata-on-recovery code tries to open all files using
 O_RDWR. At least on 9.1, this can make recovery fail:
 
 * launch postgres, hit ^\ (or otherwise shut down uncleanly)
 * touch foo; chmod 444 foo
 * launch postgres
 
 LOG:  database system was interrupted; last known up at 2015-05-23 19:18:36 
 CEST
 FATAL:  could not open file /home/cbe/9.1/foo: Permission denied
 LOG:  startup process (PID 27305) exited with exit code 1
 LOG:  aborting startup due to startup process failure
 
 The code on 9.4 looks similar to me, but I couldn't trigger the
 problem there.

Correction: 9.4 is equally broken. (I was still running 9.4.1 when I
tried first.)

 I think this is a real-world problem:
 
 1) In older releases, the SSL certs were read from the data directory,
 and at least the default Debian installation creates symlinks from
 PGDATA/server.* to /etc/ssl/ where PostgreSQL can't write
 
 2) It's probably a pretty common scenario that the root user will edit
 postgresql.conf, and make backups or create other random files there
 that are not writable. Even a non-writable postgresql.conf itself or
 recovery.conf was not a problem previously.

3) The .postgresql.conf.swp files created by (root's) vim are 0600.

 To me, this is a serious regression because it prevents automatic
 startup of a server that would otherwise just run.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Allow GiST distance function to return merely a lower-bound.

2015-05-23 Thread Tom Lane
I wrote:
 Heikki Linnakangas heikki.linnakan...@iki.fi writes:
 Allow GiST distance function to return merely a lower-bound.

 I wondered how come this patch did not touch nodeIndexonlyscan.c.

 After some investigation, it seems the only reason that this patch even
 appears to work is that none of the built-in or contrib opclasses support
 both lossy ordering operators and GIST fetch functions.  Otherwise we
 would do an index-only scan and the executor would simply ignore the
 recheck flag.

 I doubt we can ship this in this state; even if the core code doesn't
 exercise the problematic combination, surely third-party opclasses
 will want to?

On further thought, maybe it's OK: we'd only be using an index-only scan
if the index can return the exact value of the indexed column (as the
circle and polygon GIST opclasses cannot).  If it can do that, then it
should be able to compute exact distances too --- worst case, it could
reconstitute the column value and apply the distance operator itself.
So maybe we don't need to add a pile of code for that.

We do need a not-implemented error check though, so I added one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-23 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 So from my perspective anything which requires going off standard
 PostgreSQL packages, and encourages users to go off standard PostgreSQL
 packages, is a actually a fairly high cost even if the code is
 non-invasive.

Agreed.

 I would be more open to a GUC which limited the auth
 mechansisms available (requiring restart to change), for example, than a
 compile flag.

But how would that fix Volker's scenario?  GUCs are even easier to change
than pg_hba.conf --- in fact, now that we have ALTER SYSTEM, you couldn't
even use configuration management of postgresql.conf to prevent somebody
from altering the value of such a GUC.

I think the real bottom line is this: our code is not designed to prevent
DBAs from doing things that are contrary to local policy, and I for one
am not terribly excited about trying to make it do so.  The list of things
that might be contrary to local policy is just too long, and the number
of ways a DBA might get around any particular restriction is too great.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow GiST distance function to return merely a lower-bound.

2015-05-23 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@iki.fi writes:
 Allow GiST distance function to return merely a lower-bound.

I wondered how come this patch did not touch nodeIndexonlyscan.c.

After some investigation, it seems the only reason that this patch even
appears to work is that none of the built-in or contrib opclasses support
both lossy ordering operators and GIST fetch functions.  Otherwise we
would do an index-only scan and the executor would simply ignore the
recheck flag.

I doubt we can ship this in this state; even if the core code doesn't
exercise the problematic combination, surely third-party opclasses
will want to?

I thought about hacking the planner to not select index-only scans,
but there's no way for it to know whether the opclass might return
recheck = true at runtime.  I think the only real fix is to actually
propagate all the changes in nodeIndexscan.c into nodeIndexonlyscan.c.
Testing it would be problematic without a suitable opclass, though :-(

A short-term hack might be to throw a not implemented error in
nodeIndexonlyscan.c if it sees the distance-recheck flag set.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Compiler warning about overflow in xlog.c

2015-05-23 Thread Petr Jelinek

Hi,

my compiler complains about overflow in xlog.c.

There is variable defined as char partialfname[MAXFNAMELEN]; but is used 
as snprintf(partialfname, MAXPGPATH, %s.partial, origfname);


There is no practical issue as the actual filename length is never over 
MAXFNAMELEN even with the .partial suffix but the code should still be 
fixed which is what attached one-line patch does.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


xlog-overflow-fix.diff
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb_set: update or upsert default?

2015-05-23 Thread Petr Jelinek

On 23/05/15 17:59, David E. Wheeler wrote:

On May 22, 2015, at 7:22 PM, Andrew Dunstan and...@dunslane.net wrote:


The proposed flag for jsonb_set (the renamed jsonb_replace) in the patch I 
recently published is set to false, meaning that the default behaviour is to 
require all elements of the path including the last to be present. What that 
does is effectively UPDATE for jsonb. If the flag is true, then the last 
element can be absent, in which case it's created, so this is basically UPSERT 
for jsonb. The question is which should be the default. We got into the weeds 
on this with suggestions of throwing errors on missing paths, but that's going 
nowhere, and I want to get discussion back onto the topic of what should be the 
default.


Here’s JavaScript in Chrome, FWIW:

var f = {}
f[foo][0] = “bar
Uncaught TypeError: Cannot set property '0' of undefined
 at anonymous:2:13
 at Object.InjectedScript._evaluateOn (anonymous:895:140)
 at Object.InjectedScript._evaluateAndWrap (anonymous:828:34)
 at Object.InjectedScript.evaluate (anonymous:694:21)



As I understand it, that's not really the same as what Andrew says. The 
real example of that is

 var f = {}
 f[foo] = “bar
 f
{ foo: 'bar' }

which works fine in JavaScript and most other dynamic languages like 
Python or Perl. So my opinion is that default should be true here.


Another thing I noticed is that while following looks as expected:
# select jsonb_set('{baz:1}'::jsonb, '{foo}', 'bar', true);
jsonb_set
--
 {baz: 1, foo: bar}
(1 row)

If I use empty jsonb object it does not work anymore:
# select jsonb_set('{}', '{foo}', 'bar', true);
 jsonb_set
---
 {}
(1 row)


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-23 Thread Tom Lane
Christoph Berg m...@debian.org writes:
 the new fsync-pgdata-on-recovery code tries to open all files using
 O_RDWR. At least on 9.1, this can make recovery fail:

Hm.  I wonder whether it would be all right to just skip files for which
we get EPERM on open().  The argument being that if we can't write to the
file, we should not be held responsible for fsync'ing it either.  But
I'm not sure whether EPERM would be the only relevant errno, or whether
there are cases where this would mask real problems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-23 Thread Christoph Berg
Re: Tom Lane 2015-05-23 2284.1432413...@sss.pgh.pa.us
 Christoph Berg m...@debian.org writes:
  the new fsync-pgdata-on-recovery code tries to open all files using
  O_RDWR. At least on 9.1, this can make recovery fail:
 
 Hm.  I wonder whether it would be all right to just skip files for which
 we get EPERM on open().  The argument being that if we can't write to the
 file, we should not be held responsible for fsync'ing it either.  But
 I'm not sure whether EPERM would be the only relevant errno, or whether
 there are cases where this would mask real problems.

Maybe logging WARNINGs instead of FATAL would be enough of a fix?

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Non-user-resettable SET SESSION AUTHORISATION

2015-05-23 Thread Noah Misch
On Tue, May 19, 2015 at 04:49:26PM -0400, Robert Haas wrote:
 On Tue, May 19, 2015 at 3:00 PM, Simon Riggs si...@2ndquadrant.com wrote:
  As long as the cookie is randomly generated for each use, then I don't see a
  practical problem with that approach.
 
 If the client sets the cookie via an SQL command, that command would
 be written to the log, and displayed in pg_stat_activity.  A malicious
 user might be able to get it from one of those places.
 
 A malicious user might also be able to just guess it.  I don't really
 want to create a situation where any weakess in pgpool's random number
 generation becomes a privilege-escalation attack.
 
 A protocol extension avoids all of that trouble, and can be target for
 9.6 just like any other approach we might come up with.  I actually
 suspect the protocol extension will be FAR easier to fully secure, and
 thus less work, not more.

All true.  Here's another idea.  Have the pooler open one additional
connection, for out-of-band signalling.  Add a pair of functions:

  pg_userchange_grant(recipient_pid int, user oid)
  pg_userchange_accept(sender_pid int, user oid)

To change the authenticated user of a pool connection, the pooler would call
pg_userchange_grant in the signalling connection and pg_userchange_accept in
the target connection.  This requires no protocol change or confidential
nonce.  The inevitably-powerful signalling user is better insulated from other
users, because the pool backends have no need to become that user at any
point.  Bugs in the pooler's protocol state machine are much less likely to
enable privilege escalation.  On the other hand, it can't be quite as fast as
the other ideas on this thread.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-23 Thread Andres Freund
On 2015-05-23 16:33:29 -0400, Tom Lane wrote:
 Christoph Berg m...@debian.org writes:
  the new fsync-pgdata-on-recovery code tries to open all files using
  O_RDWR. At least on 9.1, this can make recovery fail:
 
 Hm.  I wonder whether it would be all right to just skip files for which
 we get EPERM on open().  The argument being that if we can't write to the
 file, we should not be held responsible for fsync'ing it either.  But
 I'm not sure whether EPERM would be the only relevant errno, or whether
 there are cases where this would mask real problems.

We could even try doing the a fsync with a readonly fd as a fallback,
but that's also pretty hacky.

How about, to avoid masking actual problems, we have a more
differentiated logic for the toplevel data directory? I think we could
just skip all non-directory files in there data_directory itself. None
of the files in the toplevel directory, with the exception of
postgresql.auto.conf, will ever get written to by PG itself.  And if
there's readonly files somewhere in a subdirectory, I won't feel
particularly bad.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Run pgindent now?

2015-05-23 Thread Bruce Momjian
On Sat, May 23, 2015 at 12:32:34PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Are we ready for a pgindent run?  Back branches?
 
 I think we could do it in HEAD, but it doesn't seem like we have consensus
 about whether to touch the back branches.  Suggest just HEAD for now and
 we can continue to argue about the back branches.

pgindent run on HEAD and committed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Run pgindent now?

2015-05-23 Thread Andres Freund
On 2015-05-23 21:36:50 -0400, Bruce Momjian wrote:
 pgindent run on HEAD and committed.

-   if (IsA(node, Aggref) || IsA(node, GroupingFunc))
+   if (IsA(node, Aggref) ||IsA(node, GroupingFunc))

There's a bunch of changes like this. Looks rather odd to me? I don't
recall seing much code looking like that?

Also, does somebody have an idea to keep pgindent from butchering inline
asm like:
/*
 * Perform cmpxchg and use the zero flag which it implicitly sets when
 * equal to measure the success.
 */
-   __asm__ __volatile__(
-  lock\n
-  cmpxchgl%4,%5   \n
-  setz%2  \n
-:  =a (*expected), =m(ptr-value), =q (ret)
-:  a (*expected), r (newval), m(ptr-value)
-:  memory, cc);
+   __asm__ __volatile__(
+  lock\n
+  cmpxchgl%4,%5   \n
+ setz %2  \n
+:   =a(*expected), =m(ptr-value), =q(ret)
+:   a(*expected), r(newval), m(ptr-value)
+:   memory, cc);
+
return (bool) ret;


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Run pgindent now?

2015-05-23 Thread Bruce Momjian
On Sun, May 24, 2015 at 04:16:07AM +0200, Andres Freund wrote:
 On 2015-05-23 21:36:50 -0400, Bruce Momjian wrote:
  pgindent run on HEAD and committed.
 
 -   if (IsA(node, Aggref) || IsA(node, GroupingFunc))
 +   if (IsA(node, Aggref) ||IsA(node, GroupingFunc))
 
 There's a bunch of changes like this. Looks rather odd to me? I don't
 recall seing much code looking like that?

Wow, that is quite odd.  I saw another case where it might have done
this because the line was 80 characters without it, but not in this
case.

 Also, does somebody have an idea to keep pgindent from butchering inline
 asm like:
 /*
  * Perform cmpxchg and use the zero flag which it implicitly sets when
  * equal to measure the success.
  */
 -   __asm__ __volatile__(
 -  lock\n
 -  cmpxchgl%4,%5   \n
 -  setz%2  \n
 -:  =a (*expected), =m(ptr-value), =q (ret)
 -:  a (*expected), r (newval), m(ptr-value)
 -:  memory, cc);
 +   __asm__ __volatile__(
 +  lock\n
 +  cmpxchgl%4,%5   \n
 + setz %2  \n
 +:   =a(*expected), =m(ptr-value), =q(ret)
 +:   a(*expected), r(newval), m(ptr-value)
 +:   memory, cc);
 +
 return (bool) ret;

Ah, we have a file /pgtop/src/tools/pgindent/exclude_file_patterns which
has excluded files, and s_lock.h is one of them.  I think
/include/port/atomics/arch-x86.h needs to be added, and the pgindent
commit on the file reverted.  Are there any other files with __asm__
lines like that?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change pg_cancel_*() to ignore current backend

2015-05-23 Thread Michael Paquier
On Sat, May 23, 2015 at 7:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@anarazel.de writes:
 This whole discussion seems to be about making it easier to run SELECT
 pg_cancel_backend(pid) FROM pg_stat_activity;. But that shouldn't be
 made easier! If anything harder.

 Indeed.  I find it hard to believe that there's a real problem here, and
 I absolutely will resist breaking backwards compatibility to solve it.

+1. Reading this thread I don't see why there is actually a problem...
The whole discussion is about moving the check at SQL-level with
pg_backend_pid(), that people are used to for a couple of releases
now, into pg_cancel_backend() or within a new function. I am not
really convinced that it is worth having a new interface for that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issues in Replication Progress Tracking

2015-05-23 Thread Amit Kapila
On Sat, May 23, 2015 at 11:19 AM, Andres Freund and...@anarazel.de wrote:

 On May 22, 2015 10:23:06 PM PDT, Amit Kapila amit.kapil...@gmail.com
wrote:
 On Fri, May 22, 2015 at 11:20 PM, Andres Freund and...@anarazel.de
 wrote:
 
  On 2015-05-21 09:40:58 +0530, Amit Kapila wrote:
   On Thu, May 21, 2015 at 12:42 AM, Andres Freund
 and...@anarazel.de
 wrote:
   
On 2015-05-20 19:27:05 +0530, Amit Kapila wrote:
   
 13.
 In function replorigin_session_setup() and or
 replorigin_session_advance(), don't we need to WAL log the
 use of Replication state?
   
No, the point is that the replication progress is persisted via
 an
 extra
data block in the commit record. That's important for both
 performance
and correctness, because otherwise it gets hard to tie a
 transaction
made during replay with the update to the progress. Unless you
 use 2PC
which isn't really an alternative.
   
  
   Okay, but what triggered this question was the difference of those
 functions
   as compare to when user call function
 pg_replication_origin_advance().
   pg_replication_origin_advance() will WAL log the information during
 that
   function call itself (via replorigin_advance()).  So even if the
 transaction
   issuing pg_replication_origin_advance() function will abort, it
 will
 still
   update
   the Replication State, why so?
 
  I don't see a problem here. pg_replication_origin_advance() is for
  setting up the initial position/update the position upon
 configuration
  changes.
 
 Okay, I am not aware how exactly these API's will be used for
 replication
 but let me try to clarify what I have in mind related to this API
 usage.
 
 Can we use pg_replication_origin_advance()  for node where Replay has
 to happen, if Yes, then Let us say user of
 pg_replication_origin_advance()
 API set the lsn position to X for the node N1 on which replay has to
 happen, so now replay will proceed from X + 1 even though the
 information related to X is not persisted, so now it could so happen
 X will get written after the replay of X + 1 which might lead to
 problem
 after crash recovery?

 Then you'd use the session variant - which does tie into transactions. Is
there a reason that's be unsuitable for you?


I don't know because I have not thought about how one can
use these API's in various ways in design of the Replication
system, but I think ideally we should prohibit the use of API
in circumstances where it could lead to problem or if that is
difficult or not possible or not worth, then at least we can mention
the same in docs.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] xid wrap / optimize frozen tables?

2015-05-23 Thread Nils Goroll
Hi,

as many before, I ran into the issue of a postgresql database (8.4.1)
- committing many transactions
- to huge volume tables (3-figure GB in size)
- running the xid wrap vacuum (to freeze tuples)

where the additional read IO load has negative impact to the extent of the
system becoming unusable.

Besides considering the fact that this can be worked around by exchanging
printed sheets of paper or plastic (hello to .au) for hardware, I'd very much
appreciate answers to these questions:

* have I missed any more recent improvements regarding this problem? My
understanding is that the full scan for unfrozen tuples can be made less likely
(by reducing the number of transactions and tuning the autovac), but that it is
still required. Is this correct?

* A pretty obvious idea seems to be to add special casing for fully frozen
tables: If we could register the fact that a table is fully frozen (has no new
tuples after the complete-freeze xid), a vacuum would get reduced to just
increasing that last frozen xid.

It seems like Alvaro Herrera had implemented a patch along the lines of this
idea but I fail to find any other references to it:
http://grokbase.com/t/postgresql/pgsql-hackers/0666gann3t/how-to-avoid-transaction-id-wrap#200606113hlzxtcuzrcsfwc4pxjimyvwgu

Does anyone have pointers what happened to the patch?

Thanks, Nils


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Run pgindent now?

2015-05-23 Thread Bruce Momjian
On Fri, May 22, 2015 at 12:02:11PM -0400, Tom Lane wrote:
 I guess in the
 scenario you're describing, the most helpful thing would be if the
 pgindent commit put the typedef list it had used into the tree, and then
 we just use that (plus manual additions) when generating the I' commit.

I have always added the typedef list I used as part of pgindent commit
runs.

Are we ready for a pgindent run?  Back branches?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Run pgindent now?

2015-05-23 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Are we ready for a pgindent run?  Back branches?

I think we could do it in HEAD, but it doesn't seem like we have consensus
about whether to touch the back branches.  Suggest just HEAD for now and
we can continue to argue about the back branches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-23 Thread Christoph Berg
Hi,

the new fsync-pgdata-on-recovery code tries to open all files using
O_RDWR. At least on 9.1, this can make recovery fail:

* launch postgres, hit ^\ (or otherwise shut down uncleanly)
* touch foo; chmod 444 foo
* launch postgres

LOG:  database system was interrupted; last known up at 2015-05-23 19:18:36 CEST
FATAL:  could not open file /home/cbe/9.1/foo: Permission denied
LOG:  startup process (PID 27305) exited with exit code 1
LOG:  aborting startup due to startup process failure

The code on 9.4 looks similar to me, but I couldn't trigger the
problem there.

I think this is a real-world problem:

1) In older releases, the SSL certs were read from the data directory,
and at least the default Debian installation creates symlinks from
PGDATA/server.* to /etc/ssl/ where PostgreSQL can't write

2) It's probably a pretty common scenario that the root user will edit
postgresql.conf, and make backups or create other random files there
that are not writable. Even a non-writable postgresql.conf itself or
recovery.conf was not a problem previously.

To me, this is a serious regression because it prevents automatic
startup of a server that would otherwise just run.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


signature.asc
Description: Digital signature


Re: [HACKERS] xid wrap / optimize frozen tables?

2015-05-23 Thread Bruce Momjian
On Sat, May 23, 2015 at 03:46:33PM +0200, Nils Goroll wrote:
 Hi,
 
 as many before, I ran into the issue of a postgresql database (8.4.1)

Uh, did you mean 9.4.1 as 8.4.1 is end-of-lifed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb_set: update or upsert default?

2015-05-23 Thread David E. Wheeler
On May 22, 2015, at 7:22 PM, Andrew Dunstan and...@dunslane.net wrote:

 The proposed flag for jsonb_set (the renamed jsonb_replace) in the patch I 
 recently published is set to false, meaning that the default behaviour is to 
 require all elements of the path including the last to be present. What that 
 does is effectively UPDATE for jsonb. If the flag is true, then the last 
 element can be absent, in which case it's created, so this is basically 
 UPSERT for jsonb. The question is which should be the default. We got into 
 the weeds on this with suggestions of throwing errors on missing paths, but 
 that's going nowhere, and I want to get discussion back onto the topic of 
 what should be the default.

Here’s JavaScript in Chrome, FWIW:

var f = {}
f[foo][0] = “bar
Uncaught TypeError: Cannot set property '0' of undefined
at anonymous:2:13
at Object.InjectedScript._evaluateOn (anonymous:895:140)
at Object.InjectedScript._evaluateAndWrap (anonymous:828:34)
at Object.InjectedScript.evaluate (anonymous:694:21)

Best,

David



smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

2015-05-23 Thread Andrew Dunstan


On 05/23/2015 01:29 AM, Amit Kapila wrote:
On Fri, May 15, 2015 at 11:51 AM, Amit Kapila amit.kapil...@gmail.com 
mailto:amit.kapil...@gmail.com wrote:


 On Thu, May 14, 2015 at 10:29 PM, Andrew Dunstan 
and...@dunslane.net mailto:and...@dunslane.net wrote:

 
 
  On 05/14/2015 10:52 AM, Robert Haas wrote:
 
  On Thu, May 14, 2015 at 12:12 AM, Amit Kapila 
amit.kapil...@gmail.com mailto:amit.kapil...@gmail.com wrote:

 
  On Thu, May 14, 2015 at 2:10 AM, Andrew Dunstan 
and...@dunslane.net mailto:and...@dunslane.net wrote:

 
  How about if we simply abort if we find a non-symlink where we 
want the
  symlink to be, and only remove something that is actually a 
symlink (or a

  junction point, which is more or less the same thing)?
 
  We can do that way and for that I think we need to use rmdir
  instead of rmtree in the code being discussed (recovery path),
  OTOH we should try to minimize the errors raised during
  recovery.
 
  I'm not sure I understand this issue in detail, but why would using
  rmtree() on something you expect to be a symlink ever be a good idea?
  It seems like if things are the way you expect them to be, it has no
  benefit, but if they are different from what you expect, you might
  blow away a ton of important data.
 
  Maybe I am just confused.
 
 
 
  The suggestion is to get rid of using rmtree. Instead, if we find 
a non-symlink in pg_tblspc we'll make the user clean it up before we 
can continue. So your instinct is in tune with my suggestion.

 

 Find the patch which gets rid of rmtree usage.  I have made it as
 a separate function because the same code is used from
 create_tablespace_directories() as well.  I thought of extending the
 same API for using it from destroy_tablespace_directories() as well,
 but due to special handling (especially for ENOENT) in that function,
 I left it as of now.


Does it make sense to track this in 9.5 Open Items [1]?


[1] - https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items





Sure. It's on my list of things to get to, but by all means put it there.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xid wrap / optimize frozen tables?

2015-05-23 Thread Tom Lane
Nils Goroll sl...@schokola.de writes:
 as many before, I ran into the issue of a postgresql database (8.4.1)

*Please* tell us that was a typo.

If it wasn't, there were possibly-relevant fixes in 8.4.6, 8.4.9,
and perhaps later; I got tired of scanning the release notes.

8.4.x is out of support entirely now, of course, but it ought to be fairly
painless to drop in the last 8.4.x release (8.4.22) and see if that makes
things any better.  If not, you ought to be thinking about an upgrade to
something 9.recent rather than trying to hack up 8.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xid wrap / optimize frozen tables?

2015-05-23 Thread Nils Goroll
On 23/05/15 16:50, Tom Lane wrote:
  as many before, I ran into the issue of a postgresql database (8.4.1)
 *Please* tell us that was a typo.

Yes it was, my sincere apologies. It's 9.4.1


Nils


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-05-23 Thread Noah Misch
On Sat, May 08, 2010 at 09:24:45PM -0400, Tom Lane wrote:
 hgonza...@gmail.com writes:
  http://sources.redhat.com/bugzilla/show_bug.cgi?id=649
 
  The last explains why they do not consider it a bug:
 
  ISO C99 requires for %.*s to only write complete characters that fit below  
  the
  precision number of bytes. If you are using say UTF-8 locale, but ISO-8859-1
  characters as shown in the input file you provided, some of the strings are
  not valid UTF-8 strings, therefore sprintf fails with -1 because of the
  encoding error. That's not a bug in glibc.
 
 Yeah, that was about the position I thought they'd take.

GNU libc eventually revisited that conclusion and fixed the bug through commit
715a900c9085907fa749589bf738b192b1a2bda5.  RHEL 7.1 is fixed, but RHEL 6.6 and
RHEL 5.11 are still affected; the bug will be relevant for another 8+ years.

 So the bottom line here is that we're best off to avoid %.*s because
 it may fail if the string contains data that isn't validly encoded
 according to libc's idea of the prevailing encoding.

Yep.  Immediate precisions like %.10s trigger the bug as effectively as %.*s,
so tarCreateHeader() [_tarWriteHeader() in 9.2 and earlier] is also affected.
Switching to strlcpy(), as attached, fixes the bug while simplifying the code.
The bug symptom is error 'pg_basebackup: unrecognized link indicator 0' when
the name of a file in the data directory is not a valid multibyte string.


Commit 6dd9584 introduced a new use of .*s, to pg_upgrade.  It works reliably
for now, because it always runs in the C locale.  pg_upgrade never calls
set_pglocale_pgservice() or otherwise sets its permanent locale.  It would be
natural for us to fix that someday, at which point non-ASCII database names
would perturb this status output.

It would be good to purge the code of precisions on s conversion specifiers,
then Assert(!pointflag) in fmtstr() to catch new introductions.  I won't plan
to do it myself, but it would be a nice little defensive change.
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl 
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 0e4bd12..f46c5fc 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -17,6 +17,12 @@ command_fails(
[ 'pg_basebackup', '-D', $tempdir/backup ],
'pg_basebackup fails because of hba');
 
+# Some Windows ANSI code pages may reject this filename, in which case we
+# quietly proceed without this bit of test coverage.
+open BADCHARS, $tempdir/pgdata/FOO\xe0\xe0\xe0BAR;
+print BADCHARS test backup of file with non-UTF8 name\n;
+close BADCHARS;
+
 open HBA, $tempdir/pgdata/pg_hba.conf;
 print HBA local replication all trust\n;
 print HBA host replication all 127.0.0.1/32 trust\n;
diff --git a/src/port/tar.c b/src/port/tar.c
index 4721df3..72fd4e1 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -68,7 +68,7 @@ tarCreateHeader(char *h, const char *filename, const char 
*linktarget,
memset(h, 0, 512);  /* assume tar header size */
 
/* Name 100 */
-   sprintf(h[0], %.99s, filename);
+   strlcpy(h[0], filename, 100);
if (linktarget != NULL || S_ISDIR(mode))
{
/*
@@ -110,7 +110,7 @@ tarCreateHeader(char *h, const char *filename, const char 
*linktarget,
/* Type - Symbolic link */
sprintf(h[156], 2);
/* Link Name 100 */
-   sprintf(h[157], %.99s, linktarget);
+   strlcpy(h[157], linktarget, 100);
}
else if (S_ISDIR(mode))
/* Type - directory */
@@ -127,11 +127,11 @@ tarCreateHeader(char *h, const char *filename, const char 
*linktarget,
 
/* User 32 */
/* XXX: Do we need to care about setting correct username? */
-   sprintf(h[265], %.31s, postgres);
+   strlcpy(h[265], postgres, 32);
 
/* Group 32 */
/* XXX: Do we need to care about setting correct group name? */
-   sprintf(h[297], %.31s, postgres);
+   strlcpy(h[297], postgres, 32);
 
/* Major Dev 8 */
sprintf(h[329], %07o , 0);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Asynchronous DRAM Self-Refresh

2015-05-23 Thread Jeremy Harris
On 22/05/15 22:30, Josh Berkus wrote:
 At CoreOS Fest, Intel presented about a technology which they used to
 improve write times for the nonrelational data store Etcd.  It's called
 Asynchronous DRAM Self-Refresh, or ADR.  This is supposedly a feature of
 all of their chips since E5 which allows users to designate a small area
 of memory (16 to 64MB) which is somehow guaranteed to be flushed to disk
 in the event of a power loss (the exact mechanism was not explained).
 
 So my thought was Hello!  wal_buffers?  Theoretically, this feature
 could give us the benefits of aynchronous commit without the penalties
 ... *if* it actually works.
 
 However, since then I've been able to find zero documentation on ADR.
 There's a bunch of stuff in the Intel press releases, but zero I can
 find in their technical docs.  Anyone have a clue on this?
 
Are you certain disk was mentioned?  The wording at

 http://www.intel.com/design/intarch/iastorage/xeon.htm

to preserve critical data in RAM during a power fail
implies not.  I assume there's a battery backup for
just that memory region, and the chip does its own refresh
rather than needing a controller; the data should still
be recoverable on next boot.
-- 
Cheers,
  Jeremy



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Asynchronous DRAM Self-Refresh

2015-05-23 Thread Jeremy Harris
On 22/05/15 22:30, Josh Berkus wrote:
 At CoreOS Fest, Intel presented about a technology which they used to
 improve write times for the nonrelational data store Etcd.  It's called
 Asynchronous DRAM Self-Refresh, or ADR.  This is supposedly a feature of
 all of their chips since E5 which allows users to designate a small area
 of memory (16 to 64MB) which is somehow guaranteed to be flushed to disk
 in the event of a power loss (the exact mechanism was not explained).
 
 So my thought was Hello!  wal_buffers?  Theoretically, this feature
 could give us the benefits of aynchronous commit without the penalties
 ... *if* it actually works.
 
 However, since then I've been able to find zero documentation on ADR.
 There's a bunch of stuff in the Intel press releases, but zero I can
 find in their technical docs.  Anyone have a clue on this?
 

http://snia.org/sites/default/files/NVDIMM%20Messaging%20and%20FAQ%20Jan%2020143.pdf

describes it as a (minor) processor feature to flush
memory-pipeline buffers to RAM on powerloss detection.
The context there is for a flash-backup on-DIMM for the
RAM.

It's unclear whether any dirty data in cpu cache gets written...
Are Intel caches never write-behind?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file

2015-05-23 Thread Amit Kapila
On Sat, May 23, 2015 at 9:28 PM, Andrew Dunstan and...@dunslane.net wrote:


 On 05/23/2015 01:29 AM, Amit Kapila wrote:
  Find the patch which gets rid of rmtree usage.  I have made it as
  a separate function because the same code is used from
  create_tablespace_directories() as well.  I thought of extending the
  same API for using it from destroy_tablespace_directories() as well,
  but due to special handling (especially for ENOENT) in that function,
  I left it as of now.
 

 Does it make sense to track this in 9.5 Open Items [1]?


 [1] - https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items




 Sure. It's on my list of things to get to, but by all means put it there.


Thanks for tracking.  I have added to open items as well.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Run pgindent now?

2015-05-23 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Sun, May 24, 2015 at 04:16:07AM +0200, Andres Freund wrote:
 -   if (IsA(node, Aggref) || IsA(node, GroupingFunc))
 +   if (IsA(node, Aggref) ||IsA(node, GroupingFunc))
 
 There's a bunch of changes like this. Looks rather odd to me? I don't
 recall seing much code looking like that?

 Wow, that is quite odd.

No, pgindent has *always* been wonky about lines that contain a typedef
name but are not variable declarations.  I've gotten in the habit of
breaking IsA tests like these into two lines:

if (IsA(node, Aggref) ||
IsA(node, GroupingFunc))

just so that it doesn't look weird when pgindent gets done with it.
You can see similar weirdness around sizeof usages, if you look.

 Also, does somebody have an idea to keep pgindent from butchering inline
 asm like:

 Ah, we have a file /pgtop/src/tools/pgindent/exclude_file_patterns which
 has excluded files, and s_lock.h is one of them.  I think
 /include/port/atomics/arch-x86.h needs to be added, and the pgindent
 commit on the file reverted.

Yeah, probably :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extend pgbench expressions with functions

2015-05-23 Thread Fabien COELHO


v7: rebase after pgindent run for 9.6.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index a808546..d09b2bf 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -762,7 +762,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   references to variables literal:/replaceablevariablename/,
   and expressions composed of unary (literal-/) or binary operators
   (literal+/, literal-/, literal*/, literal//, literal%/)
-  with their usual associativity, and parentheses.
+  with their usual associativity, function literalabs/ and parentheses.
  /para
 
  para
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index e68631e..035322d 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -20,6 +20,8 @@ static PgBenchExpr *make_integer_constant(int64 ival);
 static PgBenchExpr *make_variable(char *varname);
 static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 		PgBenchExpr *rexpr);
+static int find_func(const char * fname);
+static PgBenchExpr *make_func(const int fnumber, PgBenchExpr *arg1);
 
 %}
 
@@ -34,10 +36,10 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 }
 
 %type expr expr
-%type ival INTEGER
-%type str VARIABLE
+%type ival INTEGER function
+%type str VARIABLE FUNCTION
 
-%token INTEGER VARIABLE
+%token INTEGER VARIABLE FUNCTION
 %token CHAR_ERROR /* never used, will raise a syntax error */
 
 /* Precedence: lowest to highest */
@@ -59,6 +61,10 @@ expr: '(' expr ')'			{ $$ = $2; }
 	| expr '%' expr			{ $$ = make_op('%', $1, $3); }
 	| INTEGER{ $$ = make_integer_constant($1); }
 	| VARIABLE { $$ = make_variable($1); }
+	| function '(' expr ')' { $$ = make_func($1, $3); }
+	;
+
+function: FUNCTION			{ $$ = find_func($1); pg_free($1); }
 	;
 
 %%
@@ -95,4 +101,41 @@ make_op(char operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr)
 	return expr;
 }
 
+static struct {
+	char * fname;
+	int nargs;
+	PgBenchFunction tag;
+} PGBENCH_FUNCTIONS[] = {
+	{ abs, 1, PGBENCH_ABS }
+};
+
+static int
+find_func(const char * fname)
+{
+	int nfunctions = sizeof(PGBENCH_FUNCTIONS) / sizeof(PGBENCH_FUNCTIONS[0]);
+	int i;
+
+	for (i = 0; i  nfunctions; i++)
+		if (pg_strcasecmp(fname, PGBENCH_FUNCTIONS[i].fname) == 0)
+			return i;
+
+	expr_yyerror_more(unexpected function name, fname);
+
+	/* not reached */
+	return -1;
+}
+
+static PgBenchExpr *
+make_func(const int fnumber, PgBenchExpr *arg1)
+{
+	PgBenchExpr *expr = pg_malloc(sizeof(PgBenchExpr));
+
+	Assert(fnumber = 0);
+
+	expr-etype = ENODE_FUNCTION;
+	expr-u.function.function = PGBENCH_FUNCTIONS[fnumber].tag;
+	expr-u.function.arg1 = arg1;
+	return expr;
+}
+
 #include exprscan.c
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 5331ab7..0e07bfe 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -57,6 +57,11 @@ space			[ \t\r\f]
 	yylval.ival = strtoint64(yytext);
 	return INTEGER;
 }
+[a-zA-Z]+   {
+	yycol += yyleng;
+	yylval.str = pg_strdup(yytext);
+	return FUNCTION;
+}
 
 [\n]			{ yycol = 0; yyline++; }
 {space}+		{ yycol += yyleng; /* ignore */ }
@@ -71,10 +76,16 @@ space			[ \t\r\f]
 %%
 
 void
-yyerror(const char *message)
+expr_yyerror_more(const char *message, const char *more)
 {
 	syntax_error(expr_source, expr_lineno, expr_full_line, expr_command,
- message, NULL, expr_col + yycol);
+ message, more, expr_col + yycol);
+}
+
+void
+yyerror(const char *message)
+{
+	expr_yyerror_more(message, NULL);
 }
 
 /*
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 6f35db4..de0855a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -978,6 +978,26 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval)
 return false;
 			}
 
+		case ENODE_FUNCTION:
+			{
+switch (expr-u.function.function)
+{
+	case PGBENCH_ABS:
+	{
+		int64 arg1;
+		if (!evaluateExpr(st, expr-u.function.arg1, arg1))
+			return false;
+
+		*retval = arg1  0? arg1: -arg1;
+		return true;
+	}
+	default:
+		fprintf(stderr, bad function (%d)\n,
+expr-u.function.function);
+		return false;
+}
+			}
+
 		default:
 			break;
 	}
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 42e2aae..356353e 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -15,11 +15,18 @@ typedef enum PgBenchExprType
 {
 	ENODE_INTEGER_CONSTANT,
 	ENODE_VARIABLE,
-	ENODE_OPERATOR
+	ENODE_OPERATOR,
+	ENODE_FUNCTION
 } PgBenchExprType;
 
 typedef struct PgBenchExpr PgBenchExpr;
 
+typedef enum PgBenchFunction
+{
+	PGBENCH_NONE,
+	PGBENCH_ABS
+} PgBenchFunction;
+
 struct PgBenchExpr
 {
 	PgBenchExprType etype;
@@ -39,6 +46,11 @@ struct PgBenchExpr
 			PgBenchExpr *lexpr;
 			PgBenchExpr *rexpr;
 		}			operator;
+		struct
+		{
+			PgBenchFunction function;
+			PgBenchExpr *arg1;
+		}