date:20140905

Hello

Assert is usually implemented as custom functions and used via PERFORM
statement now

-- usual current solution
PERFORM Assert(some expression)

I would to implement Assert as plpgsql internal statement due bigger
possibilities to design syntax and internal implementation now and in
future. More - as plpgsql statement should be compatible with any current
code - because there should not be collision between SQL and PLpgSQL space.
So this design doesn't break any current code.

I propose following syntax with following ecosystem:

ASSERT [ NOTICE, WARNING, EXCEPTION ]
  [ string expression or literal - explicit message ]
  [ USING clause - same as RAISE stmt (possible in future ) ]
  ( ROW_COUNT ( = |  ) ( 1 | 0 ) |
  ( QUERY some query should not be empty ) |
  ( CHECK some expression should be true )
  ( IS NOT NULL expression should not be null )

Every variant (ROW_COUNT, QUERY, CHECK, IS NOT NULL) has own default message

These asserts can be controlled by set of options (by default asserts are
enabled):

#option asserts_disable
#option asserts_disable_notice .. don't check thin asserts
#option asserts_not_stop ..  raise warning instead exception

some examples:

UPDATE tab SET c = 1 WHERE pk = somevar;
ASSERT ROW_COUNT = 1; -- knows what is previous DML or Dynamic DML

ASSERT CHECK a  100;

ASSERT IS NOT NULL pk;

ASSERT QUERY SELECT id FROM tab WHERE x = 1;

ASSERT CHECK 2 = (SELECT count(*) FROM tab WHERE x = 1);

ASSERT WARNING data are there QUERY SELECT ...

Shorter variant should to work

CREATE OR REPLACE FUNCTION assert(boolean)
RETURNS void AS $$
BEGIN
  ASSERT CHECK $1;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION assert(boolean, text)
RETURNS void AS $$
BEGIN
  ASSERT $1 CHECK $2;
END;
$$ LANGUAGE plpgsql;

Usage:

PERFORM assert(a  10);
PERFORM assert(a  10, a should be 10);

Comments, notices?

Regards

Pavel

This design should not break any current solution, it allows a static
analyses, and it doesn't close a door for future enhancing.

Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

2014-09-05 Thread Jeevan Chalke

On Thu, Sep 4, 2014 at 11:41 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 I am sory

 too much patches


:)

Patch looks good to me.
Marking Ready for Committer.

Thanks



 Regards

 Pavel



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

Thank you

Regards

Pavel


2014-09-05 8:29 GMT+02:00 Jeevan Chalke jeevan.cha...@enterprisedb.com:




 On Thu, Sep 4, 2014 at 11:41 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 I am sory

 too much patches


 :)

 Patch looks good to me.
 Marking Ready for Committer.

 Thanks



 Regards

 Pavel



 --
 Jeevan B Chalke
 Principal Software Engineer, Product Development
 EnterpriseDB Corporation
 The Enterprise PostgreSQL Company

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-05 Thread Kyotaro HORIGUCHI

Hello,

  - This patch introduced redundant socket emulation for win32
 backend but win32 bare socket for Port is already nonblocking
 as described so it donsn't seem to be a serious problem on
 performance. Addition to it, since I don't know the reason why
 win32/socket.c provides the blocking-mode socket emulation, I
 decided to preserve win32/socket.c to have blocking socket
 emulation. Possibly it can be removed.
 
 On Windows, the backend has an emulation layer for POSIX signals,
 which uses threads and Windows events. The reason win32/socket.c
 always uses non-blocking mode internally is that it needs to wait for
 the socket to become readable/writeable, and for the signal-emulation
 event, at the same time. So no, we can't remove it.

I see, thank you.

 The approach taken in the first patch seems sensible. I changed it to
 not use FD_SET, though. A custom array seems better, that way we don't
 need the pgwin32_nonblockset_init() call, we can just use initialize
 the variable. It's a little bit more code, but it's well-contained in
 win32/socket.c. Please take a look, to double-check that I didn't
 screw up.

Thank you. I felt a bit qualm to abusing fd_set. A bit more code
is not a problem.

I had close look on your patch.

Both 'nonblocking' and 'noblock' are appears in function names,
pgwin32_set_socket_block/noblock/is_nonblocking(). I prefer
nonblocking/blocking pair but I'm satisfied they are in uniform
style anyway. (Though I also didn't so ;p)

pgwin32_set_socket_block() leaves garbage in
nonblocking_sockets[] but it's no problem practically. You also
removed blocking'ize(?) code but I agree that it is correct
because fds of nonclosed socket won't be reused anyway.

pg_set_block/noblock() made me laugh. Yes you're correct. Sorry
for the bronken (but workable) code.

After all, the patch looks pretty good.
I'll continue to fit the another patch onto this.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ODBC Driver performance comparison

2014-09-05 Thread Heikki Linnakangas

I just replied to this on pgsql-odbc mailing list. Since we're talking 
about client performance, please keep the discussion there.


On 09/05/2014 08:54 AM, Vladimir Romanov wrote:

Hello all!
I do some test with ODBC driver for PosgreSql, TimesTen  MySQL. I compare
performance on very simple request. Database always located on same PC as
test application. Test PC - Lenovo T500, Cnetos 6.5 64, 8 Gb RAM, SSD.
I found what PostgreSql ODBC driver is slowest in comparison.
IMHO problems related to protocol used. I can't use SHM connection to
server or even UNIX socket.

perftool report - http://freepcrf.com/files/db_test_pq.pdf
chart 1 (w/o timesten)  - http://freepcrf.com/files/drv_comp1.png
chart 2 - http://freepcrf.com/files/drv_comp2.png
test source - https://github.com/vvromanov/db_test




- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-09-05 Thread Joel Jacobson

 On 3 sep 2014, at 16:20, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Sep 1, 2014 at 5:08 AM, Joel Jacobson j...@trustly.com wrote:
 On Sat, Jul 26, 2014 at 8:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Basically my point is that this just seems like inventing another way to
 do what one can already do with RAISE, and it doesn't have much redeeming
 social value to justify the cognitive load of inventing another construct.

 The main difference is with RAISE EXCEPTION 'OK'; you cannot know if
 it was *your* line of code which throw the 'OK'-exception or if it
 came from some other function which was called in the block of code.

 The real problem here is that if you're using PL/pgsql exceptions for
 control-flow reasons, you are taking a huge performance hit for that
 notational convenience.  I do agree that the syntax of PL/pgsql is
 clunky and maybe we should fix that anyway, but I honestly can't
 imagine too many people actually wanting to do this once they realize
 what it does to the run time of their procedure (and in some cases,
 the XID-consumption rate of their database).

Exceptions in plpgsql is indeed an exception itself :-)

There are a few use cases when they are crucial though, I would say I
use this code pattern in 0.1% of all functions, but still, when I need
this, it gets ugly.

Glad to hear you might consider the idea of fixing this.


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing implicit 'text' - xml|json|jsonb (was: PL/pgSQL 2)

2014-09-05 Thread Craig Ringer

On 09/05/2014 12:04 AM, Pavel Stehule wrote:

But some missing casts are well - I found lot performance issues based
on using wrong data types - integers, dates in text column.

Of course! That's why the default implicit casts were removed, and for
good reason. I'm only talking about a narrow class of a few specific types.

I think maybe a _few_ types need to be implicitly convertable from text,
but that's about it.

text - jsonb
text - json
text - xml
text - hstore
text - uuid
text - (user defined enum)

... mainly because otherwise app devs need frustrating workarounds (or
giving up on the PostgreSQL native types and just using 'text' columns),
and because in all these cases PostgreSQL will validate the input.

I've raised this before in other threads:

http://www.postgresql.org/message-id/edda5c6d-77e3-4c56-b33b-277e7fb32...@hub.org

http://www.postgresql.org/message-id/cactajfz8+hg_kom6qivba94kx9l3xuqz99rdushbfksb1mo...@mail.gmail.com

... even from ages ago:

http://www.postgresql.org/message-id/4cfdaee0.10...@postnewspapers.com.au

It's easy to object to this on type-purist grounds, but from a pragmatic
real-users point of view what we currently do is outright painful, and
unless we can go and fix every language binding, every query generator,
every ORM, etc to handle things just how PostgreSQL expects, some
compromise may be warranted.

It's easy to dismiss the problem by saying pass 'unknown' typed
literals via your language binding. That even works if you're willing
to jump through some hoops and are using raw JDBC. Good luck doing that
via EclipseLink, Hibernate, ActiveRecord, SQLAlchemy, MyBatis, Django
ORM, or any of the things people use to talk to PostgreSQL on a day to
day basis though.

Right now it's really painful to use some of PostgreSQL's best features
without hacking around the type system by manually creating implicit
casts. Another option is to work around it by completely removing the
benefit of the strict casting even when it's obviously right (e.g.
refusing to cast text to date) with the JDBC connection parameter
stringtype=unknown .

I'd like to get rid of the need for users to add possibly-buggy custom
casts or bypass type checking of text types, by relaxing the casts where
appropriate.

Here's a partial collection of real world user complaints I've seen
about this issue, in addition to the links above.

http://stackoverflow.com/q/20339580/398670
http://stackoverflow.com/q/15974474/398670
http://stackoverflow.com/q/17310219/398670
http://stackoverflow.com/q/14858783/398670

Here's an example of someone working around it by passing all strings as
'unknown':

http://stackoverflow.com/q/12050945/398670

A workaround someone had to do with an ETL tool:

http://stackoverflow.com/q/24038287/398670

For uuid:

http://stackoverflow.com/q/13346089/398670

Someone trying to handle it portably:

http://stackoverflow.com/q/22242630/398670

The kind of work you need to work around PostgreSQL's strictness with enums:

http://stackoverflow.com/q/7603500/398670
http://stackoverflow.com/q/851758/398670
http://stackoverflow.com/q/10898369/398670
http://stackoverflow.com/q/14884955/398670
http://stackoverflow.com/q/10898369/398670

... and that's just what I can find in a few minutes' searching on one site.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for psql History Display on MacOSX

2014-09-05 Thread Noah Misch

On Thu, Sep 04, 2014 at 09:54:37AM -0400, Robert Haas wrote:
 One point to note is that not back-patching this doesn't really fix
 anything.  Will a user be annoyed when .psql_history fails to reload
 properly on a new minor release, but utterly indifferent to whether it
 reloads in a new major release?

Users won't be utterly indifferent, but they will be less alarmed.  We
frequently use a major-version debut for bug fixes posing compatibility
hazards.  About half the items listed in the Migration to Version 9.4
release notes section fit that description.

 What if they run multiple major
 releases of PostgreSQL on the same machine, using the psql executable
 for each version when talking to that version?  (Yeah, I know it's
 backward compatible, but not everyone may realize that, or care.)

Sure.  Had I authored the patch, I probably would have withdrawn it pending
development of a thorough plan for minimizing these problems.  I don't care to
sell that level of conservatism to the rest of you.  If Tom is unconcerned
about these problems and wants to move forward with the current patch for 9.5,
that works for me.

 Given that, if we're going to do it this way at all, I favor
 back-patching: at least then the newest releases of all supported
 branches will be compatible with each other.

That's a fair point.  A back-patch is better for hackers, who occasionally run
each supported branch but rarely run outdated back-branch code.  (When I built
PostgreSQL on OS X, I used GNU readline.  I suppose some hackers do use
libedit, though.)  Not sure about ordinary users, though.

 But I'm still fuzzy on
 why we need to give up the ability to read the old format in the first
 place.  Can't we just fix that and be done with this?

Sort of.  I see no free-lunch fix, but there are alternatives to the current
proposed patch that resolve the compromises differently.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for psql History Display on MacOSX

2014-09-05 Thread Noah Misch

On Thu, Sep 04, 2014 at 07:51:03AM -0700, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  I tried your patches against libedit-28.  Wherever a command contains a
  newline, unpatched psql writes the three bytes \^A to the history file, 
  and
  patched psql writes the four bytes \012.  Unpatched psql correctly reads
  either form of the history file.  Patched psql misinterprets a history file
  created by unpatched psql, placing 0x01 bytes in the recalled command where 
  it
  should have newlines.  That's a worrisome compatibility break.
 
 I think you got the test cases backwards, or maybe neglected the aspect
 about how unpatched psql will only translate ^J to ^A in the oldest
 (or maybe the newest? too pressed for time to recheck right now) history
 entry.

I, too, had more-productive uses for this time, but morbid curiosity
prevailed.  It was the latter: I was testing a one-command history file.
Under libedit-28, unpatched psql writes ^A for newlines in the oldest
command and \012 for newlines in subsequent commands.  Patched psql writes
\012 for newlines in the oldest command and ^A for newlines in subsequent
commands.  (Surely a comment is in order if that's intentional.  Wasn't the
point to discontinue making the oldest command a special case?)  Here's the
matrix of behaviors when recalling history under libedit-28:

  master using master-written history:
oldest command: ok
rest: ok
  patched using master-written history:
oldest command: broken if it contained a newline
rest: ok
  master using patched-written history
oldest command: ok
rest: each broken if it contained a newline
  patched using patched-written history
oldest command: ok
rest: ok

Corrupting just the oldest history entry, only when it happens to contain a
newline, is acceptable.  If one assumes that users who deploy multiple major
releases use a consistent vintage of minor release, the compatibility problems
after back-patching this change are negligible.  That assumption has moderate
credibility.

 We do not escape a problem by refusing to fix it.

I have not recommended a general refusal of fixes for this bug.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement


On 2014-09-05 08:16, Pavel Stehule wrote:

Assert is usually implemented as custom functions and used via PERFORM
statement now

-- usual current solution
PERFORM Assert(some expression)

I would to implement Assert as plpgsql internal statement due bigger
possibilities to design syntax and internal implementation now and in
future. More - as plpgsql statement should be compatible with any current
code - because there should not be collision between SQL and PLpgSQL space.
So this design doesn't break any current code.


It does require making ASSERT an unreserved keyword, no?  That would 
break code where someone used assert as a variable name, for example.



I propose following syntax with following ecosystem:

ASSERT [ NOTICE, WARNING, EXCEPTION ]
   [ string expression or literal - explicit message ]
   [ USING clause - same as RAISE stmt (possible in future ) ]
   ( ROW_COUNT ( = |  ) ( 1 | 0 ) |
   ( QUERY some query should not be empty ) |
   ( CHECK some expression should be true )
   ( IS NOT NULL expression should not be null )



UPDATE tab SET c = 1 WHERE pk = somevar;
ASSERT ROW_COUNT = 1; -- knows what is previous DML or Dynamic DML
ASSERT CHECK a  100;
ASSERT IS NOT NULL pk;
ASSERT QUERY SELECT id FROM tab WHERE x = 1;
ASSERT CHECK 2 = (SELECT count(*) FROM tab WHERE x = 1);


I don't see the need for specialized syntax.  If the syntax was just 
ASSERT (expr), these could be written as:


ASSERT (row_count = 1); -- assuming we provide a special variable 
instead of having to do GET DIAGNOSTICS
ASSERT (a  100);  -- or perhaps ASSERT((a  100) IS TRUE); depending on 
how NULLs are handled

ASSERT (pk IS NOT NULL);
ASSERT (EXISTS(SELECT id FROM tab WHERE x = 1));
ASSERT (2 = (SELECT count(*) FROM tab WHERE x = 1));

the idea being that it gets turned into  SELECT expr;  and then evaluated.


ASSERT WARNING data are there QUERY SELECT ...


I think this could still be parsed correctly, though I'm not 100% sure 
on that:


ASSERT WARNING (EXISTS(SELECT ..)), 'data are there';

For extra points the error detail could work similarly to 
print_strict_params.  e.g.  ASSERT(row_count = 1);  would display the 
value of row_count in the DETAIL line, since row_count was a parameter 
to the expression.




.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-05 9:52 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 2014-09-05 08:16, Pavel Stehule wrote:

 Assert is usually implemented as custom functions and used via PERFORM
 statement now

 -- usual current solution
 PERFORM Assert(some expression)

 I would to implement Assert as plpgsql internal statement due bigger
 possibilities to design syntax and internal implementation now and in
 future. More - as plpgsql statement should be compatible with any current
 code - because there should not be collision between SQL and PLpgSQL
 space.
 So this design doesn't break any current code.


 It does require making ASSERT an unreserved keyword, no?  That would break
 code where someone used assert as a variable name, for example.


sure, sorry




  I propose following syntax with following ecosystem:

 ASSERT [ NOTICE, WARNING, EXCEPTION ]
[ string expression or literal - explicit message ]
[ USING clause - same as RAISE stmt (possible in future ) ]
( ROW_COUNT ( = |  ) ( 1 | 0 ) |
( QUERY some query should not be empty ) |
( CHECK some expression should be true )
( IS NOT NULL expression should not be null )


  UPDATE tab SET c = 1 WHERE pk = somevar;
 ASSERT ROW_COUNT = 1; -- knows what is previous DML or Dynamic DML
 ASSERT CHECK a  100;
 ASSERT IS NOT NULL pk;
 ASSERT QUERY SELECT id FROM tab WHERE x = 1;
 ASSERT CHECK 2 = (SELECT count(*) FROM tab WHERE x = 1);


 I don't see the need for specialized syntax.  If the syntax was just
 ASSERT (expr), these could be written as:

 ASSERT (row_count = 1); -- assuming we provide a special variable instead
 of having to do GET DIAGNOSTICS
 ASSERT (a  100);  -- or perhaps ASSERT((a  100) IS TRUE); depending on
 how NULLs are handled
 ASSERT (pk IS NOT NULL);
 ASSERT (EXISTS(SELECT id FROM tab WHERE x = 1));
 ASSERT (2 = (SELECT count(*) FROM tab WHERE x = 1));


I disagree. Your design is expression based design with following
disadvantages:

a) there is only one possible default message -- Assertation fault

b) there is not possibility to show statement for ASSERT ROW_COUNT

c) any static analyse is blocked, because there is not clean semantic

d) In PLpgSQL language a syntax STATEMENT '(' expression ')' is new - there
is nothing yet ---  it is discuss from yesterday -- still I am speaking
about plpgsql -- I don't would to refactor plpgsql parser.

e) for your proposal we don't need any special - you can do it as custom
function - then there is no sense to define it. Maximally it can live as
some extension in some shared repository



 the idea being that it gets turned into  SELECT expr;  and then
 evaluated.

  ASSERT WARNING data are there QUERY SELECT ...


 I think this could still be parsed correctly, though I'm not 100% sure on
 that:

 ASSERT WARNING (EXISTS(SELECT ..)), 'data are there';


PLpgSQL uses a ';' or some plpgsql keyword as SQL statement delimiter. It
reason why RETURN QUERY ... ';' So in this case can practical to place SQL
statement on the end of plpgsql statement.

parenthesis are not practical, because it is hard to identify bug ..

A simplicity of integration SQL and PLpgSQL is in using smart keywords -
It is more verbose, and it allow to well diagnostics



 For extra points the error detail could work similarly to
 print_strict_params.  e.g.  ASSERT(row_count = 1);  would display the value
 of row_count in the DETAIL line, since row_count was a parameter to the
 expression.



 .marko

Re: [HACKERS] postgresql latency bgwriter not doing its job

2014-09-05 Thread didier

hi

On Thu, Sep 4, 2014 at 7:01 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Sep 4, 2014 at 3:09 AM, Ants Aasma a...@cybertec.at wrote:
 On Thu, Sep 4, 2014 at 12:36 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 It's imo quite clearly better to keep it allocated. For one after
 postmaster started the checkpointer successfully you don't need to be
 worried about later failures to allocate memory if you allocate it once
 (unless the checkpointer FATALs out which should be exceedingly rare -
 we're catching ERRORs). It's much much more likely to succeed
 initially. Secondly it's not like there's really that much time where no
 checkpointer isn't running.

 In principle you could do the sort with the full sized array and then
 compress it to a list of buffer IDs that need to be written out. This
 way most of the time you only need a small array and the large array
 is only needed for a fraction of a second.

 It's not the size of the array that's the problem; it's the size of
 the detonation when the allocation fails.

You can use a file backed memory array
Or because it's only a hint and
- keys are in buffers (BufferTag), right?
- transition is only from 'data I care to data I don't care' if a
buffer is concurrently evicted when sorting.

Use a pre allocate buffer index array an read keys from buffers when
sorting, without memory barrier, spinlocks, whatever.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-05 Thread Kyotaro HORIGUCHI

Hello, attached is the new-and-far-simple version of this
patch. It no longer depends on win32 nonblocking patch since the
socket is in blocking mode again.

 On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:
  - Preventing protocol violation.
 
 To prevent protocol violation, secure_write sets
 ClientConnectionLost when SIGTERM detected, then
 internal_flush() and ProcessInterrupts() follow the
 instruction.
 
 Oh, hang on. Now that I look at pqcomm.c more closely, it already has
 a mechanism to avoid writing a message in the middle of another
 message. See pq_putmessage and PqCommBusy. Can we rely on that?

Hmm, it gracefully returns up to ExecProcNode() and PqCommBusy is
turned off on the way at pq_putmessage() under current
implement. So PqCommBusy is already false before it runs into
ProcessInterrupts().

Allowing ImmediateInterruptOK on signalled during send(), setting
whereToSendOutput to DestNone if PqCommBusy is true will do. We
can also distinguish read and write by looking
DoingCommandRead. The ImmediateInterruptOK section can be defined
enclosing by prepare_for_client_read/client_read_end.

  - Single pg_terminate_backend surely kills the backend.
 
 secure_raw_write() uses non-blocking socket and a loop of
 select() with timeout to surely detects received
 signal(SIGTERM).
 
 I was going to suggest using WaitLatchOrSocket instead of sleeping in
 1 second increment, but I see that WaitLatchOrSocket() doesn't
 currently support waiting for a socket to become writeable, without
 also waiting for it to become readable. I wonder how difficult it
 would be to lift that restriction.

It seems quite difficult hearing the following discussion.

 I also wonder if it would be simpler to keep the socket in blocking
 mode after all, and just close() in the signal handler if PqCommBusy
 == true. If the signal arrives while sleeping in send(), the effect
 would be the same as with your patch. If the signal arrives while
 sending, but not sleeping, you would not get the chance to send the
 already-buffered data to the client. But maybe that's OK, whether or
 not you're blocked is not very deterministic anyway.

Hmm. We're back round to the my first patch, with immediately
close the socket, and became irrelevant to win32 layer
patch. Anyway, it sounds reasonable.

Attached patch is a quick hack patch, but it seems working as
expected at a glance.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 7fcb6ef2e66231605e49bd51cd09d275b40cfd57 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Fri, 5 Sep 2014 17:21:48 +0900
Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client.

---
 src/backend/libpq/be-secure.c |   14 +++---
 src/backend/libpq/pqcomm.c|6 ++
 src/backend/tcop/postgres.c   |   40 +---
 src/include/libpq/libpq.h |1 +
 4 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 41ec1ad..329812b 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 
 	n = recv(port-sock, ptr, len, 0);
 
-	client_read_ended();
+	client_comm_ended();
 
 	return n;
 }
@@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port-sock, ptr, len, 0);
+	ssize_t		n;
+
+	prepare_for_client_comm();
+
+	n = send(port-sock, ptr, len, 0);
+
+	client_comm_ended();
+
+	return n;
 }
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..8f84f67 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -1342,6 +1342,12 @@ pq_is_send_pending(void)
 	return (PqSendStart  PqSendPointer);
 }
 
+bool
+pq_is_busy(void)
+{
+	return PqCommBusy;
+}
+
 /* 
  * Message-level I/O routines begin here.
  *
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7b5480f..7a4c483 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf)
  *
  * Even though we are not reading from a client process, we still want to
  * respond to signals, particularly SIGTERM/SIGQUIT.  Hence we must use
- * prepare_for_client_read and client_read_ended.
+ * prepare_for_client_comm and client_comm_ended.
  */
 static int
 interactive_getc(void)
 {
 	int			c;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 	c = getc(stdin);
-	client_read_ended();
+	client_comm_ended();
 	return c;
 }
 
@@ -487,7 +487,7 @@ ReadCommand(StringInfo inBuf)
 }
 
 /*
- * prepare_for_client_read -- set up to possibly block on client input
+ * prepare_for_client_comm -- set up to possibly block

Re: [HACKERS] proposal: plpgsql - Assert statement


On 9/5/14 10:09 AM, Pavel Stehule wrote:

I think this could still be parsed correctly, though I'm not 100% sure on
that:

ASSERT WARNING (EXISTS(SELECT ..)), 'data are there';



PLpgSQL uses a ';' or some plpgsql keyword as SQL statement delimiter. It
reason why RETURN QUERY ... ';' So in this case can practical to place SQL
statement on the end of plpgsql statement.


*shrug*  There are lots of cases where a comma is used as well, e.g. 
RAISE NOTICE '..', expr, expr;



parenthesis are not practical, because it is hard to identify bug ..


I don't see why.  The PL/PgSQL SQL parser goes to great lengths to 
identify unmatched parenthesis.  But the parens probably aren't 
necessary in the first place; you could just omit them and keep parsing 
until the next comma AFAICT.  So the syntax would be:


RAISE [ NOTICE | WARNING | EXCEPTION/ASSERT/WHATEVER ]
boolean_expr [, error_message [, error_message_param [, ... ] ] ];


A simplicity of integration SQL and PLpgSQL is in using smart keywords -
It is more verbose, and it allow to well diagnostics


I disagree.  The new keywords provide nothing of value here.  They even 
encourage the use of quirky syntax in *exchange* for verbosity (IS NOT 
NULL pk? really?).



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-05 Thread Kyotaro HORIGUCHI

Sorry, It tha patch contains a silly bug. Please find the
attatched one.

 Hello, attached is the new-and-far-simple version of this
 patch. It no longer depends on win32 nonblocking patch since the
 socket is in blocking mode again.
 
  On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:
   - Preventing protocol violation.
  
  To prevent protocol violation, secure_write sets
  ClientConnectionLost when SIGTERM detected, then
  internal_flush() and ProcessInterrupts() follow the
  instruction.
  
  Oh, hang on. Now that I look at pqcomm.c more closely, it already has
  a mechanism to avoid writing a message in the middle of another
  message. See pq_putmessage and PqCommBusy. Can we rely on that?
 
 Hmm, it gracefully returns up to ExecProcNode() and PqCommBusy is
 turned off on the way at pq_putmessage() under current
 implement. So PqCommBusy is already false before it runs into
 ProcessInterrupts().
 
 Allowing ImmediateInterruptOK on signalled during send(), setting
 whereToSendOutput to DestNone if PqCommBusy is true will do. We
 can also distinguish read and write by looking
 DoingCommandRead. The ImmediateInterruptOK section can be defined
 enclosing by prepare_for_client_read/client_read_end.
 
   - Single pg_terminate_backend surely kills the backend.
  
  secure_raw_write() uses non-blocking socket and a loop of
  select() with timeout to surely detects received
  signal(SIGTERM).
  
  I was going to suggest using WaitLatchOrSocket instead of sleeping in
  1 second increment, but I see that WaitLatchOrSocket() doesn't
  currently support waiting for a socket to become writeable, without
  also waiting for it to become readable. I wonder how difficult it
  would be to lift that restriction.
 
 It seems quite difficult hearing the following discussion.
 
  I also wonder if it would be simpler to keep the socket in blocking
  mode after all, and just close() in the signal handler if PqCommBusy
  == true. If the signal arrives while sleeping in send(), the effect
  would be the same as with your patch. If the signal arrives while
  sending, but not sleeping, you would not get the chance to send the
  already-buffered data to the client. But maybe that's OK, whether or
  not you're blocked is not very deterministic anyway.
 
 Hmm. We're back round to the my first patch, with immediately
 close the socket, and became irrelevant to win32 layer
 patch. Anyway, it sounds reasonable.
 
 Attached patch is a quick hack patch, but it seems working as
 expected at a glance.
From 11da4bc3c214490671d27379910a667f06cc35af Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp
Date: Fri, 5 Sep 2014 17:21:48 +0900
Subject: [PATCH] Simplly cutting off the socket if signalled during sending to client.

---
 src/backend/libpq/be-secure.c |   14 --
 src/backend/libpq/pqcomm.c|6 
 src/backend/tcop/postgres.c   |   53 -
 src/include/libpq/libpq.h |1 +
 4 files changed, 44 insertions(+), 30 deletions(-)

diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 41ec1ad..329812b 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -145,11 +145,11 @@ secure_raw_read(Port *port, void *ptr, size_t len)
 {
 	ssize_t		n;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 
 	n = recv(port-sock, ptr, len, 0);
 
-	client_read_ended();
+	client_comm_ended();
 
 	return n;
 }
@@ -178,5 +178,13 @@ secure_write(Port *port, void *ptr, size_t len)
 ssize_t
 secure_raw_write(Port *port, const void *ptr, size_t len)
 {
-	return send(port-sock, ptr, len, 0);
+	ssize_t		n;
+
+	prepare_for_client_comm();
+
+	n = send(port-sock, ptr, len, 0);
+
+	client_comm_ended();
+
+	return n;
 }
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..8f84f67 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -1342,6 +1342,12 @@ pq_is_send_pending(void)
 	return (PqSendStart  PqSendPointer);
 }
 
+bool
+pq_is_busy(void)
+{
+	return PqCommBusy;
+}
+
 /* 
  * Message-level I/O routines begin here.
  *
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7b5480f..15627c3 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -303,16 +303,16 @@ InteractiveBackend(StringInfo inBuf)
  *
  * Even though we are not reading from a client process, we still want to
  * respond to signals, particularly SIGTERM/SIGQUIT.  Hence we must use
- * prepare_for_client_read and client_read_ended.
+ * prepare_for_client_comm and client_comm_ended.
  */
 static int
 interactive_getc(void)
 {
 	int			c;
 
-	prepare_for_client_read();
+	prepare_for_client_comm();
 	c = getc(stdin);
-	client_read_ended();
+	client_comm_ended();
 	return c;
 }
 
@@ -487,7 +487,7 @@ ReadCommand(StringInfo inBuf)
 }
 
 /*
- * prepare_for_client_read -- set up to possibly block on client

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-05 10:33 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 9/5/14 10:09 AM, Pavel Stehule wrote:

 I think this could still be parsed correctly, though I'm not 100% sure on
 that:

 ASSERT WARNING (EXISTS(SELECT ..)), 'data are there';


 PLpgSQL uses a ';' or some plpgsql keyword as SQL statement delimiter. It
 reason why RETURN QUERY ... ';' So in this case can practical to place SQL
 statement on the end of plpgsql statement.


 *shrug*  There are lots of cases where a comma is used as well, e.g. RAISE
 NOTICE '..', expr, expr;

  parenthesis are not practical, because it is hard to identify bug ..


 I don't see why.  The PL/PgSQL SQL parser goes to great lengths to
 identify unmatched parenthesis.  But the parens probably aren't necessary
 in the first place; you could just omit them and keep parsing until the
 next comma AFAICT.  So the syntax would be:

 RAISE [ NOTICE | WARNING | EXCEPTION/ASSERT/WHATEVER ]
 boolean_expr [, error_message [, error_message_param [, ... ] ] ];


extension RAISE about ASSERT level has minimal new value




  A simplicity of integration SQL and PLpgSQL is in using smart keywords -
 It is more verbose, and it allow to well diagnostics


 I disagree.  The new keywords provide nothing of value here.  They even
 encourage the use of quirky syntax in *exchange* for verbosity (IS NOT
 NULL pk? really?).


It is about semantic and conformity of proposed tools. Sure,  all can
reduced to ASSERT(expr) .. but where is some benefit against function call

I am able to do without any change of language as plpgsql extension - there
is no necessary to do any change for too thin proposal




 .marko

Re: [HACKERS] proposal: plpgsql - Assert statement


On 9/5/14 10:40 AM, Pavel Stehule wrote:

2014-09-05 10:33 GMT+02:00 Marko Tiikkaja ma...@joh.to:

I don't see why.  The PL/PgSQL SQL parser goes to great lengths to
identify unmatched parenthesis.  But the parens probably aren't necessary
in the first place; you could just omit them and keep parsing until the
next comma AFAICT.  So the syntax would be:

RAISE [ NOTICE | WARNING | EXCEPTION/ASSERT/WHATEVER ]
boolean_expr [, error_message [, error_message_param [, ... ] ] ];



extension RAISE about ASSERT level has minimal new value


Oops.  I meant to type ASSERT there, instead of RAISE.  Does that make 
more sense?



I disagree.  The new keywords provide nothing of value here.  They even
encourage the use of quirky syntax in *exchange* for verbosity (IS NOT
NULL pk? really?).



It is about semantic and conformity of proposed tools. Sure,  all can
reduced to ASSERT(expr) .. but where is some benefit against function call


I see several benefits:

  1) Standard syntax, available anywhere
  2) Since the RAISE EXCEPTION happens at the caller's site, we can 
provide information not available to an ordinary function, such as the 
values of the parameters passed to it

  3) We can make the exception uncatchable
  4) ASSERTs can be easily disabled (if we choose to allow that), even 
per-function



I am able to do without any change of language as plpgsql extension - there
is no necessary to do any change for too thin proposal


What *exactly* about my proposal is too thin?  What does your thing do 
that mine doesn't?  If you're saying your suggestion allows us to give a 
better error message, I disagree:



  ( ROW_COUNT ( = |  ) ( 1 | 0 ) |

I've already addressed this: we can print the parameters and their 
values automatically, so printing the row count here doesn't give any 
additional value.


  ( QUERY some query should not be empty ) |

With this definition, absolutely zero value over ASSERT EXISTS(..);

  ( CHECK some expression should be true )

No additional value; it's either NULL, FALSE or TRUE and both syntaxes 
can display what the expression evaluated to.


  ( IS NOT NULL expression should not be null )

It's either NULL or it isn't.  No additional value.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing implicit 'text' - xml|json|jsonb


On 9/5/14 9:04 AM, Craig Ringer wrote:

It's easy to object to this on type-purist grounds, but from a pragmatic
real-users point of view what we currently do is outright painful, and
unless we can go and fix every language binding, every query generator,
every ORM, etc to handle things just how PostgreSQL expects, some
compromise may be warranted.

It's easy to dismiss the problem by saying pass 'unknown' typed
literals via your language binding. That even works if you're willing
to jump through some hoops and are using raw JDBC. Good luck doing that
via EclipseLink, Hibernate, ActiveRecord, SQLAlchemy, MyBatis, Django
ORM, or any of the things people use to talk to PostgreSQL on a day to
day basis though.

Right now it's really painful to use some of PostgreSQL's best features
without hacking around the type system by manually creating implicit
casts. Another option is to work around it by completely removing the
benefit of the strict casting even when it's obviously right (e.g.
refusing to cast text to date) with the JDBC connection parameter
stringtype=unknown .

I'd like to get rid of the need for users to add possibly-buggy custom
casts or bypass type checking of text types, by relaxing the casts where
appropriate.


I really don't like the idea of relaxing casts.  And I really object to 
the notion of casting from test to date being obviously right.


The problem here seems to be only related to mistyped parameters.  Can 
we contain the damage to that part only somehow?  Or make this optional 
(defaulting to off, I hope)?




.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing implicit 'text' - xml|json|jsonb

2014-09-05 Thread Craig Ringer

On 09/05/2014 05:04 PM, Marko Tiikkaja wrote:
 
 I really don't like the idea of relaxing casts.  And I really object to
 the notion of casting from test to date being obviously right.

Gah. It's obviously right to *reject* implicit conversions like
text-date. I specifically do _not_ want to add such a conversion, and
gave a list of types for which I think conversions from text are
appropriate.

 The problem here seems to be only related to mistyped parameters.  Can
 we contain the damage to that part only somehow?  Or make this optional
 (defaulting to off, I hope)?

I'd love to make it affect only parameters, actually, for v3 protocol
bind/parse/execute. That would be ideal.

Right now the main workaround is to send all string-typed parameters as
'unknown'-typed, but that causes a mess with function overload
resolution, and it's wrong most of the time when the parameter really is
just text.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-05 10:57 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 9/5/14 10:40 AM, Pavel Stehule wrote:

 2014-09-05 10:33 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 I don't see why.  The PL/PgSQL SQL parser goes to great lengths to
 identify unmatched parenthesis.  But the parens probably aren't necessary
 in the first place; you could just omit them and keep parsing until the
 next comma AFAICT.  So the syntax would be:

 RAISE [ NOTICE | WARNING | EXCEPTION/ASSERT/WHATEVER ]
 boolean_expr [, error_message [, error_message_param [, ... ] ] ];


 extension RAISE about ASSERT level has minimal new value


 Oops.  I meant to type ASSERT there, instead of RAISE.  Does that make
 more sense?


for me a basic limit is boolean_expr



  I disagree.  The new keywords provide nothing of value here.  They even
 encourage the use of quirky syntax in *exchange* for verbosity (IS NOT
 NULL pk? really?).


 It is about semantic and conformity of proposed tools. Sure,  all can
 reduced to ASSERT(expr) .. but where is some benefit against function call


 I see several benefits:

   1) Standard syntax, available anywhere
   2) Since the RAISE EXCEPTION happens at the caller's site, we can
 provide information not available to an ordinary function, such as the
 values of the parameters passed to it
   3) We can make the exception uncatchable
   4) ASSERTs can be easily disabled (if we choose to allow that), even
 per-function


All points can be solved  by extension on PGXN .. and your proposed syntax
can be better processed on Postgres level than PLpgSQL level.



  I am able to do without any change of language as plpgsql extension -
 there
 is no necessary to do any change for too thin proposal


 What *exactly* about my proposal is too thin?  What does your thing do
 that mine doesn't?  If you're saying your suggestion allows us to give a
 better error message, I disagree:


Too thin it means so there is not possibility to extensibility,
possibility to define dafaults, possibility to use it for static analyse.




   ( ROW_COUNT ( = |  ) ( 1 | 0 ) |

 I've already addressed this: we can print the parameters and their values
 automatically, so printing the row count here doesn't give any additional
 value.


In this case, I can use a plpgsql parser for static analyse - I can do
warning, if related query has not filter PK and similar.



   ( QUERY some query should not be empty ) |

 With this definition, absolutely zero value over ASSERT EXISTS(..);

   ( CHECK some expression should be true )

 No additional value; it's either NULL, FALSE or TRUE and both syntaxes can
 display what the expression evaluated to.

   ( IS NOT NULL expression should not be null )

 It's either NULL or it isn't.  No additional value.



 .marko

Re: [HACKERS] proposal: plpgsql - Assert statement


On 9/5/14 11:08 AM, Pavel Stehule wrote:

2014-09-05 10:57 GMT+02:00 Marko Tiikkaja ma...@joh.to:

Oops.  I meant to type ASSERT there, instead of RAISE.  Does that make
more sense?


for me a basic limit is boolean_expr


I don't understand what you mean by this.


It is about semantic and conformity of proposed tools. Sure,  all can
reduced to ASSERT(expr) .. but where is some benefit against function call



I see several benefits:

   1) Standard syntax, available anywhere
   2) Since the RAISE EXCEPTION happens at the caller's site, we can
provide information not available to an ordinary function, such as the
values of the parameters passed to it
   3) We can make the exception uncatchable
   4) ASSERTs can be easily disabled (if we choose to allow that), even
per-function



All points can be solved  by extension on PGXN ..


#3 probably.  Being able to satisfy #1 through PGXN is ridiculous.  #2 
and #4 would require at least hooking into PL/PgSQL, which might be 
possible, but it's intrusive and honestly feels fragile.


But perhaps more to the point, why would that criticism apply to my 
suggestion, but not yours?  Why don't we just reject any ASSERT syntax?



and your proposed syntax
can be better processed on Postgres level than PLpgSQL level.


*shrug*  Doing it in SQL would probably break more stuff.  I'm trying to 
contain the damage.  And arguably, this is mostly only useful in PL/PgSQL.



  I am able to do without any change of language as plpgsql extension -

there
is no necessary to do any change for too thin proposal



What *exactly* about my proposal is too thin?  What does your thing do
that mine doesn't?  If you're saying your suggestion allows us to give a
better error message, I disagree:



Too thin it means so there is not possibility to extensibility,
possibility to define dafaults, possibility to use it for static analyse.


Again, why does this criticism only apply to my suggestion and not yours?


   ( ROW_COUNT ( = |  ) ( 1 | 0 ) |

I've already addressed this: we can print the parameters and their values
automatically, so printing the row count here doesn't give any additional
value.



In this case, I can use a plpgsql parser for static analyse - I can do
warning, if related query has not filter PK and similar.


You can do that anyway, you just need to work a bit harder.  I don't see 
the problem, really.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Join push-down support for foreign tables

2014-09-05 Thread Atri Sharma

On Fri, Sep 5, 2014 at 2:20 AM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Sep 4, 2014 at 11:56 AM, Bruce Momjian br...@momjian.us wrote:
  I am thinking eventually we will need to cache the foreign server
  statistics on the local server.
 
  Wouldn't that lead to issues where the statistics get outdated and we
 have to
  anyways query the foreign server before planning any joins? Or are you
 thinking
  of dropping the foreign table statistics once the foreign join is
 complete?
 
  I am thinking we would eventually have to cache the statistics, then get
  some kind of invalidation message from the foreign server.  I am also
  thinking that cache would have to be global across all backends, I guess
  similar to our invalidation cache.

 Maybe ... but I think this isn't really related to the ostensible
 topic of this thread.  We can do join pushdown just fine without the
 ability to do anything like this.

 I'm in full agreement that we should probably have a way to cache some
 kind of statistics locally, but making that work figures to be tricky,
 because (as I'm pretty sure Tom has pointed out before) there's no
 guarantee that the remote side's statistics look anything like
 PostgreSQL statistics, and so we might not be able to easily store
 them or make sense of them.  But it would be nice to at least have the
 option to store such statistics if they do happen to be something we
 can store and interpret.


I agree that we need local statistics too (full agreement to Bruce's
proposal) but playing the Devil's advocate here and trying to figure how
will things like invalidation and as you mentioned, cross compatibility
work.



 It's also coming to seem to me more and more that we need a way to
 designate several PostgreSQL machines as a cooperating cluster.  This
 would mean they'd keep connections to each other open and notify each
 other about significant events, which could include hey, I updated
 the statistics on this table, you might want to get the new ones or
 hey, i've replicated your definition for function X so it's safe to
 push it down now as well as hey, I have just been promoted to be the
 new master or even automatic negotiation of which of a group of
 machines should become the master after a server failure.


Thats a brilliant idea, and shouldnt be too much of a problem. One race
condition that is possible is that multiple backend may try to globally
propagate different statistics of the same table, but I think that any
standard logical ordering algorithm should handle that. Also, the automatic
master promotion seems like a brilliant idea and is also great since we
have time tested standard algorithms for that.

One thing I would like to see is that assuming all the interacting nodes do
not have identical schemas, if we can somehow maintain cross node
statistics and use them for planning cross node joins. That would lead to
similar problems as the ones already noted for having local statistics for
foreign databases, but if we solve those anyways for storing local
statistics, we could potentially look at having cross node relation
statistics as well.


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Regards,

Atri
*l'apprenant*

Re: [HACKERS] Allowing implicit 'text' - xml|json|jsonb


On 9/5/14 11:08 AM, Craig Ringer wrote:

On 09/05/2014 05:04 PM, Marko Tiikkaja wrote:


I really don't like the idea of relaxing casts.  And I really object to
the notion of casting from test to date being obviously right.


Gah. It's obviously right to *reject* implicit conversions like
text-date. I specifically do _not_ want to add such a conversion, and
gave a list of types for which I think conversions from text are
appropriate.


Oh, *strict casting* is obviously right.  My apologies, I completely 
misparsed that.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-05 11:17 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 9/5/14 11:08 AM, Pavel Stehule wrote:

 2014-09-05 10:57 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 Oops.  I meant to type ASSERT there, instead of RAISE.  Does that make
 more sense?


 for me a basic limit is boolean_expr


 I don't understand what you mean by this.

  It is about semantic and conformity of proposed tools. Sure,  all can
 reduced to ASSERT(expr) .. but where is some benefit against function
 call


 I see several benefits:

1) Standard syntax, available anywhere
2) Since the RAISE EXCEPTION happens at the caller's site, we can
 provide information not available to an ordinary function, such as the
 values of the parameters passed to it
3) We can make the exception uncatchable
4) ASSERTs can be easily disabled (if we choose to allow that), even
 per-function


 All points can be solved  by extension on PGXN ..


 #3 probably.  Being able to satisfy #1 through PGXN is ridiculous.  #2 and
 #4 would require at least hooking into PL/PgSQL, which might be possible,
 but it's intrusive and honestly feels fragile.

 But perhaps more to the point, why would that criticism apply to my
 suggestion, but not yours?  Why don't we just reject any ASSERT syntax?

  and your proposed syntax
 can be better processed on Postgres level than PLpgSQL level.


 *shrug*  Doing it in SQL would probably break more stuff.  I'm trying to
 contain the damage.  And arguably, this is mostly only useful in PL/PgSQL.

I am able to do without any change of language as plpgsql extension -

 there
 is no necessary to do any change for too thin proposal


 What *exactly* about my proposal is too thin?  What does your thing do
 that mine doesn't?  If you're saying your suggestion allows us to give a
 better error message, I disagree:


 Too thin it means so there is not possibility to extensibility,
 possibility to define dafaults, possibility to use it for static analyse.


 Again, why does this criticism only apply to my suggestion and not yours?


There is one stronger difference - there is clean, what is targer of
assertation, because there is defined semantic.

When all is just any expression, there is nothing specified semantic


 ( ROW_COUNT ( = |  ) ( 1 | 0 ) |

 I've already addressed this: we can print the parameters and their values
 automatically, so printing the row count here doesn't give any additional
 value.


 In this case, I can use a plpgsql parser for static analyse - I can do
 warning, if related query has not filter PK and similar.


 You can do that anyway, you just need to work a bit harder.  I don't see
 the problem, really.


bit harder, with possibility to false identification




 .marko

Re: [HACKERS] Scaling shared buffer eviction

2014-09-05 Thread Amit Kapila

On Wed, Sep 3, 2014 at 1:45 AM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Aug 28, 2014 at 7:11 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
  I have updated the patch to address the feedback.  Main changes are:
 
  1. For populating freelist, have a separate process (bgreclaimer)
  instead of doing it by bgwriter.
  2. Autotune the low and high threshold values for buffers
  in freelist. I have used the formula as suggested by you upthread.
  3. Cleanup of locking regimen as discussed upthread (completely
  eliminated BufFreelist Lock).
  4. Improved comments and general code cleanup.

 Overall this looks quite promising to me.

 I had thought to call the new process just bgreclaim rather than
 bgreclaimer, but perhaps your name is better after all.  At least,
 it matches what we do elsewhere.  But I don't care for the use
 Bgreclaimer; let's do BgReclaimer if we really need mixed-case, or
 else bgreclaimer.

Changed it to bgreclaimer.


 This is unclear:

 +buffers for replacement.  Earlier to protect freelist, we use LWLOCK as
that
 +is needed to perform clock sweep which is a longer operation, however
now we
 +are using two spinklocks freelist_lck and victimbuf_lck to perform
operations
 +on freelist and run clock sweep respectively.

 I would drop the discussion of what was done before and say something
 like this: The data structures relating to buffer eviction are
 protected by two spinlocks.  freelist_lck protects the freelist and
 related data structures, while victimbuf_lck protects information
 related to the current clock sweep condition.

Changed, but I have not used exact wording mentioned above, let me know
if new wording used is okay?

 +always in this list.  We also throw buffers into this list if we consider
 +their pages unlikely to be needed soon; this is done by background
process
 +reclaimer.  The list is singly-linked using fields in the

 I suggest: Allocating pages from this list is much cheaper than
 running the clock sweep algorithm, which may encounter many buffers
 that are poor candidates for eviction before finding a good candidate.
 Therefore, we have a background process called bgreclaimer which works
 to keep this list populated.

Changed as per your suggestion.

 +Background Reclaimer's Processing
 +-

 I suggest titling this section Background Reclaim.

 +The background reclaimer is designed to move buffers to freelist that are

 I suggest replacing the first three words of this sentence with
bgreclaimer.

As per discussion in thread, I have kept it as it is.

 +and move the the unpinned and zero usage count buffers to freelist.  It
 +keep on doing this until the number of buffers in freelist become equal
 +high threshold of freelist.

 s/keep/keeps/
 s/become equal/reaches the/
 s/high threshold/high water mark/
 s/of freelist//

Changed as per your suggestion.

 Please change the other places that say threshold to use the water
 mark terminology.

 +if (StrategyMoveBufferToFreeListEnd (bufHdr))

 Extra space.

 + * buffers in consecutive cycles.

 s/consecutive/later/

 +/* Execute the LRU scan */

 s/LRU scan/clock sweep/ ?

Changed as per your suggestion.


 +while (tmp_num_to_free  0)

 I am not sure it's a good idea for this value to be fixed at loop
 start and then just decremented.  Shouldn't we loop and do the whole
 thing over once we reach the high watermark, only stopping when
 StrategySyncStartAndEnd() says num_to_free is 0?

Okay, changed the loop as per your suggestion.

 +/* choose next victim buffer to clean. */

 This process doesn't clean buffers; it puts them on the freelist.

Right. Changed it to match what it does.

 + * high threshold of freelsit), we drasticaly reduce the odds for

 Two typos.

Fixed.

 + * of buffers in freelist fall below low threshold of freelist.

 s/fall/falls/

Changed as per your suggestion.

 In freelist.c, it seems like a poor idea to have two spinlocks as
 consecutive structure members; they'll be in the same cache line,
 leading to false sharing.  If we merge them into a single spinlock,
 does that hurt performance?  If we put them further apart, e.g. by
 moving the freelist_lck to the start of the structure, followed by the
 latches, and leaving victimbuf_lck where it is, does that help
 performance?

As per discussion, I have kept them as it is and added a comment
indicating that we can consider having both locks in separate
cache lines.

 +/*
 + * If the buffer is pinned or has a nonzero usage_count,
 we cannot use
 + * it; discard it and retry.  (This can only happen if
VACUUM put a
 + * valid buffer in the freelist and then someone else
 used it before
 + * we got to it.  It's probably impossible altogether as
 of 8.3, but
 + * we'd better check anyway.)
 + */
 +

 This comment is clearly obsolete.

Removed.

  I have not yet added statistics

Re: [HACKERS] Scaling shared buffer eviction

2014-09-05 Thread Amit Kapila

On Fri, Sep 5, 2014 at 8:42 AM, Mark Kirkwood mark.kirkw...@catalyst.net.nz
wrote:

 On 04/09/14 14:42, Amit Kapila wrote:

 On Thu, Sep 4, 2014 at 8:00 AM, Mark Kirkwood 
mark.kirkw...@catalyst.net.nz
 wrote:



 Hi Amit,

 Results look pretty good. Does it help in the read-write case too?


 Last time I ran the tpc-b test of pgbench (results of which are
 posted earlier in this thread), there doesn't seem to be any major
 gain for that, however for cases where read is predominant, you
 might see better gains.

 I am again planing to take that data in next few days.


 FWIW below are some test results on the 60 core beast with this patch
applied to 9.4. I'll need to do more runs to iron out the variation,
 but it looks like the patch helps the standard (write heavy) pgbench
workload a little, and clearly helps the read only case.


Thanks for doing the test.  I think if possible you can take
the performance data with higher scale factor (4000) as it
seems your m/c has 1TB of RAM.  You might want to use
latest patch I have posted today.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-05 Thread Jan Wieck


On 09/05/2014 04:40 AM, Pavel Stehule wrote:




2014-09-05 10:33 GMT+02:00 Marko Tiikkaja ma...@joh.to
mailto:ma...@joh.to:

On 9/5/14 10:09 AM, Pavel Stehule wrote:

I think this could still be parsed correctly, though I'm not
100% sure on
that:

ASSERT WARNING (EXISTS(SELECT ..)), 'data are there';


PLpgSQL uses a ';' or some plpgsql keyword as SQL statement
delimiter. It
reason why RETURN QUERY ... ';' So in this case can practical to
place SQL
statement on the end of plpgsql statement.


*shrug*  There are lots of cases where a comma is used as well, e.g.
RAISE NOTICE '..', expr, expr;

parenthesis are not practical, because it is hard to identify bug ..


I don't see why.  The PL/PgSQL SQL parser goes to great lengths to
identify unmatched parenthesis.  But the parens probably aren't
necessary in the first place; you could just omit them and keep
parsing until the next comma AFAICT.  So the syntax would be:

RAISE [ NOTICE | WARNING | EXCEPTION/ASSERT/WHATEVER ]
boolean_expr [, error_message [, error_message_param [, ... ] ] ];


extension RAISE about ASSERT level has minimal new value


Adding a WHEN clause to RAISE would have the benefit of not needing any 
new keywords at all.


RAISE EXCEPTION 'format' [, expr ...] WHEN row_count  1;


Regards,
Jan




A simplicity of integration SQL and PLpgSQL is in using smart
keywords -
It is more verbose, and it allow to well diagnostics


I disagree.  The new keywords provide nothing of value here.  They
even encourage the use of quirky syntax in *exchange* for verbosity
(IS NOT NULL pk? really?).


It is about semantic and conformity of proposed tools. Sure,  all can
reduced to ASSERT(expr) .. but where is some benefit against function call

I am able to do without any change of language as plpgsql extension -
there is no necessary to do any change for too thin proposal



.marko





--
Jan Wieck
Senior Software Engineer
http://slony.info


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A mechanism securing web applications in DBMS

2014-09-05 Thread Geoff Montee

On Fri, Sep 5, 2014 at 12:21 AM, Laurence Rowe l...@lrowe.co.uk wrote:



 Here my_auth_function would validate the ticket and would need to be able
 to do two things not currently possible with a SECURITY DEFINER function:

 1. Call SET SESSION AUTHORIZATION / SET ROLE to become a user the
 connection user is not otherwise allowed to become.

 2. Dynamically set which roles are 'inherited' by the user it is becoming.


 Laurence


I've been testing a similar setup using security barrier views for RLS.

It would be a stretch to call my initialization function an
authentication function. My application server actually does the
authentication using client certificate validation. My database
initialization function is primarily for authorization (i.e. it determines
what the user should be able to see at a fine-grained level, and it
performs the work to grant the right accesses).

Anyway, I also encountered the two issues you mention, and I designed my
initialization function to get around them:

1.) My initialization function dynamically creates a role for the user. It
returns the role name (as the special name text type). The application
checks the returned row, and it does the SET ROLE. (And later does a RESET
ROLE, since we are using connection pooling.)

2.) My initialization function dynamically grants roles to the user
accounts as needed.

Good authentication between the application server and the database server
is extra important in my case, since the application is trusted to do the
authentication and it is granted access to security definer functions that
dynamically create and grant roles.

I just thought I'd share. As a DBA, it's interesting to read ideas about
how to push more of this work to the database server.

Geoff Montee

Re: [HACKERS] pgbench throttling latency limit

2014-09-05 Thread Jan Wieck


On 08/27/2014 04:08 AM, Heikki Linnakangas wrote:

That model might make some sense if you think e.g. of a web application,
where the web server has a timeout for how long it waits to get a
database connection from a pool, but once a query is started, the
transaction is considered a succeess no matter how long it takes. The
latency limit would be that timeout. But I think a more useful model is
that when the user clicks a button, he waits at most X seconds for the
result. If that deadline is exceeded, the web server will give a 404, or
the user will simply get bored and go away, and the transaction is
considered a failure.


Correct, the whole TPC-B model better fits an application where client 
requests enter a queue at the specified TPS rate and that queue is 
processed.


While we are at it,

Note that in the original TPC-B specification the transaction duration 
measured is the time from receiving the client request (in current 
pgbench under throttling that is for when the transaction is scheduled) 
and when the request is answered. This is the client visible response 
time, which has nothing to do with the database latency.


As per TPC-B, the entire test is only valid if 90% of all client 
response times are within 2 seconds.


It would be useful if pgbench would

A) measure and report that client response time in the per transaction
   log files and

B) report at the end what percentage of transactions finished within
   a specified response time constraint (default 2 seconds).


Regards,
Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench throttling latency limit

2014-09-05 Thread Fabien COELHO




That model might make some sense if you think e.g. of a web application,
where the web server has a timeout for how long it waits to get a
database connection from a pool, but once a query is started, the
transaction is considered a succeess no matter how long it takes. The
latency limit would be that timeout. But I think a more useful model is
that when the user clicks a button, he waits at most X seconds for the
result. If that deadline is exceeded, the web server will give a 404, or
the user will simply get bored and go away, and the transaction is
considered a failure.


Correct, the whole TPC-B model better fits an application where client 
requests enter a queue at the specified TPS rate and that queue is processed.


While we are at it,

Note that in the original TPC-B specification the transaction duration 
measured is the time from receiving the client request (in current pgbench 
under throttling that is for when the transaction is scheduled) and when the 
request is answered. This is the client visible response time, which has 
nothing to do with the database latency.


Ok. This correspond to the definition used in the current patch. However 
ISTM that the tpc-b bench is as fast as possible, there is no underlying 
schedule as with the throttled pgbench.


As per TPC-B, the entire test is only valid if 90% of all client response 
times are within 2 seconds.


It would be useful if pgbench would

A) measure and report that client response time in the per transaction
  log files and


I never used the per transaction log file. I think that it may already be 
the case that it contains this information when not throttled. When under 
throttling, I do not know.



B) report at the end what percentage of transactions finished within
  a specified response time constraint (default 2 seconds).


What is currently reported is the complement (% of transactions completed 
over the time limit).


Note that despite pg appaling latency performance, in may stay well over 
the 90% limit, or even 100%: when things are going well a lot of 
transaction run in about ms, while when things are going bad transactions 
would take a long time (although possibly under or about 1s anyway), *but* 
very few transactions are passed, the throughput is very small. The fact 
that during 15 seconds only 30 transactions are processed is a detail that 
does not show up in the metric.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench throttling latency limit

2014-09-05 Thread Jan Wieck


On 09/05/2014 10:12 AM, Fabien COELHO wrote:

Note that despite pg appaling latency performance, in may stay well over
the 90% limit, or even 100%: when things are going well a lot of
transaction run in about ms, while when things are going bad transactions
would take a long time (although possibly under or about 1s anyway), *but*
very few transactions are passed, the throughput is very small. The fact
that during 15 seconds only 30 transactions are processed is a detail that
does not show up in the metric.


I haven't used the real pgbench for a long time. I will have to look at 
your patch and see what the current version actually does or does not.


What I have been using is a Python version of pgbench that I wrote for 
myself when I started learning that language. That one does record both 
values, the DB transaction latency and the client response time (time 
from the request being entered into the Queue until transaction commit). 
When I look at those results it is possible to have an utterly failing 
run, with 60% of client response times being within 2 seconds, but all 
the DB transactions are still in milliseconds.


As said, I'll have to take a look at it. Since I am on vacation next 
week, getting ready for my first day at EnterpriseDB, this may actually 
happen.



Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] implement subject alternative names support for SSL connections

2014-09-05 Thread Alexey Klyukin

On Thu, Sep 4, 2014 at 10:23 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

 Hmm. Perhaps we should use X509_NAME_get_index_by_NID + X509_NAME_get_entry
 instead of X509_NAME_get_text_by_NID. You could then pass the ASN1_STRING
 object to the certificate_name_entry_validate_match() function, and have it
 do the ASN1_STRING_length() and ASN1_STRING_data() calls too.
...
 I think we should:

 1. Check if there's a common name, and if so, print that
 2. Check if there is exactly one SAN, and if so, print that
 3. Just print an error without mentioning names.

 There's a lot of value in printing the name if possible, so I'd really like
 to keep that. But I agree that printing all the names if there are several
 would get complicated and the error message could become very long. Yeah,
 the error message might need to be different for cases 1 and 2. Or maybe
 phrase it server certificate's name \%s\ does not match host name
 \%s\, which would be reasonable for both 1. and 2.

Thank you, I've implemented both suggestions in the attached new
version of the patch.
On the error message, I've decided to show either a single name, or
the first examined name and the number of other names present in the
certificate, i.e:

 psql: server name example.com and 2 other names from server SSL certificate 
 do not match host name example-foo.com

The error does not state whether the names comes from the CN or from
the SAN section.

This version also checks for the availability of the subject name, it
looks like RFC 5280 does not require it at all.

4.1.2.6.  Subject

   The subject field identifies the entity associated with the public
   key stored in the subject public key field.  The subject name MAY be
   carried in the subject field and/or the subjectAltName extension.

The pattern of allocating the name in the subroutine and then
reporting it (and releasing when necessary) in the main function is a
little bit ugly, but it looks to me the least ugly of anything else I
could come up (i.e. extracting those names at the time the error
message is shown).

Regards,

-- 
Alexey Klyukin
diff --git a/src/interfaces/libpq/fe-secure-openssl.c 
b/src/interfaces/libpq/fe-secure-openssl.c
new file mode 100644
index f950fc3..a4ad305
*** a/src/interfaces/libpq/fe-secure-openssl.c
--- b/src/interfaces/libpq/fe-secure-openssl.c
***
*** 60,68 
--- 60,73 
  #ifdef USE_SSL_ENGINE
  #include openssl/engine.h
  #endif
+ #include openssl/x509v3.h
  
  static bool verify_peer_name_matches_certificate(PGconn *);
  static intverify_cb(int ok, X509_STORE_CTX *ctx);
+ static intcertificate_name_entry_validate_match(PGconn *conn,
+   
  ASN1_STRING *name,
+   
  bool *match,
+   
  char **store_name);
  static void destroy_ssl_system(void);
  static intinitialize_SSL(PGconn *conn);
  static PostgresPollingStatusType open_client_SSL(PGconn *);
*** wildcard_certificate_match(const char *p
*** 473,540 
  
  
  /*
!  *Verify that common name resolves to peer.
   */
! static bool
! verify_peer_name_matches_certificate(PGconn *conn)
  {
!   char   *peer_cn;
!   int r;
!   int len;
!   boolresult;
! 
!   /*
!* If told not to verify the peer name, don't do it. Return true
!* indicating that the verification was successful.
!*/
!   if (strcmp(conn-sslmode, verify-full) != 0)
!   return true;
  
!   /*
!* Extract the common name from the certificate.
!*
!* XXX: Should support alternate names here
!*/
!   /* First find out the name's length and allocate a buffer for it. */
!   len = X509_NAME_get_text_by_NID(X509_get_subject_name(conn-peer),
!   
NID_commonName, NULL, 0);
!   if (len == -1)
!   {
!   printfPQExpBuffer(conn-errorMessage,
! libpq_gettext(could not get 
server common name from server certificate\n));
!   return false;
!   }
!   peer_cn = malloc(len + 1);
!   if (peer_cn == NULL)
{
printfPQExpBuffer(conn-errorMessage,
! libpq_gettext(out of 
memory\n));
!   return false;
}
  
!   r = X509_NAME_get_text_by_NID(X509_get_subject_name(conn-peer),
! 
NID_commonName, peer_cn, len + 1);
!   if (r != len)
{
-   /* Got different length than on the first call. Shouldn't 
happen. */
printfPQExpBuffer(conn-errorMessage,

Re: [HACKERS] PL/pgSQL 2

2014-09-05 Thread Merlin Moncure

On Thu, Sep 4, 2014 at 6:40 PM, Florian Pflug f...@phlo.org wrote:
 Cost of hidden IO cast is negative too. If we can change it, then we can
 increase a sped.

 But the whole power of PL/pgSQL comes from the fact that it allows you to
 use the full set of postgres data types and operatores, and that it seamlessly
 integrated with SQL. Without that, PL/pgSQL is about as appealing as BASIC
 as a programming language...

Right, and it's exactly those types and operators that are the cause
of the performance issues.  A compiled pl/pgsql would only get serious
benefit for scenarios involving tons of heavy iteration or funky local
data structure manipulation.  Those scenarios are somewhat rare in
practice for database applications and often better handled in a
another pl should they happen.

plv8 is emerging as the best non-sql it's JIT compiled by the plv8
runtime, the javascript language is designed for embedding. and the
json data structure has nice similarities with postgres's arrays and
types.  In fact, if I *were* to attempt pl/pgsql compiling, I'd
probably translate the code to plv8 and hand it off to the llvm
engine.  You'd still have to let postgres handle most of the operator
and cast operations but you could pull some things into the plv8
engine.  Probably, this would be a net loser since plv8 (unlike
plpgsql) has to run everything through SPI.

IMO, what needs to happen first would be for the data type routines to
be pulled out of main library so that client side applications and pls
could link against it allowing for guaranteed sql semantics without
having to call into the backend -- at least the standard types.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql latency bgwriter not doing its job

2014-09-05 Thread Robert Haas

On Fri, Sep 5, 2014 at 4:17 AM, didier did...@gmail.com wrote:
 It's not the size of the array that's the problem; it's the size of
 the detonation when the allocation fails.

 You can use a file backed memory array
 Or because it's only a hint and
 - keys are in buffers (BufferTag), right?
 - transition is only from 'data I care to data I don't care' if a
 buffer is concurrently evicted when sorting.

 Use a pre allocate buffer index array an read keys from buffers when
 sorting, without memory barrier, spinlocks, whatever.

If by a file-backed memory array you mean mmap() something, that
doesn't do anything to address the possibility of failure.  mmap(),
like malloc(), can fail.

But I think we don't really need to have this argument.  Andres's
proposal to put this in the main shared memory segment sounds fine to
me.  If allocating that fails, you'll know to reduce shared_buffers
and try again.  If it succeeds, you should be safe after that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql latency bgwriter not doing its job

2014-09-05 Thread Claudio Freire

On Fri, Sep 5, 2014 at 3:09 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Sep 5, 2014 at 4:17 AM, didier did...@gmail.com wrote:
 It's not the size of the array that's the problem; it's the size of
 the detonation when the allocation fails.

 You can use a file backed memory array
 Or because it's only a hint and
 - keys are in buffers (BufferTag), right?
 - transition is only from 'data I care to data I don't care' if a
 buffer is concurrently evicted when sorting.

 Use a pre allocate buffer index array an read keys from buffers when
 sorting, without memory barrier, spinlocks, whatever.

 If by a file-backed memory array you mean mmap() something, that
 doesn't do anything to address the possibility of failure.  mmap(),
 like malloc(), can fail.

It's worse. If your process needs to page from disk when touching a
byte in the mmap'd array, and the disk somehow fails to work, you get
a segmentation fault.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

2014-09-05 Thread Tomas Vondra

Hi everyone,

as Heikki mentioned in his commitfest status message, this patch still
has no reviewers :-( Is there anyone willing to pick up that, together
with the 'dense allocation' patch (as those two are closely related)?

I'm looking in Robert's direction, as he's the one who came up with the
dense allocation idea, and also commented on the hasjoin bucket resizing
patch ...

I'd ask Pavel Stehule, but as he's sitting next to me in the office he's
not really independent ;-) I'll do some reviews on the other patches
over the weekend, to balance this of course.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] VIP: explain of running query

Hi

I am sending a prototype with basic implementation with explain of running
query specified by pid.

It can show more than the execution plan. There is a examples of full query
text and running completion tag.

This patch is in early stage - I know, so there is one race condition.

I hoped so I can use new shm_mq API, but it is not prepared for usage where
receiver and sender are mutable.

How it works:

postgres=# select pg_cmdstatus(pid,1) from pg_stat_activity where pid 
pg_backend_pid();

pg_cmdstatus
---
 Query Text: select * from pg_class, pg_attribute limit 400;
 Limit  (cost=0.00..8795.58 rows=697380 width=403)
   -  Nested Loop  (cost=0.00..8795.58 rows=697380 width=403)
 -  Seq Scan on pg_attribute  (cost=0.00..66.64 rows=2364
width=203)
 -  Materialize  (cost=0.00..12.42 rows=295 width=200)
   -  Seq Scan on pg_class  (cost=0.00..10.95 rows=295
width=200)
(6 rows)

postgres=# select pg_cmdstatus(pid,2) from pg_stat_activity where pid 
pg_backend_pid();
pg_cmdstatus
-
 select * from pg_class, pg_attribute limit 400;
(1 row)

postgres=# select pg_cmdstatus(pid,3) from pg_stat_activity where pid 
pg_backend_pid();
 pg_cmdstatus
---
 SELECT 144427
(1 row)

postgres=# select pg_cmdstatus(pid,3) from pg_stat_activity where pid 
pg_backend_pid();
 pg_cmdstatus
---
 SELECT 209742
(1 row)

postgres=# select pg_cmdstatus(pid,3) from pg_stat_activity where pid 
pg_backend_pid();
 pg_cmdstatus
---
 SELECT 288472
(1 row)

In future a function can be replaced by statement EXPLAIN pid WITH
autocomplete - It can show a subset of EXPLAIN ANALYZE -- but it needs a
some parametrization of executor environment.

First discuss to this topic was year ago

http://www.postgresql.org/message-id/cafj8pra-duzkmdtu52ciugb0p7tvri_b8ltjmjfwcnr1lpt...@mail.gmail.com

http://www.postgresql.org/message-id/CAFj8pRDEo24joEg4UFRDYeFADFTw-jw_=t=kPwOyDW=v=g1...@mail.gmail.com

Regards

Pavel
commit 8dfcd8daee84cbe60dbf552073f883751274faf4
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Fri Sep 5 21:29:06 2014 +0200

vip prototype - explain and status of another PostgreSQL proces

diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 22f116b..d5ec3ae 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -13,7 +13,7 @@ top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
 OBJS = aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o  \
-	collationcmds.o constraint.o conversioncmds.o copy.o createas.o \
+	cmdstatus.o collationcmds.o constraint.o conversioncmds.o copy.o createas.o \
 	dbcommands.o define.o discard.o dropcmds.o \
 	event_trigger.o explain.o extension.o foreigncmds.o functioncmds.o \
 	indexcmds.o lockcmds.o matview.o operatorcmds.o opclasscmds.o \
diff --git a/src/backend/commands/cmdstatus.c b/src/backend/commands/cmdstatus.c
new file mode 100644
index 000..f2d0d9c
--- /dev/null
+++ b/src/backend/commands/cmdstatus.c
@@ -0,0 +1,468 @@
+/*-
+ *
+ * comment.c
+ *
+ * PostgreSQL object comments utility code.
+ *
+ * Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/commands/comment.c
+ *
+ *-
+ */
+
+#include postgres.h
+
+#include funcapi.h
+#include miscadmin.h
+
+#include access/htup_details.h
+#include commands/cmdstatus.h
+#include commands/explain.h
+#include lib/stringinfo.h
+#include storage/proc.h
+#include storage/procarray.h
+#include storage/shmem.h
+#include tcop/dest.h
+#include tcop/pquery.h
+#include utils/builtins.h
+
+
+#define CMDINFO_SLOTS		100
+#define BUFFER_SIZE		(8 * 1024)
+
+typedef struct {
+bool is_valid;
+bool is_done;
+int target_pid;
+int sender_pid;
+int request_type;
+int result_code;
+} CmdStatusInfoEntry;
+
+typedef struct {
+	LWLock		*lock;			/* protect slots - search/modification */
+	CmdStatusInfoEntry *slots;
+	LWLock		*buffer_lock;		/* protect buffer handling */
+	void		*buffer;		/* result data */
+	Size		buffer_size;
+	int		target_pid;
+	int		sender_pid;
+	bool		buffer_is_free;		/* buffer is generally available */
+	bool		buffer_holds_data;	/* buffer holds a valid data */
+} CmdStatusInfo;
+
+static CmdStatusInfo *cmd_status_info = NULL;
+
+/*
+ * Prepare explain of query
+ *
+ */
+static StringInfo
+explain_query(QueryDesc *queryDesc)
+{
+	ExplainState es;
+
+	ExplainInitState(es);
+	es.analyze = false;
+	es.verbose = false;
+	es.buffers = false;
+	es.format = EXPLAIN_FORMAT_TEXT;
+
+	ExplainBeginOutput(es);
+	ExplainQueryText(es, queryDesc);
+	ExplainPrintPlan(es, queryDesc);
+
+	ExplainEndOutput(es);
+
+	/* Remove last line break */
+	if (es.str-len

Re: [HACKERS] Allowing implicit 'text' - xml|json|jsonb (was: PL/pgSQL 2)

2014-09-05 Thread Merlin Moncure

On Fri, Sep 5, 2014 at 2:04 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 On 09/05/2014 12:04 AM, Pavel Stehule wrote:

 But some missing casts are well - I found lot performance issues based
 on using wrong data types - integers, dates in text column.

 Of course! That's why the default implicit casts were removed, and for
 good reason. I'm only talking about a narrow class of a few specific types.

 I think maybe a _few_ types need to be implicitly convertable from text,
 but that's about it.

 text - jsonb
 text - json
 text - xml
 text - hstore
 text - uuid
 text - (user defined enum)

 ... mainly because otherwise app devs need frustrating workarounds (or
 giving up on the PostgreSQL native types and just using 'text' columns),
 and because in all these cases PostgreSQL will validate the input.

That seems pretty reasonable.  If you do that along with redefining
certain functions like lpad() to take any instead of text you'd
eliminate most of the headaches.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-05 Thread Oskari Saarenmaa

I wrote the attached patch to optionally emit warnings when column or table
aliases are used without the AS keyword after errors caused by typos in
statements turning unintended things into aliases came up twice this week. 
First in a discussion with a colleague who was surprised by a 1 row result
for the query 'SELECT COUNT(*) files' and again in the pl/pgsql 2 thread
as plpgsql currently doesn't throw an error if there are more result columns
than output columns (SELECT a b INTO f1, f2).

The patch is still missing documentation and it needs another patch to
modify all the statements in psql  co to use AS so you can use things like
\d and tab-completion without triggering the warnings.  I can implement
those changes if others think this patch makes sense.

/ Oskari
From 478e694e5281a3bf91e44177ce925e6625cb44a5 Mon Sep 17 00:00:00 2001
From: Oskari Saarenmaa o...@ohmu.fi
Date: Fri, 5 Sep 2014 22:31:22 +0300
Subject: [PATCH] parser: optionally warn about missing AS for column and table
 aliases

Add a new GUC missing_as_warning (defaults to false, the previous
behavior) to raise a WARNING when a column or table alias is used without
the AS keyword.

This allows catching some types of errors where another keyword or a comma
was missing and a label is being used as an alias instead of something else,
for example cases like:

SELECT COUNT(*) files;
SELECT * FROM files users;
SELECT path size FROM files INTO f_path, f_size;
---
 src/backend/parser/gram.y|  24 +
 src/backend/utils/misc/guc.c |  11 +++
 src/include/parser/parser.h  |   2 +
 src/test/regress/expected/select.out | 170 +++
 src/test/regress/sql/select.sql  |  47 ++
 5 files changed, 254 insertions(+)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b46dd7b..06a71dd 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -65,6 +65,10 @@
 #include utils/xml.h
 
 
+/* GUCs */
+bool missing_as_warning = false;
+
+
 /*
  * Location tracking support --- simpler than bison's default, since we only
  * want to track the start position not the end position of each nonterminal.
@@ -10119,12 +10123,20 @@ alias_clause:
}
| ColId '(' name_list ')'
{
+   if (missing_as_warning)
+   ereport(WARNING,
+   (errmsg(alias 
without explicit AS and missing_as_warning enabled),
+   
parser_errposition(@3)));
$$ = makeNode(Alias);
$$-aliasname = $1;
$$-colnames = $3;
}
| ColId
{
+   if (missing_as_warning)
+   ereport(WARNING,
+   (errmsg(alias 
without explicit AS and missing_as_warning enabled),
+   
parser_errposition(@1)));
$$ = makeNode(Alias);
$$-aliasname = $1;
}
@@ -10156,6 +10168,10 @@ func_alias_clause:
| ColId '(' TableFuncElementList ')'
{
Alias *a = makeNode(Alias);
+   if (missing_as_warning)
+   ereport(WARNING,
+   (errmsg(alias 
without explicit AS and missing_as_warning enabled),
+   
parser_errposition(@1)));
a-aliasname = $1;
$$ = list_make2(a, $3);
}
@@ -10244,6 +10260,10 @@ relation_expr_opt_alias: relation_expr 
%prec UMINUS
| relation_expr ColId
{
Alias *alias = makeNode(Alias);
+   if (missing_as_warning)
+   ereport(WARNING,
+   (errmsg(alias 
without explicit AS and missing_as_warning enabled),
+   
parser_errposition(@2)));
alias-aliasname = $2;
$1-alias = alias;
$$ = $1;
@@ -12577,6

Re: [HACKERS] [PATCH] parser: optionally warn about missing AS for column and table aliases


On 2014-09-05 22:38, Oskari Saarenmaa wrote:

I wrote the attached patch to optionally emit warnings when column or table
aliases are used without the AS keyword after errors caused by typos in
statements turning unintended things into aliases came up twice this week.
First in a discussion with a colleague who was surprised by a 1 row result
for the query 'SELECT COUNT(*) files' and again in the pl/pgsql 2 thread
as plpgsql currently doesn't throw an error if there are more result columns
than output columns (SELECT a b INTO f1, f2).

The patch is still missing documentation and it needs another patch to
modify all the statements in psql  co to use AS so you can use things like
\d and tab-completion without triggering the warnings.  I can implement
those changes if others think this patch makes sense.


I think this is only problematic for column aliases.  I wouldn't want to 
put these two to be put into the same category, as I always omit the AS 
keyword for tables aliases (and will continue to do so), but never omit 
it for column aliases.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-05 Thread Oskari Saarenmaa


05.09.2014 23:45, Marko Tiikkaja kirjoitti:

On 2014-09-05 22:38, Oskari Saarenmaa wrote:

I wrote the attached patch to optionally emit warnings when column or
table
aliases are used without the AS keyword after errors caused by typos in
statements turning unintended things into aliases came up twice this
week.
First in a discussion with a colleague who was surprised by a 1 row
result
for the query 'SELECT COUNT(*) files' and again in the pl/pgsql 2
thread
as plpgsql currently doesn't throw an error if there are more result
columns
than output columns (SELECT a b INTO f1, f2).

The patch is still missing documentation and it needs another patch to
modify all the statements in psql  co to use AS so you can use things
like
\d and tab-completion without triggering the warnings.  I can implement
those changes if others think this patch makes sense.


I think this is only problematic for column aliases.  I wouldn't want to
put these two to be put into the same category, as I always omit the AS
keyword for tables aliases (and will continue to do so), but never omit
it for column aliases.


I prefer using AS for both, but I can see the case for requiring AS in 
table aliases being a lot weaker.  Not emitting warnings for table 
aliases would also reduce the changes required in psql  co as they seem 
to be using aliases mostly (only?) for tables.


What'd be a good name for the GUC controlling this, 
missing_column_as_warning?


/ Oskari



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] parser: optionally warn about missing AS for column and table aliases


On 2014-09-05 22:56, Oskari Saarenmaa wrote:

What'd be a good name for the GUC controlling this,
missing_column_as_warning?


I don't know about others, but I've previously had the idea of having 
more warnings like this (my go-to example is  DELETE FROM foo WHERE bar 
IN (SELECT bar FROM barlesstable);  where barlesstable doesn't have a 
column bar).  Perhaps it would be better to have a guc with a list of 
warnings, similarly to what we did for plpgsql.extra_warnings?


Now that I start thinking about it, more ideas for such warnings start 
springing up..



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Filter error log statements by sqlstate

2014-09-05 Thread Oskari Saarenmaa


17.01.2014 19:07, Tom Lane kirjoitti:

Oskari Saarenmaa o...@ohmu.fi writes:

I don't know about others, but filtering by individual SQLSTATE is
exactly what I need - I don't want to suppress all plpgsql errors or
constraint violations, but I may want to suppress plpgsql RAISE
EXCEPTION and CHECK violations.


[...]


It hasn't really been proven that SQLSTATE-class filtering would work
conveniently, but AFAICS the only way to prove or disprove that is for
somebody to code it up and use it in production.


[...]


Another thought here is that if you're willing to have the filter
only able to *prevent* logging, and not to let it *cause* logging
of messages that would otherwise be suppressed by log_min_messages,
it could be implemented as a loadable module using the emit_log_hook.


So this is what we ended up doing: a module with emit_log_hook to allow 
upgrading log_min_messages and log_min_error_statement values per 
sqlstate.  I'm now using this in production and it has had a positive 
impact in reducing the volume of (unwanted) logs being collected and 
allowing a kludgy rsyslog.conf filter to be removed (which didn't really 
work that well - it only dropped the first part of a multipart log 
entry, writing partial pg log entries in syslog).


https://github.com/ohmu/pgloggingfilter

I'm not super happy about the syntax it uses, but at least it should be 
obvious that it works just like the core GUCs but is settable per 
sqlstate.  I've been planning to sketch a proposal for a better way to 
configure log verbosity and details based on sqlstate, statement 
duration or other variables, but unfortunately haven't had time to do it 
yet.


/ Oskari


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-05 23:06 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 2014-09-05 22:56, Oskari Saarenmaa wrote:

 What'd be a good name for the GUC controlling this,
 missing_column_as_warning?


 I don't know about others, but I've previously had the idea of having more
 warnings like this (my go-to example is  DELETE FROM foo WHERE bar IN
 (SELECT bar FROM barlesstable);  where barlesstable doesn't have a column
 bar).  Perhaps it would be better to have a guc with a list of warnings,
 similarly to what we did for plpgsql.extra_warnings?

 Now that I start thinking about it, more ideas for such warnings start
 springing up..


I had same idea - maybe some general mode prune human errors mode

Pavel




 .marko



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-05 Thread David G Johnston

Marko Tiikkaja-4 wrote
 On 2014-09-05 22:38, Oskari Saarenmaa wrote:
 I wrote the attached patch to optionally emit warnings when column or
 table
 aliases are used without the AS keyword after errors caused by typos in
 statements turning unintended things into aliases came up twice this
 week.
 First in a discussion with a colleague who was surprised by a 1 row
 result
 for the query 'SELECT COUNT(*) files' and again in the pl/pgsql 2
 thread
 as plpgsql currently doesn't throw an error if there are more result
 columns
 than output columns (SELECT a b INTO f1, f2).

 The patch is still missing documentation and it needs another patch to
 modify all the statements in psql  co to use AS so you can use things
 like
 \d and tab-completion without triggering the warnings.  I can implement
 those changes if others think this patch makes sense.
 
 I think this is only problematic for column aliases.  I wouldn't want to 
 put these two to be put into the same category, as I always omit the AS 
 keyword for tables aliases (and will continue to do so), but never omit 
 it for column aliases.

I agree on being able to pick the behavior independently for select-list
items and the FROM clause items.

Using this to mitigate the pl/pgsql column mis-match issue seems like too
broad of a solution.

I probably couldn't mount a convincing defense of my opinion but at first
blush I'd say we should pass on this.  Not with prejudice - looking at the
issue periodically has merit - but right now it seems to be mostly adding
clutter to the code without significant benefit.

Tangential - I'd rather see something like EXPLAIN (STATIC) that would
allow a user to quickly invoke a built-in static SQL analyzer on their query
and have it point this kind of thing out.  Adding a catalog for STATIC
configurations in much the same way we have text search configurations would
allow extensions and users to define their own rulesets that could be
attached to their ROLE GUC: default_static_analyzer_ruleset and also
passed in as a modifier in the EXPLAIN invocation.

Stuff like correlated subqueries without alias use could be part of that (to
avoid typos that result in constant true) and also duplicate column names
floating to the top-most select-list, or failure to use an alias on a
function call result.  There are probably others though I'm stretching to
even find these...

Anyway, the idea of using a GUC and evaluating every query that is written
(the added if statements), is not that appealing even if the impact of one
more check is likely to be insignificant (is it?)

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PATCH-parser-optionally-warn-about-missing-AS-for-column-and-table-aliases-tp5817971p5817980.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-05 Thread David G Johnston

David G Johnston wrote
 
 Marko Tiikkaja-4 wrote
 On 2014-09-05 22:38, Oskari Saarenmaa wrote:
 I wrote the attached patch to optionally emit warnings when column or
 table
 aliases are used without the AS keyword after errors caused by typos in
 statements turning unintended things into aliases came up twice this
 week.
 First in a discussion with a colleague who was surprised by a 1 row
 result
 for the query 'SELECT COUNT(*) files' and again in the pl/pgsql 2
 thread
 as plpgsql currently doesn't throw an error if there are more result
 columns
 than output columns (SELECT a b INTO f1, f2).

 The patch is still missing documentation and it needs another patch to
 modify all the statements in psql  co to use AS so you can use things
 like
 \d and tab-completion without triggering the warnings.  I can implement
 those changes if others think this patch makes sense.
 
 I think this is only problematic for column aliases.  I wouldn't want to 
 put these two to be put into the same category, as I always omit the AS 
 keyword for tables aliases (and will continue to do so), but never omit 
 it for column aliases.
 I agree on being able to pick the behavior independently for select-list
 items and the FROM clause items.
 
 Using this to mitigate the pl/pgsql column mis-match issue seems like too
 broad of a solution.
 
 I probably couldn't mount a convincing defense of my opinion but at first
 blush I'd say we should pass on this.  Not with prejudice - looking at the
 issue periodically has merit - but right now it seems to be mostly adding
 clutter to the code without significant benefit.
 
 Tangential - I'd rather see something like EXPLAIN (STATIC) that would
 allow a user to quickly invoke a built-in static SQL analyzer on their
 query and have it point this kind of thing out.  Adding a catalog for
 STATIC configurations in much the same way we have text search
 configurations would allow extensions and users to define their own
 rulesets that could be attached to their ROLE GUC:
 default_static_analyzer_ruleset and also passed in as a modifier in the
 EXPLAIN invocation.
 
 Stuff like correlated subqueries without alias use could be part of that
 (to avoid typos that result in constant true) and also duplicate column
 names floating to the top-most select-list, or failure to use an alias on
 a function call result.  There are probably others though I'm stretching
 to even find these...
 
 Anyway, the idea of using a GUC and evaluating every query that is written
 (the added if statements), is not that appealing even if the impact of one
 more check is likely to be insignificant (is it?)

To protect users on every query they write there would need to be some kind
of always explain first and only execute if no warnings are thrown
mode...and ideally some way to then override that on a per-query basis if
you know you are correct and don't want to fix the SQL...

If the static check fails the query itself would error and the detail would
contain the status result of the static analysis; otherwise the query should
return as normal.

This at least avoids having to introduce 10 different GUC just to
accommodate this feature and neatly bundles them into named packages.  

A warn-and-execute mode would be possible as well.  I would make it
incompatible with autocommit mode so that a user doing this would have a
chance to evaluate the warnings and rollback even if the statement ran to
completion successfully.

David J.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PATCH-parser-optionally-warn-about-missing-AS-for-column-and-table-aliases-tp5817971p5817982.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

On Fri, Sep  5, 2014 at 02:33:00PM -0700, David G Johnston wrote:
 This at least avoids having to introduce 10 different GUC just to
 accommodate this feature and neatly bundles them into named packages.  
 
 A warn-and-execute mode would be possible as well.  I would make it
 incompatible with autocommit mode so that a user doing this would have a
 chance to evaluate the warnings and rollback even if the statement ran to
 completion successfully.

Our TODO list had the idea of a novice mode that would warn about such
things.


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases


On 2014-09-05 11:19 PM, David G Johnston wrote:

Marko Tiikkaja-4 wrote

I think this is only problematic for column aliases.  I wouldn't want to
put these two to be put into the same category, as I always omit the AS
keyword for tables aliases (and will continue to do so), but never omit
it for column aliases.


I agree on being able to pick the behavior independently for select-list
items and the FROM clause items.

Using this to mitigate the pl/pgsql column mis-match issue seems like too
broad of a solution.


The PL/PgSQL column can be solved via other methods; see for example my 
suggestion of PL/PgSQL warning - num_into_expressions in the current 
commit fest (and the non-backwards-compatible solution of throwing a 
run-time error I suggested).  But some people run SQL queries from their 
apps, and I've seen people make this mistake and get confused.



I probably couldn't mount a convincing defense of my opinion but at first
blush I'd say we should pass on this.  Not with prejudice - looking at the
issue periodically has merit - but right now it seems to be mostly adding
clutter to the code without significant benefit.


Are you talking about this particular option, or the notion of parser 
warnings in general?  Because if we're going to add a handful of 
warnings, I don't see why this couldn't be on that list.



Tangential - I'd rather see something like EXPLAIN (STATIC) that would
allow a user to quickly invoke a built-in static SQL analyzer on their query
and have it point this kind of thing out.  Adding a catalog for STATIC
configurations in much the same way we have text search configurations would
allow extensions and users to define their own rulesets that could be
attached to their ROLE GUC: default_static_analyzer_ruleset and also
passed in as a modifier in the EXPLAIN invocation.


If you knew there's a good chance you might make a mistake, you could 
probably avoid doing it in the first place.  I make mistakes all the 
time, would I then always execute two commands when I want to do 
something?  Having to run a special check my query please command 
seems silly.



Anyway, the idea of using a GUC and evaluating every query that is written
(the added if statements), is not that appealing even if the impact of one
more check is likely to be insignificant (is it?)


I'm not sure how you would do this differently in the EXPLAIN (STATIC) 
case.  Those ifs have to be somewhere, and that's always a non-zero cost.


That being said, yes, performance costs of course have to be evaluated 
carefully.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-05 Thread David Johnston

On Fri, Sep 5, 2014 at 6:27 PM, Marko Tiikkaja ma...@joh.to wrote:

 On 2014-09-05 11:19 PM, David G Johnston wrote:

 Marko Tiikkaja-4 wrote


  I probably couldn't mount a convincing defense of my opinion but at first
 blush I'd say we should pass on this.  Not with prejudice - looking at the
 issue periodically has merit - but right now it seems to be mostly adding
 clutter to the code without significant benefit.


 Are you talking about this particular option, or the notion of parser
 warnings in general?  Because if we're going to add a handful of warnings,
 I don't see why this couldn't be on that list.


This option implemented in this specific manner.


  Tangential - I'd rather see something like EXPLAIN (STATIC) that would
 allow a user to quickly invoke a built-in static SQL analyzer on their
 query
 and have it point this kind of thing out.  Adding a catalog for STATIC
 configurations in much the same way we have text search configurations
 would
 allow extensions and users to define their own rulesets that could be
 attached to their ROLE GUC: default_static_analyzer_ruleset and also
 passed in as a modifier in the EXPLAIN invocation.


 If you knew there's a good chance you might make a mistake, you could
 probably avoid doing it in the first place.  I make mistakes all the time,
 would I then always execute two commands when I want to do something?
 Having to run a special check my query please command seems silly.


My follow-on post posited a solution that still uses the EXPLAIN mechanism
but configured in a way so users can have it be always on if desired.



  Anyway, the idea of using a GUC and evaluating every query that is written
 (the added if statements), is not that appealing even if the impact of one
 more check is likely to be insignificant (is it?)


 I'm not sure how you would do this differently in the EXPLAIN (STATIC)
 case.  Those ifs have to be somewhere, and that's always a non-zero cost.

 That being said, yes, performance costs of course have to be evaluated
 carefully.


If you add lots more checks that is lots more ifs compared to a single if
to see if static analysis should be attempted in the first place.  For a
single option its largely a wash.  Beyond that I don't know enough to say
whether having the parser dispatch the query differently based upon it
being preceded with EXPLAIN (STATIC) or a standard query would be any
faster than a single IF for a single check.

The more options you can think of for a novice mode the more appealing a
formal static analysis interface becomes - both for execution and also
because the number of options being introduced and managed can be made into
formal configurations instead of an independent but related set of GUC
variables.

David J.

Re: [HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases


On 2014-09-05 11:33 PM, David G Johnston wrote:

To protect users on every query they write there would need to be some kind
of always explain first and only execute if no warnings are thrown
mode...and ideally some way to then override that on a per-query basis if
you know you are correct and don't want to fix the SQL...

If the static check fails the query itself would error and the detail would
contain the status result of the static analysis; otherwise the query should
return as normal.


This feels even sillier.  Instinctively, if you can contain the 
functionality into the EXPLAIN path only, I don't see why you couldn't 
do it in a single  if (..)  for every query.  I doubt you could ever 
measure that difference.



This at least avoids having to introduce 10 different GUC just to
accommodate this feature and neatly bundles them into named packages.


I disagree.  Even if there was such a static analysis mode, I think 
there would have to be some way of filtering some of them out.


But 10 difference GUCs can still be avoided; see 
plpgsql.extra_warnings, for example.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] Re: BUG #9555: pg_dump for tables with inheritance recreates the table with the wrong order of columns

On Wed, Sep  3, 2014 at 12:07:55PM -0400, Bruce Momjian wrote:
 On Mon, Sep  1, 2014 at 04:40:11PM -0400, Bruce Momjian wrote:
  On Mon, Sep  1, 2014 at 04:06:58PM -0400, Tom Lane wrote:
   Bruce Momjian br...@momjian.us writes:
NOTICE:  moving and merging column c with inherited definition
DETAIL:  user-specified column moved to the location of the 
inherited
column
   
   Dept of nitpicking: errdetail messages are supposed to be complete
   sentences, properly capitalized and punctuated.  Please re-read the
   style guidelines if you have forgotten them.
  
  Oh, yeah;  updated patch attached.
 
 OK, patch applied.  This will warn about reordering that happens via
 SQL, and via pg_dump restore.  Do we want to go farther and preserve
 column ordering by adding ALTER TABLE [constraint] ISLOCAL and have
 pg_dump reuse binary-upgrade mode?

OK, hearing nothing, I will consider the improved notice message as
sufficient and this item closed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2

2014-09-05 Thread Andrew Dunstan



On 09/05/2014 12:37 PM, Merlin Moncure wrote:

On Thu, Sep 4, 2014 at 6:40 PM, Florian Pflug f...@phlo.org wrote:

Cost of hidden IO cast is negative too. If we can change it, then we can
increase a sped.

But the whole power of PL/pgSQL comes from the fact that it allows you to
use the full set of postgres data types and operatores, and that it seamlessly
integrated with SQL. Without that, PL/pgSQL is about as appealing as BASIC
as a programming language...

Right, and it's exactly those types and operators that are the cause
of the performance issues.  A compiled pl/pgsql would only get serious
benefit for scenarios involving tons of heavy iteration or funky local
data structure manipulation.  Those scenarios are somewhat rare in
practice for database applications and often better handled in a
another pl should they happen.

plv8 is emerging as the best non-sql it's JIT compiled by the plv8
runtime, the javascript language is designed for embedding. and the
json data structure has nice similarities with postgres's arrays and
types.  In fact, if I *were* to attempt pl/pgsql compiling, I'd
probably translate the code to plv8 and hand it off to the llvm
engine.  You'd still have to let postgres handle most of the operator
and cast operations but you could pull some things into the plv8
engine.  Probably, this would be a net loser since plv8 (unlike
plpgsql) has to run everything through SPI.


plpgsql makes extensive use of SPI. Just look at the source code if you 
don't believe me.


plv8 also has a nice find_function gadget that lets you find and call 
another plv8 function directly instead of having to use an SPI call.


It has two serious defects in my view, that it inherits from v8.

First, and foremost, it has the old really really horrible Javascript 
scoping rules for variables. This makes it totally unsuitable for 
anything except trivially short functions. There is good news and bad 
news on this front: modern versions of v8 have code to allow proper 
lexical scoping as provided for in the draft ECMASCRIPT6 standard (the 
feature is named harmony scoping). Example of command line use:


   andrew@vpncli plv8js]$ d8 --use-strict --harmony
   V8 version 3.14.5.10 [console: readline]
   d8 var i = 10; for (let i = 0; i  3; i++) { let j = i; for (let i
   = 4; i  6; i++) { print (j  + j +  i  + i); } }
   j 0 i 4
   j 0 i 5
   j 1 i 4
   j 1 i 5
   j 2 i 4
   j 2 i 5
   d8 print(i);
   10
   d8 

The bad news is that neither Hitosho nor I (yet) know how to allow 
setting these flags for the plv8 embedded engine.


The other defect is that its string handling is just awful. It has 
neither multiline strings, not interpolation into strings.


The good news is that the new draft standard addresses these issues too, 
with something called template strings. The bad news is that V8 doesn't 
yet have code to support the feature, AFAICT. The Mozilla people are a 
bit ahead here, and this feature is due in a release of their rhino 
javascript library that will be in Mozilla 34, due out in November, 
AIUI. Let's hope that the V8 guys get their act together on this.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade and epoch

On Tue, Sep  2, 2014 at 10:38:55PM -0700, Sergey Konoplev wrote:
 On Tue, Sep 2, 2014 at 7:59 PM, Bruce Momjian br...@momjian.us wrote:
  On Wed, Apr 23, 2014 at 12:41:41PM -0700, Sergey Konoplev wrote:
  On Wed, Apr 23, 2014 at 5:26 AM, Bruce Momjian br...@momjian.us wrote:
   Sergey, are you seeing a problem only because you are
   interacting with other systems that didn't reset their epoch?
 
  I faced this after upgrading clusters with PgQ Skytools3 installed
  only. They didn't interact with any other systems.
 
  I have developed the attached patch which causes pg_upgrade to preserve
  the transaction epoch.  I plan to apply this for PG 9.5.
 
 That is a great news. Thank you, Bruce.

Patch applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Display of timestamp in pg_dump custom format

On Thu, Sep  4, 2014 at 01:19:31PM -0400, Bruce Momjian wrote:
 Uh, not sure what I was thinking --- strftime() is the way to go.  Here
 is the new output:
 
   ;
   ; Archive created at 2014-09-04 13:00:15 -0400   ---
   ; dbname: test
   ; TOC Entries: 8
   ; Compression: -1
   ; Dump Version: 1.12-0
   ; Format: CUSTOM
   ; Integer: 4 bytes
   ; Offset: 8 bytes
   ; Dumped from database version: 9.5devel
   ; Dumped by pg_dump version: 9.5devel
 
 I found two other places in our dump code that use strftime with a
 similar format, but they had problems with the timezone string on
 Windows, so I switched those over to use a numeric timezone offset as
 well.

Patch applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade and epoch

2014-09-05 Thread Greg Stark

On Wed, Sep 3, 2014 at 3:59 AM, Bruce Momjian br...@momjian.us wrote:
 I have developed the attached patch which causes pg_upgrade to preserve
 the transaction epoch.  I plan to apply this for PG 9.5.

I would say this is a simple bug and should be back patched to 9.4 and
9.3. We're only going to continue to get complaints from people
running into this.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade and epoch

On Sat, Sep  6, 2014 at 12:26:55AM +0100, Greg Stark wrote:
 On Wed, Sep 3, 2014 at 3:59 AM, Bruce Momjian br...@momjian.us wrote:
  I have developed the attached patch which causes pg_upgrade to preserve
  the transaction epoch.  I plan to apply this for PG 9.5.
 
 I would say this is a simple bug and should be back patched to 9.4 and
 9.3. We're only going to continue to get complaints from people
 running into this.

Yes, I did think about that, but it seems like a behavior change. 
However, it is tempting to avoid future bug reports about this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need Multixact Freezing Docs

On Wed, Sep  3, 2014 at 05:17:17PM -0400, Robert Haas wrote:
  I had a look at this and came upon a problem --- there is no multi-xid
  SQL data type, and in fact the system catalogs that store mxid values
  use xid, e.g.:
 
   relminmxid | xid   | not null
 
  With no mxid data type, there is no way to do function overloading to
  cause age to call the mxid variant.
 
  Should we use an explicit mxid_age() function name?  Add an mxid data
  type?
 
 Maybe both.  But mxid_age() is probably the simpler way forward just to start.

OK, patch applied using mxid_age() and no new data type.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
new file mode 100644
index cf174f0..d692308
*** a/doc/src/sgml/maintenance.sgml
--- b/doc/src/sgml/maintenance.sgml
*** HINT:  Stop the postmaster and vacuum th
*** 640,646 
   possible multixact ID still appearing in any tuple of that table.
   If this value is older than
   xref linkend=guc-vacuum-multixact-freeze-table-age, a whole-table
!  scan is forced.  Whole-table commandVACUUM/ scans, regardless of
   what causes them, enable advancing the value for that table.
   Eventually, as all tables in all databases are scanned and their
   oldest multixact values are advanced, on-disk storage for older
--- 640,651 
   possible multixact ID still appearing in any tuple of that table.
   If this value is older than
   xref linkend=guc-vacuum-multixact-freeze-table-age, a whole-table
!  scan is forced.  functionmxid_age()/ can be used on
!  structnamepg_class/.structfieldrelminmxid/ to find its age.
! /para
! 
! para
!  Whole-table commandVACUUM/ scans, regardless of
   what causes them, enable advancing the value for that table.
   Eventually, as all tables in all databases are scanned and their
   oldest multixact values are advanced, on-disk storage for older
diff --git a/src/backend/utils/adt/xid.c b/src/backend/utils/adt/xid.c
new file mode 100644
index 602a9e5..ecb3cf5
*** a/src/backend/utils/adt/xid.c
--- b/src/backend/utils/adt/xid.c
***
*** 16,21 
--- 16,22 
  
  #include limits.h
  
+ #include access/multixact.h
  #include access/transam.h
  #include access/xact.h
  #include libpq/pqformat.h
*** xid_age(PG_FUNCTION_ARGS)
*** 100,105 
--- 101,121 
  		PG_RETURN_INT32(INT_MAX);
  
  	PG_RETURN_INT32((int32) (now - xid));
+ }
+ 
+ /*
+  *		mxid_age			- compute age of a multi XID (relative to latest stable mxid)
+  */
+ Datum
+ mxid_age(PG_FUNCTION_ARGS)
+ {
+ 	TransactionId xid = PG_GETARG_TRANSACTIONID(0);
+ 	MultiXactId now = ReadNextMultiXactId();
+ 
+ 	if (!MultiXactIdIsValid(xid))
+ 		PG_RETURN_INT32(INT_MAX);
+ 
+ 	PG_RETURN_INT32((int32) (now - xid));
  }
  
  /*
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
new file mode 100644
index e1b62a5..e0875b7
*** a/src/include/catalog/catversion.h
--- b/src/include/catalog/catversion.h
***
*** 53,58 
   */
  
  /*			mmddN */
! #define CATALOG_VERSION_NO	201408281
  
  #endif
--- 53,58 
   */
  
  /*			mmddN */
! #define CATALOG_VERSION_NO	201409021
  
  #endif
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 5176ed0..09e138b
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*** DATA(insert OID = 1180 (  abstime		   PG
*** 1277,1282 
--- 1277,1284 
  DESCR(convert timestamp with time zone to abstime);
  DATA(insert OID = 1181 (  age			   PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 23 28 _null_ _null_ _null_ _null_ xid_age _null_ _null_ _null_ ));
  DESCR(age of a transaction ID, in transactions before current transaction);
+ DATA(insert OID = 3218 (  mxid_age		   PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 23 28 _null_ _null_ _null_ _null_ mxid_age _null_ _null_ _null_ ));
+ DESCR(age of a multi-transaction ID, in multi-transactions before current multi-transaction);
  
  DATA(insert OID = 1188 (  timestamptz_mi   PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 1186 1184 1184 _null_ _null_ _null_ _null_ timestamp_mi _null_ _null_ _null_ ));
  DATA(insert OID = 1189 (  timestamptz_pl_interval PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 1184 1184 1186 _null_ _null_ _null_ _null_ timestamptz_pl_interval _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
new file mode 100644
index 78cc0a0..d88e7a3
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
*** extern Datum xidrecv(PG_FUNCTION_ARGS);
*** 845,850 
--- 845,851 
  extern Datum xidsend(PG_FUNCTION_ARGS);
  extern Datum xideq(PG_FUNCTION_ARGS);
  extern Datum xid_age(PG_FUNCTION_ARGS);
+ extern Datum

[HACKERS] Adding a nullable DOMAIN column w/ CHECK


Hi,

First of all, sorry about breaking the thread; I don't subscribe to 
-general so I can't copy the original email.  This is in response to the 
problem here: 
http://www.postgresql.org/message-id/CACfv+p+8dToaR7h06_M_YMjpV5Na-CQq7kN=kgjq_k84h7u...@mail.gmail.com


Attached is a very ugly proof-of-concept patch of how this could work. 
With this applied, the ALTER TABLE becomes fast:


=# create domain test_domain numeric check (value  0);
CREATE DOMAIN
Time: 2,443 ms
=# create table test_domain_table as select generate_series(1, 200);
SELECT 200
Time: 1802,681 ms
=# alter table test_domain_table add column t test_domain;
ALTER TABLE
Time: 4,991 ms

The patch is obviously a load of horse crap, but does anyone have any 
objections to the general approach of making this pattern faster?  To do 
this optimization we do have to assume that CHECKs in DOMAINs are at 
least STABLE, but I don't see that as a problem; those should be 
IMMUTABLE anyway, I think.



.marko
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 4795,4815  ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
  			Oid			baseTypeId;
  			int32		baseTypeMod;
  			Oid			baseTypeColl;
  
  			baseTypeMod = typmod;
  			baseTypeId = getBaseTypeAndTypmod(typeOid, baseTypeMod);
  			baseTypeColl = get_typcollation(baseTypeId);
! 			defval = (Expr *) makeNullConst(baseTypeId, baseTypeMod, baseTypeColl);
! 			defval = (Expr *) coerce_to_target_type(NULL,
! 	(Node *) defval,
! 	baseTypeId,
! 	typeOid,
! 	typmod,
! 	COERCION_ASSIGNMENT,
! 	COERCE_IMPLICIT_CAST,
! 	-1);
! 			if (defval == NULL) /* should not happen */
  elog(ERROR, failed to coerce base type to domain);
  		}
  
  		if (defval)
--- 4795,4833 
  			Oid			baseTypeId;
  			int32		baseTypeMod;
  			Oid			baseTypeColl;
+ 			Expr	   *nullval;
+ 			ExprState  *nullvalState;
+ 			ExprContext *econtext;
+ 			bool		isnull;
+ 			Snapshot	snapshot;
+ 			EState	   *estate;
  
  			baseTypeMod = typmod;
  			baseTypeId = getBaseTypeAndTypmod(typeOid, baseTypeMod);
  			baseTypeColl = get_typcollation(baseTypeId);
! 			nullval = (Expr *) makeNullConst(baseTypeId, baseTypeMod, baseTypeColl);
! 			nullval = (Expr *) coerce_to_target_type(NULL,
! 	 (Node *) nullval,
! 	 baseTypeId,
! 	 typeOid,
! 	 typmod,
! 	 COERCION_ASSIGNMENT,
! 	 COERCE_IMPLICIT_CAST,
! 	 -1);
! 			if (nullval == NULL) /* should not happen */
  elog(ERROR, failed to coerce base type to domain);
+ 			nullval = expression_planner(nullval);
+ 			nullvalState = ExecInitExpr(nullval, NULL);
+ 			estate = CreateExecutorState();
+ 			econtext = GetPerTupleExprContext(estate);
+ 			snapshot = RegisterSnapshot(GetLatestSnapshot());
+ 			(void) ExecEvalExpr(nullvalState,
+ econtext,
+ isnull,
+ NULL);
+ 			(void) isnull;
+ 			UnregisterSnapshot(snapshot);
+ 			FreeExecutorState(estate);
  		}
  
  		if (defval)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A mechanism securing web applications in DBMS

2014-09-05 Thread Stephen Frost

Zhaomo,

  As an FYI- we generally prefer inline responses rather than
  top-posting on the PostgreSQL mailing lists.  Thanks.

* Zhaomo Yang (zhy...@cs.ucsd.edu) wrote:
 (1) Two philosophies important to our design
 - Try to not force web application developers to make too many changes
 to their apps if they wa.

That's certainly fair.

 - Try to make our mechanism as simple as possible.
 Web application developers have all kinds of backgrounds. If the
 security mechanism is too alien to them, they wouldn't use it.

I'm surprised to hear this and a suggestion to used stored procedures in
the same email- SPs are generally considered 'foreign' to the web
developers that I've talked to. :)  That said, I'll grant that there are
generally two camps: those who expect a database to only have BerkleyDB
level key/value capabilities, and those who know what they're doing and
what relational databases and SQL are all about.  The latter (and clear
minority) group will take advantage of these capabilites, certainly,
regardless of how they are expressed and are likely already comfortable
using stored procedures and database-level roles.

 (2) Why we need to cache application-level users' identifiers
 We want to differentiate application-level users in DBMS, but not by
 creating a DB user (or role in PG's terminology ) for every
 application-level user, otherwise there will be all sorts of problems
 when the number of application-level users is greater than a threshold
 (e.g. catalog, as you mentioned). 

While I agree that this can be an issue when things scale up, you *can*
address it by sharding the database based on user.  Even so though, I
agree that PG would do well to improve the situation around this.

 Instead, we still use one DB user
 representing all the application-level users, just as how web apps
 work now. Besides the identifiers (attributes) of a application-level
 user are stored in some private place of the corresponding session
 (e.g. temp table) when the application-level user authenticates so
 that the DBMS can differentiate application-level users. (Connection
 pooling should be fine as long as an application session doesn't
 return its connection until it finishes. )

Fair enough, and the RLS capabilities which are being added to PG will
support this approach.  If a temp table is being used then dynamic SQL
may be required and therefore a plpgsql function will be involved to
handle looking up the current user, as you won't be using PG roles.

 Normally, a web application authenticates an application-level user by
 making a SELECT query with the user provided user id and password on
 the password table to see if there is a match (Of course this is an
 over simplified version of how authentication works. ).  Using our
 mechanism, the web application instead calls the authentication
 function, which does a SELECT on the table first, and store the
 identifiers of that application-level user somewhere if a match found.
 The identifiers of the current application-level user are referenced
 by the policies so that fine-grained access control can be enforced.

That 'somewhere' is certainly something that PG could improve upon- we
don't have SQL-level variable capability today and this means that temp
tables have to be used, which is certainly unfortunate.  I'd love to see
work done to improve this situation.

 (3) CREATE AUTHENTICATION FUNCTION
 In our mechanism, we ask web application developers provide an
 authentication function which normally takes user id and password as
 inputs and returns a row containing all the identifiers (attributes)
 of the corresponding application-level user. Let us call the place
 storing the current application-level user's identifiers as
 identifier store.

I would *strongly* advocate *against* passing the password to the
database in any (non-hashed) form.  You are much better off using a
one-way hash as early as possible in the stack (ideally, in whatever
system initially receives the password on the server side) and then
comparing that one-way hash.  Of course, passwords in general are not
considered secure and one-time passwords, hardware tokens, or PIV /
HSPD12 / CAC cards with client-side certificates.

 The whole point of this CREATE AUTHENTICATION FUNCTION syntax is to
 reduce developers' work.  By giving developers very specific
 instructions on how to write an authentication function, we hope they
 would find it easy to write one. Admittedly, however, what CREATE
 AUTHENTICATION FUNCTION does can be achieved by CREATE FUNCTION.

I don't see how this is particularly better than simply providing a
function-creating-function (if there is really a concern that creating
two functions instead of just the one is a serious complication..) or,
better yet, creating an extension which creates all the functions,
tables, etc necessary for this system.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] A mechanism securing web applications in DBMS

2014-09-05 Thread Zhaomo Yang

Stephen,

There are several things I want to explain:

(1) Two philosophies important to our design
- Try to not force web application developers to make too many changes
to their apps if they wa.
- Try to make our mechanism as simple as possible.
Web application developers have all kinds of backgrounds. If the
security mechanism is too alien to them, they wouldn't use it.

(2) Why we need to cache application-level users' identifiers
We want to differentiate application-level users in DBMS, but not by
creating a DB user (or role in PG's terminology ) for every
application-level user, otherwise there will be all sorts of problems
when the number of application-level users is greater than a threshold
(e.g. catalog, as you mentioned). Instead, we still use one DB user
representing all the application-level users, just as how web apps
work now. Besides the identifiers (attributes) of a application-level
user are stored in some private place of the corresponding session
(e.g. temp table) when the application-level user authenticates so
that the DBMS can differentiate application-level users. (Connection
pooling should be fine as long as an application session doesn't
return its connection until it finishes. )

Normally, a web application authenticates an application-level user by
making a SELECT query with the user provided user id and password on
the password table to see if there is a match (Of course this is an
over simplified version of how authentication works. ).  Using our
mechanism, the web application instead calls the authentication
function, which does a SELECT on the table first, and store the
identifiers of that application-level user somewhere if a match found.
The identifiers of the current application-level user are referenced
by the policies so that fine-grained access control can be enforced.

(3) CREATE AUTHENTICATION FUNCTION
In our mechanism, we ask web application developers provide an
authentication function which normally takes user id and password as
inputs and returns a row containing all the identifiers (attributes)
of the corresponding application-level user. Let us call the place
storing the current application-level user's identifiers as
identifier store.

For example,

This is an authentication function provided by a app developer.

CREATE AUTHENTICATION FUNCTION auth (user_id TEXT, password TEXT)
RETURNS table(uid BIGINT, permissions TEXT[]) AS $$
...
$$ LANGUAGE plpgsql SECURITY DEFINER;

Under the hood, two regular functions will be defined

# the same function with a different name
CREATE FUNCTION _auth (user_id TEXT, password TEXT)
RETURNS table(uid BIGINT, permissions TEXT[]) AS $$
  # copy the function body from the CREATE AUTHENTICATION FUNCTION above
$$ LANGUAGE plpgsql SECURITY DEFINER;

# the function which is actually called in the server code
CREATE FUNCTION auth (user_id TEXT, password TEXT)
RETURNS table(uid BIGINT, permissions TEXT[]) AS $$
  # clear the identifier store
  # execute function _auth and insert the result into the identifier store.
  # return the row in the identifier store
$$ LANGUAGE plpgsql SECURITY DEFINER;

The whole point of this CREATE AUTHENTICATION FUNCTION syntax is to
reduce developers' work.  By giving developers very specific
instructions on how to write an authentication function, we hope they
would find it easy to write one. Admittedly, however, what CREATE
AUTHENTICATION FUNCTION does can be achieved by CREATE FUNCTION.

Please let me know if you have any other questions.

Zhaomo

On Thu, Sep 4, 2014 at 6:53 PM, Stephen Frost sfr...@snowman.net wrote:
 Zhaomo,

 * Zhaomo Yang (zhy...@cs.ucsd.edu) wrote:
 I am a graduate student from UC San Diego. My adviser, Dr. Kirill
 Levchenko, and I have been working on a web/DB security project for
 the last few months. Since fine-grained access control in DBMS is part
 of our project and the PostgreSQL community is also working on it now,
 we would like to exchange some ideas on the issue with you.

 Fantastic!  Very interested.

 1. Background
 Nowadays people entrust more and more sensitive data to web
 applications, while security vulnerabilities are common in these
 applications. The normal structure of web applications consists of
 three tiers: client-side code, server-side code and a database. In
 server-side code a database user representing all the
 application-level users interacts with the database with full
 privileges. Since currently database built-in access control is too
 coarse, no access control mechanism is used in most of web
 applications. This situation is not ideal since a malicious
 application user can tamper with other users’ data by exploiting
 vulnerabilities in the application, and if the application is
 completely under the malicious user’s control so is the database.

 Agreed- we are certainly working to improve that situation, though
 consideration must also be given to how our catalogs are structured and
 that they are unlikely to support web-scale numbers of

Re: [HACKERS] PL/pgSQL 1.2


On 2014-09-04 2:28 PM, I wrote:

On 9/4/14 2:04 PM, Pavel Stehule wrote:

for example best practices for PL/SQL by Steven Feuerstein


I'll spend some time with that book to have a better idea on where
you're coming from.


I've read through this book twice now.  Some observations on things we 
don't follow:


  - We don't use the exact hungarian notation -ish convention for 
naming stuff.  I don't see that as a bad thing.
  - Granted, we could be using the  myfield tablename.columnname%TYPE; 
 probably more.  On the other hand, sometimes you would prefer to not 
have all your types in your functions change transparently after an 
ALTER TABLE.
  - The book takes the single exit point thinking to an extreme.  I 
don't agree with that, regardless of the language (and thus I might not 
necessarily always follow it).
  - The book says Encapsulate INSERT, UPDATE, and DELETE statements 
behind procedure calls, which quite directly contradicts what you said 
earlier.


The rest of the stuff we follow in our codebase as far as I can tell 
(except the Oracle-specific stuff, obviously).


But further, even if we did follow every single one of the above points 
perfectly, it wouldn't change the point we're trying to make.  What 
we're doing is following what the book dedicated an entire chapter to: 
Defensive Programming.  Enforcing that that UPDATE affected exactly one 
row?  Defensive Programming.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL 2


On 2014-09-02 8:52 PM, Kevin Grittner wrote:

Marko Tiikkaja ma...@joh.to wrote:


Sounds like in this case you'd only use set-oriented programming
at the end of the transaction, no?


I guess -- more properly I would say in the final database
transaction for that financial transaction.


Yes, I should have said financial transaction, but I hit send a bit 
too early.



And no, that never
made me wish that plpgsql functions defaulted to throwing errors
for DML statements that affected more than one row.


Fine.  But you should still be able to see the point we're trying to 
make.  The number one is special, and it's present everywhere.  If you 
want to program defensively, you have to go through a lot of pain right 
now.  We're looking for a way to alleviate that pain.  Defaulting to 
throwing errors would be one way to do it, but that's not what's being 
suggested here anymore.


You can dismiss what we're doing by saying that it doesn't follow the 
best practices or we just want an interface for a key-value store or 
whatever.  And yes, to some extent, a simple interface for a key-value 
store would come in handy.  But we still have the 5-15% (big part of it 
being the reporting we need to do) of the code that *doesn't* want that, 
*and* we want to use all of the Postgres features where applicable.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-09-05 Thread Peter Geoghegan

On Wed, Sep 3, 2014 at 2:44 PM, Peter Geoghegan p...@heroku.com wrote:
 I guess it should still be a configure option, then. Or maybe there
 should just be a USE_ABBREV_KEYS macro within pg_config_manual.h.

Attached additional patches are intended to be applied on top off most
of the patches posted on September 2nd [1]. Note that you should not
apply patch 0001-* from that set to master, since it has already been
committed to master [2]. However, while rebasing I revised
patch/commit 0005-* to abbreviation used on all platforms, including
32-bit platforms (the prior 0005-* patch just re-enabled the
optimization on Darwin/Apple), so you should discard the earlier
0005-* patch. In a later commit I also properly formalize the idea
that we always do opportunistic memcmp() == 0 checks, no matter what
context a sortsupport-accelerated text comparison occurs in. That
seems like a good idea, but it's broken out in a separate commit in
case you are not in agreement.

While I gave serious consideration to your idea of having a dedicated
abbreviation comparator, and not duplicating sortsupport state when
abbreviated keys are used (going so far as to almost fully implement
the idea), I ultimately decided that my vote says we don't do that. It
seemed to me that there were negligible benefits for increased
complexity. In particular, I didn't want to burden tuplesort with
having to worry about whether or not abbreviation was aborted during
tuple copying, or was not used by the opclass in the first place -
implementing your scheme makes that distinction relevant. It's very
convenient to have comparetup_heap() compare the leading sort key
(that specifically looks at SortTuple.datum1 pairs) indifferently,
using the same comparator for abbreviated and not abbreviated
cases indifferently. comparetup_heap() does not seem like a great
place to burden with caring about each combination any more than
strictly necessary.

I like that I don't have to care about every combination, and can
treat abbreviation abortion as the special case with the extra step,
in line with how I think of the optimization conceptually. Does that
make sense? Otherwise, there'd have to be a ApplySortComparator()
*and* ApplySortComparatorAbbreviated() call with SortTuple.datum1
pairs passed, as appropriate for each opclass (and abortion state), as
well as a heap_getattr() tie-breaker call for the latter case alone
(when we got an inconclusive answer, OR when abbreviation was
aborted). Finally, just as things are now, there'd have to be a loop
where the second or subsequent attributes are dealt with by
ApplySortComparator()'ing. So AFAICT under your scheme there are 4
ApplySortComparator* call sites required, rather than 3 as under mine.

Along similar lines, I thought about starting from nkey = 0 within
comparetup_heap() when abortion occurs (so that there'd only be 2
ApplySortComparator() call sites - no increase from master) , but that
turns out to be messy, plus I like those special tie-breaker
assertions.

I will be away for much of next week, and will have limited access to
e-mail. I will be around tomorrow, though. I hope that what I've
posted is suitable to commit without further input from me.

[1] 
http://www.postgresql.org/message-id/cam3swztetqckc24lhwkdlasjf-b-ccnn4q0oyjhgbx+ncpn...@mail.gmail.com
[2] 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d8d4965dc29263462932be03d4206aa694e2cd7e
-- 
Peter Geoghegan
From 889d615218d5cac9097c11bec33e2d8e70112922 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan p...@heroku.com
Date: Fri, 5 Sep 2014 16:43:56 -0700
Subject: [PATCH 7/7] Soften wording about nodeAgg.c single column support

Support for forcing nodeAgg.c to use a heap tuplesort rather than a
Datum tuplesort, even when sorting on a single attribute may be useful.
Datum tuplesorts do not and cannot support abbreviated keys, since Datum
tuplesorting necessitates preserving the original representation for
later re-use by certain datum tuplesort clients.  When using a heap
tuplesort will make the difference between using and not using
abbreviated keys, it is likely significantly faster to prefer a heap
tuplesort.  On the other hand, nodeAgg.c's current preference for a
datum tuplesort is likely to result in superior performance for
non-abbreviated cases.

Since it is now apparent that support for forcing a heap tuplesort
within nodeAgg.c will follow in a later, separately reviewed patch,
soften the wording in comments within nodeAgg.c to reflect this.  This
doesn't seem unreasonable, since sortsupport optimization is already
used inconsistently.
---
 src/backend/executor/nodeAgg.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 3b335e7..95143c3 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -365,9 +365,9 @@ initialize_aggregates(AggState *aggstate,
 			 * otherwise sort the full tuple.  (See comments

Re: [HACKERS] PL/pgSQL 1.2

2014-09-06 4:25 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 2014-09-04 2:28 PM, I wrote:

 On 9/4/14 2:04 PM, Pavel Stehule wrote:

 for example best practices for PL/SQL by Steven Feuerstein


 I'll spend some time with that book to have a better idea on where
 you're coming from.


 I've read through this book twice now.  Some observations on things we
 don't follow:

   - We don't use the exact hungarian notation -ish convention for naming
 stuff.  I don't see that as a bad thing.
   - Granted, we could be using the  myfield tablename.columnname%TYPE;
 probably more.  On the other hand, sometimes you would prefer to not have
 all your types in your functions change transparently after an ALTER TABLE.
   - The book takes the single exit point thinking to an extreme.  I
 don't agree with that, regardless of the language (and thus I might not
 necessarily always follow it).
   - The book says Encapsulate INSERT, UPDATE, and DELETE statements
 behind procedure calls, which quite directly contradicts what you said
 earlier.


Not necessary -- It say -- complex SQL should not be used more times in
code, but there is not specified, so they must by stored in trivial
functions. Complex queries should be wrapped by views instead - it doesn't
block a optimizer

There is a strong warning to not break optimizer.



 The rest of the stuff we follow in our codebase as far as I can tell
 (except the Oracle-specific stuff, obviously).


Ten years ago I wrote article
http://postgres.cz/wiki/PL/pgSQL_%28en%29#Recommendation_for_design_of_stored_procedures_in_PL.2FpqSQL_language
based on Steve F, Joe Celko and others presentations and books


http://postgres.cz/wiki/PL/pgSQL_%28en%29#Recommendation_for_design_of_stored_procedures_in_PL.2FpqSQL_language

There is point: Don't enclose SQL commands to simply functions uselessly.

Where is a problem.

People can prepare a simple functions like you did:

...

CREATE OR REPLACE FUNCTION user_list ()
RETURNS SETOF id AS $$
BEGIN
  RETURN QUERY SELECT id FROM user WHERE .. some = $1
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION update_user(int)
RETURNS void AS $$
BEGIN
  UPDATE user SET .. WHERE id = $1
END;
$$ LANGUAGE;

And then  use it in mass operations:

BEGIN
  FOR company IN SELECT * FROM company_list()
  LOOP
FOR id IN SELECT * FROM user_list(company)
LOOP
  update_user(id);
END LOOP;

Or use it in application same style.

It is safe .. sure, and I accept it. But It is terrible slow.

If you are lucky and have some knowledges, you can use a SQL function in
Postgres. It is a macros, so it is not a black bock for optimizer, but I am
not sure, if postgres optimizer can do well work in this case too.

This is Joe Celko lovely theme.


 But further, even if we did follow every single one of the above points
 perfectly, it wouldn't change the point we're trying to make.  What we're
 doing is following what the book dedicated an entire chapter to: Defensive
 Programming.  Enforcing that that UPDATE affected exactly one row?
 Defensive Programming.


Your strategy is defensive. 100%. But then I don't understand to your
resistant  to verbosity. It is one basic stone of Ada design

The problem of defensive strategy in stored procedures is possibility to
block optimizer and result can be terrible slow. On the end, it needs a
complex clustering solution, complex HA24 solution and higher complexity ~
less safety.

This is not problem on low load or low data applications.

Banking applications are safe (and I accept, so there it is necessary), but
they are not famous by speed.

Pavel




 .marko

Re: [HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases

2014-09-06 0:53 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 2014-09-05 11:33 PM, David G Johnston wrote:

 To protect users on every query they write there would need to be some
 kind
 of always explain first and only execute if no warnings are thrown
 mode...and ideally some way to then override that on a per-query basis if
 you know you are correct and don't want to fix the SQL...

 If the static check fails the query itself would error and the detail
 would
 contain the status result of the static analysis; otherwise the query
 should
 return as normal.


 This feels even sillier.  Instinctively, if you can contain the
 functionality into the EXPLAIN path only, I don't see why you couldn't do
 it in a single  if (..)  for every query.  I doubt you could ever measure
 that difference.

  This at least avoids having to introduce 10 different GUC just to
 accommodate this feature and neatly bundles them into named packages.


 I disagree.  Even if there was such a static analysis mode, I think
 there would have to be some way of filtering some of them out.

 But 10 difference GUCs can still be avoided; see plpgsql.extra_warnings,
 for example.


You used a good name for these mode in other thread defensive. We can
support some  defensive mode.  Personally I am sure, so if somebody would
to use it, it would to use it as complex. Too less granularity increase a
complexity of Postgres configuration and we don't would it.

Novice mode is not correct name and can has degrading character.

In this mode people maybe got more very specific warnings (DBA doesn't like
it, because logs are massive), developers should to use explicit casting
much more (developers doesn't like explicit casting in SQL). But when we
use a good name, then they can accept it and use it. It is paradox, so
defensive mode is mainly for professionals with experience - absolute
beginner probably will hate this mode due too restrictivity

Regards

Pavel



 .marko



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-05 14:35 GMT+02:00 Jan Wieck j...@wi3ck.info:

 On 09/05/2014 04:40 AM, Pavel Stehule wrote:




 2014-09-05 10:33 GMT+02:00 Marko Tiikkaja ma...@joh.to
 mailto:ma...@joh.to:

 On 9/5/14 10:09 AM, Pavel Stehule wrote:

 I think this could still be parsed correctly, though I'm not
 100% sure on
 that:

 ASSERT WARNING (EXISTS(SELECT ..)), 'data are there';


 PLpgSQL uses a ';' or some plpgsql keyword as SQL statement
 delimiter. It
 reason why RETURN QUERY ... ';' So in this case can practical to
 place SQL
 statement on the end of plpgsql statement.


 *shrug*  There are lots of cases where a comma is used as well, e.g.
 RAISE NOTICE '..', expr, expr;

 parenthesis are not practical, because it is hard to identify bug
 ..


 I don't see why.  The PL/PgSQL SQL parser goes to great lengths to
 identify unmatched parenthesis.  But the parens probably aren't
 necessary in the first place; you could just omit them and keep
 parsing until the next comma AFAICT.  So the syntax would be:

 RAISE [ NOTICE | WARNING | EXCEPTION/ASSERT/WHATEVER ]
 boolean_expr [, error_message [, error_message_param [, ... ] ] ];


 extension RAISE about ASSERT level has minimal new value


 Adding a WHEN clause to RAISE would have the benefit of not needing any
 new keywords at all.

 RAISE EXCEPTION 'format' [, expr ...] WHEN row_count  1;


It was one my older proposal.

Can we find a agreement there?

Pavel




 Regards,
 Jan




 A simplicity of integration SQL and PLpgSQL is in using smart
 keywords -
 It is more verbose, and it allow to well diagnostics


 I disagree.  The new keywords provide nothing of value here.  They
 even encourage the use of quirky syntax in *exchange* for verbosity
 (IS NOT NULL pk? really?).


 It is about semantic and conformity of proposed tools. Sure,  all can
 reduced to ASSERT(expr) .. but where is some benefit against function call

 I am able to do without any change of language as plpgsql extension -
 there is no necessary to do any change for too thin proposal



 .marko




 --
 Jan Wieck
 Senior Software Engineer
 http://slony.info

[HACKERS] plpgsql defensive mode