Re: [HACKERS] Patch to support SEMI and ANTI join removal

2014-11-23 Thread David Rowley
On Wed, Nov 19, 2014 at 11:49 PM, David Rowley dgrowle...@gmail.com wrote:

 On Sun, Nov 16, 2014 at 12:19 PM, David Rowley dgrowle...@gmail.com
 wrote:

 On Sun, Nov 16, 2014 at 10:09 AM, Simon Riggs si...@2ndquadrant.com
 wrote:


 I propose that we keep track of whether there are any potentially
 skippable joins at the top of the plan. When we begin execution we do
 a single if test to see if there is run-time work to do. If we pass
 the run-time tests we then descend the tree and prune the plan to
 completely remove unnecessary nodes. We end with an EXPLAIN and
 EXPLAIN ANALYZE that looks like this

  QUERY PLAN
  --
   Aggregate (actual rows=1 loops=1)
   -  Seq Scan on t1 (actual rows=100 loops=1)

 Doing that removes all the overheads and complexity; it also matches
 how join removal currently works.




I've attached an updated patch which works in this way. All of the skipping
code that I had added to the executor's join functions has now been removed.

Here's an example output with the plan trimmed, and then untrimmed.

set constraints b_c_id_fkey deferred;
explain (costs off) select b.* from b inner join c on b.c_id = c.id;
  QUERY PLAN
---
 Seq Scan on b
(1 row)

-- add a item to the trigger queue by updating a referenced record.
update c set id = 2 where id=1;
explain (costs off) select b.* from b inner join c on b.c_id = c.id;
  QUERY PLAN
--
 Hash Join
   Hash Cond: (b.c_id = c.id)
   -  Seq Scan on b
   -  Hash
 -  Seq Scan on c
(5 rows)

A slight quirk with the patch as it stands is that I'm unconditionally NOT
removing Sort nodes that sit below a MergeJoin node. The reason for this is
that I've not quite figured out a way to determine if the Sort order is
required still.

An example of this can be seen in the regression tests:

-- check merge join nodes are removed properly
set enable_hashjoin = off;
-- this should remove joins to b and c.
explain (costs off)
select COUNT(*) from a inner join b on a.b_id = b.id left join c on a.id =
c.id;
QUERY PLAN
---
 Aggregate
   -  Sort
 Sort Key: a.b_id
 -  Seq Scan on a
(4 rows)

As the patch stands there's still a couple of FIXMEs in there, so there's
still a bit of work to do yet.

Comments are welcome

Regards

David Rowley


inner_join_removals_2014-11-24_7cde1e4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] make check-world regress failed

2014-11-23 Thread Vladimir Koković

Hi,

PostgreSQL check-world regress failed with current GIT HEAD on my Kubuntu  
14.10.


uname -a
Linux vlD-kuci 3.16.0-24-generic #32-Ubuntu SMP Tue Oct 28 13:13:18 UTC  
2014 i686 athlon i686 GNU/Linux


gdb -d /home/src/postgresql-devel/postgresql-git/postgresql/src -c core
...
Loaded symbols for  
/home/src/postgresql-devel/dev-build/src/test/regress/regress.so

(gdb) bt
#0  0xb76ecc7c in __kernel_vsyscall ()
#1  0xb7075577 in __GI_raise (sig=sig@entry=6) at  
../nptl/sysdeps/unix/sysv/linux/raise.c:56

#2  0xb7076cf3 in __GI_abort () at abort.c:89
#3  0x084c326a in ?? ()
#4  0x0a56c3b8 in ?? ()
#5  0xb76d232f in pg_atomic_init_u64 (ptr=0xbfa16fd4, val=0) at  
/home/src/postgresql-devel/postgresql-git/postgresql/src/include/port/atomics.h:445
#6  0xb76d50e4 in test_atomic_uint64 () at  
/home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/regress.c:1022
#7  0xb76d5756 in test_atomic_ops (fcinfo=0xa57c76c) at  
/home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/regress.c:1114

#8  0x0825bfee in ?? ()
...

My patch solved check-world regress fail.

Best regards
Vladimir Kokovic DP senior

generic-gcc.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2014-11-23 Thread Tom Lane
I wrote:
 The core of that complaint is that we'd have to make ASSERT a plpgsql
 reserved word, which is true enough as things stand today.  However,
 why is it that plpgsql statement-introducing keywords need to be
 reserved?  The only reason for that AFAICS is to allow us to distinguish
 the statements from assignments.  But it seems like that could possibly
 be gotten around with some work.

Attached is a draft patch that de-reserves 17 of plpgsql's existing
reserved words, leaving 24 still reserved (of which only 9 are not also
reserved in the core grammar).  This would make it painless to invent an
ASSERT statement, as well as any other new statement type that's not
associated with looping or block structure.  (The limiting factor on those
is that if a statement could have an opt_block_label, its keyword still
has to be reserved, unless we want to complicate matters a bunch more.)

regards, tom lane

diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 82378c7..4af28ea 100644
*** a/src/pl/plpgsql/src/pl_gram.y
--- b/src/pl/plpgsql/src/pl_gram.y
*** unreserved_keyword	:
*** 2315,2346 
--- 2315,2360 
  | K_ALIAS
  | K_ARRAY
  | K_BACKWARD
+ | K_CLOSE
+ | K_COLLATE
  | K_COLUMN
  | K_COLUMN_NAME
  | K_CONSTANT
  | K_CONSTRAINT
  | K_CONSTRAINT_NAME
+ | K_CONTINUE
  | K_CURRENT
  | K_CURSOR
  | K_DATATYPE
  | K_DEBUG
+ | K_DEFAULT
  | K_DETAIL
+ | K_DIAGNOSTICS
  | K_DUMP
+ | K_ELSIF
  | K_ERRCODE
  | K_ERROR
+ | K_EXCEPTION
+ | K_EXIT
+ | K_FETCH
  | K_FIRST
  | K_FORWARD
+ | K_GET
  | K_HINT
  | K_INFO
+ | K_INSERT
  | K_IS
  | K_LAST
  | K_LOG
  | K_MESSAGE
  | K_MESSAGE_TEXT
+ | K_MOVE
  | K_NEXT
  | K_NO
  | K_NOTICE
+ | K_OPEN
  | K_OPTION
+ | K_PERFORM
  | K_PG_CONTEXT
  | K_PG_DATATYPE_NAME
  | K_PG_EXCEPTION_CONTEXT
*** unreserved_keyword	:
*** 2349,2356 
--- 2363,2372 
  | K_PRINT_STRICT_PARAMS
  | K_PRIOR
  | K_QUERY
+ | K_RAISE
  | K_RELATIVE
  | K_RESULT_OID
+ | K_RETURN
  | K_RETURNED_SQLSTATE
  | K_REVERSE
  | K_ROW_COUNT
diff --git a/src/pl/plpgsql/src/pl_scanner.c b/src/pl/plpgsql/src/pl_scanner.c
index 6a5a04b..db01ba5 100644
*** a/src/pl/plpgsql/src/pl_scanner.c
--- b/src/pl/plpgsql/src/pl_scanner.c
*** IdentifierLookup plpgsql_IdentifierLooku
*** 31,51 
   *
   * We keep reserved and unreserved keywords in separate arrays.  The
   * reserved keywords are passed to the core scanner, so they will be
!  * recognized before (and instead of) any variable name.  Unreserved
!  * words are checked for separately, after determining that the identifier
   * isn't a known variable name.  If plpgsql_IdentifierLookup is DECLARE then
   * no variable names will be recognized, so the unreserved words always work.
   * (Note in particular that this helps us avoid reserving keywords that are
   * only needed in DECLARE sections.)
   *
   * In certain contexts it is desirable to prefer recognizing an unreserved
!  * keyword over recognizing a variable name.  Those cases are handled in
!  * pl_gram.y using tok_is_keyword().
   *
!  * For the most part, the reserved keywords are those that start a PL/pgSQL
!  * statement (and so would conflict with an assignment to a variable of the
!  * same name).  We also don't sweat it much about reserving keywords that
!  * are reserved in the core grammar.  Try to avoid reserving other words.
   */
  
  /*
--- 31,58 
   *
   * We keep reserved and unreserved keywords in separate arrays.  The
   * reserved keywords are passed to the core scanner, so they will be
!  * recognized before (and instead of) any variable name.  Unreserved words
!  * are checked for separately, usually after determining that the identifier
   * isn't a known variable name.  If plpgsql_IdentifierLookup is DECLARE then
   * no variable names will be recognized, so the unreserved words always work.
   * (Note in particular that this helps us avoid reserving keywords that are
   * only needed in DECLARE sections.)
   *
   * In certain contexts it is desirable to prefer recognizing an unreserved
!  * keyword over recognizing a variable name.  In particular, at the start
!  * of a statement we should prefer unreserved keywords unless the statement
!  * looks like an assignment (i.e., first token is followed by ':=' or '[').
!  * This rule allows most statement-introducing keywords to be kept unreserved.
!  * (We still have to reserve initial keywords that might follow a block
!  * label, unfortunately, since the method used to determine if we are at
!  * start of statement doesn't recognize such cases.  We'd also have to
!  * reserve any keyword that could legitimately be followed by ':=' or '['.)
!  * Some additional cases are handled in pl_gram.y using 

Re: [HACKERS] Turning recovery.conf into GUCs

2014-11-23 Thread Andres Freund
On 2014-11-21 10:59:23 -0800, Josh Berkus wrote:
 On 11/21/2014 10:54 AM, Stephen Frost wrote:
  * Josh Berkus (j...@agliodbs.com) wrote:
  Either way, from the code it is clear that we only stay in recovery if
  standby_mode is directly turned on.  This makes the whole check for a
  specially named file unnecessary, IMO: we should just check the value of
  standby_mode (which is off by default).
 
  So, what happens when someone does pg_ctl promote?  Somehow
  standby_mode needs to get set to off.  Maybe we write standby_mode =
  off to postgresql.auto.conf?
  
  Uhh, and then not work for the sane folks who disable
  postgresql.auto.conf?  No thanks..
 
 Other ideas then, without reverting to the old system?  Being able to
 promote servers over port 5432 will be a huge advantage for
 container-based systems, so I don't want to give that up as a feature.

Why is a trigger file making that impossible? Adding the code from
pg_ctl promote into a SQL callable function is a couple minutes worth of
work?

A guc alone won't work very well - standby_mode is checked in specific
places, you can't just turn that off and expect that that'd result in
speedy promotion. And it'd break people using scripts pg_standby. And it also
has other effects.

So, no.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] superuser() shortcuts

2014-11-23 Thread Andres Freund
On 2014-11-21 10:12:40 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  I still think this change makes the error message more verbose, without
  any win in clarity.
 
 Can we agree that there should be consistency?

Consistency with what? Are you thinking of the messages in
aclck.c:no_priv_msg? I don't think that's really comparable. A
permission denied on a relation is much easier to understand than
replication permissions and such.

It'd surely not be better if pg_basebackup would a error message bar
actually helpful information. Btw, the replication permission use in
postinit.c isn't related to slots.

 I'm not really particular about which way we go with the specific
 wording (suggestions welcome..) but the inconsistency should be dealt
 with.

Meh.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-23 Thread Andres Freund
On 2014-11-21 08:31:01 -0500, Robert Haas wrote:
 On Thu, Nov 20, 2014 at 8:31 PM, Andres Freund and...@2ndquadrant.com wrote:
  I can't follow that logic.  Why do you think self-exclusive locks are
  the only ones that matter?
 
  All the others can be acquired by jumping in front of the lock's
  waitqueue?
 
 That's is true, as a practical matter, in many cases.  But from the
 point of view of the lock manager, it's really not true.  It is almost
 true if you assume that the lock level acquired by the parallel
 backend will be weaker than the lock level held by the user backed,
 but not quite, because RowExclusiveLock conflicts with ShareLock.

The assumption that a parallel backend will accquire a locklevel = the
worker's lock imo is a pretty sane one - we obviously can't predict
things if not. And I don't think any of the other presented approaches
can do that.

I'm not really bothered by RowExclusiveLock vs. ShareLock. Unless I miss
something that can only be problematic if the primary acquires a
ShareLock and the worker tries a RowExclusive. That just can't be sane.

 The assumption that the parallel backend's lock level will always be
 weaker seems shaky.  It's probably true for relation locks, but if we
 ever want to support parallel workers that can write any data
 whatsoever, they're going to need to be able to take relation
 extension locks.

I don't think extension locks are a good argument here. As you noted
upthread, they really need to be used consistenly across processes. And
they shouldn't/wouldn't be granted by your suggested mechanism either,
right?

 If one backend in a parallel group attempts to
 acquire the relation extension lock while another worker holds it, the
 rule that all group members should be regarded as waiting for each
 other will result in instantaneous deadlock.  That's pretty
 undesirable, and suggests to me that the whole model is defective in
 some way.

Shouldn't the deadlock checker actually check that all involved
processes are actually waiting for a lock? It seems a bit pointless to
deadlock when one node is actually progressing. That seems like it'd be
an absolute PITA for anything involving system tables and such.

   I don't buy the plan time/execution time argument. We'll need to be able
   to adapt plans to the availability of bgworker slots *anyway*. Otherwise
   we either need to to set the number of bgworkers to MaxConnections *
   MaxParallelism or live with errors as soon as too many processes use
   parallelism. The former is clearly unrealistic given PG's per-backend
   overhead and the latter will be a *much* bigger PITA than all the
   deadlock scenarios talked about here. It'll constantly fail, even if
   everything is ok.
 
  I certainly think that any parallel node needs to be able to cope with
  getting fewer workers than it would ideally like to have.  But that's
  different from saying that when our own backend takes new locks, we
  have to invalidate plans.  Those seem like largely unrelated problems.
 
  No need to invalidate them if you have the infrastructure to run with no
  parallel workers - just use the path that's used if there's no workers
  available.
 
 Well, it's OK for a query to run with less paralellism than the
 planner thought optimal.  It's not OK for it to attempt to run with
 the requested degree of parallelism and deadlock.  I think if the
 deadlock detector gets involved it's quite unlikely that you're going
 to be able to make things degrade smoothly to non-parallel operation.

That was in response to your argument about plan invalidation - which I
don't buy because it'd only reuse the graceful degradiation capabilities
we need anyway.

  The heavyweight locking issue is really killing me, though.
 
  I don't really understand why you're not content with just detecting
  deadlocks for now. Everything else seems like bells and whistles for
  later.
 
 I don't think it's OK for a parallel query to just randomly deadlock.

I think that should be the end goal, yes.

 It's OK for parallel query to excuse itself politely and retire from
 the field, but I don't think it's OK for it to say, oh, sure, I can
 parallelize that for you, and then fail by deadlocking with itself.

But I also think that right now it doesn't necessarily need to be the
focus immediately. This is a topic that I think will be much easier to
approach if some demonstration of actual parallel users would be
available. Doesn't have to be committable or such, but I think it'll
help to shape how this has to look.  I think unrecognized deadlocks will
make things too annoying to use, but I don't think it needs to be
guaranteed deadlock free or such.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-11-23 Thread Dilip kumar
On 23 November 2014 14:45, Amit Kapila Wrote

Thanks a lot for debugging and fixing the issue..

The stacktrace of crash is as below:

#0  0x0080108cf3a4 in .strlen () from /lib64/libc.so.6
#1  0x0080108925bc in ._IO_vfprintf () from /lib64/libc.so.6
#2  0x0080108bc1e0 in .__GL__IO_vsnprintf_vsnprintf () from 
/lib64/libc.so.6
#3  0x0fff7e581590 in .appendPQExpBufferVA () from
/data/akapila/workspace/master/installation/lib/libpq.so.5
#4  0x0fff7e581774 in .appendPQExpBuffer () from
/data/akapila/workspace/master/installation/lib/libpq.so.5
#5  0x10003748 in .run_parallel_vacuum ()
#6  0x10003f60 in .vacuum_parallel ()
#7  0x10002ae4 in .main ()
(gdb) f 5
#5  0x10003748 in .run_parallel_vacuum ()

So now the real reason here is that the list of tables passed to
function is corrupted.  The below code seems to be the real
culprit:

vacuum_parallel()
{
..
if (!tables || !tables-head)
{
SimpleStringList dbtables = {NULL, NULL};
...
..
   tables = dbtables;
}
..
}

In above code dbtables is local to if loop and code
is using the address of same to assign it to tables which
is used out of if block scope, moving declaration to the
outer scope fixes the problem in my environment.  Find the
updated patch that fixes this problem attached with this
mail.  Let me know your opinion about the same.

Yes, that’s the reason of corruption, this must be causing both the issues, 
sending corrupted query to server as well as crash at client side.


While looking at this problem, I have noticed couple of other
improvements:

a. In prepare_command() function, patch is doing init of sql
buffer (initPQExpBuffer(sql);) which I think is not required
as both places from where this function is called, it is done by
caller.  I think this will lead to memory leak.

Fixed..

b. In prepare_command() function, for fixed strings you can
use appendPQExpBufferStr() which is what used in original code
as well.

Changed as per comment..

c.
run_parallel_vacuum()
{
..
prepare_command(connSlot[free_slot].connection, full, verbose,
and_analyze, analyze_only, freeze, sql);

appendPQExpBuffer(sql,  %s, cell-val);
..
}

I think it is better to end command with ';' by using
appendPQExpBufferStr(sql, ;); in above code.

Done

Latest patch is attached, please have a look.

Regards,
Dilip Kumar






vacuumdb_parallel_v18.patch
Description: vacuumdb_parallel_v18.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-11-23 Thread Amit Kapila
On Mon, Nov 24, 2014 at 7:34 AM, Dilip kumar dilip.ku...@huawei.com wrote:

 On 23 November 2014 14:45, Amit Kapila Wrote



 Thanks a lot for debugging and fixing the issue..



 Latest patch is attached, please have a look.


I think still some of the comments given upthread are not handled:

a.  About cancel handling
b.  Setting of inAbort flag for case where PQCancel is successful
c.  Performance data of --analyze-in-stages switch
d.  Have one pass over the comments in patch.  I could still some
wrong multiline comments.  Refer below:
+  /* Otherwise, we got a stage from vacuum_all_databases(), so run
+   * only that one. */


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] postgres_fdw behaves oddly

2014-11-23 Thread Ashutosh Bapat
On Sun, Nov 23, 2014 at 2:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
  (2014/11/19 18:21), Ashutosh Bapat wrote:
  Ok. I added that comment to the commitfest and changed the status to
  ready for commiter.

  Many thanks!

 I committed this with some cosmetic adjustments, and one not-so-cosmetic
 one: I left it continuing to allow CTID conditions to be sent to the
 remote server.  That works today and we shouldn't disable it, especially
 not in the back branches where it could mean a performance regression
 in a minor release.


Thanks.


 As for the question of whether reltargetlist or tlist should be examined,
 reltargetlist is the correct thing.  I do not see a need to modify
 use_physical_tlist either.  If you trace through what happens in e.g.
 postgres_fdw, you'll notice that it only bothers to retrieve actually-used
 columns (which it computes correctly) from the remote side.  It then
 constructs a scan tuple that corresponds to the foreign table's physical
 column set, inserting NULLs for any unfetched columns.  This is handed
 back to execScan.c and matches what a heapscan or indexscan would produce.
 The point of the use_physical_tlist optimization is to avoid a projection
 step *on this tuple* if we don't really need to do one (ie, if it'd be
 just as good for the scan node's output tuple to be identical to the row's
 physical tuple).  If we disabled use_physical_tlist then we'd be forcing
 a projection step that would have the effect of removing the NULLs for
 unfetched columns, but it would not actually be of value, just as it isn't
 in the corresponding cases for heap/index scans.

 For the current patch, that's fair.

In grouping_planner() the targetlist of the topmost plan node of join tree
is changed to the set of expressions requested in the query.
1554 else
1555 {
1556 /*
1557  * Otherwise, just replace the subplan's flat
tlist with
1558  * the desired tlist.
1559  */
1560 result_plan-targetlist = sub_tlist;
1561 }
This means that for a simple SELECT from the foreign table, projection
would be necessary in ExecScan if all the columns are not required.

So, the strategy to emulate heap or index scan wastes extra cycles and
memory in adding NULL columns which are not required and then filtering
them during the projection (within ExecScan itself). It also limits the
ability of foreign scans to fetch shippable expressions when those
expressions are part of the targetlist and fetching their values (instead
of the columns participating in the expressions) reduces the size of the
fetched row.

These cases do not exist for heap scans or index scans because the tuples
are obtained directly from the buffer, so no extra memory required for the
extra columns and projection as well as expression evaluation is anyway
required.


BTW, given that there wasn't any very good way to test the createplan.c
 fix except by adding test cases to postgres_fdw, I didn't see much value
 in splitting the patch in two.  The committed patch includes tests that
 the createplan.c bug is fixed as well as the postgres_fdw one.

 regards, tom lane




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] [TODO] Track number of files ready to be archived in pg_stat_archiver

2014-11-23 Thread Michael Paquier
Hearing nothing from the original author, this patch that was in state
Waiting on Author for a couple of days is switched to returned with
feedback.
Regards,
-- 
Michael