Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-17 Thread Simon Riggs
On 16 October 2012 23:03, Josh Berkus j...@agliodbs.com wrote:

 Can you explain in more detail how this would be used on the receiving
 side?  I'm unable to picture it from your description.

This will allow implementation of pgq in core, as discussed many times
at cluster hackers meetings.

 I'm also a bit reluctant to call this a message queue, since it lacks
 the features required for it to be used as an application-level queue.

It's the input end of an application-level queue. In this design the
queue is like a table, so we need SQL grammar to support this new type
of object. Replication message doesn't describe this, since it has
little if anything to do with replication and if anything its a
message type, not a message.

You're right that Hannu needs to specify the rest of the design and
outline the API. The storage of the queue is in WAL, which raises
questions about how the API will guarantee we read just once from the
queue and what happens when queue overflows. The simple answer would
be we put everything in a table somewhere else, but that needs more
careful specification to show we have both ends of the queue and a
working design.

Do we need a new object at all? Can we not just define a record type,
then define messages using that type? At the moment I think the
named-object approach works better, but we should consider that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to avoid base backup in automated failover

2012-10-17 Thread Amit Kapila
On Wednesday, October 17, 2012 11:22 AM chinnaobi wrote:
 Hey Haas,
 
 What does the standby server have to wait for replication to catch up
 before
 promoting ?? Is there any parameter to configure this ??
 
 Few more questions on this part
 
 1. How could we ensure the standby has received all transactions sent by
 primary till the point primary server is dead. (Meaning the dead primary
 and
 standby server are exactly same, so that the dead primary comes back it
 can
 be turned to standby without any issues).

 2. When the dead primary is turned to standby the streaming is not
 happening
 due to current_wal_location is ahead in the standby server is ahead of
 wal_sent_location. In this case how can I start streaming without taking
 a
 fresh base backup ??
 
 3. When the dead primary comes back the DB still accepts data and it
 goes to
 out of sync with the current primary and streaming won't start. Is there
 any
 solution for this case ??

I think points 2  3 can be addressed with new feature getting implemented
by Heikki provided standby has received all WAL of primary till the point it
goes dead.
 
https://commitfest.postgresql.org/action/patch_view?id=933



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Global Sequences

2012-10-17 Thread Markus Wanner
Simon,

On 10/16/2012 02:36 PM, Simon Riggs wrote:
 Where else would you put the hook? The hook's location as described
 won't change whether you decide you want 1, 2 or 3.

You assume we want an API that supports all three options. In that case,
yes, the hooks need to be very general.

Given that option 3 got by far the most support, I question whether we
need such a highly general API. I envision an API that keeps the
bookkeeping and cache lookup functionality within Postgres. So we have a
single, combined-effort, known working implementation for that.

What remains to be done within the plugin effectively is the consensus
problem: it all boils down to the question of which node gets the next
chunk of N sequence numbers. Where N can be 1 (default CACHE setting in
Postgres) or any higher number for better performance (reduces the total
communication overhead by a factor of N - or at least pretty close to
that, if you take into account lost chucks due to node failures).

A plugin providing that has to offer a method to request for a global
ordering and would have to trigger a callback upon reaching consensus
with other nodes on who gets the next chunk of sequence numbers. That
works for all N = 1. And properly implements option 3 (but doesn't
allow implementations of options 1 or 2, which I claim we don't need,
anyway).

 Implementations will be similar, differing mostly in the topology and
 transport layer

I understand that different users have different needs WRT transport
layers - moving the hooks as outlined above still allows flexibility in
that regard.

What different topologies do you have in mind? I'd broadly categorize
this all as multi-master. Do you need finer grained differentiation? Or
do you somehow include slaves (i.e. read-only transactions) in this process?

As you yourself are saying, implementations will only differ in that
way, let's keep the common code the same. And not require plugins to
duplicate that. (This also allows us to use the system catalogs for book
keeping, as another benefit).

 which means its not going to be possible to provide
 such a thing initially without slowing it down to the point we don't
 actually get it at all.

Sorry, I don't quite understand what you are trying to say, here.

Overall, thanks for bringing this up. I'm glad to see something
happening in this area, after all.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Global Sequences

2012-10-17 Thread Markus Wanner
Tom,

On 10/16/2012 06:15 PM, Tom Lane wrote:
 I challenge you to find anything in the SQL standard that suggests that
 sequences have any nonlocal behavior.  If anything, what you propose
 violates the standard, it doesn't make us follow it more closely.

If you look at a distributed database as a transparent equivalent of a
single-node system, I'd say the SQL standard applies to the entire
distributed system. From that point of view, I'd rather argue that any
local-only behavior violates the standard.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Global Sequences

2012-10-17 Thread Simon Riggs
On 17 October 2012 09:10, Markus Wanner mar...@bluegap.ch wrote:
 Simon,

 On 10/16/2012 02:36 PM, Simon Riggs wrote:
 Where else would you put the hook? The hook's location as described
 won't change whether you decide you want 1, 2 or 3.

 You assume we want an API that supports all three options. In that case,
 yes, the hooks need to be very general.

I'm not assuming that, so much of what you say is moot, though it is
good and welcome input.

 Given that option 3 got by far the most support, I question whether we
 need such a highly general API. I envision an API that keeps the
 bookkeeping and cache lookup functionality within Postgres. So we have a
 single, combined-effort, known working implementation for that.

IMHO an API is required for give me the next allocation of numbers,
essentially a bulk equivalent of nextval().

Anything lower level is going to depend upon implementation details
that I don't think we should expose.

I'm sure there will be much commonality between 2 similar
implementations, just as there is similar code in each index type. But
maintaining modularity is important and ahead of us actually seeing 2
implementations, trying to prejudge that is going to slow us all down
and likely screw us up.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Global Sequences

2012-10-17 Thread Markus Wanner
Simon,

On 10/17/2012 10:34 AM, Simon Riggs wrote:
 IMHO an API is required for give me the next allocation of numbers,
 essentially a bulk equivalent of nextval().

Agreed. That pretty exactly matches what I described (and what's
implemented in Postgres-R). The API then only needs to be called every N
invocations of nextval(), because otherwise nextval() can simply return
a cached number previously allocated in a single step, eliminating a lot
of the communication overhead.

You realize an API at that level doesn't allow for an implementation of
options 1 and 2? (Which I'm convinced we don't need, so that's fine with
me).

 Anything lower level is going to depend upon implementation details
 that I don't think we should expose.

Exactly. Just like we shouldn't expose other implementation details,
like writing to system catalogs or WAL.

 I'm sure there will be much commonality between 2 similar
 implementations, just as there is similar code in each index type. But
 maintaining modularity is important and ahead of us actually seeing 2
 implementations, trying to prejudge that is going to slow us all down
 and likely screw us up.

Agreed. Let me add, that modularity only serves a purpose, if the
boundaries between the modules are chosen wisely. It sounds like we are
on the same page, though.

To testify this: IMHO an API for setval() is required to invalidate all
node's caches and re-set an initial value, as a starting point for the
next bulk of numbers that nextval() will return.

currval() doesn't need to be changed or hooked at all, because it's a
read-only operation.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Dimitri Fontaine
Peter Geoghegan pe...@2ndquadrant.com writes:
 Clearly deprecating rules implies some loss of functionality - there
 is no exact, drop-in equivalent to something that magically rewrites
 SQL that isn't equally baroque and problematic. If that's the bar,
 then detractors of rules should stop wasting their breath, because the
 bar has been set impossibly high.

I believe an advice system is a good contender here, as already
proposed here:

  http://archives.postgresql.org/pgsql-hackers/2012-10/msg00610.php

See defadvice in Emacs Lisp and The Standard Method Combination of the
Common Lisp Object System as sources of inspiration here.

  
http://www.gnu.org/software/emacs/manual/html_node/elisp/Advising-Functions.html
  http://www.gigamonkeys.com/book/object-reorientation-generic-functions.html

It basically would be rules without the multiple evaluation risks, yet
still the multiple evaluation feature when you need it, and with
explicit control over it.

Then if you insist on comparing to a macro facility, as we're talking
about dynamic code rewriting, maybe we need to compare RULEs to the lisp
style macro facility, which is nothing like a pre-processor facility (in
lisp, that's the reader, I think).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to avoid base backup in automated failover

2012-10-17 Thread chinnaobi
Hey Amitkapila,

Thank you for the quick reply.

How can implement this patch in windows, because I am using windows 9.1.1
postgreSQL application ?? 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-avoid-base-backup-in-automated-failover-tp5711147p5728562.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Simon Riggs
On 12 October 2012 10:08, Daniel Farina dan...@heroku.com wrote:
 On Thu, Oct 11, 2012 at 11:55 PM, Simon Riggs si...@2ndquadrant.com wrote:
 As regards cost/benefit analysis, this is a low importance feature,
 but then that is why I proposed a low effort fix that is flexible to
 the needs of users affected.

 Is there any feature that is more loathed and more narrowly used than
 rules?

I doubt it.

Would you or someone else be able to come up with some words of
caution for us to put in the manual that would be helpful to
developers?

There isn't even a list of caveats for rules.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Global Sequences

2012-10-17 Thread Dimitri Fontaine
Simon Riggs si...@2ndquadrant.com writes:
 On 16 October 2012 15:15, Tom Lane t...@sss.pgh.pa.us wrote:
 What you really want is something vaguely like nextval but applied to
 a distinct type of object.  That is, I think we first need a different
 kind of object called a global sequence with its own DDL operations.

 hence a different solution. CREATE SEQUENCE is SQL Standard and used
 by SERIAL, many people's SQL, SQL generation tools etc.. My objective
 is to come up with something that makes the standard code work
 correctly in a replicated environment.

I think we still can have both. I like Tom's suggestion better, as it
provides for a cleaner implementation in the long run, I think.

Now, the way I see how to get a GLOBAL SEQUENCE by default when creating
a SERIAL column would be with an Event Trigger. To get there, we need to
implement some kind of INSTEAD OF Event Trigger, and the good news is
that we only need to do that in a very specialized command, not as a
generic facility. At least as a starter.

   CREATE EVENT TRIGGER global_sequences_please
 ON ddl_create_sequence
   WHEN context in ('generated')
  EXECUTE PROCEDURE create_global_sequence_instead_thanks();

That would take care of any SERIAL or BIGSERIAL column creation and
leave alone manual CREATE SEQUENCE commands, as those would have a
context of 'toplevel' as opposed to 'generated'.

This context exposing and filtering is included in my current patch for
Event Triggers that I'm submitting quite soon to the next commit fest.

We still need to implement the ddl_create_sequence event that only
fires before create sequence and refrain from creating a sequence if it
did have an event trigger attached, whatever that did.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-17 Thread Hannu Krosing

On 10/17/2012 12:03 AM, Josh Berkus wrote:

Hannu,

Can you explain in more detail how this would be used on the receiving
side?  I'm unable to picture it from your description.

It would be used similar to how the event tables in pgQ (from skytools)
is used - as a source of events to be replied on the subscriber side.

(For discussion sake let's just call this LOGGED ONLY TABLE, as opposed
to UNLOGGED TABLE we already have)

The simplest usage would be implementing remote log tables that is
tables, where you do INSERT on the master side, but it inserts only
a logical WAL record and nothing else.

On subscriber side your replay process reads this WAL record as an
insert event and if the table is declared as an ordinary table on
subscriber, it performs an insert there.

This would make it trivial to implement a persistent remote log table
with minimal required amount of writing on the master side.

We could even implement a log table which captures also log entries
from aborted transactions by treating ROLLBACK as COMMIT for this
table.

But the subscriber side could also do other things instead (or in
addition to) filling a log table. For example, it could create a 
partitioned

table instead of a plain table defined on the provider side.

There is support and several example replay agents in skytools package
which do this based on pgQ

Or you could do computations/materialised views based on events from 
the table.


Or you could use the insert events/wal records as a base for some
other remote processing, like sending out e-mails .

There is also support for these kinds of things in skytools.


I'm also a bit reluctant to call this a message queue, since it lacks
the features required for it to be used as an application-level queue.
REPLICATION MESSAGE, maybe?


Initially I'd just stick with LOG ONLY TABLE or QUEUE based on what
 it does, not on how it could be used.

LOGGED ONLY TABLE is very technical description of realisation - I'd
prefer it to work as mush like a table as possible, similar to how VIEW
currently works - for all usages that make sense, you can simply
substitute it for a TABLE

QUEUE emphasizes the aspect of logged only table that it accepts
records in a certain order, persists these and then quarantees
that they can be read out in exact the same order - all this being
guaranteed by existing WAL mechanisms.

It is not meant to be a full implementation of application level queuing
system though but just the capture, persisting and distribution parts

Using this as an application level queue needs a set of interface
functions to extract the events and also to keep track of the processed
events. As there is no general consensus what these shoul be (like if
processing same event twice is allowed) this part is left for specific
queue consumer implementations.


Hannu Krosing




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-10-17 Thread Shigeru HANADA
Hi Tomas,

Sorry to be late.

On Sat, Aug 25, 2012 at 7:36 AM, Tomas Vondra t...@fuzzy.cz wrote:
 attached is a patch that improves performance when dropping multiple
 tables within a transaction. Instead of scanning the shared buffers for
 each table separately, the patch removes this and evicts all the tables
 in a single pass through shared buffers.

Here are my review comments.

Submission
==
The patch is in unified diff format, and can be applied to the head of
master.  It doesn't include any test nor document, but it seems ok
because the patch doesn't change visible behavior.

Usability
=
The patch intends to improve performance of bulk DROP TABLEs which are
in a transaction.  Such improvement would useful in cases such as  1)
dropping set of partitioned tables as periodic maintenance work, and 2)
dropping lot of work tables.  The patch doesn't change any SQL syntax or
built-in objects, but just change internal behavior.

Feature test

The patch doesn't provide no visible functionality, and existing
regression tests passed.

Performance test

I tested 1000 tables case (each is copy of pgbench_branches with 10
rows) on 1GB shared_buffers server.  Please note that I tested on
MacBook air, i.e. storage is not HDD but SSD.  Here is the test procedure:

1) loop 1000 times
 1-1) create copy table of pgbench_accounts as accounts$i
 1-2) load 10 rows
 1-3) add primary key
 1-4) select all rows to cache pages in shared buffer
2) BEGIN
3) loop 1000 times
 3-1) DROP TABLE accounts$i
4) COMMIT

The numbers below are average of 5 trials.

-+--+-
  Build  |   DROP * |   COMMIT
-+--+-
 Master  | 0.239 ms | 1220.421 ms
 Patched | 0.240 ms |  432.209 ms
-+--+-
* time elapsed for one DROP TABLE statement

IIUC the patch's target is improving COMMIT performance by avoiding
repeated buffer search loop, so this results show that the patch
obtained its goal.

Coding review
=
I have some comments about coding.

* Some cosmetic changes are necessary.
* Variable j in DropRelFileNodeAllBuffersList seems redundant.
* RelFileNodeBackendIsTemp() macro is provided for checking whether the
relation is local, so using it would be better.

Please see attached patch for changes above.

* As Robert commented, this patch adds DropRelFileNodeAllBuffersList by
copying code from DropRelFileNodeAllBuffers.  Please refactor it to
avoid code duplication.
* In smgrDoPendingDeletes, you free srels explicitly.  Can't we leave
them to memory context stuff?  Even it is required, at least pfree must
be called in the case nrels == 0 too.
* In smgrDoPendingDeletes, the buffer srels is expanded in every
iteration.  This seems a bit inefficient.  How about doubling the
capacity when used it up?  This requires additional variable, but
reduces repalloc call considerably.
* Just an idea, but if we can remove entries for local relations from
rnodes array before buffer loop in DropRelFileNodeAllBuffersList,
following bsearch might be more efficient, though dropping many
temporary tables might be rare.

 Our system creates a lot of working tables (even 100.000) and we need
 to perform garbage collection (dropping obsolete tables) regularly. This
 often took ~ 1 hour, because we're using big AWS instances with lots of
 RAM (which tends to be slower than RAM on bare hw). After applying this
 patch and dropping tables in groups of 100, the gc runs in less than 4
 minutes (i.e. a 15x speed-up).

Hm, my environment seems very different from yours.  Could you show the
setting of shared_buffers in your environment?  I'd like to make my test
environment as similar as possible to yours.

 This is not likely to improve usual performance, but for systems like
 ours, this patch is a significant improvement.

I'll test the performance of bare DROP TABLEs (not surrounded by BEGIN
and COMMIT) tomorrow to confirm that the patch doesn't introduce
performance degradation.

Regards,
-- 
Shigeru HANADA
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 993bc49..86bca04 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -335,6 +335,10 @@ smgrDoPendingDeletes(bool isCommit)
PendingRelDelete *pending;
PendingRelDelete *prev;
PendingRelDelete *next;
+
+   SMgrRelation   *srels = palloc(sizeof(SMgrRelation));
+   int nrels = 0,
+   i = 0;
 
prev = NULL;
for (pending = pendingDeletes; pending != NULL; pending = next)
@@ -358,14 +362,26 @@ smgrDoPendingDeletes(bool isCommit)
SMgrRelation srel;
 
srel = smgropen(pending-relnode, 
pending-backend);
-   smgrdounlink(srel, false);
-   smgrclose(srel);
+
+   srels = repalloc(srels, 

Re: [HACKERS] How to avoid base backup in automated failover

2012-10-17 Thread Amit Kapila
On Wednesday, October 17, 2012 3:09 PM chinnaobi wrote:
 Hey Amitkapila,
 
 Thank you for the quick reply.
 
 How can implement this patch in windows, because I am using windows
 9.1.1
 postgreSQL application ??
 

If the patch serves the feature you require, then once it gets committed
(there are few bugs yet to be resolved), the feature will be available for
windows as well.
About the version, I think it will be available in 9.3 only. 
If you are very urgent need of this, may be you can merge in your own copy
of 9.1.1. 
However that has its own implications.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-10-17 Thread Tomas Vondra

Hi,

thanks for the review. I'll look into that in ~2 weeks, once the 
pgconf.eu

is over. A few comments in the text below.

Dne 17.10.2012 12:34, Shigeru HANADA napsal:

Performance test

I tested 1000 tables case (each is copy of pgbench_branches with 
10

rows) on 1GB shared_buffers server.  Please note that I tested on
MacBook air, i.e. storage is not HDD but SSD.  Here is the test 
procedure:


1) loop 1000 times
 1-1) create copy table of pgbench_accounts as accounts$i
 1-2) load 10 rows
 1-3) add primary key
 1-4) select all rows to cache pages in shared buffer
2) BEGIN
3) loop 1000 times
 3-1) DROP TABLE accounts$i
4) COMMIT


I don't think the 'load rows' and 'select all rows' is really 
necessary.
And AFAIK sequential scans use small circular buffer not to pollute 
shared
buffers, so I'd guess the pages are not cached in shared buffers 
anyway.

Have you verified that, e.g. by pg_buffercache?


The numbers below are average of 5 trials.

-+--+-
  Build  |   DROP * |   COMMIT
-+--+-
 Master  | 0.239 ms | 1220.421 ms
 Patched | 0.240 ms |  432.209 ms
-+--+-
* time elapsed for one DROP TABLE statement

IIUC the patch's target is improving COMMIT performance by avoiding
repeated buffer search loop, so this results show that the patch
obtained its goal.

Coding review
=
I have some comments about coding.

* Some cosmetic changes are necessary.
* Variable j in DropRelFileNodeAllBuffersList seems redundant.
* RelFileNodeBackendIsTemp() macro is provided for checking whether 
the

relation is local, so using it would be better.

Please see attached patch for changes above.

* As Robert commented, this patch adds DropRelFileNodeAllBuffersList 
by

copying code from DropRelFileNodeAllBuffers.  Please refactor it to
avoid code duplication.
* In smgrDoPendingDeletes, you free srels explicitly.  Can't we leave
them to memory context stuff?  Even it is required, at least pfree 
must

be called in the case nrels == 0 too.
* In smgrDoPendingDeletes, the buffer srels is expanded in every
iteration.  This seems a bit inefficient.  How about doubling the
capacity when used it up?  This requires additional variable, but
reduces repalloc call considerably.
* Just an idea, but if we can remove entries for local relations from
rnodes array before buffer loop in DropRelFileNodeAllBuffersList,
following bsearch might be more efficient, though dropping many
temporary tables might be rare.


Yes, I plan to do all of this.

Our system creates a lot of working tables (even 100.000) and we 
need
to perform garbage collection (dropping obsolete tables) regularly. 
This
often took ~ 1 hour, because we're using big AWS instances with lots 
of
RAM (which tends to be slower than RAM on bare hw). After applying 
this
patch and dropping tables in groups of 100, the gc runs in less than 
4

minutes (i.e. a 15x speed-up).


Hm, my environment seems very different from yours.  Could you show 
the
setting of shared_buffers in your environment?  I'd like to make my 
test

environment as similar as possible to yours.


We're using m2.4xlarge instances (70G of RAM) with 10GB shared buffers.

Tomas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-17 Thread Amit Kapila
 On Monday, October 15, 2012 3:43 PM Heikki Linnakangas wrote:
 On 13.10.2012 19:35, Fujii Masao wrote:
  On Thu, Oct 11, 2012 at 11:52 PM, Heikki Linnakangas
  hlinnakan...@vmware.com  wrote:
  Ok, thanks. Committed.
 
  I found one typo. The attached patch fixes that typo.
 
 Thanks, fixed.
 
  ISTM you need to update the protocol.sgml because you added
  the field 'replyRequested' to WalSndrMessage and StandbyReplyMessage.


 
  Is it worth adding the same mechanism (send back the reply immediately
  if walsender request a reply) into pg_basebackup and pg_receivexlog?
 
 Good catch. Yes, they should be taught about this too. I'll look into
 doing that too.

If you have not started and you don't have objection, I can pickup this to
complete it.

For both (pg_basebackup and pg_receivexlog), we need to get a timeout
parameter from user in command line, as
there is no conf file here. New Option can be -t (parameter name can be
recvtimeout).

The main changes will be in function ReceiveXlogStream(), it is a common
function for both 
Pg_basebackup and pg_receivexlog. Handling will be done in same way as we
have done in walreceiver.

Suggestions/Comments?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-17 Thread Amit Kapila
On Wednesday, October 17, 2012 5:16 PM Amit Kapila wrote:
  On Monday, October 15, 2012 3:43 PM Heikki Linnakangas wrote:
  On 13.10.2012 19:35, Fujii Masao wrote:
   On Thu, Oct 11, 2012 at 11:52 PM, Heikki Linnakangas
   hlinnakan...@vmware.com  wrote:
   Ok, thanks. Committed.
  
   I found one typo. The attached patch fixes that typo.
 
  Thanks, fixed.
 
   ISTM you need to update the protocol.sgml because you added
   the field 'replyRequested' to WalSndrMessage and
 StandbyReplyMessage.
 
 
 
   Is it worth adding the same mechanism (send back the reply
 immediately
   if walsender request a reply) into pg_basebackup and pg_receivexlog?
 
  Good catch. Yes, they should be taught about this too. I'll look into
  doing that too.
 
 If you have not started and you don't have objection, I can pickup this
 to
 complete it.
 
 For both (pg_basebackup and pg_receivexlog), we need to get a timeout
 parameter from user in command line, as
 there is no conf file here. New Option can be -t (parameter name can be
 recvtimeout).
 
 The main changes will be in function ReceiveXlogStream(), it is a common
 function for both
 Pg_basebackup and pg_receivexlog. Handling will be done in same way as
 we
 have done in walreceiver.

Some more functions where it receives the data files also need similar
handling in pg_basebackup.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch only relevant - relevant only

2012-10-17 Thread Andrew Dunstan


On 10/16/2012 11:24 PM, Karl O. Pinc wrote:

Hi,

As long as I'm sending in trivial fixes
to the docs here's a bit of wording that's been bugging me.

In a number of places the docs read only relevant,
this patch reverses this to read relevant only.

I believe this reads better because it quickly
answers the question is what? with is relevant,
making the sentence less of a strain to read.
Only relevant would be better if you really wanted
to emphasize the only, which I don't think is called
for.

(Sending in such trivial patches makes me feel like
I'm bikeshedding.  Feel free to ignore them without comment.)




This doesn't appear to correct any ambiguity, nor any grammatical error. 
I find these sentences perfectly readable as they are. Not everything in 
the docs conforms to my personal style either, but I'm not in favor of 
taking this sort of patch which is just a matter of substituting your 
stylistic judgment for that for the original author. If we do that we'll 
never stop.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] timezone change not in changelog ?

2012-10-17 Thread Tom Lane
Laurent Laborde kerdez...@gmail.com writes:
 Friendly greetings !
 There is a change between 9.1 and 9.2 that doesn't seems to be in the
 changelog :
 http://www.postgresql.org/docs/9.2/static/runtime-config-client.html#GUC-TIMEZONE
 http://www.postgresql.org/docs/9.1/static/runtime-config-client.html#GUC-TIMEZONE

ITYM this:

  listitem
   para
Identify the server time zone during applicationinitdb/, and set
filenamepostgresql.conf/filename entries
link linkend=guc-timezonevarnametimezone//link and
link linkend=guc-log-timezonevarnamelog_timezone//link
accordingly (Tom Lane)
   /para

   para
This avoids expensive time zone probes during server start.
   /para
  /listitem

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Enforce that INSERT...RETURNING preserves the order of multi rows

2012-10-17 Thread Merlin Moncure
On Wed, Oct 17, 2012 at 7:38 AM, P. Christeas x...@linux.gr wrote:
 It has been a fact that the RETURNING clause on an INSERT will return
 multiple rows with the same order as multiple VALUES have been fed.

Is that defined in the standard?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Enforce that INSERT...RETURNING preserves the order of multi rows

2012-10-17 Thread Tom Lane
P. Christeas x...@linux.gr writes:
 It has been a fact that the RETURNING clause on an INSERT will return
 multiple rows with the same order as multiple VALUES have been fed.

 eg: INSERT INTO tbl1(code) VALUES ('abc'), ('def'), ('agh')
RETURNING id, code;

 is expected to yield:
id | code
   ---
 1 | abc
 2 | def
 3 | agh

 Clarify that in the documentation, and also write a test case that will
 prevent us from breaking the rule in the future.

I don't believe this is a good idea in the slightest.  Yeah, the current
implementation happens to act like that, but there is no reason that we
should make it guaranteed behavior.  Nor is a regression test case going
to stop someone from changing it, anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Enforce that INSERT...RETURNING preserves the order of multi rows

2012-10-17 Thread Peter Geoghegan
On 17 October 2012 14:53, Merlin Moncure mmonc...@gmail.com wrote:
 Is that defined in the standard?

RETURNING isn't even defined in the standard.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]Tablesample Submission

2012-10-17 Thread Alvaro Herrera
Hitoshi Harada escribió:

 Patch does not apply cleanly against latest master.  outfuncs.c,
 allpath.c and cost.h have rejected parts.  The make check failed in a
 lot of cases up to 26 out of 133.  I didn't look into each issue but I
 suggest rebasing on the latest master and making sure the regression
 test passes.

We've been waiting for a rebase for long enough, so I've marked this
patch as Returned with Feedback (for which we thank Hitoshi Harada).
Since this is said to be useful functionality, please make sure you
update the patch and resubmit to the next commitfest.  Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] September 2012 commitfest

2012-10-17 Thread Alvaro Herrera
A week ago, I wrote:

 Some numbers: we got 65 patches this time, of which we rejected 4 and
 returned 3 with feedback.  14 patches have already been committed, and
 13 are waiting on their respective authors.  25 patches need review, and
 6 are said to be ready for committers.

A week later, numbers have changed but not by much.  We now have 66
patches (one patch from the previous commitfest was supposed to be moved
to this one but didn't).  Rejected patches are still 4; there are now 7
returned with feedback.  17 are committed, and 10 are waiting on
authors.  21 patches need review.

Most of the remaining Waiting-on-author patches have seen recent
activity, which is why I didn't close them as returned.  Authors should
speak up soon -- there is no strict policy but I don't think I'm going
to wait later than this Friday for some final version to be submitted
that can be considered ready for committer.  If more work is needed than
simple fixes, authors are encouraged to close such patches as returned
with feedback themselves and resubmit during the next commitfest.


Most worrying to me are the large number of patches waiting for a review
-- in some cases they are waiting even for an initial review.  Here's a
list:

Server Features

* Timeout framework extension and lock_timeout
* Patch to compute Max LSN of Data Pages
* FOR KEY SHARE foreign keys
* [PoC] Writable Foreign Tables
* Incorrect behaviour when using a GiST index on points

Performance

* Skip checkpoint on promoting from streaming replication
* Decrease GiST bloat when penalties are identical
* Range Types statistics
* Performance Improvement by reducing WAL for Update Operation
* Adjacent in SP-GiST for range-types
* 2d-mapping based GiST for ranges
* Performance Improvement in Buffer Management for Select operation

Security

* Row-Level Security
* Extend argument of OAT_POST_CREATE

System Administration

* New statistics for WAL buffer dirty writes
* Switching timeline over streaming replication

Miscellaneous

* support INSERT INTO...RETURNING with partitioned table using rule
* Reworks for generic ALTER commands
* copy result to psql variables
* FDW for PostgreSQL
* pgbench - custom logging step, estimate of remaining time

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] embedded list

2012-10-17 Thread Alvaro Herrera
Alvaro Herrera escribió:
 Here's the final version.  I think this is ready to go in.

Committed.

There are several uses of SHM_QUEUE in the backend code which AFAICS can
be replaced with dlist.  If someone's looking for an easy project,
here's one.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Enforce that INSERT...RETURNING preserves the order of multi rows

2012-10-17 Thread P. Christeas
On Wednesday 17 October 2012, you wrote:
 P. Christeas x...@linux.gr writes:
  It has been a fact that the RETURNING clause on an INSERT will return
  multiple rows with the same order as multiple VALUES have been fed.

 I don't believe this is a good idea in the slightest.  Yeah, the current
 implementation happens to act like that, but there is no reason that we
 should make it guaranteed behavior.  

That's my point, to push you to decide on that feature and clarify it in the 
documentation.

So far, it's very tempting for me to use this behavior, since I can avoid 
multiple INSERTs (=save bandwidth) and also the burden of figuring out which of 
the returned ids associates to which inserted row.

Having a discussion (or argument or a vote) like this, I think, is useful.


FYI, there is also a stack overflow question on this:
http://stackoverflow.com/questions/5439293/is-insert-returning-guaranteed-to-
return-things-in-the-right-order

-- 
Say NO to spam and viruses. Stop using Microsoft Windows!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Joshua D. Drake


On 10/17/2012 02:48 AM, Simon Riggs wrote:


Would you or someone else be able to come up with some words of
caution for us to put in the manual that would be helpful to
developers?

There isn't even a list of caveats for rules.


I think we need the inverse. Some documentation on why to use rules and 
this basically boils down to the problem. Can anyone tell me a reason to 
use explicit rules over a trigger and function combination?


And that is the crux of the issue. If we can't identify a reason the 
feature currently exists and we have a suitable and better replacement, 
the feature should be deprecated and removed.


My suggestion for docs is:

Note: Do not use, use Triggers with Functions instead link

Sincerely,

Joshua D. Drake



--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Simon Riggs
On 17 October 2012 18:02, Joshua D. Drake j...@commandprompt.com wrote:

 Note: Do not use, use Triggers with Functions instead link

Agreed, something simple is required. I suggest expanding that just a little...

Rules are a non-SQL Standard feature and where possible we recommend
that you write your applications using triggers, views and functions
instead. Although not likely to be fully deprecated soon, the use of
rules is now actively discouraged.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Enforce that INSERT...RETURNING preserves the order of multi rows

2012-10-17 Thread Merlin Moncure
On Wed, Oct 17, 2012 at 9:29 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 17 October 2012 14:53, Merlin Moncure mmonc...@gmail.com wrote:
 Is that defined in the standard?

 RETURNING isn't even defined in the standard.

Right: Point being, assumptions based on implementation ordering are
generally to be avoided unless they are explicitly defined in the
standard or elsewhere.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce palloc's in numeric operations.

2012-10-17 Thread Alvaro Herrera
Kyotaro HORIGUCHI wrote:
 Hello, I will propose reduce palloc's in numeric operations.
 
 The numeric operations are slow by nature, but usually it is not
 a problem for on-disk operations. Altough the slowdown is
 enhanced on on-memory operations.
 
 I inspcted them and found some very short term pallocs. These
 palloc's are used for temporary storage for digits of unpaked
 numerics.

This looks like a neat little patch.  Some feedback has been provided by
Heikki (thanks!) and since we're still waiting for an updated version, I
have marked this Returned with Feedback for the time being.  Please make
sure to address the remaining issues and submit to the next commitfest.
Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-17 Thread Greg Stark
On Wed, Oct 17, 2012 at 11:26 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 The simplest usage would be implementing remote log tables that is
 tables, where you do INSERT on the master side, but it inserts only
 a logical WAL record and nothing else.

 On subscriber side your replay process reads this WAL record as an
 insert event and if the table is declared as an ordinary table on
 subscriber, it performs an insert there.

What kinds of applications would need that?

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch only relevant - relevant only

2012-10-17 Thread Abhijit Menon-Sen
At 2012-10-17 09:19:58 -0400, and...@dunslane.net wrote:

 This doesn't appear to correct any ambiguity, nor any grammatical
 error.

FWIW, it's quite standard and uncontroversial good writing advice to
push only as far right as it can go. It does correct an ambiguity,
but in this case the ambiguity is silly and harmless, so fixing it
seems like nitpicking when you read the patch.

-- Abhijit

P.S. I would fix it anyway.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Hannu Krosing

On 10/17/2012 11:31 AM, Dimitri Fontaine wrote:

Peter Geoghegan pe...@2ndquadrant.com writes:

Clearly deprecating rules implies some loss of functionality - there
is no exact, drop-in equivalent to something that magically rewrites
SQL that isn't equally baroque and problematic.

Maybe we can upgrade STATEMENT triggers to level where they
cover all practical uses of rules.

Currently we can create FOR EACH STATEMENT triggers on
INSERT OR UPDATE OR DELETE but they are pretty useless for
anything else than recording that such an even took place as
both OLD and NEW are empty for these.

Perhaps we can make them much more useful by exposing
more of the original statement to the called function.


If that's the bar,
then detractors of rules should stop wasting their breath, because the
bar has been set impossibly high.

I believe an advice system is a good contender here, as already
proposed here:

   http://archives.postgresql.org/pgsql-hackers/2012-10/msg00610.php

See defadvice in Emacs Lisp and The Standard Method Combination of the
Common Lisp Object System as sources of inspiration here.

   
http://www.gnu.org/software/emacs/manual/html_node/elisp/Advising-Functions.html
   http://www.gigamonkeys.com/book/object-reorientation-generic-functions.html

It basically would be rules without the multiple evaluation risks, yet
still the multiple evaluation feature when you need it, and with
explicit control over it.

Then if you insist on comparing to a macro facility, as we're talking
about dynamic code rewriting, maybe we need to compare RULEs to the lisp
style macro facility, which is nothing like a pre-processor facility

Is it something closer to decorators in some languages - that is functions
that wrap other functions in some extra functionality ?


(in
lisp, that's the reader, I think).

Regards,




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Identity projection

2012-10-17 Thread Alvaro Herrera
Kyotaro HORIGUCHI wrote:
 Hello, sorry for long absense,
 
 #  I had unexpected and urgent time-consuming tasks... :-(
 
 I have had a bit more precise inspection by two aspects, and they
 seemd showing that the difference should be the execution time of
 ExecProject.
 
 I'll be able to back fully next week with reviesed patch, and to
 take some other pathes to review...

Hi, I've marked this patch as Returned with Feedback (thanks Tom).
Please submit an updated version to the upcoming commitfest.  Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Josh Berkus
All,

For the record, I like RULEs and would prefer if someone fixed the
issues with them instead of deprecating them.  However, I also
acknowledge that that is unlikely to happen.

 Would you or someone else be able to come up with some words of
 caution for us to put in the manual that would be helpful to
 developers?

We could start with:

=

Warning: RULEs are tricky to use correctly and difficult to understand,
even for users with a lot of PostgreSQL experience.  For most purposes,
you want a triggerlink and not a RULE.  It is also likely that the
RULEs feature will be deprecated in some future release of PostgreSQL
when all RULE functionality can be replaced by other mechanisms.

The PostgreSQL Project recommends that you use Triggerslink instead of
RULEs unless you have a specific reason to use RULEs.

==

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Greg Stark
I dislike both of the explanations above which don't actually explain
why people shouldn't use rules (Josh does say they're tricky which is
a start). Just telling people we hate parts of the system doesn't
really come off well and leaves them wondering why.

I would suggest something like

Warning: RULES are tricky to use correctly. They rewrite the original
query into a new query before it is run and it is very hard to
correctly anticipate and rewrite every possible input query into the
desired result. There are also unexpected interactions with other
components when RULES do something unexpected such as rewrite a single
query to return two result sets.

For most applications it's much simpler and more predictable to use
TRIGGERs. ROW level triggers are evaluated for each row the original
query is about to process (or has just finished processing) and this
makes them much easier to follow. Statement level TRIGGERs can be used
for audit logs and similar operations which need to run once per
statement.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Andrew Dunstan


On 10/17/2012 01:02 PM, Joshua D. Drake wrote:


On 10/17/2012 02:48 AM, Simon Riggs wrote:


Would you or someone else be able to come up with some words of
caution for us to put in the manual that would be helpful to
developers?

There isn't even a list of caveats for rules.


I think we need the inverse. Some documentation on why to use rules 
and this basically boils down to the problem. Can anyone tell me a 
reason to use explicit rules over a trigger and function combination?





I don't know how many times I have to say this: people are not 
listening. Tom has already given a case for it upthread:



Triggers necessarily operate on a row-at-a-time basis.  In theory,
for at least some bulk operations, a rule could greatly outperform
a trigger.  It's difficult to walk away from that - unless somebody
can prove that the advantage doesn't ever accrue in practice.





People can keep ignoring that if they like, but some of us won't. This 
mantra of there is no reason at all to use rules is like climate 
change denial - no matter how many times you say it that won't make it true.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Michael Nolan
On 10/12/12, Josh Berkus j...@agliodbs.com wrote:

 I realize you weren't around when we removed row OIDs, but I was *still*
 getting flack from that in 2008.  And we lost entire OSS projects to
 other databases because of removing row OIDs.  And those were marked
 deprecated for 3 years before we removed them.

FWIW, the documentation for 9.2 still mentions OIDs and the
'default_with_oids' parameter, in what release was it announced they
would be removed and in what release were they removed?
--
Mike Nolan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Peter Geoghegan
On 17 October 2012 18:50, Andrew Dunstan and...@dunslane.net wrote:
 I don't know how many times I have to say this: people are not listening.
 Tom has already given a case for it upthread:


 Triggers necessarily operate on a row-at-a-time basis.  In theory,
 for at least some bulk operations, a rule could greatly outperform
 a trigger.  It's difficult to walk away from that - unless somebody
 can prove that the advantage doesn't ever accrue in practice.

Fair point. I'm just not sure that that is a good enough reason to not
deprecate rules. I mean, if experienced hackers cannot figure out if
that's actually a useful facet of rules, what hope is there for anyone
else?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Peter Geoghegan
On 17 October 2012 18:46, Greg Stark st...@mit.edu wrote:
 I would suggest something like

 Warning: RULES are tricky to use correctly. They rewrite the original
 query into a new query before it is run and it is very hard to
 correctly anticipate and rewrite every possible input query into the
 desired result. There are also unexpected interactions with other
 components when RULES do something unexpected such as rewrite a single
 query to return two result sets.

+1 to that sort of wording.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Josh Berkus
Greg,

 Warning: RULES are tricky to use correctly. They rewrite the original
 query into a new query before it is run and it is very hard to
 correctly anticipate and rewrite every possible input query into the
 desired result. There are also unexpected interactions with other
 components when RULES do something unexpected such as rewrite a single
 query to return two result sets.
 
 For most applications it's much simpler and more predictable to use
 TRIGGERs. ROW level triggers are evaluated for each row the original
 query is about to process (or has just finished processing) and this
 makes them much easier to follow. Statement level TRIGGERs can be used
 for audit logs and similar operations which need to run once per
 statement.

This is excellent.  However, if we are actually considerting deprecating
them, we should add the sentence RULEs may be deprecated in a future
release of PostgreSQL.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 I dislike both of the explanations above which don't actually explain
 why people shouldn't use rules (Josh does say they're tricky which is
 a start). Just telling people we hate parts of the system doesn't
 really come off well and leaves them wondering why.

Agreed.  I think that by far the most common problem people hit with
rules has to do with unexpected multiple evaluations of volatile
functions (eg nextval).  If we're going to put in some warning text
I think it would be smart to explain that and maybe even show an
example.  It'd be easy enough to generate an example involving,
say, a rule that's meant to log rows that are inserted.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Joshua D. Drake


On 10/17/2012 10:46 AM, Greg Stark wrote:


I dislike both of the explanations above which don't actually explain
why people shouldn't use rules (Josh does say they're tricky which is
a start). Just telling people we hate parts of the system doesn't
really come off well and leaves them wondering why.

I would suggest something like

Warning: RULES are tricky to use correctly. They rewrite the original
query into a new query before it is run and it is very hard to
correctly anticipate and rewrite every possible input query into the
desired result. There are also unexpected interactions with other
components when RULES do something unexpected such as rewrite a single
query to return two result sets.

For most applications it's much simpler and more predictable to use
TRIGGERs. ROW level triggers are evaluated for each row the original
query is about to process (or has just finished processing) and this
makes them much easier to follow. Statement level TRIGGERs can be used
for audit logs and similar operations which need to run once per
statement.



I am not sure where to stick it but we should also include the fact that 
rules are almost always slower that a trigger/function comparative.


Sincerely,

JD








--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Josh Berkus

 I am not sure where to stick it but we should also include the fact that
 rules are almost always slower that a trigger/function comparative.

That wouldn't be accurate, actually.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread John R Pierce

On 10/17/12 2:31 AM, Dimitri Fontaine wrote:

Then if you insist on comparing to a macro facility, as we're talking
about dynamic code rewriting, maybe we need to compare RULEs to the lisp
style macro facility, which is nothing like a pre-processor facility (in
lisp, that's the reader, I think).


Except the vast majority of the audience likely have no idea what the 
'lisp macro system' is like or what this comparison implies.. Even 
though I dabbled in a little LISP almost 40 years ago, I don't remember 
anything about LISP macros, just Assembler and C macros :)


Any such comparison should be with something that is common knowledge, 
not something even more obscure than the primary subject matter.





--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Simon Riggs
On 17 October 2012 18:46, Greg Stark st...@mit.edu wrote:

 I would suggest something like

 Warning: RULES are tricky to use correctly. They rewrite the original
 query into a new query before it is run and it is very hard to
 correctly anticipate and rewrite every possible input query into the
 desired result. There are also unexpected interactions with other
 components when RULES do something unexpected such as rewrite a single
 query to return two result sets.

 For most applications it's much simpler and more predictable to use
 TRIGGERs. ROW level triggers are evaluated for each row the original
 query is about to process (or has just finished processing) and this
 makes them much easier to follow. Statement level TRIGGERs can be used
 for audit logs and similar operations which need to run once per
 statement.

I like this very much. Thank you.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-17 Thread Christopher Browne
Well, replication is arguably a relevant case.

For Slony, the origin/master node never cares about logged changes - that
data is only processed on replicas.  Now, that's certainly a little
weaselly - the log data (sl_log_*) has got to get read to get to the
replica.

This suggests, nonetheless, a curiously different table structure than is
usual, and I could see this offering interesting possibilities.

The log tables are only useful to read in transaction order, which is
pretty well the order data gets written to WAL, so perhaps we could have
savings by only writing data to WAL...

It occurs to me that this notion might exist as a special sort of table,
interesting for pgq as well as Slony, which consists of:

- table data is stored only in WAL
- an index supports quick access to this data, residing in WAL
- TOASTing perhaps unneeded?
- index might want to be on additional attributes
- the triggers-on-log-tables thing Slony 2.2 does means we want these
tables to support triggers
- if data is only held in WAL, we need to hold the WAL until (mumble,
later, when known to be replicated)
- might want to mix local updates with updates imported from remote nodes

I think it's a misnomer to think this is about having the data not locally
accessible.  Rather, it has a pretty curious access and storage pattern.

And a slick pgq queue would likely make a good Slony log, too.


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread John R Pierce

On 10/17/12 10:46 AM, Greg Stark wrote:

Warning: RULES are tricky to use correctly. They rewrite the original
query into a new query before it is run and it is very hard to
correctly anticipate and rewrite every possible input query into the
desired result. There are also unexpected interactions with other
components when RULES do something unexpected such as rewrite a single
query to return two result sets.

For most applications it's much simpler and more predictable to use
TRIGGERs. ROW level triggers are evaluated for each row the original
query is about to process (or has just finished processing) and this
makes them much easier to follow. Statement level TRIGGERs can be used
for audit logs and similar operations which need to run once per
statement.


as a relative novice, I concur, this is clear, concise, and to the point.



--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Daniel Farina
On Wed, Oct 17, 2012 at 10:50 AM, Andrew Dunstan and...@dunslane.net wrote:
 Triggers necessarily operate on a row-at-a-time basis.  In theory,
 for at least some bulk operations, a rule could greatly outperform
 a trigger.  It's difficult to walk away from that - unless somebody
 can prove that the advantage doesn't ever accrue in practice.

 People can keep ignoring that if they like, but some of us won't. This
 mantra of there is no reason at all to use rules is like climate change
 denial - no matter how many times you say it that won't make it true.

I think there is an assumed presumption on behalf of those those
vigorously opposing the deprecation of rules that everyone understands
what the use cases for rules are and their respective commonality.  So
far, the discussion has been pretty unenlightening to me, and I find
the notion that those in favor of deprecation are just skirting well
known questions ill justified.  Just because an in theory... case
works better is not in and of itself enough to warrant a vigorous
defense -- perhaps I missed the email where people said yes, I see
that all the time when rules are involved and wouldn't want to go
without it.

You and Josh seem to be strong proponents of rules for reasons other
than I just don't want to break applications.  That's not too many
to ask both of you: can you itemize your use cases and how important
you feel they are?

I'll cost-size it for you: for me, as of my current understanding, if
but one more defect can be removed per year by dropping all
maintenance of RULES in exchange, I'd take that trade, as I understand
things right now.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Joshua D. Drake


On 10/17/2012 11:32 AM, Josh Berkus wrote:




I am not sure where to stick it but we should also include the fact that
rules are almost always slower that a trigger/function comparative.


That wouldn't be accurate, actually.


Let me add: when used with partitioning. I should have been more explicit.

JD






--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Andrew Dunstan


On 10/17/2012 03:06 PM, Daniel Farina wrote:

On Wed, Oct 17, 2012 at 10:50 AM, Andrew Dunstan and...@dunslane.net wrote:

Triggers necessarily operate on a row-at-a-time basis.  In theory,
for at least some bulk operations, a rule could greatly outperform
a trigger.  It's difficult to walk away from that - unless somebody
can prove that the advantage doesn't ever accrue in practice.

People can keep ignoring that if they like, but some of us won't. This
mantra of there is no reason at all to use rules is like climate change
denial - no matter how many times you say it that won't make it true.

I think there is an assumed presumption on behalf of those those
vigorously opposing the deprecation of rules that everyone understands
what the use cases for rules are and their respective commonality.  So
far, the discussion has been pretty unenlightening to me, and I find
the notion that those in favor of deprecation are just skirting well
known questions ill justified.  Just because an in theory... case
works better is not in and of itself enough to warrant a vigorous
defense -- perhaps I missed the email where people said yes, I see
that all the time when rules are involved and wouldn't want to go
without it.

You and Josh seem to be strong proponents of rules for reasons other
than I just don't want to break applications.  That's not too many
to ask both of you: can you itemize your use cases and how important
you feel they are?

I'll cost-size it for you: for me, as of my current understanding, if
but one more defect can be removed per year by dropping all
maintenance of RULES in exchange, I'd take that trade, as I understand
things right now.




I'll give you one case, although I still think Tom is right - the onus 
of proof is on those proposing to remove a feature, not the other way 
around.


Some years ago I was partitioning a large data store. By far the fastest 
way to do this, by about an order of magnitude, turned out to be using a 
partitioning rule. In testing it was way faster than using a trigger, 
even one written in C, or pulling out the individual partitions one by 
one. And I don't thing writing triggers in C is an acceptable 
replacement for rules anyway.


One I had the data partitioned I dropped the rule and put a trigger in 
place.


Now I'd be fairly miffed if we just removed that capability. I 
personally feel that the bar for removing features should be pretty darn 
high.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Daniel Farina
On Wed, Oct 17, 2012 at 12:43 PM, Andrew Dunstan and...@dunslane.net wrote:

 On 10/17/2012 03:06 PM, Daniel Farina wrote:

 On Wed, Oct 17, 2012 at 10:50 AM, Andrew Dunstan and...@dunslane.net
 wrote:

 Triggers necessarily operate on a row-at-a-time basis.  In theory,
 for at least some bulk operations, a rule could greatly outperform
 a trigger.  It's difficult to walk away from that - unless somebody
 can prove that the advantage doesn't ever accrue in practice.

 People can keep ignoring that if they like, but some of us won't. This
 mantra of there is no reason at all to use rules is like climate change
 denial - no matter how many times you say it that won't make it true.

 I think there is an assumed presumption on behalf of those those
 vigorously opposing the deprecation of rules that everyone understands
 what the use cases for rules are and their respective commonality.  So
 far, the discussion has been pretty unenlightening to me, and I find
 the notion that those in favor of deprecation are just skirting well
 known questions ill justified.  Just because an in theory... case
 works better is not in and of itself enough to warrant a vigorous
 defense -- perhaps I missed the email where people said yes, I see
 that all the time when rules are involved and wouldn't want to go
 without it.

 You and Josh seem to be strong proponents of rules for reasons other
 than I just don't want to break applications.  That's not too many
 to ask both of you: can you itemize your use cases and how important
 you feel they are?

 I'll cost-size it for you: for me, as of my current understanding, if
 but one more defect can be removed per year by dropping all
 maintenance of RULES in exchange, I'd take that trade, as I understand
 things right now.



 I'll give you one case, although I still think Tom is right - the onus of
 proof is on those proposing to remove a feature, not the other way around.

I'll have to register my disagreement then, in the special case where
a feature becomes so obscure that many people don't have a wide-spread
intuition at what it's good at or used for.  Tom also said build the
replacement, and without itemization of use cases, I don't even know
what that would look like -- perhaps such knowledge is assumed, but I
think it's assumed wrongly, so perhaps there just needs to be some
education.  At best you could define what to build somewhat
tautologically from the mechanism used by RULES, and that's not a very
good way to go about it, methinks.

Thank you for humoring me and fleshing out your case anyway.

 Some years ago I was partitioning a large data store. By far the fastest way
 to do this, by about an order of magnitude, turned out to be using a
 partitioning rule. In testing it was way faster than using a trigger, even
 one written in C, or pulling out the individual partitions one by one. And I
 don't thing writing triggers in C is an acceptable replacement for rules
 anyway.

 One I had the data partitioned I dropped the rule and put a trigger in
 place.

That's a good one.  So, would a more legitimate partitioning becoming
a feature be enough to assuage user-visible rules support?  Or are
there other cases?

 Now I'd be fairly miffed if we just removed that capability. I personally
 feel that the bar for removing features should be pretty darn high.

The bar for quality is also high.  Like I said: to my needs, one less
bug outweighs the advantages of rules, especially if that advantage is
carried over multiple years.  I still lose quite a bit in the
deprecation regardless: if even 0.1% of the customer base uses rules,
a sudden deprecation will cause us a lot of pain.  However, a slow
deprecation is a lot more manageable and, if it pays off in one more
bug solved a year or a better positioned feature maintained with
equivalent effort it will have been worth it.

That's another thing that has not come up for discussion: those who
maintain rules -- are they happy to do it? What is the investment of
time like?  I have been presuming a cost of maintenance, but I have
never heard someone who actually maintains rules regularly or
implements features that become more complex because of it try to size
the benefit one way or another.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Josh Berkus

 You and Josh seem to be strong proponents of rules for reasons other
 than I just don't want to break applications.  That's not too many
 to ask both of you: can you itemize your use cases and how important
 you feel they are?

Well, my main issue is actually that I don't want to break people's
applications.  I seldom use RULEs myself.

The one area where I do is handling bulk loading for partitioning or
other table redirection.  For this case, rules are much, much, much (as
in 100X) faster than row-at-a-time triggers.

Could statement triggers and MERGE be improved to replace this?  Pretty
much certainly.  Have they been?  No.

Also, I'll point out that surveying -hackers for feature usage is a
pretty skewed group.  A bunch of the people on this list (maybe most of
them) don't develop applications -- some never have.  If we think
surveying usage affects our plans for deprecation, then we ought to
survey a less select group of users.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2012-10-17 Thread Alvaro Herrera
Kohei KaiGai escribió:
 2012/10/5 Alvaro Herrera alvhe...@2ndquadrant.com:

 The attached patch fixes the messaging issue.
 I newly add func_signature_string_oid() that returns compatible function's
 signature, but takes its object-id.
 
 So, the error message is now constructed as:
 +   case OBJECT_AGGREGATE:
 +   case OBJECT_FUNCTION:
 +   errorm = format_elog_string(function %s already exists in
 schema \%s\,
 +   func_signature_string_oid(objectId),
 +   get_namespace_name(namespaceId));
 +break;

Thanks, yeah, this works for me.

I am now wondering if it would make sense to merge the duplicate-name
error cases in AlterObjectNamespace_internal and
AlterObjectRename_internal.  The former only works when there is a name
catcache for the object type.  Maybe we can create a single function to
which we give the object type, name/args, oid, etc, and it uses a
catcache if available and falls back to get_object_address (with the
IMO ugly name list manipulations) if not.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Josh Berkus
On 10/17/12 12:57 PM, Daniel Farina wrote:
 I'll have to register my disagreement then, in the special case where
 a feature becomes so obscure that many people don't have a wide-spread
 intuition at what it's good at or used for.  Tom also said build the
 replacement, and without itemization of use cases, I don't even know
 what that would look like -- perhaps such knowledge is assumed, but I
 think it's assumed wrongly, so perhaps there just needs to be some
 education.  At best you could define what to build somewhat
 tautologically from the mechanism used by RULES, and that's not a very
 good way to go about it, methinks.

Well, there are the cases for which RULEs are actually the superior/only
mechanism (probably a fairly small set) and the cases where they are
not, but are used anyway (a much larger set).  For the latter group,
those cases need to be (a) identified, and (b) migration documented.

For example, one can currently create an ON UPDATE rule to make a view
updatable.  It is now also possible to create a trigger to do the same
thing, and its results would be more predictable.  However, nobody has
documented how one would migrate and existing UPDATE rule to a new ON
UPDATE trigger.

Putting it as Andrew and Josh need to enumerate these cases, or forever
be silent is quite unfair to our users.  Andrew and I hardly represent
the entire scope of PostgreSQL app developers.  Enumerating the cases,
finding replacements for them, and documenting migrations needs to be a
group effort.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-17 Thread Josh Berkus

 It is not meant to be a full implementation of application level queuing
 system though but just the capture, persisting and distribution parts
 
 Using this as an application level queue needs a set of interface
 functions to extract the events and also to keep track of the processed
 events. As there is no general consensus what these shoul be (like if
 processing same event twice is allowed) this part is left for specific
 queue consumer implementations.

Well, but AFAICT, you've already prohibited features through your design
which are essential to application-level queues, and are implemented by,
for example, pgQ.

1. your design only allows the queue to be read on replicas, not on the
node where the item was inserted.

2. if you can't UPDATE or DELETE queue items -- or LOCK them -- how on
earth would a client know which items they have executed and which they
haven't?

3. Double-down on #2 in a multithreaded environment.

For an application-level queue, the base functionality is:

ADD ITEM
READ NEXT (#) ITEM(S)
LOCK ITEM
DELETE ITEM

More sophisticated an useful queues also allow:

READ NEXT UNLOCKED ITEM
LOCK NEXT UNLOCKED ITEM
UPDATE ITEM
READ NEXT (#) UNSEEN ITEM(S)

The design you describe seems to prohibit pretty much all of the above
operations after READ NEXT.  This makes it completely useless as a
application-level queue.

And, for that matter, if your new queue only accepts INSERTs, why not
just improve LISTEN/NOTIFY so that it's readable on replicas?  What does
this design buy you that that doesn't?

Quite possibly you have plans which answer all of the above, but they
aren't at all clear in your RFC.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-17 Thread Kevin Grittner
Simon Riggs wrote:
 On 6 October 2012 00:56, Tom Lane t...@sss.pgh.pa.us wrote:
 
 2. DROP INDEX CONCURRENTLY doesn't bother to do
 TransferPredicateLocksToHeapRelation until long after it's
 invalidated the index. Surely that's no good? Is it even possible
 to do that correctly, when we don't have a lock that will prevent
 new predicate locks from being taken out meanwhile?
 
 No idea there. Input appreciated.

[Sorry for delayed response; fighting through a backlog.]
 
If the creation of a new tuple by insert or update would not perform
the related index tuple insertion, and the lock has not yet been
transfered to the heap relation, yes we have a problem.  Will take a
look at the code.

Creation of new predicate locks while in this state has no bearing on
the issue as long as locks are transferred to the heap relation after
the last scan using the index has completed.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in planner's equivalence-class processing

2012-10-17 Thread Martijn van Oosterhout
On Tue, Oct 16, 2012 at 11:56:52AM -0400, Tom Lane wrote:
 Is anybody concerned about the compatibility implications of fixing this
 bug in the back branches?  I'm worried about people complaining that we
 broke their application in a minor release.  Maybe they were depending
 on incorrect behavior, but they might complain anyway.  On the other
 hand, the fact that this hasn't been reported from the field in nine
 years suggests that not many people write queries like this.

Nice detective work. I'd personally say that it should be fixed. I
personally haven't written these kinds of queries so I'm not affected,
but I don't like the idea of known bugs being unfixed.

It's a pity we can't have a system that can somehow independantly
checks the results of the planner

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-17 Thread Christopher Browne
On Wed, Oct 17, 2012 at 4:25 PM, Josh Berkus j...@agliodbs.com wrote:

 It is not meant to be a full implementation of application level queuing
 system though but just the capture, persisting and distribution parts

 Using this as an application level queue needs a set of interface
 functions to extract the events and also to keep track of the processed
 events. As there is no general consensus what these shoul be (like if
 processing same event twice is allowed) this part is left for specific
 queue consumer implementations.

 Well, but AFAICT, you've already prohibited features through your design
 which are essential to application-level queues, and are implemented by,
 for example, pgQ.

 1. your design only allows the queue to be read on replicas, not on the
 node where the item was inserted.

I commented separately on this; I'm pretty sure there needs to be a
way to read the queue on a replica, yes, indeed.

 2. if you can't UPDATE or DELETE queue items -- or LOCK them -- how on
 earth would a client know which items they have executed and which they
 haven't?

If the items are actually stored in WAL, then it seems well and truly
impossible to do any of those three things directly.

What could be done, instead, would be to add successor items to
indicate that they have been dealt with, in effect, back-references.

You don't get to UPDATE or DELETE; instead, you do something like:

   INSERT into queue (reference_to_xid, reference_to_id_in_xid, action)
values (old_xid_1, old_id_within_xid_1, 'COMPLETED'), (old_xid_2,
old_id_within_xid_2, 'CANCELLED');

In a distributed context, it's possible that multiple nodes could be
reading from the same queue, so that while process at least once is
no trouble, process at most once is just plain troublesome.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order

2012-10-17 Thread Alvaro Herrera
Joel Jacobson wrote:
 On Thu, Jul 5, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  You may in fact need a new field --- I'm just saying it should be in the
  object-type-specific struct, eg FuncInfo, not DumpableObject.
 
 
 I suggest adding char *funcsig to FuncInfo, and moving the funcsig =
 format_function_arguments(finfo, funciargs) code from dumpFunc to getFuncs.
 
 Because dumpFunc is called after sortDumpableObjectsByTypeName, setting
 funcsig in the FuncInfo struct in dumpFunc would't work, as it needs to be
 available when entering sortDumpableObjectsByTypeName.

Uh, the patch you posted keeps the pg_get_function_identity_arguments
call in dumpFunc, but there is now also a new one in getFuncs.  Do we
need to remove the second one?

Here's an updated patch for your consideration.  I was about to push
this when I noticed the above.  The only change here is that the extra
code that tests for new remoteVersions in the second else if branch of
getFuncs and getAggregates has been removed, since it cannot ever be
reached.

(I tested the new pg_dump with 8.2 and HEAD and also verified it passes
pg_upgrade's make check.  I didn't test with other server versions.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
***
*** 3534,3539  getAggregates(Archive *fout, int *numAggs)
--- 3534,3540 
  	int			i_proargtypes;
  	int			i_rolname;
  	int			i_aggacl;
+ 	int			i_proiargs;
  
  	/* Make sure we are in proper schema */
  	selectSourceSchema(fout, pg_catalog);
***
*** 3543,3553  getAggregates(Archive *fout, int *numAggs)
  	 * rationale behind the filtering logic.
  	 */
  
! 	if (fout-remoteVersion = 80200)
  	{
  		appendPQExpBuffer(query, SELECT tableoid, oid, proname AS aggname, 
  		  pronamespace AS aggnamespace, 
  		  pronargs, proargtypes, 
  		  (%s proowner) AS rolname, 
  		  proacl AS aggacl 
  		  FROM pg_proc p 
--- 3544,3555 
  	 * rationale behind the filtering logic.
  	 */
  
! 	if (fout-remoteVersion = 80400)
  	{
  		appendPQExpBuffer(query, SELECT tableoid, oid, proname AS aggname, 
  		  pronamespace AS aggnamespace, 
  		  pronargs, proargtypes, 
+ 		  pg_catalog.pg_get_function_identity_arguments(oid) AS proiargs,
  		  (%s proowner) AS rolname, 
  		  proacl AS aggacl 
  		  FROM pg_proc p 
***
*** 3565,3576  getAggregates(Archive *fout, int *numAggs)
--- 3567,3594 
  			  deptype = 'e'));
  		appendPQExpBuffer(query, ));
  	}
+ 	else if (fout-remoteVersion = 80200)
+ 	{
+ 		appendPQExpBuffer(query, SELECT tableoid, oid, proname AS aggname, 
+ 		  pronamespace AS aggnamespace, 
+ 		  pronargs, proargtypes, 
+ 		  NULL::text AS proiargs,
+ 		  (%s proowner) AS rolname, 
+ 		  proacl AS aggacl 
+ 		  FROM pg_proc p 
+ 		  WHERE proisagg AND (
+ 		  pronamespace != 
+ 		  (SELECT oid FROM pg_namespace 
+ 		  WHERE nspname = 'pg_catalog')),
+ 		  username_subquery);
+ 	}
  	else if (fout-remoteVersion = 70300)
  	{
  		appendPQExpBuffer(query, SELECT tableoid, oid, proname AS aggname, 
  		  pronamespace AS aggnamespace, 
  		  CASE WHEN proargtypes[0] = 'pg_catalog.\any\'::pg_catalog.regtype THEN 0 ELSE 1 END AS pronargs, 
  		  proargtypes, 
+ 		  NULL::text AS proiargs, 
  		  (%s proowner) AS rolname, 
  		  proacl AS aggacl 
  		  FROM pg_proc 
***
*** 3585,3590  getAggregates(Archive *fout, int *numAggs)
--- 3603,3609 
  		  0::oid AS aggnamespace, 
    CASE WHEN aggbasetype = 0 THEN 0 ELSE 1 END AS pronargs, 
  		  aggbasetype AS proargtypes, 
+ 		  NULL::text AS proiargs, 
  		  (%s aggowner) AS rolname, 
  		  '{=X}' AS aggacl 
  		  FROM pg_aggregate 
***
*** 3600,3605  getAggregates(Archive *fout, int *numAggs)
--- 3619,3625 
  		  0::oid AS aggnamespace, 
    CASE WHEN aggbasetype = 0 THEN 0 ELSE 1 END AS pronargs, 
  		  aggbasetype AS proargtypes, 
+ 		  NULL::text AS proiargs, 
  		  (%s aggowner) AS rolname, 
  		  '{=X}' AS aggacl 
  		  FROM pg_aggregate 
***
*** 3623,3628  getAggregates(Archive *fout, int *numAggs)
--- 3643,3649 
  	i_proargtypes = PQfnumber(res, proargtypes);
  	i_rolname = PQfnumber(res, rolname);
  	i_aggacl = PQfnumber(res, aggacl);
+ 	i_proiargs = PQfnumber(res, proiargs);
  
  	for (i = 0; i  ntups; i++)
  	{
***
*** 3642,3647  getAggregates(Archive *fout, int *numAggs)
--- 3663,3669 
  		agginfo[i].aggfn.lang = InvalidOid;		/* not currently interesting */
  		agginfo[i].aggfn.prorettype = InvalidOid;		/* not saved */
  		agginfo[i].aggfn.proacl = pg_strdup(PQgetvalue(res, i, i_aggacl));
+ 		agginfo[i].aggfn.proiargs = pg_strdup(PQgetvalue(res, i, i_proiargs));
  		agginfo[i].aggfn.nargs = 

Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Daniel Farina
On Wed, Oct 17, 2012 at 1:12 PM, Josh Berkus j...@agliodbs.com wrote:
 On 10/17/12 12:57 PM, Daniel Farina wrote:
 I'll have to register my disagreement then, in the special case where
 a feature becomes so obscure that many people don't have a wide-spread
 intuition at what it's good at or used for.  Tom also said build the
 replacement, and without itemization of use cases, I don't even know
 what that would look like -- perhaps such knowledge is assumed, but I
 think it's assumed wrongly, so perhaps there just needs to be some
 education.  At best you could define what to build somewhat
 tautologically from the mechanism used by RULES, and that's not a very
 good way to go about it, methinks.

[use case, redacted, although worth independent consideration]

 Putting it as Andrew and Josh need to enumerate these cases, or forever
 be silent is quite unfair to our users.  Andrew and I hardly represent
 the entire scope of PostgreSQL app developers.  Enumerating the cases,
 finding replacements for them, and documenting migrations needs to be a
 group effort.

Unfortunately I myself see little evidence of the vast, vast --
several nines of vast -- majority of folks using rules, and as I said:
as a thought experiment, merely one solved bug is worth more to me
than rules from what I know at this time.  If I had a wealth of user
pain to draw upon, I would have in opposition to their deprecation.
But, I don't, so I am cautiously in favor of pipelining a slow
deprecation, even though I can only be hurt by the process tactically
-- strategically, I can be helped, e.g. by solving even one defect
that lowers *my* maintenance cost.  You can consider my sentiment the
result of some evidence of absence, if you will.  I can probably
refine this intuition if it would change someone's mind, but given the
tone of conversation, I'd probably simply be given a no-true-scotsman
retort -- which is true, Heroku's user base is not provably
representative of all users.  But what else is there to go on, besides
experiences of others, such as yours and Andrew's, or others?

Both of you have given some well-considered use cases now, but the
conversation was a quagmire a while because it seems like the thing to
do was dismiss those charitable to the idea of deprecation rather than
even tersely list out use cases that are in danger.  If the project
suffered a vast number of deprecation requests I could understand the
'silence is not consent' argument, because who has the time to defend
all territory all the time?  But as-is such conversations are so rare
that I think positive identification of use cases is worthwhile use of
time, if it can result in but a chance of eliminating maintenance
burden and surface area.

Features do not stay for free, especially not at the level of quality
the project demands, and my personal sanity benefits from that
quality.  While nobody has given a cost of the maintenance of rules, I
would surmise it is non-zero, and consuming resources on potentially
lousy features is not a service to users either, and I do not wish
that to be ignored.

Finally, putting aside the use cases you are able to positively
identify from your personal experirence, I think it's reasonable to
put in a message of intent-to-deprecate and reverse or revise course
as more data appears.  Perhaps the thinking should be: intent to
aggressively gather data to enable deprecation rather than a final
deprecation decision and plan, full stop.  The most direct route may
be to package such a request into error messages or warnings in the
database, because I do not think release notes or announcements are
enough.

Contrast this with the sudden change to VACUUM FULL: from no doubling
in space usage to a doubling in space usage temporarily.  That's
nothing to sneeze at, and who knows, thousands of administrative
scripts happily VACUUM FULLing could have blown up terribly.  But, it
was changed anyway, because the feature was pretty much deemed not
that useful to people relating their needs.  How is this reasoning
consistent with that change?

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Christopher Browne
On Wed, Oct 17, 2012 at 5:45 PM, Daniel Farina dan...@heroku.com wrote:
 retort -- which is true, Heroku's user base is not provably
 representative of all users.  But what else is there to go on, besides
 experiences of others, such as yours and Andrew's, or others?

Well, Heroku doesn't support Slony + Londiste (and I am not overly
indignant at your reasoning at not supporting them), so I think we can
guarantee that anything involving those trigger-based replication
systems will be absent from Heroku's user base.  Which is not to
express indignance, but merely that there are a few knowable biases
about your users.  And I'd expect quite a few unknow(n|able) ones.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Josh Berkus
Daniel,

 Unfortunately I myself see little evidence of the vast, vast --
 several nines of vast -- majority of folks using rules, and as I said:
 as a thought experiment, merely one solved bug is worth more to me
 than rules from what I know at this time. 

Again, the answer to this is to run an aggressively promoted survey, so
that we can have data, rather than speculation by -hackers.

 Finally, putting aside the use cases you are able to positively
 identify from your personal experirence, I think it's reasonable to
 put in a message of intent-to-deprecate and reverse or revise course
 as more data appears.  Perhaps the thinking should be: intent to
 aggressively gather data to enable deprecation rather than a final
 deprecation decision and plan, full stop.

Exactly.

I fact, I'll go further and say that I believe we will be deprecating
RULEs eventually.  It's merely a question of how long that will take and
what we need to document, announce and implement before then.

I would tend to say well, they're not hurting anyone, why not keep
them? Except that we're gathering an increasing number of features
(RETURNING, FDWs, CTEs, Command triggers) which don't work well together
with RULEs.  That puts us in danger of turning into MySQL (Sorry, you
can't use Full Text Search with transactions), which is not a direction
we want to go in.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-17 Thread Kevin Grittner
Kevin Grittner wrote:
 Simon Riggs wrote:
 On 6 October 2012 00:56, Tom Lane t...@sss.pgh.pa.us wrote:
  
 2. DROP INDEX CONCURRENTLY doesn't bother to do
 TransferPredicateLocksToHeapRelation until long after it's
 invalidated the index. Surely that's no good? Is it even possible
 to do that correctly, when we don't have a lock that will prevent
 new predicate locks from being taken out meanwhile?
 
 No idea there. Input appreciated.
 
 If the creation of a new tuple by insert or update would not
 perform the related index tuple insertion, and the lock has not yet
 been transfered to the heap relation, yes we have a problem.  Will
 take a look at the code.
 
 Creation of new predicate locks while in this state has no bearing
 on the issue as long as locks are transferred to the heap relation
 after the last scan using the index has completed.
 
To put that another way, it should be done at a time when it is sure
that no query sees indisvalid = true and no query has yet seen
indisready = false.  Patch attached.  Will apply if nobody sees a
problem with it.

-Kevin
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***
*** 1316,1321  index_drop(Oid indexId, bool concurrent)
--- 1316,1326 
  	 * table lock strong enough to prevent all queries on the table from
  	 * proceeding until we commit and send out a shared-cache-inval notice
  	 * that will make them update their index lists.
+ 	 *
+ 	 * All predicate locks on the index are about to be made invalid. Promote
+ 	 * them to relation locks on the heap. For correctness this must be done
+ 	 * after the index was last seen with indisready = true and before it is
+ 	 * seen with indisvalid = false.
  	 */
  	heapId = IndexGetRelation(indexId, false);
  	if (concurrent)
***
*** 1349,1354  index_drop(Oid indexId, bool concurrent)
--- 1354,1361 
  		 */
  		indexRelation = heap_open(IndexRelationId, RowExclusiveLock);
  
+ 		TransferPredicateLocksToHeapRelation(userIndexRelation);
+ 
  		tuple = SearchSysCacheCopy1(INDEXRELID,
  	ObjectIdGetDatum(indexId));
  		if (!HeapTupleIsValid(tuple))
***
*** 1438,1449  index_drop(Oid indexId, bool concurrent)
  		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
  		userIndexRelation = index_open(indexId, AccessExclusiveLock);
  	}
! 
! 	/*
! 	 * All predicate locks on the index are about to be made invalid. Promote
! 	 * them to relation locks on the heap.
! 	 */
! 	TransferPredicateLocksToHeapRelation(userIndexRelation);
  
  	/*
  	 * Schedule physical removal of the files
--- 1445,1452 
  		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
  		userIndexRelation = index_open(indexId, AccessExclusiveLock);
  	}
! 	else
! 		TransferPredicateLocksToHeapRelation(userIndexRelation);
  
  	/*
  	 * Schedule physical removal of the files

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Simon Riggs
On 17 October 2012 23:24, Josh Berkus j...@agliodbs.com wrote:

 I fact, I'll go further and say that I believe we will be deprecating
 RULEs eventually.  It's merely a question of how long that will take and
 what we need to document, announce and implement before then.

 I would tend to say well, they're not hurting anyone, why not keep
 them? Except that we're gathering an increasing number of features
 (RETURNING, FDWs, CTEs, Command triggers) which don't work well together
 with RULEs.  That puts us in danger of turning into MySQL (Sorry, you
 can't use Full Text Search with transactions), which is not a direction
 we want to go in.

I don't really understand. We *are* already in the position you say we
don't want to go towards. It's not a danger, its a current reality.

So what do we do? I've got the doc changes now. Let's agree the rest
of the plan...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in planner's equivalence-class processing

2012-10-17 Thread Tom Lane
I wrote:
 So this is essentially an oversight in the patch that added tracking of
 nullable_relids.  I got confused about the difference between
 outerjoin_delayed (this clause, as a whole, is not outerjoin_delayed
 because its natural semantic level would be at the join anyway) and
 having nonempty nullable_relids, and thought that equivalence-class
 processing would never see such clauses because it doesn't accept
 outerjoin_delayed clauses.  So there's no code in equivclass.c to
 compute nullable_relids sets for constructed clauses.  At some point
 it might be worth adding same, but for now I'm just going to tweak
 distribute_qual_to_rels to not believe that clauses with nonempty
 nullable_relids can be equivalence clauses.

I tried the latter, with the first patch attached below, and it fixed
the incorrect-plan problem.  However, further testing showed that this
method also breaks one of the optimizations implemented in equivclass.c,
which is the ability to push constants through full joins, as in

select * from tenk1 a full join tenk1 b using (unique1)
where unique1 = 42;

We don't seem to have a regression test case for that optimization,
which is probably because it predates the availability of EXPLAIN's
COSTS OFF option.  (I've added one in the second patch below.)
I thought for a minute about accepting that as fallout, but I know
for certain that we'd get pushback on it, because we did last time
I broke it :-(.

So it appears that what we have to do is upgrade the equivalence-class
machinery to handle nullable_relids properly.  The second patch attached
below does that.  It's considerably larger than the quick-hack patch,
though not as large as I'd originally feared.

Anyway, the question now is whether to back-patch this.  We do have
pretty much the same equivalence-class infrastructure in all versions
since 8.3, so it's possible to do it ... but it's a bit nervous-making.
On the other hand, letting a known bug go unfixed isn't attractive
either.

regards, tom lane

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3c7fa632b8ebe26c1d91e83e0ba1fa9c03b9e919..1e1a9b236aeb0672645828b8d7dfd458a6ef3710 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*** distribute_qual_to_rels(PlannerInfo *roo
*** 988,993 
--- 988,1010 
  			if (check_redundant_nullability_qual(root, clause))
  return;
  		}
+ 		else if (!bms_is_empty(nullable_relids))
+ 		{
+ 			/*
+ 			 * Although the qual doesn't have to be delayed to above its
+ 			 * natural syntactic level, it does contain references to rels
+ 			 * that are nulled by lower outer joins, so don't treat it as a
+ 			 * possible equivalence clause.
+ 			 *
+ 			 * XXX in the future this restriction could be relaxed, but there
+ 			 * are a couple of stumbling blocks.  First, we'd have to verify
+ 			 * that the no-delay property applies to each side of the equality
+ 			 * separately, and second, the equivalence machinery would have to
+ 			 * be improved to create output RestrictInfos with appropriate
+ 			 * (possibly non-empty) nullable_relids values.
+ 			 */
+ 			maybe_equivalence = false;
+ 		}
  		else
  		{
  			/*
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 624b7455f365fd1b49c582a368e692cc2f584f74..58606fa95af4f9497d06ce9a95d3e57407e46845 100644
*** a/src/test/regress/expected/join.out
--- b/src/test/regress/expected/join.out
*** SELECT qq, unique1
*** 2554,2560 
   -  Hash
 -  Seq Scan on int8_tbl b
 -  Index Scan using tenk1_unique2 on tenk1 c
!  Index Cond: (unique2 = COALESCE((COALESCE(a.q1, 0::bigint)), (COALESCE(b.q2, (-1)::bigint
  (8 rows)
  
  SELECT qq, unique1
--- 2554,2560 
   -  Hash
 -  Seq Scan on int8_tbl b
 -  Index Scan using tenk1_unique2 on tenk1 c
!  Index Cond: (COALESCE((COALESCE(a.q1, 0::bigint)), (COALESCE(b.q2, (-1)::bigint))) = unique2)
  (8 rows)
  
  SELECT qq, unique1
*** select b.unique1 from
*** 2833,2838 
--- 2833,2882 
  (5 rows)
  
  --
+ -- test handling of potential equivalence clauses above outer joins
+ --
+ explain (costs off)
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+   QUERY PLAN  
+ --
+  Nested Loop Left Join
+Filter: ((COALESCE(b.thousand, 123) = a.q1) AND (a.q1 = COALESCE(b.hundred, 123)))
+-  Seq Scan on int8_tbl a
+-  Index Scan using tenk1_unique2 on tenk1 b
+  Index Cond: (a.q1 = unique2)
+ (5 rows)
+ 
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2

Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Daniel Farina
On Wed, Oct 17, 2012 at 3:24 PM, Josh Berkus j...@agliodbs.com wrote:
 Daniel,

 Unfortunately I myself see little evidence of the vast, vast --
 several nines of vast -- majority of folks using rules, and as I said:
 as a thought experiment, merely one solved bug is worth more to me
 than rules from what I know at this time.

 Again, the answer to this is to run an aggressively promoted survey, so
 that we can have data, rather than speculation by -hackers.

I think that's great, but I am cynical enough to believe that after
such surveys that we should be prepared to turn back if the insertion
of a deprecation warning into Postgres generates more data (e.g.
complaints).  I'm quite happy with long and even uncertain process,
depending on what happens, deprecation may have to put off for a very
long time.

I don't usually like to push so insistently, but I felt inclined to
because I did not feel that, in the beginning, that those proposing
that we even talk about the idea got a very evenhanded response.  Your
sentiments may vary, but I feel this is a justified one, now.

 Finally, putting aside the use cases you are able to positively
 identify from your personal experirence, I think it's reasonable to
 put in a message of intent-to-deprecate and reverse or revise course
 as more data appears.  Perhaps the thinking should be: intent to
 aggressively gather data to enable deprecation rather than a final
 deprecation decision and plan, full stop.

 Exactly.

 I fact, I'll go further and say that I believe we will be deprecating
 RULEs eventually.  It's merely a question of how long that will take and
 what we need to document, announce and implement before then.

 I would tend to say well, they're not hurting anyone, why not keep
 them? Except that we're gathering an increasing number of features
 (RETURNING, FDWs, CTEs, Command triggers) which don't work well together
 with RULEs.  That puts us in danger of turning into MySQL (Sorry, you
 can't use Full Text Search with transactions), which is not a direction
 we want to go in.

Sounds very reasonable to me.  Also, contains some good reasons for
deprecation I had not thought of.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 I would tend to say well, they're not hurting anyone, why not keep
 them? Except that we're gathering an increasing number of features
 (RETURNING, FDWs, CTEs, Command triggers) which don't work well together
 with RULEs.

Really?  On what do you base that claim?  The only one of those that I
might believe is command triggers, but AFAIK we only have/plan command
triggers for DDL, so there's no overlap.

I'm fairly annoyed by the entire tenor of this conversation, because
the people who are hollering the loudest seem to be people who have
never actually touched any of the rules code, but nonetheless seem
prepared to tell those of us who have what to spend our time on.

Now having said that, I would definitely like to see rules in their
current form go away eventually.  But not without a substitute.
Triggers are not a complete replacement, and no amount of wishful
thinking makes them so.

Perhaps it would be more profitable to try to identify the pain points
that make people so eager to get rid of rules, and then see if we could
alleviate them.  One big problem I know about offhand is the
multiple-evaluation risk, which seems at least in principle fixable.
What others are there?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-17 Thread Tom Lane
Kevin Grittner kgri...@mail.com writes:
 To put that another way, it should be done at a time when it is sure
 that no query sees indisvalid = true and no query has yet seen
 indisready = false.  Patch attached.  Will apply if nobody sees a
 problem with it.

The above statement of the requirement doesn't seem to match what you
put in the comment:

 +  * All predicate locks on the index are about to be made invalid. 
 Promote
 +  * them to relation locks on the heap. For correctness this must be done
 +  * after the index was last seen with indisready = true and before it is
 +  * seen with indisvalid = false.

and the comment is rather vaguely worded too (last seen by what?).
Please wordsmith that a bit more.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Identity projection

2012-10-17 Thread Kyotaro HORIGUCHI
Ah. It's too late. I'll re-submit updated versions of my patches
left alone in the last CF.

 Hi, I've marked this patch as Returned with Feedback (thanks Tom).
 Please submit an updated version to the upcoming commitfest.  Thanks.

I'm sorry and thank you.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order

2012-10-17 Thread Joachim Wieland
On Wed, Oct 17, 2012 at 5:43 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 (I tested the new pg_dump with 8.2 and HEAD and also verified it passes
 pg_upgrade's make check.  I didn't test with other server versions.)

I also tested against 8.3 and 8.4 since 8.4 is the version that
introduced pg_get_function_identity_arguments. The included testcase
fails on 8.3 and succeeds on 8.4 (pg_dump succeeds in both cases of
course but it's only ordered deterministically in 8.4+).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-10-17 Thread 花田 茂
Hi Tomas,

On 2012/10/17, at 20:45, Tomas Vondra t...@fuzzy.cz wrote:
 
 Dne 17.10.2012 12:34, Shigeru HANADA napsal:
 Performance test
 
 I tested 1000 tables case (each is copy of pgbench_branches with 10
 rows) on 1GB shared_buffers server.  Please note that I tested on
 MacBook air, i.e. storage is not HDD but SSD.  Here is the test procedure:
 
 1) loop 1000 times
 1-1) create copy table of pgbench_accounts as accounts$i
 1-2) load 10 rows
 1-3) add primary key
 1-4) select all rows to cache pages in shared buffer
 2) BEGIN
 3) loop 1000 times
 3-1) DROP TABLE accounts$i
 4) COMMIT
 
 I don't think the 'load rows' and 'select all rows' is really necessary.
 And AFAIK sequential scans use small circular buffer not to pollute shared
 buffers, so I'd guess the pages are not cached in shared buffers anyway.
 Have you verified that, e.g. by pg_buffercache?

Oops, you're right.  I omitted 1-3 and 1-4 in actual measurement, but IMO 
loading data is necessary to fill the shared buffer up, because # of buffers 
which are deleted during COMMIT is major factor of this patch.  And, yes, I 
verified that all shared buffers are used after all loading have been finished.

 
 Our system creates a lot of working tables (even 100.000) and we need
 to perform garbage collection (dropping obsolete tables) regularly. This
 often took ~ 1 hour, because we're using big AWS instances with lots of
 RAM (which tends to be slower than RAM on bare hw). After applying this
 patch and dropping tables in groups of 100, the gc runs in less than 4
 minutes (i.e. a 15x speed-up).
 
 Hm, my environment seems very different from yours.  Could you show the
 setting of shared_buffers in your environment?  I'd like to make my test
 environment as similar as possible to yours.
 
 We're using m2.4xlarge instances (70G of RAM) with 10GB shared buffers.

Thank you, it's more huge than I expected.  I'm not sure whether my boss allows 
me to use such rich environment... :(


Here are results of additional measurements on my MBA.

* stats of 1000 bare DROP TABLE statements

90%ile of patched PG is just 2% slower than Master, so it would be acceptable.

 |  Patched   |   Master
-++
 Average |   1.595 ms |   1.634 ms
 Median  |   1.791 ms |   1.900 ms
 90%ile  |   2.517 ms |   2.477 ms
 Max |  37.526 ms |  24.553 ms

* Total time to complete 1000 DROP TABLEs and COMMIT

   | Patched |  Master
---+-+-
 Bare  | 1595 ms | 1634 ms
 In TX |  672 ms | 1459 ms

Regards,
--
Shigeru HANADA
shigeru.han...@gmail.com






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-17 Thread Kevin Grittner
Kevin Grittner wrote:

 It took me a while to spot it, but yeah -- I reversed the field
 names in the comment. :-( The email was right; I fixed the comment.

Hmm. The comment is probably better now, but I've been re-checking
the code, and I think my actual code change is completely wrong. Give
me a bit to sort this out.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] September 2012 commitfest

2012-10-17 Thread Amit Kapila
On Wednesday, October 17, 2012 9:38 PM Alvaro Herrera wrote:
 A week ago, I wrote:
 
  Some numbers: we got 65 patches this time, of which we rejected 4 and
  returned 3 with feedback.  14 patches have already been committed, and
  13 are waiting on their respective authors.  25 patches need review,
 and
  6 are said to be ready for committers.
 
 A week later, numbers have changed but not by much.  We now have 66
 patches (one patch from the previous commitfest was supposed to be moved
 to this one but didn't).  Rejected patches are still 4; there are now 7
 returned with feedback.  17 are committed, and 10 are waiting on
 authors.  21 patches need review.
 
 Most of the remaining Waiting-on-author patches have seen recent
 activity, which is why I didn't close them as returned.  Authors should
 speak up soon -- there is no strict policy but I don't think I'm going
 to wait later than this Friday for some final version to be submitted
 that can be considered ready for committer.  

For the Patch, Trim trailing NULL columns, I have provided the performance
data required
and completed the review. There are only few review comments which can be
addressed.
So is it possible that I complete them and mark it as Ready For Committer
or what else can be the way to proceed for this patch
if author doesn't respond.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-17 Thread Kevin Grittner
Kevin Grittner wrote:
 Hmm. The comment is probably better now, but I've been re-checking
 the code, and I think my actual code change is completely wrong.
 Give me a bit to sort this out.

I'm having trouble seeing a way to make this work without rearranging
the code for concurrent drop to get to a state where it has set
indisvalid = false, made that visible to all processes, and ensured
that all scans of the index are complete -- while indisready is still
true. That is the point where TransferPredicateLocksToHeapRelation()
could be safely called. Then we would need to set indisready = false,
make that visible to all processes, and ensure that all access to the
index is complete. I can't see where it works to set both flags at
the same time. I want to sleep on it to see if I can come up with any
other way, but right now that's the only way I'm seeing to make DROP
INDEX CONCURRENTLY compatible with SERIALIZABLE transactions. :-(

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-17 Thread Neil Tiffin

On Oct 17, 2012, at 4:45 PM, Daniel Farina wrote:

 On Wed, Oct 17, 2012 at 1:12 PM, Josh Berkus j...@agliodbs.com wrote:
 On 10/17/12 12:57 PM, Daniel Farina wrote:
 I'll have to register my disagreement then, in the special case where
 a feature becomes so obscure that many people don't have a wide-spread
 intuition at what it's good at or used for.  Tom also said build the
 replacement, and without itemization of use cases, I don't even know
 what that would look like -- perhaps such knowledge is assumed, but I
 think it's assumed wrongly, so perhaps there just needs to be some
 education.  At best you could define what to build somewhat
 tautologically from the mechanism used by RULES, and that's not a very
 good way to go about it, methinks.
 
 [use case, redacted, although worth independent consideration]
 
 Putting it as Andrew and Josh need to enumerate these cases, or forever
 be silent is quite unfair to our users.  Andrew and I hardly represent
 the entire scope of PostgreSQL app developers.  Enumerating the cases,
 finding replacements for them, and documenting migrations needs to be a
 group effort.
 
 Unfortunately I myself see little evidence of the vast, vast --
 several nines of vast -- majority of folks using rules, and as I said:
 as a thought experiment, merely one solved bug is worth more to me
 than rules from what I know at this time.  If I had a wealth of user
 pain to draw upon, I would have in opposition to their deprecation.
 But, I don't, so I am cautiously in favor of pipelining a slow
 deprecation, even though I can only be hurt by the process tactically

I am a lurker here, and as such, understand that I have no standing.  But I do 
write internal applications using postgresql and it seems to me that the 
direction forward is clear.  I've just went back and read the 9.2 documentation 
on Rules.  It appears that Rules are a current supported and best solution to 
many problems.  So as previously stated and I think pretty much agreed the docs 
must be changed.  I did not pick up from the docs that there were the problems 
mentioned in the various emails.

With that said, having read each email, there are some politics that do not 
make sense.

Are these the facts?

1. Rules are required in the core.  For example, that is how views are 
implemented.
2. There are some, possibly fringe, use cases where Rules are the best solution.
3. There are many uses of Rules that are fragile, or even broken in 
implementation.

4. There is a desire to make Rules an internal core functionality only.
or
5. There is a desire to eliminate Rules all together.

6. There is new functionality that does not work correctly considering Rules.  
(e.g. Rules code is not updated.)

It would seem to me that with #1 and #2 it is foolish (to me, not understanding 
the politics) to consider deprecation.

The real issue is, Should Rules be visible to users?

As an application developer, I do not use Rules because they are non standard 
and my code will be used by different back ends, so personality I have no skin 
in this decision.  But logically, I think that it is silly to consider 
deprecation at this time.  The time to consider deprecation is when no core 
functionality depends on Rules.  Until that time, there is nothing to be gained 
by deprecation and there is no reason to piss off users by deprecation of code 
that has to be maintained anyway.

So I would move the docs to the internal section, state that Rules are not 
recommended to be used in user SQL, and that Rules may be deprecated in the 
future, then leave things alone for a couple of years until the way forward 
becomes clear.  If developers want to deprecate Rules, then create code that 
eliminates Rules from being require for core functions.

It seems to me that eventually Rules will suffer bit rot and it will be clear 
that it is time to remove all traces, or Rules will be maintained (albeit 
possibly less scope) and they will continue as core functionality based on need.

Neil







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] hash_search and out of memory

2012-10-17 Thread Hitoshi Harada
If OOM happens during expand_table() in hash_search_with_hash_value()
for RelationCacheInsert, the hash table entry is allocated and stored
in the hash table, but idhentry-reldesc remains NULL.  Since OOM
causes AbortTransaction(), in AtEOXact_RelationCache() this NULL
pointer is referenced and we hit SIGSEGV.

The fix would be either catch OOM error with PG_TRY() and undo the
hash entry, or do the expansion first and put the entry later.  The
latter is a bit ugly because we have to re-calculate hash bucket after
we decided to expand, so the former looks better.  Do you think of
other solutions?

Thanks,
-- 
Hitoshi Harada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers