[HACKERS] automatically updating security barrier views

2014-04-07 Thread Drew Crawford
Hello list,

I am posting here since Craig Ringer suggested user feedback on this feature at 
this time may make a difference for 9.4 inclusion.  I apologize if I am in the 
wrong place.

Row-level security is probably THE major element the FOSS databases are behind 
compared to proprietary databases like Oracle or DB2.  I understand that RLS 
was originally targeted for 9.3, slipped, was retargeted for 9.4, and will slip 
again.  This is unfortunate, but I realize it’s a hard problem.

It is possible for me to get about 90% of the way there by clever use of views. 
 However views are not intrinsically secure, and well-crafted queries can leak 
the underlying table information.  It is not immediately clear to me, as a 
user, if there is a way to lock down the user account hard enough to prevent 
this, but I suspect the answer is no.

In theory security-barrier views solve this problem, but since you cannot 
INSERT or UPDATE a security-barrier view this only works for read-only 
applications.  For read-write scenarios, some other solution must be taken for 
the write path.

If UPDATE/INSERT on security-barrier views slips to 9.5, the impact to me 
personally is that I will have to create some kind of MITM proxy or write some 
kind of privilege-elevated process to handle the write path.

I am willing to take on some tasks in support of solving this problem in 9.4, 
but my unfamiliarity with the codebase means that realistically I’m not of much 
help.  Certainly, if there is something I could be doing, please let me know.

Drew






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Pavel Stehule
Hello

+1 for feature
-1 for Oracle syntax - it is hardly inconsistent with Postgres

Autonomous transactions should be used everywhere - not only in plpgsql

Regards

Pavel


2014-04-07 6:06 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com:

  I would like to propose “Autonomous Transaction” feature for 9.5.
 Details for the same are mentioned below:



 *What is Autonomous Transaction?*

 An autonomous transaction has its own COMMIT and ROLLBACK scope to ensure
 that its outcome does not affect the caller’s uncommitted changes.
 Additionally, the COMMITs and ROLLBACK in the calling transaction should
 not affect the changes that were finalized on the completion of autonomous
 transaction itself. Below are properties of autonomous transaction:

 1.  The autonomous transaction does not see uncommitted changes made
 by the main transaction and does not share locks or resources with main
 transaction.

 2.  Changes in autonomous transactions are visible to other
 transactions upon commit of the autonomous transactions. Thus, users can
 access the updated information without having to wait for the main
 transaction to commit.

 3.  Autonomous transactions can start other autonomous transaction.
 There are no limit, other than resource limits, on how many levels of
 autonomous transaction can be started.



 *Use-case:*

 There are many use-case for this feature. One of the use-case is
 illustrated below

 Say a procedure is defined, which does some operation on the
 database and incase of any failure in operation on main table, it maintains
 the failure information in a separate relation. But because of current
 transaction behavior, once main table operation fails, it will rollback
 whole transaction and hence error logged in error relation will be also
 lost, which might have been required for future analysis.

In order to solve this issue, we can use autonomous transaction as
 shown below:

 *CREATE OR REPLACE function operation(err_msg IN VARCHAR) returns void AS
 $$*

 *BEGIN*

 * INSERT INTO at_test(id, description) VALUES (998,
 ‘Description for 998’);*

 * INSERT INTO at_test(id, description) VALUES (999, NULL);*

 *EXCEPTION*

 * WHEN OTHER THEN*

 * PRAGMA AUTONOMOUS TRANSACTION;*

 * INSERT INTO error_logs(id, timestamp,
 err_msg) VALUES(nextval(‘errno’), timenow(), err_msg);*

 * COMMIT;*

 * RAISE not_null_violation;*

 *END;*

 *$$ LANGUAGE plpgsql;*

 So once we execute above procedure, second INSERT will fails and then
 within exception handling it will start autonomous transaction and log the
 error information in a separate table and then gets committed. So though
 operation to table at_test will fail and rollback, error information will
 persist in the error_logs table. After execution of procedure, record in
 two tables will be as below:

 *Postgres=# select * from error_logs;*

 *id |  log_time   | err_msg*

 *+-+-*

 *  5 | 2014-01-17 19:57:11 | error*

 *postgres=# select * from at_test;*

 *id | decsription*

 *+-*

 *(0 rows)*



 *Syntax:*

 Syntax to create autonomous transaction can be as:

 *PRAGMA AUTONOMOUS TRANSACTION;*

 This can be used with independent SQL commands, from procedure, triggers.



 *Implementation:*

 Implementation of autonomous transaction is based on the existing
 sub-transaction and main transaction. Most of the implementations are
 re-used for autonomous transaction also. Below are the brief details about
 the same:



 *Autonomous Transaction Storage:*

 As for main transaction, structure PGXACT is used to store main
 transactions, which are created in shared memory of size:

 (Number of process)*sizeof(struct PGXACT)

 Similarly a new structure will be defined to store autonomous transaction:

 *Struct PGAutonomousXACT*

 *{*

 *   TransactionId  xid;*

 *   TransactionId  xmin;*

 *   /* Store the level below main transaction as stored for
 sub-transaction*/*

 *   intnestingLevel;*

 *   struct XidCache   subxids;*

 *   bool overflowed;*

 *   bool delaychkpt;*

 *   uint  nxids;*

 *} PGAutonomousXACT;*

 All structure members of PGAutonomousXACT are same as used in PGXACT
 except nestingLevel as marked in bold color to store the level of
 transaction.

 Similar to main transaction, the memory allocated to store autonomous
 transaction will be:

 *(Number of process) * sizeof (struct PGAutonomousXACT)*MAX_AUTO_TX_LEVEL*

 Where MAX_AUTO_TX_LEVEL is maximum number of nested autonomous transaction
 level.

 Unlike main transaction, autonomous transaction cannot be accessed
 directly. It can be accessed using offset as:

 *(Process number)*MAX_AUTO_TX_LEVEL + (current auto tx level)*

 

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Craig Ringer
On 04/07/2014 12:06 PM, Rajeev rastogi wrote:


 Syntax to create autonomous transaction can be as:
 
 */PRAGMA AUTONOMOUS TRANSACTION;/*

Wouldn't you want to use SET TRANSACTION for this?

Or a suffix on BEGIN, like BEGIN AUTONOMOUS TRANSACTION ?

What's the logic behind introducing PRAGMA ?


If you wanted to use that syntax for Oracle compatibility you'd need to use:

PRAGMA AUTONOMOUS_TRANSACTION;

(note underscore). But really, this would no be a pragma at all,
PostgreSQL doesn't really have the concept. Calling it that would just
be misleading.




 *_Starting of Autonomous  Transaction:_*
 
 Starting of autonomous transaction will be exactly same as starting
 sub-transaction.

If you don't want it to dirty read data from the parent tx, or inherit
parent locks, then it cannot be the same at all.

 2.  Freeing of all resource and popping of previous transaction
 happens in the same way as sub-transaction.

I'm not sure what you mean here.


Overall, this looks like a HUGE job to make work well. I know some
others have been doing work along the same lines, so hopefully you'll be
able to collaborate and share ideas.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pending 9.4 patches

2014-04-07 Thread Craig Ringer
On 04/05/2014 03:57 AM, Andres Freund wrote:

 c07) Updatable security barrier views.
  This needs a serious look by a committer.


I've been exercising it via row security and it's been looking pretty
solid. It isn't a huge or intrusive patch, and it's seen several rounds
of discussion during its development and refinement. (Thanks Dean).

In some ways it'd be nicer to do it by splitting resultRelation into two
(row read source, and relation to write modified tuples into), but this
turns out to be rather complex and exceedingly intrusive. We might need
to do it someday - at that point, it wouldn't be overly hard to change
updatable s.b. views over to working that way, but only once the major
surgery required to remove the assumptions about resultRelation was done.

Having this in place in 9.4 would allow people to build row-security
like features in applications, and ease the path of row security into 9.5.

 r04) Row-security based on Updatable security barrier views
  This one's fate seems to be hard to judge without c07.

Open issues remain with this patch, and resources for working on it in
9.4 have run out.

It is not ready for commit. A core bugfix with locking in security
barrier views is required before the regression tests can be fixed up
properly, for one thing. Tom also expressed concerns about how plan
invalidation works, though it's not yet clear whether that was just
miscommunication about how it works on my part or whether there's a
concrete problem there.

I'd really love to finish this off for 9.4, but other projects have to
come first.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Ian Barwick

On 07/04/14 15:50, Craig Ringer wrote:

On 04/07/2014 12:06 PM, Rajeev rastogi wrote:



Syntax to create autonomous transaction can be as:

 */PRAGMA AUTONOMOUS TRANSACTION;/*


Wouldn't you want to use SET TRANSACTION for this?

Or a suffix on BEGIN, like BEGIN AUTONOMOUS TRANSACTION ?

What's the logic behind introducing PRAGMA ?


If you wanted to use that syntax for Oracle compatibility you'd need to use:

 PRAGMA AUTONOMOUS_TRANSACTION;

(note underscore).


FWIW the implementation in the patch uses PRAGMA 
AUTONOMOUS_TRANSACTION, the space is presumably a typo.



Regards

Ian Barwick


--
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Including replication slot data in base backups

2014-04-07 Thread Michael Paquier
On Fri, Apr 4, 2014 at 9:57 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 For 9.4, clearly yes, this would change the semantic of recovery and
 this is not something wise to do at the end of a development cycle.
 For 9.5 though, this is a different story. It clearly depends on if
 this is though as useful enough to change how recovery fetches WAL
 files (in this case by scanning existing repslots). There are other
 things to consider as well like for example: do we reset the
 restart_lsn of a repslot if needed WAL files are not here anymore or
 abort recovery? I haven't worked much with repslots though...
Coming back to that, I am still wondering if for the time being it
would not be better to add in pg_basebackup documentation that
replication slot information is not added in a backup, per se the
patch attached.
Regards,
-- 
Michael
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml 
b/doc/src/sgml/ref/pg_basebackup.sgml
index 6ce0c8c..4305788 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -590,6 +590,13 @@ PostgreSQL documentation
or an older major version, down to 9.1. However, WAL streaming mode (-X
stream) only works with server version 9.3.
   /para
+
+  para
+   The backup will not include information about replication slots
+   (see xref linkend=streaming-replication-slots) as it is not
+   guaranteed that a node in recovery will have WAL files required for
+   a given slot.
+  /para
  /refsect1
 
  refsect1

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] polymorphic SQL functions has a problem with domains

2014-04-07 Thread Pavel Stehule
2014-04-02 17:19 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Pavel Stehule pavel.steh...@gmail.com writes:
  I was informed about impossibility to use a polymorphic functions
 together
  with domain types

  see

   create domain xx as numeric(15);

  create or replace function g(anyelement, anyelement)
  returns anyelement as
  $$  select $1 + $2 $$
  language sql immutable;

  postgres=# select g(1::xx, 2::xx);
  ERROR:  return type mismatch in function declared to return xx
  DETAIL:  Actual return type is numeric.
  CONTEXT:  SQL function g during inlining

 That example doesn't say you can't use polymorphic functions with domains.
 It says that this particular polymorphic function definition is wrong:
 it is not making sure its result is of the expected data type.  I don't
 recall right now whether SQL functions will apply an implicit cast on the
 result for you, but even if they do, an upcast from numeric to some domain
 over numeric wouldn't be implicit.


I though about this issue again, and I am thinking so it is PostgreSQL bug

we can do safe transformation from Parent type - domain.

and returning result require same transformation (in this case) - so
enforcing casting (not only binary casting) should be safe.

Otherwise - CAST(var AS var) should be useful and can helps too.

Regards

Pavel Stehule



 regards, tom lane



Re: [HACKERS] Get more from indices.

2014-04-07 Thread Etsuro Fujita

Hi Horiguchi-san,

Sorry for not reviewing this patch in the last CF.

(2014/03/10 16:21), Kyotaro HORIGUCHI wrote:

Oops! I found a bug in this patch. The previous v8 patch missed
the case that build_index_pathkeys() could build a partial
pathkeys from the index tlist.

This causes the situation follows,

===
=# \d cu11
  Table public.cu11
  Column |  Type   | Modifiers
+-+---
  a  | integer | not null
  b  | integer | not null
  c  | integer |
  d  | text|
Indexes:
 cu11_a_b_idx UNIQUE, btree (a, b)

s=# explain (costs off) select * from cu11 order by a, c ,d;
   QUERY PLAN
---
  Index Scan using cu11_a_b_idx on cu11
(1 row)
===

Where the simple ORDER BY a, c, d on the table with index on
columns (a, b) results simple index scan which cannot perform the
order.

The attached v9 patche is fixed by adding a check for the case
into index_pathkeys_are_extensible(), and rebase to current HEAD.


Good catch!

ISTM that the biggest concern about this patch would be whether it's 
worth complicating the code, because the range of application of the 
patch is not so wide as is.  So, all we need to do is show important use 
cases that prove the effectiveness of the patch.  Sorry, I can't come up 
with a good idea, though.


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyData dont signal error

2014-04-07 Thread Martijn van Oosterhout
On Tue, Apr 01, 2014 at 01:53:13PM -0400, Robert Haas wrote:
  One of the things you mentioned is I often find it necessary to refer to
  existing examples of code when trying to figure out how to do things
  correctly.  I couldn't agree more.  Haven't seen one yet, but found plenty
  of discussion that tap danced around one or more of the components of the
  copy, put, end paradigm.  Maybe I should have just asked for a sample code
  snippet but didn't after a day or so of frustration and trying to piece
  together other people's incomplete samples.
 
 FWIW, I've generally found that the best examples are what's in the
 core distribution.  I'd go and look at a tool like psql or pg_restore
 and find the code that handles this, and then copy it and cut it down
 to what you need.

To move the conversation along:

https://github.com/postgres/postgres/blob/master/src/bin/psql/copy.c#L664

Seems possibly even more robust than most people will code, but it's
had a lot of real world testing.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Rajeev rastogi
On 07 April 2014 12:20, Craig Ringer
 
  Syntax to create autonomous transaction can be as:
 
  */PRAGMA AUTONOMOUS TRANSACTION;/*
 
 Wouldn't you want to use SET TRANSACTION for this?
 
 Or a suffix on BEGIN, like BEGIN AUTONOMOUS TRANSACTION ?
 
 What's the logic behind introducing PRAGMA ?
 
 
 If you wanted to use that syntax for Oracle compatibility you'd need to
 use:
 
 PRAGMA AUTONOMOUS_TRANSACTION;
 
 (note underscore). But really, this would no be a pragma at all,
 PostgreSQL doesn't really have the concept. Calling it that would just
 be misleading.

Actually it is same as oracle (i.e. PRAGMA AUTONOMOUS_TRANSACTION), it was just 
typo mistake in previous mail.
But if this is also not accepted then we can discuss and come out with a syntax 
based on everyone agreement.

 
  *_Starting of Autonomous  Transaction:_*
 
  Starting of autonomous transaction will be exactly same as starting
  sub-transaction.
 
 If you don't want it to dirty read data from the parent tx, or inherit
 parent locks, then it cannot be the same at all.

While starting sub-transaction, it is just initializing the resources required 
and
links the same to the parent transaction, which we require for autonomous 
transaction also.
I am not able to notice any issue as you mentioned above with this.
Please let me know if I am missing something or misunderstood your concern.

  2.  Freeing of all resource and popping of previous transaction
  happens in the same way as sub-transaction.
 
 I'm not sure what you mean here.

It means, during commit of autonomous transaction, freeing of all resource are 
done in the same way as done for sub-transaction.
Also current autonomous transaction gets popped out and points to the parent 
transaction in the similar way as done for sub-transaction.
 
 Overall, this looks like a HUGE job to make work well. I know some
 others have been doing work along the same lines, so hopefully you'll
 be able to collaborate and share ideas.

Yes it is huge works, so I have proposed in the beginning of 9.5 so that we can 
have multiple round of discussion and hence address
all concerns.
Also I have proposed to finish this feature in multiple rounds i.e. first 
patch, we can try to support autonomous transaction from
standalone SQL-command only, which will set-up infrastructure for future work 
in this area.

Using the WIP patch sent, I have done basic testing and it works fine.

Any comments?

Thanks and Regards,
Kumar Rajeev Rastogi





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Rajeev rastogi
On 07 April 2014 12:12, Pavel Stehule wrote:

+1 for feature
Thanks

-1 for Oracle syntax - it is hardly inconsistent with Postgres
We can discuss and come out with the syntax based on everyone agreement.
Autonomous transactions should be used everywhere - not only in plpgsql

Yes you are right. I am not planning to support only using plpgsql.  Initially 
we can support this
Using the standalone SQL-commands and then later we can enhance based on this 
infrastructure
to be used using plpgsql, triggers.

Thanks and Regards,
Kumar Rajeev Rastogi



Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Pavel Stehule
2014-04-07 11:59 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com:

  On 07 April 2014 12:12, Pavel Stehule wrote:

  +1 for feature

 Thanks



 -1 for Oracle syntax - it is hardly inconsistent with Postgres

 We can discuss and come out with the syntax based on everyone agreement.

 Autonomous transactions should be used everywhere - not only in plpgsql



 Yes you are right. I am not planning to support only using plpgsql.
 Initially we can support this

 Using the standalone SQL-commands and then later we can enhance based on
 this infrastructure

 to be used using plpgsql, triggers.


ok

long time I though about this feature.

I am thinking so this should be fully isolated transaction - it should not
be subtransaction, because then you can break database consistency - RI

I am happy so someone does this job

Regards

Pavel




 *Thanks and Regards,*

 *Kumar Rajeev Rastogi *





Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Atri Sharma
On Mon, Apr 7, 2014 at 3:41 PM, Pavel Stehule pavel.steh...@gmail.comwrote:




 2014-04-07 11:59 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com:

   On 07 April 2014 12:12, Pavel Stehule wrote:

  +1 for feature

 Thanks



 -1 for Oracle syntax - it is hardly inconsistent with Postgres

 We can discuss and come out with the syntax based on everyone agreement.

 Autonomous transactions should be used everywhere - not only in plpgsql



 Yes you are right. I am not planning to support only using plpgsql.
 Initially we can support this

 Using the standalone SQL-commands and then later we can enhance based on
 this infrastructure

 to be used using plpgsql, triggers.


 ok

 long time I though about this feature.

 I am thinking so this should be fully isolated transaction - it should not
 be subtransaction, because then you can break database consistency - RI



I am missing something here, but how does making it a subtransaction break
consistency? Isnt that what should actually be happening so that the
autonomous transaction's changes are actually visible till the parent
transaction commits?

What am I missing here?

Regards,

Atri


Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Andres Freund
On 2014-04-07 15:46:42 +0530, Atri Sharma wrote:
 On Mon, Apr 7, 2014 at 3:41 PM, Pavel Stehule pavel.steh...@gmail.comwrote:
 I am missing something here, but how does making it a subtransaction break
 consistency? Isnt that what should actually be happening so that the
 autonomous transaction's changes are actually visible till the parent
 transaction commits?
 
 What am I missing here?

START TRANSACTION;
INSERT INTO referenced_to_table ... id = 1;
START AUTONOMOUS SUBTRANSACTION;
INSERT INTO referencing_table id = 1 ...;
COMMIT AUTONOMOUS SUBTRANSACTION;
ROLLBACK;

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Pavel Stehule
2014-04-07 12:16 GMT+02:00 Atri Sharma atri.j...@gmail.com:




 On Mon, Apr 7, 2014 at 3:41 PM, Pavel Stehule pavel.steh...@gmail.comwrote:




 2014-04-07 11:59 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com:

   On 07 April 2014 12:12, Pavel Stehule wrote:

  +1 for feature

 Thanks



 -1 for Oracle syntax - it is hardly inconsistent with Postgres

 We can discuss and come out with the syntax based on everyone agreement.

 Autonomous transactions should be used everywhere - not only in plpgsql



 Yes you are right. I am not planning to support only using plpgsql.
 Initially we can support this

 Using the standalone SQL-commands and then later we can enhance based on
 this infrastructure

 to be used using plpgsql, triggers.


 ok

 long time I though about this feature.

 I am thinking so this should be fully isolated transaction - it should
 not be subtransaction, because then you can break database consistency - RI



 I am missing something here, but how does making it a subtransaction break
 consistency? Isnt that what should actually be happening so that the
 autonomous transaction's changes are actually visible till the parent
 transaction commits?


commit of autonomous transaction doesn't depends on outer transaction. So
anything what you can do, should be independent on outer transaction.

Pavel




 What am I missing here?

 Regards,

 Atri



Re: [HACKERS] automatically updating security barrier views

2014-04-07 Thread Stephen Frost
* Drew Crawford (d...@sealedabstract.com) wrote:
 I am willing to take on some tasks in support of solving this problem in 9.4, 
 but my unfamiliarity with the codebase means that realistically I’m not of 
 much help.  Certainly, if there is something I could be doing, please let me 
 know.

Download the patch, build PG with it, test it and try to break it.

Review the docs.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-07 Thread Florian Pflug
On Apr5, 2014, at 09:55 , Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 5 April 2014 08:38, Dean Rasheed dean.a.rash...@gmail.com wrote:
 [snip] releasing it in this state feels a little half-baked
 to me.
 
 
 I regret writing that almost as soon as I sent it. The last of those
 queries is now over 10 times faster than HEAD, which is certainly
 worthwhile. What bugs me is that there is so much more potential in
 this patch.

Well, the perfect is the enemy of the good as they say. By all means,
let's get rid of the O(n^2) behaviour in 9.5, but let's get a basic
version into 9.4.

 I think it's pretty close to being committable, but I fear that time
 is running out for 9.4.

I'm not aware of any open issues in the basic patch, other then 
scaling the reported calling statistics by nloops, which should be
trivial. I'll post an updated patch later today.

I don't really expect all the add-on patches to make it into 9.4  -
they don't seem to have gotten much attention yet - but at least 
the inverse transition functions for the basic arithmetic aggregates
should be doable I hope.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-07 Thread Rajeev rastogi
On 04 April 2014 18:09, Joshua Yanovski Wrote:

 The counter would be updated only by VACUUM, which would perform the 
 same computation performed by the COUNT operation but add it 
 permanently to counter just before it set the visible_map bit to 1 (I 
 am not certain whether this would require unusual changes to WAL 
 replay or not).

I think there is one more disadvantages with this (or maybe I have missed 
something): 
User may not use count(*) or may use just once after creating new kind 
of index proposed but VACUUM will have to keep performing operation equivalent 
to iterative count(*), which might degrade performance for VACUUM.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-04-07 Thread Tom Lane
Amit Kapila amit.kapil...@gmail.com writes:
 On Sat, Apr 5, 2014 at 8:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 How's that going to work during pg_ctl stop?  There's no -o switch
 provided.

 As there's no -o switch, so there won't be problem of getting wrong event
 source name from server due to command line options which you mentioned
 upthread or is there something which I am missing about it?

AFAICS, the sole argument for trying to do things this way is to log to
the same event_source the running postmaster is using.  But -C will not
provide that; it will only tell you what the postmaster *would* use if
it were freshly started right now.  If -o had been used at postmaster
start time, an inquiry done by pg_ctl stop (or reload, etc) won't match.
Another, possibly more plausible, failure mode is if the entry in
postgresql.conf has been edited since the running postmaster looked at it.

Admittedly, all of these cases are kind of unusual.  But that's all the
more reason, IMO, to be wary of sending our error messages to an
unexpected place in such cases.  Somebody who's trying to debug a
configuration problem doesn't need the added complication of trying to
figure out where pg_ctl's error messages might have gone.

 You are right that with the current patch approach, we will miss many
 opportunities for failures and the way suggested by you below (new switch)
 is more appropriate to fix this issue. Another thought that occurred to me
 is why not change the failures which are before set of appropriate
 event_source to report on console. The main reason for using event log
 to report error's in pg_ctl is because it can be used for service
 (register/unregister/..) in Windows and all the work we do before setting
 event_source is not related to it's usage as a service.

AFAICS, pg_ctl already reports to stderr if stderr is a tty.  This whole
issue only comes up when pg_ctl itself is running as a service (which
I guess primarily means starting/stopping the postgresql service?).
So that puts extra weight on trying to be sure that error messages go
to a predictable place; the user's going to be hard-pressed to debug
background failures without that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyData dont signal error

2014-04-07 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 To move the conversation along:

 https://github.com/postgres/postgres/blob/master/src/bin/psql/copy.c#L664

 Seems possibly even more robust than most people will code, but it's
 had a lot of real world testing.

Note that the looping behavior there is actually rather new, and is
probably unnecessary for 99% of applications; it would depend on what
your ambitions are for dealing gracefully with connection loss (and
even then, I think we were forced into this by pre-existing decisions
elsewhere in psql about where and how we handle connection loss).

The important thing for this question is the error reporting that
occurs at lines 692-694.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pending 9.4 patches

2014-04-07 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 On 04/05/2014 03:57 AM, Andres Freund wrote:
 r04) Row-security based on Updatable security barrier views
 This one's fate seems to be hard to judge without c07.

 Open issues remain with this patch, and resources for working on it in
 9.4 have run out.

 It is not ready for commit. A core bugfix with locking in security
 barrier views is required before the regression tests can be fixed up
 properly, for one thing. Tom also expressed concerns about how plan
 invalidation works, though it's not yet clear whether that was just
 miscommunication about how it works on my part or whether there's a
 concrete problem there.

 I'd really love to finish this off for 9.4, but other projects have to
 come first.

Given that, I think we should go ahead and mark this one Returned With
Feedback.  It's past time to be punting anything that doesn't have a
serious chance of getting committed for 9.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD

2014-04-07 Thread Robert Haas
On Sun, Apr 6, 2014 at 1:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_fireflydt=2014-04-06%2017%3A04%3A00

 TRAP: FailedAssertion(!(FastPathStrongRelationLocks-count[fasthashcode]  
 0), File: lock.c, Line: 1240)
 [53418a51.6a08:2] LOG:  server process (PID 27631) was terminated by signal 6
 [53418a51.6a08:3] DETAIL:  Failed process was running: create table gc1() 
 inherits (c1);

Uggh.  That's unfortunate, but not terribly surprising: I didn't think
that missing volatile was very likely to be the cause of this.  Have
we been getting random failures of this type since the fastlock stuff
went in, and we're only just now noticing?  Or did some recent change
expose this problem?  I'm a bit suspicious of the patches to
static-ify stuff, since that might cause the compiler to think it
could move things across function calls that it hadn't thought
move-able before, but FastPathStrongLocks references would seem to be
the obvious candidate for that, and volatile-izing it ought to have
fixed it.  I would think.

One thing I noticed, looking at this particular failure, is that at
the time that create table gc1() inherits (c1) failed an assertion,
another backend was inside select blockme(), and specifically inside
of select count(*)from tenk1 a, tenk1 b, tenk1 c.  I can't
help but suspect that the bug is somehow concurrency-related, so the
presence of concurrent activity seems like a clue, but I can't figure
out the relationship.  blockme() shouldn't be taking any strong
relation locks.  Unless AV decided to truncate a table just then, the
process that failed should be the only one in the system with any
strong relation lock, so if there's a race, what is it racing against?

In the failure on prairiedog back on March 25th...

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=prairiedogdt=2014-03-25%2003%3A15%3A03

...there's a lot more concurrent activity.  The process that failed
the assertion was running CREATE TABLE t3 (name TEXT, n INTEGER);
concurrently, the following other queries were running:

SELECT oid AS clsoid, relname, relnatts + 10 AS x
   INTO selinto_schema.tmp2
   FROM pg_class WHERE relname like '%b%';

CREATE TEMP TABLE foo (id integer);

create temp table t3 as select generate_series(-1000,1000) as x;

CREATE TEMPORARY TABLE bitwise_test(
  i2 INT2,
  i4 INT4,
  i8 INT8,
  i INTEGER,
  x INT2,
  y BIT(4)
);

DROP TABLE savepoints;

create temp table tt1(f1 int);

CREATE TEMP TABLE arrtest2 (i integer ARRAY[4], f float8[], n
numeric[], t text[], d timestamp[]);

COMMIT PREPARED 'regress-one';

All but the first and last of those take a strong relation lock, so
some kind of race could certainly account for that failure.  It's also
interesting that COMMIT PREPARED is shown as being involved here; that
code is presumably much more rarely executed than the code for the
regular commit or abort paths, and might therefore be thought more
likely to harbor a bug.  In particular, there's this code in
LockRefindAndRelease:

/*
 * Decrement strong lock count.  This logic is needed only for 2PC.
 */
if (decrement_strong_lock_count
 ConflictsWithRelationFastPath(lock-tag, lockmode))
{
uint32  fasthashcode = FastPathStrongLockHashPartition(hashcode);

SpinLockAcquire(FastPathStrongRelationLocks-mutex);
FastPathStrongRelationLocks-count[fasthashcode]--;
SpinLockRelease(FastPathStrongRelationLocks-mutex);
}

I notice that this code lacks an
Assert(FastPathStrongRelationLocks-count[fasthashcode]  0).  I think
we should add one.  If this code is somehow managing to decrement one
of the counts when it's already zero, the next process whose lock gets
mapped to this partition would increment the count from the maximum
value that can be stored back to zero.  Then, when it goes to release
the strong lock, it finds that the count is already zero and goes
boom.  This theory could even explain the new crash, since the COMMIT
PREPARED stuff has already happened by the point where rover_firefly
failed; since the lock tags are hashed to create fasthashcode,
variation in which objects got which OIDs due to concurrency in the
regression test could cause the failure to move around or even escape
detection altogether.  Now, even if the 2PC code is the problem, that
doesn't explain exactly what's wrong with the above logic, but it
would help narrow down where to look.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tds_fdw for Sybase and MS SQL Server

2014-04-07 Thread Mike Blackwell
Excellent!  I have an application for this.  I'll give it a look.

Thanks!
Mike

__
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com


http://www.rrdonnelley.com/
* mike.blackw...@rrd.com*


On Sun, Apr 6, 2014 at 9:03 AM, Geoff Montee geoff.mon...@gmail.com wrote:

 If anyone is interested, I've developed a foreign data wrapper that can be
 used to connect to Sybase databases and Microsoft SQL Server. You can get
 it on GitHub here:

 https://github.com/GeoffMontee/tds_fdw

 I did my testing with FreeTDS, an open source TDS library.

 I've talked to Greg Smith about this in the past. I don't really have the
 expertise or resources to develop and test this much more than I already
 have. If anyone likes it and would like to take over ownership of the
 code, feel free to do so.

 I actually developed this over a year ago, so it doesn't support write
 operations added in PostgreSQL 9.3.

 By the way, thanks to everyone who spoke at PGConf NYC 2014. It was very
 informative.

 Geoff Montee




Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD

2014-04-07 Thread Andres Freund
On 2014-04-07 10:06:07 -0400, Robert Haas wrote:
 I'm a bit suspicious of the patches to
 static-ify stuff, since that might cause the compiler to think it
 could move things across function calls that it hadn't thought
 move-able before, but FastPathStrongLocks references would seem to be
 the obvious candidate for that, and volatile-izing it ought to have
 fixed it.  I would think.

Hm. It generally might be interesting to get a few !X86 buildfarms
running builds with LTO enabled. That might expose some dangerous
assumptions more easily.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windows exit code 128 ... it's baaack

2014-04-07 Thread Andres Freund
On 2014-04-05 11:05:09 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-02-27 19:14:13 -0500, Tom Lane wrote:
  I looked at the postmaster log for the ongoing issue on narwhal
  (to wit, that the contrib/dblink test dies the moment it tries
  to do anything dblink-y), and looky here what the postmaster
  has logged:
 
  One interesting bit about this is that it seems to work in 9.3...
 
 Well, yeah, it seems to have been broken somehow by the Windows
 linking changes we did awhile back to try to ensure that missing
 PGDLLIMPORT markers would be detected reliably.  Which we did not
 back-patch.

Hard to say since there's been no working builds for HEAD for so long on
narwahl :(.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ipc_test

2014-04-07 Thread Robert Haas
On Fri, Apr 4, 2014 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-04 09:31:01 -0400, Robert Haas wrote:
 Does anybody care about being able to compile ipc_test as a standalone
 binary any more?

 I don't.

 I can't remember the last time I had use for it either.  +1 for removal.

OK, done.  One less thing to worry about when committing!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 10:20 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-04-07 10:06:07 -0400, Robert Haas wrote:
 I'm a bit suspicious of the patches to
 static-ify stuff, since that might cause the compiler to think it
 could move things across function calls that it hadn't thought
 move-able before, but FastPathStrongLocks references would seem to be
 the obvious candidate for that, and volatile-izing it ought to have
 fixed it.  I would think.

 Hm. It generally might be interesting to get a few !X86 buildfarms
 running builds with LTO enabled. That might expose some dangerous
 assumptions more easily.

I strongly suspect that will break stuff all over the place.  We can
either get compiler barriers working for real, or we can start
volatile-izing every reference in an LWLock-protected critical
section.  Hint: the second one is insane.

That might be off-topic for this issue at hand, though...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-04-07 Thread Robert Haas
On Sat, Apr 5, 2014 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 So basically, I think having pg_ctl try to do what this patch proposes
 is a bad idea.

I'm not a Windows person either, but I tend to agree.  I can't think
that this is going to be very robust ... and if it's not going to be
robust, what's the point?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD

2014-04-07 Thread Andres Freund
On 2014-04-07 10:45:51 -0400, Robert Haas wrote:
 On Mon, Apr 7, 2014 at 10:20 AM, Andres Freund and...@2ndquadrant.com wrote:
  Hm. It generally might be interesting to get a few !X86 buildfarms
  running builds with LTO enabled. That might expose some dangerous
  assumptions more easily.
 
 I strongly suspect that will break stuff all over the place.  We can
 either get compiler barriers working for real, or we can start
 volatile-izing every reference in an LWLock-protected critical
 section.  Hint: the second one is insane.

You don't have to convince me. The way there is where I am not sure
we're agreeing.

I didn't break a few months back for x86 on light loads btw. Not that
that's saying much.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gsoc knn spgist

2014-04-07 Thread Robert Haas
On Fri, Apr 4, 2014 at 11:56 AM, Костя Кузнецов chapae...@yandex.ru wrote:
 I want to implement knn for spgist. I dont have question with knn, but have
 questions with implementation of interface.

 i modify pg_am.h (set amcanorderbyop  to true in spgist index).Also i modify
 pg_amop.h(add

 DATA(insert (4015   600 600 15 o 517 4000 1970 ));

 )

 explain SELECT * FROM quad_point_tbl ORDER BY p - '-2,50'

  Sort  (cost=1219.31..1246.82 rows=11003 width=16)
Sort Key: ((p - '(-2,50)'::point))
-  Index Only Scan using sp_quad_ind on quad_point_tbl
 (cost=0.15..480.70 r
 ows=11003 width=16)
 what am I doing wrong?

I don't know.  I think there are a lot of possibilities based on the
information provided.  A simple explanation would be that the planner
considered the plan you want and found the other one cheaper, but the
real cause could well be something else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD

2014-04-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Apr 6, 2014 at 1:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_fireflydt=2014-04-06%2017%3A04%3A00

 Uggh.  That's unfortunate, but not terribly surprising: I didn't think
 that missing volatile was very likely to be the cause of this.

Yeah.  That was a bug, but evidently it's not the bug we're looking for.

 Have
 we been getting random failures of this type since the fastlock stuff
 went in, and we're only just now noticing?  Or did some recent change
 expose this problem?

Not sure.  I used to rely on the pgbuildfarm-status-green daily digests
to cue me to look at transient buildfarm failures, but that list has been
AWOL for months.  However, I'm pretty sure this has not been happening
ever since 9.2, so yeah, it's at least become more probable in the last
few months.

 I'm a bit suspicious of the patches to
 static-ify stuff, since that might cause the compiler to think it
 could move things across function calls that it hadn't thought
 move-able before, but FastPathStrongLocks references would seem to be
 the obvious candidate for that, and volatile-izing it ought to have
 fixed it.  I would think.

Keep in mind also that prairiedog is running a pretty old gcc (4.0.1 if
memory serves), so I'd not expect it to be doing any crazy optimizations.
I suspect we are looking at some plain old logic bug, but as you say it's
hard to guess where exactly.

 [ LockRefindAndRelease ] lacks an
 Assert(FastPathStrongRelationLocks-count[fasthashcode]  0).  I think
 we should add one.

Absolutely.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2014: Implementing clustering algorithms in MADlib

2014-04-07 Thread Hai Qian
Hi Maxence,

This is really really good. Hopefully we will have a fruitful summer, and
make MADlib a better product.

Thanks

Hai

--
*Pivotal http://www.gopivotal.com/*
A new platform for a new era


On Sat, Apr 5, 2014 at 7:14 AM, Maxence Ahlouche maxence.ahlou...@gmail.com
 wrote:

 Hi all,

 2014-04-03 9:04 GMT+02:00 Hai Qian hq...@gopivotal.com:

 Hi Maxence,

 We are very glad that you are enthusiastic about MADlib.Your proposal
 looks
 very nice. The algorithms that you proposed will be very useful, and can
 serve as a good start. After learning about the C++ abstraction layer and
 testing suite, you should be able to quickly grasp how to program for
 MADlib. We can fill in the details of the plan later. The conflict in your
 time schedule shouldn't be a problem.

 It would be nice if you could provide a little bit more introduction to
 these algorithms and their importance in the proposal.

 Overall it is very good. And thank you again

 Hai


 As you requested, I've described the clustering algorithms on my github
 [0]. I'm copying it at the end of this email, just in case.
 In the case of OPTICS, I still need some time to grasp the details of how
 it works, so I've only included a basic description of this one.

 Regards,
 Maxence

 [0] https://github.com/viodlen/gsoc_2014/blob/master/algorithm_details.rst


 Details about the different clustering algorithms
 =

 This file describes the k-means algorithm, which is currently the only
 clustering algorithm implemented in MADlib, and the k-medoids and
 OPTICS algorithm, which I plan to implement this summer, as a GSoC
 project. I'll also explain the DBSCAN algorithm, as OPTICS is only an
 extension of this one.

 The goal of all these algorithms is to determine clusters in a set of
 points. While this problem is NP-hard, it can be solved efficiently
 thanks to these algorithms, even though the result is usually not
 perfect.

 I've tried not to add too many details, and taken some shortcuts, so
 there may be some inaccuracies. I wanted to keep this readable (not
 sure that goal was achieved, though), but if anyone wants more
 precisions, I'll gladly add them.

 k-means
 ---

 The k-means algorithm is the one implemented in MADlib. It selects
 *k* points, either randomly, among the dataset, or set by the user,
 to use as initial centroids (center of clusters). *k* must be
 defined by the user; the algorithm is unable to guess the number of
 clusters.

 All the points in the dataset are then affected to the nearest
 centroid, thus making *k* clusters.

 The next step is to compute the new centroids as a mean of all the
 points in a cluster. Then reassign all the points to then new
 centroids, and start all over again until there is no change in
 cluster assignation.

 This algorithm is usually pretty fast (even though it can be made to
 take an exponential time to converge). Still, it is easy to get a
 wrong result because of local convergence, e.g. having one cluster
 split in two parts by the algorithm, or two clusters merged. It is
 also pretty sensitive to outliers (points that don't obviously belong
 to any cluster), and the final result depend greatly on the initial
 centroids.

 This algorithm doesn't work well with non-euclidean distance, in
 general.

 Another think to note is that k-means will result in Voronoi cell
 shaped clusters, which is not always what we want.

 k-medoids
 -

 The k-medoids algorithm works the same way as k-means, save for a few
 exceptions.

 With k-medoids, the centroids are always points of the dataset (and
 are then called medoids). The new medoids are the points in clusters
 which minimize the sum of pairwise dissimilarities (in other terms,
 the point for which the average distance to other points in the
 cluster is minimal). This makes the algorithm less sensitive to
 outliers than k-means.

 This algorithm is computationnally more intensive than k-means, but
 still fast.

 As for k-means, it is possible to run the algorithm several times with
 different initial centroids, and get the best result (i.e. the one
 that minimizes the sum of distances from points to their centroids).

 DBSCAN
 ---

 DBSCAN (*Density-Based Spatial Clustering of Applications with Noise*)
 is a clustering algorithm based on the density of clusters.

 This adresses several limitations of k-means and k-medoids: it does
 not assign a cluster to outliers, and allows the detection of
 weird-shaped clusters. Moreover, it doesn't need to be told the number
 of clusters.

 A point is called dense if enough other points are near enough from
 it, where enough other points and near enough are defined by the
 parameters *min_points* and *epsilon*.

 *min_points* therefore represents the minimum number of points
 required to make a cluster, and *epsilon* is the maximum distance a
 point can be from a cluster to be considered part of this cluster.

 For every unassigned point, count 

Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 [ LockRefindAndRelease ] lacks an
 Assert(FastPathStrongRelationLocks-count[fasthashcode]  0).  I think
 we should add one.

 Absolutely.

Turns out there were two places missing such an assertion: the 2PC
path, and the abort-strong-lock-acquire path.  I added an assertion to
both.  In theory, if the problem is coming from either of those
places, this might even increase the frequency of buildfarm failures,
since it removes the necessity for another normal-path release to hit
the same partition afterwards.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Fri, Apr 4, 2014 at 4:29 PM, Greg Stark st...@mit.edu wrote:
 On Fri, Apr 4, 2014 at 12:15 PM, Robert Haas robertmh...@gmail.com wrote:
  Perhaps I shouldn't lay my own guilt trip on other committers --- but
  I think it would be a bad precedent to not deal with the existing patch
  queue first.
 
  +1

 +1

 I don't think we have to promise a strict priority queue and preempt all
 other development. But I agree loosely that processing patches that have
 been around should be a higher priority.

 I've been meaning to do more review for a while and just took a skim through
 the queue. There are only a couple I feel I can contribute with so I'm going
 to work on those and then if it's still before the feature freeze I would
 like to go ahead with Peter's patch. I think it's generally a good patch.

To be honest, I think that's just flat-out inappropriate.  There were
over 100 patches in this CommitFest and there's not a single committed
patch that has your name on it even as a reviewer, let alone a
committer.  When a committer says, hey, I'm going to commit XYZ, that
basically forces anybody who might have an objection to it to drop
what they're doing and object fast, before it's too late.  In other
words, the people who just said that they are too busy reviewing
patches that were timely submitted and don't want to divert effort
from that to handle patches that weren't are going to have to do that
anyway, or lose their right to object.  I think that's unfair.  You're
essentially leveraging a commit bit that you haven't used in more than
three years to try to push a patch that was submitted months too late
to the head of the review queue - and, just to put icing on the cake,
it just so happens that you and the patch author work for the same
employer.  I have no objection to people committing patches written by
others who work at the same company, but only if those patches have
gone through a full, fair, and public review, with ample opportunity
for other people to complain if they don't like it.  That is obviously
not the case here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-07 Thread Dean Rasheed
On 7 April 2014 14:09, Florian Pflug f...@phlo.org wrote:
 On Apr5, 2014, at 09:55 , Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 5 April 2014 08:38, Dean Rasheed dean.a.rash...@gmail.com wrote:
 [snip] releasing it in this state feels a little half-baked
 to me.


 I regret writing that almost as soon as I sent it. The last of those
 queries is now over 10 times faster than HEAD, which is certainly
 worthwhile. What bugs me is that there is so much more potential in
 this patch.

 Well, the perfect is the enemy of the good as they say. By all means,
 let's get rid of the O(n^2) behaviour in 9.5, but let's get a basic
 version into 9.4.

 I think it's pretty close to being committable, but I fear that time
 is running out for 9.4.

 I'm not aware of any open issues in the basic patch, other then
 scaling the reported calling statistics by nloops, which should be
 trivial. I'll post an updated patch later today.

 I don't really expect all the add-on patches to make it into 9.4  -
 they don't seem to have gotten much attention yet - but at least
 the inverse transition functions for the basic arithmetic aggregates
 should be doable I hope.


I've just finished reading through all the other patches, and they all
look OK to me. It's mostly straightforward stuff, so despite the size
it's hopefully all committable once the base patch goes in.

I think that you're probably right that optimising
window_gettupleslot() to eliminate the O(n^2) behaviour can be left to
a later patch --- the existing performance benefits of this patch are
enough to justify its inclusion IMO. It would be nice to include the
trivial optimisation to window_gettupleslot() that I posted upthread,
since it gives such a big improvement for only a few lines of code
changed.

If you post a new patch set, I'll mark it as ready for committer attention.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Proposal: variant of regclass

2014-04-07 Thread Robert Haas
On Sat, Apr 5, 2014 at 1:10 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 The reason of this behavior is that in out functions (regclassout), we return
 the OID as it is incase it doesn't exist.  One way to fix this is incase of
 OID input parameters, we check if the passed OID exists in to_* functions
 and return NULL if it doesn't exist. I think by this way we can retain
 the similarity of input parameters between types and to_* functions and
 making to_* functions return NULL incase OID doesn't exist makes it
 similar to case when user passed name.

We could do that, but that's not my preferred fix.

 My suggestion is to
 restructure the code so that to_regclass() only accepts a name, not an
 OID, and make its charter precisely to perform a name lookup and
 return an OID (as regclass) or NULL if there's no match.

 How will we restrict user from passing some number in string form?
 Do you mean that incase user has passed OID, to_* functions should
 return NULL or if found that it is OID then return error incase of to_*
 functions?

Each of the _guts functions first handles the case where the input is
exactly -; then it checks for an all-numeric value, which is
interpreted an OID; then it has a lengthy chunk of logic to handle the
case where we're in bootstrap mode; and if none of those cases handle
the situation, then it ends by doing the lookup in the normal way,
in each case marked with a comment that says Normal case.  I think
that we should do only the last part - what in each case follows the
normal case comment - for the to_reg* functions.

In other words, let's revert the whole refactoring of this file to
create reg*_guts functions, and instead just copy the relevant logic
for the name lookups into the new functions.  For to_regproc(), for
example, it would look like this (untested):

  names = stringToQualifiedNameList(pro_name_or_oid);
  clist = FuncnameGetCandidates(names, -1, NIL, false, false, false);
  if (clist == NULL || clist-next != NULL)
   result = InvalidOid;
  else
   result = clist-oid;

With that change, this patch will actually get a whole lot smaller,
change less already-existing code, and deliver cleaner behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless Replica Identity: NOTHING noise from psql \d

2014-04-07 Thread Robert Haas
On Sat, Apr 5, 2014 at 7:12 AM, Andres Freund and...@anarazel.de wrote:
 On 2014-04-03 14:49:54 -0400, Andrew Dunstan wrote:
 I've been kind of hoping that someone would step up on both these items, but
 the trail seems to have gone cold.

 I'm going to put out the new buildfarm release with the new module to run
 test_decoding check. But of course It won't have MSVC support.

 These can go on my long TOO list unless someone else gets there first.

 So, I was thinking on how we can improve this situation. There's
 basically one bigger remaining problem besides make check vs. make
 installcheck:
 Currently contrib modules can't easily run isolationtester in a general
 fashion. That's why test_decoding's Makefile has to rig that all
 itself. How about simply improving the contrib support to recognize
 tests in ISOLATION_REGRESS in addition to the current REGRESS? Then we
 can simply train vcregress.pl to pick them up generally, without
 special support for test_decoding.

IMHO, that's a fine idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Including replication slot data in base backups

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 3:04 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Fri, Apr 4, 2014 at 9:57 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 For 9.4, clearly yes, this would change the semantic of recovery and
 this is not something wise to do at the end of a development cycle.
 For 9.5 though, this is a different story. It clearly depends on if
 this is though as useful enough to change how recovery fetches WAL
 files (in this case by scanning existing repslots). There are other
 things to consider as well like for example: do we reset the
 restart_lsn of a repslot if needed WAL files are not here anymore or
 abort recovery? I haven't worked much with repslots though...
 Coming back to that, I am still wondering if for the time being it
 would not be better to add in pg_basebackup documentation that
 replication slot information is not added in a backup, per se the
 patch attached.

Not sure if this is exactly the right way to do it, but I agree that
something along those lines is a good idea.  I also think, maybe even
importantly, that we should probably document that people using
file-copy based hot backups should strongly consider removing the
replication slots by hand before using the backup.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Proposal: variant of regclass

2014-04-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 In other words, let's revert the whole refactoring of this file to
 create reg*_guts functions, and instead just copy the relevant logic
 for the name lookups into the new functions.

The main discomfort I'd had with this patch was the amount of refactoring
it did; that made it hard to verify and I wasn't feeling like it made for
much net improvement in cleanliness.  If we can do it without that, and
have only as much duplicate code as you suggest, then +1.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Proposal: variant of regclass

2014-04-07 Thread Andres Freund
On 2014-04-04 11:18:10 -0400, Robert Haas wrote:
 On Wed, Apr 2, 2014 at 11:27 PM, Amit Kapila amit.kapil...@gmail.com wrote:
  Right, it will get reset in error. However still we need to free for 
  missing_ok
  case and when it is successful in getting typeid. So don't you think it is
  better to just free once before calling LookupTypeName()?
 
  The code is right in it's current form as well, it's just a minor suggestion
  for improvement, so if you think current way the code written is okay, then
  ignore this suggestion.
 
 I see.  Here's an updated patch with a bit of minor refactoring to
 clean that up, and some improvements to the documentation.
 
 I was all ready to commit this when I got cold feet.  What's bothering
 me is that the patch, as written, mimics the exact behavior of the
 text-regproc cast, including the fact that the supplying an OID,
 written as a number, will always return that OID, whether it exists or
 not:
 
 rhaas=# select to_regclass('1259'), to_regclass('pg_class');
  to_regclass | to_regclass
 -+-
  pg_class| pg_class
 (1 row)
 
 rhaas=# select to_regclass('12590'), to_regclass('does_not_exist');
  to_regclass | to_regclass
 -+-
  12590   |
 (1 row)
 
 I think that's unacceptably weird behavior.  My suggestion is to
 restructure the code so that to_regclass() only accepts a name, not an
 OID, and make its charter precisely to perform a name lookup and
 return an OID (as regclass) or NULL if there's no match.

There's actually another good reason to not copy regclass's behaviour:

postgres=# CREATE TABLE 123();
CREATE TABLE
postgres=# SELECT '123'::regclass;
 regclass
 --
  123
  (1 row)

I don't think that's fixable for ::regclass, but we shouldn't copy it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Including replication slot data in base backups

2014-04-07 Thread Fabrízio de Royes Mello
Em segunda-feira, 7 de abril de 2014, Robert Haas robertmh...@gmail.com
escreveu:

 On Mon, Apr 7, 2014 at 3:04 AM, Michael Paquier
 michael.paqu...@gmail.com javascript:; wrote:
  On Fri, Apr 4, 2014 at 9:57 PM, Michael Paquier
  michael.paqu...@gmail.com javascript:; wrote:
  For 9.4, clearly yes, this would change the semantic of recovery and
  this is not something wise to do at the end of a development cycle.
  For 9.5 though, this is a different story. It clearly depends on if
  this is though as useful enough to change how recovery fetches WAL
  files (in this case by scanning existing repslots). There are other
  things to consider as well like for example: do we reset the
  restart_lsn of a repslot if needed WAL files are not here anymore or
  abort recovery? I haven't worked much with repslots though...
  Coming back to that, I am still wondering if for the time being it
  would not be better to add in pg_basebackup documentation that
  replication slot information is not added in a backup, per se the
  patch attached.

 Not sure if this is exactly the right way to do it, but I agree that
 something along those lines is a good idea.  I also think, maybe even
 importantly, that we should probably document that people using
 file-copy based hot backups should strongly consider removing the
 replication slots by hand before using the backup.


+1

Fabrízio


-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Fri, Apr 4, 2014 at 4:29 PM, Greg Stark st...@mit.edu wrote:
  I've been meaning to do more review for a while and just took a skim through
  the queue. There are only a couple I feel I can contribute with so I'm going
  to work on those and then if it's still before the feature freeze I would
  like to go ahead with Peter's patch. I think it's generally a good patch.
 
 To be honest, I think that's just flat-out inappropriate.  There were
 over 100 patches in this CommitFest and there's not a single committed
 patch that has your name on it even as a reviewer, let alone a
 committer.  

I haven't got any either (except for my little one), which frustrates
me greatly.  Not because I'm looking for credit on the time that I've
spent in discussions, doing reviews, and I could have sworn there was
some patch that I did commit, but because I've not been able to find
the larger chunks of time required to get the more complex patches in.

 When a committer says, hey, I'm going to commit XYZ, that
 basically forces anybody who might have an objection to it to drop
 what they're doing and object fast, before it's too late.  In other
 words, the people who just said that they are too busy reviewing
 patches that were timely submitted and don't want to divert effort
 from that to handle patches that weren't are going to have to do that
 anyway, or lose their right to object.  

I don't agree that this is the case.  We do revert patches from time to
time, when necessary, and issues with this particular patch seem likely
to be found during testing, well in advance of any release, and it's
self contained enough to be reverted pretty easily.  Perhaps it's not
fair to expect everyone to have realized that, but at least any
committer looking at it would, I believe, weigh those against it being a
late patch.

 I think that's unfair.  You're
 essentially leveraging a commit bit that you haven't used in more than
 three years to try to push a patch that was submitted months too late
 to the head of the review queue - and, just to put icing on the cake,
 it just so happens that you and the patch author work for the same
 employer.  I have no objection to people committing patches written by
 others who work at the same company, but only if those patches have
 gone through a full, fair, and public review, with ample opportunity
 for other people to complain if they don't like it.  That is obviously
 not the case here.

As for this- it's disingenuous at best and outright accusatory at worst.
The reason for this discussion is entirely because of PGConf.NYC, imv,
where Peter brought this patch up with at least Greg, Magnus and I.
Also during PGConf.NYC, Greg was specifically asking me about patches
which were in the commitfest that could be reviewed, ideally without
having to go back through threads hundreds of messages long and dealing
with complex parts of the code.  Sadly, as is often the case, the easy
ones get handled pretty quickly and the difficult ones get left behind.

One bad part of the commitfest, imv, is that when a clearly good,
small patch does manage to show up, we don't formally take any of that
into consideration when we are prioritizing patches.  They end up being
informally prioritized and get in quickly at the beginning, but that
doesn't address the reality that those smaller patches likely would get
in without any kind of commitfest process and not letting them in
because of the commitfest process doesn't generally mean that the larger
patches are any more likely to get in- iow, it's not a zero-sum game.
We have quite a few part-time committers (or at least, committers who
have disproportionate amounts of time, or perhaps the difference is in
the size of the blocks of time which can be dedicated to PG) and saying
no, you can't help unless you tackle the hard problems doesn't
particularly move us forward.

All that said, I don't have any particularly good idea of how to fix
any of this- it's not fair to tell the committers who have more time (or
larger blocks of time, etc) you must work the hard problems only
either.  I don't feel Greg's interest in this patch has anything to do
with his current employment and everything to do with the side-talks in
NYC, and to that point, I'm very tempted to go look at this patch myself
because it sounds like an exciting improvement with minimal effort.
Would I feel bad for doing so, with the CustomScan API and Updatable
Security Barrier Views patches still pending?  Sure, but it's the
difference between finding an hour and finding 8.  The hour will come
pretty easily (probably spent half that on this email..), while an
8-hour block, which would likely turn into more, is neigh-on impossible
til at least this weekend.  And, no, 8x one-hour blocks would not be
worthwhile; I've tried that before.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Fwd: [HACKERS] Proposal: variant of regclass

2014-04-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 There's actually another good reason to not copy regclass's behaviour:

 postgres=# CREATE TABLE 123();
 CREATE TABLE
 postgres=# SELECT '123'::regclass;
  regclass
  --
   123
   (1 row)

 I don't think that's fixable for ::regclass, but we shouldn't copy it.

I think that's not proving what you thought; the case is correctly handled
if you quote:

regression=# create table 123(z int);
CREATE TABLE
regression=# select '123'::regclass;
 regclass 
--
 123
(1 row)

regression=# select '123'::regclass;
 regclass 
--
 123
(1 row)

But I agree that we don't want these functions accepting numeric OIDs,
even though ::regclass must.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Proposal: variant of regclass

2014-04-07 Thread Andres Freund
On 2014-04-07 12:59:36 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  There's actually another good reason to not copy regclass's behaviour:
 
  postgres=# CREATE TABLE 123();
  CREATE TABLE
  postgres=# SELECT '123'::regclass;
   regclass
   --
123
(1 row)
 
  I don't think that's fixable for ::regclass, but we shouldn't copy it.
 
 I think that's not proving what you thought; the case is correctly handled
 if you quote:

I know, but it's a pretty easy to make mistake. Many if not most of the
usages of regclass I have seen were wrong in that way.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 1:01 PM, Stephen Frost sfr...@snowman.net wrote:
 All that said, I don't have any particularly good idea of how to fix
 any of this- it's not fair to tell the committers who have more time (or
 larger blocks of time, etc) you must work the hard problems only
 either.  I don't feel Greg's interest in this patch has anything to do
 with his current employment and everything to do with the side-talks in
 NYC, and to that point, I'm very tempted to go look at this patch myself
 because it sounds like an exciting improvement with minimal effort.
 Would I feel bad for doing so, with the CustomScan API and Updatable
 Security Barrier Views patches still pending?  Sure, but it's the
 difference between finding an hour and finding 8.  The hour will come
 pretty easily (probably spent half that on this email..), while an
 8-hour block, which would likely turn into more, is neigh-on impossible
 til at least this weekend.  And, no, 8x one-hour blocks would not be
 worthwhile; I've tried that before.

If it's only going to take you an hour to address this patch (or 8 to
address those other ones) then you spend a heck of a lot less time on
review for a patch of a given complexity level than I do.  I agree
that it's desirable to slip things in, from time to time, when they're
uncontroversial and obviously meritorious, but I'm not completely
convinced that this is such a case.  As an utterly trivial point, I
find the naming to be less than ideal: poorman is not a term I want
to enshrine in our code.  That's not very descriptive of what the
patch is actually doing even if you know what the idiom means, and
people whose first language - many of whom do significant work on our
code - may not.

Now the point is not that that's a serious flaw in and of itself.  The
point is that these kinds of issues deserve to be discussed and agreed
on, and the process should be structured in a way that permits that.
And that discussion will require the time not only of the people who
find this patch more interesting than any other, but also of the
people who just said that they're busy with other things right now,
unless those people want to forfeit their right to an opinion.
Experience has shown that it's a whole lot easier for anyone here to
get a patch changed before it's committed than after it's committed,
so I don't buy your argument that the timing there doesn't matter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Peter Geoghegan
On Mon, Apr 7, 2014 at 10:23 AM, Robert Haas robertmh...@gmail.com wrote:
 As an utterly trivial point, I
 find the naming to be less than ideal: poorman is not a term I want
 to enshrine in our code.  That's not very descriptive of what the
 patch is actually doing even if you know what the idiom means, and
 people whose first language - many of whom do significant work on our
 code - may not.

I didn't come up with the idea, or the name.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-04-07 Thread Alvaro Herrera
KONDO Mitsumasa wrote:
 Hi all,
 
 I think this patch is completely forgotten, and feel very unfortunate:(
 
 Min, max, and stdev is basic statistics in general monitoring tools,
 So I'd like to push it.

I just noticed that this patch not only adds min,max,stddev, but it also
adds the ability to reset an entry's counters.  This hasn't been
mentioned in this thread at all; there has been no discussion on whether
this is something we want to have, nor on whether this is the right API.

What it does is add a new function pg_stat_statements_reset_time() which
resets the min and max values from all function's entries, without
touching the stddev state variables nor the number of calls.  Note
there's no ability to reset the values for a single function.  

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Peter Geoghegan
On Mon, Apr 7, 2014 at 10:42 AM, Andres Freund and...@2ndquadrant.com wrote:
 I didn't come up with the idea, or the name.

 Doesn't mean it needs to be enshrined everywhere. I don't think Robert's
 against putting it in some comments.

That seems reasonable. If someone wants to call what I have here
semi-reliable normalized keys, for example, I have no objection
whatsoever. In fact, I think that's probably a good idea.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 1:23 PM, Robert Haas robertmh...@gmail.com wrote:
 As an utterly trivial point, I
 find the naming to be less than ideal: poorman is not a term I want
 to enshrine in our code.  That's not very descriptive of what the
 patch is actually doing even if you know what the idiom means, and
 people whose first language - many of whom do significant work on our
 code - may not.

To throw out one more point that I think is problematic, Peter's
original email on this thread gives a bunch of examples of strxfrm()
normalization that all different in the first few bytes - but so do
the underlying strings.  I *think* (but don't have time to check right
now) that on my MacOS X box, strxfrm() spits out 3 bytes of header
junk and then 8 bytes per character in the input string - so comparing
the first 8 bytes of the strxfrm()'d representation would amount to
comparing part of the first byte.  If for any reason the first byte is
the same (or similar enough) on many of the input strings, then this
will probably work out to be slower rather than faster.  Even if other
platforms are more space-efficient (and I think at least some of them
are), I think it's unlikely that this optimization will ever pay off
for strings that don't differ in the first 8 bytes.  And there are
many cases where that could be true a large percentage of the time
throughout the input, e.g. -MM-DD HH:MM:SS timestamps stored as
text.  It seems like that the patch pessimizes those cases, though of
course there's no way to know without testing.

Now it *may well be* that after doing some research and performance
testing we will conclude that either no commonly-used platforms show
any regressions or that the regressions that do occur are discountable
in view of the benefits to more common cases to the benefits.  I just
don't think mid-April is the right time to start those discussions
with the goal of a 9.4 commit; and I also don't think committing
without having those discussions is very prudent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Andres Freund
On 2014-04-07 10:29:53 -0700, Peter Geoghegan wrote:
 On Mon, Apr 7, 2014 at 10:23 AM, Robert Haas robertmh...@gmail.com wrote:
  As an utterly trivial point, I
  find the naming to be less than ideal: poorman is not a term I want
  to enshrine in our code.  That's not very descriptive of what the
  patch is actually doing even if you know what the idiom means, and
  people whose first language - many of whom do significant work on our
  code - may not.
 
 I didn't come up with the idea, or the name.

Doesn't mean it needs to be enshrined everywhere. I don't think Robert's
against putting it in some comments.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Andres Freund
On 2014-04-07 13:01:52 -0400, Stephen Frost wrote:
 I haven't got any either (except for my little one), which frustrates
 me greatly.  Not because I'm looking for credit on the time that I've
 spent in discussions, doing reviews, and I could have sworn there was
 some patch that I did commit, but because I've not been able to find
 the larger chunks of time required to get the more complex patches in.

I am a bit confused. To my eyes there's been a huge number of actually
trivial patches in this commitfest? Even now, there's some:

* Bugfix for timeout in LDAP connection parameter resolution
* Problem with displaying wide tables in psql
* Enable CREATE FOREIGN TABLE (... LIKE ... )
* Add min, max, and stdev execute statement time in pg_stat_statement
* variant of regclass etc.
* vacuumdb: Add option --analyze-in-stages

Are all small patches that don't need major changes before getting committed.

That's after three months. And after a high number of smaller patches
committed by Tom on Friday.

  When a committer says, hey, I'm going to commit XYZ, that
  basically forces anybody who might have an objection to it to drop
  what they're doing and object fast, before it's too late.  In other
  words, the people who just said that they are too busy reviewing
  patches that were timely submitted and don't want to divert effort
  from that to handle patches that weren't are going to have to do that
  anyway, or lose their right to object.
 
 I don't agree that this is the case.  We do revert patches from time to
 time, when necessary, and issues with this particular patch seem likely
 to be found during testing, well in advance of any release, and it's
 self contained enough to be reverted pretty easily.

Given the trackrecord with testing the project seems to have with
testing, I don't have much faith in that claim. But even if, it'll only
get you testing on 2-3 platforms, without noticing portability issues.

I think it'd be a different discussion if this where CF-1 or so. But
we're nearly *2* months after the the *end* of the last CF.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 If it's only going to take you an hour to address this patch (or 8 to
 address those other ones) then you spend a heck of a lot less time on
 review for a patch of a given complexity level than I do.  

Eh, I don't really time it and I'm probably completely off in reality,
it's more of a relative feeling thing wrt how long it'd take.  I was
also thinking about it from a 'initial review and provide feedback'
standpoint, actually getting to a point of committing something
certainly takes much longer, but I can also kick off tests and do
individual testing in those smaller time blocks.  Reading the code and
understanding it, writing up the feedback email, etc, is what requires
the larger single block.  Perhaps I can work on improving that for
myself and maybe find a way to do it in smaller chunks, but that hasn't
happened in the however-many-years, and there's the whole 'old dog, new
tricks' issue.

 I agree
 that it's desirable to slip things in, from time to time, when they're
 uncontroversial and obviously meritorious, but I'm not completely
 convinced that this is such a case.  As an utterly trivial point, I
 find the naming to be less than ideal: poorman is not a term I want
 to enshrine in our code.  That's not very descriptive of what the
 patch is actually doing even if you know what the idiom means, and
 people whose first language - many of whom do significant work on our
 code - may not.

Fair enough.

 Now the point is not that that's a serious flaw in and of itself.  The
 point is that these kinds of issues deserve to be discussed and agreed
 on, and the process should be structured in a way that permits that.

The issue on it being called poorman?  That doesn't exactly strike me
as needing a particularly long discussion, nor that it would be
difficult to change later.  I agree that there may be other issues, and
it'd be great to get buy-in from everyone before anything goes in, but
there's really zero hope of that being a reality.

 And that discussion will require the time not only of the people who
 find this patch more interesting than any other, but also of the
 people who just said that they're busy with other things right now,
 unless those people want to forfeit their right to an opinion.

I certainly hope that no committer feels that they forfeit their right
to an opinion about a piece of code because they didn't object to it
before it was committed.  My experience is that committed code gets
reviewed and concerns are raised, at which point it's usually on the
original committer to go back and fix it; which I'm certainly glad for.

 Experience has shown that it's a whole lot easier for anyone here to
 get a patch changed before it's committed than after it's committed,
 so I don't buy your argument that the timing there doesn't matter.

Once code has been released and there are external dependencies on it,
that's obviously an issue.  Ahead of that, I feel like the issue is
really more one of interest- everyone is very interested when code is
about to go in, but once it's in, for whatever reason, the interest
becomes much less to review and comment on it.  I feel like that's a
very long-standing issue as it relates to our 'beta' period because
doing code review just simply isn't fun and the motivation is reduced
once it's been committed.

That makes me wonder about having beta review-fests (in fact, I feel
like that may have been proposed before...), where commits are assigned
out to be reviewed by someone other than the committer/author/original
reviewer, as a way to motivate individuals to go review what has gone
in before we get to release.  That might help us ensure that more of the
committed code *gets* another review before release and reduce the
issues we have post-release.  Just a thought.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Firing trigger if only

2014-04-07 Thread Fabrízio de Royes Mello
On Thu, Apr 3, 2014 at 8:50 AM, Gabriel yu1...@gmail.com wrote:

 Good afternoon all.I have some problem with triggers on PostgreSQL 8.4.I
 have
 a trigger on specific table(articles) that fires on update statement:

 CREATE OR REPLACE FUNCTION trigger_articles_update()
   RETURNS trigger AS
 $BODY$BEGIN
   INSERT INTO
 article_change(article_id,change_type)VALUES(OLD.article_id,2);
RETURN NULL;
 END$BODY$
   LANGUAGE 'plpgsql' VOLATILE
   COST 100;
 ALTER FUNCTION trigger_articles_update() OWNER TO postgres;

 I have 2 different applications that performs update on table
 articles(written in Delphi and using ZeosDB). My problem is that I want
 trigger to fire only when performing update with first application, but not
 with second.I know that triggers supposed to fire on every change on table,
 but this is a specific problem that I have.Any hint appreciated. 8)


Since 9.0 version of PostgreSQL you can set 'application_name' in
connection [1] and test it in your trigger using a query like:

regress=# SELECT application_name FROM pg_stat_activity WHERE pid =
pg_backend_pid();
 application_name
--
 psql
(1 registro)

I strongly recommend you to upgrade your 8.4 because the EOL [2] is July
2014.

But if an upgrade isn't an option for now then you can use the old
custom_variable_classes to set your application, i.e.:

1) Add to your postgresql.conf:

custom_variable_classes = 'foo'
foo.application_name = 'undefined'

2) Reload your PostgreSQL

3) You can use the following functions to get/set the
'foo.application_name' custom variable:

-- get
SELECT current_setting('foo.application_name');

-- set
SELECT set_config('foo.application_name', 'myapp');

4) Now you can use this functions do register the name of your application
an use it in your trigger.

Regards,

[1]
http://www.postgresql.org/docs/current/static/runtime-config-logging.html#GUC-APPLICATION-NAME
[2] http://www.postgresql.org/support/versioning/
[3] http://www.postgresql.org/docs/8.4/static/runtime-config-custom.html
-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-04-07 13:01:52 -0400, Stephen Frost wrote:
  I haven't got any either (except for my little one), which frustrates
  me greatly.  Not because I'm looking for credit on the time that I've
  spent in discussions, doing reviews, and I could have sworn there was
  some patch that I did commit, but because I've not been able to find
  the larger chunks of time required to get the more complex patches in.
 
 I am a bit confused. To my eyes there's been a huge number of actually
 trivial patches in this commitfest? Even now, there's some:
 
 * Bugfix for timeout in LDAP connection parameter resolution

I can take a look at that (if no one else wants to speak up about it).

 * Problem with displaying wide tables in psql

That's not without controvery, as I understand it, but I admit that I
haven't been following it terribly closely.

 * Enable CREATE FOREIGN TABLE (... LIKE ... )

This has definitely got issues which are not trival, see Tom's recent
email on the subject..

 * Add min, max, and stdev execute statement time in pg_stat_statement

This was also quite controversal.  If we've finally settled on this as
being acceptable then perhaps it can get in pretty easily.

 * variant of regclass etc.

This was recently being discussed also.

 * vacuumdb: Add option --analyze-in-stages

Haven't looked at this at all.

 Are all small patches that don't need major changes before getting committed.

That strikes me as optimistic.  I do plan to go do another pass through
the commitfest patches before looking at other things (as Greg also said
he would do); thanks for bringing up the ones you feel are more
managable- it'll help me focus on them.

 Given the trackrecord with testing the project seems to have with
 testing, I don't have much faith in that claim. But even if, it'll only
 get you testing on 2-3 platforms, without noticing portability issues.

This would be another case where it'd be nice if we could give people
access to the buildfarm w/o having to actually commit something.

 I think it'd be a different discussion if this where CF-1 or so. But
 we're nearly *2* months after the the *end* of the last CF.

There wouldn't be any discussion if it was CF-1 as I doubt anyone would
object to it going in (or at least not as strongly..), even if it was
submitted after CF-1 was supposed to be over with remaining patches.
It's the threat of getting punted to the next release that really makes
the difference here, imv.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 1:58 PM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 If it's only going to take you an hour to address this patch (or 8 to
 address those other ones) then you spend a heck of a lot less time on
 review for a patch of a given complexity level than I do.

 Eh, I don't really time it and I'm probably completely off in reality,
 it's more of a relative feeling thing wrt how long it'd take.  I was
 also thinking about it from a 'initial review and provide feedback'
 standpoint, actually getting to a point of committing something
 certainly takes much longer, but I can also kick off tests and do
 individual testing in those smaller time blocks.  Reading the code and
 understanding it, writing up the feedback email, etc, is what requires
 the larger single block.  Perhaps I can work on improving that for
 myself and maybe find a way to do it in smaller chunks, but that hasn't
 happened in the however-many-years, and there's the whole 'old dog, new
 tricks' issue.

It takes me a big block of time, too, at least for the initial review.

 The issue on it being called poorman?  That doesn't exactly strike me
 as needing a particularly long discussion, nor that it would be
 difficult to change later.  I agree that there may be other issues, and
 it'd be great to get buy-in from everyone before anything goes in, but
 there's really zero hope of that being a reality.

Really?  I think there have been just about zero patches that have
gone in in the last (ugh) three months that have had significant
unaddressed objections from anyone at the time they were committed.
There has certainly been an enormous amount of work by a whole lot of
people to address objections that have been raised, and generally that
has gone well.  I will not pretend that every patch that has gone in
is completely well-liked by everyone and I am sure that is not the
case.  Nevertheless I think we've done pretty well.  Now the people
whose stuff has not got in - and may not get in - may well feel that
their stuff got short shrift, and I'm not going to deny that there's a
problem there.  But breaking from our usual procedure for this patch
is just adding to that unfairness, not making anything better.

 And that discussion will require the time not only of the people who
 find this patch more interesting than any other, but also of the
 people who just said that they're busy with other things right now,
 unless those people want to forfeit their right to an opinion.

 I certainly hope that no committer feels that they forfeit their right
 to an opinion about a piece of code because they didn't object to it
 before it was committed.  My experience is that committed code gets
 reviewed and concerns are raised, at which point it's usually on the
 original committer to go back and fix it; which I'm certainly glad for.

Sure, people can object to anything whenever they like.  But it
becomes an uphill battle once it goes in, unless the breakage is
rather flagrant.

Regardless, if this patch had had multiple, detailed reviews finding
lots of issues that were then addressed, I'd probably be keeping my
trap shut and hoping for the best.  But it hasn't.  Any patch of this
size is going to have nitpicky stuff that needs to be addressed, if
nothing else, and points requiring some modicum of public discussion,
and there hasn't been any of that.  That suggests to me that it just
hasn't been thoroughly reviewed yet, at least not in public, and that
should happen long before anyone talks about picking it up for commit.
 If this patch had been timely submitted and I'd picked it up right
away, I would have expected at least three weeks to elapse between the
time my first review was posted and the time all the details were
nailed down, even assuming no more serious problems were found, and
starting that ball rolling now means looking forward to a commit
around the end of April assuming things go great. That seems way too
late to me; IMHO, we should already be mostly in mop-up mode now
(hence my recent DSM patch cleanup patch, which no one has commented
on...).

And I think it's likely that, in fact, we'll find cases where this
regresses performance rather badly, for reasons sketched in an email I
posted a bit ago.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 To throw out one more point that I think is problematic, Peter's
 original email on this thread gives a bunch of examples of strxfrm()
 normalization that all different in the first few bytes - but so do
 the underlying strings.  I *think* (but don't have time to check right
 now) that on my MacOS X box, strxfrm() spits out 3 bytes of header
 junk and then 8 bytes per character in the input string - so comparing
 the first 8 bytes of the strxfrm()'d representation would amount to
 comparing part of the first byte.  If for any reason the first byte is
 the same (or similar enough) on many of the input strings, then this
 will probably work out to be slower rather than faster.  Even if other
 platforms are more space-efficient (and I think at least some of them
 are), I think it's unlikely that this optimization will ever pay off
 for strings that don't differ in the first 8 bytes.  And there are
 many cases where that could be true a large percentage of the time
 throughout the input, e.g. -MM-DD HH:MM:SS timestamps stored as
 text.  It seems like that the patch pessimizes those cases, though of
 course there's no way to know without testing.

Portability and performance concerns were exactly what worried me as
well.  It was my hope/understanding that this was a clear win which was
vetted by other large projects across multiple platforms.  If that's
actually in doubt and it isn't a clear win then I agree that we can't be
trying to squeeze it in at this late date.

 Now it *may well be* that after doing some research and performance
 testing we will conclude that either no commonly-used platforms show
 any regressions or that the regressions that do occur are discountable
 in view of the benefits to more common cases to the benefits.  I just
 don't think mid-April is the right time to start those discussions
 with the goal of a 9.4 commit; and I also don't think committing
 without having those discussions is very prudent.

I agree with this in concept- but I'd be willing to spend a bit of time
researching it, given that it's from a well known and respected author
who I trust has done much of this research already.

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] WAL replay bugs

2014-04-07 Thread Heikki Linnakangas


I've been playing with a little hack that records a before and after 
image of every page modification that is WAL-logged, and writes the 
images to a file along with the LSN of the corresponding WAL record. I 
set up a master-standby replication with that hack in place in both 
servers, and ran the regression suite. Then I compared the after images 
after every WAL record, as written on master, and as replayed by the 
standby.


The idea is that the page content in the standby after replaying a WAL 
record should be identical to the page in the master, when the WAL 
record was generated. There are some known cases where that doesn't 
hold, but it's a useful sanity check. To reduce noise, I've been 
focusing on one access method at a time, filtering out others.


I did that for GIN first, and indeed found a bug in my new 
incomplete-split code, see commit 594bac42. After fixing that, and 
zeroing some padding bytes (38a2b95c), I'm now getting a clean run with 
that.



Next, I took on GiST, and lo-and-behold found a bug there pretty quickly 
as well. This one has been there ever since we got Hot Standby: the redo 
of a page update (e.g an insertion) resets the right-link of the page. 
If there is a concurrent scan, in a hot standby server, that scan might 
still need the rightlink, and will hence miss some tuples. This can be 
reproduced like this:


1. in master, create test table.

CREATE TABLE gisttest (id int4);
CREATE INDEX gisttest_idx ON gisttest USING gist (id);
INSERT INTO gisttest SELECT g * 1000 from generate_series(1, 10) g;

-- Test function. Starts a scan, fetches one row from it, then waits 10 
seconds until fetching the rest of the rows.

-- Returns the number of rows scanned. Should be 10 if you follow
-- these test instructions.
CREATE OR REPLACE FUNCTION gisttestfunc() RETURNS int AS
$$
declare
  i int4;
  t text;
  cur CURSOR FOR SELECT 'foo' FROM gisttest WHERE id = 0;
begin
  set enable_seqscan=off; set enable_bitmapscan=off;

  i = 0;
  OPEN cur;
  FETCH cur INTO t;

  perform pg_sleep(10);

  LOOP
EXIT WHEN NOT FOUND; -- this is bogus on first iteration
i = i + 1;
FETCH cur INTO t;
  END LOOP;
  CLOSE cur;
  RETURN i;
END;
$$ LANGUAGE plpgsql;

2. in standby

SELECT gisttestfunc();
blocks

3. Quickly, before the scan in standby continues, cause some page splits:

INSERT INTO gisttest SELECT g * 1000+1 from generate_series(1, 10) g;

4. The scan in standby finishes. It should return 10, but will 
return a lower number if you hit the bug.



At a quick glance, I think fixing that is just a matter of not resetting 
the right-link. I'll take a closer look tomorrow, but for now I just 
wanted to report what I've been doing. I'll post the scripts I've been 
using later too - nag me if I don't.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Peter Geoghegan
On Mon, Apr 7, 2014 at 10:47 AM, Robert Haas robertmh...@gmail.com wrote:
 To throw out one more point that I think is problematic, Peter's
 original email on this thread gives a bunch of examples of strxfrm()
 normalization that all different in the first few bytes - but so do
 the underlying strings.  I *think* (but don't have time to check right
 now) that on my MacOS X box, strxfrm() spits out 3 bytes of header
 junk and then 8 bytes per character in the input string - so comparing
 the first 8 bytes of the strxfrm()'d representation would amount to
 comparing part of the first byte.  If for any reason the first byte is
 the same (or similar enough) on many of the input strings, then this
 will probably work out to be slower rather than faster.  Even if other
 platforms are more space-efficient (and I think at least some of them
 are), I think it's unlikely that this optimization will ever pay off
 for strings that don't differ in the first 8 bytes.

Why would any platform have header bytes in the resulting binary
strings? That doesn't make any sense. Are you sure you aren't thinking
of the homogeneous trailing bytes that you can also see in my example?

The only case that this patch could possibly regress is where there
are strings that differ beyond about the first 8 bytes, but are not
identical (we chance a memcmp() == 0 before doing a full strcoll()
when tie-breaking on the semi-reliable initial comparison). We still
always avoid fmgr-overhead (and shim overhead, which I've measured),
as in your original patch - you put that at adding 7% at the time,
which is likely to make up for otherwise-regressed cases. There is
nothing at all contrived about my test-case.

You have to have an awfully large number of significantly similar but
not identical strings in order to possibly lose out. Even if you have
such a case, and the fmgr-trampoline-elision doesn't make up for it
(doesn't make up for having to do a separate heapattr lookup on the
minimal tuple, and optimization not too relevant for pass by reference
types), which is quite a stretch, it seems likely that you have other
cases that do benefit, which in aggregate makes up for it. The
benefits I've shown, on the first test case I picked are absolutely
enormous.

Now, let's assume that I'm wrong about all this, and that in fact
there is a plausible case where all of those tricks don't work out,
and someone has a complaint about a regression. What are we going to
do about that? Not accelerate text sorting by at least a factor of 3
for the benefit of some very narrow use-case? That's the only measure
I can see that you could take to not regress that case.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Andres Freund
On 2014-04-07 14:12:09 -0400, Stephen Frost wrote:
 I can take a look at that (if no one else wants to speak up about it).
 
  * Problem with displaying wide tables in psql
 
 That's not without controvery, as I understand it, but I admit that I
 haven't been following it terribly closely.

There didn't seem to be any conflicts here? I am talking about
http://archives.postgresql.org/message-id/CAJTaR32A1_d0DqP25T4%3DLwE3RpmhNf3oY%3Dr0-ksejepfPv6O%3Dw%40mail.gmail.com

  * Enable CREATE FOREIGN TABLE (... LIKE ... )
 
 This has definitely got issues which are not trival, see Tom's recent
 email on the subject..

Yea. Besides others, he confirmed my comments. The issue there was
basically that I didn't like something, others disagreed. Still needed a
committers input.

  * Add min, max, and stdev execute statement time in pg_stat_statement
 
 This was also quite controversal.  If we've finally settled on this as
 being acceptable then perhaps it can get in pretty easily.

The minimal variant (just stddev) didn't seem to be all that
controversial.

  I think it'd be a different discussion if this where CF-1 or so. But
  we're nearly *2* months after the the *end* of the last CF.
 
 There wouldn't be any discussion if it was CF-1 as I doubt anyone would
 object to it going in (or at least not as strongly..), even if it was
 submitted after CF-1 was supposed to be over with remaining patches.
 It's the threat of getting punted to the next release that really makes
 the difference here, imv.

I can understand that for feature patches which your company/client
needs, but for a localized performance improvement? Unconvinced.

And sorry, if the threat of getting punted to the next release plays
a significant role in here, the patch *needs* to be punted. The only
reason I can see at this point is ah, trivial enough, let's just do
this now.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Alvaro Herrera
Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:

  I think it'd be a different discussion if this where CF-1 or so. But
  we're nearly *2* months after the the *end* of the last CF.
 
 There wouldn't be any discussion if it was CF-1 as I doubt anyone would
 object to it going in (or at least not as strongly..), even if it was
 submitted after CF-1 was supposed to be over with remaining patches.
 It's the threat of getting punted to the next release that really makes
 the difference here, imv.

That's why we have this rule that CF4 should only receive patches that
were already reviewed in previous commitfests.  I, too, find the
fast-tracking of this patch completely outside of the CF process to be
distasteful.  We summarily reject much smaller patches at the end of
each cycle process, even when the gain is as obvious as is claimed to
be for this patch.

TBH I don't see why we're even discussing this.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 2:12 PM, Stephen Frost sfr...@snowman.net wrote:
 I think it'd be a different discussion if this where CF-1 or so. But
 we're nearly *2* months after the the *end* of the last CF.

 There wouldn't be any discussion if it was CF-1 as I doubt anyone would
 object to it going in (or at least not as strongly..), even if it was
 submitted after CF-1 was supposed to be over with remaining patches.
 It's the threat of getting punted to the next release that really makes
 the difference here, imv.

The point is that if this had been submitted for CF-1, CF-2, CF-3, or
CF-4, and I had concerns about it (which I do), then I would have
budgeted time to record those concerns so that they could be discussed
and, if necessary, addressed.  Since it wasn't, I assumed I didn't
need to worry about studying the patch, figuring out which of my
concerns were actually legitimate, searching for other areas of
potential concern, and putting together a nice write-up until June -
and maybe not even then, because at that point other people might also
be studying the patch and might cover those areas in sufficient detail
as to obviate my own concerns.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Mon, Apr 7, 2014 at 1:58 PM, Stephen Frost sfr...@snowman.net wrote:
  The issue on it being called poorman?  That doesn't exactly strike me
  as needing a particularly long discussion, nor that it would be
  difficult to change later.  I agree that there may be other issues, and
  it'd be great to get buy-in from everyone before anything goes in, but
  there's really zero hope of that being a reality.
 
 Really?  I think there have been just about zero patches that have
 gone in in the last (ugh) three months that have had significant
 unaddressed objections from anyone at the time they were committed.

That implies that everyone has been reviewing everything, which is
certainly not entirely the case..  If that's happening then I'm not sure
I understand the point of the commitfest, and feel I must be falling
down on the job here.  Certainly there are patches which have been
picked up by committers and committed that I've not reviewed.  I also
didn't object, but if I find issue with them in the future, I'd
certainly bring them up, even post-commit.

 There has certainly been an enormous amount of work by a whole lot of
 people to address objections that have been raised, and generally that
 has gone well.  I will not pretend that every patch that has gone in
 is completely well-liked by everyone and I am sure that is not the
 case.  Nevertheless I think we've done pretty well.  Now the people
 whose stuff has not got in - and may not get in - may well feel that
 their stuff got short shrift, and I'm not going to deny that there's a
 problem there.  But breaking from our usual procedure for this patch
 is just adding to that unfairness, not making anything better.

This goes back to the point which I was trying to make earlier-
prioritization.  Small+clearly-good patches should be prioritized
higher, but identifying those is no small matter.  Of course, we would
need to address the starvation risk when it comes to larger patches,
which I've got no answer for.

 Regardless, if this patch had had multiple, detailed reviews finding
 lots of issues that were then addressed, I'd probably be keeping my
 trap shut and hoping for the best.  But it hasn't.  Any patch of this
 size is going to have nitpicky stuff that needs to be addressed, if
 nothing else, and points requiring some modicum of public discussion,
 and there hasn't been any of that.  That suggests to me that it just
 hasn't been thoroughly reviewed yet, at least not in public, and that
 should happen long before anyone talks about picking it up for commit.

Perhaps my memory is foggy, but I recall at least some amount of
discussion and review of this, along with fixes going in, over the past
couple of weeks.  More may be needed, of course, but if so, I'd expect
any committer looking at it would realize that and bounce it at this
point, given the feature freeze deadline which is like next week..

  If this patch had been timely submitted and I'd picked it up right
 away, I would have expected at least three weeks to elapse between the
 time my first review was posted and the time all the details were
 nailed down, even assuming no more serious problems were found, and
 starting that ball rolling now means looking forward to a commit
 around the end of April assuming things go great. That seems way too
 late to me; IMHO, we should already be mostly in mop-up mode now
 (hence my recent DSM patch cleanup patch, which no one has commented
 on...).

I agree with this.

 And I think it's likely that, in fact, we'll find cases where this
 regresses performance rather badly, for reasons sketched in an email I
 posted a bit ago.

I've not researched it enough myself to say.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 That's why we have this rule that CF4 should only receive patches that
 were already reviewed in previous commitfests.

I, at least, always understood that rule to be 'large' patches, which
this didn't strike me as.

 I, too, find the
 fast-tracking of this patch completely outside of the CF process to be
 distasteful.  We summarily reject much smaller patches at the end of
 each cycle process, even when the gain is as obvious as is claimed to
 be for this patch.

In the past, we've also committed large patches which were submitted for
the first time to CF-4.

 TBH I don't see why we're even discussing this.

Think I'm about done, personally.  I can't comment more without actually
looking at it and doing some research on it myself and I don't know that
I'll be able to do that any time soon, as I told Peter when he asked me
about it in NYC.  That said, for my part, I don't like telling Greg that
he either has to review something else which was submitted but that he's
got no interest in (or which would take much longer), or not do
anything.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Andres Freund
On 2014-04-07 14:35:23 -0400, Stephen Frost wrote:
 That said, for my part, I don't like telling Greg that
 he either has to review something else which was submitted but that he's
 got no interest in (or which would take much longer), or not do
 anything.

Reviewing and committing are two very different shoes imo. This
discussion wasn't about it getting reviewed before the next CF, but
about committing it into 9.4.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 2:19 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Apr 7, 2014 at 10:47 AM, Robert Haas robertmh...@gmail.com wrote:
 To throw out one more point that I think is problematic, Peter's
 original email on this thread gives a bunch of examples of strxfrm()
 normalization that all different in the first few bytes - but so do
 the underlying strings.  I *think* (but don't have time to check right
 now) that on my MacOS X box, strxfrm() spits out 3 bytes of header
 junk and then 8 bytes per character in the input string - so comparing
 the first 8 bytes of the strxfrm()'d representation would amount to
 comparing part of the first byte.  If for any reason the first byte is
 the same (or similar enough) on many of the input strings, then this
 will probably work out to be slower rather than faster.  Even if other
 platforms are more space-efficient (and I think at least some of them
 are), I think it's unlikely that this optimization will ever pay off
 for strings that don't differ in the first 8 bytes.

 Why would any platform have header bytes in the resulting binary
 strings? That doesn't make any sense. Are you sure you aren't thinking
 of the homogeneous trailing bytes that you can also see in my example?

No, I'm not sure of that at all.  I haven't looked at this topic in a
while, but I'm happy to budget some to time to do so - for the June
CommitFest.

 The only case that this patch could possibly regress is where there
 are strings that differ beyond about the first 8 bytes, but are not
 identical (we chance a memcmp() == 0 before doing a full strcoll()
 when tie-breaking on the semi-reliable initial comparison). We still
 always avoid fmgr-overhead (and shim overhead, which I've measured),
 as in your original patch - you put that at adding 7% at the time,
 which is likely to make up for otherwise-regressed cases. There is
 nothing at all contrived about my test-case.

It's not a question of whether your test case is contrived.  Your test
case can be (and likely is) extremely realistic and still not account
for other cases when the patch regresses performance.  If I understand
correctly, and I may not because I wasn't planning to spend time on
this patch until the next CommitFest, the patch basically uses the
bytes available in datum1 to cache what you refer to as a normalized
or poor man's sort key which, it is hoped, will break most ties.
However, on any workload where it fails to break ties, you're going to
incur the additional overheads of (1) comparing the poor-man's sort
key, (2) memcmping the strings (based on what you just said), and then
(3) digging the correct datum back out of the tuple.  I note that
somebody thought #3 was important enough to be worth creating datum1
for in the first place, so I don't think it can or should be assumed
that undoing that optimization in certain cases will turn out to be
cheap enough not to matter.

In short, I don't see any evidence that you've made an attempt to
construct a worst-case scenario for this patch, and that's a basic
part of performance testing.  I had to endure having Andres beat the
snot out of me over cases where the MVCC snapshot patch regressed
performance, and as painful as that was, it led to a better patch.  If
I'd committed the first thing that did well on my own tests, which
*did* include attempts (much less successful than Andres's) to find
regressions, we'd be significantly worse off today than we are.

 You have to have an awfully large number of significantly similar but
 not identical strings in order to possibly lose out. Even if you have
 such a case, and the fmgr-trampoline-elision doesn't make up for it
 (doesn't make up for having to do a separate heapattr lookup on the
 minimal tuple, and optimization not too relevant for pass by reference
 types), which is quite a stretch, it seems likely that you have other
 cases that do benefit, which in aggregate makes up for it. The
 benefits I've shown, on the first test case I picked are absolutely
 enormous.

Testing the cases where your patch wins and hand-waving that the
losses won't be that bad in other cases - without actually testing -
is not the right methodology.

And I think it would be quite a large error to assume that tables
never contain large numbers of similar but not identical strings.  I
bet there are many people who have just that.  That's also quite an
inaccurate depiction of the cases where this will regress things; a
bunch of strings that share the first few characters might not be
similar in any normal human sense of that term, while still slipping
through the proposed sieve.  In your OP, a six-byte string blew up
until 20 bytes after being strxfrm()'d, which means that you're only
comparing the first 2-3 bytes.  On a platform where Datums are only 4
bytes, you're only comparing the first 1-2 bytes.  Arguing that nobody
anywhere has a table where not even the first character or two are
identical across most or all of the strings in the 

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-04-07 14:35:23 -0400, Stephen Frost wrote:
  That said, for my part, I don't like telling Greg that
  he either has to review something else which was submitted but that he's
  got no interest in (or which would take much longer), or not do
  anything.
 
 Reviewing and committing are two very different shoes imo. This
 discussion wasn't about it getting reviewed before the next CF, but
 about committing it into 9.4.

I'm not entirely sure why.  No one is giving grief to the people
bringing up new things which are clearly for 9.5 at this point, yet we
have a ton of patches that still need to be reviewed.  I don't think
that we don't want to tell individuals who are volunteering how to
spend their time, but what we're saying is that they can do anything
except actually commit stuff, because we must have ordering to what
gets committed; an ordering which hasn't really got any prioritization
to it based on the patch itself but entirely depends on when it was
submitted.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 2:35 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-04-07 14:35:23 -0400, Stephen Frost wrote:
 That said, for my part, I don't like telling Greg that
 he either has to review something else which was submitted but that he's
 got no interest in (or which would take much longer), or not do
 anything.

 Reviewing and committing are two very different shoes imo. This
 discussion wasn't about it getting reviewed before the next CF, but
 about committing it into 9.4.

Yes.  I did not object to this patch being posted in the midst of
trying to nail down this release, and I certainly do not object to
Greg, or Stephen, or anyone else reviewing it.  My note was
specifically prompted not by someone say they intended to *review* the
patch, but that they intended to *commit* it when it hasn't even
really been reviewed yet.  There are patches that are trivial enough
that it's fine for someone to commit them without a public review
first, but this isn't remotely close to being in that category.  If
nothing else, the fact that it extends the definition of the btree
opclass is sufficient reason to merit a public review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 There are patches that are trivial enough
 that it's fine for someone to commit them without a public review
 first, but this isn't remotely close to being in that category.  If
 nothing else, the fact that it extends the definition of the btree
 opclass is sufficient reason to merit a public review.

hrmpf.  You have a good point about that- I admit that I didn't consider
that as much as I should have.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Greg Stark
On Mon, Apr 7, 2014 at 11:37 AM, Robert Haas robertmh...@gmail.com wrote:
 You're
 essentially leveraging a commit bit that you haven't used in more than
 three years to try to push a patch that was submitted months too late


I'm not leveraging anything any I'm not going to push something unless
people are on board. That's *why* I sent that message. And I started
the email by saying I was going to go work on patches from the
commitfest first.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Peter Geoghegan
On Mon, Apr 7, 2014 at 11:59 AM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 There are patches that are trivial enough
 that it's fine for someone to commit them without a public review
 first, but this isn't remotely close to being in that category.  If
 nothing else, the fact that it extends the definition of the btree
 opclass is sufficient reason to merit a public review.

 hrmpf.  You have a good point about that- I admit that I didn't consider
 that as much as I should have.

Actually, contrary to the original subject of this thread, that isn't
the case. I have not added a support function 3, which I ultimately
concluded was a bad idea. This is all sort support.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Peter Geoghegan
On Mon, Apr 7, 2014 at 11:56 AM, Greg Stark st...@mit.edu wrote:
 On Mon, Apr 7, 2014 at 11:37 AM, Robert Haas robertmh...@gmail.com wrote:
 You're
 essentially leveraging a commit bit that you haven't used in more than
 three years to try to push a patch that was submitted months too late


 I'm not leveraging anything any I'm not going to push something unless
 people are on board. That's *why* I sent that message. And I started
 the email by saying I was going to go work on patches from the
 commitfest first.

Exactly.

I was of the opinion, as some familiar with the subject matter, that
this rose to the level of deserving special consideration. I'm glad
that there does seem to be a general recognition that such a category
exists. Given the reservations of Robert and others, this isn't going
to happen for 9.4. It was never going to happen under a cloud of
controversy. I only broached the idea.

Special consideration is not something I ask for lightly. I must admit
that it's hard to see things as I do if you aren't as familiar with
the problem. I happen to think that this is the wrong decision, but
I'll leave it at that.

I'm sure that whatever we come up with for 9.5 will be a lot better
than what I have here, because it will probably be generalized to
other important cases.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 2:56 PM, Greg Stark st...@mit.edu wrote:
 On Mon, Apr 7, 2014 at 11:37 AM, Robert Haas robertmh...@gmail.com wrote:
 You're
 essentially leveraging a commit bit that you haven't used in more than
 three years to try to push a patch that was submitted months too late

 I'm not leveraging anything any I'm not going to push something unless
 people are on board. That's *why* I sent that message. And I started
 the email by saying I was going to go work on patches from the
 commitfest first.

I said that a lot more harshly than I should have, and I impugned you
unfairly.  Sorry.

I'm going to try again:

I don't doubt that your desire to move this patch forward is motivated
by anything under than the best of possible motivations.  However,
whether you intend it or not, trying to move this patch toward a 9.4
commit, or even trying to get people to express an opinion on whether
this is suitable for a 9.4 commit, is inevitably going to cause senior
reviewers who think they might have concerns about it to need to spend
time on it.  Inevitably, that time will come at the expense of patches
that were timely submitted, and that is unfair to the people who
submitted those patches.

Of course, if you want to review this patch now, I'm 100% OK with
that.  If you want to review other pending patches, for 9.4 or 9.5, I
think that's great, too.  But if there's talk of committing this
patch, I think that seems both quite a bit too late (relative to the
timing of CF4) and quite a bit too early (relative to the amount of
review and testing done thus far).

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Stephen Frost
* Peter Geoghegan (p...@heroku.com) wrote:
 Actually, contrary to the original subject of this thread, that isn't
 the case. I have not added a support function 3, which I ultimately
 concluded was a bad idea. This is all sort support.

Well, as apparently no one is objecting to Greg reviewing it, I'd
suggest he do that and actually articulate his feelings on the patch
post-review and exactly what it is changing and if he feels it needs
public comment, rather than all this speculation by folks who aren't
looking at the patch.

In other words, in hindsight, Greg was rather premature with his
suggestion that he might commit it and rather than suggesting such, he
should have just said he was going to review it and then come back with
a detailed email argueing the case for it to go in.

I don't particularly fault Greg for that, but perhaps some of this could
be avoided in the future.

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] Transaction local statistics are incorrect at speed

2014-04-07 Thread Tom Lane
My Salesforce colleague Teja Mupparti found an interesting bug.
Consider the following example:

drop table if exists test;
create table test(i int);

insert into test values(1);

select pg_sleep(1);

begin;
insert into test values(2);
insert into test values(3);
select pg_stat_get_xact_tuples_inserted('test'::regclass);
commit;

select pg_sleep(1);

begin;
insert into test values(4);
insert into test values(5);
select pg_stat_get_xact_tuples_inserted('test'::regclass);
commit;

If you do this by hand, or with the above script verbatim, the
pg_stat_get_xact_tuples_inserted() calls both report 2, which
is what you'd expect: the counts are supposed to reflect rows
inserted in the current transaction.

However, if you take out the pg_sleep calls, you get entirely
different results, and soon realize that the counts are including
the previous transactions!

The reason for this is that pgstat_report_stat() includes a delay
check, such that it doesn't ship off statistics counts to the collector
unless at least 500 ms have elapsed since the last report.  Without
the sleeps, the later transactions execute while the previous
transactions' counts are still being held locally, *and those counts
get included into the reported totals*.

This seems like a pretty clear bug to me; does anyone want to argue
that it isn't?

In the case of pg_stat_get_xact_tuples_inserted and a couple of other
routines, it would be entirely trivial to fix: just ignore
tabentry-t_counts.t_tuples_inserted (which is the count held over from
previous transactions) and only total the trans-tuples_inserted counters.
However, for a number of other counters such as blocks_fetched, we don't
store transaction-local counts separately, and would have to start doing
so if we wanted to make these functions work as documented.

Thoughts?  I have other things to do right now than fix this myself.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers

2014-04-07 Thread Andres Freund
Hi,

I've the need to persist a the result of an index_getnext() in a tuple
slot. I don't want to unneccessarily duplicate the tuple data itself, so
I'd like to use ExecStoreTuple(buffer = real_buffer) notion. But since
the next index_getnext()/index_endscan() will overwrite/release the
heaptuple I need to copy the HeapTupleData().
It'd be rather useful to be able to do ExecStoreTuple(tuple, slot,
some_buffer, true), i.e. hav ethe HeapTupleData struct freed *and* the
buffer pinned. There's an Assert preventing that though.

Why?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction local statistics are incorrect at speed

2014-04-07 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 This seems like a pretty clear bug to me; does anyone want to argue
 that it isn't?

I'd agree that it's a bug.

 In the case of pg_stat_get_xact_tuples_inserted and a couple of other
 routines, it would be entirely trivial to fix: just ignore
 tabentry-t_counts.t_tuples_inserted (which is the count held over from
 previous transactions) and only total the trans-tuples_inserted counters.
 However, for a number of other counters such as blocks_fetched, we don't
 store transaction-local counts separately, and would have to start doing
 so if we wanted to make these functions work as documented.

I've been thinking for a while that we need more of this (storing and
aggregating statistics data locally first) or we're never going to be
able to add more information like the stddev, etc, that's been
requested.  I'd really like to see us collecting a great deal more
regarding stats and that's not going to happen unless we can cache and
aggregate locally first.

 Thoughts?  I have other things to do right now than fix this myself.

While a bug, I don't feel that it's a particularly high priority one.
What worries me more is the risk of breaking things if we do backpatch
a fix that postpones updating some information that used to be
reflected immediately.  It seems like we'd need/want some kind of
generalized structure for this too, which would be even more work and
which has zero chance for 9.4, meaning that it'll need to be fixed one
way now and then whacked around in 9.5 (assuming someone has the time,
etc..).

Not a good situation. :/

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers

2014-04-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I've the need to persist a the result of an index_getnext() in a tuple
 slot. I don't want to unneccessarily duplicate the tuple data itself, so
 I'd like to use ExecStoreTuple(buffer = real_buffer) notion. But since
 the next index_getnext()/index_endscan() will overwrite/release the
 heaptuple I need to copy the HeapTupleData().
 It'd be rather useful to be able to do ExecStoreTuple(tuple, slot,
 some_buffer, true), i.e. hav ethe HeapTupleData struct freed *and* the
 buffer pinned. There's an Assert preventing that though.

There's an assumption that if you are asking to pin a buffer, the tuple
pointer you're passing is pointing into that buffer (and is therefore not
something that could be pfree'd).  If it isn't pointing into a buffer,
the tuple slot is not the place to be keeping the buffer reference.

I'm disinclined to remove that Assert because failing to pin a buffer
when you *are* passing a pointer into it is a very bad, hard-to-find bug.
Admittedly the Assert is only a partial defense against that problem,
but it's better than nothing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers

2014-04-07 Thread Andres Freund
On 2014-04-07 15:58:31 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I've the need to persist a the result of an index_getnext() in a tuple
  slot. I don't want to unneccessarily duplicate the tuple data itself, so
  I'd like to use ExecStoreTuple(buffer = real_buffer) notion. But since
  the next index_getnext()/index_endscan() will overwrite/release the
  heaptuple I need to copy the HeapTupleData().
  It'd be rather useful to be able to do ExecStoreTuple(tuple, slot,
  some_buffer, true), i.e. hav ethe HeapTupleData struct freed *and* the
  buffer pinned. There's an Assert preventing that though.
 
 There's an assumption that if you are asking to pin a buffer, the tuple
 pointer you're passing is pointing into that buffer (and is therefore not
 something that could be pfree'd).  If it isn't pointing into a buffer,
 the tuple slot is not the place to be keeping the buffer reference.

Yea. I *have* a HeapTupleData struct pointing into the buffer. It's just
that the lifetime of the indexscans's xs_ctup isn't sufficient for my
case. So I have to allocate a new HeapTupleData struct, *again* pointing
into the buffer. I'd like to manage the lifetime of that HeapTupleData
struct (*not* the entire HeapTupleHeader blob) via the slot.
That doesn't sound too crazy to me?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-04-07 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I just noticed that this patch not only adds min,max,stddev, but it also
 adds the ability to reset an entry's counters.  This hasn't been
 mentioned in this thread at all; there has been no discussion on whether
 this is something we want to have, nor on whether this is the right API.

 What it does is add a new function pg_stat_statements_reset_time() which
 resets the min and max values from all function's entries, without
 touching the stddev state variables nor the number of calls.  Note
 there's no ability to reset the values for a single function.  

That seems pretty bizarre.  At this late date, the best thing would
probably be to strip out the undiscussed functionality.  It can get
resubmitted for 9.5 if there's a real use-case for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I am a bit confused. To my eyes there's been a huge number of actually
 trivial patches in this commitfest? Even now, there's some:

 * Bugfix for timeout in LDAP connection parameter resolution
 * Problem with displaying wide tables in psql
 * Enable CREATE FOREIGN TABLE (... LIKE ... )
 * Add min, max, and stdev execute statement time in pg_stat_statement
 * variant of regclass etc.
 * vacuumdb: Add option --analyze-in-stages

 Are all small patches that don't need major changes before getting committed.

FWIW, I think the reason most of those aren't in is that there's not
consensus that it's a feature we want.  Triviality of the patch itself
doesn't make it easier to get past that.  (Indeed, when a feature patch
is actually trivial, that usually suggests to me that it's not been
thought through fully ...)

The LDAP one requires both LDAP and Windows expertise, which means the
pool of qualified committers for it is pretty durn small.  I think
Magnus promised to deal with it, though.

 I think it'd be a different discussion if this where CF-1 or so. But
 we're nearly *2* months after the the *end* of the last CF.

Yeah.  At this point the default decision has to be to reject (or
more accurately, punt to 9.5).  I think we can still get a few of
these in and meet the mid-April target date, but many of them will
not make it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREIGN TABLE ( ... LIKE ... )

2014-04-07 Thread Andres Freund
On 2014-04-05 11:46:16 -0400, Tom Lane wrote:
 ISTM this is because the proposed feature is wrongheaded.  The basic
 concept of CREATE TABLE LIKE is that you're copying properties from
 another object of the same type.  You might or might not want every
 property, but there's no question of whether you *could* copy every
 property.  In contrast, what this is proposing to do is copy properties
 from (what might be) a plain table to a foreign table, and those things
 aren't even remotely the same kind of object.
 
 It would make sense to me to restrict LIKE to copy from another foreign
 table, and then there would be a different set of INCLUDING/EXCLUDING
 options that would be relevant (options yes, indexes no, for example).

I actually think it's quite useful to create a foreign table that's the
same shape as a local table. And the patches approach of refusing to
copy thinks that aren't supported sounds sane to me.
Consider e.g. moving off older partitioned data off to an archiving
server. New local partitions are often created using CREATE TABLE LIKE,
but that's not possible for the foreign ones.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers

2014-04-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-07 15:58:31 -0400, Tom Lane wrote:
 There's an assumption that if you are asking to pin a buffer, the tuple
 pointer you're passing is pointing into that buffer (and is therefore not
 something that could be pfree'd).  If it isn't pointing into a buffer,
 the tuple slot is not the place to be keeping the buffer reference.

 Yea. I *have* a HeapTupleData struct pointing into the buffer. It's just
 that the lifetime of the indexscans's xs_ctup isn't sufficient for my
 case. So I have to allocate a new HeapTupleData struct, *again* pointing
 into the buffer. I'd like to manage the lifetime of that HeapTupleData
 struct (*not* the entire HeapTupleHeader blob) via the slot.
 That doesn't sound too crazy to me?

In that case you should have another tuple slot of your own and let it
keep the tuple (and buffer pin).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Peter Geoghegan
On Mon, Apr 7, 2014 at 11:47 AM, Robert Haas robertmh...@gmail.com wrote:
 It's not a question of whether your test case is contrived.  Your test
 case can be (and likely is) extremely realistic and still not account
 for other cases when the patch regresses performance.  If I understand
 correctly, and I may not because I wasn't planning to spend time on
 this patch until the next CommitFest, the patch basically uses the
 bytes available in datum1 to cache what you refer to as a normalized
 or poor man's sort key which, it is hoped, will break most ties.
 However, on any workload where it fails to break ties, you're going to
 incur the additional overheads of (1) comparing the poor-man's sort
 key, (2) memcmping the strings (based on what you just said), and then
 (3) digging the correct datum back out of the tuple.  I note that
 somebody thought #3 was important enough to be worth creating datum1
 for in the first place, so I don't think it can or should be assumed
 that undoing that optimization in certain cases will turn out to be
 cheap enough not to matter.

The much earlier datum1 optimization is mostly compelling for
pass-by-value types, for reasons that prominently involve
cache/locality considerations. That's probably why this patch is so
compelling - it makes those advantages apply to pass-by-reference
types too.

 Testing the cases where your patch wins and hand-waving that the
 losses won't be that bad in other cases - without actually testing -
 is not the right methodology.

Okay. Here is a worst-case, with the pgbench script the same as my
original test-case, but with much almost maximally unsympathetic data
to sort:

[local]/postgres=# update customers set firstname =
'padding-padding-padding-padding' || firstname;
UPDATE 2

Master:

pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 100 s
number of transactions actually processed: 323
latency average: 309.598 ms
tps = 3.227745 (including connections establishing)
tps = 3.227784 (excluding connections establishing)

Patch:

pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 100 s
number of transactions actually processed: 307
latency average: 325.733 ms
tps = 3.066256 (including connections establishing)
tps = 3.066313 (excluding connections establishing)

That seems like a pretty modest regression for a case where 100% of
what I've done goes to waste. If something that I'd done worked out
10% of the time, we'd still be well ahead.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers

2014-04-07 Thread Andres Freund
On 2014-04-07 16:29:57 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-07 15:58:31 -0400, Tom Lane wrote:
  There's an assumption that if you are asking to pin a buffer, the tuple
  pointer you're passing is pointing into that buffer (and is therefore not
  something that could be pfree'd).  If it isn't pointing into a buffer,
  the tuple slot is not the place to be keeping the buffer reference.

  Yea. I *have* a HeapTupleData struct pointing into the buffer. It's just
  that the lifetime of the indexscans's xs_ctup isn't sufficient for my
  case. So I have to allocate a new HeapTupleData struct, *again* pointing
  into the buffer. I'd like to manage the lifetime of that HeapTupleData
  struct (*not* the entire HeapTupleHeader blob) via the slot.
  That doesn't sound too crazy to me?

 In that case you should have another tuple slot of your own and let it
 keep the tuple (and buffer pin).

Maybe I am just being dense here and misunderstanding tuple slots. The
executor isn't exactly my strong suit.

But say I am doing something like:

slot = ExecInitExtraTupleSlot(estate);
ExecSetSlotDescriptor(slot, RelationGetDescr(rel));
...
while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL)
{
 /* some check */
 ...
 if (blurb)
 ExecStoreTuple(scantuple, slot, scan-xs_cbuf, true);
}

...
index_endscan(scan); /* possibly */

/* now use the stored tuple via slot-tts_tuple */

that's not going to work, because scantuple might be free'd or pointing
to another tuple, from the next index_getnext() call. Right?
So what I now do is essentially:
while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL)
{
...
ht = palloc(sizeof(HeapTupleData)); /* in the right context */
memcpy(ht, scantuple, sizeof(HeapTupleData));
ExecStoreTuple(ht, slot, scan-xs_cbuf, false);
slot-tts_shouldFree = true;
...
}

That will a) keep the buffer pinned long enough, b) keep the
HeapTupleData struct around long enough.

That works, but seems pretty ugly. Thus I am wondering if a) there's a
way to avoid the outside manipulation of tts_shouldFree b) why there's
no builtin operation doing the palloc, memcpy and store...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 4:35 PM, Peter Geoghegan p...@heroku.com wrote:
 The much earlier datum1 optimization is mostly compelling for
 pass-by-value types, for reasons that prominently involve
 cache/locality considerations.

I agree.

 That's probably why this patch is so
 compelling - it makes those advantages apply to pass-by-reference
 types too.

Well, whether the patch is compelling is actually precisely what we
need to figure out.  I feel like you're asserting your hoped-for
conclusion prematurely.

 Testing the cases where your patch wins and hand-waving that the
 losses won't be that bad in other cases - without actually testing -
 is not the right methodology.

 Okay. Here is a worst-case, with the pgbench script the same as my
 original test-case, but with much almost maximally unsympathetic data
 to sort:

 [local]/postgres=# update customers set firstname =
 'padding-padding-padding-padding' || firstname;
 UPDATE 2

 Master:

 pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100
 transaction type: Custom query
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 100 s
 number of transactions actually processed: 323
 latency average: 309.598 ms
 tps = 3.227745 (including connections establishing)
 tps = 3.227784 (excluding connections establishing)

 Patch:

 pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100
 transaction type: Custom query
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 100 s
 number of transactions actually processed: 307
 latency average: 325.733 ms
 tps = 3.066256 (including connections establishing)
 tps = 3.066313 (excluding connections establishing)

 That seems like a pretty modest regression for a case where 100% of
 what I've done goes to waste. If something that I'd done worked out
 10% of the time, we'd still be well ahead.

Now that is definitely interesting, and it does seem to demonstrate
that the worst case for this patch might not be as bad as I had feared
- it's about a 5% regression: not great, but perhaps tolerable.  It's
not actually a worse case unless firstname is a fair ways into a tuple
with lots of variable-length columns before it, because part of what
the datum1 thing saves you is the cost of repeatedly walking through
the tuple's column list.

But I still think that a lot more could be done - and I'd challenge
you (or others) to do it - to look for cases where this might be a
pessimization.  I get that the patch has an upside, but nearly every
patch that anybody proposes does, and the author usually points out
those cases quite prominently, as you have, and right so.  But what
makes for a much more compelling submission is when the author *also*
tries really hard to break the patch, and hopefully fails.  I agree
that the test result shown above is good news for this patch's future
prospects, but I *don't* agree that it nails the door shut.  What
about other locales?  Other operating systems?  Other versions of
libc?  Longer strings?  Wider tuples?  Systems where datums are only
32-bits?  Sure, you can leave those things to the reviewer and/or
committer to worry about, but that's not the way to expedite the
patch's trip into the tree.

I have to admit, I didn't really view the original postings on this
topic to be anything more than, hey, we've got something promising
here, it's worth more study.  That's part of why I was so taken aback
by Greg's email.  There's certainly good potential here, but I think
there's quite a bit left of work left to do before you can declare
victory...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-07 Thread Peter Geoghegan
On Mon, Apr 7, 2014 at 3:49 PM, Robert Haas robertmh...@gmail.com wrote:
 Now that is definitely interesting, and it does seem to demonstrate
 that the worst case for this patch might not be as bad as I had feared
 - it's about a 5% regression: not great, but perhaps tolerable.  It's
 not actually a worse case unless firstname is a fair ways into a tuple
 with lots of variable-length columns before it, because part of what
 the datum1 thing saves you is the cost of repeatedly walking through
 the tuple's column list.

I only use strxfrm() when the leading key is text. Otherwise, it's
pretty much just what your original sort support for text patch from
2012 does, and we don't bother with anything other than fmgr-elision.
The only thing that stops the above test case being perfectly pessimal
is that we don't always chance a memcmp(), due to some comparisons
realizing that len1 != len2. Also, exactly one comparison actually
benefits from that same chance a memcmp() trick in practice (there
is one duplicate). But it really is vanishingly close to perfectly
pessimal on my dev system, or that is at least my sincere belief.

I think that we're going to find ourselves with more and more cases
where to risk wasting lots of compute bandwidth to save a little
memory bandwidth is, counter-intuitively, a useful optimization. My
opportunistic memcmp() is probably an example of this. I'm aware of
others. This is perhaps a contributing factor to how inexpensive it is
to waste all the effort that goes into strxfrm() and so on. Having
said that, this is an idea from the 1960s, which might explain
something about the name of the technique.

 But I still think that a lot more could be done - and I'd challenge
 you (or others) to do it - to look for cases where this might be a
 pessimization.  I get that the patch has an upside, but nearly every
 patch that anybody proposes does, and the author usually points out
 those cases quite prominently, as you have, and right so.  But what
 makes for a much more compelling submission is when the author *also*
 tries really hard to break the patch, and hopefully fails.  I agree
 that the test result shown above is good news for this patch's future
 prospects, but I *don't* agree that it nails the door shut.  What
 about other locales?  Other operating systems?  Other versions of
 libc?  Longer strings?  Wider tuples?  Systems where datums are only
 32-bits?  Sure, you can leave those things to the reviewer and/or
 committer to worry about, but that's not the way to expedite the
 patch's trip into the tree.

I don't think that 32-bit systems are all that high a priority for
performance work. Cellphones will come out with 64-bit processors this
year. Even still, it's hard to reason about how much difference that
will make, except to say that the worst case cannot be that bad. As
for other locales, it only gets better for this patch, because I
believe en_US.UTF-8 is one of the cheapest to compare locales (which
is to say, requires fewest passes). If you come up with some test case
with complicated collation rules (I'm thinking of hu_HU, Hungarian),
it surely makes the patch look much better. Having said that, I still
don't do anything special with the C locale (just provide a
non-fmgr-accessed comparator), which should probably be fixed, on the
grounds that sorting using the C locale is probably now more expensive
than with the en_US.UTF-8 collation.

 I have to admit, I didn't really view the original postings on this
 topic to be anything more than, hey, we've got something promising
 here, it's worth more study.  That's part of why I was so taken aback
 by Greg's email.  There's certainly good potential here, but I think
 there's quite a bit left of work left to do before you can declare
 victory...

I think that Greg's choice of words was a little imprudent, but must
be viewed in the context of an offline discussion during the hall
track of pgConf NYC. Clearly Greg wasn't about to go off and
unilaterally commit this. FWIW, I think I put him off the idea a few
hours after he made those remarks, without intending for what I'd said
to have that specific effect (or the opposite effect).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREIGN TABLE ( ... LIKE ... )

2014-04-07 Thread Michael Paquier
On Tue, Apr 8, 2014 at 5:24 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-04-05 11:46:16 -0400, Tom Lane wrote:
 ISTM this is because the proposed feature is wrongheaded.  The basic
 concept of CREATE TABLE LIKE is that you're copying properties from
 another object of the same type.  You might or might not want every
 property, but there's no question of whether you *could* copy every
 property.  In contrast, what this is proposing to do is copy properties
 from (what might be) a plain table to a foreign table, and those things
 aren't even remotely the same kind of object.

 It would make sense to me to restrict LIKE to copy from another foreign
 table, and then there would be a different set of INCLUDING/EXCLUDING
 options that would be relevant (options yes, indexes no, for example).

 I actually think it's quite useful to create a foreign table that's the
 same shape as a local table. And the patches approach of refusing to
 copy thinks that aren't supported sounds sane to me.
This could be improved as well: it would be useful to be able to copy
the column options of another foreign table.

 Consider e.g. moving off older partitioned data off to an archiving
 server. New local partitions are often created using CREATE TABLE LIKE,
 but that's not possible for the foreign ones.
Definitely a use case.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers

2014-04-07 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-07 16:29:57 -0400, Tom Lane wrote:
 In that case you should have another tuple slot of your own and let it
 keep the tuple (and buffer pin).

 that's not going to work, because scantuple might be free'd or pointing
 to another tuple, from the next index_getnext() call. Right?

It's true that scantuple is probably pointing at the xs_ctup field of the
IndexScanDesc, so you need to put that data somewhere else if you want
to hold onto it past the current loop iteration.

 So what I now do is essentially:
 while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL)
 {
 ...
 ht = palloc(sizeof(HeapTupleData)); /* in the right context */
 memcpy(ht, scantuple, sizeof(HeapTupleData));
 ExecStoreTuple(ht, slot, scan-xs_cbuf, false);
 slot-tts_shouldFree = true;
 ...
 }

Well, that is certainly messy.  I think you could just use a local
HeapTupleData variable instead of palloc'ing every time, where local
means has lifespan similar to the slot pointer.  That is

  HeapTupleData my_htup;

  while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL)
  {
memcpy(my_htup, scantuple, sizeof(HeapTupleData));
ExecStoreTuple(my_htup, slot, scan-xs_cbuf, false);
  }

If my_htup is just a local, you'd want to clear the slot before my_htup
goes out of scope, just to be sure it doesn't try to dereference a
dangling pointer.  There's some vaguely similar hacking near the end of
ExecDelete.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2014-04-07 Thread Josh Berkus
On 04/07/2014 02:16 PM, Heikki Linnakangas wrote:
 I've been playing with a little hack that records a before and after
 image of every page modification that is WAL-logged, and writes the
 images to a file along with the LSN of the corresponding WAL record. I
 set up a master-standby replication with that hack in place in both
 servers, and ran the regression suite. Then I compared the after images
 after every WAL record, as written on master, and as replayed by the
 standby.

This is awesome ... thank you for doing this.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Default gin operator class of jsonb failing with index row size maximum reached

2014-04-07 Thread Michael Paquier
Hi all,

While doing some tests with jsonb, I found a failure as told in $subject:
=# create table data_jsonb (data jsonb);
CREATE TABLE
=# insert into data_jsonb ... tuple in the script attached
INSERT 1
=# create index data_index on data_jsonb using gin(data);
ERROR:  54000: index row size 1808 exceeds maximum 1352 for index data_index
LOCATION:  GinFormTuple, ginentrypage.c:110
=# create index data_index2 on data_jsonb using gin (data jsonb_hash_ops);
CREATE INDEX

The data creating the failure is a tuple in a dump of geonames
(http://www.geonames.org/export/), listing some geographical data, and
it is caused by some arabic characters it seems used to provide
translations for a given geographical location. Encoding of the
database on which I have done the tests is UTF-8. Japanese, Chinese
equivalents were working fine btw with this operator.

Documentation of jsonb tells that jsonb documents should be kept at a
reasonable size to reduce lock contention, but there is no mention of
size limitation for indexes:
http://www.postgresql.org/docs/devel/static/datatype-json.html

A test case is attached, note as well that only the default gin
operator class is failing, jsonb_hash_ops worked well.
Regards,
-- 
Michael


jsonb_index_error.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2014-04-07 Thread Michael Paquier
On Tue, Apr 8, 2014 at 3:16 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

 I've been playing with a little hack that records a before and after image
 of every page modification that is WAL-logged, and writes the images to a
 file along with the LSN of the corresponding WAL record. I set up a
 master-standby replication with that hack in place in both servers, and ran
 the regression suite. Then I compared the after images after every WAL
 record, as written on master, and as replayed by the standby.
Assuming that adding some dedicated hooks in the core able to do
actions before and after a page modification occur is not *that*
costly (well I imagine that it is not acceptable in terms of
performance), could it be possible to get that in the shape of a
extension that could be used to test WAL record consistency? This may
be an idea to think about...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-04-07 Thread Robert Haas
On Mon, Apr 7, 2014 at 4:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I just noticed that this patch not only adds min,max,stddev, but it also
 adds the ability to reset an entry's counters.  This hasn't been
 mentioned in this thread at all; there has been no discussion on whether
 this is something we want to have, nor on whether this is the right API.

 What it does is add a new function pg_stat_statements_reset_time() which
 resets the min and max values from all function's entries, without
 touching the stddev state variables nor the number of calls.  Note
 there's no ability to reset the values for a single function.

 That seems pretty bizarre.  At this late date, the best thing would
 probably be to strip out the undiscussed functionality.  It can get
 resubmitted for 9.5 if there's a real use-case for it.

If I'm understanding correctly, that actually seems reasonably
sensible.  I mean, the big problem with min/max is that you might be
taking averages and standard deviations over a period of months, but
the maximum and minimum are not that meaningful over such a long
period.  You might well want to keep a longer-term average while
seeing the maximum and minimum for, say, today.  I don't know exactly
how useful that is, but it doesn't seem obviously dumb.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-04-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Apr 7, 2014 at 4:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 What it does is add a new function pg_stat_statements_reset_time() which
 resets the min and max values from all function's entries, without
 touching the stddev state variables nor the number of calls.  Note
 there's no ability to reset the values for a single function.

 That seems pretty bizarre.  At this late date, the best thing would
 probably be to strip out the undiscussed functionality.  It can get
 resubmitted for 9.5 if there's a real use-case for it.

 If I'm understanding correctly, that actually seems reasonably
 sensible.  I mean, the big problem with min/max is that you might be
 taking averages and standard deviations over a period of months, but
 the maximum and minimum are not that meaningful over such a long
 period.  You might well want to keep a longer-term average while
 seeing the maximum and minimum for, say, today.  I don't know exactly
 how useful that is, but it doesn't seem obviously dumb.

Well, maybe it's okay, but not being able to reset the stddev state seems
odd, and the chosen function name seems odder.  In any case, the time to
be discussing the design of such functionality was several months ago.
I'm still for ripping it out for 9.4, rather than being stuck with
behavior that's not been adequately discussed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-07 Thread Amit Kapila
On Mon, Apr 7, 2014 at 12:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2014-04-04 6:51 GMT+02:00 Amit Kapila amit.kapil...@gmail.com:
 On Tue, Apr 1, 2014 at 11:42 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  2014-03-27 17:56 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:
  So I'll prepare a some prototypes in next month for
 
  1. log a execution time for cancelled queries,
  2. track a query lock time
 Yes. Initially I though only about cancelled queries, but now O am thinking
 so some more wide solution can be better. Sometimes - some long queries can
 be stopped by Postgres, or by system - and info about duration can be same
 usefull.

Right.


 One more thing I think currently also we can find that by crude way
 (pg_stat_activity has query_start time and log_line_prefix has end time),
 but I think the having in one place 'log' will be more useful.
 ??

I just wanted to say that if someone wants to calculate the duration
of cancelled query (or other error'd query), you can do that by checking
the start time from pg_stat_activity and end time from log (using
log_line_prefix), but this is of course cumbersome.

  Same technique I would to
  use for printing lock time - it can be printing instead symbol %L.

 Above you have mentioned that you are planing to have three different
 lock times (Table, Tuple and others), so will this one symbol (%L) enable
 all three lock times?


 My idea is start with %L as total lock time, what is useful for wide group
 of users, and next or in same time we can enhance it with two chars prefix
 symbols

So do you want to just print lock time for error'd statements, won't
it better to
do it for non-error'd statements as well or rather I feel it can be more useful
for non-error statements? Do we already have some easy way to get wait-time
for non-error statements?

 so

 %L .. total lock time
 %Lr .. lock relation
 %Lt .. lock tuples
 %Lo .. lock others

 L = Lr + Lt + Lr


 Are you planing to add new logs for logging this or planing to use
 existing
 infrastructure?


 I have not a prototype yet, so I don't know what will be necessary

Okay, I think then it's better to discuss after your initial analysis/
prototype, but I think you might need to add some infrastructure code
to make it possible, as lock database object (relation, tuple, ..) and lock
others (buffers, ..) have different locking strategy, so to get total wait time
for a statement due to different kind of locks you need to accumulate
different wait times.

 One general point is that won't it be bit difficult to parse the log line
 having
 so many times, should we consider to log with some special marker for
 each time (for example: tup_lock_wait_time:1000ms).


 We should to optimize a log_line_prefix processing instead.

 Proposed options are interesting for enterprise using, when you have a
 some more smart tools for log entry processing, and when you need a complex
 view about performance of billions queries - when cancel time and lock time
 is important piece in mosaic of server' fitness.

Exactly, though this might not be directly related to this patch, but having
it would be useful.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers