date:20130914

Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-09-14 Thread samthakur74

> >You have added this email to the commit fest, but it contains no patch.
>
>Please add the email with the actual patch.
>
 I hope its attached now!

>  Maybe the author should be
> >given a chance to update the patches, though, because they are quite
> >old.
>
 I did connect with Daniel and he did have some improvement ideas. I am not
sure when they could be implemented. Since we have a interest in the
current version of the patch, which needed documentation, i tried to
complete that.
Thank you,
Sameer


pg_stat_statements-identification-v4.patch.gz (8K) 





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/pg-stat-statements-calls-under-estimation-propagation-tp5738128p5770937.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

[HACKERS] Questions about checksum feature in 9.3

2013-09-14 Thread Kevin

I am getting a new server ready for production and saw the release note on the 
new checksum feature. I thought it sounded like something we might want, and 
then after reading realized we have to initdb with the feature on. I figured 
I'd better check into it a little more since changing later might be a bit of a 
hassle and found notes on getting a vectorized version running for better 
performance.

My attempts to compile it vectorized on OS X seemed to have failed since I 
don't find a vector instruction in the .o file even though the options -msse4.1 
-funroll-loops -ftree-vectorize should be supported according to the man page 
for Apple's llvm-gcc.

So, has anyone compiled checksum vectorized on OS X? Are there any performance 
data that would indicate whether or not I should worry with this in the first 
place? 

So far we are pretty happy with the performance of 9.2.4, but have noticed a 
few situations where it's a little slower than we might like, but these 
instances are rare. I'd accept a small performance hit if we can get better 
reliability and awareness of potential problems.

Thanks,
Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git apply vs patch -p1

2013-09-14 Thread Peter Eisentraut

On Sat, 2013-09-14 at 11:37 -0700, Josh Berkus wrote:
> Lately I've been running into a lot of reports of false conflicts
> reported by "git apply".  The most recent one was the "points" patch,
> which git apply rejected for completely ficticious reasons (it claimed
> that the patch was trying to create a new file where a file already
> existed, which it wasn't) 

Every file in that patch contains

new file mode 100644

which means it is creating a new file.  I would review how that patch
was created.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Pavel Stehule

2013/9/15 Marko Tiikkaja 

> On 2013-09-15 00:09, Pavel Stehule wrote:
>
>> this is a possibility for introduction a new hook and possibility
>> implement
>> asserions and similar task in generic form (as extension). it can be
>> assertions, tracing, profiling.
>>
>
> You can already do tracing and profiling in an extension.  I don't see
> what you would put inside the function body for these two, either.
>

you cannot mark a tracing points explicitly in current (unsupported now)
extensions.

These functions share  same pattern:

CREATE OR REPLACE FUNCTION assert(boolean)
RETURNS void AS $$
IF current_setting('plpgsq.assertions') = 'on' THEN
  IF $1 THEN
RAISE EXCEPTION 'Assert fails';
  END IF;
END IF;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION trace(text)
RETURNS void AS $$
IF current_setting('plpgsq.trace') = 'on' THEN
RAISE WARNING 'trace: %', $1; END IF;
END;
$$ LANGUAGE plpgsql;

Depends on usage, these functions will not be extremely slow against to
builtin solution - can be faster, if we implement it in C, and little bit
faster if we implement it as internal PLpgSQL statement. But if you use a
one not simple queries, then overhead is not significant (probably).

You have to watch some global state variable and then execute (or not) some
functionality.

Regards

Pavel

>
>
> Regards,
> Marko Tiikkaja
>

Re: [HACKERS] [PATCH] Revive line type

2013-09-14 Thread Peter Eisentraut

Here is a new patch for the line type, with a new input/output format
{A,B,C}, as discussed in this thread.


>From 837fcf5d9b1ee8e589ef4b19f7d6e575229ca758 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 15 Sep 2013 00:02:06 -0400
Subject: [PATCH] Revive line type

Change the input/output format to {A,B,C}, to match the internal
representation.

Complete the implementations of line_in, line_out, line_recv,
line_send.  Remove comments and error messages about the line type not
being implemented.  Add regression tests for existing line operators
and functions.

Reviewed-by: rui hua <365507506...@gmail.com>
---
 doc/src/sgml/datatype.sgml |  42 -
 doc/src/sgml/func.sgml |   6 +
 src/backend/utils/adt/geo_ops.c| 219 ++-
 src/include/catalog/pg_type.h  |   3 +-
 src/include/utils/geo_decls.h  |   7 -
 src/test/regress/expected/geometry.out |   3 -
 src/test/regress/expected/line.out | 271 +
 src/test/regress/expected/sanity_check.out |   3 +-
 src/test/regress/output/misc.source|   3 +-
 src/test/regress/parallel_schedule |   2 +-
 src/test/regress/serial_schedule   |   1 +
 src/test/regress/sql/geometry.sql  |   4 -
 src/test/regress/sql/line.sql  |  87 +
 13 files changed, 503 insertions(+), 148 deletions(-)
 create mode 100644 src/test/regress/expected/line.out
 create mode 100644 src/test/regress/sql/line.sql

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 87668ea..07f0385 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -3051,9 +3051,7 @@ Geometric Types

 Geometric data types represent two-dimensional spatial
 objects.  shows the geometric
-types available in PostgreSQL.  The
-most fundamental type, the point, forms the basis for all of the
-other types.
+types available in PostgreSQL.

 
 
@@ -3063,8 +3061,8 @@ Geometric Types

 Name
 Storage Size
-Representation
 Description
+Representation

   
   
@@ -3077,8 +3075,8 @@ Geometric Types

 line
 32 bytes
-Infinite line (not fully implemented)
-((x1,y1),(x2,y2))
+Infinite line
+{A,B,C}


 lseg
@@ -3153,6 +3151,38 @@ Points

 

+Lines
+
+
+ line
+
+
+
+ Lines (line) are represented by the linear equation Ax + By
+ + C = 0, where A and B are not both zero.  Values of
+ type line is input and output in the following form:
+
+{ A, B, C }
+
+
+ Alternatively, any of the following forms can be used for input:
+
+
+[ ( x1 , y1 ) , ( x2 , y2 ) ]
+( ( x1 , y1 ) , ( x2 , y2 ) )
+  ( x1 , y1 ) , ( x2 , y2 )
+x1 , y1   ,   x2 , y2
+
+
+ where
+ (x1,y1)
+ and
+ (x2,y2)
+ are two (different) points on the line.
+
+   
+
+   
 Line Segments
 
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ee1c957..8f60c56 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -8123,6 +8123,12 @@ Geometric Type Conversion Functions
 circle(polygon '((0,0),(1,1),(2,0))')


+line(point, point)
+line
+points to line
+line(point '(-1,0)', point '(1,0)')
+   
+   
 
  
   lseg
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index 5d0b596..6bfe6d7 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -926,42 +926,82 @@ box_diagonal(PG_FUNCTION_ARGS)
 /***
  **
  **		Routines for 2D lines.
- **Lines are not intended to be used as ADTs per se,
- **but their ops are useful tools for other ADT ops.  Thus,
- **there are few relops.
  **
  ***/
 
+static bool
+line_decode(const char *str, LINE *line)
+{
+	char	   *tail;
+
+	while (isspace((unsigned char) *str))
+		str++;
+	if (*str++ != '{')
+		return false;
+	line->A = strtod(str, &tail);
+	if (tail <= str)
+		return false;
+	str = tail;
+	while (isspace((unsigned char) *str))
+		str++;
+	if (*str++ != DELIM)
+		return false;
+	line->B = strtod(str, &tail);
+	if (tail <= str)
+		return false;
+	str = tail;
+	while (isspace((unsigned char) *str))
+		str++;
+	if (*str++ != DELIM)
+		return false;
+	line->C = strtod(str, &tail);
+	if (tail <= str)
+		return false;
+	str = tail;
+	while (isspace((unsigned char) *str))
+		str++;
+	if (*str++ != '}')
+		return false;
+	while (isspace((unsigned char) *str))
+		str++;
+	if (*str)
+		return false;
+
+	return true;
+}
+
 Datum
 line_in(PG_FUNCTION_ARGS)
 {
-#ifdef ENABLE_LINE_TYPE
 	char	   *str = PG_GETARG_CSTRING(0);
-#endif
 	LINE	   *line;
-
-#ifdef ENABLE_LINE_TYPE
-	/*

[HACKERS] Proposal: json_populate_record and nested json objects

2013-09-14 Thread chris travers

Hi all;

Currently json_populate_record and json_populate_recordset cannot work with
nested json objects.  This creates two fundamental problems when trying to use
JSON as an interface format.

The first problem is you can't easily embed a json data type in an json object
and have it populate a record.  This means that storing extended attributes in
the database is somewhat problematic if you accept the whole row in as a json
object.

The second problem is that nested data structures and json don't go together
well.  You can't have  a composite type which has as an attribute an array of
another composite type and populate this from a json object.  This makes json
largely an alternative to hstore for interfaces in its current shape.

I would propose handling the json_populate_record and friends as such:

1. Don't throw errors initially as a pre-check if the json object is nested.
2. If one comes to a nested fragment, check the attribute type it is going into
first.
2.1 If it is a json type, put the nested fragment there.
2.2 If it is a composite type (i.e. anything in pg_class), push it through
another json_populate_record run
2.3 If it is neither, then see if a json::[type] cast exists, if so call it.
2.4 Otherwise raise an exception

I have a few questions before I go on to look at creating a patch.

1.  Are there any problems anyone spots with this approach?

2.  Is anyone working on something like this?

3.  Would it be preferable to build something like this first as an extension
(perhaps with different function names) or first as a patch?

Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] [PoC] pgstattuple2: block sampling to reduce physical read

2013-09-14 Thread Peter Eisentraut

On Sat, 2013-09-14 at 16:18 +0900, Satoshi Nagayasu wrote:
> I'm looking forward to seeing more feedback on this approach,
> in terms of design and performance improvement.
> So, I have submitted this for the next CF. 

Your patch fails to build:

pgstattuple.c: In function ‘pgstat_heap_sample’:
pgstattuple.c:737:13: error: ‘SnapshotNow’ undeclared (first use in this 
function)
pgstattuple.c:737:13: note: each undeclared identifier is reported only once 
for each function it appears in




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL, RAISE and error context

2013-09-14 Thread Peter Eisentraut

On Sat, 2013-09-14 at 04:58 +0200, Marko Tiikkaja wrote:
> The attached patch (based on Pavel's patch) changes the default to be 
> slightly more verbose (the CONTEXT lines which were previously
> omitted 
> will be visible), but adds a new PGVerbosity called COMPACT which 
> suppresses CONTEXT in non-error messages.  Now DEFAULT will be more 
> useful when debugging PL/PgSQL, and people who are annoyed by the new 
> behaviour can use the COMPACT mode. 

Your patch fails the regression tests.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)

2013-09-14 Thread Peter Eisentraut

On Sat, 2013-09-14 at 00:41 +0100, Richard Poole wrote:
> The attached patch adds the MAP_HUGETLB flag to mmap() for shared
> memory on systems that support it. 

Please fix the tabs in the SGML files.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpgsql.print_strict_params

2013-09-14 Thread Peter Eisentraut

On Fri, 2013-09-13 at 23:56 +0200, Marko Tiikkaja wrote:
> Attached is a patch for optionally printing more information on STRICT
> failures in PL/PgSQL: 

Please fix the tabs in the SGML files.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for data checksums

2013-09-14 Thread Bernd Helmle




--On 15. September 2013 00:25:34 +0200 Andres Freund 
 wrote:



Looks like a good idea to me. The implementation looks sane as well,
except that I am not sure if we really need to introduce that faux
variable. If the variable cannot be set and we have a SHOW hook, do we
need it?


It's along the line with the other informational variables like block_size 
et al. Do you want to have a function instead or what's your intention?


One benefit is to have 'em all in SHOW ALL which can be used to compare 
database/cluster settings, to mention one use case i have in mind.


--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-14 Thread Peter Geoghegan

On Sat, Sep 14, 2013 at 3:15 PM, Andres Freund  wrote:
>> Well, the reality is that the promises idea hasn't been described in
>> remotely enough detail to compare it to what I have here. I've pointed
>> out plenty of problems with it.
>
> Even if you disagree, I still think that doesn't matter in the very
> least.

It matters if you care about getting this feature.

> You say:
>
>> I think that the details of how this approach compare to others are
>> totally pertinent. For me, that's the whole point - getting towards
>> something that will balance all of these concerns and be acceptable.
>
> Well, the two other people involved in the discussion so far have gone
> on the record saying that the presented approach is not acceptable to
> them. And you haven't started reacting to that.

Uh, yes I have. I'm not really sure what you could mean by that. What
am I refusing to address?

>> Yes, it's entirely possible that that could look quite different to
>> what I have here. I do not want to reduce all this to a question of
>> "is this one design acceptable or not?".
>
> But the way you're discussing it so far is exactly reducing it that way.

The fact that I was motivated to do things this way serves to
illustrate the problems generally.

> If you want the discussion to be about *how* can we implement it that
> the various concerns are addressed: fsck*ing great. I am with you there.

Isn't that what we were doing? There has been plenty of commentary on
alternative approaches.

> In the end, even though I have my usual strong opinions which is the
> best way, I don't care which algorithm gets pursued further. At least,
> if, and only if, it has a fighting chance of getting committed. Which
> this doesn't.

I don't think that any design that has been described to date doesn't
have serious problems. Causing excessive bloat, particularly in
indexes is a serious problem also.

>> The reason I ended up here is not because I didn't get the memo about
>> holding buffer locks across complex operations being a bad thing. At
>> least grant me that. I'm here because in all these years no one has
>> come up with a suggestion that doesn't have some very major downsides.
>> Like, even worse than this.
>
> I think you're massively, massively, massively overstating the dangers
> of bloat here. It's a known problem that's *NOT* getting worse by any of
> the other proposals if you compare it with the loop/lock/catch
> implementation of upsert that we have today as the only option.

Why would I compare it with that? That's terrible, and very few of our
users actually know about it anyway. Also, will an UPDATE followed by
an INSERT really bloat all that much anyway?

> And we
> *DO* have infrastructure to deal with bloat, even if could use some
> improvement. We *don't* have infrastructure with deadlocks on
> lwlocks. And we're not goint to get that infrastructure, because it
> would even further remove the "lw" part of lwlocks.

Everything I said so far is predicated on LWLocks not deadlocking
here, so I'm not really sure why you'd say that. If I can't find a way
to prevent deadlock, then clearly the approach is doomed.

> It's that it interleaves an already complex but local locking scheme
> that required several years to become correct with another that is just
> the same. That's an utterly horrid idea.

You're missing my point, which is that it may be possible, with
relatively modest effort, to analyze things to ensure that deadlock is
impossible - regardless of the complexities of the two systems -
because they're reasonably well encapsulated. See below, under "I'll
say it again".

Now, I can certainly understand why you wouldn't be willing to accept
that at face value. The idea isn't absurd, though. You could think of
the heap_insert() call as being under the control of the btree code
(just as, say, heap_hot_search() is), even though the code isn't at
all structured that way, and that's awkward. I'm actually slightly
tempted to structure it that way.

>> Is that really so different to holding an exclusive
>> lock on a btree buffer while holding a shared lock on a heap buffer?
>> Because that's what _bt_check_unique() does today.
>
> Yes, it it is different. But, in my opinion, bt_check_unique() doing so
> is a bug that needs fixing. Not something that we want to extend.

Well, I think you know that that's never going to happen. There are
all kinds of reasons why it works that way that cannot be disavowed.
My definition of a bug includes a user being affected.

>> > At this point I am a bit confused why you are asking for review.
>>
>> I am asking for us, collectively, through consensus, to resolve the
>> basic approach to doing this. That was something I stated right up
>> front, pointing out details of where the discussion had gone in the
>> past. That was my explicit goal. There has been plenty of discussing
>> on this down through the years, but nothing ever came from it.
>
> At the moment ISTM you're not conceding on *ANY

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-14 Thread Peter Geoghegan

On Sat, Sep 14, 2013 at 12:22 AM, Robert Haas  wrote:
>> I mean, if we do the promise tuple
>> thing, and there are multiple unique indexes, what happens when an
>> inserter needs to block pending the outcome of another transaction?
>> They had better go clean up the promise tuples from the other unique
>> indexes that they're trying to insert into, because they cannot afford
>> to hold value locks for a long time, no matter how they're
>> implemented.
>
> As Andres already pointed out, this is not correct.

While not doing this is not incorrect, it certainly would be useful
for preventing deadlocks and unnecessary contention. In a world where
people expect either an insert or an update, we ought to try and
reduce contention across multiple unique indexes. I can understand why
that doesn't matter today, though - if you're going to insert
duplicates indifferent to whether or not there will be conflicts,
that's a kind of abuse, and not worth optimizing - seems probable that
most transactions will commit. However, it seems much less probable
that most upserters will insert. People may well wish to upsert all
the time where an insert is hardly ever necessary, which is one reason
why I have doubts about other proposals.

Note that today there is no guarantee that the original waiter for a
duplicate-inserting xact to complete will be the first one to get a
second chance, so I think it's hard to question this on correctness
grounds. Even if they are released in FIFO order, there is no reason
to assume that the first waiter will win the race with a second. Most
obviously, the second waiter may not even ever get the chance to block
on the same xid at all (so it's not really a waiter at all) and still
be able to insert, if the blocking-xact aborts after the second
"waiter" starts its descent but before it checks uniqueness. All this,
even though the second "waiter" arrived maybe minutes after the first.

What I'm talking about here is really unlikely to result in lock
starvation, because the original waiter typically gets to observe the
other waiter go through, and that's reason enough to give up entirely.
Now, it's kind of weird that the original waiter will still end up
blocking on the xid that caused it to wait in the first instance. So
there should be more thought put into that, like remembering the xid
and only waiting on it on a retry, or some similar scheme. Maybe you
could contrive a scenario where this causes lock starvation, but I
suspect you could do the same thing for the present btree insertion
code.

> Just to add to
> what he said, we already have long-lasting value locks in the form of
> SIREAD locks. SIREAD can exist at different levels of granularity, but
> one of those levels is index-page-level granularity, where they have
> the function of guarding against concurrent insertions of values that
> would fall within that page, which just so happens to be the same
> thing you want to do here.  The difference between those locks and
> what you're proposing here is that they are implemented differently.
> That is why those were acceptable and this is not.

As the implementer of this patch, I'm obligated to put some checks in
unique index insertion that everyone has to care about. There is no
way around that. Complexity issues aside, I think that an argument
could be made for this approach *reducing* the impact on concurrency
relative to other approaches, if there isn't too many unique indexes
to deal with, which is the case the vast majority of the time. I mean,
those other approaches necessitate doing so much more with *exclusive*
locks held. Like inserting, maybe doing a page split, WAL-logging, all
with the lock, and then either updating in place or killing the
promise tuple, and WAL-logging that, with an exclusive lock held the
second time around. Plus searching for everything twice. I think that
frequently killing all of those broken-promise tuples could have
deleterious effects on concurrency and/or index bloat or the kind only
remedied by reindex. Do you update the freespace map too? More
exclusive locks! Or if you leave it up to VACUUM (and just set the xid
to InvalidXid, which is still extra work), autovacuum has to care
about a new *class* of bloat - index-only bloat. Plus lots of dead
duplicates are bad for performance in btrees generally.

> As here, that is way more expensive than
> simply grabbing and holding a share-lock on the page.  But we get a
> number of important benefits out of it.  The backend remains
> interruptible while the tuple is locked, the protocol for granting
> locks is FIFO to prevent starvation, we don't suppress page eviction
> while the lock is held, we can simultaneously lock arbitrarily large
> numbers of tuples, and deadlocks are detect and handled cleanly.  If
> those requirements were negotiable, we would surely have negotiated
> them away already, because the performance benefits would be immense.

False equivalence. We only need to lock as many unique index *

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


On 2013-09-15 00:09, Pavel Stehule wrote:

this is a possibility for introduction a new hook and possibility implement
asserions and similar task in generic form (as extension). it can be
assertions, tracing, profiling.


You can already do tracing and profiling in an extension.  I don't see 
what you would put inside the function body for these two, either.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for data checksums

2013-09-14 Thread Andres Freund

Hi,

On 2013-09-14 18:33:38 +0200, Bernd Helmle wrote:
> Attached is a small patch to add a new GUC to report wether data checksums
> for a particular cluster are enabled. The only way to get this info afaik is
> to look into pg_control and the version number used, but i'd welcome a way
> to access this remotely, too. If there aren't any objections i'll add this
> to the CF.

Looks like a good idea to me. The implementation looks sane as well,
except that I am not sure if we really need to introduce that faux
variable. If the variable cannot be set and we have a SHOW hook, do we
need it?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-14 Thread Andres Freund

On 2013-09-13 14:41:46 -0700, Peter Geoghegan wrote:
> On Fri, Sep 13, 2013 at 12:23 PM, Andres Freund  
> wrote:
> > The reason I wasn't saying "this will never get accepted" are twofold:
> > a) I don't want to stiffle alternative ideas to the "promises" idea,
> > just because I think it's the way to go. That might stop a better idea
> > from being articulated. b) I am not actually in the position to say it's
> > not going to be accepted.
> 
> Well, the reality is that the promises idea hasn't been described in
> remotely enough detail to compare it to what I have here. I've pointed
> out plenty of problems with it.

Even if you disagree, I still think that doesn't matter in the very
least. You say:

> I think that the details of how this approach compare to others are
> totally pertinent. For me, that's the whole point - getting towards
> something that will balance all of these concerns and be acceptable.

Well, the two other people involved in the discussion so far have gone
on the record saying that the presented approach is not acceptable to
them. And you haven't started reacting to that.

> Yes, it's entirely possible that that could look quite different to
> what I have here. I do not want to reduce all this to a question of
> "is this one design acceptable or not?".

But the way you're discussing it so far is exactly reducing it that way.

If you want the discussion to be about *how* can we implement it that
the various concerns are addressed: fsck*ing great. I am with you there.

In the end, even though I have my usual strong opinions which is the
best way, I don't care which algorithm gets pursued further. At least,
if, and only if, it has a fighting chance of getting committed. Which
this doesn't.

> After all, it was the first thing that
> I considered, and I'm on the record talking about it in the 2012 dev
> meeting. I didn't take that approach for many good reasons.

Well, I wasn't there when you said that ;)

> The reason I ended up here is not because I didn't get the memo about
> holding buffer locks across complex operations being a bad thing. At
> least grant me that. I'm here because in all these years no one has
> come up with a suggestion that doesn't have some very major downsides.
> Like, even worse than this.

I think you're massively, massively, massively overstating the dangers
of bloat here. It's a known problem that's *NOT* getting worse by any of
the other proposals if you compare it with the loop/lock/catch
implementation of upsert that we have today as the only option. And we
*DO* have infrastructure to deal with bloat, even if could use some
improvement. We *don't* have infrastructure with deadlocks on
lwlocks. And we're not goint to get that infrastructure, because it
would even further remove the "lw" part of lwlocks.

> >> As to the rules you refer to, you must mean "These locks are intended
> >> to be short-term: they should not be held for long". I don't think
> >> that they will ever be held for long. At least, when I've managed the
> >> amount of work that a heap_insert() can do better. I expect to produce
> >> a revision where toasting doesn't happen with the locks held soon.
> >> Actually, I've already written the code, I just need to do some
> >> testing.
> >
> > I personally think - and have stated so before - that doing a
> > heap_insert() while holding the btree lock is unacceptable.
> 
> Presumably your reason is essentially that we exclusive lock a heap
> buffer (exactly one heap buffer) while holding shared locks on btree
> index buffers.

It's that it interleaves an already complex but local locking scheme
that required several years to become correct with another that is just
the same. That's an utterly horrid idea.

> Is that really so different to holding an exclusive
> lock on a btree buffer while holding a shared lock on a heap buffer?
> Because that's what _bt_check_unique() does today.

Yes, it it is different. But, in my opinion, bt_check_unique() doing so
is a bug that needs fixing. Not something that we want to extend.

(Note that _bt_check_unique() already needs to deal with the fact that
it reads an unlocked page, because it moves right in some cases)

And, as you say:

> Now, I'll grant you that there is one appreciable difference, which is
> that multiple unique indexes may be involved. But limiting ourselves
> to the primary key or something like that remains an option. And I'm
> not sure that it's really any worse anyway.

I don't think that's an acceptable limitation. If it were something we
could lift in a release or two, maybe, but that's not what you're
talking about.

> > At this point I am a bit confused why you are asking for review.
> 
> I am asking for us, collectively, through consensus, to resolve the
> basic approach to doing this. That was something I stated right up
> front, pointing out details of where the discussion had gone in the
> past. That was my explicit goal. There has been plenty of discussing
> on this down through

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


On 2013-09-14 23:55, Pavel Stehule wrote:

but introduction a reserved keword for one very special purpose (without
extensibility) is not prudent.


How about using an existing keyword then?  ASSERTION used to be reserved 
in the SQL standard but is unreserved in postgres.  CHECK might work and 
is fully reserved.  CONSTRAINT?  IS?



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Pavel Stehule

Dne 14. 9. 2013 23:55 "Pavel Stehule"  napsal(a):
>
>
> Dne 14. 9. 2013 23:35 "Marko Tiikkaja"  napsal(a):
>
> >
> > On 2013-09-14 23:05, Pavel Stehule wrote:
> >>
> >> A some languages has a generic PRAGMA keyword. So I would be much more
> >> happy with something like
> >>
> >> PRAGMA Assert(found);
> >>
> >> It is much more close to ADA, and it allows a reuse of new keyword for
any
> >> other usage in future (your proposal is too simply, without possibility
> >> open new doors in future). And we can define a some standard predefined
> >> asserts too - like Assert, AssertNotNull, AssertNotEmpty, ...
> >
> >
> > I don't see why e.g. PRAGMA AssertNotEmpty(foo);  would be better than
ASSERT NotEmpty(foo);  and the NotNull version is even sillier considering
the expression is arbitrary SQL, and we'd have to do all kinds of different
versions or people would be disappointed (AssertNull, AssertNotNull,
AssertExists, AssertNotExists, etc.).
> >
> > I see what you're trying to do, but I don't think crippling new
features just because we might do something similar at some point is a good
idea.  I'm guessing this is what happened with the row_count syntax, which
made the feature an absolute nightmare to use.
>
> a more than one asserts can be my personal preferrence (it is not
important).
>
> but introduction a reserved keword for one very special purpose (without
extensibility) is not prudent.
>
> plpgsql has still lot of relations to pl/sql and ada, and I don't think
so we have to introduce a new original syntax everytime

this is a possibility for introduction a new hook and possibility implement
asserions and similar task in generic form (as extension). it can be
assertions, tracing, profiling.

I like a integrated assertions, but I would not close a door to future
enhancing (probably there will not be a possibility intriduce a new keyword
for tracing - although I would to implement a difference between
development an production usage.

so I am against to your proposal - it doesn't allow any extensibility.

>
> >> other issue - A asserts macros has one or two parameters usually. You
> >> should to support two params assert (with message).
> >
> >
> > That I think is worth looking into.
> >
> >
> > Regards,
> > Marko Tiikkaja

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Pavel Stehule

Dne 14. 9. 2013 23:35 "Marko Tiikkaja"  napsal(a):
>
> On 2013-09-14 23:05, Pavel Stehule wrote:
>>
>> A some languages has a generic PRAGMA keyword. So I would be much more
>> happy with something like
>>
>> PRAGMA Assert(found);
>>
>> It is much more close to ADA, and it allows a reuse of new keyword for
any
>> other usage in future (your proposal is too simply, without possibility
>> open new doors in future). And we can define a some standard predefined
>> asserts too - like Assert, AssertNotNull, AssertNotEmpty, ...
>
>
> I don't see why e.g. PRAGMA AssertNotEmpty(foo);  would be better than
ASSERT NotEmpty(foo);  and the NotNull version is even sillier considering
the expression is arbitrary SQL, and we'd have to do all kinds of different
versions or people would be disappointed (AssertNull, AssertNotNull,
AssertExists, AssertNotExists, etc.).
>
> I see what you're trying to do, but I don't think crippling new features
just because we might do something similar at some point is a good idea.
 I'm guessing this is what happened with the row_count syntax, which made
the feature an absolute nightmare to use.

a more than one asserts can be my personal preferrence (it is not
important).

but introduction a reserved keword for one very special purpose (without
extensibility) is not prudent.

plpgsql has still lot of relations to pl/sql and ada, and I don't think so
we have to introduce a new original syntax everytime.

>
>> other issue - A asserts macros has one or two parameters usually. You
>> should to support two params assert (with message).
>
>
> That I think is worth looking into.
>
>
> Regards,
> Marko Tiikkaja

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-14 Thread Andres Freund

On 2013-09-14 09:57:43 +0100, Greg Stark wrote:
> It seems to me that the nature of the problem is that there will
> unavoidably be a nexus between the two parts of the code here. We can try
> to isolate it as much as possible but we're going to need a bit of a
> compromise.

I think Roberts and mine point is that there are several ways to
approach this without doing that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


On 2013-09-14 23:05, Pavel Stehule wrote:

A some languages has a generic PRAGMA keyword. So I would be much more
happy with something like

PRAGMA Assert(found);

It is much more close to ADA, and it allows a reuse of new keyword for any
other usage in future (your proposal is too simply, without possibility
open new doors in future). And we can define a some standard predefined
asserts too - like Assert, AssertNotNull, AssertNotEmpty, ...


I don't see why e.g. PRAGMA AssertNotEmpty(foo);  would be better than 
ASSERT NotEmpty(foo);  and the NotNull version is even sillier 
considering the expression is arbitrary SQL, and we'd have to do all 
kinds of different versions or people would be disappointed (AssertNull, 
AssertNotNull, AssertExists, AssertNotExists, etc.).


I see what you're trying to do, but I don't think crippling new features 
just because we might do something similar at some point is a good idea. 
 I'm guessing this is what happened with the row_count syntax, which 
made the feature an absolute nightmare to use.



other issue - A asserts macros has one or two parameters usually. You
should to support two params assert (with message).


That I think is worth looking into.


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] bitmap indexes

2013-09-14 Thread Andres Freund

Hi Abhijit,

On 2013-09-14 23:44:24 +0530, Abhijit Menon-Sen wrote:
> I've been working on this patch for a while, and have made some progress
> towards (a) general fixing, and (b) a working VACUUM implementation (the
> major remaining piece). Unfortunately, I've been busy moving house, and
> the latter is not complete (and not in this patch).
> 
> I will continue working on the code, and I'll post updates. I expect to
> have more to show in just a few days.
> 
> Nevertheless, I'm posting it for review now as I keep working. Given the
> size and age of the patch, I would appreciate any comments, no matter
> how nitpicky.

It'd be nice if you could quickly sketch out the plan for vacuum, that
will make reviewing more useful and easier.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Where to load modules from?

2013-09-14 Thread Andres Freund

On 2013-09-14 22:15:58 +0200, Dimitri Fontaine wrote:
> The way they make that safe is by using cgroups and SELinux, IIUC.
> 
> We can attack the problem in several ways:
> 
>   - have an initdb switch to tweak the library path per cluster,

That sounds like an utterly horrible idea without any advantages.

>   - have a superuser-only GUC to tweak the library path,

Hm. I think we might want to make it a PGC_POSTMASTER/postgresql.conf
variable instead. Is that stopping usecases of yours?

That's what I vote for.

>   - consider on-disk extension as templates and move their module files
> somewhere private in $PGDATA and load the code from there

I don't understand what that does to address the security concerns.

> That would allow OS upgrades not to impact running instances until
> they do ALTER EXTENSION UPDATE; and allowing co-existence of
> different versions of the same extension in different clusters of
> the same major version, and maybe in separate databases of the same
> cluster in some cases (depends on the extension's module specifics),

And it would be an upgrade nightmare.

> This proposal comes with no patch because I think we are able to
> understand it without that, so that it would only be a waste of
> everybody's time to attach code for a random solution on the list here
> to that email. Or consider that the fourth point is currently the only
> one addressed in this very proposal…

Yea, the code issue seem to be small here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgSQL, RAISE and error context

2013-09-14 Thread Pavel Stehule

2013/9/14 Marko Tiikkaja 

> On 23/08/2013 10:36, I wrote:
>
>> On 8/23/13 8:38 AM, Pavel Stehule wrote:
>>
>>> do you prepare patch ?
>>>
>>
>> I should have the time to produce one for the September commitfest, but
>> if you (or anyone else) want to work on this, I won't object.
>>
>> My opinion at this very moment is that we should leave the the DEFAULT
>> verbosity alone and add a new one (call it COMPACT or such) with the
>> suppressed context for non-ERRORs.
>>
>
> Well, turns out there isn't really any way to preserve complete backwards
> compatibility if we want to do this change.
>
> The attached patch (based on Pavel's patch) changes the default to be
> slightly more verbose (the CONTEXT lines which were previously omitted will
> be visible), but adds a new PGVerbosity called COMPACT which suppresses
> CONTEXT in non-error messages.  Now DEFAULT will be more useful when
> debugging PL/PgSQL, and people who are annoyed by the new behaviour can use
> the COMPACT mode.
>
> Any thoughts?
>
>
+1

Regards

Pavel


>
>
> Regards,
> Marko Tiikkaja
>
>

Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-14 Thread Andres Freund

On 2013-02-25 22:15:33 +0100, Andres Freund wrote:
> Currently guc-file.c allows the following as guc names:
> 
> ID{LETTER}{LETTER_OR_DIGIT}*
> QUALIFIED_ID  {ID}"."{ID}
> 
> That is either one token starting with a letter followed by numbers or
> letters or exactly two of those separated by a dot.
> Those restrictions are existing for neither SET/set_config() via SQL nor
> for postgres -c styles of setting options.
> 
> I propose loosening those restrictions to
> a) allow repeatedly qualified names like a.b.c
> b) allow variables to start with a digit from the second level onwards.

The attached patch does a) but not b) as Tom argued it's not allowed in
a plain fashion in SQL either. There you need to quote the variable name.

> Additionally, should we perhaps enforce the same rules for -c and
> set_config()/SET?

The discussion seems to have concluded that this isn't neccessary, so I
haven't included this.

The patch still is pretty trivial - I've looked around and I really
didn't see anything else that requires changes. I haven't even found
documentation to adapt.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From 9fa69ccc79c7bc42be554bc3f4e740c4e77d50d7 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sat, 14 Sep 2013 23:15:10 +0200
Subject: [PATCH] Allow custom GUCs to be nested more than one level in config
 files

Doing so was allowed via SET, postgres -c and set_config()
before. Allowing to do so is useful for grouping together variables
inside an extension's toplevel namespace.

There still are differences in the accepted variable names between the
different methods of setting GUCs after this commit, but the concensus
is that those are acceptable.
---
 src/backend/utils/misc/guc-file.l | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index b730a12..ff4202d 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -78,7 +78,7 @@ LETTER			[A-Za-z_\200-\377]
 LETTER_OR_DIGIT [A-Za-z_0-9\200-\377]
 
 ID{LETTER}{LETTER_OR_DIGIT}*
-QUALIFIED_ID	{ID}"."{ID}
+QUALIFIED_ID	{ID}("."{ID})+
 
 UNQUOTED_STRING {LETTER}({LETTER_OR_DIGIT}|[-._:/])*
 STRING			\'([^'\\\n]|\\.|\'\')*\'
-- 
1.8.4.21.g992c386.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Pavel Stehule

2013/9/14 Marko Tiikkaja 

> On 2013-09-14 22:40, I wrote:
>
>> Someone may prove me wrong here, but to me it looks like this would only
>> prevent ASSERT from being used as the name of a PL/PgSQL variable.
>>
>
> The comment at the beginning of pl_scanner.c seems to suggest same.
>

yes, there are no too much possibilities, how to do it.

Effective implementation needs a special PLpgSQL statement - but I am 100%
happy with proposed syntax. We introduce a new special keyword just for one
simple construct.

A some languages has a generic PRAGMA keyword. So I would be much more
happy with something like

PRAGMA Assert(found);

It is much more close to ADA, and it allows a reuse of new keyword for any
other usage in future (your proposal is too simply, without possibility
open new doors in future). And we can define a some standard predefined
asserts too - like Assert, AssertNotNull, AssertNotEmpty, ...

other issue - A asserts macros has one or two parameters usually. You
should to support two params assert (with message).

>
> Regards,
> Marko Tiikkaja
>

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


On 2013-09-14 22:40, I wrote:

Someone may prove me wrong here, but to me it looks like this would only
prevent ASSERT from being used as the name of a PL/PgSQL variable.


The comment at the beginning of pl_scanner.c seems to suggest same.


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


On 2013-09-14 22:24, Pavel Stehule wrote:

There is a significant issue - new reserved keyword. There is high
probability so lot of users has a functions named "assert".


Someone may prove me wrong here, but to me it looks like this would only 
prevent ASSERT from being used as the name of a PL/PgSQL variable. 
That's still a backwards compatibility break, but the case you were 
worried about should still work:


=# create function assert(boolean) returns boolean as $$ begin if $1 is 
not true then raise exception 'custom assert()'; end if; return true; 
end $$ language plpgsql;

CREATE FUNCTION
=# create function f() returns int as $$
$# begin
$# assert false;
$# perform assert(true);
$# if assert(true) then
$# perform assert(false);
$# end if;
$# end
$# $$ language plpgsql;
CREATE FUNCTION
=# select f();
ERROR:  custom assert()
CONTEXT:  SQL statement "SELECT assert(false)"
PL/pgSQL function f() line 6 at PERFORM



I like this functionality, but I dislike any compatibility break for
feature, that can be implemented as extension without any lost of
compatibility or lost of functionality.


I don't see how this could be implemented as an extension, even if you 
write it in C.  I think being able to turn assertions off in production 
with no execution time overhead is what justifies having this in-core. 
The nicer syntax isn't enough (compared to, say, PERFORM assert(..)). 
And if we're only breaking code for people who use "assert" as the 
variable name, I'd say we go for it.



So can you redesign this without new keyword?


I don't see an easy way to do that, but I'm not too familiar with 
PL/PgSQL's parser so maybe I'm just missing something.




Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Jaime Casanova

El 14/09/2013 15:18, "Marko Tiikkaja"  escribió:
>
> On 2013-09-14 21:55, Jaime Casanova wrote:
>>
>> On Sat, Sep 14, 2013 at 1:52 PM, Marko Tiikkaja  wrote:
>>>
>>> And by "compile time" here, I mean at the time when the PL/PgSQL
function
>>
>> is
>>>
>>> compiled, not the postgres server binary.
>>>
>>
>> and "compile time" means when the function is created or replaced? or the
>> first time is used?
>
>
> Uhh..  I have to admit, I'm not sure.  I think this would be when you
CREATE the function for the backend that created the function, and on the
first call in other backends.
>
>
>> if the second. Why not have a plpgsql.enable_assert variable?
>> A directive you can use per function
>>
>> then i can keep the ASSERT and activate them by replacing the function
>
>
> The patch supports a "compiler option" to override the global option and
enable it per function:
>
> create function foof()
> returns void
> as $$
> #enable_assertions
> begin
> -- failure
> assert 1 > 2;
> end
> $$ language plpgsql;
>
>
> Does this address your wishes?
>
>

That's exactly what i wanted. thanks

--
Jaime Casanova
2ndQuadrant: Your PostgreSQL partner

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Pavel Stehule

Hello

There is a significant issue - new reserved keyword. There is high
probability so lot of users has a functions named "assert".

I like this functionality, but I dislike any compatibility break for
feature, that can be implemented as extension without any lost of
compatibility or lost of functionality.

So can you redesign this without new keyword?


Regards

Pavel


2013/9/14 Marko Tiikkaja 

> Hi,
>
> Attached is a patch for supporting assertions in PL/PgSQL.  These are
> similar to the Assert() backend macro: they can be disabled during compile
> time, but when enabled, abort execution if the passed expression is not
> true.
>
> A simple example:
>
> CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$
> BEGIN
> DELETE FROM users WHERE users.username = delete_user.username;
> ASSERT FOUND;
> END
> $$ LANGUAGE plpgsql;
>
> SELECT delete_user('mia');
> ERROR:  Assertion on line 4 failed
> CONTEXT:  PL/pgSQL function delete_user(text) line 4 at ASSERT
>
>
> Again, I'll add this to the open commitfest, but feedback is greatly
> appreciated.
>
>
> Regards,
> Marko Tiikkaja
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


On 2013-09-14 21:55, Jaime Casanova wrote:

On Sat, Sep 14, 2013 at 1:52 PM, Marko Tiikkaja  wrote:

And by "compile time" here, I mean at the time when the PL/PgSQL function

is

compiled, not the postgres server binary.



and "compile time" means when the function is created or replaced? or the
first time is used?


Uhh..  I have to admit, I'm not sure.  I think this would be when you 
CREATE the function for the backend that created the function, and on 
the first call in other backends.



if the second. Why not have a plpgsql.enable_assert variable?
A directive you can use per function

then i can keep the ASSERT and activate them by replacing the function


The patch supports a "compiler option" to override the global option and 
enable it per function:


create function foof()
returns void
as $$
#enable_assertions
begin
-- failure
assert 1 > 2;
end
$$ language plpgsql;


Does this address your wishes?


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Where to load modules from?

2013-09-14 Thread Dimitri Fontaine

Hi,

This topic gets back at every release, more often now that we have
proper Extensions with ability to dump&restore. Lately the guys from
Open Shift project (a Red Hat team) have asked for a way to load DSO
module files from user-owned directory.

The way they make that safe is by using cgroups and SELinux, IIUC.

We can attack the problem in several ways:

  - have an initdb switch to tweak the library path per cluster,

  - have a superuser-only GUC to tweak the library path,

  - consider on-disk extension as templates and move their module files
somewhere private in $PGDATA and load the code from there

That would allow OS upgrades not to impact running instances until
they do ALTER EXTENSION UPDATE; and allowing co-existence of
different versions of the same extension in different clusters of
the same major version, and maybe in separate databases of the same
cluster in some cases (depends on the extension's module specifics),

  - do nothing even though the current solution is clearly broken, as in
not allowing to answer several user needs and preventing us to
implement full support (e.g. base backups, hot standby) for
extensions.

This proposal comes with no patch because I think we are able to
understand it without that, so that it would only be a waste of
everybody's time to attach code for a random solution on the list here
to that email. Or consider that the fourth point is currently the only
one addressed in this very proposal…

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] information schema parameter_default implementation

2013-09-14 Thread Peter Eisentraut

Here is an updated patch which fixes the bug you have pointed out.

On Thu, 2013-01-31 at 18:59 +0500, Ali Dar wrote:

> I checked our your patch. There seems to be an issue when we have OUT
> parameters after the DEFAULT values.

Fixed.

> Some other minor observations:
> 1) Some variables are not lined in pg_get_function_arg_default().

Are you referring to indentation issues?  I think the indentation is
good, but pgindent will fix it anyway.

> 2) I found the following check a bit confusing, maybe you can make it
> better
> if (!argmodes || argmodes[i] == PROARGMODE_IN || argmodes[i] ==
> PROARGMODE_INOUT || argmodes[i] == PROARGMODE_VARIADIC)

Factored that out into a separate helper function.
> 
> 2) inputargn can be assigned in declaration.

I'd prefer to initialize it close to where it is used.

> 3) Function level comment for pg_get_function_arg_default() is
> missing.

I think the purpose of the function is clear.

> 4) You can also add comments inside the function, for example the
> comment for the line:
> nth = inputargn - 1 - (proc->pronargs - proc->pronargdefaults);

Suggestion?

> 5) I think the line added in the
> documentation(informational_schema.sgml) is very long. Consider
> revising. Maybe change from:
> 
> "The default expression of the parameter, or null if none or if the
> function is not owned by a currently enabled role." TO
> 
> "The default expression of the parameter, or null if none was
> specified. It will also be null if the function is not owned by a
> currently enabled role."
> 
> I don't know what do you exactly mean by: "function is not owned by a
> currently enabled role"?

I think this style is used throughout the documentation of the
information schema.  We need to keep the descriptions reasonably
compact, but I'm willing to entertain other opinions.


>From 36f25fa2ec96879bda1993818db9a9632d8ac340 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 14 Sep 2013 15:55:54 -0400
Subject: [PATCH] Implement information_schema.parameters.parameter_default
 column

Reviewed-by: Ali Dar 
---
 doc/src/sgml/information_schema.sgml|  9 
 src/backend/catalog/information_schema.sql  |  9 +++-
 src/backend/utils/adt/ruleutils.c   | 72 +
 src/include/catalog/catversion.h|  2 +-
 src/include/catalog/pg_proc.h   |  2 +
 src/include/utils/builtins.h|  1 +
 src/test/regress/expected/create_function_3.out | 33 +++-
 src/test/regress/sql/create_function_3.sql  | 24 +
 8 files changed, 148 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index 22e17bb..22f43c8 100644
--- a/doc/src/sgml/information_schema.sgml
+++ b/doc/src/sgml/information_schema.sgml
@@ -3323,6 +3323,15 @@ parameters Columns
in future versions.)
   
  
+
+ 
+  parameter_default
+  character_data
+  
+   The default expression of the parameter, or null if none or if the
+   function is not owned by a currently enabled role.
+  
+ 
 

   
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index c5f7a8b..fd706e3 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -1133,10 +1133,15 @@ CREATE VIEW parameters AS
CAST(null AS sql_identifier) AS scope_schema,
CAST(null AS sql_identifier) AS scope_name,
CAST(null AS cardinal_number) AS maximum_cardinality,
-   CAST((ss.x).n AS sql_identifier) AS dtd_identifier
+   CAST((ss.x).n AS sql_identifier) AS dtd_identifier,
+   CAST(
+ CASE WHEN pg_has_role(proowner, 'USAGE')
+  THEN pg_get_function_arg_default(p_oid, (ss.x).n)
+  ELSE NULL END
+ AS character_data) AS parameter_default
 
 FROM pg_type t, pg_namespace nt,
- (SELECT n.nspname AS n_nspname, p.proname, p.oid AS p_oid,
+ (SELECT n.nspname AS n_nspname, p.proname, p.oid AS p_oid, p.proowner,
  p.proargnames, p.proargmodes,
  _pg_expandarray(coalesce(p.proallargtypes, p.proargtypes::oid[])) AS x
   FROM pg_namespace n, pg_proc p
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 9a1d12e..5a05396 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -2265,6 +2265,78 @@ static char *generate_function_name(Oid funcid, int nargs,
 	return argsprinted;
 }
 
+static bool
+is_input_argument(int nth, const char *argmodes)
+{
+	return (!argmodes || argmodes[nth] == PROARGMODE_IN || argmodes[nth] == PROARGMODE_INOUT || argmodes[nth] == PROARGMODE_VARIADIC);
+}
+
+Datum
+pg_get_function_arg_default(PG_FUNCTION_ARGS)
+{
+	Oid			funcid = PG_GETARG_OID(0);
+	int32		nth_arg = PG_GETARG_INT32(1);
+	HeapTuple	proctup;
+	Form_pg_proc proc;
+

[HACKERS] PL Code Archive Proposal

2013-09-14 Thread Dimitri Fontaine

Hi,

In my efforts to allow PostgreSQL users to be able to fully use the
server even when not granted file system level access to it, came the
question of PL "lib" code management.

Where do you manage the "library" code you need, those parts of your
code that are not exposed at the SQL level?

For example when doing plpython you still need to install files on the
server's file system each time you want to be able to "import package"
from your Stored Procedures.

The Python community seems to have finally solved that problem and now
offers a facility (called wheel) comparable to Java .jar files, see
details at https://pypi.python.org/pypi/wheel. I don't know about the
Perl and TCL communities.

When thinking about a way to benefit from those facilities in our PL
infrastructure, we would need to be able to upload an "archive" file in
a suitable format and I guess register per-PL handlers for those
archives: storage and loading has to be considered.

  CREATE ARCHIVE schema.name
LANGUAGE plpythonu
  AS $$
binary blob here, maybe base64 encoded, PL dependent
  $$;

The standard saith that in the case of PL/Java a spefific function's
classpath is composed of all those JAR archives that you've been
registering against the same schema as where you put the function in.

If we choose to follow that model then any function created in the same
schema and language as any given archive is going to be able to "import"
things from it: the archive will be LOADed (whatever that means in your
PL of choice) when the function is "compiled" and "used".


Now, uploading a binary file and storing it in $PGDATA looks a lot like
what we're still talking about for the DSO modules bits. So here's
another way to think about it, where we don't need any language feature:

  CREATE ARCHIVE schema.name
LANGUAGE plpythonu
   WITH 'path/to/component.py' AS $$ … $$,
'path/to/__init__.py'  AS $$ … $$,
…;

That would just upload given text/plain contents on the file system and
arrange for the language runtime to be able to use it. For python that
means tweaking PYTHONPATH.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git apply vs patch -p1

2013-09-14 Thread Andrew Dunstan



On 09/14/2013 03:08 PM, Andres Freund wrote:

On 2013-09-14 15:03:52 -0400, Andrew Dunstan wrote:

On 09/14/2013 02:37 PM, Josh Berkus wrote:

Folks,

Lately I've been running into a lot of reports of false conflicts
reported by "git apply".  The most recent one was the "points" patch,
which git apply rejected for completely ficticious reasons (it claimed
that the patch was trying to create a new file where a file already
existed, which it wasn't).

I think we should modify the patch review and developer instructions to
recommend always using patch -p1 (or -p0, depending), even if the patch
was produced with "git diff".

Thoughts?



FWIW that's what I invariably use.

You do have to be careful to git-add/git-rm any added/deleted files, which
git-apply does for you (as well as renames) - I've been caught by that a
couple of times.

git reset?





Yes, of course you can roll back as long as you haven't published your 
commits. But it's a nuisance.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Jaime Casanova

On Sat, Sep 14, 2013 at 1:52 PM, Marko Tiikkaja  wrote:
> On 14/09/2013 20:47, I wrote:
>>
>> These are
>> similar to the Assert() backend macro: they can be disabled during
>> compile time, but when enabled, abort execution if the passed expression
>> is not true.
>

Hi,

That's very good, thanks.

>
> And by "compile time" here, I mean at the time when the PL/PgSQL function
is
> compiled, not the postgres server binary.
>

and "compile time" means when the function is created or replaced? or the
first time is used?

if the second. Why not have a plpgsql.enable_assert variable?
A directive you can use per function

then i can keep the ASSERT and activate them by replacing the function

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157

[HACKERS] json docs fixup

2013-09-14 Thread Andrew Dunstan



While writing slides for pgopen next week, I noticed that the JSON docs 
on json_populate_record and json_populate_recordset contain this sentence:


   A column may only be specified once.


IIRC we removed that restriction  during development, so unless there is 
a squawk I am going to simply remove that sentence where it occurs. Or 
should we say something like:


   If a column is specified more than once, the last value is used.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git apply vs patch -p1

2013-09-14 Thread Andres Freund

On 2013-09-14 15:03:52 -0400, Andrew Dunstan wrote:
> 
> On 09/14/2013 02:37 PM, Josh Berkus wrote:
> >Folks,
> >
> >Lately I've been running into a lot of reports of false conflicts
> >reported by "git apply".  The most recent one was the "points" patch,
> >which git apply rejected for completely ficticious reasons (it claimed
> >that the patch was trying to create a new file where a file already
> >existed, which it wasn't).
> >
> >I think we should modify the patch review and developer instructions to
> >recommend always using patch -p1 (or -p0, depending), even if the patch
> >was produced with "git diff".
> >
> >Thoughts?
> >
> 
> 
> FWIW that's what I invariably use.
> 
> You do have to be careful to git-add/git-rm any added/deleted files, which
> git-apply does for you (as well as renames) - I've been caught by that a
> couple of times.

git reset?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: PL/PgSQL strict_mode

2013-09-14 Thread Marko Tiikkaja

(Oops, just realized I only replied to a part of your message.  I'm 
blaming it on my lack of sleep.)


On 14/09/2013 06:53, chris travers wrote:

2) If you're running in strict mode and you want to
insert/update/delete more than one row, things get a bit uglier; a wCTE
would work for some cases. If strict mode doesn't affect EXECUTE (see
point 1 above), that could work too. Or maybe there could be a new
command which runs a query, discards the results and ignores the number
of rows processed.


Yeah, I am worried about this one.  I am concerned that if you can't disable on
a statement by statement basis, then you have a problem where you end up
removing the mode from the function and then it becomes a lot harder for
everyone maintaining the function to have a clear picture of what is going on.
  I am further worried that hacked ugly code ways around it will introduce 
plenty
of other maintenance pain points that will be worse than what you are solving.


I completely agree.  If you can't turn the option on globally, you're 
bound to hit problems at some point.


My thinking currently is that this mode would require adding a new type 
of statement (I'd call it PERFORM, but I know that one's been taken, so 
please don't get hung up on that):


  PERFORM UPDATE foo SET ..;
or
  PERFORM SELECT foo(id) FROM ..;

This statement would allow you to run any statement in strict mode 
without hitting the "more than one row processed" error message.



I'll be adding this to the open commitfest in hopes of getting some
feedback on this idea (I'm prepared to hear a lot of "you're crazy!"),
but feel free to comment away any time you please.


Well, I don't know if my feedback above is helpful, but there it is ;-)


Thanks for your input!


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git apply vs patch -p1

2013-09-14 Thread Andrew Dunstan



On 09/14/2013 02:37 PM, Josh Berkus wrote:

Folks,

Lately I've been running into a lot of reports of false conflicts
reported by "git apply".  The most recent one was the "points" patch,
which git apply rejected for completely ficticious reasons (it claimed
that the patch was trying to create a new file where a file already
existed, which it wasn't).

I think we should modify the patch review and developer instructions to
recommend always using patch -p1 (or -p0, depending), even if the patch
was produced with "git diff".

Thoughts?




FWIW that's what I invariably use.

You do have to be careful to git-add/git-rm any added/deleted files, 
which git-apply does for you (as well as renames) - I've been caught by 
that a couple of times.



cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] record identical operator

2013-09-14 Thread Andres Freund

On 2013-09-14 11:25:52 -0700, Kevin Grittner wrote:
> Andres Freund  wrote:
> 
> > what I am talking about is that
> > e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3];
> > obviously should be true.
> 
> The patch does not change the behavior of the = operator for any
> type under any circumstances.

Yes, sure. I wasn't thinking you would.

> > But both arrays don't have the same binary representation since
> > the former has a null bitmap, the latter not. So, if you had a
> > composite type like (int4[]) and would compare that without
> > invoking operators you'd return something false in some cases
> > because of the null bitmaps.
> 
> Not for the = operator.  The new "identical" operator would find
> them to not be identical, though.

Yep. And I think that's a problem if exposed to SQL. People won't
understand the hazards and end up using it because its faster or
somesuch.

> Since the new operator is only for the record type, I need to wrap
> the values in your example:

Yes.

> The REFRESH causes them to match again, and later REFRESH runs
> won't see a need to do any work there unless the on-disk
> representation changes again.

Yes, I understand that the matview code itself will just perform
superflous work. We use such comparisons in other parts of the code
similarly.

> As far as I can see, we have four choices:
> 
> (1)  Never update values that are "equal", even if they appear
> different to the users, as was demonstrated with the citext
> example.

I think, introducing a noticeable amount of infrastructure for this just
because of citext is a bad idea.
At some point we need to replace citext with proper case-insensitive
collation support - then it really might become necessary.

> (2)  Require every data type which can be used in a matview to
> implement some new operator or function for "identical".  Perhaps
> that could be mitigated to only implementat it if equal values can
> have user-visible differences.

That basically would require adding a new member to btree opclasses that
btrees don't need themselves... Hm.

> (3)  Embed special cases into record identical tests for types
> known to allow multiple on-disk representations which have no
> user-visible differences.

I think this is a complete nogo. a) I don't forsee we know of all these
cases b) it wouldn't be extensible.

Oh. Now that I've read further, I see you feel the same. Good ;)

> (4)  Base the need to update a matview column on whether its
> on-disk representation is identical to what a new run of the
> defining query would generate.  If this causes performance problems
> for use of a given type in a matview, one possible solution would
> be to modify that particular type to use a canonical format when
> storing a value into a record.  For example, storing an array which
> has a bitmap of null values even though there are no nulls in the
> array could strip the bitmap as it is stored to the record.

If matview refreshs weren't using plain SQL and thus wouldn't require
exposing that operator to SQL I wouldn't have a problem with this...

There's the ungodly ugly choice of having an matview_equal function (or
operator) that checks if we're doing a refresh atm...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


On 14/09/2013 20:47, I wrote:

These are
similar to the Assert() backend macro: they can be disabled during
compile time, but when enabled, abort execution if the passed expression
is not true.


And by "compile time" here, I mean at the time when the PL/PgSQL 
function is compiled, not the postgres server binary.



Regards,
Marko Tiikkaja



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Assertions in PL/PgSQL

2013-09-14 Thread Marko Tiikkaja


Hi,

Attached is a patch for supporting assertions in PL/PgSQL.  These are 
similar to the Assert() backend macro: they can be disabled during 
compile time, but when enabled, abort execution if the passed expression 
is not true.


A simple example:

CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$
BEGIN
DELETE FROM users WHERE users.username = delete_user.username;
ASSERT FOUND;
END
$$ LANGUAGE plpgsql;

SELECT delete_user('mia');
ERROR:  Assertion on line 4 failed
CONTEXT:  PL/pgSQL function delete_user(text) line 4 at ASSERT


Again, I'll add this to the open commitfest, but feedback is greatly 
appreciated.



Regards,
Marko Tiikkaja
*** a/doc/src/sgml/plpgsql.sgml
--- b/doc/src/sgml/plpgsql.sgml
***
*** 3528,3533  RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' 
|| user_id;
--- 3528,3596 
  
 
  
+   
+Assertions
+ 
+
+ Assertions provide a way to check that the
+ internal state of a function is as expected.  For example:
+ 
+ CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$
+ BEGIN
+   DELETE FROM users WHERE users.username = delete_user.username;
+   ASSERT FOUND;
+ END
+ $$ LANGUAGE plpgsql;
+ 
+ SELECT delete_user('mia');
+ ERROR:  Assertion on line 4 failed
+ CONTEXT:  PL/pgSQL function delete_user(text) line 4 at ASSERT
+ 
+ 
+ One could implement the equivalent functionality with a conditional
+ RAISE EXCEPTION statement, but assertions have two major differences:
+ 
+  
+
+ They're a lot faster to write than the equivalent IF .. THEN 
RAISE EXCEPTION .. END IF constructs.
+
+  
+  
+
+ They can be (and are by default) disabled in production
+ environments.  A disabled assertion only incurs a negligible
+ compile-time overhead and no execution time overhead, so you
+ can write complex checks for development environments without
+ having to worry about performance.
+
+  
+  
+
+ 
+
+ The configuration parameter plpgsql.enable_assertions
+ controls whether assertions are enabled.  Note that the value of
+ this parameter only affects subsequent compilations of
+ PL/pgSQL functions, but not statements already
+ compiled in the current session.
+
+ 
+
+ It is also possible to enable assertions in a single function
+ (possibly overriding the global setting) by using a compile
+ option:
+ 
+ CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$
+ #enable_assertions
+ BEGIN
+   DELETE FROM users WHERE users.username = delete_user.username;
+   ASSERT FOUND;
+ END
+ $$ LANGUAGE plpgsql;
+ 
+
+   
+ 
   
  
   
*** a/src/pl/plpgsql/src/pl_comp.c
--- b/src/pl/plpgsql/src/pl_comp.c
***
*** 351,356  do_compile(FunctionCallInfo fcinfo,
--- 351,357 
function->fn_cxt = func_cxt;
function->out_param_varno = -1; /* set up for no OUT param */
function->resolve_option = plpgsql_variable_conflict;
+   function->enable_assertions = plpgsql_enable_assertions;
  
if (is_dml_trigger)
function->fn_is_trigger = PLPGSQL_DML_TRIGGER;
***
*** 847,852  plpgsql_compile_inline(char *proc_source)
--- 848,854 
function->fn_cxt = func_cxt;
function->out_param_varno = -1; /* set up for no OUT param */
function->resolve_option = plpgsql_variable_conflict;
+   function->enable_assertions = plpgsql_enable_assertions;
  
plpgsql_ns_init();
plpgsql_ns_push(func_name);
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***
*** 133,138  static int exec_stmt_dynexecute(PLpgSQL_execstate *estate,
--- 133,140 
 PLpgSQL_stmt_dynexecute *stmt);
  static int exec_stmt_dynfors(PLpgSQL_execstate *estate,
  PLpgSQL_stmt_dynfors *stmt);
+ static int exec_stmt_assert(PLpgSQL_execstate *estate,
+ PLpgSQL_stmt_assert *stmt);
  
  static void plpgsql_estate_setup(PLpgSQL_execstate *estate,
 PLpgSQL_function *func,
***
*** 1466,1471  exec_stmt(PLpgSQL_execstate *estate, PLpgSQL_stmt *stmt)
--- 1468,1477 
rc = exec_stmt_close(estate, (PLpgSQL_stmt_close *) 
stmt);
break;
  
+   case PLPGSQL_STMT_ASSERT:
+   rc = exec_stmt_assert(estate, (PLpgSQL_stmt_assert *) 
stmt);
+   break;
+ 
default:
estate->err_stmt = save_estmt;
elog(ERROR, "unrecognized cmdtype: %d", stmt->cmd_type);
***
*** 3629,3634  exec_stmt_dynfors(PLpgSQL_execstate *estate, 
PLpgSQL_stmt_dynfors *stmt)
--- 3635,3655

Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-09-14 Thread Peter Eisentraut

On Sat, 2013-09-14 at 03:51 -0700, samthakur74 wrote:
> We have a need to see this patch committed hence the revived interest
> in this thread 

You have added this email to the commit fest, but it contains no patch.
Please add the email with the actual patch.  Maybe the author should be
given a chance to update the patches, though, because they are quite
old.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] git apply vs patch -p1

2013-09-14 Thread Josh Berkus

Folks,

Lately I've been running into a lot of reports of false conflicts
reported by "git apply".  The most recent one was the "points" patch,
which git apply rejected for completely ficticious reasons (it claimed
that the patch was trying to create a new file where a file already
existed, which it wasn't).

I think we should modify the patch review and developer instructions to
recommend always using patch -p1 (or -p0, depending), even if the patch
was produced with "git diff".

Thoughts?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] record identical operator

2013-09-14 Thread Kevin Grittner

Andres Freund  wrote:

> what I am talking about is that
> e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3];
> obviously should be true.

The patch does not change the behavior of the = operator for any
type under any circumstances.

> But both arrays don't have the same binary representation since
> the former has a null bitmap, the latter not. So, if you had a
> composite type like (int4[]) and would compare that without
> invoking operators you'd return something false in some cases
> because of the null bitmaps.

Not for the = operator.  The new "identical" operator would find
them to not be identical, though.

Since the new operator is only for the record type, I need to wrap
the values in your example:

test=# SELECT row ((ARRAY[1,2,3,NULL])[1:3])::record
test-#  = row (ARRAY[1,2,3])::record;
 ?column? 
--
 t
(1 row)

test=# SELECT row ((ARRAY[1,2,3,NULL])[1:3])::record
test-#    === row (ARRAY[1,2,3])::record;
 ?column? 
--
 f
(1 row)

Or, to borrow from the citext example using arrays:

test=# CREATE TABLE array_table (
test(#   id serial primary key,
test(#   nums int4[]
test(# );
CREATE TABLE
test=# INSERT INTO array_table (nums)
test-#   VALUES ((ARRAY[1,2,3,NULL])[1:3]), (ARRAY[1,2,3]),
test-#  (ARRAY[1,2,3]), (NULL), (NULL);
INSERT 0 5
test=# CREATE MATERIALIZED VIEW array_matview AS
test-#   SELECT * FROM array_table;
SELECT 5
test=# CREATE UNIQUE INDEX array_matview_id
test-#   ON array_matview (id);
CREATE INDEX
test=# select * from array_matview;
 id |  nums   
+-
  1 | {1,2,3}
  2 | {1,2,3}
  3 | {1,2,3}
  4 | 
  5 | 
(5 rows)

Note that the on-disk representation of the row where id = 1
differs from the on-disk representation where id in (2,3), both in
the table and the matview.

test=# SELECT *
test-#   FROM array_matview m
test-#   FULL JOIN array_table t ON (t.id = m.id AND t === m)
test-#   WHERE t.id IS NULL OR m.id IS NULL;
 id | nums | id | nums 
+--++--
(0 rows)

... so the query looking for work for the RMVC statement finds
nothing to do.

test=# UPDATE array_table SET nums = (ARRAY[1,2,3,NULL])[1:3]
test-#   WHERE id between 1 and 2;
UPDATE 2

Now we have added an unnecessary bitmap to the on-disk storage of
the value where id = 2.

test=# SELECT *
test-#   FROM array_matview m
test-#   FULL JOIN array_table t ON (t.id = m.id AND t === m)
test-#   WHERE t.id IS NULL OR m.id IS NULL;
 id |  nums   | id |  nums   
+-++-
    | |  2 | {1,2,3}
  2 | {1,2,3} |    | 
(2 rows)

... and the query sees that they differ.

test=# REFRESH MATERIALIZED VIEW CONCURRENTLY array_matview;
REFRESH MATERIALIZED VIEW
test=# SELECT *
test-#   FROM array_matview m
test-#   FULL JOIN array_table t ON (t.id = m.id AND t === m)
test-#   WHERE t.id IS NULL OR m.id IS NULL;
 id | nums | id | nums 
+--++--
(0 rows)

The REFRESH causes them to match again, and later REFRESH runs
won't see a need to do any work there unless the on-disk
representation changes again.

As far as I can see, we have four choices:

(1)  Never update values that are "equal", even if they appear
different to the users, as was demonstrated with the citext
example.

(2)  Require every data type which can be used in a matview to
implement some new operator or function for "identical".  Perhaps
that could be mitigated to only implementat it if equal values can
have user-visible differences.

(3)  Embed special cases into record identical tests for types
known to allow multiple on-disk representations which have no
user-visible differences.

(4)  Base the need to update a matview column on whether its
on-disk representation is identical to what a new run of the
defining query would generate.  If this causes performance problems
for use of a given type in a matview, one possible solution would
be to modify that particular type to use a canonical format when
storing a value into a record.  For example, storing an array which
has a bitmap of null values even though there are no nulls in the
array could strip the bitmap as it is stored to the record.

Currently we are using (4).  I only included (3) for completeness;
even just typing it as a hypothetical made me want to take a
shower.  (1) seems pretty horrid, too.  (2) isn't evil, exactly,
but any types which allowed user-visible differences in equal
values would not exhibit correct behavior in matviews until and
unless an "identical" operator was added.  It might also perform
noticeably worse than (4).

Option (4) has the advantage of showing correct logical behavior in
all types immediately, and restricting any performance fixes to the
types with the inconsistent storage formats.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] GUC for data checksums

2013-09-14 Thread Bernd Helmle

Attached is a small patch to add a new GUC to report wether data checksums 
for a particular cluster are enabled. The only way to get this info afaik 
is to look into pg_control and the version number used, but i'd welcome a 
way to access this remotely, too. If there aren't any objections i'll add 
this to the CF.


--
Thanks

Bernd

data_checksums_guc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: pgbench progress report improvements

2013-09-14 Thread Pavel Stehule

2013/9/12 Fabien COELHO 

>
> Hello Pavel,
>
> Thanks for your review.
>
>
>  * patched with minor warning
>> * compilable cleanly
>> * zero impact on PostgreSQL server functionality
>> * it does what was in proposal
>> ** change 5sec progress as default (instead no progress)
>> ** finalise a rate limit support - fixes a latency calculation
>>
>
> Just a point about the motivation: the rationale for having a continuous
> progress report is that benchmarking is subject to possibly long warmup
> times, and thus a test may have to run for hours so as to be significant. I
> find running a command for hours without any hint about what is going on
> quite annoying.
>
>
>  * code is clean
>> * documentation is included
>> * there is no voices against this patch and this patch increases a pgbench
>> usability/
>>
>> I have only one question. When I tested this patch with throttling I got a
>> very similar values of lag.
>>
>
> Yep. That is just good!
>
>
>  What is sense, or what is semantic of this value?
>>
>
> The "lag" measures the stochastic processus health. Actually, it measures
> how far behind schedule the clients are when performing throttled
> transactions. If it was to increase, that would mean that something is
> amiss, possibly not enough client threads or other issues. If it is small,
> then all is well.
>
>
>  It is not detailed documented.
>>
>
> It is documented in the section about the --rate option, see
> http://www.postgresql.org/**docs/devel/static/pgbench.html


ok, I see it now.

So this patch is ready for commit

Regards

Pavel


>
>
>  Should be printed this value in this form on every row? We can
>> print some warning when lag is higher than latency instead?
>>
>
> Hmmm... what is important is when the lag changes values.
>
> Generally one would indeed expect that to be smaller than the latency, but
> that is not really possible when transaction are very fast, say under "-S"
> with read-only queries that hit the memory cache.
>
> Also the problem with printing warnings is that it changes the output
> format, but it seems to me more useful to print the value, so that it can
> be processed automatically and simply.
>
> Also, from a remote client perspective, say a web application, the overall
> latency is the lag plus the transaction latency: you first wait to get
> through the database (lag), and then you can perform your transaction
> (latency).
>
>
>  Or we can use this value, but it should be better documented, please.
>>
>
> Is the documentation pointed above enough?
>
> --
> Fabien.
>

Re: [HACKERS] review: pgbench progress report improvements

2013-09-14 Thread Pavel Stehule

2013/9/13 Fabien COELHO 

>
> Hello,
>
> About patch eols:
>
>
>postgresql> patch -p1 < ../pgbench-measurements-v2.**patch
>>> patching file contrib/pgbench/pgbench.c
>>> patching file doc/src/sgml/pgbench.sgml
>>>
>>
>> it can depends on o.s. I did tests on Fedora 14. and for patching without
>> warning I had to use dos2unix tool.
>>
>
> Hmmm. I use a Linux Ubuntu laptop, so generating DOS end of lines is
> unlikely if it is not there at the beginning. Running "dos2unix" on the
> patch file locally does not seem to change anything. So I assume that the
> patch encoding was changed somewhere along the path you used to get it.
>

It is possible - but, this is only minor issue

Pavel


>
> --
> Fabien.
>

Re: [HACKERS] unaccent module - two params function should be immutable

2013-09-14 Thread Pavel Stehule

2013/9/11 Bruce Momjian 

> On Tue, Feb 19, 2013 at 08:30:29AM +0100, Pavel Stehule wrote:
> > Hello
> >
> > There was a proposal to change flag of function to immutable - should
> > be used in indexes
> >
> > CREATE FUNCTION unaccent(regdictionary, text)
> > RETURNS text
> > AS 'MODULE_PATHNAME', 'unaccent_dict'
> > LANGUAGE C STABLE STRICT;
> >
> >
> > is there any progress?
>
> I have developed the attached patch based on your suggestion.  I did not
> see anything in the code that would make it STABLE, except a lookup of a
> dictionary library:
>
> dictOid = get_ts_dict_oid(stringToQualifiedNameList("unaccent"),
> false);
>

yes, it risk, but similar is with timezones, and other external data.

Regards

Pavel


>
> --
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
>
>   + It's impossible for everything to be true. +
>

Re: [HACKERS] record identical operator

2013-09-14 Thread Andres Freund

On 2013-09-13 19:20:11 -0700, Kevin Grittner wrote:
> Andres Freund  wrote:
> 
> > Not one that's dependendant on padding bytes, null bitmaps that
> > can or cannot be present and such.
> 
> Can you provide an example of where that's an issue with this
> patch?

I haven't yet tested your patch, but what I am talking about is that
e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3];
obviously should be true. But both arrays don't have the same binary
representation since the former has a null bitmap, the latter not.
So, if you had a composite type like (int4[]) and would compare that
without invoking operators you'd return something false in some cases
because of the null bitmaps.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)

2013-09-14 Thread Andrew Gierth

> "Peter" == Peter Eisentraut  writes:

 Peter> Please fix compiler warnings:

Someone should do the same in WaitForBackgroundWorkerStartup so that
building with -Werror works.

New patch coming shortly.

-- 
Andrew.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-09-14 Thread samthakur74

>This patch needs documentation. At a minimum, the new calls_underest
>field needs to be listed in the description of the pg_stat_statements.
I have attached a version which includes documentation.
pg_stat_statements-identification-v4.patch.gz

  
>Pardon for not following the discussion: What exactly does the
>calls_underest field mean? I couldn't decipher it from the patch. What
>can an admin do with the value? How does it compare with just bumping up
>pg_stat_statements.max? 
Paraphrasing the documentation. 
There is a possibility that,for queries which are tracked intermittently,
could have their statistics silently reset.
The calls_underest field gives an indication that a query has been tracked
intermittently and consequently 
 have a higher probability of erroneous statistics. Queries tracked
constantly will have a zero value for 
calls_underest indicating zero probability of erroneous statistics.
A greater value of calls_underest indicates greater degree of inconsistent
tracking which in
turn means greater possibility of erroneous statistics due to statistics
being reset when 
query was not being tracked.

Increasing pg_stat_statements.max reduces the possibility of a query being
tracked intermittently but does not address the problem of indicating the
probability of erroneous statistics if the query is indeed being tracked
intermittently. I do not believe the admin can influence the value of 
calls_underest as it is not a GUC parameter.

We have a need to see this patch committed hence the revived interest in
this thread

regards
Sameer



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/pg-stat-statements-calls-under-estimation-propagation-tp5738128p5770844.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: PL/PgSQL strict_mode

2013-09-14 Thread Marko Tiikkaja


On 14/09/2013 06:53, chris travers wrote:

I am going to suggest that STRICT is semantically pretty far from what is meant
here in common speech.  I think STRICT here would be confusing.  This would be
really pretty severe for people coming from Perl or MySQL backgrounds.


The name of the feature doesn't really matter.  I chose STRICT because 
the behaviour is somewhat similar to INTO .. STRICT.



May I suggest SINGLE as a key word instead?  It might be worth having attached
to a INSERT, UPDATE, and DELETE statements.


I already suggested the "attach a keyword to every statement" approach, 
and got shot down quite quickly.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-14 Thread Greg Stark

I haven't read the patch and the btree code is an area I really don't know,
so take this for what it's worth

It seems to me that the nature of the problem is that there will
unavoidably be a nexus between the two parts of the code here. We can try
to isolate it as much as possible but we're going to need a bit of a
compromise.

I'm imagining a function that takes two target heap buffers and a btree
key. It would descend the btree and holding the leaf page lock do a
try_lock on the heap pages. If it fails to get the locks then it releases
whatever it got and returns for the heap update to find new pages and try
again.

This still leaves the potential problem with page splits and I assume it
would still be tricky to call it without unsatisfactorily mixing executor
and btree code. But that's as far as I got.

-- 
greg

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-09-14 Thread Robert Haas

On Fri, Sep 13, 2013 at 2:59 PM, Peter Geoghegan  wrote:
> I would suggest letting those other individuals speak for themselves
> too. Particularly if you're going to name someone who is on vacation
> like that.

You seem to be under the impression that I'm mentioning Tom's name, or
Andres's, because I need to win some kind of an argument.  I don't.
We're not going to accept a patch that uses lwlocks in the way that
you are proposing.

> I mean, if we do the promise tuple
> thing, and there are multiple unique indexes, what happens when an
> inserter needs to block pending the outcome of another transaction?
> They had better go clean up the promise tuples from the other unique
> indexes that they're trying to insert into, because they cannot afford
> to hold value locks for a long time, no matter how they're
> implemented.

As Andres already pointed out, this is not correct.  Just to add to
what he said, we already have long-lasting value locks in the form of
SIREAD locks. SIREAD can exist at different levels of granularity, but
one of those levels is index-page-level granularity, where they have
the function of guarding against concurrent insertions of values that
would fall within that page, which just so happens to be the same
thing you want to do here.  The difference between those locks and
what you're proposing here is that they are implemented differently.
That is why those were acceptable and this is not.

> That could take much longer than just releasing a shared
> buffer lock, since for each unique index the promise tuple must be
> re-found from scratch. There are huge issues with additional
> complexity and bloat. Oh, and now your lightweight locks aren't so
> lightweight any more.

Yep, totally agreed.  If you simply lock the buffer, or take some
other action which freezes out all concurrent modifications to the
page, then re-finding the lock is much simpler.  On the other hand,
it's much simpler precisely because you've reduced concurrency to the
degree necessary to make it simple.  And reducing concurrency is bad.
Similarly, complexity and bloat are not great things taken in
isolation, but many of our existing locking schemes are already very
complex.  Tuple locks result in a complex jig that involves locking
the tuple via the heavyweight lock manager, performing a WAL-logged
modification to the page, and then releasing the lock in the
heavyweight lock manager.  As here, that is way more expensive than
simply grabbing and holding a share-lock on the page.  But we get a
number of important benefits out of it.  The backend remains
interruptible while the tuple is locked, the protocol for granting
locks is FIFO to prevent starvation, we don't suppress page eviction
while the lock is held, we can simultaneously lock arbitrarily large
numbers of tuples, and deadlocks are detect and handled cleanly.  If
those requirements were negotiable, we would surely have negotiated
them away already, because the performance benefits would be immense.

> If the value locks were made interruptible through some method, such
> as the promise tuples approach, does that really make deadlocking
> acceptable?

Yes.  It's not possible to prevent all deadlocks.  It IS possible to
make sure that they are properly detected and that precisely one of
the transactions involved is rolled back to resolve the deadlock.

> You can hardly compare a buffer's LWLock with a system one that
> protects critical shared memory structures. We're talking about a
> shared lock on a single btree leaf page per unique index involved in
> upserting.

Actually, I can and I am.  Buffers ARE critical shared memory structures.

>> A further problem is that a backend which holds even one lwlock can't
>> be interrupted.  We've had this argument before and it seems that you
>> don't think that non-interruptibility is a problem, but it project
>> policy to allow for timely interrupts in all parts of the backend and
>> we're not going to change that policy for this patch.
>
> I don't think non-interruptibility is a problem? Really, do you think
> that this kind of inflammatory rhetoric helps anybody? I said nothing
> of the sort. I recall saying something about an engineering trade-off.
> Of course I value interruptibility.

I don't see what's inflammatory about that statement.  The point is
that this isn't the first time you've proposed a change which would
harm interruptibility and it isn't the first time I've objected on
precisely that basis.  Interruptibility is not a nice-to-have that we
can trade away from time to time; it's essential and non-negotiable.

> If you're concerned about non-interruptibility, consider XLogFlush().
> That does rather a lot of work with WALWriteLock exclusive locked. On
> a busy system, some backend is very frequently going to experience a
> non-interruptible wait for the duration of however long it takes to
> write and flush perhaps a whole segment. All other flushing backends
> are stuck in non-interruptible waits wa

Re: [HACKERS] [PoC] pgstattuple2: block sampling to reduce physical read

2013-09-14 Thread Satoshi Nagayasu

(2013/07/23 20:02), Greg Smith wrote:
> On 7/23/13 2:16 AM, Satoshi Nagayasu wrote:
>> I've been working on new pgstattuple function to allow
>> block sampling [1] in order to reduce block reads while
>> scanning a table. A PoC patch is attached.
> 
> Take a look at all of the messages linked in
> https://commitfest.postgresql.org/action/patch_view?id=778
> 
> Jaime and I tried to do what you're working on then, including a random
> block sampling mechanism modeled on the stats_target mechanism.  We
> didn't do that as part of pgstattuple though, which was a mistake.
> 
> Noah created some test cases as part of his thorough review that were
> not computing the correct results.  Getting the results correct for all
> of the various types of PostgreSQL tables and indexes ended up being
> much harder than the sampling part.  See
> http://www.postgresql.org/message-id/20120222052747.ge8...@tornado.leadboat.com
> in particular for that.

Thanks for the info. I have read the previous discussion.

I'm looking forward to seeing more feedback on this approach,
in terms of design and performance improvement.
So, I have submitted this for the next CF.

>> This new function, pgstattuple2(), samples only 3,000 blocks
>> (which accounts 24MB) from the table randomly, and estimates
>> several parameters of the entire table.
> 
> There should be an input parameter to the function for how much sampling
> to do, and if it's possible to make the scale for it to look like
> ANALYZE that's helpful too.
> 
> I have a project for this summer that includes reviving this topic and
> making sure it works on some real-world systems.  If you want to work on
> this too, I can easily combine that project into what you're doing.

Yeah, I'm interested in that. Something can be shared?

Regards,
-- 
Satoshi Nagayasu 
Uptime Technologies, LLC. http://www.uptime.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2013-09-14 Thread Amit Kapila

>On Monday, July 08, 2013 5:16 PM Andres Freund wrote:
>>On 2013-07-08 17:10:43 +0530, Amit Kapila wrote:
>>> On Monday, July 08, 2013 4:26 PM Andres Freund wrote:
>>> > On 2013-07-08 16:17:54 +0530, Hari Babu wrote:
>>> > > +This utility can also be used to decide whether backup is
>>> > required or not when the data page
>>> > > +in old-master precedes the last applied LSN in old-standby
>>> > (i.e., new-master) at the
>>> > > +moment of the failover.
>>> > > +   
>>> > > +  
>>> >
>>> > I don't think this is safe in any interesting set of cases. Am I
>>> > missing
>>> > something?
>>>
>>> No, you are not missing anything. It can be only used to find max LSN in
>>> database which can avoid further corruption

>>Why is the patch submitted documenting it as a use-case then? I find it
>>rather scary if the *patch authors* document a known unsafe use case as
>>one of the known use-cases.

>I got the problem which can occur with the specified use case. Removed the
>wrong use case specified above.
>Thanks for the review, please find the updated patch attached in the mail.

Patch is not getting compiled on Windows, I had made following changes:
a. updated the patch for resolving Windows build
b. few documentation changes in (pg_resetxlog.sgml) for spelling
mistake and minor line change
c. corrected year for Copyright in file pg_computemaxlsn.c


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pg_computemaxlsn_v11.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

58 matches

Mail list logo