Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Andres Freund-3 wrote
 On 2013-11-04 11:27:33 -0500, Robert Haas wrote:
 On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire lt;

 klaussfreire@

 gt; wrote:
  Such a thing would help COPY, so maybe it's worth a look
 
 I have little doubt that a deferred insertion buffer of some kind
 could help performance on some workloads, though I suspect the buffer
 would have to be pretty big to make it worthwhile on a big COPY that
 generates mostly-random insertions.
 
 Even for random data presorting the to-be-inserted data appropriately
 could result in much better io patterns.

Mmh, I'm afraid that the buffer should be huge to get some real advantage.
You have to buffer enough values to avoid touching entire pages, which is
not that easy. The LSM-tree is much complicated than a simple memory-buffer
+ delayed inserts. The other index types that are supposed to help in the
random-insertion workload rely on sequential insertions (at the expense of
more writing, and slower reads) rather than re-use the btree pattern.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776963.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Simon Riggs wrote
 Everybody on this thread is advised to look closely at Min Max indexes
 before starting any further work.
 
 MinMax will give us access to many new kinds of plan, plus they are
 about as close to perfectly efficient, by which I mean almost zero
 overhead, with regard to inserts as it is possible to get.

Simon, I don't understand how minmax indexes would help in a random-inserts
scenario.
While I would love to use minmax for other columns (since we also partition
and search based on a timestamp, which is usually well clustered), I thought
minmax index would be perfect in a mostly-incremental values scenario. 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776964.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

2013-11-05 Thread Albe Laurenz
Arul Shaji Arulappan wrote:
 Attached is a patch that implements the first set of changes discussed
 in this thread originally. They are:
 
 (i) Implements NCHAR/NVARCHAR as distinct data types, not as synonyms so
 that:
   - psql \d can display the user-specified data types.
   - pg_dump/pg_dumpall can output NCHAR/NVARCHAR columns as-is,
 not as CHAR/VARCHAR.
   - Groundwork to implement additional features for NCHAR/NVARCHAR
 in the future (For eg: separate encoding for nchar columns).
 (ii) Support for NCHAR/NVARCHAR in ECPG
 (iii) Documentation changes to reflect the new data type

If I understood the discussion correctly the use case is that
there are advantages to having a database encoding different
from UTF-8, but you'd still want sume UTF-8 columns.

Wouldn't it be a better design to allow specifying the encoding
per column?  That would give you more flexibility.

I know that NCHAR/NVARCHAR is SQL Standard, but as I still think
that it is a wart.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Simon Riggs
On 5 November 2013 08:32, Leonardo Francalanci m_li...@yahoo.it wrote:
 Simon Riggs wrote
 Everybody on this thread is advised to look closely at Min Max indexes
 before starting any further work.

 MinMax will give us access to many new kinds of plan, plus they are
 about as close to perfectly efficient, by which I mean almost zero
 overhead, with regard to inserts as it is possible to get.

 Simon, I don't understand how minmax indexes would help in a random-inserts
 scenario.
 While I would love to use minmax for other columns (since we also partition
 and search based on a timestamp, which is usually well clustered), I thought
 minmax index would be perfect in a mostly-incremental values scenario.

Minmax indexes seem to surprise many people, so broad generalisations
aren't likely to be useful.

I think the best thing to do is to publish some SQL requests that
demonstrate in detail what you are trying to achieve and test them
against minmax indexes. That way we can discuss what does work and
what doesn't work well enough yet.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions

2013-11-05 Thread Sameer Kumar
 Hello,

   With this index, you will get a different plan like this,
  
  Exactly my point, can we look at making windowing functions
  smart and make use of available indexes?

 I might have guessed..


   Does this satisfies your needs?
  
  Not exactly. If I have missed to mention, this is not a
  production issue for me. I am trying to see if PostgreSQL
  planner produces best plans for Data Warehouse and mining
  oriented queries.

 I agree to the point.

  I think Hashes can be efficiently used for sorting (and I
  believe they are used for joins too when a pre-sorted data set
  is not available via indexes). This again could my
  misinterpretation.

 It is true if 'Sorting' means 'key classification without
 orderings'. Hashes should always appear at inner side of a join,
 I'm convinced. The ordered' nature is not required for the case
 if outer side is already ordered. If not, separate sorting will
 needed.

  I lost you somewhere here. My be this is above my pay-grade :-)

 Sorry for my crumsy english :-


No, it was not your English. :-)
When I read it again and try to relate, I get your point. Actually true,
hashes must always be performed as last option (if that is what you too
meant) and if there are few other operations they must be the last one to
be performed especially after sorting/grouping. Hashes must somehow make
use of already sorted data (I think this something even you indicated)


 Well, at least with Oracle and DB2 planners I have seen that
  the plan produced with dense_rank performs better than a series
  of nested SELECT MAX().

 I see your point. Although I don't know what plans they
 generates, and I don't see how to ordering and ranking without
 sorting.  Could you let me see what they look like?

 # Nevertheless, I don't have the confidence that I can be of some
 # help..

 I will do that if I get a DB2 system or Oracle system running. I will try
to replicate the same 2 test cases and share the plan. One thing which I am
sure is, the below part of the plan

QUERY PLAN | Subquery Scan on __unnamed_subquery_0
 (cost=12971.39..16964.99 rows=614 width=43) (actual
time=2606.075..2953.937 rows=558 loops=1)

would be generated as RID scan in DB2 (which I have seen to perform better
than normal subquery scans in DB2).



Regards
Sameer
Ashnik Pte Ltd


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Simon Riggs wrote
 Minmax indexes seem to surprise many people, so broad generalisations
 aren't likely to be useful.
 
 I think the best thing to do is to publish some SQL requests that
 demonstrate in detail what you are trying to achieve and test them
 against minmax indexes. That way we can discuss what does work and
 what doesn't work well enough yet.

While I do believe in testing (since In theory there is no difference
between theory and practice. In practice there is), I would like to know
the properties of the minmax index before trying it.
What is it supposed to be good at? What are the pros/cons? We can't ask all
the users to just try the index and see if it works for them. 
As I said, my understanding is that is very efficient (both in insertion and
in searching) when data is somehow ordered in the table. But maybe I got it
wrong...

Anyway, the sql scenario is simple: a table with 4 bigint indexes; data in
the fields is a random bigint in the range 1-1000. Insertion is 5-10K
rows/second. One search every 1-5 seconds, by one of the indexed fields
(only equality, no range). There's also an index on a timestamp field, but
that's not random so it doesn't give that many problems (it's actually the
one where I wanted to try the minmax).




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776976.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Oskari Saarenmaa
This makes it easy to see if the binaries were built from a real release
or if they were built from a custom git tree.
---
 configure.in | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/configure.in b/configure.in
index a4baeaf..7c5b3ce 100644
--- a/configure.in
+++ b/configure.in
@@ -29,7 +29,14 @@ AC_CONFIG_AUX_DIR(config)
 AC_PREFIX_DEFAULT(/usr/local/pgsql)
 AC_SUBST(configure_args, [$ac_configure_args])
 
-AC_DEFINE_UNQUOTED(PG_VERSION, $PACKAGE_VERSION, [PostgreSQL version as a 
string])
+# Append git tag based version to PG_VERSION if we're building from a git
+# checkout, but don't touch PACKAGE_VERSION which is used to create other
+# version variables (major version and numeric version.)
+PG_VERSION=$PACKAGE_VERSION
+if test -d .git; then
+  PG_VERSION=$PG_VERSION (`git describe --tags --long --dirty HEAD || echo 
unknown`)
+fi
+AC_DEFINE_UNQUOTED(PG_VERSION, $PG_VERSION, [PostgreSQL version as a string])
 [PG_MAJORVERSION=`expr $PACKAGE_VERSION : '\([0-9][0-9]*\.[0-9][0-9]*\)'`]
 AC_SUBST(PG_MAJORVERSION)
 AC_DEFINE_UNQUOTED(PG_MAJORVERSION, $PG_MAJORVERSION, [PostgreSQL major 
version as a string])
-- 
1.8.4.2



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Simon Riggs
On 5 November 2013 09:57, Leonardo Francalanci m_li...@yahoo.it wrote:
 Simon Riggs wrote
 Minmax indexes seem to surprise many people, so broad generalisations
 aren't likely to be useful.

 I think the best thing to do is to publish some SQL requests that
 demonstrate in detail what you are trying to achieve and test them
 against minmax indexes. That way we can discuss what does work and
 what doesn't work well enough yet.

 While I do believe in testing (since In theory there is no difference
 between theory and practice. In practice there is), I would like to know
 the properties of the minmax index before trying it.
 What is it supposed to be good at? What are the pros/cons? We can't ask all
 the users to just try the index and see if it works for them.

No, but then all users aren't suggesting we need a new index type are they?

I think its reasonable for you to spend time checking whether what you
require will be met, and if not, what precise query doesn't it help,
so we can better design any future new-index.

 As I said, my understanding is that is very efficient (both in insertion and
 in searching) when data is somehow ordered in the table. But maybe I got it
 wrong...

 Anyway, the sql scenario is simple: a table with 4 bigint indexes; data in
 the fields is a random bigint in the range 1-1000. Insertion is 5-10K
 rows/second. One search every 1-5 seconds, by one of the indexed fields
 (only equality, no range). There's also an index on a timestamp field, but
 that's not random so it doesn't give that many problems (it's actually the
 one where I wanted to try the minmax).

Without meaning to pick on you, imprecise analysis yields unhelpful
new features. The clearer we are about what we are trying to solve the
more likely we are to solve it well. 30 seconds analysis on what is
needed is not sufficient to justify an additional man year of
development, especially if a man year of work has already been done
and the testing of the latest feature is now at hand.

The requests from the indexes field, are they ordered? are they
limited? Or you really want ALL calls? What is the tolerance on those?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Simon Riggs wrote
 On 5 November 2013 09:57, Leonardo Francalanci lt;

 m_lists@

 gt; wrote:
 While I do believe in testing (since In theory there is no difference
 between theory and practice. In practice there is), I would like to know
 the properties of the minmax index before trying it.
 What is it supposed to be good at? What are the pros/cons? We can't ask
 all
 the users to just try the index and see if it works for them.
 
 No, but then all users aren't suggesting we need a new index type are
 they?
 
 I think its reasonable for you to spend time checking whether what you
 require will be met, and if not, what precise query doesn't it help,
 so we can better design any future new-index.

I don't understand the parallel We can't ask all the users to just try the
index and see if it works for them with all users aren't suggesting we
need a new index type. Anyway:

I'm not suggesting we need a new index type. Please read my first post: I'm
asking info, fearing that there's just a lot of marketing/hype in those
indexes.

What do you mean by spend time checking whether what you require will be
met? Met by what, minmax indexes?


Simon Riggs wrote
 As I said, my understanding is that is very efficient (both in insertion
 and
 in searching) when data is somehow ordered in the table. But maybe I got
 it
 wrong...
 
 Anyway, the sql scenario is simple: a table with 4 bigint indexes; data
 in
 the fields is a random bigint in the range 1-1000. Insertion is 5-10K
 rows/second. One search every 1-5 seconds, by one of the indexed fields
 (only equality, no range). There's also an index on a timestamp field,
 but
 that's not random so it doesn't give that many problems (it's actually
 the
 one where I wanted to try the minmax).
 
 Without meaning to pick on you, imprecise analysis yields unhelpful
 new features. The clearer we are about what we are trying to solve the
 more likely we are to solve it well. 30 seconds analysis on what is
 needed is not sufficient to justify an additional man year of
 development, especially if a man year of work has already been done
 and the testing of the latest feature is now at hand.

I've never said that my analysis justifies a man year of work. As I said,
I'm actually not confident at all that even if we had those cool indexes
they would work on my scenario (marketing aside, there's not that much
data/tests out there on those indexes). I just wanted to know if the matter
was discussed in the past / getting more info.

At the same time, I'm reluctant to try a new index hoping it will work in my
case just because it's new and a man year of work has already been done.
Again: what's this minmax index supposed to be good at?
If it's indexing data in mostly-sequential order, it won't help me. From
what I got (maybe I got it wrong?) the index stores min/max values of
sequence of pages. In my case I guess that those min/max values would be
close to 1 (min) /1000 (max), because I insert data in random order. So
any query will scan the entire table anyway. Am I wrong?


Simon Riggs wrote
 The requests from the indexes field, are they ordered? 

Mmmh, I don't think I understand the question... an operator searches for
calls made by a user, so he searches in random order... 


Simon Riggs wrote
 are they
 limited? Or you really want ALL calls? What is the tolerance on those?

I want all the calls made by the user. But there won't be many calls for 1
user.
And most users will never be queried (as I said, one calls by person
search every 1-5 seconds, so a very small percentage of calls will ever be
queried/retrieved)



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776982.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Crash recovery

2013-11-05 Thread Soroosh Sardari
Hi

When PG crashes or the computer turned down unexpectedly, next time
postmaster
starts up, it does the crash recovery, actually redo xlog records, vacuum,
etc.

What module is responsible for crash recovery?

Regards,
Soroosh Sardari


Re: [HACKERS] Crash recovery

2013-11-05 Thread Heikki Linnakangas

On 05.11.2013 13:21, Soroosh Sardari wrote:

When PG crashes or the computer turned down unexpectedly, next time
postmaster
starts up, it does the crash recovery, actually redo xlog records, vacuum,
etc.

What module is responsible for crash recovery?


See src/backend/access/transam/xlog.c. The code to read the WAL is 
there. It calls redo-routines in other modules, see e.g heap_redo for 
replaying heap records, btree_redo for B-tree index records and so forth.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Heikki Linnakangas

On 05.11.2013 12:22, Oskari Saarenmaa wrote:

This makes it easy to see if the binaries were built from a real release
or if they were built from a custom git tree.


Hmm, that would mean that a build from git checkout REL9_3_1 produces 
a different binary than one built from postgresql-9.3.1.tar.gz tarball.


I can see some value in that kind of information, ie. knowing what 
patches a binary was built with, but this would only solve the problem 
for git checkouts. Even for a git checkout, the binaries won't be 
automatically updated unless you run configure again, which makes it 
pretty unreliable.


-1 from me.

PS, the git command in the patch doesn't work with my version of git:

$ git describe --tags --long --dirty HEAD
fatal: --dirty is incompatible with committishes
$ git --version
git version 1.8.4.rc3

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

2013-11-05 Thread MauMau

From: Albe Laurenz laurenz.a...@wien.gv.at

If I understood the discussion correctly the use case is that
there are advantages to having a database encoding different
from UTF-8, but you'd still want sume UTF-8 columns.

Wouldn't it be a better design to allow specifying the encoding
per column?  That would give you more flexibility.


Yes, you are right.  In the previous discussion:

- That would be nice if available, but it is hard to implement multiple 
encodings in one database.
- Some people (I'm not sure many or few) are NCHAR/NVARCHAR in other DBMSs. 
To invite them to PostgreSQL, it's important to support national character 
feature syntactically and document it in the manual.  This is the first 
step.
- As the second step, we can implement multiple encodings in one database. 
According to the SQL standard, NCHAR(n) is equivalent to CHAR(n) 
CHARACTER SET cs, where cs is an implementation-defined character set.



Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

2013-11-05 Thread Albe Laurenz
MauMau wrote:
 From: Albe Laurenz laurenz.a...@wien.gv.at
 If I understood the discussion correctly the use case is that
 there are advantages to having a database encoding different
 from UTF-8, but you'd still want sume UTF-8 columns.

 Wouldn't it be a better design to allow specifying the encoding
 per column?  That would give you more flexibility.
 
 Yes, you are right.  In the previous discussion:
 
 - That would be nice if available, but it is hard to implement multiple
 encodings in one database.

Granted.

 - Some people (I'm not sure many or few) are NCHAR/NVARCHAR in other DBMSs.
 To invite them to PostgreSQL, it's important to support national character
 feature syntactically and document it in the manual.  This is the first
 step.

I looked into the Standard, and it does not have NVARCHAR.
The type is called NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING
or NCHAR VARYING.

I guess that the goal of this patch is to support Oracle syntax.
But anybody trying to port CREATE TABLE statements from Oracle
is already exposed to enough incompatibilities that the difference between
NVARCHAR and NCHAR VARYING will not be the reason to reject PostgreSQL.

In other words, I doubt that introducing the nonstandard NVARCHAR
will have more benefits than drawbacks (new reserved word).

Regarding the Standard compliant names of these data types, PostgreSQL
already supports those.  Maybe some documentation would help.

 - As the second step, we can implement multiple encodings in one database.
 According to the SQL standard, NCHAR(n) is equivalent to CHAR(n)
 CHARACTER SET cs, where cs is an implementation-defined character set.

That second step would definitely have benefits.

But I don't think that this requires the first step that your patch
implements, it is in fact orthogonal.

I don't think that there is any need to change NCHAR even if we
get per-column encoding, it is just syntactic sugar to support
SQL Feature F421.

Why not tackle the second step first?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 6:57 AM, Leonardo Francalanci m_li...@yahoo.it wrote:
 Simon Riggs wrote
 Minmax indexes seem to surprise many people, so broad generalisations
 aren't likely to be useful.

 I think the best thing to do is to publish some SQL requests that
 demonstrate in detail what you are trying to achieve and test them
 against minmax indexes. That way we can discuss what does work and
 what doesn't work well enough yet.

 While I do believe in testing (since In theory there is no difference
 between theory and practice. In practice there is), I would like to know
 the properties of the minmax index before trying it.
 What is it supposed to be good at? What are the pros/cons? We can't ask all
 the users to just try the index and see if it works for them.
 As I said, my understanding is that is very efficient (both in insertion and
 in searching) when data is somehow ordered in the table. But maybe I got it
 wrong...

Well, for one, random inserts (with random data) on a min-max index
have a roughly 1/N chance of requiring a write to disk, and (N-1)/N
chance of being completely free (or maybe a read to verify a write
isn't needed, but that'll probably hit shared buffers), where N is the
number of tuples per page. Per index page that is.

Of course, non-random workloads are a different matter.

Min-max indexes always require a sequential scan of the min-max index
itself when querying. That works when you intend to query enough
tuples to make up the cost (that is, more tuples than M * N *
random_cost / seq_cost), where M is the number of pages in the index.
Well, actually, since they result in better io patterns as well, the
tradeoff is probably a little bit more tricky than that, in favor of
min-max indexes.

Min-max indexes tend to be very compact, so M is usually low.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-11-05 Thread Sameer Thakur
Hello,
Please find attached pg_stat_statements-identification-v9.patch.
I have tried to address the following review comments
1. Use version PGSS_TUP_V1_2
2.Fixed total time being zero
3. Remove 'session_start' from the view and use point release number
to generate queryid
4. Hide only queryid and query text and not all fields from unauthorized user
5. Removed introduced field from view and code as statistics session
concept is not being used
6. Removed struct Instrumentation usage
7. Updated sgml to reflect changes made. Removed all references to
statistics session, and introduced fields.

regards
Sameer


pg_stat_statements-identification-v9.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security writer-side checks proposal

2013-11-05 Thread Robert Haas
On Mon, Nov 4, 2013 at 8:00 PM, Craig Ringer cr...@2ndquadrant.com wrote:
 On 11/04/2013 09:55 PM, Robert Haas wrote:
 I continue to think that this syntax is misguided.  For SELECT and
 DELETE there is only read-side security, and for INSERT there is only
 write-side security, so that's OK as far as it goes, but for UPDATE
 both read-side security and write-side security are possible, and
 there ought to be a way to get one without the other.  This syntax
 won't support that cleanly.

 That's what I was thinking earlier too - separate FOR READ and FOR
 WRITE instead.

 The reason I came back to insert/update/delete was that it's entirely
 reasonable to want to prohibit deletes but permit updates to the same
 tuple. Both are writes; both set xmax, it's just that one _replaces_ the
 tuple, the other doesn't.

 So really, there are four cases:

 READ
 WRITE INSERT
 WRITE UPDATE
 WRITE DELETE

Isn't READ similarly divisible into READ SELECT, READ UPDATE, and READ DELETE?

 I would generally expect that most people would want either read side
 security for all commands or read and write side security for all
 commands.  I think whatever syntax we come up with this feature ought
 to make each of those things straightforward to get.

 but sometimes with different predicates for read and write, i.e. you can
 see rows you can't modify or can insert rows / update rows that you
 can't see after the change.

Yes, that's possible.

 Similarly, saying you can update but not delete seems quite reasonable
 to me.

If you simply want to allow UPDATE but not DELETE, you can refrain
from granting the table-level privilege.  The situation in which you
need things separate is when you want to allow both UPDATE and DELETE
but with different RLS quals for each.

 On the other hand, we might choose to say if you want to do things with
 that granularity use your own triggers to enforce it and provide only
 READ and WRITE for RLS.

The funny thing about this whole feature is that it's just syntax
support for doing things that you can already do in other ways.  If
you want read-side security, create a security_barrier view and select
from that instead of hitting the table directly.  If you want
write-side security, enforce it using triggers.  So essentially what
this is, I think, is an attempt to invent nicer syntax around
something that we already have, and provide a one-stop-shopping
experience rather than a roll-your-own experience for people who want
row-level security.

Now maybe that's fine.  But given that, I think it's pretty important
that we get the syntax right.  Because if you're adding a feature
primarily to add a more convenient syntax, then the syntax had better
actually be convenient.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.4] row level security

2013-11-05 Thread Robert Haas
On Mon, Nov 4, 2013 at 8:46 PM, Craig Ringer cr...@2ndquadrant.com wrote:
 On 11/04/2013 11:17 PM, Robert Haas wrote:
 I'd still like to here what's wrong with what I said here:

 http://www.postgresql.org/message-id/ca+tgmoyr1phw3x9vnvuwdcfxkzk2p_jhtwc0fv2q58negcx...@mail.gmail.com

 For me, just my understanding. I'm still too new to the planner and
 rewriter to grasp that properly as written.

 I was responding to Tom's objection, and he makes a good point about
 CASE and optimisation. We have to be free to re-order and pre-evaluate
 where LEAKPROOF flags make it safe and permissible, without ever
 otherwise doing so. That makes the SECURITY BARRIER subquery look
 better, since the limited pull-up / push-down is already implemented there.

 Robert, any suggesitons on how to approach what you suggest? I'm pretty
 new to the planner's guts, but I know there've been some complaints
 about the way the current RLS code fiddles with Vars when it changes a
 direct scan of a rel into a subquery scan.

 The code in:

 https://github.com/ringerc/postgres/blob/rls-9.4/src/backend/optimizer/prep/prepunion.c#L1647

 and

 https://github.com/ringerc/postgres/blob/rls-9.4/src/backend/optimizer/prep/prepunion.c#L1591

 seems to be the one folks were complaining about earlier.

I haven't studied this patch in detail, but I see why there's some
unhappiness about that code: it's an RLS-specific kluge.  Just
shooting from the hip here, maybe we should attack the problem of
making security-barrier views updatable first, as a separate patch.  I
would think that if we make that work, this will also work without,
hopefully, any special hackery.  And we'd get a separate,
independently useful feature out of it, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql: Remove internal uses of CTimeZone/HasCTZSet.

2013-11-05 Thread Robert Haas
On Mon, Nov 4, 2013 at 8:17 PM, Noah Misch n...@leadboat.com wrote:
 On Mon, Nov 04, 2013 at 02:34:02PM -0500, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  On Fri, Nov 01, 2013 at 04:51:34PM +, Tom Lane wrote:
  Remove internal uses of CTimeZone/HasCTZSet.

  This changed EncodeDateTime() output for USE_SQL_DATES and USE_GERMAN_DATES
  styles, because it inserts a space before tzn but does not insert a space
  before EncodeTimezone() output.  Example:

set datestyle = sql,mdy;
select '2013-01-01'::timestamptz;

  old output:

timestamptz
  
   01/01/2013 00:00:00+00

  new output:

 timestamptz
  -
   01/01/2013 00:00:00 +00

  I assume this was unintended.

 Yeah, I had seen some cases of that.  I don't find it of great concern.
 We'd have to insert some ugly special-case code that looks at the zone
 name to decide whether to insert a space or not, and I don't think it'd
 actually be an improvement to do so.  (Arguably, these formats are
 more consistent this way.)  Still, this change is probably a sufficient
 reason not to back-patch this part of the fix.

 I was prepared to suppose that no substantial clientele relies on the
 to_char() TZ format code expanding to blank, the other behavior that changed
 with this patch.  It's more of a stretch to figure applications won't stumble
 over this new whitespace.  I do see greater consistency in the new behavior,
 but changing type output is a big deal.  While preserving the old output does
 require ugly special-case code, such code was in place for years until removal
 by this commit and the commit following it.  Perhaps making the behavior
 change is best anyway, but we should not conclude that decision on the basis
 of its origin as a side effect of otherwise-desirable refactoring.

I have to admit I fear this will break a huge amount of application code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] exit_horribly vs exit_nicely in pg_dump

2013-11-05 Thread Pavel Golub
Hello.

Examining pg_dump sources recently I've found that different exit
procedure used for the same situations.

A quick example from pg_dump.c:

if (dataOnly  schemaOnly)
exit_horribly(NULL, options -s/--schema-only and 
-a/--data-only cannot be used together\n);

if (dataOnly  outputClean)
exit_horribly(NULL, options -c/--clean and -a/--data-only 
cannot be used together\n);

if (dump_inserts  oids)
{
write_msg(NULL, options --inserts/--column-inserts and 
-o/--oids cannot be used together\n);
write_msg(NULL, (The INSERT command cannot set OIDs.)\n);
exit_nicely(1);
}

I suppose this should be call to exit_nicely() for all possible cases.

The only need for calling exit_horribly() is when we are deep down in
the multithreaded code, AFAIK.

-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security writer-side checks proposal

2013-11-05 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Now maybe that's fine.  But given that, I think it's pretty important
 that we get the syntax right.  Because if you're adding a feature
 primarily to add a more convenient syntax, then the syntax had better
 actually be convenient.

I agree that we want to get the syntax correct, but also very clear as
it's security related and we don't want anyone surprised by what happens
when they use it.  The idea, as has been discussed in the past, is to
then allow tying RLS in with SELinux and provide MAC.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Oskari Saarenmaa
On Tue, Nov 05, 2013 at 02:06:26PM +0200, Heikki Linnakangas wrote:
 On 05.11.2013 12:22, Oskari Saarenmaa wrote:
 This makes it easy to see if the binaries were built from a real release
 or if they were built from a custom git tree.
 
 Hmm, that would mean that a build from git checkout REL9_3_1
 produces a different binary than one built from
 postgresql-9.3.1.tar.gz tarball.

Good point - how about adding git describe information only if the checkout
doesn't match a tag exactly?  So when you build REL9_3_1 there would be no
extra information, but when you have any local changes on top of it we'd add
the git describe output.

 I can see some value in that kind of information, ie. knowing what
 patches a binary was built with, but this would only solve the
 problem for git checkouts. Even for a git checkout, the binaries
 won't be automatically updated unless you run configure again,
 which makes it pretty unreliable.
 
 -1 from me.

I don't think we can solve the problem of finding local changes for all the
things people may do, but I'd guess it's pretty common to build postgresql
from a clean local git checkout and with this change at least some portion
of users would get some extra information.  To solve the rerun configure
thing we could put git version in a new header file and have a makefile
dependency on .git/index for regenerating that file when needed.

We could also let users override the extra version with a command line
argument for configure so distributions could put the distribution package
version there, for example 9.3.1-2.fc20 on my Fedora system.

 PS, the git command in the patch doesn't work with my version of git:
 
 $ git describe --tags --long --dirty HEAD
 fatal: --dirty is incompatible with committishes
 $ git --version
 git version 1.8.4.rc3

I initially used HEAD when looking at the git describe values, but then
added --dirty which obviously doesn't make sense when describing a ref.

Sorry about the broken patch, I was applying these changes manually from
configure.in to configure because I didn't have the proper autoconf version
installed (autoconf 2.63 doesn't seem to be available in many distributions
anymore, maybe the required version could be upgraded at some point..)

How about the patch below to fix the exact tag and --dirty issues?

/ Oskari

diff --git a/configure.in b/configure.in
index a4baeaf..d253286 100644
--- a/configure.in
+++ b/configure.in
@@ -29,7 +29,18 @@ AC_CONFIG_AUX_DIR(config)
 AC_PREFIX_DEFAULT(/usr/local/pgsql)
 AC_SUBST(configure_args, [$ac_configure_args])
 
-AC_DEFINE_UNQUOTED(PG_VERSION, $PACKAGE_VERSION, [PostgreSQL version as a 
string])
+# Append git tag based version to PG_VERSION if we're building from a git
+# checkout that doesn't match a tag exactly.  Don't touch PACKAGE_VERSION
+# which is used to create other version variables (major version and numeric
+# version.)
+PG_VERSION=$PACKAGE_VERSION
+if test -d .git; then
+  exact=`git describe --tags --exact-match --dirty 2/dev/null || echo dirty`
+  if echo $exact | grep -q dirty; then
+PG_VERSION=$PG_VERSION (`git describe --tags --long --dirty || echo 
unknown`)
+  fi
+fi
+AC_DEFINE_UNQUOTED(PG_VERSION, $PG_VERSION, [PostgreSQL version as a string])
 [PG_MAJORVERSION=`expr $PACKAGE_VERSION : '\([0-9][0-9]*\.[0-9][0-9]*\)'`]
 AC_SUBST(PG_MAJORVERSION)
 AC_DEFINE_UNQUOTED(PG_MAJORVERSION, $PG_MAJORVERSION, [PostgreSQL major 
version as a string])
-- 
1.8.4.2



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Michael Paquier
On Tue, Nov 5, 2013 at 2:05 PM, Oskari Saarenmaa o...@ohmu.fi wrote:
 I can see some value in that kind of information, ie. knowing what
 patches a binary was built with, but this would only solve the
 problem for git checkouts. Even for a git checkout, the binaries
 won't be automatically updated unless you run configure again,
 which makes it pretty unreliable.

 -1 from me.

 I don't think we can solve the problem of finding local changes for all the
 things people may do, but I'd guess it's pretty common to build postgresql
 from a clean local git checkout and with this change at least some portion
 of users would get some extra information.  To solve the rerun configure
 thing we could put git version in a new header file and have a makefile
 dependency on .git/index for regenerating that file when needed.

 We could also let users override the extra version with a command line
 argument for configure so distributions could put the distribution package
 version there, for example 9.3.1-2.fc20 on my Fedora system.

 PS, the git command in the patch doesn't work with my version of git:

 $ git describe --tags --long --dirty HEAD
 fatal: --dirty is incompatible with committishes
 $ git --version
 git version 1.8.4.rc3

 I initially used HEAD when looking at the git describe values, but then
 added --dirty which obviously doesn't make sense when describing a ref.

 Sorry about the broken patch, I was applying these changes manually from
 configure.in to configure because I didn't have the proper autoconf version
 installed (autoconf 2.63 doesn't seem to be available in many distributions
 anymore, maybe the required version could be upgraded at some point..)

 How about the patch below to fix the exact tag and --dirty issues?
A similar thing has been discussed a couple of months ago, and the
idea to add any git-related information in configure has been
rejected. You can have a look at the discussion here:
http://www.postgresql.org/message-id/3038.1374031...@sss.pgh.pa.us
As well as a potential solution here:
http://www.postgresql.org/message-id/c51433da5e804767724d60eea57f4178.squir...@webmail.xs4all.nl

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column order and physical column order

2013-11-05 Thread Alvaro Herrera
David Rowley escribió:

 In this case how does Postgresql know that attnum 3 is the 2nd user column
 in that table? Unless I have misunderstood something then there must be
 some logic in there to skip dropped columns and if so then could it not
 just grab the attphynum at that location? then just modify the 1-5 places
 listed above to sort on attlognum?

During parse analysis, those columns obtained from pg_attribute are
transformed to target list entries; they travel through the parser and
executor in that representation, and TupleDescs are constructed from
those lists.  Making that works correctly needs some more code than just
sorting on attlognum.  There are some other messy parts like handling
composite types when passed to functions, COPY, and some other things I
don't remember.  Did you look at the places my patch touches?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Claudio Freire wrote
 Min-max indexes always require a sequential scan of the min-max index
 itself when querying. 

I'm worried about the number of heap pages that will be scanned.
My understanding is that given the random input, the index will
not be selective enough, and will end up requiring a lot of page 
scanning to get the relevant rows.

That is: the selectivity of the min/max values will be very low, given
the high cardinality and randomness of the input; I'm afraid that, in
most page ranges, the min will be lower than the queried ones,
and the max higher, resulting in lots of heap page scans.

Quick test:
a table with 1M rows, with random values from 1 to 1000:
create table t as select generate_series(1, 100) as i, trunc(random() *
1000) as n;

suppose a page range contains 100 rows (but my understanding is that minmax
index will use a much bigger row count...), let's find how many page
ranges
should be scanned to find a series of 50 values (again, random in the
1-1000 range):

with cte as 
 (select min(n) as minn, max(n) as maxn, i/100 from t group by i/100),
 inp as (select generate_series(1, 50) iinp, trunc(random() * 1000) as
s)
 select count(*) from inp
   left outer join cte on inp.s between minn and maxn group by iinp


I get  9000 pages for 49 values out of 50... which means scanning 90% of
the table.

Either my sql is not correct (likely), or my understanding of the minmax
index is
not correct (even more likely), or the minmax index is not usable in a
random inputs
scenario.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5777009.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Tom Lane
Oskari Saarenmaa o...@ohmu.fi writes:
 On Tue, Nov 05, 2013 at 02:06:26PM +0200, Heikki Linnakangas wrote:
 I can see some value in that kind of information, ie. knowing what
 patches a binary was built with, but this would only solve the
 problem for git checkouts. Even for a git checkout, the binaries
 won't be automatically updated unless you run configure again,
 which makes it pretty unreliable.
 
 -1 from me.

 I don't think we can solve the problem of finding local changes for all the
 things people may do, but I'd guess it's pretty common to build postgresql
 from a clean local git checkout and with this change at least some portion
 of users would get some extra information.

I agree with Heikki that this is basically useless.  Most of my builds are
from git + uncommitted changes, so telling me what the top commit was has
no notable value.  Even if I always committed before building, the hash
tags are only decipherable until I discard that branch.  So basically, this
would only be useful to people building production servers from random git
pulls from development or release-branch mainline.  How many people really
do that, and should we inconvenience everybody else to benefit them?
(Admittedly, the proposed patch is only marginally inconvenient as-is,
but anything that would force a header update after any commit would
definitely put me on the warpath.)

BTW, I don't think the proposed patch works at all in a VPATH build.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql: Remove internal uses of CTimeZone/HasCTZSet.

2013-11-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Nov 4, 2013 at 8:17 PM, Noah Misch n...@leadboat.com wrote:
 I was prepared to suppose that no substantial clientele relies on the
 to_char() TZ format code expanding to blank, the other behavior that 
 changed
 with this patch.  It's more of a stretch to figure applications won't stumble
 over this new whitespace.  I do see greater consistency in the new behavior,
 but changing type output is a big deal.  While preserving the old output does
 require ugly special-case code, such code was in place for years until 
 removal
 by this commit and the commit following it.  Perhaps making the behavior
 change is best anyway, but we should not conclude that decision on the basis
 of its origin as a side effect of otherwise-desirable refactoring.

 I have to admit I fear this will break a huge amount of application code.

I don't believe that.  You are talking about the intersection of

(1) people who use a brute force timezone (which we already know is
a small set, else the original bug would've been noticed long ago).
(2) people who use SQL or German datestyle, neither of which is
default.
(3) people whose apps will fail if there's whitespace at a specific place
in timestamp output, even though in many other cases there will be
whitespace at that place anyway, notably including the case where they
select any regular timezone.

It's possible that this intersection is nonempty, but huge amount?
Come on.  This is minor compared to the application compatibility
issues we come up with in every major release.  More, I would argue
that the number of failing applications is likely to be exceeded by
the number that no longer fail upon selection of a brute-force zone,
because they'd only ever been coded or tested with the normal-zone-name
formatting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 11:28 AM, Leonardo Francalanci m_li...@yahoo.it wrote:
 I get  9000 pages for 49 values out of 50... which means scanning 90% of
 the table.

 Either my sql is not correct (likely), or my understanding of the minmax
 index is
 not correct (even more likely), or the minmax index is not usable in a
 random inputs
 scenario.


Yep, you're correct. That's the cost for querying random values.

But, both real data isn't truly random, and you haven't really
analyzed update cost, which is what we were talking about in that last
post.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Claudio Freire wrote
 real data isn't truly random

Well, let's try normal_rand???

create table t1 as select trunc(normal_rand(100, 50, 3)) as n,
generate_series(1, 100) as i;

with cte as 
 (select min(n) as minn, max(n) as maxn, i/100 from t1 group by i/100),
 inp as (select generate_series(1, 100) iinp, trunc(normal_rand(100,
50, 3)) as s)

 select count(*),iinp from inp
   left outer join cte on inp.s between minn and maxn group by iinp;

Not that much different in my run... 



Claudio Freire wrote
 you haven't really
 analyzed update cost, which is what we were talking about in that last
 post.

I don't care for a better update cost if the cost to query is a table scan.
Otherwise, I'll just claim that no index at all is even better than minmax:
0 update cost, pretty much same query time.

Maybe there's value in minmax indexes for sequential data, but not for
random data, which is the topic of this thread.

BTW I would like to see some performance tests on the minmax indexes
vs btree for the sequential inputs... is the gain worth it? I couldn't find
any mention of performance tests in the minmax threads.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5777020.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 I agree with Heikki that this is basically useless.  Most of my builds are
 from git + uncommitted changes, so telling me what the top commit was has
 no notable value.  

The focus of this change would really be, imv anyway, for more casual
PG developers, particularly those familiar with github and who work with
branches pushed there by other developers.

 Even if I always committed before building, the hash
 tags are only decipherable until I discard that branch.

Certainly true- but then, are you typically keeping binaries around
after you discard the branch?  Or distributing those binaries to others?
Seems unlikely.  However, if you're pulling in many branches from remote
locations and building lots of PG binaries, keeping it all straight and
being confident you didn't mix anything can be a bit more challenging.

 So basically, this
 would only be useful to people building production servers from random git
 pulls from development or release-branch mainline.  How many people really
 do that, and should we inconvenience everybody else to benefit them?

Not many do it today because we actively discourage it by requiring
patches to be posted to the mailing list and the number of people
writing PG patches is relatively small.  Even so though, I can see folks
like certain PG-on-cloud providers, who are doing testing, or even
deployments, with various patches to provide us feedback on them, and
therefore have to manage a bunch of different binaries, might find it
useful.

 (Admittedly, the proposed patch is only marginally inconvenient as-is,
 but anything that would force a header update after any commit would
 definitely put me on the warpath.)
 
 BTW, I don't think the proposed patch works at all in a VPATH build.

Clearly, that would need to be addressed.

All-in-all, I'm not super excited about this but I also wouldn't mind,
so while not really a '+1', I'd say '+0'.  Nice idea, if it isn't
painful to deal with and maintain.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 So basically, this
 would only be useful to people building production servers from random git
 pulls from development or release-branch mainline.  How many people really
 do that, and should we inconvenience everybody else to benefit them?

 Not many do it today because we actively discourage it by requiring
 patches to be posted to the mailing list and the number of people
 writing PG patches is relatively small.  Even so though, I can see folks
 like certain PG-on-cloud providers, who are doing testing, or even
 deployments, with various patches to provide us feedback on them, and
 therefore have to manage a bunch of different binaries, might find it
 useful.

I can see that there might be a use for tagging multiple binaries,
I just don't believe that this is a particularly helpful way to do it.
The last-commit tag is neither exactly the right data nor even a little
bit user-friendly.  What about, say, a configure option to add a
user-specified string to the version() result?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Andrew Dunstan


On 11/05/2013 10:32 AM, Stephen Frost wrote:


All-in-all, I'm not super excited about this but I also wouldn't mind,
so while not really a '+1', I'd say '+0'.  Nice idea, if it isn't
painful to deal with and maintain.




I doubt it's buying us anything worth having. What's more, it might get 
in the road of some other gadget that would be worth having.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 What about, say, a configure option to add a
 user-specified string to the version() result?

I quite like that idea, personally.  Folks who care about it being a git
tag could then trivially get that also.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] better atomics

2013-11-05 Thread Andres Freund
Hi,

(Coming back to this now)

On 2013-10-28 21:55:22 +0100, Andres Freund wrote:
 The list I have previously suggested was:
 
 * pg_atomic_load_u32(uint32 *)
 * uint32 pg_atomic_store_u32(uint32 *)
 
 To be able to write code without spreading volatiles all over while
 making it very clear that that part of the code is worthy of attention.
 
 * uint32 pg_atomic_exchange_u32(uint32 *ptr, uint32 val)
 * bool pg_atomic_compare_exchange_u32(uint32 *ptr, uint32 *expected, uint32 
 newval)

So what I've previously proposed did use plain types for the
to-be-manipulated atomic integers. I am not sure about that anymore
though, C11 includes support for atomic operations, but it assumes that
the to-be-manipulated variable is of a special atomic type. While I am
not propose that we try to emulate C11's API completely (basically
impossible as it uses type agnostic functions, it also exposes too
much), it seems to make sense to allow PG's atomics to be implemented by
C11's.

The API is described at: http://en.cppreference.com/w/c/atomic

The fundamental difference to what I've suggested so far is that the
atomic variable isn't a plain integer/type but a distinct datatype that
can *only* be manipulated via the atomic operators. That has the nice
property that it cannot be accidentally manipulated via plain operators.

E.g. it wouldn't be
uint32 pg_atomic_fetch_add_u32(uint32 *ptr, uint32 add_);
but
uint32 pg_atomic_fetch_add_u32(pg_atomic_uint32 *ptr, uint32 add_);

where, for now, atomic_uint32 is something like:
typedef struct pg_atomic_uint32
{
volatile uint32 value;
} pg_atomic_uint32;
a future C11 implementation would just typedef C11's respective atomic
type to pg_atomic_uint32.

Comments?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Andrew Dunstan


On 11/05/2013 10:53 AM, Stephen Frost wrote:

* Tom Lane (t...@sss.pgh.pa.us) wrote:

What about, say, a configure option to add a
user-specified string to the version() result?

I quite like that idea, personally.  Folks who care about it being a git
tag could then trivially get that also.



+1.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security writer-side checks proposal

2013-11-05 Thread Robert Haas
On Tue, Nov 5, 2013 at 9:01 AM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 Now maybe that's fine.  But given that, I think it's pretty important
 that we get the syntax right.  Because if you're adding a feature
 primarily to add a more convenient syntax, then the syntax had better
 actually be convenient.

 I agree that we want to get the syntax correct, but also very clear as
 it's security related and we don't want anyone surprised by what happens
 when they use it.  The idea, as has been discussed in the past, is to
 then allow tying RLS in with SELinux and provide MAC.

No argument.  I think convenient and unsurprising are closely-aligned goals.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] configure: allow adding a custom string to PG_VERSION

2013-11-05 Thread Oskari Saarenmaa
This can be used to tag custom built packages with an extra version string
such as the git describe id or distribution package release version.

Based on pgsql-hackers discussion:
http://www.postgresql.org/message-id/20131105102226.ga26...@saarenmaa.fi

Signed-off-by: Oskari Saarenmaa o...@ohmu.fi
---
 configure.in | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/configure.in b/configure.in
index a4baeaf..5459c71 100644
--- a/configure.in
+++ b/configure.in
@@ -29,11 +29,15 @@ AC_CONFIG_AUX_DIR(config)
 AC_PREFIX_DEFAULT(/usr/local/pgsql)
 AC_SUBST(configure_args, [$ac_configure_args])
 
-AC_DEFINE_UNQUOTED(PG_VERSION, $PACKAGE_VERSION, [PostgreSQL version as a 
string])
 [PG_MAJORVERSION=`expr $PACKAGE_VERSION : '\([0-9][0-9]*\.[0-9][0-9]*\)'`]
 AC_SUBST(PG_MAJORVERSION)
 AC_DEFINE_UNQUOTED(PG_MAJORVERSION, $PG_MAJORVERSION, [PostgreSQL major 
version as a string])
 
+# Allow adding a custom string to PG_VERSION
+PGAC_ARG_REQ(with, extra-version, [NAME], [extra version information],
+[PG_VERSION=$PACKAGE_VERSION ($withval)], [PG_VERSION=$PACKAGE_VERSION])
+AC_DEFINE_UNQUOTED(PG_VERSION, $PG_VERSION, [PostgreSQL version as a string])
+
 AC_CANONICAL_HOST
 
 template=
@@ -1920,7 +1924,7 @@ else
 fi
 
 AC_DEFINE_UNQUOTED(PG_VERSION_STR,
-   [PostgreSQL $PACKAGE_VERSION on $host, compiled by 
$cc_string, `expr $ac_cv_sizeof_void_p \* 8`-bit],
+   [PostgreSQL $PG_VERSION on $host, compiled by $cc_string, 
`expr $ac_cv_sizeof_void_p \* 8`-bit],
[A string containing the version number, platform, and C 
compiler])
 
 # Supply a numeric version string for use by 3rd party add-ons
-- 
1.8.4.2



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Disallow pullup of a subquery with a subquery in its targetlist?

2013-11-05 Thread Tom Lane
Back at
http://www.postgresql.org/message-id/520d221e.2060...@gmail.com
there was a complaint about strange behavior of a query that looks
basically like this:

SELECT ...
FROM
 (SELECT ... ,
 ( SELECT ...
 ORDER BY random()
 LIMIT 1
 ) AS color_id
  FROM ...
 ) s
 LEFT JOIN ...

The problem is that the planner decides it can pull up the subquery s,
or flatten it into the outer query.  This entails substituting the
subqury's targetlist expressions for outer-query Vars referencing s,
and there's more than one reference to s.color_id.  So we get multiple
copies of the inner subquery, and they will produce different results
at runtime due to the use of random().  This results in inconsistent
behavior.

We decided long ago that we should forbid pullup of subqueries that
contain volatile functions in their targetlists, because of what's
basically the same hazard: you might get more evaluations of the
volatile functions than you expected, yielding inconsistent results
and/or unwanted side-effects.

I first wondered why the instance of random() didn't prevent pullup
in this example.  That's because contain_volatile_functions() does
not recurse into SubLinks, which maybe is the wrong thing; but
I'm hesitant to change it without detailed analysis of all the
(many) call sites.

However, I think that a good case could also be made for fixing this
by deciding that we shouldn't pull up if there are SubLinks in the
subquery targetlist, period.  Even without any volatile functions,
multiple copies of a subquery seem like a probable loser cost-wise.

Thoughts?  If we do change this, should we back-patch it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Jeff Janes
On Tue, Nov 5, 2013 at 12:25 AM, Leonardo Francalanci m_li...@yahoo.itwrote:

 Andres Freund-3 wrote
  On 2013-11-04 11:27:33 -0500, Robert Haas wrote:
  On Mon, Nov 4, 2013 at 11:24 AM, Claudio Freire lt;

  klaussfreire@

  gt; wrote:
   Such a thing would help COPY, so maybe it's worth a look
 
  I have little doubt that a deferred insertion buffer of some kind
  could help performance on some workloads, though I suspect the buffer
  would have to be pretty big to make it worthwhile on a big COPY that
  generates mostly-random insertions.
 
  Even for random data presorting the to-be-inserted data appropriately
  could result in much better io patterns.

 Mmh, I'm afraid that the buffer should be huge to get some real advantage.
 You have to buffer enough values to avoid touching entire pages, which is
 not that easy.


Some experiments I did a few years ago showed that applying sorts to the
data to be inserted could be helpful even when the sort batch size was as
small as one tuple per 5 pages of existing index.  Maybe even less.

Cheers,

Jeff


Re: Handle LIMIT/OFFSET before select clause (was: [HACKERS] Feature request: optimizer improvement)

2013-11-05 Thread Joe Love
I'm wondering what type of index would work for this as it is a volatile
function. Not knowing how PGs optimizer runs, I'm at a loss as to why this
wouldn't be possible or worth doing. It seems to me that all functions in
the select part of the statement could be calculated at the end of the
query after the results have been gathered, and even after the sorting had
been done as long as the column wasn't part of the order by (or perhaps
group by).

I have an entire set of functions that perform in this way. For example,
I'm selecting a list of all my products and the function does a complex
calculation based on inventory in the warehouse + expected deliveries from
the factory to determine how many of each item is available, and when they
first become available.   What's helpful is for the users search criteria
to initially limit the search result, and then I want to paginate the
results and only show them a few at a time. In the verbose syntax I
mentioned originally, the query performs well, in the most straightforward
syntax, it does not. I'm not sure I even need to hint the optimizer to
perform this type of an optimization as it seems it would be beneficial (or
at least not detrimental) 100% of the time.


On Sat, Nov 2, 2013 at 10:06 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Atri Sharma atri.j...@gmail.com writes:
  I understand the reasons for executing SELECT before the sort. But,
  couldnt we get the planner to see the LIMIT part and push the sort
  node above the select node for this specific case?

 [ Shrug... ]  I don't see the point.  If the OP actually cares about the
 speed of this query, he's going to want to avoid the sort step too,
 which is what makes the index a good idea.

 More generally, this is not a transformation we could apply
 unconditionally --- at the very least it'd need to be avoided when
 volatile functions are involved, and I don't think it's invariably
 a win from a cost standpoint even if there aren't semantic blockers.
 But right now the planner has no real ability to reason about placement
 of SELECT-list evaluation: it's done in a fixed spot in any particular
 plan structure.  I don't think it's practical to add such considerations
 to the rat's nest that is grouping_planner and friends.  I have
 ambitions to replace all that with a Path-generation-and-comparison
 approach, and the Paths used for this would need to carry some
 indication of which expressions would be evaluated where.  So maybe
 once that's done we could think about whether this is worth doing.
 I remain dubious though.

 regards, tom lane




-- 
Joe's Computer Service
405-227-0951
Computer Running Slow? Call Joe!
$125, Your computer will run like new!
www.JoesComputerService.net


Re: [HACKERS] [PATCH] configure: add git describe output to PG_VERSION when building a git tree

2013-11-05 Thread Jeff Janes
On Tue, Nov 5, 2013 at 6:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Oskari Saarenmaa o...@ohmu.fi writes:
  On Tue, Nov 05, 2013 at 02:06:26PM +0200, Heikki Linnakangas wrote:
  I can see some value in that kind of information, ie. knowing what
  patches a binary was built with, but this would only solve the
  problem for git checkouts. Even for a git checkout, the binaries
  won't be automatically updated unless you run configure again,
  which makes it pretty unreliable.
 
  -1 from me.

  I don't think we can solve the problem of finding local changes for all
 the
  things people may do, but I'd guess it's pretty common to build
 postgresql
  from a clean local git checkout and with this change at least some
 portion
  of users would get some extra information.

 I agree with Heikki that this is basically useless.  Most of my builds are
 from git + uncommitted changes, so telling me what the top commit was has
 no notable value.  Even if I always committed before building, the hash
 tags are only decipherable until I discard that branch.


I nearly always remember to set config's prefix to some directory name
that describes the uncommitted changes which I am reviewing or testing.
 Also including into the directory name the git commit to which those
changes were applied is awkward and easy to forgot to do--the kind of thing
best done by software.  (And if I've discarded the branch, that pretty much
tells me what I need to know about the binary built from it--obsolete.)

Cheers,

Jeff


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 12:22 PM, Leonardo Francalanci m_li...@yahoo.it wrote:
 Claudio Freire wrote
 you haven't really
 analyzed update cost, which is what we were talking about in that last
 post.

 I don't care for a better update cost if the cost to query is a table scan.
 Otherwise, I'll just claim that no index at all is even better than minmax:
 0 update cost, pretty much same query time.

 Maybe there's value in minmax indexes for sequential data, but not for
 random data, which is the topic of this thread.


Well, of course, they're not magic pixie dust.

But is your data really random? (or normal?)

That's the thing...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-05 Thread Heikki Linnakangas

On 04.11.2013 23:44, Alexander Korotkov wrote:

On Mon, Oct 21, 2013 at 11:12 PM, Alexander Korotkov
aekorot...@gmail.comwrote:


Attached version of patch is debugged. I would like to note that number of
bugs was low and it wasn't very hard to debug. I've rerun tests on it. You
can see that numbers are improved as the result of your refactoring.

  event | period
---+-
  index_build   | 00:01:45.416822
  index_build_recovery  | 00:00:06
  index_update  | 00:05:17.263606
  index_update_recovery | 00:01:22
  search_new| 00:24:07.706482
  search_updated| 00:26:25.364494
(6 rows)

  label  | blocks_mark
+-
  search_new |   847587636
  search_updated |   881210486
(2 rows)

  label |   size
---+---
  new   | 419299328
  after_updates | 715915264
(2 rows)

Beside simple bugs, there was a subtle bug in incremental item indexes
update. I've added some more comments including ASCII picture about how
item indexes are shifted.

Now, I'm trying to implement support of old page format. Then we can
decide which approach to use.



Attached version of patch has support of old page format. Patch still needs
more documentation and probably refactoring, but I believe idea is clear
and it can be reviewed. In the patch I have to revert change of null
category placement for compatibility with old posting list format.


Thanks, just glanced at this quickly.

If I'm reading the patch correctly, old-style non-compressed posting 
tree leaf pages are not vacuumed at all; that's clearly not right.


I argued earlier that we don't need to handle the case that compressing 
a page makes it larger, so that the existing items don't fit on the page 
anymore. I'm having some second thoughts on that; I didn't take into 
account the fact that the mini-index in the new page format takes up 
some space. I think it's still highly unlikely that there isn't enough 
space on a 8k page, but it's not totally crazy with a non-standard small 
page size. So at least that situation needs to be checked for with an 
ereport(), rather than Assert.


To handle splitting a non-compressed page, it seems that you're relying 
on the fact that before splitting, we try to insert, which compresses 
the page. The problem with that is that it's not correctly WAL-logged. 
The split record that follows includes a full copy of both page halves, 
but if the split fails for some reason, e.g you run out of disk space, 
there is no WAL record at all of the the compression. I'd suggest doing 
the compression in the insert phase on a temporary copy of the page, 
leaving the original page untouched if there isn't enough space for the 
insertion to complete. (You could argue that this case can't happen 
because the compression must always create enough space to insert one 
more item. maybe so, but at least there should be an explicit check for 
that.)


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Claudio Freire wrote
 Well, of course, they're not magic pixie dust. 

Of course they aren't. I think they can make a difference in a sequential
input scenario. But I'm not the one who said that they are fit to
solve the problems me and other people are talking about in this thread.


Claudio Freire wrote
 But is your data really random? (or normal?)
 That's the thing...

I still don't understand.

Even if the data was normal, it wouldn't work. You can try and
change the 3rd parameter in normal_rand and get a narrower
distribution, but at the same time the query values should be
narrower... so you'll get the same results. 

The problem is: if the range you get between min and max is 
too big in each page range, you'll end up scanning a lot of heap
pages.

To me, the thing is: has anyone made performance tests of these
minmax indexes with an input that is not sequential?

(BTW I'd like to see tests for the sequential input scenario too...)





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5777055.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Leonardo Francalanci
Jeff Janes wrote
 Some experiments I did a few years ago showed that applying sorts to the
 data to be inserted could be helpful even when the sort batch size was as
 small as one tuple per 5 pages of existing index.  Maybe even less.

Cool!!! Do you have any idea/hint on how I could try and replicate that?
Do you remember how you did it?




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5777056.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Claudio Freire
On Tue, Nov 5, 2013 at 2:52 PM, Leonardo Francalanci m_li...@yahoo.it wrote:
 Jeff Janes wrote
 Some experiments I did a few years ago showed that applying sorts to the
 data to be inserted could be helpful even when the sort batch size was as
 small as one tuple per 5 pages of existing index.  Maybe even less.

 Cool!!! Do you have any idea/hint on how I could try and replicate that?
 Do you remember how you did it?


I do it regularly by sorting tuples before inserting/updating. It
helps quite significantly for batches of ~1000 tuples (well, in my
case).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List of binary-compatible data types

2013-11-05 Thread Josh Berkus
Noah,

 Also, JSON -- Text seems to be missing from the possible binary
 conversions.  That's a TODO, I suppose.
 
 Only json -- text, not json -- text.  Note that you can add the cast
 manually if you have an immediate need.

Huh?  Why would text -- JSON require a physical rewrite?  We have to
validate it, sure, but we don't need to rewrite it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List of binary-compatible data types

2013-11-05 Thread Noah Misch
On Tue, Nov 05, 2013 at 10:00:15AM -0800, Josh Berkus wrote:
 Noah,
 
  Also, JSON -- Text seems to be missing from the possible binary
  conversions.  That's a TODO, I suppose.
  
  Only json -- text, not json -- text.  Note that you can add the cast
  manually if you have an immediate need.
 
 Huh?  Why would text -- JSON require a physical rewrite?  We have to
 validate it, sure, but we don't need to rewrite it.

That's all true, but the system has no concept like this cast validates the
data, never changing it.  We would first need to add metadata supporting such
a concept.  On the other hand, create cast (json as text) without function;
leans only on concepts the system already knows.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Better error message for window-function spec bizarreness

2013-11-05 Thread Tom Lane
We've had a couple of complaints about the error message that's thrown
for the case where you try to copy-and-modify a window definition that
includes a frame clause:
http://www.postgresql.org/message-id/200911191711.najhbped009...@wwwmaster.postgresql.org
http://www.postgresql.org/message-id/CADyrUxP-5pNAqxjuFx9ToeTEhsxwo942PS3Bk_=jekdmvg0...@mail.gmail.com

I propose the attached patch, which changes the text of the message to
cannot copy window \%s\ because it has a frame clause, and then adds
a HINT Omit the parentheses in this OVER clause. if the clause is just
OVER (windowname) and doesn't include any attempt to modify the window's
properties.

I think we should back-patch this into all versions that have window
functions (ie, all supported branches).

Any objections or better ideas?

regards, tom lane

diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index ea90e58..29183c1 100644
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
*** transformWindowDefinitions(ParseState *p
*** 1735,1745 
  		/*
  		 * Per spec, a windowdef that references a previous one copies the
  		 * previous partition clause (and mustn't specify its own).  It can
! 		 * specify its own ordering clause. but only if the previous one had
  		 * none.  It always specifies its own frame clause, and the previous
! 		 * one must not have a frame clause.  (Yeah, it's bizarre that each of
  		 * these cases works differently, but SQL:2008 says so; see 7.11
! 		 * window clause syntax rule 10 and general rule 1.)
  		 */
  		if (refwc)
  		{
--- 1735,1750 
  		/*
  		 * Per spec, a windowdef that references a previous one copies the
  		 * previous partition clause (and mustn't specify its own).  It can
! 		 * specify its own ordering clause, but only if the previous one had
  		 * none.  It always specifies its own frame clause, and the previous
! 		 * one must not have a frame clause.  Yeah, it's bizarre that each of
  		 * these cases works differently, but SQL:2008 says so; see 7.11
! 		 * window clause syntax rule 10 and general rule 1.  The frame
! 		 * clause rule is especially bizarre because it makes OVER foo
! 		 * different from OVER (foo), solely in that the latter is required
! 		 * to throw an error if foo has a nondefault frame clause.	Well, ours
! 		 * not to reason why, but we do go out of our way to throw a useful
! 		 * error message for such cases.
  		 */
  		if (refwc)
  		{
*** transformWindowDefinitions(ParseState *p
*** 1778,1788 
  			wc-copiedOrder = false;
  		}
  		if (refwc  refwc-frameOptions != FRAMEOPTION_DEFAULTS)
  			ereport(ERROR,
  	(errcode(ERRCODE_WINDOWING_ERROR),
! 	 errmsg(cannot override frame clause of window \%s\,
! 			windef-refname),
  	 parser_errposition(pstate, windef-location)));
  		wc-frameOptions = windef-frameOptions;
  		/* Process frame offset expressions */
  		wc-startOffset = transformFrameOffset(pstate, wc-frameOptions,
--- 1783,1809 
  			wc-copiedOrder = false;
  		}
  		if (refwc  refwc-frameOptions != FRAMEOPTION_DEFAULTS)
+ 		{
+ 			/*
+ 			 * Use this message if this is a WINDOW clause, or if it's an OVER
+ 			 * clause that includes ORDER BY or framing clauses.  (We already
+ 			 * rejected PARTITION BY above, so no need to check that.)
+ 			 */
+ 			if (windef-name ||
+ orderClause || windef-frameOptions != FRAMEOPTION_DEFAULTS)
+ ereport(ERROR,
+ 		(errcode(ERRCODE_WINDOWING_ERROR),
+ 		 errmsg(cannot copy window \%s\ because it has a frame clause,
+ windef-refname),
+ 		 parser_errposition(pstate, windef-location)));
+ 			/* Else this clause is just OVER (foo), so say this: */
  			ereport(ERROR,
  	(errcode(ERRCODE_WINDOWING_ERROR),
! 			errmsg(cannot copy window \%s\ because it has a frame clause,
!    windef-refname),
! 	 errhint(Omit the parentheses in this OVER clause.),
  	 parser_errposition(pstate, windef-location)));
+ 		}
  		wc-frameOptions = windef-frameOptions;
  		/* Process frame offset expressions */
  		wc-startOffset = transformFrameOffset(pstate, wc-frameOptions,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Handle LIMIT/OFFSET before select clause (was: [HACKERS] Feature request: optimizer improvement)

2013-11-05 Thread Tom Lane
Joe Love j...@primoweb.com writes:
 I'm wondering what type of index would work for this as it is a volatile
 function. Not knowing how PGs optimizer runs, I'm at a loss as to why this
 wouldn't be possible or worth doing. It seems to me that all functions in
 the select part of the statement could be calculated at the end of the
 query after the results have been gathered, and even after the sorting had
 been done as long as the column wasn't part of the order by (or perhaps
 group by).

The short answer is that doing so directly contradicts the computational
model defined by the SQL standard, and will break applications that rely
on the current behavior.  Since there's already a clear way to write the
query in a way that specifies evaluating the functions after the
sort/limit steps (ie, put the order by/limit in a sub-select), IMHO that's
what you should do, not lobby to make the optimizer reinterpret what you
wrote.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Handle LIMIT/OFFSET before select clause (was: [HACKERS] Feature request: optimizer improvement)

2013-11-05 Thread Atri Sharma
On Tue, Nov 5, 2013 at 11:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Joe Love j...@primoweb.com writes:
 I'm wondering what type of index would work for this as it is a volatile
 function. Not knowing how PGs optimizer runs, I'm at a loss as to why this
 wouldn't be possible or worth doing. It seems to me that all functions in
 the select part of the statement could be calculated at the end of the
 query after the results have been gathered, and even after the sorting had
 been done as long as the column wasn't part of the order by (or perhaps
 group by).

 The short answer is that doing so directly contradicts the computational
 model defined by the SQL standard, and will break applications that rely
 on the current behavior.  Since there's already a clear way to write the
 query in a way that specifies evaluating the functions after the
 sort/limit steps (ie, put the order by/limit in a sub-select), IMHO that's
 what you should do, not lobby to make the optimizer reinterpret what you
 wrote.



+1.

I thought more about our earlier discussion on this, and I agree with
the point that making the planner push limit over select for this
specific case is not a good idea.



-- 
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List of binary-compatible data types

2013-11-05 Thread Josh Berkus
Noah,

 That's all true, but the system has no concept like this cast validates the
 data, never changing it.  We would first need to add metadata supporting such
 a concept.  On the other hand, create cast (json as text) without function;
 leans only on concepts the system already knows.
 

Yeah, I'm thinking it might be worth coming up with a solution for that
specific case.  As users upgrade from 9.0 and 9.1 to 9.3, they're going
to want to convert their text columns containing JSON to columns of the
JSON type, and are going to be surprised how painful that is.

Of course, if we get binary JSON in 9.4 (Oleg?), then a binary
conversion will be required, so maybe it's a moot point.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List of binary-compatible data types

2013-11-05 Thread Andres Freund
On 2013-11-05 11:15:29 -0800, Josh Berkus wrote:
 Noah,
 
  That's all true, but the system has no concept like this cast validates the
  data, never changing it.  We would first need to add metadata supporting 
  such
  a concept.  On the other hand, create cast (json as text) without 
  function;
  leans only on concepts the system already knows.
  
 
 Yeah, I'm thinking it might be worth coming up with a solution for that
 specific case.  As users upgrade from 9.0 and 9.1 to 9.3, they're going
 to want to convert their text columns containing JSON to columns of the
 JSON type, and are going to be surprised how painful that is.

There's zap chance of doing anything for 9.3, this would require quite a
bit of code in tablecmds.c and that surely isn't going to happen in the
backbranches.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing RelationCloseSmgr in FreeFakeRelcacheEntry?

2013-11-05 Thread Andres Freund
On 2013-11-04 13:48:32 +0100, Andres Freund wrote:
 What about just unowning the smgr entry with
 if (rel-rd_smgr != NULL)
smgrsetowner(NULL, rel-rd_smgr)
 when closing the fake relcache entry?
 
 That shouldn't require any further changes than changing the comment in
 smgrsetowner() that this isn't something expected to frequently happen.

Attached is a patch doing it like that, it required slightmy more
invasive changes than that. With the patch applied we survive replay of
a primary's make check run under valgrind without warnings.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 849559b9f1d0c20ea938d1dd13b37b53ebede856 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 5 Nov 2013 17:23:09 +0100
Subject: [PATCH] Un-own SMgrRelations in FreeFakeRelcacheEntry().

The previous coding left SMgrRelation-smgr_owner point into free'd
memory after FreeFakeRelcacheEntry() was executed which then was
accessed and written to when commit messages including smgr
invalidations or dropped relations where replayed during crash
recovery or physical replication.
Due to the memory allocation patterns during recovery this usually
didn't cause big problems but was noticed using valgrind.
---
 src/backend/access/transam/xlogutils.c |  3 +++
 src/backend/storage/smgr/smgr.c| 41 ++
 src/include/storage/smgr.h |  1 +
 3 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 5429d5e..94ddd78 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -433,6 +433,9 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
 void
 FreeFakeRelcacheEntry(Relation fakerel)
 {
+	/* make sure the fakerel is not referenced anymore */
+	if (fakerel-rd_smgr != NULL)
+		smgrunown(fakerel-rd_smgr);
 	pfree(fakerel);
 }
 
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index f7f1437..c9a2a36 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -85,6 +85,7 @@ static SMgrRelation first_unowned_reln = NULL;
 /* local function prototypes */
 static void smgrshutdown(int code, Datum arg);
 static void remove_from_unowned_list(SMgrRelation reln);
+static void add_to_unowned_list(SMgrRelation reln);
 
 
 /*
@@ -174,9 +175,7 @@ smgropen(RelFileNode rnode, BackendId backend)
 		for (forknum = 0; forknum = MAX_FORKNUM; forknum++)
 			reln-md_fd[forknum] = NULL;
 
-		/* place it at head of unowned list (to make smgrsetowner cheap) */
-		reln-next_unowned_reln = first_unowned_reln;
-		first_unowned_reln = reln;
+		add_to_unowned_list(reln);
 	}
 
 	return reln;
@@ -191,7 +190,7 @@ smgropen(RelFileNode rnode, BackendId backend)
 void
 smgrsetowner(SMgrRelation *owner, SMgrRelation reln)
 {
-	/* We don't currently support disowning an SMgrRelation here */
+	/* We don't support disowning an SMgrRelation here */
 	Assert(owner != NULL);
 
 	/*
@@ -214,6 +213,25 @@ smgrsetowner(SMgrRelation *owner, SMgrRelation reln)
 }
 
 /*
+ * smgrunown() -- Remove long-lived reference to an SMgrRelation object if one
+ *exists
+ */
+void
+smgrunown(SMgrRelation reln)
+{
+	if (reln-smgr_owner == NULL)
+		return;
+
+	/* unset the owner's reference */
+	*(reln-smgr_owner) = NULL;
+
+	/* unset our reference to the owner*/
+	reln-smgr_owner = NULL;
+
+	add_to_unowned_list(reln);
+}
+
+/*
  * remove_from_unowned_list -- unlink an SMgrRelation from the unowned list
  *
  * If the reln is not present in the list, nothing happens.  Typically this
@@ -246,6 +264,21 @@ remove_from_unowned_list(SMgrRelation reln)
 }
 
 /*
+ * add_to_unowned_list -- link an SMgrRelation onto the unowned list
+ *
+ * Check remove_from_unowned_list()s comments for performance
+ * considerations.
+ */
+static void
+add_to_unowned_list(SMgrRelation reln)
+{
+	/* place it at head of unowned list (to make smgrsetowner cheap) */
+	reln-next_unowned_reln = first_unowned_reln;
+	first_unowned_reln = reln;
+}
+
+
+/*
  *	smgrexists() -- Does the underlying file for a fork exist?
  */
 bool
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index 98b6f13..20c834d 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -80,6 +80,7 @@ extern void smgrinit(void);
 extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
 extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
 extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
+extern void smgrunown(SMgrRelation reln);
 extern void smgrclose(SMgrRelation reln);
 extern void smgrcloseall(void);
 extern void smgrclosenode(RelFileNodeBackend rnode);
-- 
1.8.4.21.g992c386.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

[HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-11-05 Thread Rajeev rastogi
This patch implements the following TODO item:

Split out pg_resetxlog output into pre- and post-sections


http://archives.postgresql.org/pgsql-hackers/2010-08/msg02040.php

On execution of pg_resetxlog using the option -n
1. It will display values in two section.
2. First section will be called as Current pg_control values 
or Guess pg_control values.
3. In first section, it will display all current (i.e. before 
change) values of control file or guessed values.
4. Second section will be called as Values to be used after 
reset.
5. In second section, it will display new values of parameters 
to be reset as per user request.

Please provide your opinion or expectation out of this patch.

I will add the same to November commitFest.

Thanks and Regards,
Kumar Rajeev Rastogi


pg_resetxlogsection.patch
Description: pg_resetxlogsection.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing locking in at least INSERT INTO view WITH CHECK

2013-11-05 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-02 17:05:24 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:

 Also attached is 0004 which just adds a heap_lock() around a
 newly created temporary table in the matview code which
 shouldn't be required for correctness but gives warm and fuzzy
 feelings as well as less debugging noise.

 Will think about this.  I agree is is probably worth doing
 something to reduce the noise when looking for cases that
 actually matter.

 It's pretty much free, so I don't think there really is any
 reason to deviate from other parts of the code. Note how e.g.
 copy_heap_data(), DefineRelation() and ATRewriteTable() all lock
 the new relation, even if it just has been created and is (and in
 the latter two cases will always be) invisible.

None of those locations are using heap_open() as the first
parameter to heap_close().  That looks kinda iffy, and the fact
that it is not yet done anywhere in the code gives me pause.  You
probably had a reason for preferring that to a simple call to
LockRelationOid(), but I'm not seeing what that reason is.  I also
don't understand the use of the lockmode variable here.

I'm thinking of something like the attached instead.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index d5a10ad..d5e58ae 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -30,6 +30,7 @@
 #include miscadmin.h
 #include parser/parse_relation.h
 #include rewrite/rewriteHandler.h
+#include storage/lmgr.h
 #include storage/smgr.h
 #include tcop/tcopprot.h
 #include utils/builtins.h
@@ -243,6 +244,7 @@ ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
 	/* Create the transient table that will receive the regenerated data. */
 	OIDNewHeap = make_new_heap(matviewOid, tableSpace, concurrent,
 			   ExclusiveLock);
+	LockRelationOid(OIDNewHeap, AccessExclusiveLock);
 	dest = CreateTransientRelDestReceiver(OIDNewHeap);
 
 	/* Generate the data, if wanted. */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing locking in at least INSERT INTO view WITH CHECK

2013-11-05 Thread Andres Freund
On 2013-11-05 11:56:25 -0800, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  On 2013-11-02 17:05:24 -0700, Kevin Grittner wrote:
  Andres Freund and...@2ndquadrant.com wrote:
 
  Also attached is 0004 which just adds a heap_lock() around a
  newly created temporary table in the matview code which
  shouldn't be required for correctness but gives warm and fuzzy
  feelings as well as less debugging noise.
 
  Will think about this.  I agree is is probably worth doing
  something to reduce the noise when looking for cases that
  actually matter.
 
  It's pretty much free, so I don't think there really is any
  reason to deviate from other parts of the code. Note how e.g.
  copy_heap_data(), DefineRelation() and ATRewriteTable() all lock
  the new relation, even if it just has been created and is (and in
  the latter two cases will always be) invisible.
 
 None of those locations are using heap_open() as the first
 parameter to heap_close().

Oh! Sure, what I'd posted was just an absolutely crude hack that surely
isn't committable.

 I'm thinking of something like the attached instead.

Looks fine to me, maybe add a short comment?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing locking in at least INSERT INTO view WITH CHECK

2013-11-05 Thread Andres Freund
On 2013-11-05 12:21:23 -0800, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com
 
  Looks fine to me
 
 Any thoughts on whether this should be back-patched to 9.3?  I can
 see arguments both ways, and don't have a particularly strong
 feeling one way or the other.

Hehe. I was wondering myself. I'd tentatively say no - unless we also
backpatch the debugging patch there doesn't seem to be good reason to
since the likelihood of conficts due to it doesn't seem very high.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Andres Freund
Hi,

There frequently have been bugs where (heap|relation|index)_open(NoLock)
was used without a previous locks which in some circumstances is an easy
mistake to make and which is hard to notice.

The attached patch adds --use-cassert only WARNINGs against doing so:

Add cassert-only checks against unlocked use of relations.

Specifically relation_open(), which also covers heap/index_open(), and
RelationIdGetRelation() WARN if the relation is opened without being
locked. index_open() now checks whether the heap relation the index is
covering is locked.

To make those checks possible add StrongestLocalRelationLock() which
returns the strongest locally held lock over a relation. It relies on
the also added StrongestLocalLock() which searches the local locktable
sequentially for matching locks.

After
1) 2a781d57dcd027df32d15ee2378b84d0c4d005d1
2) 
http://archives.postgresql.org/message-id/1383681385.79520.YahooMailNeo%40web162902.mail.bf1.yahoo.com
3) 
http://archives.postgresql.org/message-id/CAEZATCVCgboHKu_%2BK0nakrXW-AFEz_18icE0oWEQHsM-OepWhw%40mail.gmail.com

HEAD doesn't generate warnings anymore. I think Kevin will commit
something akin to 2), but 3) is an actual open bug, so that patch will
need to get applied beforehand.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From c02e2c00c0dc98ff7d5f63a1d30887b8bbfe0dc3 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 23 Oct 2013 02:34:51 +0200
Subject: [PATCH 1/4] Add cassert-only checks against unlocked use of
 relations.

Specifically relation_open(), which also covers heap/index_open(), and
RelationIdGetRelation() WARN if the relation is opened without being
locked. index_open() now checks whether the heap relation the index is
covering is locked.

To make those checks possible add StrongestLocalRelationLock() which
returns the strongest locally held lock over a relation. It relies on
the also added StrongestLocalLock() which searches the local locktable
sequentially for matching locks.
---
 src/backend/access/heap/heapam.c   | 18 ++
 src/backend/access/index/indexam.c | 23 +++
 src/backend/storage/lmgr/lmgr.c| 23 ++-
 src/backend/storage/lmgr/lock.c| 35 +++
 src/backend/utils/cache/relcache.c | 18 ++
 src/include/storage/lmgr.h |  2 ++
 src/include/storage/lock.h |  2 ++
 7 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a21f31b..3fbadd4 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1038,6 +1038,24 @@ relation_open(Oid relationId, LOCKMODE lockmode)
 
 	pgstat_initstats(r);
 
+#if USE_ASSERT_CHECKING
+	/*
+	 * Unless locked a relation's definition can change, thus relying
+	 * on it is dangerous in that case.
+	 * We cannot rely on correct locking during bootstrap, so don't
+	 * check for it there.
+	 */
+	if (assert_enabled  IsNormalProcessingMode()  lockmode == NoLock)
+	{
+		LOCKMODE	curmode;
+
+		curmode = StrongestLocalRelationLock(relationId);
+		if (curmode == NoLock)
+			elog(WARNING, relation_open(%u, NoLock) of relation \%s\ without previously held lock,
+			 relationId, RelationGetRelationName(r));
+	}
+#endif
+
 	return r;
 }
 
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index b878155..c75bee2 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -65,6 +65,8 @@
 
 #include postgres.h
 
+#include miscadmin.h
+
 #include access/relscan.h
 #include access/transam.h
 #include catalog/index.h
@@ -142,6 +144,8 @@ static IndexScanDesc index_beginscan_internal(Relation indexRelation,
  * 
  */
 
+#include catalog/catalog.h
+
 /* 
  *		index_open - open an index relation by relation OID
  *
@@ -169,6 +173,25 @@ index_open(Oid relationId, LOCKMODE lockmode)
  errmsg(\%s\ is not an index,
 		RelationGetRelationName(r;
 
+#if USE_ASSERT_CHECKING
+	/*
+	 * Using an index's definition is only safe if the underlying
+	 * relation is already locked.
+	 * We cannot rely on correct locking during bootstrap, so don't
+	 * check for it there.
+	 */
+	if (assert_enabled  IsNormalProcessingMode())
+	{
+		LOCKMODE	mode;
+
+		mode = StrongestLocalRelationLock(r-rd_index-indrelid);
+		if (mode == NoLock)
+			elog(WARNING, index_open(%u) of index \%s\ without lock on heap relation %u,
+			 relationId, RelationGetRelationName(r),
+			 r-rd_index-indrelid);
+	}
+#endif
+
 	return r;
 }
 
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index a978172..a6cf74a 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c

Re: [HACKERS] Window functions can be created with defaults, but they don't work

2013-11-05 Thread Tom Lane
I wrote:
 I noticed this while poking at the variadic-aggregates issue:
 regression=# create function nth_value_def(anyelement, integer = 1) returns 
 anyelement language internal window immutable strict as 'window_nth_value';
 CREATE FUNCTION
 regression=# SELECT nth_value_def(ten) OVER (PARTITION BY four), ten, four
 FROM (SELECT * FROM tenk1 WHERE unique2  10 ORDER BY four, ten)s;
 The connection to the server was lost. Attempting reset: Failed.

Attached is a proposed patch against HEAD that fixes this by supporting
default arguments properly for window functions.  In passing, it also
allows named-argument notation in window function calls, since that's
free once the other thing works (because the same subroutine fixes up
both things).

 The reason this crashes is that the planner doesn't apply
 default-insertion to WindowFunc nodes, only to FuncExprs.  We could make
 it do that, probably, but that seems to me like a feature addition.
 I think a more reasonable approach for back-patching is to fix function
 creation to disallow attaching defaults to window functions (or
 aggregates, for that matter, which would have the same issue if CREATE
 AGGREGATE had the syntax option to specify defaults).  ProcedureCreate
 seems like an appropriate place, since it already contains a lot of sanity
 checks of this sort.

Having now done the patch to fix it properly, I'm more inclined to think
that maybe we should just back-patch this rather than inserting an error
check.  It seems pretty low-risk.

Comments?

regards, tom lane

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index e3dbc4b..16aade4 100644
*** a/doc/src/sgml/syntax.sgml
--- b/doc/src/sgml/syntax.sgml
*** SELECT concat_lower_or_upper('Hello', 'W
*** 2527,2534 
  
 note
  para
!  Named and mixed call notations can currently be used only with regular
!  functions, not with aggregate functions or window functions.
  /para
 /note
/sect2
--- 2527,2535 
  
 note
  para
!  Named and mixed call notations currently cannot be used when calling an
!  aggregate function (but they do work when an aggregate function is used
!  as a window function).
  /para
 /note
/sect2
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 76c032c..4fa73a9 100644
*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*** eval_const_expressions_mutator(Node *nod
*** 2251,2256 
--- 2251,2306 
   */
  return (Node *) copyObject(param);
  			}
+ 		case T_WindowFunc:
+ 			{
+ WindowFunc *expr = (WindowFunc *) node;
+ Oid			funcid = expr-winfnoid;
+ List	   *args;
+ Expr	   *aggfilter;
+ HeapTuple	func_tuple;
+ WindowFunc *newexpr;
+ 
+ /*
+  * We can't really simplify a WindowFunc node, but we mustn't
+  * just fall through to the default processing, because we
+  * have to apply expand_function_arguments to its argument
+  * list.  That takes care of inserting default arguments and
+  * expanding named-argument notation.
+  */
+ func_tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+ if (!HeapTupleIsValid(func_tuple))
+ 	elog(ERROR, cache lookup failed for function %u, funcid);
+ 
+ args = expand_function_arguments(expr-args, expr-wintype,
+  func_tuple);
+ 
+ ReleaseSysCache(func_tuple);
+ 
+ /* Now, recursively simplify the args (which are a List) */
+ args = (List *)
+ 	expression_tree_mutator((Node *) args,
+ 			eval_const_expressions_mutator,
+ 			(void *) context);
+ /* ... and the filter expression, which isn't */
+ aggfilter = (Expr *)
+ 	eval_const_expressions_mutator((Node *) expr-aggfilter,
+    context);
+ 
+ /* And build the replacement WindowFunc node */
+ newexpr = makeNode(WindowFunc);
+ newexpr-winfnoid = expr-winfnoid;
+ newexpr-wintype = expr-wintype;
+ newexpr-wincollid = expr-wincollid;
+ newexpr-inputcollid = expr-inputcollid;
+ newexpr-args = args;
+ newexpr-aggfilter = aggfilter;
+ newexpr-winref = expr-winref;
+ newexpr-winstar = expr-winstar;
+ newexpr-winagg = expr-winagg;
+ newexpr-location = expr-location;
+ 
+ return (Node *) newexpr;
+ 			}
  		case T_FuncExpr:
  			{
  FuncExpr   *expr = (FuncExpr *) node;
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 2bd24c8..ede36d1 100644
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
*** ParseFuncOrColumn(ParseState *pstate, Li
*** 537,553 
  	 errmsg(window functions cannot return sets),
  	 parser_errposition(pstate, location)));
  
- 		/*
- 		 * We might want to support this later, but for now reject it because
- 		 * the planner and executor wouldn't cope with NamedArgExprs in a
- 		 * WindowFunc node.
- 	

Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

2013-11-05 Thread MauMau

From: Albe Laurenz laurenz.a...@wien.gv.at

I looked into the Standard, and it does not have NVARCHAR.
The type is called NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING
or NCHAR VARYING.


OUch, that's just a mistake in my mail.  You are correct.



 I guess that the goal of this patch is to support Oracle syntax.
But anybody trying to port CREATE TABLE statements from Oracle
is already exposed to enough incompatibilities that the difference between
NVARCHAR and NCHAR VARYING will not be the reason to reject PostgreSQL.
In other words, I doubt that introducing the nonstandard NVARCHAR
will have more benefits than drawbacks (new reserved word).


Agreed.  But I'm in favor of supporting other DBMS's syntax if it doesn't 
complicate the spec or implementation too much, because it can help migrate 
to PostgreSQL.  I understand PostgreSQL has made such efforts like PL/pgSQL 
which is similar to PL/SQL, text data type, AS in SELECT statement, etc.




But I don't think that this requires the first step that your patch
implements, it is in fact orthogonal.


(It's not my patch.)



Regarding the Standard compliant names of these data types, PostgreSQL
already supports those.  Maybe some documentation would help.

I don't think that there is any need to change NCHAR even if we
get per-column encoding, it is just syntactic sugar to support
SQL Feature F421.


Maybe so.  I guess the distinct type for NCHAR is for future extension and 
user friendliness.  As one user, I expect to get national character 
instead of char character set xxx as output of psql \d and pg_dump when I 
specified national character in DDL.  In addition, that makes it easy to 
use the pg_dump output for importing data to other DBMSs for some reason.



Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 There frequently have been bugs where (heap|relation|index)_open(NoLock)
 was used without a previous locks which in some circumstances is an easy
 mistake to make and which is hard to notice.
 The attached patch adds --use-cassert only WARNINGs against doing so:

While I agree that there seems to be a problem here, I'm not convinced
that this is the solution.  The implication of a heap_open(NoLock) is that
the programmer believes that some previous action must have taken a lock
on the relation; if he's wrong, then the causal link that he thought
existed doesn't really.  But this patch is not checking for a causal link;
it'll be fooled just as easily as the programmer is by a happenstance
(that is, unrelated) previous lock on the relation.  What's more, it
entirely fails to check whether the previous lock is really strong enough
for what we're going to do.

I also find it unduly expensive to search the whole lock hashtable on
every relation open.  That's going to be a O(N^2) cost for a transaction
touching N relations, and that doesn't sound acceptable, not even for
assert-only code.

If we're sufficiently worried by this type of bug, ISTM we'd be better off
just disallowing heap_open(NoLock).  At the time we invented that, every
lock request went to shared memory; but now that we have the local lock
table, re-locking just requires a local hash lookup followed by
incrementing a local counter.  That's probably pretty cheap --- certainly
a lot cheaper than what you've got here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Andres Freund
On 2013-11-05 16:25:53 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  There frequently have been bugs where (heap|relation|index)_open(NoLock)
  was used without a previous locks which in some circumstances is an easy
  mistake to make and which is hard to notice.
  The attached patch adds --use-cassert only WARNINGs against doing so:
 
 While I agree that there seems to be a problem here, I'm not convinced
 that this is the solution.  The implication of a heap_open(NoLock) is that
 the programmer believes that some previous action must have taken a lock
 on the relation; if he's wrong, then the causal link that he thought
 existed doesn't really.  But this patch is not checking for a causal link;
 it'll be fooled just as easily as the programmer is by a happenstance
 (that is, unrelated) previous lock on the relation.  What's more, it
 entirely fails to check whether the previous lock is really strong enough
 for what we're going to do.

Sure. But it already found several bugs as evidenced by the referenced
thread, so it seems to be helpful enough.

 I also find it unduly expensive to search the whole lock hashtable on
 every relation open.  That's going to be a O(N^2) cost for a transaction
 touching N relations, and that doesn't sound acceptable, not even for
 assert-only code.

We could relatively easily optimize that to a constant factor by just
iterating over the possible lockmodes. Should only take a couple more
lines.
I'd be really, really surprised if it even comes close to the overhead
of AtEOXact_Buffers() though. Which is not to say it's a bad idea to
make this check more effective though ;)

 If we're sufficiently worried by this type of bug, ISTM we'd be better off
 just disallowing heap_open(NoLock).  At the time we invented that, every
 lock request went to shared memory; but now that we have the local lock
 table, re-locking just requires a local hash lookup followed by
 incrementing a local counter.  That's probably pretty cheap --- certainly
 a lot cheaper than what you've got here.

Hm. That only works though if we're using the same lockmode as before -
often enough the *_open(NoLock) checks would use a weaker locklevel than
the previous lock. So I think the cost of doing so would probably be
noticeable.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-11-05 16:25:53 -0500, Tom Lane wrote:
 If we're sufficiently worried by this type of bug, ISTM we'd be better off
 just disallowing heap_open(NoLock).  At the time we invented that, every
 lock request went to shared memory; but now that we have the local lock
 table, re-locking just requires a local hash lookup followed by
 incrementing a local counter.  That's probably pretty cheap --- certainly
 a lot cheaper than what you've got here.

 Hm. That only works though if we're using the same lockmode as before -
 often enough the *_open(NoLock) checks would use a weaker locklevel than
 the previous lock. So I think the cost of doing so would probably be
 noticeable.

If you're not using the same lockmode as before, it's probably a bug anyway.
As I said already, the entire NoLock coding technique is dependent on
having a very clear idea of which previous lock-taking you're riding
on the coattails of.  Why wouldn't you duplicate that lock level, 
if we say you can't use NoLock anymore?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing locking in at least INSERT INTO view WITH CHECK

2013-11-05 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-05 12:21:23 -0800, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com

 Looks fine to me

 Any thoughts on whether this should be back-patched to 9.3?  I
 can see arguments both ways, and don't have a particularly
 strong feeling one way or the other.

 Hehe. I was wondering myself. I'd tentatively say no - unless we
 also backpatch the debugging patch there doesn't seem to be good
 reason to since the likelihood of conficts due to it doesn't seem
 very high.

Works for me.  Done.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Andres Freund
On 2013-11-05 16:45:49 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-11-05 16:25:53 -0500, Tom Lane wrote:
  If we're sufficiently worried by this type of bug, ISTM we'd be better off
  just disallowing heap_open(NoLock).  At the time we invented that, every
  lock request went to shared memory; but now that we have the local lock
  table, re-locking just requires a local hash lookup followed by
  incrementing a local counter.  That's probably pretty cheap --- certainly
  a lot cheaper than what you've got here.
 
  Hm. That only works though if we're using the same lockmode as before -
  often enough the *_open(NoLock) checks would use a weaker locklevel than
  the previous lock. So I think the cost of doing so would probably be
  noticeable.
 
 If you're not using the same lockmode as before, it's probably a bug anyway.
 As I said already, the entire NoLock coding technique is dependent on
 having a very clear idea of which previous lock-taking you're riding
 on the coattails of.  Why wouldn't you duplicate that lock level, 
 if we say you can't use NoLock anymore?

Hm. That doesn't really seem to match my reading of the code. There's
quite some code around that relies the fact that the parser has taken an
appropriately strong lock already. E.g. the parser chooses the lockmode
in addRangeTableEntry() - I don't think we want to add duplicate
isLockedRefname() calls everywhere.
Other users of that relation will likely just use AccessShareLock since
that's sufficient for their own use.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Andres Freund
On 2013-11-05 22:35:41 +0100, Andres Freund wrote:
 We could relatively easily optimize that to a constant factor by just
 iterating over the possible lockmodes. Should only take a couple more
 lines.

On that note, any chance you remember why you increased MAX_LOCKMODE by
2 to 10 back in 2001 although AccessExclusiveLock is 8? The relevant
commit is E4fe42dfbc3bafa0ea615239d716a6b37d67da253 .

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

2013-11-05 Thread Peter Eisentraut
On 11/5/13, 1:04 AM, Arulappan, Arul Shaji wrote:
 Implements NCHAR/NVARCHAR as distinct data types, not as synonyms

If, per SQL standard, NCHAR(x) is equivalent to CHAR(x) CHARACTER SET
cs, then for some cs, NCHAR(x) must be the same as CHAR(x).
Therefore, an implementation as separate data types is wrong.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On that note, any chance you remember why you increased MAX_LOCKMODE by
 2 to 10 back in 2001 although AccessExclusiveLock is 8? The relevant
 commit is 4fe42dfbc3bafa0ea615239d716a6b37d67da253 .

Probably because it seemed like a round number, which 9 wasn't ...
keep in mind that this data structure is nominally intended to support
other lock semantics besides what LockConflicts[] defines.  (BTW,
it's a conceptual error to imagine that the numerical values of the
lock mode codes define a strict strength ordering, which is another
reason I don't care for your patch.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window functions can be created with defaults, but they don't work

2013-11-05 Thread Tom Lane
I wrote:
 Attached is a proposed patch against HEAD that fixes this by supporting
 default arguments properly for window functions.  In passing, it also
 allows named-argument notation in window function calls, since that's
 free once the other thing works (because the same subroutine fixes up
 both things).

Hm, well, almost free --- turns out ruleutils.c was assuming WindowFuncs
couldn't contain named arguments.  Updated patch attached.

regards, tom lane

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index e3dbc4b..16aade4 100644
*** a/doc/src/sgml/syntax.sgml
--- b/doc/src/sgml/syntax.sgml
*** SELECT concat_lower_or_upper('Hello', 'W
*** 2527,2534 
  
 note
  para
!  Named and mixed call notations can currently be used only with regular
!  functions, not with aggregate functions or window functions.
  /para
 /note
/sect2
--- 2527,2535 
  
 note
  para
!  Named and mixed call notations currently cannot be used when calling an
!  aggregate function (but they do work when an aggregate function is used
!  as a window function).
  /para
 /note
/sect2
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 76c032c..4fa73a9 100644
*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*** eval_const_expressions_mutator(Node *nod
*** 2251,2256 
--- 2251,2306 
   */
  return (Node *) copyObject(param);
  			}
+ 		case T_WindowFunc:
+ 			{
+ WindowFunc *expr = (WindowFunc *) node;
+ Oid			funcid = expr-winfnoid;
+ List	   *args;
+ Expr	   *aggfilter;
+ HeapTuple	func_tuple;
+ WindowFunc *newexpr;
+ 
+ /*
+  * We can't really simplify a WindowFunc node, but we mustn't
+  * just fall through to the default processing, because we
+  * have to apply expand_function_arguments to its argument
+  * list.  That takes care of inserting default arguments and
+  * expanding named-argument notation.
+  */
+ func_tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+ if (!HeapTupleIsValid(func_tuple))
+ 	elog(ERROR, cache lookup failed for function %u, funcid);
+ 
+ args = expand_function_arguments(expr-args, expr-wintype,
+  func_tuple);
+ 
+ ReleaseSysCache(func_tuple);
+ 
+ /* Now, recursively simplify the args (which are a List) */
+ args = (List *)
+ 	expression_tree_mutator((Node *) args,
+ 			eval_const_expressions_mutator,
+ 			(void *) context);
+ /* ... and the filter expression, which isn't */
+ aggfilter = (Expr *)
+ 	eval_const_expressions_mutator((Node *) expr-aggfilter,
+    context);
+ 
+ /* And build the replacement WindowFunc node */
+ newexpr = makeNode(WindowFunc);
+ newexpr-winfnoid = expr-winfnoid;
+ newexpr-wintype = expr-wintype;
+ newexpr-wincollid = expr-wincollid;
+ newexpr-inputcollid = expr-inputcollid;
+ newexpr-args = args;
+ newexpr-aggfilter = aggfilter;
+ newexpr-winref = expr-winref;
+ newexpr-winstar = expr-winstar;
+ newexpr-winagg = expr-winagg;
+ newexpr-location = expr-location;
+ 
+ return (Node *) newexpr;
+ 			}
  		case T_FuncExpr:
  			{
  FuncExpr   *expr = (FuncExpr *) node;
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 2bd24c8..ede36d1 100644
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
*** ParseFuncOrColumn(ParseState *pstate, Li
*** 537,553 
  	 errmsg(window functions cannot return sets),
  	 parser_errposition(pstate, location)));
  
- 		/*
- 		 * We might want to support this later, but for now reject it because
- 		 * the planner and executor wouldn't cope with NamedArgExprs in a
- 		 * WindowFunc node.
- 		 */
- 		if (argnames != NIL)
- 			ereport(ERROR,
- 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- 	 errmsg(window functions cannot use named arguments),
- 	 parser_errposition(pstate, location)));
- 
  		/* parse_agg.c does additional window-func-specific processing */
  		transformWindowFuncCall(pstate, wfunc, over);
  
--- 537,542 
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 04b1c4f..915fb7a 100644
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
*** get_windowfunc_expr(WindowFunc *wfunc, d
*** 7461,7466 
--- 7461,7467 
  	StringInfo	buf = context-buf;
  	Oid			argtypes[FUNC_MAX_ARGS];
  	int			nargs;
+ 	List	   *argnames;
  	ListCell   *l;
  
  	if (list_length(wfunc-args)  FUNC_MAX_ARGS)
*** get_windowfunc_expr(WindowFunc *wfunc, d
*** 7468,7485 
  (errcode(ERRCODE_TOO_MANY_ARGUMENTS),
   errmsg(too many arguments)));
  	nargs = 0;
  	foreach(l, wfunc-args)
  	{
  		Node	   *arg = (Node *) lfirst(l);
  
! 		Assert(!IsA(arg, 

Re: [HACKERS] Add cassert-only checks against unlocked use of relations

2013-11-05 Thread Andres Freund
On 2013-11-05 17:19:21 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On that note, any chance you remember why you increased MAX_LOCKMODE by
  2 to 10 back in 2001 although AccessExclusiveLock is 8? The relevant
  commit is 4fe42dfbc3bafa0ea615239d716a6b37d67da253 .
 
 Probably because it seemed like a round number, which 9 wasn't ...
 keep in mind that this data structure is nominally intended to support
 other lock semantics besides what LockConflicts[] defines.

Hm. Given that there are Assert()s for MAX_LOCKMODE around, that adding
a new method isn't possible without editing lock.c and that we use it to
in shared memory structures I am not sure I see the point of that slop.
Anyway, it was just a point of curiosity.

  (BTW,
 it's a conceptual error to imagine that the numerical values of the
 lock mode codes define a strict strength ordering, which is another
 reason I don't care for your patch.)

Well, while I don't thing it has too much practical bearing, we could
just check for *any* lock already held instead, that's all we need for
the added checks. I thought it might be more useful to get the strongest
lock rather than any lock for other potential checks, but if that's a
contentious point...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: small patch to correct filename formatting error in '\s FILE' output

2013-11-05 Thread Tom Lane
Ian Lawrence Barwick barw...@gmail.com writes:
 2013/9/10 Bruce Momjian br...@momjian.us:
 I still see that weird behavior in git head:
 
 pgdevel=# \s history.txt
 Wrote history to file ./history.txt.
 pgdevel=# \s /tmp/history.txt
 Wrote history to file .//tmp/history.txt.
 pgdevel=# \cd /tmp
 pgdevel=# \s /tmp/history.txt
 Wrote history to file /tmp//tmp/history.txt.
 
 Should I revert the suggested patch?

 IIRC the patch was never applied, the reversion candidate is the existing
 commit 0725065b.

I reverted 0725065b.  AFAICT there is no interest in making \s produce
a reliable full path name.  There was some interest in making \cd tell
you where it'd chdir'd to, but that would be a separate patch.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] configure: allow adding a custom string to PG_VERSION

2013-11-05 Thread Michael Paquier
On Tue, Nov 5, 2013 at 4:29 PM, Oskari Saarenmaa o...@ohmu.fi wrote:
 This can be used to tag custom built packages with an extra version string
 such as the git describe id or distribution package release version.
Could you attach a proper patch to your email and register it to the
next commit fest? Typically here:
https://commitfest.postgresql.org/action/commitfest_view?id=20
This idea is more adaptable than your latest proposition, so some
other people might be interested to review it. At least I like this
idea.

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Greg Stark
On Tue, Nov 5, 2013 at 5:30 PM, Claudio Freire klaussfre...@gmail.comwrote:

  Maybe there's value in minmax indexes for sequential data, but not for
  random data, which is the topic of this thread.


 Well, of course, they're not magic pixie dust.

 But is your data really random? (or normal?)


I think minmax indexes are more akin to bitmap indexes. They will be very
effective for columns with low-cardinality, especially for columns that are
very clustered. In the extreme if all the values in some regions of the
table are the same then minmax indexes would be optimal. I wouldn't expect
them to be very effective for a highly selective column that isn't well
clustered.

It really sounds like you're describing a particular workload that btrees
could just be more optimized for. Buffering all inserts in memory and
merging them into the btree lazily is actually something Heikki has
proposed in the past. I'm not clear if that gets you all the benefits of
the indexes you described or not but it seems to target the particular
problem you're having.

-- 
greg


Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-05 Thread Alvaro Herrera
Greg Stark escribió:

 I think minmax indexes are more akin to bitmap indexes. They will be very
 effective for columns with low-cardinality, especially for columns that are
 very clustered. In the extreme if all the values in some regions of the
 table are the same then minmax indexes would be optimal. I wouldn't expect
 them to be very effective for a highly selective column that isn't well
 clustered.

I think clustering is more important than cardinality.  I mean you can
have a very useful minmax index on a float column, on which maybe there
are no two identical values.

I certainly don't think minmax indexes will solve all indexing problems.
In the end, they're just one more tool in your DBA toolbox.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] List of binary-compatible data types

2013-11-05 Thread Josh Berkus
Andres,

 There's zap chance of doing anything for 9.3, this would require quite a
 bit of code in tablecmds.c and that surely isn't going to happen in the
 backbranches.

Oh, sure, I was thinking of a workaround.

Actually, being able to separate need to check contents from need to
rewrite values could be useful for a lot of type conversions.

I'd also love some way of doing a no-rewrite conversion between
timestamp and timestamptz, based on the assumption that the original
values are UTC time.  That's one I encounter a lot.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git diff --check whitespace checks, gitattributes

2013-11-05 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Attached is a patch that
 - Adds a .gitattributes file to configure appropriate whitespace checks
 for git diff --check.

 - Cleans up all whitespace errors found in this way in existing code.
 Most of that is in files not covered by pgindent, some in new code since
 the last pgindent.

 This makes the entire tree git diff --check clean.  After this, future
 patches can be inspected for whitespace errors with git diff --check,
 something that has been discussed on occasion.

I always (well, almost always) do git diff --check, so making it stronger
sounds good to me.  But it sounds like this still leaves it to the
committer to remember to run it.  Can we do anything about that?

 One open question is whether psql output pasted into documentation, in
 particular .sgml files, should preserve the trailing whitespace that
 psql produces.  This is currently done inconsistently.

 My preference is to trim the trailing whitespace, because otherwise it's
 impossible to check for trailing whitespace errors in other parts of
 those files.

Maybe we should think about fixing psql to not generate that whitespace.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git diff --check whitespace checks, gitattributes

2013-11-05 Thread Alvaro Herrera
Peter Eisentraut wrote:

 This makes the entire tree git diff --check clean.  After this, future
 patches can be inspected for whitespace errors with git diff --check,
 something that has been discussed on occasion.

+1 to this, and also +1 to Tom's suggestion of making it more strongly
enforced.

 One open question is whether psql output pasted into documentation, in
 particular .sgml files, should preserve the trailing whitespace that
 psql produces.  This is currently done inconsistently.

I think pasting psql output verbatim isn't such a great idea anyway.
Maybe it's okay for certain specific examples, but in most cases I think
it'd be better to produce native SGML tables instead of programoutput
stuff.  After all, it's the result set that's interesting, not the way
psql presents it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR during end-of-xact/FATAL

2013-11-05 Thread Amit Kapila
On Thu, Oct 31, 2013 at 8:22 PM, Noah Misch n...@leadboat.com wrote:
 CommitTransaction() and AbortTransaction() both do much work, and large
 portions of that work either should not or must not throw errors.  An error
 during either function will, as usual, siglongjmp() out.  Ordinarily,
 PostgresMain() will regain control and fire off a fresh AbortTransaction().
 The consequences thereof depend on the original function's progress:

 - Before the function updates CurrentTransactionState-state, an ERROR is
   fully acceptable.  CommitTransaction() specifically places failure-prone
   tasks accordingly; AbortTransaction() has no analogous tasks.

 - After the function updates CurrentTransactionState-state, an ERROR yields a
   user-unfriendly e.g. WARNING:  AbortTransaction while in COMMIT state.
   This is not itself harmful, but we've largely kept the things that can fail
   for pedestrian reasons ahead of that point.

 - After CommitTransaction() calls RecordTransactionCommit() for an xid-bearing
   transaction, an ERROR upgrades to e.g. PANIC:  cannot abort transaction
   805, it was already committed.

 - After AbortTransaction() calls ProcArrayEndTransaction() for an xid-bearing
   transaction, an ERROR will lead to this assertion failure:

   TRAP: FailedAssertion(!(((allPgXact[proc-pgprocno].xid) != 
 ((TransactionId) 0))), File: procarray.c, Line: 396)

 If the original AbortTransaction() pertained to a FATAL, the situation is
 worse.  errfinish() promotes the ERROR thrown from AbortTransaction() to
 another FATAL,

isn't errstart promotes ERROR to FATAL?

 so we reenter proc_exit().  Thanks to the following logic in
 shmem_exit(),

following comment is in proc_exit_prepare(), but the effect will be
same as described you due to logic in shmem_exit().

 we will never return to AbortTransaction():

 /*
  * call all the registered callbacks.
  *
  * Note that since we decrement on_proc_exit_index each time, if a
  * callback calls ereport(ERROR) or ereport(FATAL) then it won't be
  * invoked again when control comes back here (nor will the
  * previously-completed callbacks).  So, an infinite loop should not 
 be
  * possible.
  */

 As a result, we miss any cleanups that had not yet happened in the original
 AbortTransaction().  In particular, this can leak heavyweight locks.  An
 asserts build subsequently fails this way:

 TRAP: FailedAssertion(!(SHMQueueEmpty((MyProc-myProcLocks[i]))), File: 
 proc.c, Line: 788)

 In a production build, the affected PGPROC slot just continues to hold the
 lock until the next backend claiming that slot calls LockReleaseAll().  Oops.
 Bottom line: most bets are off given an ERROR after RecordTransactionCommit()
 in CommitTransaction() or anywhere in AbortTransaction().

When I tried above scenario, I hit Assert at different place

shmem_exit()-pgstat_beshutdown_hook()-pgstat_report_stat()
pgstat_report_stat(bool force)
{
..
Assert(entry-trans == NULL);
}

The reason is that in AbortTransaction, an ERROR is thrown before call
to AtEOXact_PgStat(false).

This means that in the situation when an ERROR occurs in
AbortTransaction which is called as a result of FATAL error, there are
many more possibilities of Assert.


 Now, while those assertion failures are worth preventing on general principle,
 the actual field importance depends on whether things actually do fail in the
 vulnerable end-of-xact work.  We've prevented the errors that would otherwise
 be relatively common, but there are several rare ways to get a late ERROR.
 Incomplete list:

 - If smgrDoPendingDeletes() finds files to delete, mdunlink() and its callee
   relpathbackend() call palloc(); this is true in all supported branches.  In
   9.3, due to commit 279628a0, smgrDoPendingDeletes() itself calls palloc().
   (In fact, it does so even when the pending list is empty -- this is the only
   palloc() during a trivial transaction commit.)  palloc() failure there
   yields a PANIC during commit.

 - ResourceOwnerRelease() calls FileClose() during abort, and FileClose()
   raises an ERROR when close() returns EIO.

 - AtEOXact_Inval() can lead to calls like RelationReloadIndexInfo(), which has
   many ways to throw errors.  This precedes releasing heavyweight locks, so an
   error here during an abort pertaining to a FATAL exit orphans locks as
   described above.  This relates into another recent thread:
 http://www.postgresql.org/message-id/20130805170931.ga369...@tornado.leadboat.com


 What should we do to mitigate these problems?  Certainly we can harden
 individual end-of-xact tasks to not throw errors, as we have in the past.
 What higher-level strategies should we consider?  What about for the unclean
 result of the FATAL-then-ERROR scenario in particular?

About unclean FATAL-then-ERROR scenario, one way to deal at high level
could be to treat such a case as backend crash in which case
postmaster reinitialises shared memory and 

Re: [HACKERS] [v9.4] row level security

2013-11-05 Thread Craig Ringer
On 11/05/2013 09:36 PM, Robert Haas wrote:
 I haven't studied this patch in detail, but I see why there's some
 unhappiness about that code: it's an RLS-specific kluge.  Just
 shooting from the hip here, maybe we should attack the problem of
 making security-barrier views updatable first, as a separate patch.

That's the approach I've been considering. There are a few wrinkles with
it, though:

(a) Updatable views are implemented in the rewriter, not the planner.
The rewriter is not re-run when plans are invalidated or when the
session authorization changes, etc. This means that we can't simply omit
the RLS predicate for superuser because the same rewritten parse tree
might get used for both superuser and non-superuser queries.

Options:

* Store the before-rewrite parse tree when RLS is discovered on one of
the rels in the tree. Re-run the rewriter when re-planning. Ensure a
change in current user always invalidates plans.

* Declare that it's not legal to run a query originally parsed and
rewritten as superuser as a non-superuser or vice versa. This would
cause a great deal of pain with PL/PgSQL.

* Always add the RLS predicate and solve the problem of reliably
short-circuiting the user-supplied predicate for RLS-exempt users.  We'd
need a way to allow direct (not query-based) COPY TO for tables with RLS
applied, preventing the rewriting of direct table access into subqueries
for COPY, but since we never save plans for COPY that may be fine.

* ... ?


(b) Inheritance is a problem when RLS is done in the rewriter. As I
understood it from Kohei KaiGai's description to me earlier, there was a
strong preference on -hackers to enforce RLS predicates for child and
parent tables completely independently. That's how RLS currently works,
but it might be hard to get the same effect when applying RLS in the
rewriter. We'd need to solve that, or redefine RLS's behaviour so that
the predicate on a parent table applies to any child tables too.
Personally I'd prefer the latter.

(c) RLS might interact differently with rules declared on tables if
implemented in the rewriter, so some investigation into that would be
needed.

 I
 would think that if we make that work, this will also work without,
 hopefully, any special hackery.  And we'd get a separate,
 independently useful feature out of it, too.

I tend to agree. I'm just a bit concerned about dealing with the issues
around RLS-exempt operations and users.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security writer-side checks proposal

2013-11-05 Thread Craig Ringer
On 11/05/2013 09:30 PM, Robert Haas wrote:
 So really, there are four cases:

 READ
 WRITE INSERT
 WRITE UPDATE
 WRITE DELETE
 
 Isn't READ similarly divisible into READ SELECT, READ UPDATE, and READ DELETE?

Not in my opinion. No matter what the command, the read side is all
about having some way to obtain the contents of the tuple.

Separate READ DELETE etc would only be interesting if we wanted to let
someone DELETE rows they cannot SELECT. Since we have DELETE ...
RETURNING, and since users can write a predicate function for DELETE
that leaks the information even if we didn't, in practice if you give
the user any READ right you've given them all of them. So I don't think
we can support that (except maybe by column RLS down the track).

By contrast on the write side it seems routine to need different rules
for different operations. The traditional heriachical mandatory access
model as implemented by Teradata Row Level Security,  Oracle Label Based
Security, etc, can have different rules for each operation. Here's a
synopsis of the example described in the Teradata docs as a typical policy:

- SELECT: Current session security label must be = the row label

- INSERT: Current session security label must be = the new row label

- UPDATE: Session label must be = row label to update the row.
  New row must be = session security label.

- DELETE: Only permitted for rows with the lowest label set, ensuring
  row is reviewed and declassified before deletion.

Except for the DELETE, this is actually just two policies, one for reads
(session label = row label) and one for writes (session label = new row
label). So this might be an acceptable constraint if necessary, but it'd
be really good to support per-command rules, and we certainly need
asymmetric read- and write- rules.

I'm looking into use cases and existing examples to put this in a more
concerete context, as much of the RLS discussion has been a hypothetical
one that doesn't look at concrete user problems much.

 Similarly, saying you can update but not delete seems quite reasonable
 to me.
 
 If you simply want to allow UPDATE but not DELETE, you can refrain
 from granting the table-level privilege.  The situation in which you
 need things separate is when you want to allow both UPDATE and DELETE
 but with different RLS quals for each.

That's what I was getting at, yes. Like the example above; the set of
rows you can update might be different to the set of rows you can delete.

 On the other hand, we might choose to say if you want to do things with
 that granularity use your own triggers to enforce it and provide only
 READ and WRITE for RLS.
 
 The funny thing about this whole feature is that it's just syntax
 support for doing things that you can already do in other ways.  If
 you want read-side security, create a security_barrier view and select
 from that instead of hitting the table directly.  If you want
 write-side security, enforce it using triggers.

Right now you can't have both together, though; an UPDATE on the raw
table can observe rows that wouldn't be visible via the view and can
send them to the client via RAISE NOTICE or whatever.

Support for automatically updatable security barrier views would take
care of this issue, at which point I'd agree: RLS becomes mostly
cosmetic syntactical sugar over existing capabilities. FKs would ignore
RLS, much like the would if you use explicit SECURITY BARRIER views and
have FKs between the base tables.

One big difference still remains though: when you add an RLS policy on a
table, all procedures and views referring to that table automatically
use the transparent security barrier view over the table instead. That's
*not* the case when you use views manually; you have to re-create views
that point to the table so they instead point to a security barrier view
over the table. Again it's nothing you can't do with updatable security
barrier views, but it's automatic and transparent with RLS.

 Now maybe that's fine.  But given that, I think it's pretty important
 that we get the syntax right.  Because if you're adding a feature
 primarily to add a more convenient syntax, then the syntax had better
 actually be convenient.

I completely agree with that.  I don't have a strong opinion on the
current syntax.

I've looked at how some other vendors do it, and I can't say their
approaches are pretty.

Oracle VPD has you create a PL/SQL procedure to generate the SQL text of
the desired RLS predicate. It then wants you to create a POLICY (via a
PL/SQL call to a built-in package) that associates the predicate
generation function with a table and sets some options controlling what
statement types it applies to, when the predicate is re-generated, etc.
It also has policy groups, which bundle policies together and control
whether or not they're applied for a given session. The predicates of
different policies are ANDed together.

So to create a single RLS policy on a single table you have to write a

Re: [HACKERS] Row-security writer-side checks proposal

2013-11-05 Thread Craig Ringer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/05/2013 10:01 PM, Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 Now maybe that's fine.  But given that, I think it's pretty
 important that we get the syntax right.  Because if you're adding
 a feature primarily to add a more convenient syntax, then the
 syntax had better actually be convenient.
 
 I agree that we want to get the syntax correct, but also very clear
 as it's security related and we don't want anyone surprised by what
 happens when they use it.  The idea, as has been discussed in the
 past, is to then allow tying RLS in with SELinux and provide MAC.

That was my impression also.

To help get closer to that point, since you were involved in the work
on auto-updatable views: any hints on what might be needed to tackle
making security barrier views updatable?


There's a fun little wrinkle with MAC, by the way: functional indexes.
We can't allow the creation of a functional index, even by the table
owner, if it uses any non-LEAKPROOF operators and functions. Otherwise
the user can write a function to leak the rows, then create an index
using that function.

That's not a problem for the current phase of RLS because the table
owner is allowed to remove the RLS constraint directly. They can also
add triggers that might leak rows via CASCADEs, etc. When MAC comes
into the picture we'll need to impose limits on triggers and
functional indexes added to rows.


- -- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.15 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSefGAAAoJELBXNkqjr+S2W6EH+wc3fM3GGoYjnietLfGiiFmA
4ea7sIcio9kdDap3dNpgnMW2NfEHu/OLxSptFGBjl3w4RfA1KSQaKcwupjmanPGa
har7MylI4SKDRHB5LWZEgYrK1A3n/PTJUap3DFGhLJxAdCMM3AtQfcyHBoj/LXfZ
9o9KkpXfzFW2e4yuPR714rZMzfAgO+Jyij9WkcayNASw/0jnCuhCdBtg8mKU6mhz
lC4KA0WGxXqCGDdKxPwVRSJTMoT8kBeUBf4lznSEeGspxCHb4GafMCFvhHarQ9WU
+aBY1mw3ELFXqfPurLC5RZVQGYsygWfzrREJ+oHUJ3khgPR2djj0EAemK3lwO6M=
=HYU7
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers