Re: [HACKERS] Large C files

2011-10-16 Thread David Fetter
On Fri, Oct 14, 2011 at 07:36:32PM +0100, Peter Geoghegan wrote:
> This evening, David Fetter described a problem to me that he was
> having building the Twitter FDW. It transpired that it was down to its
> dependence on an #include that was recently judged to be
> redundant.This seems like another reason to avoid using pgrminclude -
> even if we can be certain that the #includes are redundant within
> Postgres, we cannot be sure that they are redundant in third party
> code.

Perhaps something that tested some third-party code
automatically...say, doesn't the new buildfarm stuff allow running
some arbitrary code?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_comments (was: Allow \dd to show constraint comments)

2011-10-16 Thread Robert Haas
On Fri, Oct 14, 2011 at 11:12 AM, Robert Haas  wrote:
> On Wed, Oct 12, 2011 at 10:20 PM, Josh Kupershmidt  wrote:
>>> On the third hand, Josh's previous batch of changes to clean up
>>> psql's behavior in this area are clearly a huge improvement: you can
>>> now display the comment for nearly anything by running the appropriate
>>> \d command for whatever the object type is.  So ... is this still
>>> a good idea, or should we just forget about it?
>>
>> I think this question is a part of a broader concern, namely do we
>> want to create and support system views for easier access to
>> information which is already available in different ways through psql
>> commands, or by manually digging around in the catalogs? I believe
>> there are at least several examples of existing views we maintain
>> which are very similar to pg_comments: pg_seclabel seems quite
>> similar, for instance.
>
> That's one's a direct analogue, but I don't want to overbroaden the
> issue.  I guess it just seems to me that if no one's going to champion
> adding this, maybe we shouldn't.

Hearing no cries of "oh, yes, please", I'm marking this Returned with
Feedback for now.  We can always revisit it if we hear that more
people want it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Robert Haas
On Sun, Oct 16, 2011 at 8:59 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I previously floated the idea of using a new keyword, possibly LET,
>> for this, like this:
>
>> LET var = value [, ...] IN query
>
>> I'm not sure if anyone bought it, but I'll run it up the flagpole
>> again and see if anyone salutes.  I tend to agree with the idea that
>> SET LOCAL isn't always what you want; per-transaction is not the same
>> as per-query, and multi-command query strings have funny semantics,
>> and multiple server round-trips are frequently undesirable; and it
>> just seems cleaner, at least IMHO.
>
> Well, syntax aside, the real issue here is that GUC doesn't have
> any notion of a statement-lifespan setting, and adding one would require
> adding per-statement overhead; not to mention possibly adding
> considerable logical complexity, depending on exactly what you wanted to
> define as a "statement".  I don't think an adequate case has been
> made that SET LOCAL is insufficient.

Would it require adding per-statement overhead in every case, or just
when the feature gets used?  I suspect the latter, which is no
different from anything else we have.  We do already have at least one
other case that seems similar to me: you can apply a setting using
ALTER FUNCTION .. SET; the new value is applied when you enter the
function and restored on exit.  I'd imagine that this would have
similar semantics.  In terms of what qualifies as a statement, I would
rather imagine that it would only be worthwhile to apply this to
queries rather than fooling around with utility statements.  I mean,
it would be nice if it Just Worked Anywhere, but that's likely to be a
lot more work (and grammar conflicts) than we want to deal with.

Anyway, the judgement of whether or not SET LOCAL is sufficient is in
the end a value judgement, and I'm not going to pretend that my
opinion is superior to all others.  My personal experience, however,
is that I've never used or wanted SET LOCAL, but I've wanted a
single-statement equivalent a few times.  So for me personally, having
this in lieu of SET LOCAL would be an improvement from a usability
perspective.  YMMV, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Andrew Dunstan



On 10/16/2011 08:59 PM, Tom Lane wrote:

Robert Haas  writes:

I previously floated the idea of using a new keyword, possibly LET,
for this, like this:
LET var = value [, ...] IN query
I'm not sure if anyone bought it, but I'll run it up the flagpole
again and see if anyone salutes.  I tend to agree with the idea that
SET LOCAL isn't always what you want; per-transaction is not the same
as per-query, and multi-command query strings have funny semantics,
and multiple server round-trips are frequently undesirable; and it
just seems cleaner, at least IMHO.

Well, syntax aside, the real issue here is that GUC doesn't have
any notion of a statement-lifespan setting, and adding one would require
adding per-statement overhead; not to mention possibly adding
considerable logical complexity, depending on exactly what you wanted to
define as a "statement".  I don't think an adequate case has been
made that SET LOCAL is insufficient.




I agree. But if we are going to go there I vastly prefer Robert's 
suggestion of a separate syntactical structure. Mixing this up with WITH 
would just be an awful mess, and cause endless confusion.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Tom Lane
Robert Haas  writes:
> I previously floated the idea of using a new keyword, possibly LET,
> for this, like this:

> LET var = value [, ...] IN query

> I'm not sure if anyone bought it, but I'll run it up the flagpole
> again and see if anyone salutes.  I tend to agree with the idea that
> SET LOCAL isn't always what you want; per-transaction is not the same
> as per-query, and multi-command query strings have funny semantics,
> and multiple server round-trips are frequently undesirable; and it
> just seems cleaner, at least IMHO.

Well, syntax aside, the real issue here is that GUC doesn't have
any notion of a statement-lifespan setting, and adding one would require
adding per-statement overhead; not to mention possibly adding
considerable logical complexity, depending on exactly what you wanted to
define as a "statement".  I don't think an adequate case has been
made that SET LOCAL is insufficient.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Robert Haas
On Sun, Oct 16, 2011 at 4:58 PM, Tom Lane  wrote:
> Dimitri Fontaine  writes:
>> Now that you mention it, the following might actually already work:
>
>>  WITH settings AS (
>>    SELECT set_config('timezone', 'Europe/Amsterdam', t),
>>           set_config('work_mem', '1 GB', t)
>>  ),
>>       foo AS (
>>    SELECT …
>>  )
>>  INSERT INTO bar SELECT * FROM foo;
>
> Only for small values of "work" ... you won't be able to affect planner
> settings that way, nor can you assume that that WITH item is executed
> before all else.  See recent thread pointing out that setting values
> mid-query is unsafe.

I previously floated the idea of using a new keyword, possibly LET,
for this, like this:

LET var = value [, ...] IN query

I'm not sure if anyone bought it, but I'll run it up the flagpole
again and see if anyone salutes.  I tend to agree with the idea that
SET LOCAL isn't always what you want; per-transaction is not the same
as per-query, and multi-command query strings have funny semantics,
and multiple server round-trips are frequently undesirable; and it
just seems cleaner, at least IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Hitoshi Harada
2011/10/17 Greg Stark :
> On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane  wrote:
>> We could hack around this by adding more columns to the result so that
>> an index-only scan doesn't work.  But I wonder whether it wouldn't be
>> smarter to add ORDER BY clauses to the window function calls.  I've been
>> known to argue against adding just-in-case ORDER BYs to the regression
>> tests in the past; but these cases bother me more because a plan change
>> will not just rearrange the result rows but change their contents,
>> making it really difficult to verify that nothing's seriously wrong.
>
> I'm not sure if it applies to this case but I recall I was recently
> running queries on Oracle that included window functions and it
> wouldn't even let me run them without ORDER BY clauses in the window
> definition. I don't know if it cleverly determines that the ORDER BY
> will change the results or if Oracle just requires ORDER BY on all
> window definitions or what.

AFAIK, the current standard doesn't tell clearly if all/some window
functions require ORDER BY clause in window specifications. Some
window functions like rank and row_number is meaningless if it is
omitted, so some implementation doesn't allow it omitted. And I
believe Oracle implemented it before the standard, so that'd be why
details are different from spec. We designed it per spec and omitting
the clause doesn't violate any part of the standard.

Regards,
-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-10-16 Thread Kerem Kat
CORRESPONDING clause take 2

After realizing that modifying prepunion.c to include a custom subquery
is not easy(incomprehensible to me) as it sounds and turning into a
hassle after making several uninformed changes, I decided to go with
modifying analyze.c.

The incomprehensible part is constructing a custom subquery as a
SubqueryScan.

Anyway I managed to implement the clause as a Subquery in analyze.c.

In the method transformSetOperationTree, if the node is a setoperation and
contains a corresponding clause, i.e. CORRESPONDING, or CORRESPONDING
BY(columns...),
we determine the common column names. Column ordering in select statements
are not important to the CORRESPONDING. With the common column names
in hand, we create a RangeSubselect node accordingly and replace the original
statement op->larg with the new RangeSubselect. RangeSubselect in turn has the
original op->larg as a from clause. We do the same to op->rarg too.

There were no changes done in prepunion.c

There are documentation changes and one regression test in the patch.


Best Regards,

Kerem KAT
*** a/doc/src/sgml/queries.sgml
--- b/doc/src/sgml/queries.sgml
***
*** 1225,1230 
--- 1225,1233 
 EXCEPT


+CORRESPONDING
+   
+   
 set union


***
*** 1241,1249 
 The results of two queries can be combined using the set operations
 union, intersection, and difference.  The syntax is
  
! query1 UNION ALL query2
! query1 INTERSECT ALL query2
! query1 EXCEPT ALL query2
  
 query1 and
 query2 are queries that can use any of
--- 1244,1252 
 The results of two queries can be combined using the set operations
 union, intersection, and difference.  The syntax is
  
! query1 UNION ALL CORRESPONDING BY (select_list) query2
! query1 INTERSECT ALL CORRESPONDING BY (select_list) query2
! query1 EXCEPT ALL CORRESPONDING BY (select_list) query2
  
 query1 and
 query2 are queries that can use any of
***
*** 1283,1288 
--- 1286,1299 

  

+ CORRESPONDING returns all columns that are in both query1 and query2 with the same name.
+   
+ 
+   
+ CORRESPONDING BY returns all columns in the column list that are also in both query1 and query2 with the same name.
+   
+ 
+   
 In order to calculate the union, intersection, or difference of two
 queries, the two queries must be union compatible,
 which means that they return the same number of columns and
*** a/doc/src/sgml/sql.sgml
--- b/doc/src/sgml/sql.sgml
***
*** 859,865 
  [ WHERE condition ]
  [ GROUP BY expression [, ...] ]
  [ HAVING condition [, ...] ]
! [ { UNION | INTERSECT | EXCEPT } [ ALL ] select ]
  [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
  [ LIMIT { count | ALL } ]
  [ OFFSET start ]
--- 859,865 
  [ WHERE condition ]
  [ GROUP BY expression [, ...] ]
  [ HAVING condition [, ...] ]
! [ { UNION | INTERSECT | EXCEPT } [ ALL ] [ CORRESPONDING [ BY ( expression ) ] ] select ]
  [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
  [ LIMIT { count | ALL } ]
  [ OFFSET start ]
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***
*** 2507,2512 
--- 2507,2513 
  	COPY_NODE_FIELD(lockingClause);
  	COPY_SCALAR_FIELD(op);
  	COPY_SCALAR_FIELD(all);
+ 	COPY_NODE_FIELD(correspondingClause);
  	COPY_NODE_FIELD(larg);
  	COPY_NODE_FIELD(rarg);
  
***
*** 2522,2527 
--- 2523,2530 
  	COPY_SCALAR_FIELD(all);
  	COPY_NODE_FIELD(larg);
  	COPY_NODE_FIELD(rarg);
+ 	COPY_NODE_FIELD(correspondingColumns);
+ 	COPY_SCALAR_FIELD(hasCorrespondingBy);
  	COPY_NODE_FIELD(colTypes);
  	COPY_NODE_FIELD(colTypmods);
  	COPY_NODE_FIELD(colCollations);
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***
*** 982,987 
--- 982,988 
  	COMPARE_NODE_FIELD(lockingClause);
  	COMPARE_SCALAR_FIELD(op);
  	COMPARE_SCALAR_FIELD(all);
+ 	COMPARE_NODE_FIELD(correspondingClause);
  	COMPARE_NODE_FIELD(larg);
  	COMPARE_NODE_FIELD(rarg);
  
***
*** 995,1000 
--- 996,1003 
  	COMPARE_SCALAR_FIELD(all);
  	COMPARE_NODE_FIELD(larg);
  	COMPARE_NODE_FIELD(rarg);
+ 	COMPARE_NODE_FIELD(correspondingColumns);
+ 	COMPARE_SCALAR_FIELD(hasCorrespondingBy);
  	COMPARE_NODE_FIELD(colTypes);
  	COMPARE_NODE_FIELD(colTypmods);
  	COMPARE_NODE_FIELD(colCollations);
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
***
*** 2894,2899 
--- 2894,2901 
  	return true;
  if (walker(stmt->lockingClause, context))
  	return true;
+ if (walker(stmt->correspondingClause, context))
+ 	return true;
  if (walker(stmt->larg, context))
  	return true;
  if (walker(stmt->rarg, context))
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*

Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Hitoshi Harada
2011/10/17 Tom Lane :
> Hitoshi Harada  writes:
>> 2011/10/15 Tom Lane :
>>> I can't recall whether there was some good reason for underspecifying
>>> these test queries.  It looks like all the problematic ones were added in
>>> commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 "Extend the set of frame
>>> options supported for window functions", which means it was either me or
>>> Hitoshi-san who wrote them that way, but memory is not serving this
>>> afternoon.
>
>> I don't remember if I wrote that part or not, but I like to add
>> explicit ORDER BY to the test cases. It doesn't appear that some
>> reason stopped us to specify it. So +1 for adding the clauses.
>
> I looked at this more closely and realized that the reason for doing it
> like that was to test window frames defined using ROWS rather than
> RANGE.  If we fully specify the window function's input ordering then
> there's no very interesting distinction between the two, because no rows
> will have any peers.  So adding ORDER BY would in fact reduce the scope
> of the tests.
>
> At this point I'm inclined to leave it alone.  Maybe we could think of
> some other test cases (perhaps using some other function than SUM) which
> would both exercise the difference between RANGE and ROWS mode, and not
> be sensitive to the detailed input ordering.  But I doubt it's really
> worth the trouble.

Ah, you mentioned about ORDER BY in window specification (OVER
clause). I thought it was query's ORDER BY. Yes, it affects in RANGE
case, and we don't have rich frame support of RANGE (like n PRECEDING
...) so the case ORDER BY affects result is limited. Agree with
leaving it alone.

Regards,
-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Tom Lane
Florian Pflug  writes:
> ... reading those parts again, I realize the it says "When ORDER BY is omitted
> the *default* frame consists ... ", and that the second quote is followed
> by a footnote which says

>   There are options to define the window frame in other ways, but this 
> tutorial
>   does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]

> So it was just me being thick. Sorry for the noise.

Hmm.  Maybe the use of a  there is too subtle, and we should
instead have that text in-line (probably in parentheses)?  Or we could
use a , but that's probably too much emphasis.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Florian Pflug
On Oct17, 2011, at 00:14 , Tom Lane wrote:
> Florian Pflug  writes:
>> But some frame clauses (row 1 preceding, for example) have an effect despite
>> there being no ORDER BY, like here:
> 
> Yeah, why did you expect differently?  Without ORDER BY, all rows are
> peers in the frame ordering, so there's no way for a RANGE spec to
> select less than the whole partition.  But with ROWS, you can select
> less than that.

I was confused by

  When an aggregate function is used as a window function, it aggregates over
  the rows within the current row's window frame. [3.5, Window Functions].

combined with the part I quoted before, which was

 By default, if ORDER BY is supplied then the frame consists of all rows
 from the start of the partition up through the current row, plus any
 following rows that are equal to the current row according to the ORDER BY
 clause. When ORDER BY is omitted the default frame consists of all rows
 in the partition. [9.19, Window Functions, Last Paragraph]

But, reading those parts again, I realize the it says "When ORDER BY is omitted
the *default* frame consists ... ", and that the second quote is followed
by a footnote which says

  There are options to define the window frame in other ways, but this tutorial
  does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]

So it was just me being thick. Sorry for the noise.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Tom Lane
Florian Pflug  writes:
> But some frame clauses (row 1 preceding, for example) have an effect despite
> there being no ORDER BY, like here:

Yeah, why did you expect differently?  Without ORDER BY, all rows are
peers in the frame ordering, so there's no way for a RANGE spec to
select less than the whole partition.  But with ROWS, you can select
less than that.

In general it's not that hard to create nondeterministic window-function
queries, since the SQL standard doesn't require you to specify a unique
ordering for the window function's input rows.  Even in RANGE mode there
are plenty of functions that are sensitive to the exact ordering, eg
first_value/last_value.  I guess the committee felt that locking this
down would restrict the feature too much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Tom Lane
Hitoshi Harada  writes:
> 2011/10/15 Tom Lane :
>> I can't recall whether there was some good reason for underspecifying
>> these test queries.  It looks like all the problematic ones were added in
>> commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 "Extend the set of frame
>> options supported for window functions", which means it was either me or
>> Hitoshi-san who wrote them that way, but memory is not serving this
>> afternoon.

> I don't remember if I wrote that part or not, but I like to add
> explicit ORDER BY to the test cases. It doesn't appear that some
> reason stopped us to specify it. So +1 for adding the clauses.

I looked at this more closely and realized that the reason for doing it
like that was to test window frames defined using ROWS rather than
RANGE.  If we fully specify the window function's input ordering then
there's no very interesting distinction between the two, because no rows
will have any peers.  So adding ORDER BY would in fact reduce the scope
of the tests.

At this point I'm inclined to leave it alone.  Maybe we could think of
some other test cases (perhaps using some other function than SUM) which
would both exercise the difference between RANGE and ROWS mode, and not
be sensitive to the detailed input ordering.  But I doubt it's really
worth the trouble.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-10-16 Thread Jeff Davis
On Fri, 2011-10-07 at 12:54 +0400, Alexander Korotkov wrote:

> The first thing caught my eye in existing GiST code is idea of
> subtype_float. float8 has limited precision and can't respresent, for
> example, varlena values good enough. Even if we have large int8 value
> we can loose lower bits, but data distribution can be so that these
> bits are valuable. Wouldn't it better to have function like
> subtype_diff_float which returns difference between two values of
> subtype as an float? Using of such function could make penalty more
> sensible to even small difference between values, and accordingly more
> relevant.
> 
I started implementing subtype_diff, and I noticed that it requires
defining an extra function for each range type. Previously, the numeric
types could just use a cast, which was convenient for user-defined range
types.

If you have any other ideas to make that cleaner, please let me know.
Otherwise I'll just finish implementing subtype_diff.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Tom Lane
Dimitri Fontaine  writes:
> Now that you mention it, the following might actually already work:

>  WITH settings AS (
>SELECT set_config('timezone', 'Europe/Amsterdam', t),
>   set_config('work_mem', '1 GB', t)
>  ),
>   foo AS (
>SELECT …
>  )
>  INSERT INTO bar SELECT * FROM foo;

Only for small values of "work" ... you won't be able to affect planner
settings that way, nor can you assume that that WITH item is executed
before all else.  See recent thread pointing out that setting values
mid-query is unsafe.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Dimitri Fontaine
Tom Lane  writes:
> That looks pretty non-future-proof to me.  WITH is a SQL-standard
> syntax, it's not an extension that we control.

Now that you mention it, the following might actually already work:

 WITH settings AS (
   SELECT set_config('timezone', 'Europe/Amsterdam', t),
  set_config('work_mem', '1 GB', t)
 ),
  foo AS (
   SELECT …
 )
 INSERT INTO bar SELECT * FROM foo;

So maybe what we need is to only change the is_local parameter to the
function set_config() so that we can have the setting last for only the
current statement?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Tom Lane
Dimitri Fontaine  writes:
> I think it would fit quite well within our extending of the WITH syntax.

> WITH
>   work_mem = '1GB',
>   timezone = 'Europe/Amsterdam',
>   foo AS (
> SELECT something
> )
> SELECT toplevel FROM foo;

That looks pretty non-future-proof to me.  WITH is a SQL-standard
syntax, it's not an extension that we control.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-16 Thread Tom Lane
Jeff Davis  writes:
> On master, I see a minor test error (at least on my machine) as well as
> a diff. Patch attached.

Hmm, yeah, I forgot to fix this regression test when I added that DETAIL
line.  However, I don't see the need for fooling with the lc_time value?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Dimitri Fontaine
Jan Urbański  writes:
> this idea has cropped up last PGCon - the ability to set GUC variables
> for the duration of a single query. It would work by setting the GUCs
> for the duration of the query and setting them back to what they were
> after it has terminated. By "setting them back" I mean respecting the
> previously set values, regardless of their source (set in run-time,
> per-role settings, postgresql.conf settings).

+1 on the use case, allowing to do that in the statement itself would be
a nice convenience.

> It mostly falls into the realm of syntax sugar, but as more than one
> person felt it's a good idea, I thought I'd float it around here.
>
> I poked a little bit at the grammar to see where could it fit and didn't
> have much success of doing it without a new reserved keyword. Supposing
> the idea gets some traction, any suggestions for the syntax?

I think it would fit quite well within our extending of the WITH syntax.

WITH
  work_mem = '1GB',
  timezone = 'Europe/Amsterdam',
  foo AS (
SELECT something
)
SELECT toplevel FROM foo;

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Florian Pflug
On Oct16, 2011, at 20:04 , Greg Stark wrote:
> On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane  wrote:
>> We could hack around this by adding more columns to the result so that
>> an index-only scan doesn't work.  But I wonder whether it wouldn't be
>> smarter to add ORDER BY clauses to the window function calls.  I've been
>> known to argue against adding just-in-case ORDER BYs to the regression
>> tests in the past; but these cases bother me more because a plan change
>> will not just rearrange the result rows but change their contents,
>> making it really difficult to verify that nothing's seriously wrong.
> 
> I'm not sure if it applies to this case but I recall I was recently
> running queries on Oracle that included window functions and it
> wouldn't even let me run them without ORDER BY clauses in the window
> definition. I don't know if it cleverly determines that the ORDER BY
> will change the results or if Oracle just requires ORDER BY on all
> window definitions or what.

I was about to point out that whether or not ORDER BY is required probably
ought to depend on whether the window function acts on the frame, or the
whole partition. And that oracle quite likely knows that for the individual
window functions while we don't.

But, actually, our documentation states in 3.5, Window Functions, that

  By default, if ORDER BY is supplied then the frame consists of all rows
  from the start of the partition up through the current row, plus any
  following rows that are equal to the current row according to the ORDER BY
  clause. When ORDER BY is omitted the default frame consists of all rows
  in the partition. 

Either I'm confused, or that doesn't match the observed regression test
failure.

I did a few experiments. Assume that table "d" contains three rows with
v=1, v=2 and v=3.

This case seems to work (the frame is always the whole partition, even
though the frame_clause says "between unbounded preceding and current row"):

postgres=# select array_agg(v) over (range between unbounded preceding
 and current row) from d;
 array_agg 
---
 {1,2,3}
 {1,2,3}
 {1,2,3}

Once one adds an ORDER BY, the frame_clause is taken into account, so
that works too:

postgres=# select array_agg(v) over (order by v range between unbounded 
preceding
 and current row) from d;
 array_agg 
---
 {1}
 {1,2}
 {1,2,3}

But some frame clauses (row 1 preceding, for example) have an effect despite
there being no ORDER BY, like here:

postgres=# select array_agg(v) over (rows 1 preceding) from d;
 array_agg 
---
 {1}
 {1,2}
 {2,3}

ISTM that we probably should ignore frame clauses, unless there's an ORDER BY.
Or maybe even throw an error?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM

2011-10-16 Thread Tom Lane
Florian Pflug  writes:
> On Oct16, 2011, at 19:09 , Tom Lane wrote:
>> That doesn't seem like the same thing at all, because the indexed column
>> is on different sides of the expression in the two cases.  The situation
>> that I'm worried about is "indexedcolumn op ANY(arrayconstant)", and
>> what you're bringing up is "constant op ANY(indexedarraycolumn)".

> Couldn't we teach the main executor to push a ScalarArrayOpExpr down
> into the index AM if the operator belongs to the index's opclass, one
> side is indexed, and the other is constant?

Well, no, unless you're proposing to somehow implement the "constant op
ANY(indexedarraycolumn)" case in all the AMs.  I don't see any practical
way to do it in btree, for one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for new feature: Buffer Cache Hibernation

2011-10-16 Thread Tom Lane
Greg Stark  writes:
> On Fri, Oct 14, 2011 at 4:29 PM, Tom Lane  wrote:
>> Right.  I think this one falls into my class #2, ie, we have no idea how
>> to implement it usefully.  Doesn't (necessarily) mean that the core
>> concept is without merit.

> Hm. given that we have an implementation I wouldn't say we have *no*
> clue.  But there are certainly some parts we don't have consensus yet
> on. But then working code sometimes trumps a lack of absolute
> consensus.

In this context "working" means "shows a significant performance
benefit", and IIRC we don't have a demonstration of that.  Anyway this
was all discussed back in May.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM

2011-10-16 Thread Florian Pflug
On Oct16, 2011, at 19:09 , Tom Lane wrote:
> Florian Pflug  writes:
>> On Oct15, 2011, at 20:58 , Tom Lane wrote:
>>> So, at least as far as btrees are concerned, it seems like I implemented
>>> the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed
>>> down into the index AM.  The above rules seem pretty btree-specific, so
>>> I don't think that we ought to have the main executor doing any of this.
>>> I envision doing this by marking btree as supporting ScalarArrayOpExpr
>>> scankeys directly, so that the executor does nothing special with them,
>>> and the planner treats them the same as regular scalar indexquals.
> 
>> Hm, would this make it possible to teach the array GIN ops to also handle
>> ScalarArrayOpExpr?
> 
> Hmm, maybe.  In principle the index AM can always do this at least as
> efficiently as the executor can, and maybe there's actual wins to be had
> in GIST and GIN.  So another route to getting rid of the executor-level
> support would be to implement ScalarArrayOpExpr in all the AMs.  I'm not
> personally volunteering to do that though.

Hm, that sounds like we ought to leave the existing infrastructure in
the main executor in place until we have GIN and GIST support.

>> I've recently had to put
>>  ARRAY[$1] <@ $2 AND $1 = ANY($2)
>> into an (inlineable) SQL function to make it use a (btree) index if
>> $1 is a scalar-values field (and $1 constant array) and a GIN index if $2 
>> is a GIN-indexed array-values field (and $2 a constant array). Which of
>> course sucks from an efficiency POV.
> 
> That doesn't seem like the same thing at all, because the indexed column
> is on different sides of the expression in the two cases.  The situation
> that I'm worried about is "indexedcolumn op ANY(arrayconstant)", and
> what you're bringing up is "constant op ANY(indexedarraycolumn)".

Hm, true

> To fit the latter into the existing opclass infrastructure, we'd have to
> somehow teach the planner that "constant op ANY(indexedarraycolumn)"
> is interchangeable with "indexedarraycolumn @> constant", for pairs of
> operators where that's indeed the case.  Seems like that'd be a lot
> messier than the use-case warrants.

That exactly was what convinced me previously that there's no easy way
to do this. I had hoped that with your patch only the index AMs, instead of
the planner, need to know about these equivalences. 

Couldn't we teach the main executor to push a ScalarArrayOpExpr down
into the index AM if the operator belongs to the index's opclass, one
side is indexed, and the other is constant?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Underspecified window queries in regression tests

2011-10-16 Thread Greg Stark
On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane  wrote:
> We could hack around this by adding more columns to the result so that
> an index-only scan doesn't work.  But I wonder whether it wouldn't be
> smarter to add ORDER BY clauses to the window function calls.  I've been
> known to argue against adding just-in-case ORDER BYs to the regression
> tests in the past; but these cases bother me more because a plan change
> will not just rearrange the result rows but change their contents,
> making it really difficult to verify that nothing's seriously wrong.

I'm not sure if it applies to this case but I recall I was recently
running queries on Oracle that included window functions and it
wouldn't even let me run them without ORDER BY clauses in the window
definition. I don't know if it cleverly determines that the ORDER BY
will change the results or if Oracle just requires ORDER BY on all
window definitions or what.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for new feature: Buffer Cache Hibernation

2011-10-16 Thread Greg Stark
On Fri, Oct 14, 2011 at 4:29 PM, Tom Lane  wrote:
> Right.  I think this one falls into my class #2, ie, we have no idea how
> to implement it usefully.  Doesn't (necessarily) mean that the core
> concept is without merit.

Hm. given that we have an implementation I wouldn't say we have *no*
clue.  But there are certainly some parts we don't have consensus yet
on. But then working code sometimes trumps a lack of absolute
consensus.

But just for the sake of argument I'm not sure that the implementation
of dumping the current contents of the buffer cache is actually
optimal. It doesn't handle resizing the buffer cache after a restart
for example which I think would be a significant case. There could be
other buffer cache algorithm parameters users might change -- though I
don't think we really have any currently.

If we had --to take it to an extreme-- a record of every buffer
request prior to the shutdown then we could replay that log virtually
with the new buffer cache size and know what buffers the new buffer
cache size would have had in it.

I'm not sure if there's any way to gather that data efficiently, and
if we could if there's any way to bound the amount of data we would
have to retain to anything less than nigh-infinite volumes, and if we
could if there's any way to limit that has to be replayed on restart.
But my point is that there may be other more general options than
snapshotting the actual buffer cache of the system shutting down.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-16 Thread Jeff Davis
On master, I see a minor test error (at least on my machine) as well as
a diff. Patch attached.

Regards,
Jeff Davis
*** a/src/test/regress/expected/collate.linux.utf8.out
--- b/src/test/regress/expected/collate.linux.utf8.out
***
*** 395,401  SELECT relname FROM pg_class WHERE relname ~* '^abc';
  (0 rows)
  
  -- to_char
! SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char   
  -
--- 395,401 
  (0 rows)
  
  -- to_char
! SET lc_time TO 'tr_TR.UTF-8';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char   
  -
***
*** 967,972  CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
--- 967,973 
  ERROR:  parameter "lc_ctype" must be specified
  CREATE COLLATION testx (locale = 'nonsense'); -- fail
  ERROR:  could not create locale "nonsense": No such file or directory
+ DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
  CREATE COLLATION test4 FROM nonsense;
  ERROR:  collation "nonsense" for encoding "UTF8" does not exist
  CREATE COLLATION test5 FROM test0;
*** a/src/test/regress/sql/collate.linux.utf8.sql
--- b/src/test/regress/sql/collate.linux.utf8.sql
***
*** 146,152  SELECT relname FROM pg_class WHERE relname ~* '^abc';
  
  -- to_char
  
! SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr_TR");
  
--- 146,152 
  
  -- to_char
  
! SET lc_time TO 'tr_TR.UTF-8';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr_TR");
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM

2011-10-16 Thread Tom Lane
Florian Pflug  writes:
> On Oct15, 2011, at 20:58 , Tom Lane wrote:
>> So, at least as far as btrees are concerned, it seems like I implemented
>> the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed
>> down into the index AM.  The above rules seem pretty btree-specific, so
>> I don't think that we ought to have the main executor doing any of this.
>> I envision doing this by marking btree as supporting ScalarArrayOpExpr
>> scankeys directly, so that the executor does nothing special with them,
>> and the planner treats them the same as regular scalar indexquals.

> Hm, would this make it possible to teach the array GIN ops to also handle
> ScalarArrayOpExpr?

Hmm, maybe.  In principle the index AM can always do this at least as
efficiently as the executor can, and maybe there's actual wins to be had
in GIST and GIN.  So another route to getting rid of the executor-level
support would be to implement ScalarArrayOpExpr in all the AMs.  I'm not
personally volunteering to do that though.

> I've recently had to put
>   ARRAY[$1] <@ $2 AND $1 = ANY($2)
> into an (inlineable) SQL function to make it use a (btree) index if
> $1 is a scalar-values field (and $1 constant array) and a GIN index if $2 
> is a GIN-indexed array-values field (and $2 a constant array). Which of
> course sucks from an efficiency POV.

That doesn't seem like the same thing at all, because the indexed column
is on different sides of the expression in the two cases.  The situation
that I'm worried about is "indexedcolumn op ANY(arrayconstant)", and
what you're bringing up is "constant op ANY(indexedarraycolumn)".  To
fit the latter into the existing opclass infrastructure, we'd have to
somehow teach the planner that "constant op ANY(indexedarraycolumn)"
is interchangeable with "indexedarraycolumn @> constant", for pairs of
operators where that's indeed the case.  Seems like that'd be a lot
messier than the use-case warrants.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM

2011-10-16 Thread Florian Pflug
On Oct15, 2011, at 20:58 , Tom Lane wrote:
> So, at least as far as btrees are concerned, it seems like I implemented
> the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed
> down into the index AM.  The above rules seem pretty btree-specific, so
> I don't think that we ought to have the main executor doing any of this.
> I envision doing this by marking btree as supporting ScalarArrayOpExpr
> scankeys directly, so that the executor does nothing special with them,
> and the planner treats them the same as regular scalar indexquals.

Hm, would this make it possible to teach the array GIN ops to also handle
ScalarArrayOpExpr? I've recently had to put

  ARRAY[$1] <@ $2 AND $1 = ANY($2)

into an (inlineable) SQL function to make it use a (btree) index if
$1 is a scalar-values field (and $1 constant array) and a GIN index if $2 
is a GIN-indexed array-values field (and $2 a constant array). Which of
course sucks from an efficiency POV.

At the time I didn't see a way to easily teach GIN to support ANY, but
with your proposal it seems entirely doable. Unless I'm missing something,
that is.

> In principle somebody could be doing something like
>   WHERE pointcol <@ ANY (ARRAY[list of box values])
> and expecting that to generate a bitmap indexscan on a GIST index, but
> is it likely that anyone is doing that?  (As opposed to the equivalent
> formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would
> still work to generate OR'd bitmap scans even if we took out the
> ScalarArrayOpExpr logic.)

Hm, if the number of box values isn't fixed, the ANY form plays much nicer
nicer with parametrized statements than the OR form. So I think we shouldn't
take that away from people. 

OTOH it seems that, depending on the actual list of box values, a bitmap
indexscan isn't the smartest way to do this. If the boxes overlap (or are
close enough to each other), using the bounding box to search the index
and then filtering the remaining rows ought to be more efficient.

So maybe we should remove the support for ScalarArrayOpExpr from the main
executor, but add support for it to GIST?

best regards,
Florian Pflug




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Jan Urbański
On 16/10/11 17:49, Tom Lane wrote:
> =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=  writes:
>> this idea has cropped up last PGCon - the ability to set GUC variables
>> for the duration of a single query. It would work by setting the GUCs
>> for the duration of the query and setting them back to what they were
>> after it has terminated.
> 
> Doesn't SET LOCAL cover this use-case pretty well already?

It does to a certain degree. If you have a bunch of statements in a
transaction and want to execute one of them with a different timezone
setting, you have to do the SET/RESET dance. In theory you should also
first grab the current value to set it back afterwards, in case someone
else did SET LOCAL before you, but I'll admin that's far-fetched.

The main use case would be apps running behing pgbouncer and using
statement pooling, and plain convenience.

I'd find it useful myself, but for now I'm making do with SET LOCAL and
it works fine. I'm bringing it up because it appears in the TODO created
at the PL Summit:

* Further discussion of per-statement config parameters for things
like timezone - Jan Urbanski

Tryin' to do my bit and all ;)

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Thom Brown
On 16 October 2011 16:44, Jan Urbański  wrote:
> Hi,
>
> this idea has cropped up last PGCon - the ability to set GUC variables
> for the duration of a single query. It would work by setting the GUCs
> for the duration of the query and setting them back to what they were
> after it has terminated. By "setting them back" I mean respecting the
> previously set values, regardless of their source (set in run-time,
> per-role settings, postgresql.conf settings).
>
> An example of where this would be useful: an application maintains a
> persistent connection to the database and answers requests for data from
> a bunch of clients. Each connected client has a preferred timezone and
> would like to get results in that timezone. Currently the application
> has to either sprinkle each query with AT TIME ZONE or wrap the queries
> in "BEGIN; SET LOCAL TIMEZONE ..; ; COMMIT". It gets more complex
> when things like pgbouncer come into play.
>
> Another example is a one-off query that should use a different
> statement_timeout than the server has configured or a REINDEX command
> that would like to use more maintenance_work_mem.
>
> It mostly falls into the realm of syntax sugar, but as more than one
> person felt it's a good idea, I thought I'd float it around here.
>
> I poked a little bit at the grammar to see where could it fit and didn't
> have much success of doing it without a new reserved keyword. Supposing
> the idea gets some traction, any suggestions for the syntax?

What about SET LOCAL?

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LIMITing number of results in a VIEW with global variables

2011-10-16 Thread Florian Pflug
On Oct15, 2011, at 14:52 , Thomas Girault wrote:
> Alternatively, we could also set the alpha value before the query :
> 
> SELECT set_alpha(0.1); SELECT age, young(age) FROM employees WHERE
> young(age);

That's certainly much safer.

> I would be very interested to know if there is smarter way to set global
> variables.

The closest thing to global variables are GUC settings. These can be
set globally in postgres.conf, per user and/or per-database via
ALTER ROLE/DATABSE ... [IN DATABASE ...], per session with SET, per
function via CREATE FUNCTION ... SET and finally per subtransaction 
with SET LOCAL. Modules can add their own GUC settings - you should be
able to find an example in one of the modules in contrib/

But your getter/setter-based solution isn't bad either - in fact, the
trigram module contains something very similar. Just don't try to
modify the value mid-query.

>> I can sort the results in the view 'sorted_employees' according to
>>> value MU of a fuzzy predicate thanks to fuzzy2bool cast function.
>>> 
>>> CREATE OR REPLACE VIEW sorted_employees AS
>>>   SELECT *, get_mu() as mu
>>>   FROM employees
>>>   ORDER BY mu DESC;
>> 
>> Are you aware that an ORDER BY clause in a VIEW is only going to work
>> if you do "SELECT .. .FROM view".
> 
> I It really the first time I am using views, I didn't know that !

Hm, I think I didn't explain that to well, so to avoid giving you a false
impression here's another try.

A SELECT without an ORDER BY clause attached to the *outmost* level never
guarantees any particular ordering of the result, nor any particular
relationship between the ordering of the SELECT's data sources and the
ordering of the SELECT's result. The only exception are SELECT's of the
form "select ... from " where  has an ORDER BY attached to
the outmost level. From that, it follows that an ORDER BY in views used
in SELECTs more complex than the above is usually useless.

Things are different for views that combine ORDER BY and LIMIT, of course.
Then, not only the order of the view's results changes, but also it's output
set. Which of course affects every statement which uses the view.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Tom Lane
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=  writes:
> this idea has cropped up last PGCon - the ability to set GUC variables
> for the duration of a single query. It would work by setting the GUCs
> for the duration of the query and setting them back to what they were
> after it has terminated.

Doesn't SET LOCAL cover this use-case pretty well already?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: set GUC variables for single query

2011-10-16 Thread Jan Urbański
Hi,

this idea has cropped up last PGCon - the ability to set GUC variables
for the duration of a single query. It would work by setting the GUCs
for the duration of the query and setting them back to what they were
after it has terminated. By "setting them back" I mean respecting the
previously set values, regardless of their source (set in run-time,
per-role settings, postgresql.conf settings).

An example of where this would be useful: an application maintains a
persistent connection to the database and answers requests for data from
a bunch of clients. Each connected client has a preferred timezone and
would like to get results in that timezone. Currently the application
has to either sprinkle each query with AT TIME ZONE or wrap the queries
in "BEGIN; SET LOCAL TIMEZONE ..; ; COMMIT". It gets more complex
when things like pgbouncer come into play.

Another example is a one-off query that should use a different
statement_timeout than the server has configured or a REINDEX command
that would like to use more maintenance_work_mem.

It mostly falls into the realm of syntax sugar, but as more than one
person felt it's a good idea, I thought I'd float it around here.

I poked a little bit at the grammar to see where could it fit and didn't
have much success of doing it without a new reserved keyword. Supposing
the idea gets some traction, any suggestions for the syntax?

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM

2011-10-16 Thread Tom Lane
Noah Misch  writes:
> On Sat, Oct 15, 2011 at 02:58:45PM -0400, Tom Lane wrote:
>> [algorithm for a regular index scan satisfying "key IN (...)"]

> Sounds sensible.  The algorithm applies to more than ScalarArrayOpExpr; is it
> not the ability to handle an OR'ed list of ScanKey instead of an AND'ed one?

No, because it's restricted to all the elements having the same operator
and same index column; it's not a generic OR.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-10-16 Thread Kohei KaiGai
Hi Robert,

I'm a bit confusing about this sentence.

> If you can make this work, I think it could be a pretty sweet plannner
> optimization even apart from the implications for security views.
> Consider a query of this form:
>
> A LEFT JOIN B LEFT JOIN C
>
> where B is a view defined as:
>
> B1 JOIN B2 JOIN B3 LEFT JOIN B4 LEFT JOIN B5
>
> Now let's suppose that from_collapse_limit/join_collapse_limit are set
> low enough that we decline to fold these subproblems together.  If
> there happens to be a qual B.x = 1, where B.x is really B1.x, then the
> generated plan sucks, because it will basically lose the ability to
> filter B1 early, very possibly on, say, a unique index.  Or at least a
> highly selective index.
>

I tried to reproduce the scenario with enough small from/join_collapse_limit
(typically 1), but it allows to push down qualifiers into the least scan plan.

E.g)
mytest=# SET from_collapse_limit = 1;
mytest=# SET join_collapse_limit = 1;
mytest=# CREATE VIEW B AS SELECT B1.* FROM B1,B2,B3 WHERE B1.x = B2.x
AND B2.x = B3.x;
mytest=# EXPLAIN SELECT * FROM A,B,C WHERE A.x=B.x AND B.x=C.x AND f_leak(B.y);
 QUERY PLAN

 Merge Join  (cost=381.80..9597.97 rows=586624 width=108)
   Merge Cond: (a.x = b1.x)
   ->  Merge Join  (cost=170.85..290.46 rows=7564 width=72)
 Merge Cond: (a.x = c.x)
 ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
   Sort Key: a.x
   ->  Seq Scan on a  (cost=0.00..22.30 rows=1230 width=36)
 ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
   Sort Key: c.x
   ->  Seq Scan on c  (cost=0.00..22.30 rows=1230 width=36)
   ->  Materialize  (cost=210.95..528.56 rows=15510 width=44)
 ->  Merge Join  (cost=210.95..489.78 rows=15510 width=44)
   Merge Cond: (b1.x = b3.x)
   ->  Merge Join  (cost=125.52..165.40 rows=2522 width=40)
 Merge Cond: (b1.x = b2.x)
 ->  Sort  (cost=40.09..41.12 rows=410 width=36)
   Sort Key: b1.x
   ->  Seq Scan on b1  (cost=0.00..22.30
rows=410 width=36)
 Filter: f_leak(y)
 ->  Sort  (cost=85.43..88.50 rows=1230 width=4)
   Sort Key: b2.x
   ->  Seq Scan on b2  (cost=0.00..22.30
rows=1230 width=4)
   ->  Sort  (cost=85.43..88.50 rows=1230 width=4)
 Sort Key: b3.x
 ->  Seq Scan on b3  (cost=0.00..22.30 rows=1230 width=4)
(25 rows)

In this example, f_leak() takes an argument come from B1 table within B view,
and it was correctly distributed to SeqScan on B1.

>From perspective of the code, the *_collapse_limit affects the contents of
joinlist being returned from deconstruct_jointree() whether its sub-portion is
flatten, or not.
However, the qualifiers are distributed on distribute_restrictinfo_to_rels() to
RelOptInfo based on its dependency of relations being referenced by
arguments. Thus, the above f_leak() was distributed to B1, not B, because
its arguments come from only B1.


I agree with the following approach to tackle this problem in 100%.
However, I'm unclear how from/join_collapse_limit affects to keep
sub-queries unflatten. It seems to me it is determined based on
the result of is_simple_subquery().

> 1. Let quals percolate down into subqueries.
> 2. Add the notion of a security view, which prevents flattening and
> disables the optimization of patch #1
> 3. Add the notion of a leakproof function, which can benefit from the
> optimization of #1 even when the view involved is a security view as
> introduced in #2
>

Thanks,

2011/10/11 Robert Haas :
> On Mon, Oct 10, 2011 at 4:28 PM, Kohei KaiGai  wrote:
>> I agreed. We have been on the standpoint that tries to prevent
>> leakable functions to reference a portion of join-tree being already
>> flatten, however, it has been a tough work.
>> It seems to me it is much simple approach that enables to push
>> down only non-leaky functions into inside of sub-queries.
>>
>> An idea is to add a hack on distribute_qual_to_rels() to relocate
>> a qualifier into inside of the sub-query, when it references only
>> a particular sub-query being come from a security view, and
>> when the sub-query satisfies is_simple_subquery(), for example.
>
> If you can make this work, I think it could be a pretty sweet plannner
> optimization even apart from the implications for security views.
> Consider a query of this form:
>
> A LEFT JOIN B LEFT JOIN C
>
> where B is a view defined as:
>
> B1 JOIN B2 JOIN B3 LEFT JOIN B4 LEFT JOIN B5
>
> Now let's suppose that from_collapse_limit/join_collapse_limit are set
> low enough that we decline to fold these subproblems together.  If
> there happens to be a qual B.x = 1, where B.x is really B1.x, then the
> generated plan suck