Re: [HACKERS] Functional dependencies and GROUP BY - for subqueries

2013-04-28 Thread Ashutosh Bapat
> Can you please elaborate, why would it be a disaster?
>
> Consider that we've done
>
> create table t1 (id int primary key, ... other stuff ...);
> create view v1 as select * from t1;
> create view v2 as select * from v1 group by id;
>
> Currently, v2 would be rejected but you would like to make it legal.
> Now consider
>
> alter table t1 drop primary key;
>
> This ALTER would have to be rejected, or else (with CASCADE) lead to
> dropping v2 but not v1.  That's pretty ugly action-at-a-distance
> if you ask me.  But worse, consider
>
> create or replace view v1 as select * from t2;
>
> where t2 exposes the same columns as t1 but lacks a primary-key
> constraint on id.  This likewise would need to invalidate v2.  We lack
> any dependency mechanism that could enforce that, and it seems seriously
> ugly that such a view redefinition could fail at all.  (Note for
> instance that there's no place to put a CASCADE/RESTRICT option in
> CREATE OR REPLACE VIEW.)
>
> So quite aside from the implementation difficulties of looking into
> views for such constraints, I don't think the behavior would be pleasant
> if we did do it.  Views are not supposed to expose properties of the
> underlying tables.
>
>
Thanks for the explanation.

Is there any reason why do we want to check the functional dependencies at
the time of parsing and not after rewrite? Obviously, by doing so, we will
allow creation of certain views which will start throwing errors after the
underlying table changes the primary key. Is it mandatory that we throw
"functional dependency" related errors at the time of creation of views?


> regards, tom lane
>



-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Graph datatype addition

2013-04-28 Thread Atri Sharma
> It's probably pretty easy to add this, but I think the question is
> what would make it better than storing the same representation in a
> text field.

I completely agree. The main point in making a new datatype would be
to add support for operations that are normally done with graphs.


>Obviously you get validation that the input is in the
> correct format, but you could do that with a CHECK constraint, too, or
> otherwise handle it in the application.  So I think the really
> interesting question is: what operations would you support on this
> data type?

I can think of the standard tasks, i.e. searching if two nodes are
connected or not,adding new nodes and edges, traversing the adjacency
lists of nodes.

If we add support for weighted graphs, we can probably add support for
some common graph algorithms, such as Djikstra's algorithm, Bellman
Ford algorithm, a MST making algorithm, network flow algorithms.

The main idea is to allow user to work with graphs pretty easily, and
allow the user to use the data present in his database to make graphs
and then process them.

> One of the problems you're likely to run into if you store the whole
> graph as a single object is that it may make many of the things you
> want to do with it not very efficient.

Yes, I agree. On further thought, I believe it would be more of a pain
if we stick to representing the whole thing as one.Rather,making
multiple components will be more flexible and modular, and allow us to
modify different components of the same graph without modifying or
interfering with other components of the graph.

I will think of a new design. I am still thinking of using HStore to
store adjacency lists. This should have good performance for access of
lists and similar tasks, IMO.

Regards,

Atri


-- 
Regards,

Atri
*l'apprenant*


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


After Startup Processing (was Re: [HACKERS] Remaining beta blockers)

2013-04-28 Thread Simon Riggs
On 29 April 2013 01:40, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sun, Apr 28, 2013 at 11:41 AM, Tom Lane  wrote:
>>> Well, it's fairly clear *how* to do it: add some more processing that
>>> occurs after we've completed crash replay.  We already have some of
>>> that, eg completion of partial splits in btrees, so it's not that much
>>> of a stretch; it's just a lot of code that's not been written yet.
>
>> As far as I can see, that would require starting a separate backend
>> process for every database, and keeping track of which of those
>> completed their post-recovery work, and disallowing connections to any
>> given database until the post-recovery work for that database had been
>> completed.  That seems to add quite a few failure modes that we don't
>> have today, which is why I haven't been much interested in going that
>> route.
>
> Well, I didn't say it would be easy or quick ;-).  But you're presuming
> quite a number of design elements that don't seem essential to me.
>
> What I'd be inclined to think about as prerequisite work is fixing
> things so that a process could reattach to a new database, after
> flushing all its caches.  That's a feature that's been requested plenty,
> and could have applications in autovacuum or other places besides this,
> and would certainly get lots of testing outside of crash recovery.
>
> Having done that, we could imagine that the startup process itself
> cycles through all the databases and does the fixup work, rather than
> complicating the postmaster logic as suggested above.  Potentially this
> could replace the filesystem-scan-driven fixup logic that exists in it
> now, if it turns out to be faster to search for unlogged tables via a
> catalog scan rather than directory scans.

I'd rather leave the Startup process alone, it does too much already.

It's easy enough to start a new process that can run SQL properly
after the Startup process has gone, maybe we can have more than one.
Hopefully now you can see why I'd like to change the meaning of
database so we can have one process talk to multiple databases at same
time. It helps for a variety of system tasks, like autovacuum,
autoreindex, logical replication etc..

The AfterStartup processing can take normal session locks on the
objects it needs to work on and keep them til its done.

The general background to this is exactly what we need to rebuild
damaged indexes, or allow unlogged indexes on logged tables which can
be rebuilt on crash.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken

2013-04-28 Thread Josh Berkus

> The fine manual notes that the target role has to already have CREATE
> privileges on the target schema --- maybe that's what's biting you in
> this case? 

Nope.  That was the first thing I thought of.  It really is that the
target role must *own* the schema.  So clearly a bug.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken

2013-04-28 Thread Tom Lane
Josh Berkus  writes:
> Actually, the problem is worse than I thought.  It looks like I can't
> set default privs for any role which is not the owner of the schema:

> analytics2=> ALTER DEFAULT PRIVILEGES IN SCHEMA web GRANT SELECT ON
> TABLES TO dbreader;
> ERROR:  permission denied for schema web

The fine manual notes that the target role has to already have CREATE
privileges on the target schema --- maybe that's what's biting you in
this case?  If so, I'd agree that this error message is insufficiently
specific, but I don't think the restriction is unreasonable.  Without
CREATE privs, there's no particular value in setting default privs for
to-be-created objects.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Graph datatype addition

2013-04-28 Thread Robert Haas
On Sun, Apr 28, 2013 at 1:06 AM, Atri Sharma  wrote:
> Inspired by the awesome work done by Oleg sir in HStore, I have been thinking 
> about making a graph data type as an extension in postgres.
>
> I have been reading about graph databases and how storing data in graphs can 
> lead to some really awesome functionalities such as social network analysis, 
> recommender systems, network management.
>
> Essentially, connected data can be represented effectively as a single data 
> item, which can be used for further analysis.
>
> This is in line with my agenda of adding more analytics functionalities in 
> postgres.
>
> I have been thinking about designing the data type as adjacency sets, 
> associating the adjacency list for each node as a value with node identifier 
> as the key.
>
> This should be able to build over HStore, using HStore to associate adjacency 
> list with its node identifier.
>
> The format could be:
>
>  =>   =>  
> '\0'
>
> So,
>
> "node1" => "node2/node3/node4","node2" => "node1/node5/node6","node3" => 
> "node1/node4/node5" '\0'
>
> This can be for unweighted graphs, we can add support for weighted graphs as 
> well.
>
> Thoughts/comments/advice please?

It's probably pretty easy to add this, but I think the question is
what would make it better than storing the same representation in a
text field.  Obviously you get validation that the input is in the
correct format, but you could do that with a CHECK constraint, too, or
otherwise handle it in the application.  So I think the really
interesting question is: what operations would you support on this
data type?

One of the problems you're likely to run into if you store the whole
graph as a single object is that it may make many of the things you
want to do with it not very efficient.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Tom Lane
Robert Haas  writes:
> On Sun, Apr 28, 2013 at 11:41 AM, Tom Lane  wrote:
>> Well, it's fairly clear *how* to do it: add some more processing that
>> occurs after we've completed crash replay.  We already have some of
>> that, eg completion of partial splits in btrees, so it's not that much
>> of a stretch; it's just a lot of code that's not been written yet.

> As far as I can see, that would require starting a separate backend
> process for every database, and keeping track of which of those
> completed their post-recovery work, and disallowing connections to any
> given database until the post-recovery work for that database had been
> completed.  That seems to add quite a few failure modes that we don't
> have today, which is why I haven't been much interested in going that
> route.

Well, I didn't say it would be easy or quick ;-).  But you're presuming
quite a number of design elements that don't seem essential to me.

What I'd be inclined to think about as prerequisite work is fixing
things so that a process could reattach to a new database, after
flushing all its caches.  That's a feature that's been requested plenty,
and could have applications in autovacuum or other places besides this,
and would certainly get lots of testing outside of crash recovery.

Having done that, we could imagine that the startup process itself
cycles through all the databases and does the fixup work, rather than
complicating the postmaster logic as suggested above.  Potentially this
could replace the filesystem-scan-driven fixup logic that exists in it
now, if it turns out to be faster to search for unlogged tables via a
catalog scan rather than directory scans.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken

2013-04-28 Thread Josh Berkus

> ... in fact, there is no combination of actions which will make "FOR
> ROLE" work.  Any invokation of "FOR ROLE" inevitably results in a
> "permission denied" message:
> 
> analytics2=> \c - webui
> You are now connected to database "analytics2" as user "webui".
> analytics2=> ALTER DEFAULT PRIVILEGES FOR ROLE webui IN SCHEMA web
> GRANT SELECT ON TABLES TO dbreader;
> ERROR:  permission denied for schema web

Actually, the problem is worse than I thought.  It looks like I can't
set default privs for any role which is not the owner of the schema:

[jberkus@pgx-test ~]$ psql -U webui analytics2
psql (9.2.4)
Type "help" for help.

analytics2=> ALTER DEFAULT PRIVILEGES IN SCHEMA web GRANT SELECT ON
TABLES TO dbreader;
ERROR:  permission denied for schema web

In other words, ALTER DEFAULT PRIVs only works if you are the role
you're trying to grant, and that role is the owner of the schema.  It
doesn't work for any other role or any schema you don't own.

This means that I have NO WAY to set default privs for the majority of
users on my system.  WTF?  How did we break this so badly?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Analyzing bug 8049

2013-04-28 Thread Tom Lane
I wrote:
> The only alternative I can see is to make a back-patch that just teaches
> get_eclass_for_sort_expr() to compute valid nullable_relids for the sort
> expression.  That's necessary code in any case, and it would fix the
> immediately complained-of bug.  The thing that scares me is that it is
> now clear that equivclass.c is capable of considering two expressions to
> be equivalent when they should not be; that is, I'm afraid there are
> variants of the problem that would not be cured by such a limited
> back-patch.  But maybe I should try to create such an example before
> proposing the more invasive approach.

In thinking about how to compute valid nullable_relids in
get_eclass_for_sort_expr, I realized that it is impractical to do it
in the current code structure, because computing nullable_relids
requires that the joininfo list has been built by deconstruct_jointree,
which hasn't been run yet at the time grouping_planner computes the
pathkeys for the ORDER BY and related clauses.  So I concluded that
we need to postpone computation of those pathkeys lists until after
deconstruct_jointree.

Working through this, I realized that we could, in fact, simply delay
computation of all five of group_pathkeys, window_pathkeys,
distinct_pathkeys, sort_pathkeys, and query_pathkeys until the point
where we currently do canonicalize_all_pathkeys.  There isn't anything
that looks at those lists in-between.  The only reason they're set up by
grouping_planner is separation of concerns --- query_planner doesn't
really have any business dealing with group_pathkeys for instance.
Furthermore, if we delay computation of those lists, then we can get rid
of the notion of non-canonical pathkeys altogether: there is noplace
that needs to generate a PathKey at all before this point.  So that
results in a nice conceptual simplification as well as saving a few
cycles.

Once I'd finished moving the pathkeys calculations, I was a tad
surprised to discover that bug 8049 was gone!  The reason for this is
that now, if an ORDER BY item matches some expression that has another
reason to be in an EquivalenceClass, the ORDER BY item will get matched
to a pre-existing EC item and so there will be no opportunity for
get_eclass_for_sort_expr() to do the wrong thing.  If an ORDER BY item
doesn't match anything, then it will be created as a new single-item
EquivalenceClass, containing arguably-incorrect NULL em_nullable_relids,
but there is nothing that will notice that.

I remain suspicious that there are or will someday be cases where we
actually need to generate valid em_nullable_relids for an ORDER BY item,
but in the absence of a demonstrated bug it doesn't seem real prudent to
add a bunch of new code here, either in 9.2 or 9.3.  Therefore, I think
what we ought to do is apply something like the attached patch to 9.2
and 9.3 and call it a day for now.

The difficulty with back-patching this patch is that it involves API
changes for query_planner() and make_pathkeys_for_sortclauses(), as well
as outright removal of canonicalize_pathkeys().  So there's a risk of
breaking third-party code that calls any of those functions.  I don't
have the slightest hesitation about changing these APIs in 9.3, but
back-patching them into 9.2 is a bit more nervous-making.

It would be trivial to make make_pathkeys_for_sortclauses() ignore
its "canonicalize" parameter in 9.2, or maybe better add an Assert
(not a runtime test) that the parameter is "true".  And we could
turn canonicalize_pathkeys() into a no-op function in 9.2.  However,
I'm not seeing a better alternative to adding the callback parameter
to query_planner().  The difficulty is that in order to compute those
pathkey lists, we need access to the "tlist" and "activeWindows" values,
which heretofore have just been local in grouping_planner().  We could
add fields to PlannerInfo to carry them, and then grouping_planner()
could just call some new function in planner.c rather than needing an
explicit callback argument.  But new fields in PlannerInfo are an ABI
break that in my judgment would be more risky in 9.2 than changing
query_planner()'s signature --- I think it's somewhat unlikely that
any plug-ins are calling query_planner() directly.

Thoughts?  Anybody know of a counterexample to the idea that no plug-ins
call query_planner()?

regards, tom lane

diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 7245fa32a0f0d7d7bb924eb8461d68a2a0d702ed..3f96595e8eb041bcebe38639e075db7eb8a77fd2 100644
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
*** _equalFromExpr(const FromExpr *a, const 
*** 728,749 
  static bool
  _equalPathKey(const PathKey *a, const PathKey *b)
  {
! 	/*
! 	 * This is normally used on non-canonicalized PathKeys, so must chase up
! 	 * to the topmost merged EquivalenceClass and see if those are the same
! 	 * (by pointer equality).
! 	 */
! 	EquivalenceClass *a_eclass;
! 	Equiva

[HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken

2013-04-28 Thread Josh Berkus
Folks,

The "FOR ROLE" syntax is completely broken, as of 9.2.4.  Not sure when
exactly this got broken; I remember it working sometime in the past:

[jberkus@pgx-test ~]$ psql -U postgres analytics2
psql (9.2.4)
Type "help" for help.

analytics2=# ALTER DEFAULT PRIVILEGES FOR ROLE webui IN SCHEMA web
GRANT SELECT ON TABLES TO dbreader;
ERROR:  permission denied for schema web

... in fact, there is no combination of actions which will make "FOR
ROLE" work.  Any invokation of "FOR ROLE" inevitably results in a
"permission denied" message:

analytics2=> \c - webui
You are now connected to database "analytics2" as user "webui".
analytics2=> ALTER DEFAULT PRIVILEGES FOR ROLE webui IN SCHEMA web
GRANT SELECT ON TABLES TO dbreader;
ERROR:  permission denied for schema web

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Robert Haas
On Sun, Apr 28, 2013 at 11:41 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, Apr 27, 2013 at 3:51 PM, Tom Lane  wrote:
>>> I cannot say that I find that idea attractive; the biggest problem with
>>> it being that updating such a state flag will be nontransactional,
>>> unless we go to a lot of effort to support rollbacks.  ISTM that the
>>> scannability property is a perfectly normal relation property and as
>>> such *ought* to be in the pg_class row, or at worst some other catalog
>>> entry.  Why do you think differently?
>
>> Mostly because of the issue with unlogged tables, I suppose.  If
>> you've got a reasonable  idea how to do catalog updates on restart,
>> though, I could probably be convinced to yield to that.
>
> Well, it's fairly clear *how* to do it: add some more processing that
> occurs after we've completed crash replay.  We already have some of
> that, eg completion of partial splits in btrees, so it's not that much
> of a stretch; it's just a lot of code that's not been written yet.

As far as I can see, that would require starting a separate backend
process for every database, and keeping track of which of those
completed their post-recovery work, and disallowing connections to any
given database until the post-recovery work for that database had been
completed.  That seems to add quite a few failure modes that we don't
have today, which is why I haven't been much interested in going that
route.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Simon Riggs
On 28 April 2013 21:06, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 28 April 2013 16:55, Tom Lane  wrote:
>>> The bottom line here is that we have substantial disagreement on how
>>> unlogged matviews should be implemented, and there's no longer enough
>>> time for coming to a resolution that will satisfy everybody.  I think
>>> that means we have to pull the feature from 9.3.  If it had not yet
>>> been committed it would certainly not be getting in now over multiple
>>> objections.
>
>> I've not said much good about Mat Views, that is true, but that was
>> aimed at not running with it as a headline feature without
>> qualification. I don't take that as far as thinking the feature should
>> be pulled completely; there is some good worth having in most things.
>> Is this issue worth pulling the whole feature on?
>
> I think you misread that.  I'm only proposing that we remove *unlogged*
> matviews, and perhaps scannability tracking for matviews.

Happily so.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-04-28 Thread David Fetter
On Tue, Feb 26, 2013 at 01:09:30PM -0800, David Fetter wrote:
> On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote:
> > On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote:
> > > Folks,
> > > 
> > > Per suggestions and lots of help from Andrew Gierth, please find
> > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > > I only redid 37) places where it's called.  The remaining one is in
> > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > changes in the that struct anyhow.
> > > 
> > > The immediate purpose is two-fold: to reduce some redundancies, which
> > > I believe is worth doing in and of itself, and to prepare for adding
> > > FILTER on aggregates from the spec, and possibly other things in
> > > the  part of the spec.
> > > 
> > > Cheers,
> > > David.
> > 
> > Folks,
> > 
> > Please find attached two versions of a patch which provides optional
> > FILTER clause for aggregates (T612, "Advanced OLAP operations").
> > 
> > The first is intended to be applied on top of the previous patch, the
> > second without it.  The first is, I believe, clearer in what it's
> > doing.  Rather than simply mechanically visiting every place a
> > function call might be constructed, it visits a central one to change
> > the default, then goes only to the places where it's relevant.
> > 
> > The patches are both early WIP as they contain no docs or regression
> > tests yet.
> 
> Docs and regression tests added, makeFuncArgs approached dropped for
> now, will re-visit later.

Regression tests added to reflect bug fixes in COLLATE.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
*** a/doc/src/sgml/ref/select.sgml
--- b/doc/src/sgml/ref/select.sgml
***
*** 595,604  GROUP BY expression [, ...]
 
  
 
! Aggregate functions, if any are used, are computed across all rows
  making up each group, producing a separate value for each group
  (whereas without GROUP BY, an aggregate
  produces a single value computed across all the selected rows).
  When GROUP BY is present, it is not valid for
  the SELECT list expressions to refer to
  ungrouped columns except within aggregate functions or if the
--- 595,607 
 
  
 
! In the absence of a FILTER clause,
! aggregate functions, if any are used, are computed across all rows
  making up each group, producing a separate value for each group
  (whereas without GROUP BY, an aggregate
  produces a single value computed across all the selected rows).
+ When a FILTER clause is present, only those
+ rows matching the FILTER clause are included.
  When GROUP BY is present, it is not valid for
  the SELECT list expressions to refer to
  ungrouped columns except within aggregate functions or if the
*** a/doc/src/sgml/syntax.sgml
--- b/doc/src/sgml/syntax.sgml
***
*** 1562,1585  sqrt(2)
  syntax of an aggregate expression is one of the following:
  
  
! aggregate_name 
(expression [ , ... ] [ 
order_by_clause ] )
! aggregate_name (ALL 
expression [ , ... ] [ 
order_by_clause ] )
! aggregate_name (DISTINCT 
expression [ , ... ] [ 
order_by_clause ] )
! aggregate_name ( * )
  
  
  where aggregate_name is a previously
  defined aggregate (possibly qualified with a schema name),
! expression is
! any value expression that does not itself contain an aggregate
! expression or a window function call, and
! order_by_clause is a optional
! ORDER BY clause as described below.
 
  
 
! The first form of aggregate expression invokes the aggregate
! once for each input row.
  The second form is the same as the first, since
  ALL is the default.
  The third form invokes the aggregate once for each distinct value
--- 1562,1587 
  syntax of an aggregate expression is one of the following:
  
  
! aggregate_name 
(expression [ , ... ] [ 
order_by_clause ] ) [ FILTER ( WHERE 
filter_clause ) ]
! aggregate_name (ALL 
expression [ , ... ] [ 
order_by_clause ] ) [ FILTER ( WHERE 
filter_clause ) ]
! aggregate_name (DISTINCT 
expression [ , ... ] [ 
order_by_clause ] ) [ FILTER ( WHERE 
filter_clause ) ]
! aggregate_name ( * ) [ FILTER ( WHERE 
filter_clause ) ]
  
  
  where aggregate_name is a previously
  defined aggregate (possibly qualified with a schema name),
! expression is any value expression that
! does not itself contain an aggregate expression or a window
! function call, order_by_clause is a
! optional ORDER BY clause as described below.  The
! aggregate_name can also be suffixed
!

Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Tom Lane
Simon Riggs  writes:
> On 28 April 2013 16:55, Tom Lane  wrote:
>> The bottom line here is that we have substantial disagreement on how
>> unlogged matviews should be implemented, and there's no longer enough
>> time for coming to a resolution that will satisfy everybody.  I think
>> that means we have to pull the feature from 9.3.  If it had not yet
>> been committed it would certainly not be getting in now over multiple
>> objections.

> I've not said much good about Mat Views, that is true, but that was
> aimed at not running with it as a headline feature without
> qualification. I don't take that as far as thinking the feature should
> be pulled completely; there is some good worth having in most things.
> Is this issue worth pulling the whole feature on?

I think you misread that.  I'm only proposing that we remove *unlogged*
matviews, and perhaps scannability tracking for matviews.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Simon Riggs
On 28 April 2013 16:55, Tom Lane  wrote:
> Simon Riggs  writes:
>> On other patches, one committer objecting to something is seen as
>> enough of a blocker to require change. That should work in every
>> direction.
>
> The bottom line here is that we have substantial disagreement on how
> unlogged matviews should be implemented, and there's no longer enough
> time for coming to a resolution that will satisfy everybody.  I think
> that means we have to pull the feature from 9.3.  If it had not yet
> been committed it would certainly not be getting in now over multiple
> objections.

I've not said much good about Mat Views, that is true, but that was
aimed at not running with it as a headline feature without
qualification. I don't take that as far as thinking the feature should
be pulled completely; there is some good worth having in most things.
Is this issue worth pulling the whole feature on?

> Given Robert's concerns, it may be that the same should be said for
> scannability tracking.  I think it's definitely the case that if we
> don't have unlogged matviews then the need for system-level tracking
> of scannability is greatly decreased.  Kevin's already said that he
> plans to work on a much more flexible notion of scannability for 9.4,
> and I remain concerned that something we do in haste now might not
> prove to be a good upward-compatible basis for that redesign.

Given that unlogged tables are somewhat volatile, unlogged matviews
wouldn't be missed much AFAICS. We can add that thought as a later
optimisation.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PG 9.3beta1 postponed a week

2013-04-28 Thread Tom Lane
After a bit of discussion, the core committee has decided that we're not
really ready to wrap a credible beta1 candidate tomorrow.  There are
several unresolved issues such as what to do about checksums and
matviews; and there seems no good reason to force resolution of
these important issues on a tight deadline.

So we'll delay the beta a week.  New plan is wrap Monday May 6,
announce Thursday May 9.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Tom Lane
Simon Riggs  writes:
> On other patches, one committer objecting to something is seen as
> enough of a blocker to require change. That should work in every
> direction.

The bottom line here is that we have substantial disagreement on how
unlogged matviews should be implemented, and there's no longer enough
time for coming to a resolution that will satisfy everybody.  I think
that means we have to pull the feature from 9.3.  If it had not yet
been committed it would certainly not be getting in now over multiple
objections.

Given Robert's concerns, it may be that the same should be said for
scannability tracking.  I think it's definitely the case that if we
don't have unlogged matviews then the need for system-level tracking
of scannability is greatly decreased.  Kevin's already said that he
plans to work on a much more flexible notion of scannability for 9.4,
and I remain concerned that something we do in haste now might not
prove to be a good upward-compatible basis for that redesign.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Tom Lane
Robert Haas  writes:
> On Sat, Apr 27, 2013 at 3:51 PM, Tom Lane  wrote:
>> I cannot say that I find that idea attractive; the biggest problem with
>> it being that updating such a state flag will be nontransactional,
>> unless we go to a lot of effort to support rollbacks.  ISTM that the
>> scannability property is a perfectly normal relation property and as
>> such *ought* to be in the pg_class row, or at worst some other catalog
>> entry.  Why do you think differently?

> Mostly because of the issue with unlogged tables, I suppose.  If
> you've got a reasonable  idea how to do catalog updates on restart,
> though, I could probably be convinced to yield to that.

Well, it's fairly clear *how* to do it: add some more processing that
occurs after we've completed crash replay.  We already have some of
that, eg completion of partial splits in btrees, so it's not that much
of a stretch; it's just a lot of code that's not been written yet.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add --single-row to psql

2013-04-28 Thread Craig Ringer
On 04/25/2013 10:55 PM, Andrew Dunstan wrote:
>
> It's an Amazon product based on release 8.0, but with many many
> features removed (e.g. Indexes!) 
More specifically, it's a hacked-up column-store-ized Pg for OLAP and
analytics work. As I understand it Amazon didn't develop it themselves;
they bought/licensed Paraccel ( http://www.paraccel.com/) and customised
to integrated into Amazon's IAM authentication/authorization, usage
accounting, dynamic provisioning, etc.

http://docs.aws.amazon.com/redshift/latest/dg/c_redshift-and-postgres-sql.html
http://en.wikipedia.org/wiki/ParAccel

I find it frustrating that I've never seen an @paraccel email address
here and that few of the other vendors of highly customised Pg offshoots
are contributing back. It's almost enough to make me like the GPL.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-28 Thread Ants Aasma
On Fri, Apr 26, 2013 at 10:57 PM, Jeff Davis  wrote:
> On Fri, Apr 26, 2013 at 7:09 AM, Simon Riggs  wrote:
>> I'm expecting to spend some time on this over the weekend, once I've
>> re-read the thread and patches to see if there is something to commit.
>>
>> That's my last time window, so this looks like the last chance to make
>> changes before beta.
>
> I updated the patch and split it into two parts (attached).
>
> The first patch is the checksum algorithm itself. I have done
> some documentation updates and moved it into the C file (rather
> than the README), but further explanation of the "shift right 3"
> modification will need to be by Ants or Florian.
>
> The second patch adds the configure-time check for the right
> compilation flags, and uses them when compiling checksum.c. I
> called the new variable CFLAGS_EXTRA, for lack of a better idea,
> so feel free to come up with a new name. It doesn't check for, or
> use, -msse4.1, but that can be specified by the user by
> configuring with CFLAGS_EXTRA="-msse4.1".
>
> I don't know of any more required changes, aside from
> documentation improvements.

I have updated the base patch. This is supposed to go under the
cflags-vector patch that Jeff posted yesterday.

I had the opportunity to run a few more tests of the hash. Based on
the tests I switched the shift-right operation from 3 to 17bits (the
original value was chosen by gut feel). Avalanche tests showed that
this value removed bias the quickest. You can see the difference in
the attached image, colors are still black 0% bias, blue 5%, green
33%, yellow 75%, red 100%. The final box in the diagram is covered by
the final mixing iteration. The take away from these diagrams is 17
mixes better than 3. 17 still has some residual bias for the final
iteration on the page. The effective information content values in
checksum for 16 high order bits on final 32 32bit words on the page
are: 16.0 15.1 14.1 13.3 12.6 12.1 11.8 11.7 11.5 11.5 11.1 11.0 10.9
10.6 10.6 10.4. Error detection capability for the highest bit is
therefore 1:1351. Based on this I also switched to using two
iterations of zeroes at the end, this way the lowest information
content is 15.976bits or 1:64473.

Documentation changes:
* reworded the algorithm description so the order of steps is more apparent.
* added a link to the FNV reference page.
* fixed note about FNV being 4 bytes at a time. Official variant is 1
byte at a time.
* added a segment of why the algorithm was chosen and its error
detection capabilities.
* added a segment about how the code affects vectorization.

Issue to decide before commiting:
* Whether to use 32 or 64 parallel checksums. The tradeoff for 64 is a
small performance hit (10-20%) on todays CPUs for a small performance
gain on Haswell processors coming to market this year and up to a
theoretical 2x performance gain on future CPUs. Changing this is just
a matter of changing N_SUMS and updating documentation to match.

Regards,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de
<><>

fnv-ants-20130428.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Robert Haas
On Sat, Apr 27, 2013 at 3:51 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, Apr 27, 2013 at 3:33 PM, Tom Lane  wrote:
>>> Um, wait, it's *not* in pg_class now, and what I was about to do was
>>> go put it there.  Is there a typo in the above para, or are you saying
>>> you don't like either approach?  If the latter, what concept have you
>>> got for an eventual implementation?
>
>> If we're going to have it at all, I'd like to make it a flag in the
>> page header on page 0, or maybe have a dedicated metapage that stores
>> that detail, and perhaps other things.
>
> I cannot say that I find that idea attractive; the biggest problem with
> it being that updating such a state flag will be nontransactional,
> unless we go to a lot of effort to support rollbacks.  ISTM that the
> scannability property is a perfectly normal relation property and as
> such *ought* to be in the pg_class row, or at worst some other catalog
> entry.  Why do you think differently?

Mostly because of the issue with unlogged tables, I suppose.  If
you've got a reasonable  idea how to do catalog updates on restart,
though, I could probably be convinced to yield to that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] high io BUT huge amount of free memory

2013-04-28 Thread Craig Ringer
On 04/24/2013 09:39 PM, Shaun Thomas wrote:
> On 04/24/2013 08:24 AM, Robert Haas wrote:
>
>> Are you referring to the fact that vm.zone_reclaim_mode = 1 is an
>> idiotic default?
> Servers are getting shafted in a lot of cases, and it's actually
> starting to make me angry.
>

A significant part of that problem is that desktop users and people
developing for desktops *test* kernels before or shortly after their
release.

Most server operators won't let a new kernel anywhere near their
machines, so reports of problems on big real-world servers lag severely
behind Linux kernel development.

By the time last years' issues are fixed, there's a whole new crop of
issues that make the new kernel problematic in other ways.

There *are* teams testing new kernels on big hardware, but this takes
money and resources not everyone has. They're also limited in what tests
the have available to them. One of the big things that you can do to
help is *produce automated test cases* that demonstrate performance
problems so they can be incorporated into future kernel testing and
benchmarking processes. You can also help test newer kernels to see if
your issues are fixed.

I know full well how frustrating it can be when you feel your use cases
and problems are ignored or dismissed (I've worked on Java EE) ... but
the only way I've ever found to get genuine progress is to put that
aside and help.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Simon Riggs
On 27 April 2013 19:06, Tom Lane  wrote:
> Noah Misch  writes:
>> On Sat, Apr 27, 2013 at 10:59:32AM -0400, Tom Lane wrote:
>>> As far as #1 goes, I think we have little choice at this point but to
>>> remove the unlogged-matviews feature for 9.3.
>
>> This perspective is all wrong.  I hate to be blunt, but that thread ended 
>> with
>> your technical objections to the committed implementation breaking apart and
>> sinking.  There was no consensus to change it on policy/UI grounds, either.
>
> [ shrug... ]  You and Kevin essentially repeated your claims that the
> current implementation is OK; nobody else weighed in.

On other patches, one committer objecting to something is seen as
enough of a blocker to require change. That should work in every
direction.

In any case, we shouldn't all need to line up and vote on stuff, its
so timewasting. Of course, we need to sometimes, but only when the
case is put clearly enough that it can be done sensibly, otherwise we
just end up voting ad hominem.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-28 Thread Simon Riggs
On 27 April 2013 20:23, Robert Haas  wrote:
> On Sat, Apr 27, 2013 at 10:59 AM, Tom Lane  wrote:

>> 2. The checksum algorithm business.  Again, we don't get to tinker with
>> that anymore once we're in beta.
>
> I think it's pretty darn clear that we should change the algorithm,
> and I think we've got a patch to do that.  So we should be able to
> resolve this relatively quickly.

I'll be working on this today.

> But +1 for adding a checksum
> algorithm ID to pg_control anyway.

Yes, seems best.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers