Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 11:40 PM, Greg Smith  wrote:
> To be frank, that makes for a materalized view implementation of little
> value over what you can currently do as far as I'm concerned.  It might be
> interesting as a prototype, but that's not necessarily going to look like
> what's needed to do this for real at all.  I'm not a big fan of dumping work
> into projects when you can see exactly how it's going to fail before you
> even get started.  As I see if, if you know where it's going to fall down,
> you don't need to build a prototype as an exercise to show you how to build
> it--you should work on that part first instead.

Hopefully, you're already aware that I have enormous respect for your
opinions on a wide variety of topics; if not, let me publicly say that
I absolutely do.

Having said that, I disagree with your conclusions in this instance.
I see nothing but upside from this work.  It is vastly easier to write
a patch that builds on existing functionality than it is to write
something new from scratch.  If there's any value in having manually
refreshed materialized views, then having the simplest possible
implementation of what those can look like committed will make it far
easier to plan out next steps.  While the proposed implementation may
not solve a huge number of real-world problems, I think there's a good
argument that some people will get good use of it.  Not everyone has
1TB tables with continuous access patterns.  And, provided that it
doesn't conflict with anything we want to do in the future, being
useful to some people is a good enough reason to put it in.

I also think that you're underestimating the number of problems that
will have to be solved to get this done.  It's going to take some
significant work - both design work and coding work - to figure out
how this should integrate into the rest of the system.  (What should
be the value of pg_class.relkind?  Where should the node
representation of the snapshot query be stored?  And did we handle all
of those OID dependencies correctly?)

Where I can see this possibly falling down (other than being just too
much work for a relative PostgreSQL novice to get it done in one
summer) is if there are concerns about it being incompatible with
incrementally-updated views.  I imagine that we're going to want to
eventually support both, so we need to make sure that this
implementation doesn't box us into a corner.  But as far as snapshot
views go, complaining that the proposed locking is too strong doesn't
seem quite fair.  Fixing that, AFAICS, is a very hard project,
possibly involving significant planner support and an implementation
of MERGE, and I would much rather try to land a fundamentals patch
like this first and then deal with the gyrations that will be involved
in making this work than try to land the whole thing all at once.

Of course, if I'm missing something, and there's a SIMPLE way to get
materialized views that can be refreshed without a full-table lock,
that's another story altogether - maybe you have an idea?

Finally, even if we decided NOT to merge this patch because of the
limitations you mention (and right now that doesn't seem to be the
consensus), having this part of it completed as a starting point for
future work might be reason enough by itself.

In short: I think you may be letting the perfect be the enemy of the good.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Greg Smith

Robert Haas wrote:

It's not obvious to me
that a brief full-table lock wouldn't be acceptable for an initial
implementation.  Obviously it wouldn't be suitable for every use case
but since we're talking about manually refreshed views that was bound
to be true anyway.
  


There already is an initial implementation of sorts.  There are a couple 
of ways you can build these right now, so any new development has to 
look like it will end with good odds of being an improvement over what's 
already available before it's worth putting development resources into.


As a rough idea of what people want these for in the field, based on 
what I've seen requests for, imagine that someone has a 1TB table 
they're materializing a view on in order to get at least a 10:1, and 
hopefully close to a 100:1, speedup on viewing summary data.  Now, 
picture what happens if you have someone doing a sequential scan on the 
MV, which is still quite big, the updater process lines up to grab an 
exclusive lock when it's done, and now a second user wanting to read a 
single row quickly comes along behind it.  Given a full-table lock 
implementation, that scenario is unlikely to play out with the second 
user getting a fast response.  They'll likely sit in a lock queue for 
some potentially long period of time instead, waiting for the active seq 
scan to finish then the update to happen.  You have to build it that way 
or a steady stream of people reading could block out updates forever.


To be frank, that makes for a materalized view implementation of little 
value over what you can currently do as far as I'm concerned.  It might 
be interesting as a prototype, but that's not necessarily going to look 
like what's needed to do this for real at all.  I'm not a big fan of 
dumping work into projects when you can see exactly how it's going to 
fail before you even get started.  As I see if, if you know where it's 
going to fall down, you don't need to build a prototype as an exercise 
to show you how to build it--you should work on that part first instead.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Greg Smith

Heikki Linnakangas wrote:

Your proposal basically describes
doing 1, in a limited fashion where the view is not updated
automatically, but only when the DBA runs a command to refresh it. I'm
not sure if that's useful enough on its own, writing "CREATE
MATERIALIZED VIEW ... SELECT ..." doesn't seem any easier than just
writing "CREATE TABLA AS ...". But if you can do something about 2, or
even a very limited part of 1, keeping the view up-to-date
automatically, it becomes much more useful.
  


You've hit upon the core issue here.  You can build materialized views 
right now using "CREATE TABLE AS".  You can even update them by creating 
a new table the same way, with a new name, and doing the 
LOCK/RENAME/DROP shuffle--what people used to do for rewriting bloated 
tables before there was CLUSTER.  The first step in the proposal here is 
essentially syntax to give an easier UI for that.  That's an interesting 
step, but recognize that it doesn't actually provide anything you can't 
do already.


If you then note that doing any sort of incremental update to the view 
is a hard problem, and that a lot of the useful cases for materialized 
views involve tables where it's impractical to recreate the whole thing 
anyway, you'll inevitably find yourself deeply lost in the minutia of 
how to handle the updates.  It's really the core problem in building 
what people expect from a materialized view implementation in a serious 
database.  Chipping away at the other pieces around it doesn't move the 
feature that far forward, even if you get every single one of them 
except incremental updates finished, because everything else combined is 
still not that much work in comparison to the issues around updates.


There certainly are a fair number of subproblems you can break out of 
here.  I just think it's important to recognize that the path that leads 
to a useful GSoC project and the one that gives a production quality 
materialized view implementation may not have that much in common, and 
to manage expectations on both sides accordingly.  If Pavel thinks he's 
going to end up being able to say "I added materialized views to 
PostgreSQL" at the end of the summer, that's going to end in 
disappointment.  And if people think this project plan will lead to 
being able to claim PostgreSQL now has this feature, that's also not 
going to go well.  If the scope is "add initial grammar and rewriting 
moving toward a future materialized view feature", which the underlying 
implementation noted as a stub prototype, that might work out OK.  This 
is why I likened it to the work on "Syntax for partitioning", which has 
a similarly focused subgoal structure.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Virtual Private Database

2010-04-10 Thread Joseph Adams
On Sat, Apr 10, 2010 at 10:00 AM, Jean-Gérard Pailloncy  wrote:
> Hello,
>
> 1) VPD: Virtual Private Database
> I would appreciate to have a new feature in PostgreSQL.
> This is an oracle-like feature that implement "Row Level Security".
> This feature may be emulated by using VIEW/RULE but this is very time
> consuming and error prone.
>
> I would appreciated to have an estimated of the faisability and the cost
> to implement it.
>
> 2) Description
> The feature may be implemented with a simple expression associated to the
> table.
>
> ALTER TABLE table_name ADD FILTER filter_name CHECK(expression);
> ALTER TABLE table_name DROP FILTER filter_name;
>
> Usage/example:
> ALTER TABLE filtered_table ADD FILTER tf_username
> CHECK(filtered_table.creator=user)
> SELECT * FROM filtered_table;
> will really do
> SELECT * FROM filtered_table WHERE filtered_table.creator=user;
>
> Same thing for INSERT, UDPATE, and DELETE
>
> UPDATE filtered_table SET b_column=1 WHERE a_column='a';
> wille really do
> UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
> filtered_table.creator=user;
>
> In practice, the devs will create few function: my_login, my_logout,
> my_filter
> and the simple "filtered_table.creator=user" will be replace by ACL
> encapsulated in the function my_filter and add a triger to check data on
> INSERT, UDPATE.
> We could use veil to build a very efficient filter.
>
> 3) Question
> - Is it doable ?
> - Is it the sound way of doing it ?
> - Is it possible to have it in core ?
> - Is there a pgsql dev interested to implemented it ?
> - Is there other people interested in such feature ?
> - How much this will cost ?
> - With which delay ?
>
>
> Cordialement,
> Jean-Gérard Pailloncy
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

This is quite similar to an idea I posted about not long ago called
access control jails (
http://archives.postgresql.org/pgsql-hackers/2010-03/msg00832.php ).
I encountered this very problem writing a simple web application
involving access control.  There are a variety of ways to make
implementing access control easier, and I think copying off of Oracle
would be among the best ;-)

Disclaimer:  I am not a PostgreSQL hacker, but a newbie with some
experience in other communities, absorbing what he can.

Access control jailing, as I conceived it, would not simply filter per
table, but would make it so all queries would be filtered.  If used
correctly, it would even be safe to execute untrusted SQL (though it
might not be advisable).

I looked at the Veil demo application a tiny bit, and the only thing I
drew from it was the impression that it makes things more complicated,
not less :(  Then again, it may just be the example demonstrating a
lot of different features at once.

One problem that ought to be addressed for any of these ideas is how
to do connection-local variables.  For instance:

> UPDATE filtered_table SET b_column=1 WHERE a_column='a';
> wille really do
> UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
> filtered_table.creator=user;

Here, what is "=user" referring to?  I suppose it is a variable that
is set not long after the session starts and only applies to that
session?  PostgreSQL has temporary tables and such, but you can't
reference them until they're already created.  Hence, I don't think
PostgreSQL elegantly supports free variables that are bound
temporarily per connection.  There are GUCs and such, but using them
for this purpose is far from elegant, if I understand correctly.

Another problem is that session-local context doesn't go well with
connection pooling, so you might need some workaround like passing
context IDs back and forth.

That's my own summary of the discussion about access control jails linked above.

By the way, here's a hack to bind a free variable to a session:

CREATE FUNCTION get_user_id() RETURNS INT AS $$
DECLARE
ret INT;
BEGIN
SELECT INTO ret id FROM user_id_tbl;
RETURN ret;
END
$$ LANGUAGE 'plpgsql';

Then, per-session:

CREATE TEMPORARY TABLE user_id_tbl (id INT);
INSERT INTO user_id_tbl VALUES (5);
SELECT get_user_id();

It relies on plpgsql not complaining about user_id_tbl not existing at
creation time.  What this trick allows one to do is set the user ID
once (e.g. after connecting), then views and such that call
get_user_id() will have the appropriate user ID without needing to
specify it per-query.

I'm curious:  is this trick a good idea?  Does connection pooling play
well with temporary tables (and thus this trick)?  Could it result in
substantial slowdowns (I don't see why it should, since get_user_id()
needs to be called once per query that uses it)?  I guess creating a
temporary table every connection has the potential to be slow.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To 

Re: [HACKERS] Virtual Private Database

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 10:00 AM, Jean-Gérard Pailloncy  wrote:
> 1) VPD: Virtual Private Database
> I would appreciate to have a new feature in PostgreSQL.
> This is an oracle-like feature that implement "Row Level Security".
> This feature may be emulated by using VIEW/RULE but this is very time
> consuming and error prone.
>
> I would appreciated to have an estimated of the faisability and the cost
> to implement it.
>
> 2) Description
> The feature may be implemented with a simple expression associated to the
> table.
>
> ALTER TABLE table_name ADD FILTER filter_name CHECK(expression);
> ALTER TABLE table_name DROP FILTER filter_name;
>
> Usage/example:
> ALTER TABLE filtered_table ADD FILTER tf_username
> CHECK(filtered_table.creator=user)
> SELECT * FROM filtered_table;
> will really do
> SELECT * FROM filtered_table WHERE filtered_table.creator=user;
>
> Same thing for INSERT, UDPATE, and DELETE
>
> UPDATE filtered_table SET b_column=1 WHERE a_column='a';
> wille really do
> UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
> filtered_table.creator=user;
>
> In practice, the devs will create few function: my_login, my_logout,
> my_filter
> and the simple "filtered_table.creator=user" will be replace by ACL
> encapsulated in the function my_filter and add a triger to check data on
> INSERT, UDPATE.
> We could use veil to build a very efficient filter.
>
> 3) Question
> - Is it doable ?
> - Is it the sound way of doing it ?
> - Is it possible to have it in core ?
> - Is there a pgsql dev interested to implemented it ?
> - Is there other people interested in such feature ?
> - How much this will cost ?
> - With which delay ?

This is very similar to the design I've been thinking about for
row-level security.

Here is a pointer to a previous email thread on the topic of row-level security.

http://archives.postgresql.org/pgsql-hackers/2009-12/msg01095.php

Before row-level security can be implemented, we'd need to fix the
problem described here:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php

With respect to sponsoring development of new features, it can
certainly be done.  Any such feature could not at this point be added
any sooner than PostgreSQL 9.1, and I'd recommend that if you want to
see it in 9.1 you should try to get a contract with someone in place
in the next few months.  To get a price, you'd need to contact a
PostgreSQL support/development company or an individual developer.
The following web page might give you some ideas where to start
looking.

http://www.postgresql.org/support/professional_support

There's sort of an understanding that we don't talk about contracts or
pricing on this list, so that the content remains technical rather
than commercial.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Martijn van Oosterhout
On Sat, Apr 10, 2010 at 02:36:41PM +0200, Boszormenyi Zoltan wrote:
> >> The above is quite reproducable, "pg_ctl stop -m immediate"
> >> "usually" inflated my serial sequence, but I had two occasions
> >> when not. The 69 -> 70 was one. The inflated increase is always 33:
> >
> > AFAIKS sequences are pre-logged with 32 values to WAL to avoid
> > overhead. I suspect this is why you are seeing those gaps.
> 
> Then it should happen all the time, even with "-m fast" or "-m smart", no?

Nope, because on a normal shutdown it writes out the actual value. When
you say "immediate" you mean "right now, don't bother with anything not
important", like for example gaps in sequences. You're essentially
crashing the DB.

Have a ncie day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first. 
>   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 2:30 PM, Jeff Davis  wrote:
> On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote:
>> I was thinking of a case for instance for ranges a,b,c in relations
>> A,B,C respectively, where  a && b and b && c, but not a && c. Would the
>> planner consider a join path of table A and C first, then that result
>> with B. After looking in doxygen, it looks like having && defined
>> without MERGES is what prevents this unwanted behaviour, since that
>> prevents a,b and c to become members of the same equivalence class.
>
> Interesting, I would have to make sure that didn't happen. Most likely
> there would be a new property like "RANGEMERGES", it wouldn't reuse the
> existing MERGES property.
>
>> Sorry for the spam on the list.
>
> Not at all, it's an interesting point.

Yeah... I agree.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Fri, Apr 9, 2010 at 5:49 PM, Jeff Davis  wrote:
> On Thu, 2010-04-08 at 22:29 -0400, Robert Haas wrote:
>> 1. knngist wants to use index scans to speed up queries of the form
>> SELECT ... ORDER BY(as opposed to the
>> existing machinery which only knows how to use an index for SELECT ...
>> ORDER BY ).
>> 2. Window functions want to define windows over a range of values
>> defined by the underlying data type.  To do this, we need to define
>> what addition and subtraction mean for a particular data type.
>> 3. Jeff Davis is interested in implementing range types.  When the
>> underlying base type is discrete, e.g. integers, you can say that
>> [1,3] = [1,4), but only if you know that 3 and 4 are consecutive (in
>> that order).
>
> To give some context, I started a thread a while ago:
>
> http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php
>
> Tom provided some interesting suggestions in that thread, but I'm not
> sure they would work for #1 or #2.



The "map && to <<" case is interesting.  It doesn't seem like it's
really a good candidate for type interfaces, because you you're not
really looking for "the" strictly-left-of operator; you're looking for
the strictly-left-of operator associated with the overlaps operator
actually specified.  And ideally there might be an index strategy
number available for <<, too, so that you could consider doing an
index scan instead of a sort, but not necessarily.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Jeff Davis
On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote:
> I was thinking of a case for instance for ranges a,b,c in relations 
> A,B,C respectively, where  a && b and b && c, but not a && c. Would the 
> planner consider a join path of table A and C first, then that result 
> with B. After looking in doxygen, it looks like having && defined 
> without MERGES is what prevents this unwanted behaviour, since that 
> prevents a,b and c to become members of the same equivalence class. 

Interesting, I would have to make sure that didn't happen. Most likely
there would be a new property like "RANGEMERGES", it wouldn't reuse the
existing MERGES property.

> Sorry for the spam on the list.

Not at all, it's an interesting point.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Robert Haas wrote:

On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga  wrote:
  

Jeff Davis wrote:


To give some context, I started a thread a while ago:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php

  

Interesting, a join type for overlaps, which makes me think a bit of the
staircase join for pre-post coordinates. However, does a join operator type
need certain kinds of properties of the operator involved, e.g. being
commutative, transitive etc? Else the join reordering fails. The latter
fails for the overlap operator.



I don't think I follow this.  As far as I know, the join order
constraints don't depend on the choice of operator.
  
I was thinking of a case for instance for ranges a,b,c in relations 
A,B,C respectively, where  a && b and b && c, but not a && c. Would the 
planner consider a join path of table A and C first, then that result 
with B. After looking in doxygen, it looks like having && defined 
without MERGES is what prevents this unwanted behaviour, since that 
prevents a,b and c to become members of the same equivalence class. 
Sorry for the spam on the list.


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Kevin Grittner
Greg Smith  wrote:
 
> And work on MERGE support is itself blocked behind the fact that
> PostgreSQL doesn't have a good way to lock access to a key value
> that doesn't exist yet--what other databases call key range
> locking.
 
The bulk of the serializable implementation WIP is work to implement
just this sort of locking.  There are already a couple possible
spin-off uses on the horizon based on the ability of these locks to
survive their initiating transactions and detect conflicting writes.
Both spinoffs involve somehow flagging a transaction as being one for
which the locks should be kept until further notice, and issuing a
notification when a conflicting write occurs.  That seems consistent
with the needs of materialized views, too.
 
It probably won't be solid in time to be useful for GSoC, but if
someone's looking to map out a plan for materialized views, I thought
this information might be germane.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga  wrote:
> Jeff Davis wrote:
>>
>> To give some context, I started a thread a while ago:
>>
>> http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php
>>
>
> Interesting, a join type for overlaps, which makes me think a bit of the
> staircase join for pre-post coordinates. However, does a join operator type
> need certain kinds of properties of the operator involved, e.g. being
> commutative, transitive etc? Else the join reordering fails. The latter
> fails for the overlap operator.

I don't think I follow this.  As far as I know, the join order
constraints don't depend on the choice of operator.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Jeff Davis wrote:

To give some context, I started a thread a while ago:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php
  
Interesting, a join type for overlaps, which makes me think a bit of the 
staircase join for pre-post coordinates. However, does a join operator 
type need certain kinds of properties of the operator involved, e.g. 
being commutative, transitive etc? Else the join reordering fails. The 
latter fails for the overlap operator.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Virtual Private Database

2010-04-10 Thread Jean-Gérard Pailloncy
Hello,

1) VPD: Virtual Private Database
I would appreciate to have a new feature in PostgreSQL.
This is an oracle-like feature that implement "Row Level Security".
This feature may be emulated by using VIEW/RULE but this is very time
consuming and error prone.

I would appreciated to have an estimated of the faisability and the cost
to implement it.

2) Description
The feature may be implemented with a simple expression associated to the
table.

ALTER TABLE table_name ADD FILTER filter_name CHECK(expression);
ALTER TABLE table_name DROP FILTER filter_name;

Usage/example:
ALTER TABLE filtered_table ADD FILTER tf_username
CHECK(filtered_table.creator=user)
SELECT * FROM filtered_table;
will really do
SELECT * FROM filtered_table WHERE filtered_table.creator=user;

Same thing for INSERT, UDPATE, and DELETE

UPDATE filtered_table SET b_column=1 WHERE a_column='a';
wille really do
UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
filtered_table.creator=user;

In practice, the devs will create few function: my_login, my_logout,
my_filter
and the simple "filtered_table.creator=user" will be replace by ACL
encapsulated in the function my_filter and add a triger to check data on
INSERT, UDPATE.
We could use veil to build a very efficient filter.

3) Question
- Is it doable ?
- Is it the sound way of doing it ?
- Is it possible to have it in core ?
- Is there a pgsql dev interested to implemented it ?
- Is there other people interested in such feature ?
- How much this will cost ?
- With which delay ?


Cordialement,
Jean-Gérard Pailloncy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Robert Haas
2010/4/10 Andrew Dunstan :
> Heikki Linnakangas wrote:
>>
>> 1. Keep the materialized view up-to-date when the base tables change.
>> This can be further divided into many steps, you can begin by supporting
>> automatic updates only on very simple views with e.g a single table and
>> a where clause. Then extend that to support joins, aggregates,
>> subqueries etc. Keeping it really limited, you could even require the
>> user to write the required triggers himself.
>
> That last bit doesn't strike me as much of an advance. Isn't the whole point
> of this to automate it? Creating greedy materialized views is usually not
> terribly difficult now, but you do have to write the triggers.

Yeah, I agree.

> The other thing that could be interesting about this would be some scheme
> for lazy refresh that didn't involve re-extracting the whole data set.

One way to do this would be to infer a primary key for the result set
based on the input query.  But I think we don't really have the
infrastructure to do this right now, so not really a project for a
beginner.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Andrew Dunstan



Heikki Linnakangas wrote:


1. Keep the materialized view up-to-date when the base tables change.
This can be further divided into many steps, you can begin by supporting
automatic updates only on very simple views with e.g a single table and
a where clause. Then extend that to support joins, aggregates,
subqueries etc. Keeping it really limited, you could even require the
user to write the required triggers himself.

  


That last bit doesn't strike me as much of an advance. Isn't the whole 
point of this to automate it? Creating greedy materialized views is 
usually not terribly difficult now, but you do have to write the triggers.


The other thing that could be interesting about this would be some 
scheme for lazy refresh that didn't involve re-extracting the whole data 
set.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] master in standby mode croaks

2010-04-10 Thread Robert Haas
On Fri, Apr 2, 2010 at 5:36 AM, Simon Riggs  wrote:
> I can't duplicate this error based upon what you have said.

I fooled around with this some more and I think I know what's going
on.  The error message I received was:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

This is generated when !checkPoint.XLogStandbyInfoMode.  That, in
turn, is set on the master to the results of XLogStandbyInfoActive(),
which is defined as XLogRequestRecoveryConnections && XLogIsNeeded().
XLogIsNeeded() is defined as XLogArchivingActive() || (max_wal_senders
> 0), and XLogArchivingActive() is defined as XLogArchiveMode.  So
when you expand it all out, this error message gets triggered when the
following condition does not hold on the master:

XLogRequestRecoveryConnections && (XLogArchiveMode || (max_wal_senders > 0))

So this can fail in either of two ways: (1)
XLogRequestRecoveryConnections (aka recovery_connections) might be
false, which is the situation described in the error message, or (2)
XLogArchiveMode (archive_mode) might be false and at the same time
max_wal_senders might be zero.  As it happens, the default
configuration of the system is recovery_connections = true,
archive_mode = false, max_wal_senders = 0, so with an out-of-the-box
config it fails for the reason that isn't the one described in the
error message.

One possible approach here is to improve the error message, but it
seems to me that having the ability of Hot Standby to run on the slave
partially controlled by three different GUCs is awfully complicated.
I think the root of the problem here is that recovery_connections
controls one behavior on the primary (whether or not we WAL-log
certain information needed for HS) and a completely unrelated behavior
on the standby (whether or not we try to allow read-only backends into
the system).  In 8.4 and prior, it was always the job of archive_mode
to decide whether WAL-logging was needed.  Maybe we should go back to
that and make it an enum:

wal_mode = {standby | archive | off}

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Boszormenyi Zoltan
Bernd Helmle írta:
>
>
> --On 10. April 2010 09:26:41 +0200 Boszormenyi Zoltan 
> wrote:
>
>> The above is quite reproducable, "pg_ctl stop -m immediate"
>> "usually" inflated my serial sequence, but I had two occasions
>> when not. The 69 -> 70 was one. The inflated increase is always 33:
>
> AFAIKS sequences are pre-logged with 32 values to WAL to avoid
> overhead. I suspect this is why you are seeing those gaps.

Then it should happen all the time, even with "-m fast" or "-m smart", no?

It seemed like my sequences have a CACHE 32 setting, which would
apply to every client that connects, runs nextval() once and disconnects.

But it didn't happen all the time, so it's not deterministic.

-- 
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a faster compression algorithm for pg_dump

2010-04-10 Thread Joachim Wieland
On Fri, Apr 9, 2010 at 5:51 AM, Greg Stark  wrote:
> Linking against as an option isn't nearly as bad since the user
> compiling it can choose whether to include the restricted feature or
> not. That's what we do with readline. However it's not nearly as
> attractive when it restricts what file formats Postgres supports -- it
> means someone might generate backup dump files that they later
> discover they don't have a legal right to read and restore :(

If we only linked against it, we'd leave it up to the user to weigh
the risk as long as we are not aware of any such violation.

Our top priority is to make sure that the project would not be harmed
if one day such a patent showed up. If I understood you correctly,
this is not an issue, even if we included lzf and less again if we
only link against it. The rest is about user education and using lzf
only in pg_dump and not for toasting, we could show a message in
pg_dump if lzf is chosen to make the user aware of the possible
issues.

If we still cannot do this, then what I am asking is: What does the
project need to be able to at least link against such a compression
algorithm? Is it a list of 10, 20, 50 or more other projects using it
or is it a lawyer saying: "There is no patent."? But then, how can we
be sure that the lawyer is right? Or couldn't we include it even if we
had both, because again, we couldn't be sure... ?


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Bernd Helmle



--On 10. April 2010 09:26:41 +0200 Boszormenyi Zoltan  
wrote:



The above is quite reproducable, "pg_ctl stop -m immediate"
"usually" inflated my serial sequence, but I had two occasions
when not. The 69 -> 70 was one. The inflated increase is always 33:


AFAIKS sequences are pre-logged with 32 values to WAL to avoid overhead. I 
suspect this is why you are seeing those gaps.


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta:
> Hi,
>
> I wanted to test HS/SR and have setup two instances, one primary
> and one secondary, the secondary is obviously a copy of the primary
> while pg_start_backup() was in effect.
>
> I started up the secondary server after "SELECT pg_stop_backup()" on
> the primary. I stopped and started the primary with "-m fast" and
> "-m immediate" and I noticed that the sequence that was created for
> my serial field was inflated if I used "-m immediate".
>
> Here's the scenario:
>
> - primary and secondary are running, then:
>
> zozo=# create table t1 (id serial primary key, t text);
> NOTICE:  CREATE TABLE will create implicit sequence "t1_id_seq" for
> serial column "t1.id"
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey"
> for table "t1"
> CREATE TABLE
> zozo=# insert into t1 (t) values ('a');
> INSERT 0 1
> zozo=#
>
> - stop the primary with "-m fast" (the connection was still alive to it)
>   and start it again, then:
>
> zozo=# \q
> [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
> psql (9.0devel)
> Type "help" for help.
>
> zozo=# select * from t1;
>  id | t
> +---
>   1 | a
> (1 row)
>
> zozo=# insert into t1 (t) values ('b');
> INSERT 0 1
> zozo=# select * from t1;
>  id | t
> +---
>   1 | a
>   2 | b
> (2 rows)
>
> - stop the primary with "-m immediate" (connection was alive on it)
>   and start it again, then:
>
> zozo=# \q
> [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
> psql (9.0devel)
> Type "help" for help.
>
> zozo=# select * from t1;
>  id | t
> +---
>   1 | a
>   2 | b
> (2 rows)
>
> zozo=# insert into t1 (t) values ('b');
> INSERT 0 1
> zozo=# select * from t1;
>  id | t
> +---
>   1 | a
>   2 | b
>  35 | b
> (3 rows)
>
> The above is quite reproducable, "pg_ctl stop -m immediate"
> "usually" inflated my serial sequence, but I had two occasions
> when not. The 69 -> 70 was one. The inflated increase is always 33:
>
> [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
> psql (9.0devel)
> Type "help" for help.
>
> zozo=# insert into t1 (t) values ('f');
> INSERT 0 1
> zozo=# select * from t1;
>  id  | t
> -+---
>1 | a
>2 | b
>   35 | b
>   36 | c
>   69 | d
>   70 | e
>  103 | f
> (7 rows)
>
> Let's try with a plain sequence:
>
> zozo=# create sequence s1;
> CREATE SEQUENCE
> zozo=# select nextval('s1');
>  nextval
> -
>1
> (1 row)
>
> I stopped the primary at this point with "-m immediate",
> and from this first result I thought that a plain sequence is
> not bothered by this:
>
> zozo=# \q
> [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
> psql (9.0devel)
> Type "help" for help.
>
> zozo=# select nextval('s1');
>  nextval
> -
>2
> (1 row)
>
> zozo=# insert into t1 (t) values ('g');
> INSERT 0 1
> zozo=# select * from t1;
>  id  | t
> -+---
>1 | a
>2 | b
>   35 | b
>   36 | c
>   69 | d
>   70 | e
>  103 | f
>  136 | g
> (8 rows)
>
> But another restart and:
>
> zozo=# \q
> [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
> psql (9.0devel)
> Type "help" for help.
>
> zozo=# select nextval('s1');
>  nextval
> -
>   35
> (1 row)
>
> zozo=# select * from t1;
>  id  | t
> -+---
>1 | a
>2 | b
>   35 | b
>   36 | c
>   69 | d
>   70 | e
>  103 | f
>  136 | g
> (8 rows)
>
> zozo=# insert into t1 (t) values ('h');
> INSERT 0 1
> zozo=# select * from t1;
>  id  | t
> -+---
>1 | a
>2 | b
>   35 | b
>   36 | c
>   69 | d
>   70 | e
>  103 | f
>  136 | g
>  169 | h
> (9 rows)
>
> It happened with a CVS version of about 2 weeks ago and the
> yesterday's version, as well. I think it's not intentional, it must be
> a race somewhere, as it doesn't happen all the time.
>
> Best regards,
> Zoltán Böszörményi
>   

And to show that it doesn't happen with "-m fast", I tried it three times.
Before quitting from psql, I stopped the primary with "-m fast" each time:

[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# insert into t1 (t) values ('i');
INSERT 0 1
zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# insert into t1 (t) values ('i');
INSERT 0 1
zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# insert into t1 (t) values ('i');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
 169 | h
 170 | i
 171 | i
 172 | i
(12 rows)

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
T

[HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Boszormenyi Zoltan
Hi,

I wanted to test HS/SR and have setup two instances, one primary
and one secondary, the secondary is obviously a copy of the primary
while pg_start_backup() was in effect.

I started up the secondary server after "SELECT pg_stop_backup()" on
the primary. I stopped and started the primary with "-m fast" and
"-m immediate" and I noticed that the sequence that was created for
my serial field was inflated if I used "-m immediate".

Here's the scenario:

- primary and secondary are running, then:

zozo=# create table t1 (id serial primary key, t text);
NOTICE:  CREATE TABLE will create implicit sequence "t1_id_seq" for
serial column "t1.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey"
for table "t1"
CREATE TABLE
zozo=# insert into t1 (t) values ('a');
INSERT 0 1
zozo=#

- stop the primary with "-m fast" (the connection was still alive to it)
  and start it again, then:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# select * from t1;
 id | t
+---
  1 | a
(1 row)

zozo=# insert into t1 (t) values ('b');
INSERT 0 1
zozo=# select * from t1;
 id | t
+---
  1 | a
  2 | b
(2 rows)

- stop the primary with "-m immediate" (connection was alive on it)
  and start it again, then:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# select * from t1;
 id | t
+---
  1 | a
  2 | b
(2 rows)

zozo=# insert into t1 (t) values ('b');
INSERT 0 1
zozo=# select * from t1;
 id | t
+---
  1 | a
  2 | b
 35 | b
(3 rows)

The above is quite reproducable, "pg_ctl stop -m immediate"
"usually" inflated my serial sequence, but I had two occasions
when not. The 69 -> 70 was one. The inflated increase is always 33:

[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# insert into t1 (t) values ('f');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
(7 rows)

Let's try with a plain sequence:

zozo=# create sequence s1;
CREATE SEQUENCE
zozo=# select nextval('s1');
 nextval
-
   1
(1 row)

I stopped the primary at this point with "-m immediate",
and from this first result I thought that a plain sequence is
not bothered by this:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# select nextval('s1');
 nextval
-
   2
(1 row)

zozo=# insert into t1 (t) values ('g');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
(8 rows)

But another restart and:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type "help" for help.

zozo=# select nextval('s1');
 nextval
-
  35
(1 row)

zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
(8 rows)

zozo=# insert into t1 (t) values ('h');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
 169 | h
(9 rows)

It happened with a CVS version of about 2 weeks ago and the
yesterday's version, as well. I think it's not intentional, it must be
a race somewhere, as it doesn't happen all the time.

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set LC_COLLATE to de_DE_phoneb

2010-04-10 Thread Martijn van Oosterhout
On Thu, Apr 08, 2010 at 06:59:44PM +0200, Frank Jagusch wrote:
> I thought PostgreSQL is using OS specific functions for sorting (means
> Win32 functions?). This sounds not good for my request. So I ask the
> whole audience: Are others out there asking for a support for the
> alternate sort orders? Is it worth to discuss further in this direction?
> 
> Here an other Idea: Is there a way to define a custom collation for a
> database?

Over the years there have been various options suggested and various
patches posted but they never make it, for various reasons.

The only concerted effort I know of is the PostgreSQL ICU patch which
is in FreeBSD ports, to work around the fact that it's C library
doesn't handle UTF-8 collation at all. 

http://people.freebsd.org/~girgen/postgresql-icu/README.html

I doubt it's been tested for Windows, but if it works it will give you
alternate sort orders and even custom sort orders.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first. 
>   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Heikki Linnakangas
Greg Smith wrote:
> The main hidden complexity in this particular project relates to
> handling view refreshes.  The non-obvious problem is that when the view
> updates, you need something like a SQL MERGE to really handle that in a
> robust way that doesn't conflict with concurrent access to queries
> against the materialized view.  And work on MERGE support is itself
> blocked behind the fact that PostgreSQL doesn't have a good way to lock
> access to a key value that doesn't exist yet--what other databases call
> key range locking.  See the notes for "Add SQL-standard
> MERGE/REPLACE/UPSERT command" at http://wiki.postgresql.org/wiki/Todo
> for more information.
> 
> You can work around that to build a prototype by grabbing a full table
> lock on the materialized view when updating it, but that's not a
> production quality solution.

It would still be useful for many applications. And it would provide a
basis to extend later. You don't need to solve all problems at once, as
long as what you implement is a useful subset.

> Now, with all that said, that doesn't mean there's not a useful project
> for you buried in this mess.  The first two steps in your plan:
> 
> 1) create materialized view
> 2) change rewriter
> 
> Include building a prototype grammer, doing an initial executor
> implementation, and getting some sort of rewriter working.  That is
> potentially good groundwork to lay here.  I would suggest that you
> completely drop your step 3:
> 
> 3) create command that takes snapshot (refresh MV)
> 
> Because you cannot built that in a way that will be useful (and by that
> I mean committable quality) until there's a better way to handle updates
> than writing a whole new table and grabbing a full relation lock to
> switch to it.  To do a good job just on the first two steps should take
> at least a whole summer anyway--there's a whole stack of background
> research needed I haven't seen anyone do yet, and that isn't on your
> plan yet.  There is a precedent for taking this approach.  After getting
> stalled trying to add the entirety of easy partitioning to PostgreSQL,
> the current scope has been scaled back to just trying to get the syntax
> and on-disk structure right, then finish off the implementation.  See
> http://wiki.postgresql.org/wiki/Table_partitioning to get an idea how
> that's been broken into those two major chunks.

The good thing about this subject for GSoC is that it can be divided
into many small steps. There's two largely independent main parts:

1. Keep the materialized view up-to-date when the base tables change.
This can be further divided into many steps, you can begin by supporting
automatic updates only on very simple views with e.g a single table and
a where clause. Then extend that to support joins, aggregates,
subqueries etc. Keeping it really limited, you could even require the
user to write the required triggers himself.

2. Teach the planner to use materialized views automatically when a
query references the base tables. So if you issue the query "SELECT *
FROM table WHERE foo > 10 AND bar = 10", and there's a materialized view
on "SELECT * FROM table WHERE bar = 10", the planner can transform the
original query into "SELECT * FROM materializedview WHERE foo > 10".
This largely depends on 1, although some DBMSs offer the option to use
manually refreshed materialized views too, knowing that they might not
be completely up-to-date.

There's a lot room to choose which problems you want to tackle, which is
good for a summer-of-code project. Your proposal basically describes
doing 1, in a limited fashion where the view is not updated
automatically, but only when the DBA runs a command to refresh it. I'm
not sure if that's useful enough on its own, writing "CREATE
MATERIALIZED VIEW ... SELECT ..." doesn't seem any easier than just
writing "CREATE TABLA AS ...". But if you can do something about 2, or
even a very limited part of 1, keeping the view up-to-date
automatically, it becomes much more useful.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers