Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Robert Haas wrote:

The advantage of specifying a + and a - in the type interface is that
the unit definition can then be specified as part of the type
declaration itself.  So you can do:

CREATE TYPE ts_sec AS RANGE OVER timestamp (UNIT = '1s');
CREATE TYPE ts_min AS RANGE OVER timestamp (UNIT = '1m');
  
The problem with mixing units with ranges is that units are properties 
of some underlying datatype but not all types on which ranges can be 
defined.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Heikki Linnakangas
Greg Smith wrote:
 The main hidden complexity in this particular project relates to
 handling view refreshes.  The non-obvious problem is that when the view
 updates, you need something like a SQL MERGE to really handle that in a
 robust way that doesn't conflict with concurrent access to queries
 against the materialized view.  And work on MERGE support is itself
 blocked behind the fact that PostgreSQL doesn't have a good way to lock
 access to a key value that doesn't exist yet--what other databases call
 key range locking.  See the notes for Add SQL-standard
 MERGE/REPLACE/UPSERT command at http://wiki.postgresql.org/wiki/Todo
 for more information.
 
 You can work around that to build a prototype by grabbing a full table
 lock on the materialized view when updating it, but that's not a
 production quality solution.

It would still be useful for many applications. And it would provide a
basis to extend later. You don't need to solve all problems at once, as
long as what you implement is a useful subset.

 Now, with all that said, that doesn't mean there's not a useful project
 for you buried in this mess.  The first two steps in your plan:
 
 1) create materialized view
 2) change rewriter
 
 Include building a prototype grammer, doing an initial executor
 implementation, and getting some sort of rewriter working.  That is
 potentially good groundwork to lay here.  I would suggest that you
 completely drop your step 3:
 
 3) create command that takes snapshot (refresh MV)
 
 Because you cannot built that in a way that will be useful (and by that
 I mean committable quality) until there's a better way to handle updates
 than writing a whole new table and grabbing a full relation lock to
 switch to it.  To do a good job just on the first two steps should take
 at least a whole summer anyway--there's a whole stack of background
 research needed I haven't seen anyone do yet, and that isn't on your
 plan yet.  There is a precedent for taking this approach.  After getting
 stalled trying to add the entirety of easy partitioning to PostgreSQL,
 the current scope has been scaled back to just trying to get the syntax
 and on-disk structure right, then finish off the implementation.  See
 http://wiki.postgresql.org/wiki/Table_partitioning to get an idea how
 that's been broken into those two major chunks.

The good thing about this subject for GSoC is that it can be divided
into many small steps. There's two largely independent main parts:

1. Keep the materialized view up-to-date when the base tables change.
This can be further divided into many steps, you can begin by supporting
automatic updates only on very simple views with e.g a single table and
a where clause. Then extend that to support joins, aggregates,
subqueries etc. Keeping it really limited, you could even require the
user to write the required triggers himself.

2. Teach the planner to use materialized views automatically when a
query references the base tables. So if you issue the query SELECT *
FROM table WHERE foo  10 AND bar = 10, and there's a materialized view
on SELECT * FROM table WHERE bar = 10, the planner can transform the
original query into SELECT * FROM materializedview WHERE foo  10.
This largely depends on 1, although some DBMSs offer the option to use
manually refreshed materialized views too, knowing that they might not
be completely up-to-date.

There's a lot room to choose which problems you want to tackle, which is
good for a summer-of-code project. Your proposal basically describes
doing 1, in a limited fashion where the view is not updated
automatically, but only when the DBA runs a command to refresh it. I'm
not sure if that's useful enough on its own, writing CREATE
MATERIALIZED VIEW ... SELECT ... doesn't seem any easier than just
writing CREATE TABLA AS  But if you can do something about 2, or
even a very limited part of 1, keeping the view up-to-date
automatically, it becomes much more useful.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set LC_COLLATE to de_DE_phoneb

2010-04-10 Thread Martijn van Oosterhout
On Thu, Apr 08, 2010 at 06:59:44PM +0200, Frank Jagusch wrote:
 I thought PostgreSQL is using OS specific functions for sorting (means
 Win32 functions?). This sounds not good for my request. So I ask the
 whole audience: Are others out there asking for a support for the
 alternate sort orders? Is it worth to discuss further in this direction?
 
 Here an other Idea: Is there a way to define a custom collation for a
 database?

Over the years there have been various options suggested and various
patches posted but they never make it, for various reasons.

The only concerted effort I know of is the PostgreSQL ICU patch which
is in FreeBSD ports, to work around the fact that it's C library
doesn't handle UTF-8 collation at all. 

http://people.freebsd.org/~girgen/postgresql-icu/README.html

I doubt it's been tested for Windows, but if it works it will give you
alternate sort orders and even custom sort orders.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


[HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Boszormenyi Zoltan
Hi,

I wanted to test HS/SR and have setup two instances, one primary
and one secondary, the secondary is obviously a copy of the primary
while pg_start_backup() was in effect.

I started up the secondary server after SELECT pg_stop_backup() on
the primary. I stopped and started the primary with -m fast and
-m immediate and I noticed that the sequence that was created for
my serial field was inflated if I used -m immediate.

Here's the scenario:

- primary and secondary are running, then:

zozo=# create table t1 (id serial primary key, t text);
NOTICE:  CREATE TABLE will create implicit sequence t1_id_seq for
serial column t1.id
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey
for table t1
CREATE TABLE
zozo=# insert into t1 (t) values ('a');
INSERT 0 1
zozo=#

- stop the primary with -m fast (the connection was still alive to it)
  and start it again, then:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# select * from t1;
 id | t
+---
  1 | a
(1 row)

zozo=# insert into t1 (t) values ('b');
INSERT 0 1
zozo=# select * from t1;
 id | t
+---
  1 | a
  2 | b
(2 rows)

- stop the primary with -m immediate (connection was alive on it)
  and start it again, then:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# select * from t1;
 id | t
+---
  1 | a
  2 | b
(2 rows)

zozo=# insert into t1 (t) values ('b');
INSERT 0 1
zozo=# select * from t1;
 id | t
+---
  1 | a
  2 | b
 35 | b
(3 rows)

The above is quite reproducable, pg_ctl stop -m immediate
usually inflated my serial sequence, but I had two occasions
when not. The 69 - 70 was one. The inflated increase is always 33:

[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# insert into t1 (t) values ('f');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
(7 rows)

Let's try with a plain sequence:

zozo=# create sequence s1;
CREATE SEQUENCE
zozo=# select nextval('s1');
 nextval
-
   1
(1 row)

I stopped the primary at this point with -m immediate,
and from this first result I thought that a plain sequence is
not bothered by this:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# select nextval('s1');
 nextval
-
   2
(1 row)

zozo=# insert into t1 (t) values ('g');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
(8 rows)

But another restart and:

zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# select nextval('s1');
 nextval
-
  35
(1 row)

zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
(8 rows)

zozo=# insert into t1 (t) values ('h');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
 169 | h
(9 rows)

It happened with a CVS version of about 2 weeks ago and the
yesterday's version, as well. I think it's not intentional, it must be
a race somewhere, as it doesn't happen all the time.

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta:
 Hi,

 I wanted to test HS/SR and have setup two instances, one primary
 and one secondary, the secondary is obviously a copy of the primary
 while pg_start_backup() was in effect.

 I started up the secondary server after SELECT pg_stop_backup() on
 the primary. I stopped and started the primary with -m fast and
 -m immediate and I noticed that the sequence that was created for
 my serial field was inflated if I used -m immediate.

 Here's the scenario:

 - primary and secondary are running, then:

 zozo=# create table t1 (id serial primary key, t text);
 NOTICE:  CREATE TABLE will create implicit sequence t1_id_seq for
 serial column t1.id
 NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey
 for table t1
 CREATE TABLE
 zozo=# insert into t1 (t) values ('a');
 INSERT 0 1
 zozo=#

 - stop the primary with -m fast (the connection was still alive to it)
   and start it again, then:

 zozo=# \q
 [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
 psql (9.0devel)
 Type help for help.

 zozo=# select * from t1;
  id | t
 +---
   1 | a
 (1 row)

 zozo=# insert into t1 (t) values ('b');
 INSERT 0 1
 zozo=# select * from t1;
  id | t
 +---
   1 | a
   2 | b
 (2 rows)

 - stop the primary with -m immediate (connection was alive on it)
   and start it again, then:

 zozo=# \q
 [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
 psql (9.0devel)
 Type help for help.

 zozo=# select * from t1;
  id | t
 +---
   1 | a
   2 | b
 (2 rows)

 zozo=# insert into t1 (t) values ('b');
 INSERT 0 1
 zozo=# select * from t1;
  id | t
 +---
   1 | a
   2 | b
  35 | b
 (3 rows)

 The above is quite reproducable, pg_ctl stop -m immediate
 usually inflated my serial sequence, but I had two occasions
 when not. The 69 - 70 was one. The inflated increase is always 33:

 [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
 psql (9.0devel)
 Type help for help.

 zozo=# insert into t1 (t) values ('f');
 INSERT 0 1
 zozo=# select * from t1;
  id  | t
 -+---
1 | a
2 | b
   35 | b
   36 | c
   69 | d
   70 | e
  103 | f
 (7 rows)

 Let's try with a plain sequence:

 zozo=# create sequence s1;
 CREATE SEQUENCE
 zozo=# select nextval('s1');
  nextval
 -
1
 (1 row)

 I stopped the primary at this point with -m immediate,
 and from this first result I thought that a plain sequence is
 not bothered by this:

 zozo=# \q
 [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
 psql (9.0devel)
 Type help for help.

 zozo=# select nextval('s1');
  nextval
 -
2
 (1 row)

 zozo=# insert into t1 (t) values ('g');
 INSERT 0 1
 zozo=# select * from t1;
  id  | t
 -+---
1 | a
2 | b
   35 | b
   36 | c
   69 | d
   70 | e
  103 | f
  136 | g
 (8 rows)

 But another restart and:

 zozo=# \q
 [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
 psql (9.0devel)
 Type help for help.

 zozo=# select nextval('s1');
  nextval
 -
   35
 (1 row)

 zozo=# select * from t1;
  id  | t
 -+---
1 | a
2 | b
   35 | b
   36 | c
   69 | d
   70 | e
  103 | f
  136 | g
 (8 rows)

 zozo=# insert into t1 (t) values ('h');
 INSERT 0 1
 zozo=# select * from t1;
  id  | t
 -+---
1 | a
2 | b
   35 | b
   36 | c
   69 | d
   70 | e
  103 | f
  136 | g
  169 | h
 (9 rows)

 It happened with a CVS version of about 2 weeks ago and the
 yesterday's version, as well. I think it's not intentional, it must be
 a race somewhere, as it doesn't happen all the time.

 Best regards,
 Zoltán Böszörményi
   

And to show that it doesn't happen with -m fast, I tried it three times.
Before quitting from psql, I stopped the primary with -m fast each time:

[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# insert into t1 (t) values ('i');
INSERT 0 1
zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# insert into t1 (t) values ('i');
INSERT 0 1
zozo=# \q
[z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo
psql (9.0devel)
Type help for help.

zozo=# insert into t1 (t) values ('i');
INSERT 0 1
zozo=# select * from t1;
 id  | t
-+---
   1 | a
   2 | b
  35 | b
  36 | c
  69 | d
  70 | e
 103 | f
 136 | g
 169 | h
 170 | i
 171 | i
 172 | i
(12 rows)

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Bernd Helmle



--On 10. April 2010 09:26:41 +0200 Boszormenyi Zoltan z...@cybertec.at 
wrote:



The above is quite reproducable, pg_ctl stop -m immediate
usually inflated my serial sequence, but I had two occasions
when not. The 69 - 70 was one. The inflated increase is always 33:


AFAIKS sequences are pre-logged with 32 values to WAL to avoid overhead. I 
suspect this is why you are seeing those gaps.


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a faster compression algorithm for pg_dump

2010-04-10 Thread Joachim Wieland
On Fri, Apr 9, 2010 at 5:51 AM, Greg Stark gsst...@mit.edu wrote:
 Linking against as an option isn't nearly as bad since the user
 compiling it can choose whether to include the restricted feature or
 not. That's what we do with readline. However it's not nearly as
 attractive when it restricts what file formats Postgres supports -- it
 means someone might generate backup dump files that they later
 discover they don't have a legal right to read and restore :(

If we only linked against it, we'd leave it up to the user to weigh
the risk as long as we are not aware of any such violation.

Our top priority is to make sure that the project would not be harmed
if one day such a patent showed up. If I understood you correctly,
this is not an issue, even if we included lzf and less again if we
only link against it. The rest is about user education and using lzf
only in pg_dump and not for toasting, we could show a message in
pg_dump if lzf is chosen to make the user aware of the possible
issues.

If we still cannot do this, then what I am asking is: What does the
project need to be able to at least link against such a compression
algorithm? Is it a list of 10, 20, 50 or more other projects using it
or is it a lawyer saying: There is no patent.? But then, how can we
be sure that the lawyer is right? Or couldn't we include it even if we
had both, because again, we couldn't be sure... ?


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Boszormenyi Zoltan
Bernd Helmle írta:


 --On 10. April 2010 09:26:41 +0200 Boszormenyi Zoltan z...@cybertec.at
 wrote:

 The above is quite reproducable, pg_ctl stop -m immediate
 usually inflated my serial sequence, but I had two occasions
 when not. The 69 - 70 was one. The inflated increase is always 33:

 AFAIKS sequences are pre-logged with 32 values to WAL to avoid
 overhead. I suspect this is why you are seeing those gaps.

Then it should happen all the time, even with -m fast or -m smart, no?

It seemed like my sequences have a CACHE 32 setting, which would
apply to every client that connects, runs nextval() once and disconnects.

But it didn't happen all the time, so it's not deterministic.

-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] master in standby mode croaks

2010-04-10 Thread Robert Haas
On Fri, Apr 2, 2010 at 5:36 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I can't duplicate this error based upon what you have said.

I fooled around with this some more and I think I know what's going
on.  The error message I received was:

recovery connections cannot start because the recovery_connections
parameter is disabled on the WAL source server

This is generated when !checkPoint.XLogStandbyInfoMode.  That, in
turn, is set on the master to the results of XLogStandbyInfoActive(),
which is defined as XLogRequestRecoveryConnections  XLogIsNeeded().
XLogIsNeeded() is defined as XLogArchivingActive() || (max_wal_senders
 0), and XLogArchivingActive() is defined as XLogArchiveMode.  So
when you expand it all out, this error message gets triggered when the
following condition does not hold on the master:

XLogRequestRecoveryConnections  (XLogArchiveMode || (max_wal_senders  0))

So this can fail in either of two ways: (1)
XLogRequestRecoveryConnections (aka recovery_connections) might be
false, which is the situation described in the error message, or (2)
XLogArchiveMode (archive_mode) might be false and at the same time
max_wal_senders might be zero.  As it happens, the default
configuration of the system is recovery_connections = true,
archive_mode = false, max_wal_senders = 0, so with an out-of-the-box
config it fails for the reason that isn't the one described in the
error message.

One possible approach here is to improve the error message, but it
seems to me that having the ability of Hot Standby to run on the slave
partially controlled by three different GUCs is awfully complicated.
I think the root of the problem here is that recovery_connections
controls one behavior on the primary (whether or not we WAL-log
certain information needed for HS) and a completely unrelated behavior
on the standby (whether or not we try to allow read-only backends into
the system).  In 8.4 and prior, it was always the job of archive_mode
to decide whether WAL-logging was needed.  Maybe we should go back to
that and make it an enum:

wal_mode = {standby | archive | off}

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Andrew Dunstan



Heikki Linnakangas wrote:


1. Keep the materialized view up-to-date when the base tables change.
This can be further divided into many steps, you can begin by supporting
automatic updates only on very simple views with e.g a single table and
a where clause. Then extend that to support joins, aggregates,
subqueries etc. Keeping it really limited, you could even require the
user to write the required triggers himself.

  


That last bit doesn't strike me as much of an advance. Isn't the whole 
point of this to automate it? Creating greedy materialized views is 
usually not terribly difficult now, but you do have to write the triggers.


The other thing that could be interesting about this would be some 
scheme for lazy refresh that didn't involve re-extracting the whole data 
set.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Robert Haas
2010/4/10 Andrew Dunstan and...@dunslane.net:
 Heikki Linnakangas wrote:

 1. Keep the materialized view up-to-date when the base tables change.
 This can be further divided into many steps, you can begin by supporting
 automatic updates only on very simple views with e.g a single table and
 a where clause. Then extend that to support joins, aggregates,
 subqueries etc. Keeping it really limited, you could even require the
 user to write the required triggers himself.

 That last bit doesn't strike me as much of an advance. Isn't the whole point
 of this to automate it? Creating greedy materialized views is usually not
 terribly difficult now, but you do have to write the triggers.

Yeah, I agree.

 The other thing that could be interesting about this would be some scheme
 for lazy refresh that didn't involve re-extracting the whole data set.

One way to do this would be to infer a primary key for the result set
based on the input query.  But I think we don't really have the
infrastructure to do this right now, so not really a project for a
beginner.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Virtual Private Database

2010-04-10 Thread Jean-Gérard Pailloncy
Hello,

1) VPD: Virtual Private Database
I would appreciate to have a new feature in PostgreSQL.
This is an oracle-like feature that implement Row Level Security.
This feature may be emulated by using VIEW/RULE but this is very time
consuming and error prone.

I would appreciated to have an estimated of the faisability and the cost
to implement it.

2) Description
The feature may be implemented with a simple expression associated to the
table.

ALTER TABLE table_name ADD FILTER filter_name CHECK(expression);
ALTER TABLE table_name DROP FILTER filter_name;

Usage/example:
ALTER TABLE filtered_table ADD FILTER tf_username
CHECK(filtered_table.creator=user)
SELECT * FROM filtered_table;
will really do
SELECT * FROM filtered_table WHERE filtered_table.creator=user;

Same thing for INSERT, UDPATE, and DELETE

UPDATE filtered_table SET b_column=1 WHERE a_column='a';
wille really do
UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
filtered_table.creator=user;

In practice, the devs will create few function: my_login, my_logout,
my_filter
and the simple filtered_table.creator=user will be replace by ACL
encapsulated in the function my_filter and add a triger to check data on
INSERT, UDPATE.
We could use veil to build a very efficient filter.

3) Question
- Is it doable ?
- Is it the sound way of doing it ?
- Is it possible to have it in core ?
- Is there a pgsql dev interested to implemented it ?
- Is there other people interested in such feature ?
- How much this will cost ?
- With which delay ?


Cordialement,
Jean-Gérard Pailloncy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Jeff Davis wrote:

To give some context, I started a thread a while ago:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php
  
Interesting, a join type for overlaps, which makes me think a bit of the 
staircase join for pre-post coordinates. However, does a join operator 
type need certain kinds of properties of the operator involved, e.g. 
being commutative, transitive etc? Else the join reordering fails. The 
latter fails for the overlap operator.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 Jeff Davis wrote:

 To give some context, I started a thread a while ago:

 http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php


 Interesting, a join type for overlaps, which makes me think a bit of the
 staircase join for pre-post coordinates. However, does a join operator type
 need certain kinds of properties of the operator involved, e.g. being
 commutative, transitive etc? Else the join reordering fails. The latter
 fails for the overlap operator.

I don't think I follow this.  As far as I know, the join order
constraints don't depend on the choice of operator.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Kevin Grittner
Greg Smith  wrote:
 
 And work on MERGE support is itself blocked behind the fact that
 PostgreSQL doesn't have a good way to lock access to a key value
 that doesn't exist yet--what other databases call key range
 locking.
 
The bulk of the serializable implementation WIP is work to implement
just this sort of locking.  There are already a couple possible
spin-off uses on the horizon based on the ability of these locks to
survive their initiating transactions and detect conflicting writes.
Both spinoffs involve somehow flagging a transaction as being one for
which the locks should be kept until further notice, and issuing a
notification when a conflicting write occurs.  That seems consistent
with the needs of materialized views, too.
 
It probably won't be solid in time to be useful for GSoC, but if
someone's looking to map out a plan for materialized views, I thought
this information might be germane.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Robert Haas wrote:

On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga yebhavi...@gmail.com wrote:
  

Jeff Davis wrote:


To give some context, I started a thread a while ago:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php

  

Interesting, a join type for overlaps, which makes me think a bit of the
staircase join for pre-post coordinates. However, does a join operator type
need certain kinds of properties of the operator involved, e.g. being
commutative, transitive etc? Else the join reordering fails. The latter
fails for the overlap operator.



I don't think I follow this.  As far as I know, the join order
constraints don't depend on the choice of operator.
  
I was thinking of a case for instance for ranges a,b,c in relations 
A,B,C respectively, where  a  b and b  c, but not a  c. Would the 
planner consider a join path of table A and C first, then that result 
with B. After looking in doxygen, it looks like having  defined 
without MERGES is what prevents this unwanted behaviour, since that 
prevents a,b and c to become members of the same equivalence class. 
Sorry for the spam on the list.


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Jeff Davis
On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote:
 I was thinking of a case for instance for ranges a,b,c in relations 
 A,B,C respectively, where  a  b and b  c, but not a  c. Would the 
 planner consider a join path of table A and C first, then that result 
 with B. After looking in doxygen, it looks like having  defined 
 without MERGES is what prevents this unwanted behaviour, since that 
 prevents a,b and c to become members of the same equivalence class. 

Interesting, I would have to make sure that didn't happen. Most likely
there would be a new property like RANGEMERGES, it wouldn't reuse the
existing MERGES property.

 Sorry for the spam on the list.

Not at all, it's an interesting point.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Fri, Apr 9, 2010 at 5:49 PM, Jeff Davis pg...@j-davis.com wrote:
 On Thu, 2010-04-08 at 22:29 -0400, Robert Haas wrote:
 1. knngist wants to use index scans to speed up queries of the form
 SELECT ... ORDER BY column op constant (as opposed to the
 existing machinery which only knows how to use an index for SELECT ...
 ORDER BY column).
 2. Window functions want to define windows over a range of values
 defined by the underlying data type.  To do this, we need to define
 what addition and subtraction mean for a particular data type.
 3. Jeff Davis is interested in implementing range types.  When the
 underlying base type is discrete, e.g. integers, you can say that
 [1,3] = [1,4), but only if you know that 3 and 4 are consecutive (in
 that order).

 To give some context, I started a thread a while ago:

 http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php

 Tom provided some interesting suggestions in that thread, but I'm not
 sure they would work for #1 or #2.

rereads thread

The map  to  case is interesting.  It doesn't seem like it's
really a good candidate for type interfaces, because you you're not
really looking for the strictly-left-of operator; you're looking for
the strictly-left-of operator associated with the overlaps operator
actually specified.  And ideally there might be an index strategy
number available for , too, so that you could consider doing an
index scan instead of a sort, but not necessarily.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 2:30 PM, Jeff Davis pg...@j-davis.com wrote:
 On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote:
 I was thinking of a case for instance for ranges a,b,c in relations
 A,B,C respectively, where  a  b and b  c, but not a  c. Would the
 planner consider a join path of table A and C first, then that result
 with B. After looking in doxygen, it looks like having  defined
 without MERGES is what prevents this unwanted behaviour, since that
 prevents a,b and c to become members of the same equivalence class.

 Interesting, I would have to make sure that didn't happen. Most likely
 there would be a new property like RANGEMERGES, it wouldn't reuse the
 existing MERGES property.

 Sorry for the spam on the list.

 Not at all, it's an interesting point.

Yeah... I agree.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences

2010-04-10 Thread Martijn van Oosterhout
On Sat, Apr 10, 2010 at 02:36:41PM +0200, Boszormenyi Zoltan wrote:
  The above is quite reproducable, pg_ctl stop -m immediate
  usually inflated my serial sequence, but I had two occasions
  when not. The 69 - 70 was one. The inflated increase is always 33:
 
  AFAIKS sequences are pre-logged with 32 values to WAL to avoid
  overhead. I suspect this is why you are seeing those gaps.
 
 Then it should happen all the time, even with -m fast or -m smart, no?

Nope, because on a normal shutdown it writes out the actual value. When
you say immediate you mean right now, don't bother with anything not
important, like for example gaps in sequences. You're essentially
crashing the DB.

Have a ncie day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] Virtual Private Database

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 10:00 AM, Jean-Gérard Pailloncy j...@rilk.com wrote:
 1) VPD: Virtual Private Database
 I would appreciate to have a new feature in PostgreSQL.
 This is an oracle-like feature that implement Row Level Security.
 This feature may be emulated by using VIEW/RULE but this is very time
 consuming and error prone.

 I would appreciated to have an estimated of the faisability and the cost
 to implement it.

 2) Description
 The feature may be implemented with a simple expression associated to the
 table.

 ALTER TABLE table_name ADD FILTER filter_name CHECK(expression);
 ALTER TABLE table_name DROP FILTER filter_name;

 Usage/example:
 ALTER TABLE filtered_table ADD FILTER tf_username
 CHECK(filtered_table.creator=user)
 SELECT * FROM filtered_table;
 will really do
 SELECT * FROM filtered_table WHERE filtered_table.creator=user;

 Same thing for INSERT, UDPATE, and DELETE

 UPDATE filtered_table SET b_column=1 WHERE a_column='a';
 wille really do
 UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
 filtered_table.creator=user;

 In practice, the devs will create few function: my_login, my_logout,
 my_filter
 and the simple filtered_table.creator=user will be replace by ACL
 encapsulated in the function my_filter and add a triger to check data on
 INSERT, UDPATE.
 We could use veil to build a very efficient filter.

 3) Question
 - Is it doable ?
 - Is it the sound way of doing it ?
 - Is it possible to have it in core ?
 - Is there a pgsql dev interested to implemented it ?
 - Is there other people interested in such feature ?
 - How much this will cost ?
 - With which delay ?

This is very similar to the design I've been thinking about for
row-level security.

Here is a pointer to a previous email thread on the topic of row-level security.

http://archives.postgresql.org/pgsql-hackers/2009-12/msg01095.php

Before row-level security can be implemented, we'd need to fix the
problem described here:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php

With respect to sponsoring development of new features, it can
certainly be done.  Any such feature could not at this point be added
any sooner than PostgreSQL 9.1, and I'd recommend that if you want to
see it in 9.1 you should try to get a contract with someone in place
in the next few months.  To get a price, you'd need to contact a
PostgreSQL support/development company or an individual developer.
The following web page might give you some ideas where to start
looking.

http://www.postgresql.org/support/professional_support

There's sort of an understanding that we don't talk about contracts or
pricing on this list, so that the content remains technical rather
than commercial.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Virtual Private Database

2010-04-10 Thread Joseph Adams
On Sat, Apr 10, 2010 at 10:00 AM, Jean-Gérard Pailloncy j...@rilk.com wrote:
 Hello,

 1) VPD: Virtual Private Database
 I would appreciate to have a new feature in PostgreSQL.
 This is an oracle-like feature that implement Row Level Security.
 This feature may be emulated by using VIEW/RULE but this is very time
 consuming and error prone.

 I would appreciated to have an estimated of the faisability and the cost
 to implement it.

 2) Description
 The feature may be implemented with a simple expression associated to the
 table.

 ALTER TABLE table_name ADD FILTER filter_name CHECK(expression);
 ALTER TABLE table_name DROP FILTER filter_name;

 Usage/example:
 ALTER TABLE filtered_table ADD FILTER tf_username
 CHECK(filtered_table.creator=user)
 SELECT * FROM filtered_table;
 will really do
 SELECT * FROM filtered_table WHERE filtered_table.creator=user;

 Same thing for INSERT, UDPATE, and DELETE

 UPDATE filtered_table SET b_column=1 WHERE a_column='a';
 wille really do
 UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
 filtered_table.creator=user;

 In practice, the devs will create few function: my_login, my_logout,
 my_filter
 and the simple filtered_table.creator=user will be replace by ACL
 encapsulated in the function my_filter and add a triger to check data on
 INSERT, UDPATE.
 We could use veil to build a very efficient filter.

 3) Question
 - Is it doable ?
 - Is it the sound way of doing it ?
 - Is it possible to have it in core ?
 - Is there a pgsql dev interested to implemented it ?
 - Is there other people interested in such feature ?
 - How much this will cost ?
 - With which delay ?


 Cordialement,
 Jean-Gérard Pailloncy


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


This is quite similar to an idea I posted about not long ago called
access control jails (
http://archives.postgresql.org/pgsql-hackers/2010-03/msg00832.php ).
I encountered this very problem writing a simple web application
involving access control.  There are a variety of ways to make
implementing access control easier, and I think copying off of Oracle
would be among the best ;-)

Disclaimer:  I am not a PostgreSQL hacker, but a newbie with some
experience in other communities, absorbing what he can.

Access control jailing, as I conceived it, would not simply filter per
table, but would make it so all queries would be filtered.  If used
correctly, it would even be safe to execute untrusted SQL (though it
might not be advisable).

I looked at the Veil demo application a tiny bit, and the only thing I
drew from it was the impression that it makes things more complicated,
not less :(  Then again, it may just be the example demonstrating a
lot of different features at once.

One problem that ought to be addressed for any of these ideas is how
to do connection-local variables.  For instance:

 UPDATE filtered_table SET b_column=1 WHERE a_column='a';
 wille really do
 UPDATE filtered_table SET b_column=1 WHERE a_column='a' and
 filtered_table.creator=user;

Here, what is =user referring to?  I suppose it is a variable that
is set not long after the session starts and only applies to that
session?  PostgreSQL has temporary tables and such, but you can't
reference them until they're already created.  Hence, I don't think
PostgreSQL elegantly supports free variables that are bound
temporarily per connection.  There are GUCs and such, but using them
for this purpose is far from elegant, if I understand correctly.

Another problem is that session-local context doesn't go well with
connection pooling, so you might need some workaround like passing
context IDs back and forth.

That's my own summary of the discussion about access control jails linked above.

By the way, here's a hack to bind a free variable to a session:

CREATE FUNCTION get_user_id() RETURNS INT AS $$
DECLARE
ret INT;
BEGIN
SELECT INTO ret id FROM user_id_tbl;
RETURN ret;
END
$$ LANGUAGE 'plpgsql';

Then, per-session:

CREATE TEMPORARY TABLE user_id_tbl (id INT);
INSERT INTO user_id_tbl VALUES (5);
SELECT get_user_id();

It relies on plpgsql not complaining about user_id_tbl not existing at
creation time.  What this trick allows one to do is set the user ID
once (e.g. after connecting), then views and such that call
get_user_id() will have the appropriate user ID without needing to
specify it per-query.

I'm curious:  is this trick a good idea?  Does connection pooling play
well with temporary tables (and thus this trick)?  Could it result in
substantial slowdowns (I don't see why it should, since get_user_id()
needs to be called once per query that uses it)?  I guess creating a
temporary table every connection has the potential to be slow.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Greg Smith

Heikki Linnakangas wrote:

Your proposal basically describes
doing 1, in a limited fashion where the view is not updated
automatically, but only when the DBA runs a command to refresh it. I'm
not sure if that's useful enough on its own, writing CREATE
MATERIALIZED VIEW ... SELECT ... doesn't seem any easier than just
writing CREATE TABLA AS  But if you can do something about 2, or
even a very limited part of 1, keeping the view up-to-date
automatically, it becomes much more useful.
  


You've hit upon the core issue here.  You can build materialized views 
right now using CREATE TABLE AS.  You can even update them by creating 
a new table the same way, with a new name, and doing the 
LOCK/RENAME/DROP shuffle--what people used to do for rewriting bloated 
tables before there was CLUSTER.  The first step in the proposal here is 
essentially syntax to give an easier UI for that.  That's an interesting 
step, but recognize that it doesn't actually provide anything you can't 
do already.


If you then note that doing any sort of incremental update to the view 
is a hard problem, and that a lot of the useful cases for materialized 
views involve tables where it's impractical to recreate the whole thing 
anyway, you'll inevitably find yourself deeply lost in the minutia of 
how to handle the updates.  It's really the core problem in building 
what people expect from a materialized view implementation in a serious 
database.  Chipping away at the other pieces around it doesn't move the 
feature that far forward, even if you get every single one of them 
except incremental updates finished, because everything else combined is 
still not that much work in comparison to the issues around updates.


There certainly are a fair number of subproblems you can break out of 
here.  I just think it's important to recognize that the path that leads 
to a useful GSoC project and the one that gives a production quality 
materialized view implementation may not have that much in common, and 
to manage expectations on both sides accordingly.  If Pavel thinks he's 
going to end up being able to say I added materialized views to 
PostgreSQL at the end of the summer, that's going to end in 
disappointment.  And if people think this project plan will lead to 
being able to claim PostgreSQL now has this feature, that's also not 
going to go well.  If the scope is add initial grammar and rewriting 
moving toward a future materialized view feature, which the underlying 
implementation noted as a stub prototype, that might work out OK.  This 
is why I likened it to the work on Syntax for partitioning, which has 
a similarly focused subgoal structure.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Greg Smith

Robert Haas wrote:

It's not obvious to me
that a brief full-table lock wouldn't be acceptable for an initial
implementation.  Obviously it wouldn't be suitable for every use case
but since we're talking about manually refreshed views that was bound
to be true anyway.
  


There already is an initial implementation of sorts.  There are a couple 
of ways you can build these right now, so any new development has to 
look like it will end with good odds of being an improvement over what's 
already available before it's worth putting development resources into.


As a rough idea of what people want these for in the field, based on 
what I've seen requests for, imagine that someone has a 1TB table 
they're materializing a view on in order to get at least a 10:1, and 
hopefully close to a 100:1, speedup on viewing summary data.  Now, 
picture what happens if you have someone doing a sequential scan on the 
MV, which is still quite big, the updater process lines up to grab an 
exclusive lock when it's done, and now a second user wanting to read a 
single row quickly comes along behind it.  Given a full-table lock 
implementation, that scenario is unlikely to play out with the second 
user getting a fast response.  They'll likely sit in a lock queue for 
some potentially long period of time instead, waiting for the active seq 
scan to finish then the update to happen.  You have to build it that way 
or a steady stream of people reading could block out updates forever.


To be frank, that makes for a materalized view implementation of little 
value over what you can currently do as far as I'm concerned.  It might 
be interesting as a prototype, but that's not necessarily going to look 
like what's needed to do this for real at all.  I'm not a big fan of 
dumping work into projects when you can see exactly how it's going to 
fail before you even get started.  As I see if, if you know where it's 
going to fall down, you don't need to build a prototype as an exercise 
to show you how to build it--you should work on that part first instead.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 11:40 PM, Greg Smith g...@2ndquadrant.com wrote:
 To be frank, that makes for a materalized view implementation of little
 value over what you can currently do as far as I'm concerned.  It might be
 interesting as a prototype, but that's not necessarily going to look like
 what's needed to do this for real at all.  I'm not a big fan of dumping work
 into projects when you can see exactly how it's going to fail before you
 even get started.  As I see if, if you know where it's going to fall down,
 you don't need to build a prototype as an exercise to show you how to build
 it--you should work on that part first instead.

Hopefully, you're already aware that I have enormous respect for your
opinions on a wide variety of topics; if not, let me publicly say that
I absolutely do.

Having said that, I disagree with your conclusions in this instance.
I see nothing but upside from this work.  It is vastly easier to write
a patch that builds on existing functionality than it is to write
something new from scratch.  If there's any value in having manually
refreshed materialized views, then having the simplest possible
implementation of what those can look like committed will make it far
easier to plan out next steps.  While the proposed implementation may
not solve a huge number of real-world problems, I think there's a good
argument that some people will get good use of it.  Not everyone has
1TB tables with continuous access patterns.  And, provided that it
doesn't conflict with anything we want to do in the future, being
useful to some people is a good enough reason to put it in.

I also think that you're underestimating the number of problems that
will have to be solved to get this done.  It's going to take some
significant work - both design work and coding work - to figure out
how this should integrate into the rest of the system.  (What should
be the value of pg_class.relkind?  Where should the node
representation of the snapshot query be stored?  And did we handle all
of those OID dependencies correctly?)

Where I can see this possibly falling down (other than being just too
much work for a relative PostgreSQL novice to get it done in one
summer) is if there are concerns about it being incompatible with
incrementally-updated views.  I imagine that we're going to want to
eventually support both, so we need to make sure that this
implementation doesn't box us into a corner.  But as far as snapshot
views go, complaining that the proposed locking is too strong doesn't
seem quite fair.  Fixing that, AFAICS, is a very hard project,
possibly involving significant planner support and an implementation
of MERGE, and I would much rather try to land a fundamentals patch
like this first and then deal with the gyrations that will be involved
in making this work than try to land the whole thing all at once.

Of course, if I'm missing something, and there's a SIMPLE way to get
materialized views that can be refreshed without a full-table lock,
that's another story altogether - maybe you have an idea?

Finally, even if we decided NOT to merge this patch because of the
limitations you mention (and right now that doesn't seem to be the
consensus), having this part of it completed as a starting point for
future work might be reason enough by itself.

In short: I think you may be letting the perfect be the enemy of the good.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers