date:20100524

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao

On Mon, May 24, 2010 at 10:20 PM, Fujii Masao  wrote:
> At the first design phase, I'd like to clarify which synch levels
> should be supported 9.1 and how it should be specified by users.

There is another question about synch level:

When should the master wait for replication?

In my current design, the backend waits for replication only at
the end of the transaction commit. Is this enough? Is there other
waiting point?

For example, smart or fast shutdown on the master should wait
for a shutdown checkpoint record to be replicated to the standby
(btw, in 9.0, shutdown waits for checkpoint record to be *sent*)?
pg_switch_xlog() needs to wait for all of original WAL file to
be replicated?

I'm not sure if the above two "waits-for-replication" have use
cases, so I'm thinking they are not worth implementing, but..

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ROLLBACK TO SAVEPOINT

2010-05-24 Thread Sam Vilain

The note at the end of;

http://www.postgresql.org/docs/8.4/static/sql-savepoint.html

Lead us to believe that if you roll back to the same savepoint name
twice in a row, that you might start walking back through the
savepoints.  I guess I missed the note on ROLLBACK TO SAVEPOINT that
that is not how it works.

Here is the section:

SQL requires a savepoint to be destroyed automatically when another
savepoint with the same name is established. In PostgreSQL, the old
savepoint is kept, though only the more recent one will be used when
rolling back or releasing. (Releasing the newer savepoint will cause the
older one to again become accessible to ROLLBACK TO SAVEPOINT and
RELEASE SAVEPOINT.) Otherwise, SAVEPOINT is fully SQL conforming.

I think it could be improved by also communicating:

Rollback to a savepoint never releases it; you can safely repeat
ROLLBACK TO SAVEPOINT statements without unwinding the transaction, even
if you are re-using savepoint names.

Well, maybe no-one else will ever have the misconception I did, but
there it is.

Sam.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unnailing shared relations (was Re: global temporary tables)

2010-05-24 Thread Tom Lane

Robert Haas  writes:
> On Mon, May 24, 2010 at 5:37 PM, Alvaro Herrera  
> wrote:
>> Excerpts from Robert Haas's message of lun may 24 17:18:21 -0400 2010:
>>> Well, I might be missing something here, but pg_class already IS
>>> database-specific.  If you change anything very significant about a
>>> shared rel in one copy of pg_class today, you're toast, IIUC.  This
>>> proposal doesn't make that any better, but I don't think it makes it
>>> any worse either.
>> 
>> I thought the whole point of this exercise was precisely to avoid this
>> sort of problem.

> Short answer: Nope.

In practice, it's very difficult to change anything about a system
catalog anyway, because so many of its properties are baked into the
behavior of the C code.  Whether there's a single copy of the catalog
rows is the least of your worries there.

> Long answer: It would be nice to do that, but in order to accomplish
> that we would need to create pg_shared_ for all relevant pg_
> and teach the backend code to check both tables in every case.  That
> seemed hard, so I suggested just duplicating the entries, thereby
> giving processes like the autovacuum launcher the ability to look at
> any shared relation without it needing to be nailed, but not actually
> solving the whole problem.

I hadn't been paying that much attention to this thread, but it's
sounding to me like it's based on a false premise.  Having a shared copy
of the catalog entries for a shared catalog would accomplish little or
nothing in terms of eliminating nailed relcache entries.  You might be
able to de-nail pg_database and friends, but only at the cost of instead
nailing up entries for "pg_shared_class" and friends.  Which seems to me
like a net step backwards in terms of the maintenance overhead for
relcache entries.

If we really cared about not nailing these (and I'm not aware of any
evidence that we should care), it would probably be more useful to try
to institute an intermediate level of nailing ("stapling"?
"scotch-taping"?) that locked the entry into memory only until we'd
finished bootstrapping the backend.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unnailing shared relations (was Re: global temporary tables)

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 5:37 PM, Alvaro Herrera  wrote:
> Excerpts from Robert Haas's message of lun may 24 17:18:21 -0400 2010:
>> On Mon, May 24, 2010 at 4:23 PM, Alvaro Herrera  
>> wrote:
>> > Excerpts from Robert Haas's message of vie may 21 10:20:38 -0400 2010:
>
>> > Uh, how does this work when you change the entries for shared relations
>> > in a database-specific pg_class?  Keeping everything in sync seems hard,
>> > if not impossible.
>>
>> Well, I might be missing something here, but pg_class already IS
>> database-specific.  If you change anything very significant about a
>> shared rel in one copy of pg_class today, you're toast, IIUC.  This
>> proposal doesn't make that any better, but I don't think it makes it
>> any worse either.
>
> I thought the whole point of this exercise was precisely to avoid this
> sort of problem.

Short answer: Nope.

Long answer: It would be nice to do that, but in order to accomplish
that we would need to create pg_shared_ for all relevant pg_
and teach the backend code to check both tables in every case.  That
seemed hard, so I suggested just duplicating the entries, thereby
giving processes like the autovacuum launcher the ability to look at
any shared relation without it needing to be nailed, but not actually
solving the whole problem.

It may be a bad idea.  It was just a thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao

On Tue, May 25, 2010 at 10:29 AM, Josh Berkus  wrote:
> I agree that #4 should be done last, but it will be needed, not in the
> least by your employer ;-) .  I don't see any obvious way to make #4
> compatible with any significant query load on the slave, but in general
> I'd think that users of #4 are far more concerned with 0% data loss than
> they are with getting the slave to run read queries.

Since #2 and #3 are enough for 0% data loss, I think that such users
would be more concerned about what results are visible in the standby.
No?

> What we should do is specify it per-standby, and then have a USERSET GUC
> on the master which specifies which transactions will be synched, and
> those will be synched only on the slaves which are set up to support
> synch.  That is, if you have:
>
> Master
> Slave #1: synch
> Slave #2: not synch
> Slave #3: not synch
>
> And you have:
> Session #1: synch
> Session #2: not synch
>
> Session #1 will be synched on Slave #1 before commit.  Nothing will be
> synched on Slaves 2 and 3, and session #2 will not wait for synch on any
> slave.
>
> I think this model delivers the maximum HA flexibility to users while
> still making intuitive logical sense.

This makes sense.

Since it's relatively easy and simple to implement such a boolean GUC flag
rather than "per-transaction" levels (there are four valid values #1, #2,
#3 and #4), I'll do that.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (9.1) btree_gist support for searching on "not equals"

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 11:25 PM, Jeff Davis  wrote:
> I think the original case (same values only) is potentially useful
> enough that we should support it.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regression testing for psql

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 10:51 PM, Stephen Frost  wrote:
> * Stephen Frost (sfr...@snowman.net) wrote:
>>     Add regression testing for psql backslash commands
>>
>>     This patch adds rather extensive regression testing
>>     of the psql backslash commands.  Hopefully this will
>>     minimize issues such as the one which cropped up
>>     recently with \h segfaulting.  Note that we don't
>>     currently explicit check all the \h options and that
>>     pretty much any catalog changes will mean that this
>>     needs to also be updated.  Still, it's a start, we can
>>     reduce the set of tests if that makes sense or they
>>     become a problem.
>
> And..  it's way too big to send to the list.  The patch is available
> here:
>
> http://snowman.net/~sfrost/psql-regress-help.patch
>
> Of course, if people want to suggest tests that just shouldn't be
> included, I can go through and strip things out.

Well...  I'm a little reluctant to believe that we should have 3.3M of
tests for the entire backend and 5M of tests just for psql.  Then,
too, there's the fact that many of these tests fail on my machine
because my username is not sfrost, and/or because of row-ordering
differences on backslash commands without enough ORDER BY to fully
determine the output order.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Bruce Momjian

Robert Haas wrote:
> On Mon, May 24, 2010 at 5:50 PM, Bruce Momjian  wrote:
> > What is your point?
> 
> My point is that I think Stefan has a good point when he says this:
> 
> >> >> >> hmm that seems better thanks, however I just noticed that we don't 
> >> >> >> have
> >> >> >> a "general limitations" section. The way the docs are now done 
> >> >> >> suggests
> >> >> >> that there are not limitations at all (except for the two warnings in
> >> >> >> the migration guide). Is pg_upgrade really up to the point where it 
> >> >> >> can
> >> >> >> fully replace pg_dump & pg_restore independent of the loaded 
> >> >> >> (contrib)
> >> >> >> or even third party modules(like postgis or custom datatypes etc)?
> 
> I think he is quite right to be concerned about these issues and if
> the limitations in this area are not well-documented so that users can
> easily be aware of them, then IMHO that is something we should
> correct.

Have you read the docs?  It does mention the issue with /contrib and
stuff.  How do I document a limitation I don't know about?  This is all
very vague.  Please suggest some wording.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (9.1) btree_gist support for searching on "not equals"

2010-05-24 Thread Jeff Davis

On Sat, 2010-05-22 at 01:02 +0300, Marko Tiikkaja wrote:
> On 5/21/10 11:47 PM +0300, Jeff Davis wrote:
> > It also allows you to enforce the constraint that only one tuple exists
> > in a table by doing something like:
> >
> >create table a
> >(
> >  i int,
> >  exclude using gist (i with<>),
> >  unique (i)
> >);
> 
> FWIW, this is achievable a lot more easily:
> CREATE UNIQUE INDEX "a_single_row" ON a ((1));
> 

Yes, you're right. Also, neither of us accounted for NULLs, so I suppose
a NOT NULL is necessary as well.

I think the original case (same values only) is potentially useful
enough that we should support it.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread Tom Lane

Stephen Frost  writes:
> ... It makes me wonder if COPY shouldn't have been implemented using
> the Executor instead, but that's, again, a completely separate topic.
> It wasn't, but it wants to play like it operates in the same kind of way
> as INSERT, so it needs to pick up the slack.

FWIW, we've shifted COPY more towards using executor support over the
years.  I'm pretty sure that it didn't originally use the executor's
index-entry-insertion infrastructure, for instance.

Building an RT entry seems like a perfectly sane thing to do in order
to make it use the executor's permissions infrastructure.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 5:50 PM, Bruce Momjian  wrote:
> What is your point?

My point is that I think Stefan has a good point when he says this:

>> >> >> hmm that seems better thanks, however I just noticed that we don't have
>> >> >> a "general limitations" section. The way the docs are now done suggests
>> >> >> that there are not limitations at all (except for the two warnings in
>> >> >> the migration guide). Is pg_upgrade really up to the point where it can
>> >> >> fully replace pg_dump & pg_restore independent of the loaded (contrib)
>> >> >> or even third party modules(like postgis or custom datatypes etc)?

I think he is quite right to be concerned about these issues and if
the limitations in this area are not well-documented so that users can
easily be aware of them, then IMHO that is something we should
correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 9:27 PM, Stephen Frost  wrote:
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>> We have two options; If the checker function takes the list of RangeTblEntry,
>> it will be comfortable to ExecCheckRTPerms(), but not DoCopy(). Inversely,
>> if the checker function takes arguments in my patch, it will be comfortable
>> to DoCopy(), but ExecCheckRTPerms().
>>
>> In my patch, it takes 6 arguments, but we can reference all of them from
>> the given RangeTblEntry. On the other hand, if DoCopy() has to set up
>> a pseudo RangeTblEntry to call checker function, it entirely needs to set
>> up similar or a bit large number of variables.
>
> I don't know that it's really all that difficult to set up an RT in
> DoCopy or RI_Initial_Check().  In my opinion, those are the strange or
> corner cases- not the Executor code, through which all 'regular' DML is
> done.  It makes me wonder if COPY shouldn't have been implemented using
> the Executor instead, but that's, again, a completely separate topic.
> It wasn't, but it wants to play like it operates in the same kind of way
> as INSERT, so it needs to pick up the slack.

I think this approach is definitely worth investigating.  KaiGai, can
you please work up what the patch would look like if we do it this
way?

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread Robert Haas

2010/5/24 KaiGai Kohei :
> I think we need a new SPI_*() interface which allows to run the given query
> without any permission checks, because these queries are purely internal 
> stuff,
> so we can trust the query is harmless.
[...]
> I'm afraid of that the src/backend/catalog/aclchk.c will become overcrowding
> in the future. If it is ugly to deploy checker functions in separated 
> dir/files,
> I think it is an idea to put it on the execMain.c, instead of 
> ExecCheckRTEPerms().

Both of these are bad ideas, for reasons Stephen Frost has articulated well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread KaiGai Kohei

(2010/05/25 10:27), Stephen Frost wrote:
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>> We have two options; If the checker function takes the list of RangeTblEntry,
>> it will be comfortable to ExecCheckRTPerms(), but not DoCopy(). Inversely,
>> if the checker function takes arguments in my patch, it will be comfortable
>> to DoCopy(), but ExecCheckRTPerms().
>>
>> In my patch, it takes 6 arguments, but we can reference all of them from
>> the given RangeTblEntry. On the other hand, if DoCopy() has to set up
>> a pseudo RangeTblEntry to call checker function, it entirely needs to set
>> up similar or a bit large number of variables.
> 
> I don't know that it's really all that difficult to set up an RT in
> DoCopy or RI_Initial_Check().  In my opinion, those are the strange or
> corner cases- not the Executor code, through which all 'regular' DML is
> done.  It makes me wonder if COPY shouldn't have been implemented using
> the Executor instead, but that's, again, a completely separate topic.
> It wasn't, but it wants to play like it operates in the same kind of way
> as INSERT, so it needs to pick up the slack.
> 

Yes, it is not difficult to set up.
The reason why I prefer the checker function takes 6 arguments are that
DoCopy() / RI_Initial_Check() has to set up RangeTblEntry in addition to
Bitmap set, but we don't have any other significant reason.

OK, let's add a hook in the ExecCheckRTPerms().

 * RI_Initial_Check()
>>
>> It seems to me the permission checks in RI_Initial_Check() is a bit ad-hoc.
> 
> I agree with this- my proposal would address this in a way whih would be
> guaranteed to be consistant: by using the same code path to do both
> checks.  I'm still not thrilled with how RI_Initial_Check() works, but
> rewriting that isn't part of this.

I agree to ignore the problem right now.
It implicitly assume the owner has SELECT privilege on the FK/PK tables,
so the minimum SELinux module also implicitly assume the client has similar
permissions on it.

>> In this case, we should execute the secondary query without permission 
>> checks,
>> because the permissions of ALTER TABLE is already checked, and we can trust
>> the given query is harmless.
> 
> I dislike the idea of providing a new SPI interfance (on the face of
> it), and also dislike the idea of having a "skip all permissions
> checking" option for anything which resembles SPI.  I would rather ask
> the question of if it really makes sense to use SPI to check FKs as
> they're being added, but we're not going to solve that issue here.

Apart from the topic of this thread, I guess it allows us to utilize
query optimization and cascaded triggers to implement FK constraints
with minimum code size.

Thanks
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread KaiGai Kohei

(2010/05/25 10:13), Stephen Frost wrote:
> KaiGai,
> 
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>>postgres=# ALTER TABLE pk_tbl OWNER TO ymj;
>>ALTER TABLE
>>postgres=# ALTER TABLE fk_tbl OWNER TO ymj;
>>ALTER TABLE
>>postgres=# REVOKE ALL ON pk_tbl, fk_tbl FROM ymj;
>>REVOKE
>>postgres=# GRANT REFERENCES ON pk_tbl, fk_tbl TO ymj;
>>GRANT
>>
>> At that time, the 'ymj' has ownership and REFERENCES permissions on
>> both of pk_tbl and fk_tbl. In this case, RI_Initial_Check() shall return
>> and the fallback-seqscan will run. But,
> 
> ymj may be considered an 'owner' on that table, but in this case, it
> doesn't have SELECT rights on it.  Now, you might argue that we should
> assume that the owner has SELECT rights (since they're granted by
> default), even if they've been revoked, but that's a whole separate
> issue.

Yes, it is entirely separate issue. I don't intend to argue whether
we can assume the default PG permission allows owner to SELECT on
the table, or not.

>>postgres=>  ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);
>>ERROR:  permission denied for relation pk_tbl
>>CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."pk_tbl" x WHERE "a" 
>> OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"
> 
> I think you've got another issue here that's not related.  Perhaps
> something wrong with a patch you've applied?  Otherwise, what version of
> PG is this?  Using 8.2, 8.3, 8.4 and a recent git checkout, I get:
> 
> postgres=# CREATE USER ymj;
> CREATE ROLE
> postgres=# CREATE TABLE pk_tbl (a int primary key, b text);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "pk_tbl_pkey" 
> for table "pk_tbl"
> CREATE TABLE
> postgres=# CREATE TABLE fk_tbl (x int, y text);
> CREATE TABLE
> postgres=# ALTER TABLE pk_tbl OWNER TO ymj;
> ALTER TABLE
> postgres=# ALTER TABLE fk_tbl OWNER TO ymj;
> ALTER TABLE
> postgres=# REVOKE ALL ON pk_tbl, fk_tbl FROM ymj;
> REVOKE
> postgres=# GRANT REFERENCES ON pk_tbl, fk_tbl TO ymj;
> GRANT
> postgres=# SET ROLE ymj;
> SET
> postgres=>  ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);
> ALTER TABLE
> postgres=>

Sorry, I missed to copy & paste INSERT statement just after CREATE TABLE.

The secondary RI_FKey_check_ins() is invoked during the while() loop using
heap_getnext(), so it is not called for empty table.

For correctness,

  postgres=# CREATE USER ymj;
  CREATE ROLE
  postgres=# CREATE TABLE pk_tbl (a int primary key, b text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "pk_tbl_pkey" 
for table "pk_tbl"
  CREATE TABLE
  postgres=# CREATE TABLE fk_tbl (x int, y text);
  CREATE TABLE
| postgres=# INSERT INTO pk_tbl VALUES (1,'aaa'), (2,'bbb'), (3,'ccc');
| INSERT 0 3
| postgres=# INSERT INTO fk_tbl VALUES (1,'xxx'), (2,'yyy'), (3,'zzz');
| INSERT 0 3
  postgres=# ALTER TABLE pk_tbl OWNER TO ymj;
  ALTER TABLE
  postgres=# ALTER TABLE fk_tbl OWNER TO ymj;
  ALTER TABLE
  postgres=# REVOKE ALL ON pk_tbl, fk_tbl FROM ymj;
  REVOKE
  postgres=# GRANT REFERENCES ON pk_tbl, fk_tbl TO ymj;
  GRANT
  postgres=# SET ROLE ymj;
  SET
  postgres=> ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);
  ERROR:  permission denied for relation pk_tbl
  CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."pk_tbl" x WHERE "a" 
OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"

I could reproduce it on the 8.4.4, but didn't try on the prior releases.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regression testing for psql

2010-05-24 Thread Stephen Frost

* Stephen Frost (sfr...@snowman.net) wrote:
> Add regression testing for psql backslash commands
> 
> This patch adds rather extensive regression testing
> of the psql backslash commands.  Hopefully this will
> minimize issues such as the one which cropped up
> recently with \h segfaulting.  Note that we don't
> currently explicit check all the \h options and that
> pretty much any catalog changes will mean that this
> needs to also be updated.  Still, it's a start, we can
> reduce the set of tests if that makes sense or they
> become a problem.

And..  it's way too big to send to the list.  The patch is available
here:

http://snowman.net/~sfrost/psql-regress-help.patch

Of course, if people want to suggest tests that just shouldn't be
included, I can go through and strip things out.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao

On Tue, May 25, 2010 at 1:18 AM, Heikki Linnakangas
 wrote:
> I see a lot of value in #4; it makes it possible to distribute read-only
> load to the standby using something like pgbouncer, completely transparently
> to the application.

Agreed.

> In the lesser modes, the application can see slightly
> stale results.

Yes

BTW, even if we got #4, we would need to be careful about that
we might see the uncommitted results from the standby. That is,
the transaction commit might become visible in the standby before
the master returns its "success" to a client. I think that we
would never get the completely-transaction-consistent results
from the standby until we have implemented the "snapshot cloning"
feature.
http://wiki.postgresql.org/wiki/ClusterFeatures#Export_snapshots_to_other_sessions

> But whatever we can easily implement, really. Pick one that you think is the
> easiest and start with that, but keep the other modes in mind in the design
> and in the user interface so that you don't paint yourself in the corner.

Yep, the design and implementation for #2 and #3 should be
easily extensible for #4. I'll keep in mind that.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Clearing psql's input buffer after auto-reconnect

2010-05-24 Thread Tom Lane

We determined that $SUBJECT would be a good idea in this thread:
http://archives.postgresql.org/pgsql-bugs/2010-05/msg00159.php

I looked a bit at what it would take to make this happen.  The
difficulty is that the input buffer is a local variable in MainLoop(),
and so are a bunch of other subsidiary variables that would need to be
reset along with it.  The place where auto-reconnect presently happens
is CheckConnection(), which is in a different file and is also several
levels of subroutine call away from MainLoop.  AFAICS there are three
ways that we might attack this:

1. Massive restructuring of the code in common.c so that the fact of
a connection reset having happened can be returned back to MainLoop.

2. Export much of MainLoop's internal state as globals, so that
CheckConnection can hack on it directly.

3. Have CheckConnection do longjmp(sigint_interrupt_jmp) after resetting
the connection, to force control to go back to MainLoop directly.
MainLoop is already coded to clear its local state after catching a
longjmp.

Now #1 might be the best long-term solution but I have no particular
appetite to tackle it, and #2 is just too ugly to contemplate.  That
leaves #3, which is a bit ugly in its own right but seems like the best
fix we're likely to get.

Comments, better ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Josh Berkus


> #4 is useful for some cases, but might often make the
> transaction commit on the master get stuck since read-only
> query can easily block recovery by the lock conflict. So
> #4 seems not to be worth working on until that HS problem
> has been addressed. Thought?

I agree that #4 should be done last, but it will be needed, not in the
least by your employer ;-) .  I don't see any obvious way to make #4
compatible with any significant query load on the slave, but in general
I'd think that users of #4 are far more concerned with 0% data loss than
they are with getting the slave to run read queries.

> Second, we need to discuss about how to specify the synch
> level. There are three approaches:
> 
> * Per standby
> 
> * Per transaction

Ach, I'm torn.  I can see strong use cases for both of the above.
Really, I think:

> * Mix
>   Allow users to specify the level per standby and
>   transaction at the same time, and then calculate the real
>   level from them by using some algorithm.

What we should do is specify it per-standby, and then have a USERSET GUC
on the master which specifies which transactions will be synched, and
those will be synched only on the slaves which are set up to support
synch.  That is, if you have:

Master
Slave #1: synch
Slave #2: not synch
Slave #3: not synch

And you have:
Session #1: synch
Session #2: not synch

Session #1 will be synched on Slave #1 before commit.  Nothing will be
synched on Slaves 2 and 3, and session #2 will not wait for synch on any
slave.

I think this model delivers the maximum HA flexibility to users while
still making intuitive logical sense.

> Which should we adopt for 9.1? I'd like to implement the
> "per-standby" approach at first since it's simple and seems
> to cover more use cases. Thought?

If people agree that the above is our roadmap, implementing
"per-standby" first makes sense, and then we can implement "per-session"
GUC later.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread Stephen Frost

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> We have two options; If the checker function takes the list of RangeTblEntry,
> it will be comfortable to ExecCheckRTPerms(), but not DoCopy(). Inversely,
> if the checker function takes arguments in my patch, it will be comfortable
> to DoCopy(), but ExecCheckRTPerms().
> 
> In my patch, it takes 6 arguments, but we can reference all of them from
> the given RangeTblEntry. On the other hand, if DoCopy() has to set up
> a pseudo RangeTblEntry to call checker function, it entirely needs to set
> up similar or a bit large number of variables.

I don't know that it's really all that difficult to set up an RT in
DoCopy or RI_Initial_Check().  In my opinion, those are the strange or
corner cases- not the Executor code, through which all 'regular' DML is
done.  It makes me wonder if COPY shouldn't have been implemented using
the Executor instead, but that's, again, a completely separate topic.
It wasn't, but it wants to play like it operates in the same kind of way
as INSERT, so it needs to pick up the slack.

> As I replied in the earlier message, it may be an idea to rename and change
> the definition of ExecCheckRTEPerms() without moving it anywhere.

And, again, I don't see that as a good idea at all.

> >> * RI_Initial_Check()
> 
> It seems to me the permission checks in RI_Initial_Check() is a bit ad-hoc.

I agree with this- my proposal would address this in a way whih would be
guaranteed to be consistant: by using the same code path to do both
checks.  I'm still not thrilled with how RI_Initial_Check() works, but
rewriting that isn't part of this.

> In this case, we should execute the secondary query without permission checks,
> because the permissions of ALTER TABLE is already checked, and we can trust
> the given query is harmless.

I dislike the idea of providing a new SPI interfance (on the face of
it), and also dislike the idea of having a "skip all permissions
checking" option for anything which resembles SPI.  I would rather ask
the question of if it really makes sense to use SPI to check FKs as
they're being added, but we're not going to solve that issue here.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Tom Lane

Florian Pflug  writes:
> The subtle point here is whether you consider the view from the "outside" (in 
> the sense of what a read-only transaction started at an arbitrary time can or 
> cannot observe), or from the "inside" (what updating transactions can observe 
> and might base their updates on).

> The former case is completely determined by the commit ordering of the 
> transactions, while the latter is not - otherwise serializability wouldn't be 
> such a hard problem.

BTW, doesn't all this logic fall in a heap as soon as you consider
read-committed transactions?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread Stephen Frost

KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>   postgres=# ALTER TABLE pk_tbl OWNER TO ymj;
>   ALTER TABLE
>   postgres=# ALTER TABLE fk_tbl OWNER TO ymj;
>   ALTER TABLE
>   postgres=# REVOKE ALL ON pk_tbl, fk_tbl FROM ymj;
>   REVOKE
>   postgres=# GRANT REFERENCES ON pk_tbl, fk_tbl TO ymj;
>   GRANT
> 
> At that time, the 'ymj' has ownership and REFERENCES permissions on
> both of pk_tbl and fk_tbl. In this case, RI_Initial_Check() shall return
> and the fallback-seqscan will run. But,

ymj may be considered an 'owner' on that table, but in this case, it
doesn't have SELECT rights on it.  Now, you might argue that we should
assume that the owner has SELECT rights (since they're granted by
default), even if they've been revoked, but that's a whole separate
issue.

>   postgres=> ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);
>   ERROR:  permission denied for relation pk_tbl
>   CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."pk_tbl" x WHERE "a" 
> OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"

I think you've got another issue here that's not related.  Perhaps
something wrong with a patch you've applied?  Otherwise, what version of
PG is this?  Using 8.2, 8.3, 8.4 and a recent git checkout, I get:

postgres=# CREATE USER ymj;
CREATE ROLE
postgres=# CREATE TABLE pk_tbl (a int primary key, b text);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "pk_tbl_pkey" 
for table "pk_tbl"
CREATE TABLE
postgres=# CREATE TABLE fk_tbl (x int, y text);
CREATE TABLE
postgres=# ALTER TABLE pk_tbl OWNER TO ymj;
ALTER TABLE
postgres=# ALTER TABLE fk_tbl OWNER TO ymj;
ALTER TABLE
postgres=# REVOKE ALL ON pk_tbl, fk_tbl FROM ymj;
REVOKE
postgres=# GRANT REFERENCES ON pk_tbl, fk_tbl TO ymj;
GRANT
postgres=# SET ROLE ymj;
SET
postgres=> ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);
ALTER TABLE
postgres=> 

> I think we need a new SPI_*() interface which allows to run the given query
> without any permission checks, because these queries are purely internal 
> stuff,
> so we can trust the query is harmless.
> Is there any other idea?

Yeah, I don't see that going anywhere...

> I'm afraid of that the src/backend/catalog/aclchk.c will become overcrowding
> in the future. If it is ugly to deploy checker functions in separated 
> dir/files,
> I think it is an idea to put it on the execMain.c, instead of 
> ExecCheckRTEPerms().

No, this is not a service of the executor, putting it in execMain.c does
not make any sense.

> It also suggest us where the checker functions should be deployed on the 
> upcoming
> DDL reworks. In similar way, we will deploy them on 
> src/backend/command/pg_database
> for example?

We'll worry about DDL when we get there.  It won't be any time soon.  I
would strongly recommend that you concentrate on building an SELinux
module using the hook function that Robert wrote or none of this is
going to end up going anywhere.  If and when we find other places which
handle DML and need adjustment, we can identify how to handle them at
that time.

Hopefully by the time we're comfortable with DML, some of the DDL
permissions checking rework will have been done and how to move forward
with allowing external security modules to handle that will be clear.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread KaiGai Kohei

(2010/05/25 4:11), Stephen Frost wrote:
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>> I'd like to point out two more points are necessary to be considered
>> for DML permission checks in addition to ExecCheckRTPerms().
>>
>> * DoCopy()
>>
>> Although DoCopy() is called from standard_ProcessUtility(), it performs
>> as DML statement, rather than DDL. It check ACL_SELECT or ACL_INSERT on
>> the copied table or attributes, similar to what ExecCheckRTEPerms() doing.
> 
> Rather than construct a complicated API for this DML activity, why don't
> we just make ExecCheckRTPerms available for DoCopy to use?  It seems
> like we could move ExecCheckRTPerms() to acl.c without too much trouble.
> acl.h already includes parsenodes.h and has knowledge of RangeVar's.
> Once DoCopy is using that, this issue resolves itself with the hook that
> Robert already wrote up.

We have two options; If the checker function takes the list of RangeTblEntry,
it will be comfortable to ExecCheckRTPerms(), but not DoCopy(). Inversely,
if the checker function takes arguments in my patch, it will be comfortable
to DoCopy(), but ExecCheckRTPerms().

In my patch, it takes 6 arguments, but we can reference all of them from
the given RangeTblEntry. On the other hand, if DoCopy() has to set up
a pseudo RangeTblEntry to call checker function, it entirely needs to set
up similar or a bit large number of variables.

As I replied in the earlier message, it may be an idea to rename and change
the definition of ExecCheckRTEPerms() without moving it anywhere.

>> * RI_Initial_Check()
>>
>> RI_Initial_Check() is a function called on ALTER TABLE command to add FK
>> constraints between two relations. The permission to execute this ALTER TABLE
>> command itself is checked on ATPrepCmd() and ATAddForeignKeyConstraint(),
>> so it does not affect anything on the DML permission reworks.
> 
> I'm not really thrilled with how RI_Initial_Check() does it's own
> permissions checking and then calls SPI expecting things to 'just work'.
> Not sure if there's some way we could handle failure from SPI, or, if it
> was changed to call ExecCheckRTPerms() instead, how it would handle
> failure cases from there.  One possible solution would be to have an
> additional option to ExecCheckRTPerms() which asks it to just check and
> return false if there's a problem, rather than unconditionally calling
> aclcheck_error() whenever it finds a problem.
> 
> Using the same function for both the initial check in RI_Initial_Check()
> and then from SPI would eliminate issues where those two checks disagree
> for some reason, which would be good in the general case.

Sorry, I missed the fallback path also needs SELECT permissions because
validateForeignKeyConstraint() calls RI_FKey_check_ins() which entirely
tries to execute SELECT statement using SPI_*() interface.
But, it is a separate issue from the DML permission reworks.

It seems to me the permission checks in RI_Initial_Check() is a bit ad-hoc.
What we really want to do here is validation of the new FK constraints.
So, the validateForeignKeyConstraint() intends to provide a fall-back code
when table-level permission is denied, doesn't it?

In this case, we should execute the secondary query without permission checks,
because the permissions of ALTER TABLE is already checked, and we can trust
the given query is harmless.

>> BTW, I guess the reason why permissions on attributes are not checked here is
>> that we missed it at v8.4 development.
> 
> Indeed, but at the same time, this looks to be a 'fail-safe' situation.
> Basically, this is checking table-level permissions, which, if you have,
> gives you sufficient rights to SELECT against the table (any column).
> What this isn't doing is allowing the option of column-level permissions
> to be sufficient for this requirement.  That's certainly an oversight in
> the column-level permissions handling (sorry about that), but it's not
> horrible- there's a workaround if RI_Initial_Check returns false already
> anyway.

Yes, it is harmless expect for performances in a corner-case.
If user have table-level permissions, it does not need to check column-
level permissions, even if it is implemented.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Florian Pflug

On May 25, 2010, at 0:42 , Dan Ports wrote:
> On Mon, May 24, 2010 at 10:24:07AM -0500, Kevin Grittner wrote:
>> Jan Wieck wrote:
>> 
>>> In some systems (data warehousing, replication), the order of
>>> commits is important, since that is the order in which changes
>>> have become visible.
>> 
>> This issue intersects with the serializable work I've been doing.
>> While in database transactions using S2PL the above is true, in
>> snapshot isolation and the SSI implementation of serializable
>> transactions, it's not. In particular, the snapshot anomalies which
>> can cause non-serializable behavior happen precisely because the
>> apparent order of execution doesn't match anything so linear as
>> order of commit.
> 
> All true, but this doesn't pose a problem in snapshot isolation. Maybe
> this is obvious to everyone else, but just to be clear: a transaction's
> snapshot is determined entirely by which transactions committed before
> it snapshotted (and hence are visible to it). Thus, replaying update
> transactions in the sae order on a slave makes the same sequence of
> states visible to it.

The subtle point here is whether you consider the view from the "outside" (in 
the sense of what a read-only transaction started at an arbitrary time can or 
cannot observe), or from the "inside" (what updating transactions can observe 
and might base their updates on).

The former case is completely determined by the commit ordering of the 
transactions, while the latter is not - otherwise serializability wouldn't be 
such a hard problem.

For some problems, like replication, the former ("outside") view is what 
matters - if slave synthesizes transactions that insert/update/delete the very 
same tuples as the original transaction did, and commits them in the same 
order, no read-only transaction can observe the difference. But that is *not* a 
serial schedule of the original transactions, since the transactions are *not* 
the same - the merely touch the same tuples. In fact, if you try replaying the 
original SQL, you *will* get different results on the slave, and not only 
because of now() and the like.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (9.1) btree_gist support for searching on "not equals"

2010-05-24 Thread Takahiro Itagaki


Marko Tiikkaja  wrote:

> On 5/21/10 11:47 PM +0300, Jeff Davis wrote:
> > It also allows you to enforce the constraint that only one tuple exists
> > in a table by doing something like:
> >
> >create table a
> >(
> >  i int,
> >  exclude using gist (i with<>),
> >  unique (i)
> >);

+1.  I've not read the code, but it might be considerable that we can
abort index scans if we find a first index entry for "i". While we must
scan all candidates for "WHERE i <> ?", but we can abort for the constraint
case because we know existing values are all the same.

> FWIW, this is achievable a lot more easily:
> CREATE UNIQUE INDEX "a_single_row" ON a ((1));

The former exclusion constraint means "one same value for all rows",
but your alternative means "a_single_row", right?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread KaiGai Kohei

(2010/05/24 22:18), Robert Haas wrote:
> 2010/5/24 KaiGai Kohei:
>> BTW, I guess the reason why permissions on attributes are not checked here is
>> that we missed it at v8.4 development.
> 
> That's a little worrying.  Can you construct and post a test case
> where this results in a user-visible failure in CVS HEAD?

Sorry, after more detailed consideration, it seems to me the permission
checks in RI_Initial_Check() and its fallback mechanism are nonsense.

See the following commands.

  postgres=# CREATE USER ymj;
  CREATE ROLE
  postgres=# CREATE TABLE pk_tbl (a int primary key, b text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "pk_tbl_pkey" 
for table "pk_tbl"
  CREATE TABLE
  postgres=# CREATE TABLE fk_tbl (x int, y text);
  CREATE TABLE
  postgres=# ALTER TABLE pk_tbl OWNER TO ymj;
  ALTER TABLE
  postgres=# ALTER TABLE fk_tbl OWNER TO ymj;
  ALTER TABLE
  postgres=# REVOKE ALL ON pk_tbl, fk_tbl FROM ymj;
  REVOKE
  postgres=# GRANT REFERENCES ON pk_tbl, fk_tbl TO ymj;
  GRANT

At that time, the 'ymj' has ownership and REFERENCES permissions on
both of pk_tbl and fk_tbl. In this case, RI_Initial_Check() shall return
and the fallback-seqscan will run. But,

  postgres=> ALTER TABLE fk_tbl ADD FOREIGN KEY (x) REFERENCES pk_tbl (a);
  ERROR:  permission denied for relation pk_tbl
  CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."pk_tbl" x WHERE "a" 
OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"

>From more careful observation of the code, the validateForeignKeyConstraint()
also calls RI_FKey_check_ins() for each scanned tuples, but it also execute
SELECT statement using SPI_*() interface internally.

In other words, both of execution paths entirely require SELECT permission
to validate new FK constraint.

I think we need a new SPI_*() interface which allows to run the given query
without any permission checks, because these queries are purely internal stuff,
so we can trust the query is harmless.
Is there any other idea?

>> The attached patch provides a common checker function of DML, and modifies
>> ExecCheckRTPerms(), CopyTo() and RI_Initial_Check() to call the checker
>> function instead of individual ACL checks.
> 
> This looks pretty sane to me, although I have not done a full review.
> I am disinclined to create a whole new directory for it.   I think the
> new function should go in src/backend/catalog/aclchk.c and be declared
> in src/include/utils/acl.h.  If that sounds reasonable to you, please
> revise and post an updated patch.
> 

I'm afraid of that the src/backend/catalog/aclchk.c will become overcrowding
in the future. If it is ugly to deploy checker functions in separated dir/files,
I think it is an idea to put it on the execMain.c, instead of 
ExecCheckRTEPerms().

It also suggest us where the checker functions should be deployed on the 
upcoming
DDL reworks. In similar way, we will deploy them on 
src/backend/command/pg_database
for example?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Dan Ports

On Mon, May 24, 2010 at 10:24:07AM -0500, Kevin Grittner wrote:
> Jan Wieck wrote:
>  
> > In some systems (data warehousing, replication), the order of
> > commits is important, since that is the order in which changes
> > have become visible.
>  
> This issue intersects with the serializable work I've been doing.
> While in database transactions using S2PL the above is true, in
> snapshot isolation and the SSI implementation of serializable
> transactions, it's not. In particular, the snapshot anomalies which
> can cause non-serializable behavior happen precisely because the
> apparent order of execution doesn't match anything so linear as
> order of commit.

All true, but this doesn't pose a problem in snapshot isolation. Maybe
this is obvious to everyone else, but just to be clear: a transaction's
snapshot is determined entirely by which transactions committed before
it snapshotted (and hence are visible to it). Thus, replaying update
transactions in the sae order on a slave makes the same sequence of
states visible to it.

Of course (as in your example) some of these states could expose
snapshot isolation anomalies. But that's true on a single-replica
system too.

Now, stepping into the SSI world...

> Replicating or recreating the whole predicate locking and conflict
> detection on slaves is not feasible for performance reasons. (I
> won't elaborate unless someone feels that's not intuitively
> obvious.) The only sane way I can see to have a slave database allow
> serializable behavior is to WAL-log the acquisition of a snapshot by
> a serializable transaction, and the rollback or commit, on the
> master, and to have the serializable snapshot build on a slave
> exclude any serializable transactions for which there are still
> concurrent serializable transactions. Yes, that does mean WAL-
> logging the snapshot acquisition even if the transaction doesn't yet
> have an xid, and WAL-logging the commit or rollback even if it never
> acquires an xid.

One important observation is that any anomaly that occurs on the slave
can be resolved by aborting a local read-only transaction. This is a
good thing, because the alternatives are too horrible to consider.

You could possibly cut the costs of predicate locking by having the
master ship with each transaction the list of predicate locks it
acquired. But you'd still have to track locks for read-only
transactions, so maybe that's not a significant cost improvement. On
the other hand, if you're willing to pay the price of serializability
on the master, why not the slaves too?

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 5:38 PM, Dave Page  wrote:
> On Mon, May 24, 2010 at 5:20 PM, Robert Haas  wrote:
>
 It works OK for me.  The link to /contrib/pg_upgrade within the nav
 section at the top righthand corner of the page seems to work just
 fine.

 http://wiki.postgresql.org/wiki/Todo#.2Fcontrib.2Fpg_upgrade
>>>
>>> The problem is that the "Contents" menu on the top right of the page
>>> doesn't allow a clickable link to that section, and many others.
>>
>> It does for me...
>
> Doesn't here. FYI, neither do others such as 2.6, 2.7, 6.1 & 6.11

Oh, interesting.  2.6 and 2.7 don't work for me, but 6.1 and 6.11 do.
That is really odd.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Jan Wieck


On 5/24/2010 3:10 PM, Dan Ports wrote:

I'm not clear on why the total rowcount is useful, but perhaps I'm
missing something obvious.


It is a glimpse into the future. Several years of pain doing replication 
work has taught me that knowing approximately who much work the next 
chunk will be "before you select it all" is a really useful thing.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Bruce Momjian

Robert Haas wrote:
> On Mon, May 24, 2010 at 3:41 PM, Bruce Momjian  wrote:
> > Robert Haas wrote:
> >> On Mon, May 24, 2010 at 3:11 PM, Bruce Momjian  wrote:
> >> > Stefan Kaltenbrunner wrote:
> >> >> > I have updated the paragraph to be:
> >> >> >
> >> >> > ? ? Upgrading from PostgreSQL 8.3 has additional restrictions not 
> >> >> > present
> >> >> > ? ? when upgrading from later PostgreSQL releases. ?For example,
> >> >> > ? ? pg_upgrade will not work for a migration from 8.3 if a user column
> >> >> > ? ? is defined as:
> >> >> >
> >> >> > Can you suggest other wording?
> >> >>
> >> >> hmm that seems better thanks, however I just noticed that we don't have
> >> >> a "general limitations" section. The way the docs are now done suggests
> >> >> that there are not limitations at all (except for the two warnings in
> >> >> the migration guide). Is pg_upgrade really up to the point where it can
> >> >> fully replace pg_dump & pg_restore independent of the loaded (contrib)
> >> >> or even third party modules(like postgis or custom datatypes etc)?
> >> >
> >> > Yea, that's about right. ?I can add limiations if you want. ?;-)
> >>
> >> I don't believe this. ?For one thing, I am pretty sure that if there
> >> are ABI differences between loadable modules between the old and new
> >> cluster, hilarity will ensue.
> >
> > Well, the point is that our existing code doesn't have any
> > incompatibilites that I know of. ?We could certainly add some in 9.1.
> 
> Yes, or third-party vendors could add some for us.  We can't guarantee
> this in general.

What is your point?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Dave Page

On Mon, May 24, 2010 at 5:20 PM, Robert Haas  wrote:

>>> It works OK for me.  The link to /contrib/pg_upgrade within the nav
>>> section at the top righthand corner of the page seems to work just
>>> fine.
>>>
>>> http://wiki.postgresql.org/wiki/Todo#.2Fcontrib.2Fpg_upgrade
>>
>> The problem is that the "Contents" menu on the top right of the page
>> doesn't allow a clickable link to that section, and many others.
>
> It does for me...

Doesn't here. FYI, neither do others such as 2.6, 2.7, 6.1 & 6.11


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unnailing shared relations (was Re: global temporary tables)

2010-05-24 Thread Alvaro Herrera

Excerpts from Robert Haas's message of lun may 24 17:18:21 -0400 2010:
> On Mon, May 24, 2010 at 4:23 PM, Alvaro Herrera  
> wrote:
> > Excerpts from Robert Haas's message of vie may 21 10:20:38 -0400 2010:

> > Uh, how does this work when you change the entries for shared relations
> > in a database-specific pg_class?  Keeping everything in sync seems hard,
> > if not impossible.
> 
> Well, I might be missing something here, but pg_class already IS
> database-specific.  If you change anything very significant about a
> shared rel in one copy of pg_class today, you're toast, IIUC.  This
> proposal doesn't make that any better, but I don't think it makes it
> any worse either.

I thought the whole point of this exercise was precisely to avoid this
sort of problem.

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Jan Wieck


On 5/24/2010 12:51 PM, Kevin Grittner wrote:

Robert Haas  wrote:
 

I think you're confusing two subtly different things.
 
The only thing I'm confused about is what benefit anyone expects to

get from looking at data between commits in some way other than our
current snapshot mechanism.  Can someone explain a use case where
what Jan is proposing is better than snapshot isolation?  It doesn't
provide any additional integrity guarantees that I can see.
 

But the commit order is still the order the effects of those
transactions have become visible - if we inserted a new read-only
transaction into the stream at some arbitrary point in time, it
would see all the transactions which committed before it and none
of those that committed afterward.
 
Isn't that what a snapshot does already?


It does and the proposed is a mere alternative serving the same purpose.

Have you ever looked at one of those queries, that Londiste or Slony 
issue against the provider DB in order to get all the log data that has 
been committed between two snapshots? Is that really the best you can 
think of?



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 4:03 PM, Kevin Grittner
 wrote:
> Robert Haas  wrote:
>
>> It's a tool for replication solutions to use.
>
> I was thrown by the original post referencing "data warehousing".
> For replication I definitely see that it would be good to provide
> some facility to grab a coherent snapshot out of the transaction
> stream, but I'm still not clear on a use case where other solutions
> aren't better.  If you want a *particular* past snapshot, something
> akin to the transactional caching that Dan Ports mentioned seems
> best.  If you just want a coherent snapshot like snapshot isolation,
> the current mechanisms seem to work (unless I'm missing something?).
> If you want solid data integrity querying the most recent replicated
> data, the proposal I posted earlier in the thread is the best I can
> see, so far.

Well, AIUI, what you're really trying to do is derive the delta
between an old snapshot and a newer snapshot.

>> Can you give an example and explain how your proposal would solve
>> it?
>
> I gave an example (without rigorous proof accompanying it, granted)
> earlier in the thread.  In that example, if you allow a selection
> against a snapshot which includes the earlier commit (the update of
> the control table) and before the later commits (the receipts which
> used the old deposit date) you have exactly the kind of
> serialization anomaly which the work in progress prevents on the
> source (master) database -- the receipts *appear* to run in earlier
> transactions because the see the pre-update deposit date, but they
> show up out of order.

Yep, I see it now.

>  As far as I'm concerned this is only a
> problem if the user *requested* serializable behavior for all
> transactions involved.

Agreed.

> If we send the information I suggested in the WAL stream, then any
> slave using the WAL stream could build a snapshot for a serializable
> transaction which excluded serializable transactions from the source
> which overlap with still-pending serializable transactions on the
> source.  In this example, the update of the control table would not
> be visible to a serializable transaction on the slave until any
> overlapping serializable transactions (which would include any
> receipts using the old date) had also committed, so you could never
> see the writes out of order.
>
> I don't think that passing detailed predicate locking information
> would be feasible from a performance perspective, but since the
> slaves are read-only, I think it is fine to pass just the minimal
> transaction-level information I described.

I suspect that's still going to be sort of hard on performance, but
maybe we should get serializable working and committed on one node
first and then worry about how to distribute it.  I think there might
be other approaches to this problem (global transaction coordinator?
standby requests snapshot from primary?).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 3:41 PM, Bruce Momjian  wrote:
> Robert Haas wrote:
>> On Mon, May 24, 2010 at 3:11 PM, Bruce Momjian  wrote:
>> > Stefan Kaltenbrunner wrote:
>> >> > I have updated the paragraph to be:
>> >> >
>> >> > ? ? Upgrading from PostgreSQL 8.3 has additional restrictions not 
>> >> > present
>> >> > ? ? when upgrading from later PostgreSQL releases. ?For example,
>> >> > ? ? pg_upgrade will not work for a migration from 8.3 if a user column
>> >> > ? ? is defined as:
>> >> >
>> >> > Can you suggest other wording?
>> >>
>> >> hmm that seems better thanks, however I just noticed that we don't have
>> >> a "general limitations" section. The way the docs are now done suggests
>> >> that there are not limitations at all (except for the two warnings in
>> >> the migration guide). Is pg_upgrade really up to the point where it can
>> >> fully replace pg_dump & pg_restore independent of the loaded (contrib)
>> >> or even third party modules(like postgis or custom datatypes etc)?
>> >
>> > Yea, that's about right. ?I can add limiations if you want. ?;-)
>>
>> I don't believe this.  For one thing, I am pretty sure that if there
>> are ABI differences between loadable modules between the old and new
>> cluster, hilarity will ensue.
>
> Well, the point is that our existing code doesn't have any
> incompatibilites that I know of.  We could certainly add some in 9.1.

Yes, or third-party vendors could add some for us.  We can't guarantee
this in general.

>> > The only open pg_upgrade items are the ones on our TODO list:
>> >
>> > ? ? ? ?http://wiki.postgresql.org/wiki/Todo
>> >
>> > (I can't give you a URL hash-reference to the section because it doesn't
>> > work on Firefox and no one seems to be able to fix it.)
>>
>> It works OK for me.  The link to /contrib/pg_upgrade within the nav
>> section at the top righthand corner of the page seems to work just
>> fine.
>>
>> http://wiki.postgresql.org/wiki/Todo#.2Fcontrib.2Fpg_upgrade
>
> The problem is that the "Contents" menu on the top right of the page
> doesn't allow a clickable link to that section, and many others.

It does for me...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unnailing shared relations (was Re: global temporary tables)

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 4:23 PM, Alvaro Herrera  wrote:
> Excerpts from Robert Haas's message of vie may 21 10:20:38 -0400 2010:
>
>> Actually, there's another way we could do this.   Instead of creating
>> pg_shared_class and pg_shared_attribute and moving all of the catalog
>> entries for the shared relations into those tables, we could consider
>> leaving the catalog entries in the unshared copies of pg_class,
>> pg_attribute, etc. and DUPLICATING them in a shared catalog which
>> would only be used prior to selecting a database.  Once we selected a
>> database we'd switch to using the database-specific pg_class et al.
>> Obviously that's a little grotty but it might (?) be easier, and
>> possibly a step along the way.
>
> Uh, how does this work when you change the entries for shared relations
> in a database-specific pg_class?  Keeping everything in sync seems hard,
> if not impossible.

Well, I might be missing something here, but pg_class already IS
database-specific.  If you change anything very significant about a
shared rel in one copy of pg_class today, you're toast, IIUC.  This
proposal doesn't make that any better, but I don't think it makes it
any worse either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unnailing shared relations (was Re: global temporary tables)

2010-05-24 Thread Alvaro Herrera

Excerpts from Robert Haas's message of vie may 21 10:20:38 -0400 2010:

> Actually, there's another way we could do this.   Instead of creating
> pg_shared_class and pg_shared_attribute and moving all of the catalog
> entries for the shared relations into those tables, we could consider
> leaving the catalog entries in the unshared copies of pg_class,
> pg_attribute, etc. and DUPLICATING them in a shared catalog which
> would only be used prior to selecting a database.  Once we selected a
> database we'd switch to using the database-specific pg_class et al.
> Obviously that's a little grotty but it might (?) be easier, and
> possibly a step along the way.

Uh, how does this work when you change the entries for shared relations
in a database-specific pg_class?  Keeping everything in sync seems hard,
if not impossible.

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Kevin Grittner

Robert Haas  wrote:

> It's a tool for replication solutions to use.

I was thrown by the original post referencing "data warehousing".
For replication I definitely see that it would be good to provide
some facility to grab a coherent snapshot out of the transaction
stream, but I'm still not clear on a use case where other solutions
aren't better.  If you want a *particular* past snapshot, something
akin to the transactional caching that Dan Ports mentioned seems
best.  If you just want a coherent snapshot like snapshot isolation,
the current mechanisms seem to work (unless I'm missing something?).
If you want solid data integrity querying the most recent replicated
data, the proposal I posted earlier in the thread is the best I can
see, so far.

> if the reconstructed transaction order inferred by SSI doesn't
> match the actual commit order, can we get a serialization anomaly
> on the standby by replaying transactions there in commit order?

Yes.  If we don't do *something* to address it, the replicas
(slaves) will operate as read-only snapshot isolation, not true
serializable.

> Can you give an example and explain how your proposal would solve
> it?

I gave an example (without rigorous proof accompanying it, granted)
earlier in the thread.  In that example, if you allow a selection
against a snapshot which includes the earlier commit (the update of
the control table) and before the later commits (the receipts which
used the old deposit date) you have exactly the kind of
serialization anomaly which the work in progress prevents on the
source (master) database -- the receipts *appear* to run in earlier
transactions because the see the pre-update deposit date, but they
show up out of order.  As far as I'm concerned this is only a
problem if the user *requested* serializable behavior for all
transactions involved.

If we send the information I suggested in the WAL stream, then any
slave using the WAL stream could build a snapshot for a serializable
transaction which excluded serializable transactions from the source
which overlap with still-pending serializable transactions on the
source.  In this example, the update of the control table would not
be visible to a serializable transaction on the slave until any
overlapping serializable transactions (which would include any
receipts using the old date) had also committed, so you could never
see the writes out of order.

I don't think that passing detailed predicate locking information
would be feasible from a performance perspective, but since the
slaves are read-only, I think it is fine to pass just the minimal
transaction-level information I described.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Alvaro Herrera

Excerpts from Josh Berkus's message of vie may 21 17:57:35 -0400 2010:

> Problem: currently, if your database has a large amount of "cold" data, 
> such as 350GB of 3-year-old sales transactions, in 8.4 vacuum no longer 
> needs to touch it thanks to the visibility map.  However, every 
> freeze_age transactions, very old pages need to be sucked into memory 
> and rewritten just in order to freeze those pages.  This can have a huge 
> impact on system performance, and seems unjustified because the pages 
> are not actually being used.

I think this is nonsense.  If you have 3-years-old sales transactions,
and your database has any interesting churn, tuples those pages have
been frozen for a very long time *already*.  The problem is vacuum
reading them in so that it can verify there's nothing to do.  If we want
to avoid *reading* those pages, this solution is useless:

> Suggested resolution: we would add a 4-byte field to the *page* header 
> which would track the XID wraparound count.

because you still have to read the page.

I think what you're looking for is for this Xid wraparound count to be
stored elsewhere, not inside the page.  That way vacuum can read it and
skip the page without reading it altogether.  I think a "freeze map" has
been mentioned downthread.

I remember mentioning some time ago that we could declare some tables as
frozen, i.e. "not needing vacuum".  This strikes me as similar, except
at the page level rather than table level.

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Bruce Momjian

Robert Haas wrote:
> On Mon, May 24, 2010 at 3:11 PM, Bruce Momjian  wrote:
> > Stefan Kaltenbrunner wrote:
> >> > I have updated the paragraph to be:
> >> >
> >> > ? ? Upgrading from PostgreSQL 8.3 has additional restrictions not present
> >> > ? ? when upgrading from later PostgreSQL releases. ?For example,
> >> > ? ? pg_upgrade will not work for a migration from 8.3 if a user column
> >> > ? ? is defined as:
> >> >
> >> > Can you suggest other wording?
> >>
> >> hmm that seems better thanks, however I just noticed that we don't have
> >> a "general limitations" section. The way the docs are now done suggests
> >> that there are not limitations at all (except for the two warnings in
> >> the migration guide). Is pg_upgrade really up to the point where it can
> >> fully replace pg_dump & pg_restore independent of the loaded (contrib)
> >> or even third party modules(like postgis or custom datatypes etc)?
> >
> > Yea, that's about right. ?I can add limiations if you want. ?;-)
> 
> I don't believe this.  For one thing, I am pretty sure that if there
> are ABI differences between loadable modules between the old and new
> cluster, hilarity will ensue.

Well, the point is that our existing code doesn't have any
incompatibilites that I know of.  We could certainly add some in 9.1.

> > The only open pg_upgrade items are the ones on our TODO list:
> >
> > ? ? ? ?http://wiki.postgresql.org/wiki/Todo
> >
> > (I can't give you a URL hash-reference to the section because it doesn't
> > work on Firefox and no one seems to be able to fix it.)
> 
> It works OK for me.  The link to /contrib/pg_upgrade within the nav
> section at the top righthand corner of the page seems to work just
> fine.
> 
> http://wiki.postgresql.org/wiki/Todo#.2Fcontrib.2Fpg_upgrade

The problem is that the "Contents" menu on the top right of the page
doesn't allow a clickable link to that section, and many others.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Move 'long long' check to c.h

2010-05-24 Thread Stephen Frost

* Michael Meskes (mes...@postgresql.org) wrote:
> > I think the current coding is extremely fragile (if it indeed works at
> > all) because of its assumption that  has been included
> 
> Well, this is the case in the code so far. 

Right, the existing code is after limits.h is included, my suggestion to
put it in c.h would have lost limits.h and broken things.  Sorry about
that.  I didn't realize the dependency and make check didn't complain
(not that I'm sure there's even a way we could have a regression test
for this..).  I didn't intend to imply the currently-committed code
didn't work (I figured it was probably fine :), was just trying to tidy
a bit.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Move 'long long' check to c.h

2010-05-24 Thread Michael Meskes

> I think the current coding is extremely fragile (if it indeed works at
> all) because of its assumption that  has been included

Well, this is the case in the code so far. 

> already.  In any case, we have configure tests that exist only for the
> benefit of contrib modules, so it's hard to argue that we shouldn't have
> one that exists only for ecpg.
> 
> I think we should fix this (properly) for 9.0.

Ok, I don't mind fixing it properly for 9.0. Will do so as soon as I find time.

Michael

-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 3:11 PM, Bruce Momjian  wrote:
> Stefan Kaltenbrunner wrote:
>> > I have updated the paragraph to be:
>> >
>> >     Upgrading from PostgreSQL 8.3 has additional restrictions not present
>> >     when upgrading from later PostgreSQL releases.  For example,
>> >     pg_upgrade will not work for a migration from 8.3 if a user column
>> >     is defined as:
>> >
>> > Can you suggest other wording?
>>
>> hmm that seems better thanks, however I just noticed that we don't have
>> a "general limitations" section. The way the docs are now done suggests
>> that there are not limitations at all (except for the two warnings in
>> the migration guide). Is pg_upgrade really up to the point where it can
>> fully replace pg_dump & pg_restore independent of the loaded (contrib)
>> or even third party modules(like postgis or custom datatypes etc)?
>
> Yea, that's about right.  I can add limiations if you want.  ;-)

I don't believe this.  For one thing, I am pretty sure that if there
are ABI differences between loadable modules between the old and new
cluster, hilarity will ensue.

> The only open pg_upgrade items are the ones on our TODO list:
>
>        http://wiki.postgresql.org/wiki/Todo
>
> (I can't give you a URL hash-reference to the section because it doesn't
> work on Firefox and no one seems to be able to fix it.)

It works OK for me.  The link to /contrib/pg_upgrade within the nav
section at the top righthand corner of the page seems to work just
fine.

http://wiki.postgresql.org/wiki/Todo#.2Fcontrib.2Fpg_upgrade

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Dan Ports

On Sun, May 23, 2010 at 04:21:58PM -0400, Jan Wieck wrote:
> In some systems (data warehousing, replication), the order of commits is
> important, since that is the order in which changes have become visible.
> This information could theoretically be extracted from the WAL, but
> scanning the entire WAL just to extract this tidbit of information would
> be excruciatingly painful.

This is very interesting to me as I've been doing some (research --
nowhere near production-level) work on building a transactional
application-level (i.e. memcached-like) cache atop Postgres. One of the
features I needed to support it was basically what you describe.

Without getting too far into the details of what I'm doing, I needed to
make it clear to a higher layer which commits were visible to a given
query. That is, I wanted to know both the order of commits and where
particular snapshots fit into this ordering. (A SnapshotData struct
obviously contains the visibility information, but a representation in
terms of the commit ordering is both more succinct and allows for easy
ordering comparisons).

Something you might want to consider, then, is adding an interface to
find out the timestamp of the current transaction's snapshot, i.e. the
timestamp of the most recent committed transaction visible to it. I
wouldn't expect this to be difficult to implement as transaction
completion/visibility is already synchronized via ProcArrayLock.

> Each record of the Transaction Commit Info consists of
> 
>   txid  xci_transaction_id
>   timestamptz   xci_begin_timestamp
>   timestamptz   xci_commit_timestamp
>   int64 xci_total_rowcount

Another piece of information that seems useful to provide here would be
the logical timestamp of the transaction, i.e. a counter that's
incremented by one for each transaction. But maybe that's implicit in
the log ordering?

I'm not clear on why the total rowcount is useful, but perhaps I'm
missing something obvious.

I've actually implemented some semblance of this on Postgres 8.2, but it
sounds like what you're interested in is more sophisticated. In
particular, I wasn't at all concerned with durability or WAL stuff, and
I had some specific requirements about when it was OK to purge the
data. Because of this (and very limited development time), I just
threw something together with a simple shared buffer.

I don't think I have any useful code to offer, but let me know if
there's some way I can help out.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 3:07 PM, Alvaro Herrera  wrote:
> It'd be nice to have as a side effect, but if not, IMHO this proposal
> could simply use a fixed buffer pool like all other slru.c callers until
> someone gets around to fixing that.  Adding more GUC switches for this
> strikes me as overkill.

I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Bruce Momjian

Stefan Kaltenbrunner wrote:
> > I have updated the paragraph to be:
> >
> > Upgrading from PostgreSQL 8.3 has additional restrictions not present
> > when upgrading from later PostgreSQL releases.  For example,
> > pg_upgrade will not work for a migration from 8.3 if a user column
> > is defined as:
> >
> > Can you suggest other wording?
> 
> hmm that seems better thanks, however I just noticed that we don't have 
> a "general limitations" section. The way the docs are now done suggests 
> that there are not limitations at all (except for the two warnings in 
> the migration guide). Is pg_upgrade really up to the point where it can 
> fully replace pg_dump & pg_restore independent of the loaded (contrib) 
> or even third party modules(like postgis or custom datatypes etc)?

Yea, that's about right.  I can add limiations if you want.  ;-)

The only open pg_upgrade items are the ones on our TODO list:

http://wiki.postgresql.org/wiki/Todo

(I can't give you a URL hash-reference to the section because it doesn't
work on Firefox and no one seems to be able to fix it.)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread Stephen Frost

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> I'd like to point out two more points are necessary to be considered
> for DML permission checks in addition to ExecCheckRTPerms().
> 
> * DoCopy()
> 
> Although DoCopy() is called from standard_ProcessUtility(), it performs
> as DML statement, rather than DDL. It check ACL_SELECT or ACL_INSERT on
> the copied table or attributes, similar to what ExecCheckRTEPerms() doing.

Rather than construct a complicated API for this DML activity, why don't
we just make ExecCheckRTPerms available for DoCopy to use?  It seems
like we could move ExecCheckRTPerms() to acl.c without too much trouble.
acl.h already includes parsenodes.h and has knowledge of RangeVar's.
Once DoCopy is using that, this issue resolves itself with the hook that
Robert already wrote up.

> * RI_Initial_Check()
> 
> RI_Initial_Check() is a function called on ALTER TABLE command to add FK
> constraints between two relations. The permission to execute this ALTER TABLE
> command itself is checked on ATPrepCmd() and ATAddForeignKeyConstraint(),
> so it does not affect anything on the DML permission reworks.

I'm not really thrilled with how RI_Initial_Check() does it's own
permissions checking and then calls SPI expecting things to 'just work'.
Not sure if there's some way we could handle failure from SPI, or, if it
was changed to call ExecCheckRTPerms() instead, how it would handle
failure cases from there.  One possible solution would be to have an
additional option to ExecCheckRTPerms() which asks it to just check and
return false if there's a problem, rather than unconditionally calling
aclcheck_error() whenever it finds a problem.

Using the same function for both the initial check in RI_Initial_Check()
and then from SPI would eliminate issues where those two checks disagree
for some reason, which would be good in the general case.

> BTW, I guess the reason why permissions on attributes are not checked here is
> that we missed it at v8.4 development.

Indeed, but at the same time, this looks to be a 'fail-safe' situation.
Basically, this is checking table-level permissions, which, if you have,
gives you sufficient rights to SELECT against the table (any column).
What this isn't doing is allowing the option of column-level permissions
to be sufficient for this requirement.  That's certainly an oversight in
the column-level permissions handling (sorry about that), but it's not
horrible- there's a workaround if RI_Initial_Check returns false already
anyway.

Basically, if you are using column-level privs, and you have necessary
rights to do this w/ those permissions (but don't have table-level
rights), it's not going to be as fast as it could be.

> The most part of the checker function is cut & paste from ExecCheckRTEPerms(),
> but its arguments are modified for easy invocation from other functions.

As mentioned above, it seems like this would be better the other way-
have the callers build RangeTbl's and then call ExecCheckRTPerms().  It
feels like that approach might be more 'future-proof' as well.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Alvaro Herrera

Excerpts from Robert Haas's message of dom may 23 20:38:14 -0400 2010:
> On Sun, May 23, 2010 at 4:21 PM, Jan Wieck  wrote:
> > The system will have postgresql.conf options for enabling/disabling the
> > whole shebang, how many shared buffers to allocate for managing access
> > to the data and to define the retention period of the data based on data
> > volume and/or age of the commit records.
> 
> It would be nice if this could just be managed out of shared_buffers
> rather than needing to configure a separate pool just for this
> feature.

FWIW we've talked about this for years -- see old discussions about how
pg_subtrans becomes a bottleneck in certain cases and you want to
enlarge the number of buffers allocated to it (probably easy to find by
searching posts from Jignesh).  I'm guessing the new notify code would
benefit from this as well.

It'd be nice to have as a side effect, but if not, IMHO this proposal
could simply use a fixed buffer pool like all other slru.c callers until
someone gets around to fixing that.  Adding more GUC switches for this
strikes me as overkill.

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 12:51 PM, Kevin Grittner
 wrote:
> Robert Haas  wrote:
>> I think you're confusing two subtly different things.
>
> The only thing I'm confused about is what benefit anyone expects to
> get from looking at data between commits in some way other than our
> current snapshot mechanism.  Can someone explain a use case where
> what Jan is proposing is better than snapshot isolation?  It doesn't
> provide any additional integrity guarantees that I can see.

It's a tool for replication solutions to use.

>> But the commit order is still the order the effects of those
>> transactions have become visible - if we inserted a new read-only
>> transaction into the stream at some arbitrary point in time, it
>> would see all the transactions which committed before it and none
>> of those that committed afterward.
>
> Isn't that what a snapshot does already?

Yes, for a particular transaction.  But this is to allow transactions
to be replayed (in order) on another node.

>> your proposed fix sounds like it would be prohibitively expensive
>> for many users. But can this actually happen?
>
> How so?  The transaction start/end logging, or looking at that data
> when building a snapshot?

I guess what I'm asking is - if the reconstructed transaction order
inferred by SSI doesn't match the actual commit order, can we get a
serialization anomaly on the standby by replaying transactions there
in commit order?  Can you give an example and explain how your
proposal would solve it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Stefan Kaltenbrunner


On 05/24/2010 07:43 PM, Bruce Momjian wrote:

Stefan Kaltenbrunner wrote:

On 05/19/2010 05:16 PM, Bruce Momjian wrote:

Andres Freund wrote:

On Wednesday 19 May 2010 22:39:32 Bruce Momjian wrote:

There are some limitations when migrating from 8.3 to 8.4, but not when
migrating from 8.3 to 9.0, because we added a feature to 9.0.  Can you
give a specific example?

Didnt the 'name' alignment change?


Uh, the heading above that item is:

   Limitations in migratingfrom  PostgreSQL
   8.3

What is unclear there?  It covers going to 8.4 and 9.0.


well the wording makes it kinda unclear on what happens if you go FROM
8.4 to 9.0. If there are no known limits we might want to add a small
note saying so. If there are some we might want to restructure the
paragraph a bit...


Sorry for the delay in replying.  The section you list is titled:

F.31.4. Limitations in migrating from PostgreSQL 8.3

http://developer.postgresql.org/pgdocs/postgres/pgupgrade.html

and the first sentence is:

pg_upgrade will not work for a migration from 8.3 if a user column

I have updated the paragraph to be:

Upgrading from PostgreSQL 8.3 has additional restrictions not present
when upgrading from later PostgreSQL releases.  For example,
pg_upgrade will not work for a migration from 8.3 if a user column
is defined as:

Can you suggest other wording?


hmm that seems better thanks, however I just noticed that we don't have 
a "general limitations" section. The way the docs are now done suggests 
that there are not limitations at all (except for the two warnings in 
the migration guide). Is pg_upgrade really up to the point where it can 
fully replace pg_dump & pg_restore independent of the loaded (contrib) 
or even third party modules(like postgis or custom datatypes etc)?




Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] beta testing - pg_upgrade bug fix - double free

2010-05-24 Thread Bruce Momjian

Pavel Stehule wrote:
> Hello
> 
> it fixes bug
> 
> pg_upgrade(13359) malloc: *** error for object 0x801600:
> non-page-aligned, non-allocated pointer being freed
> *** set a breakpoint in malloc_error_break to debug
> 
> 
> arget 03:31 /usr/local/src/postgresql/contrib/pg_upgrade git diff .
> diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
> index 31f12fb..f989229 100644
> --- a/contrib/pg_upgrade/check.c
> +++ b/contrib/pg_upgrade/check.c
> @@ -154,7 +154,6 @@ issue_warnings(migratorContext *ctx, char
> *sequence_script_file_name)
>  ctx->new.bindir,
> ctx->new.port, sequence_script_file_name,
>  ctx->logfile);
>unlink(sequence_script_file_name);
> -   pg_free(sequence_script_file_name);
>check_ok(ctx);
>}

Thanks for the report.  Tom has applied your fix.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-24 Thread Bruce Momjian

Stefan Kaltenbrunner wrote:
> On 05/19/2010 05:16 PM, Bruce Momjian wrote:
> > Andres Freund wrote:
> >> On Wednesday 19 May 2010 22:39:32 Bruce Momjian wrote:
> >>> There are some limitations when migrating from 8.3 to 8.4, but not when
> >>> migrating from 8.3 to 9.0, because we added a feature to 9.0.  Can you
> >>> give a specific example?
> >> Didnt the 'name' alignment change?
> > 
> > Uh, the heading above that item is:
> > 
> >   Limitations in migrating from PostgreSQL
> >   8.3
> > 
> > What is unclear there?  It covers going to 8.4 and 9.0.
> 
> well the wording makes it kinda unclear on what happens if you go FROM
> 8.4 to 9.0. If there are no known limits we might want to add a small
> note saying so. If there are some we might want to restructure the
> paragraph a bit...

Sorry for the delay in replying.  The section you list is titled:

F.31.4. Limitations in migrating from PostgreSQL 8.3

http://developer.postgresql.org/pgdocs/postgres/pgupgrade.html

and the first sentence is:

   pg_upgrade will not work for a migration from 8.3 if a user column

I have updated the paragraph to be:

   Upgrading from PostgreSQL 8.3 has additional restrictions not present
   when upgrading from later PostgreSQL releases.  For example,
   pg_upgrade will not work for a migration from 8.3 if a user column
   is defined as:

Can you suggest other wording?

FYI, the items that affect only 8.3 to 8.4 migrations are no longer in
the 9.0 pg_upgrade docs because we don't support 8.4 as a target
anymore;  specifically:

Limitations In Migrating _to_ PostgreSQL 8.4
--
pg_migrator will not work if a user column is defined as:

o  a user-defined composite data type
o  a user-defined array data type
o  a user-defined enum data type

You must drop any such columns and migrate them manually.

You can see the full documentation here:


http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pg-migrator/pg_migrator/README?rev=1.78&content-type=text/x-cvsweb-markup

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Move 'long long' check to c.h

2010-05-24 Thread Tom Lane

Michael Meskes  writes:
> On Sat, May 22, 2010 at 11:20:50PM -0400, Stephen Frost wrote:
>> git diff -p), I noted that c.h is already included by both extern.h
>> and ecpg.header through postgres_fe.h.  Given this and that we're
>> already doing alot of similar #define's there (unlike in those other
>> files), I felt c.h was a more appropriate place.  Putting it in c.h
>> also means we don't have to duplicate that code.

> But do other parts of PG also need it? Keep in mind that this works for ecpg
> because it needs LLONG_MIN or LONGLONG_MIN anyway. I'm not sure if there are
> compilers that have long long without those defines, but I'd guess there
> aren't.

I think the current coding is extremely fragile (if it indeed works at
all) because of its assumption that  has been included
already.  In any case, we have configure tests that exist only for the
benefit of contrib modules, so it's hard to argue that we shouldn't have
one that exists only for ecpg.

I think we should fix this (properly) for 9.0.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Heikki Linnakangas


On 24/05/10 19:51, Kevin Grittner wrote:

The only thing I'm confused about is what benefit anyone expects to
get from looking at data between commits in some way other than our
current snapshot mechanism.  Can someone explain a use case where
what Jan is proposing is better than snapshot isolation?  It doesn't
provide any additional integrity guarantees that I can see.


Right, it doesn't. What it provides is a way to reconstruct a snapshot 
at any point in time, after the fact. For example, after transactions A, 
C, D and B have committed in that order, it allows you to reconstruct a 
snapshot just like you would've gotten immediately after the commit of 
A, C, D and B respectively. That's useful replication tools like Slony 
that needs to commit the changes of those transactions in the slave in 
the same order as they were committed in the master.


I don't know enough of Slony et al. to understand why that'd be better 
than the current heartbeat mechanism they use, taking a snapshot every 
few seconds, batching commits.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Kevin Grittner

Robert Haas  wrote:

> I think you're confusing two subtly different things.

The only thing I'm confused about is what benefit anyone expects to
get from looking at data between commits in some way other than our
current snapshot mechanism.  Can someone explain a use case where
what Jan is proposing is better than snapshot isolation?  It doesn't
provide any additional integrity guarantees that I can see.

> But the commit order is still the order the effects of those
> transactions have become visible - if we inserted a new read-only
> transaction into the stream at some arbitrary point in time, it
> would see all the transactions which committed before it and none
> of those that committed afterward.

Isn't that what a snapshot does already?

> your proposed fix sounds like it would be prohibitively expensive
> for many users. But can this actually happen?

How so?  The transaction start/end logging, or looking at that data
when building a snapshot?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Josh Berkus




I'm not sure it's cheap.  What you suggest would result in a substantial
increase in clog accesses, which means (1) more I/O and (2) more
contention.  Certainly it's worth experimenting with, but it's no
guaranteed win.


It seems like there's a number of issues we could fix by making the CLOG 
more efficient somehow -- from the elimination of hint bits to the 
ability to freeze pages without writing them.


Not, of course, that I have any idea how to do that.

--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT

2010-05-24 Thread Alex Goncharov

I have some libpq-using application code, in which fetching the data
follows this logic (after a statement has been prepared):



  PQexecPrepared(pg_result, pg_conn, pg_statement_name, input_param_cnt,
 param_values, param_lengths, param_formats, result_format);
  
  PQntuples(&rows_in_result, pg_result);
  
  /* The application provides storage so that I can pass a certain number of 
rows
   * (rows_to_pass_up) to the caller, and I repeat the following loop until
   * many rows_to_pass_up cover all the rows_in_result (pg_row_num_base keeps 
the track
   * of where I am in the process. */
  
  for (int row_idx = 0; row_idx < rows_to_pass_up; ++row_idx) {
const int pg_row_number = row_idx + pg_row_num_base; 

for (int pg_column_number = 0; pg_column_number < result_column_cnt_ 
++pg_column_number) {
PQgetvalue(&value, pg_result, pg_row_number, pg_column_number);
PQgetlength(&length, pg_result, pg_row_number, pg_column_number);
}
  }



My question is: am I doing the right thing from the "data size being
passed from BE to FE" perspective?

The code in `bin/psql' relies on the value of the FETCH_COUNT
parameter to build an appropriate

fetch forward FETCH_COUNT from _psql_cursor

command.

No equivalent of FETCH_COUNT is available at the libpq level, so I
assume that the interface I am using is smart enough not to send
gigabytes of data to FE.

Is that right? Is the logic I am using safe and good?

Where does the result set (GBs of data) reside after I call
PQexecPrepared?  On BE, I hope?

Thanks,

-- Alex -- alex-goncha...@comcast.net --

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Heikki Linnakangas


On 24/05/10 16:20, Fujii Masao wrote:

The log-shipping replication has some synch levels as follows.

The transaction commit on the master
#1 doesn't wait for replication (already suppored in 9.0)
#2 waits for WAL to be received by the standby
#3 waits for WAL to be received and flushed by the standby
#4 waits for WAL to be received, flushed and replayed by
   the standby
..etc?

Which should we include in 9.1? I'd like to add #2 and #3.
They are enough for high-availability use case (i.e., to
prevent failover from losing any transactions committed).
AFAIR, MySQL semi-synchronous replication supports #2 level.

#4 is useful for some cases, but might often make the
transaction commit on the master get stuck since read-only
query can easily block recovery by the lock conflict. So
#4 seems not to be worth working on until that HS problem
has been addressed. Thought?


I see a lot of value in #4; it makes it possible to distribute read-only 
load to the standby using something like pgbouncer, completely 
transparently to the application. In the lesser modes, the application 
can see slightly stale results.


But whatever we can easily implement, really. Pick one that you think is 
the easiest and start with that, but keep the other modes in mind in the 
design and in the user interface so that you don't paint yourself in the 
corner.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Move 'long long' check to c.h

2010-05-24 Thread Michael Meskes

On Sat, May 22, 2010 at 11:20:50PM -0400, Stephen Frost wrote:
>   git diff -p), I noted that c.h is already included by both extern.h
>   and ecpg.header through postgres_fe.h.  Given this and that we're
>   already doing alot of similar #define's there (unlike in those other
>   files), I felt c.h was a more appropriate place.  Putting it in c.h
>   also means we don't have to duplicate that code.

But do other parts of PG also need it? Keep in mind that this works for ecpg
because it needs LLONG_MIN or LONGLONG_MIN anyway. I'm not sure if there are
compilers that have long long without those defines, but I'd guess there
aren't.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Move 'long long' check to c.h

2010-05-24 Thread Michael Meskes

On Sun, May 23, 2010 at 11:50:00AM -0400, Stephen Frost wrote:
> > If we need this we should do it properly with autoconf.

I absolutely agree and planed to do that *after* the release if it makes sense
for the rest of PG, but wouldn't want to mess with it in the current
situtation. On the other hand I didn't want to release with that bug in there.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 11:24 AM, Kevin Grittner
 wrote:
> Jan Wieck wrote:
>
>> In some systems (data warehousing, replication), the order of
>> commits is important, since that is the order in which changes
>> have become visible.
>
> This issue intersects with the serializable work I've been doing.
> While in database transactions using S2PL the above is true, in
> snapshot isolation and the SSI implementation of serializable
> transactions, it's not.

I think you're confusing two subtly different things.  The way to
prove that a set of transactions running under some implementation of
serializability is actually serializable is to construct a serial
order of execution consistent with the view of the database that each
transaction saw.  This may or may not match the commit order, as you
say.  But the commit order is still the order the effects of those
transactions have become visible - if we inserted a new read-only
transaction into the stream at some arbitrary point in time, it would
see all the transactions which committed before it and none of those
that committed afterward.  So I think Jan's statement is correct.

Having said that, I think your concerns about how things will look
from a slave's point of view are possibly valid.  A transaction
running on a slave is essentially a read-only transaction that the
master doesn't know about.  It's not clear to me whether adding such a
transaction to the timeline could result in either (a) that
transaction being rolled back or (b) some impact on which other
transactions got rolled back.  If it did, that would obviously be a
problem for serializability on slaves, though your proposed fix sounds
like it would be prohibitively expensive for many users.  But can this
actually happen?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Kevin Grittner

Jan Wieck wrote:

> In some systems (data warehousing, replication), the order of
> commits is important, since that is the order in which changes
> have become visible.

This issue intersects with the serializable work I've been doing.
While in database transactions using S2PL the above is true, in
snapshot isolation and the SSI implementation of serializable
transactions, it's not. In particular, the snapshot anomalies which
can cause non-serializable behavior happen precisely because the
apparent order of execution doesn't match anything so linear as
order of commit.

I'll raise that receipting example again. You have transactions
which grab the current deposit data and insert it into receipts, as
payments are received. At some point in the afternoon, the deposit
date in a control table is changed to the next day, so that the
receipts up to that point can be deposited during banking hours with
the current date as their deposit date. A report is printed (and
likely a transfer transaction recorded to move "cash in drawer" to
"cash in checking", but I'll ignore that aspect for this example).
Some receipts may not be committed when the update to the date in
the control table is committed.

This is "eventually consistent" -- once all the receipts with the
old date commit or roll back the database is OK, but until then you
might be able to select the new date in the control table and the
set of receipts matching the old date without the database telling
you that you're missing data. The new serializable implementation
fixes this, but there are open R&D items (due to the need to discuss
the issues) on the related Wiki page related to hot standby and
other replication. Will we be able to support transactional
integrity on slave machines?

What if the update to the control table and the insert of receipts
all happen on the master, but someone decides to move the (now
happily working correctly with serializable transactions) reporting
to a slave machine? (And by the way, don't get too hung up on this
particular example, I could generate dozens more on demand -- the
point is that order of commit doesn't always correspond to apparent
order of execution; in this case the receipts *appear* to have
executed first, because they are using a value "later" updated to
something else by a different transaction, even though that other
transaction *committed* first.)

Replicating or recreating the whole predicate locking and conflict
detection on slaves is not feasible for performance reasons. (I
won't elaborate unless someone feels that's not intuitively
obvious.) The only sane way I can see to have a slave database allow
serializable behavior is to WAL-log the acquisition of a snapshot by
a serializable transaction, and the rollback or commit, on the
master, and to have the serializable snapshot build on a slave
exclude any serializable transactions for which there are still
concurrent serializable transactions. Yes, that does mean WAL-
logging the snapshot acquisition even if the transaction doesn't yet
have an xid, and WAL-logging the commit or rollback even if it never
acquires an xid.

I think this solve the issue Jan raises as long as serializable
transactions are used; if they aren't there are no guarantees of
transactional integrity no matter how you track commit sequence,
unless it can be based on S2PL-type blocking locks.  I'll have to
leave that to someone else to sort out.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-24 Thread Tom Lane

Fujii Masao  writes:
> Okay, how about the attached patch? It uses the postmaster-local flag
> "ReachedEndOfRecovery" (better name?) instead of XLogCtl one.

ReachedNormalRunning, perhaps?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 10:35 AM, Fujii Masao  wrote:
> On Mon, May 24, 2010 at 10:26 PM, Robert Haas  wrote:
>> This looks pretty reasonable to me, but I guess I feel like it would
>> be better to drive the CancelBackup() decision off of whether we've
>> ever reached PM_RUN rather than consulting XLogCtl.  It just feels
>> cleaner to me to drive all of the postmaster decisions off of the same
>> signalling mechanism rather than having a separate one (that only
>> works because it's used very late in shutdown when we theoretically
>> don't need a lock) just for this one case.
>
> Okay, how about the attached patch? It uses the postmaster-local flag
> "ReachedEndOfRecovery" (better name?) instead of XLogCtl one.

Looks good to me.  I will test and apply.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-24 Thread Fujii Masao

On Mon, May 24, 2010 at 10:26 PM, Robert Haas  wrote:
> This looks pretty reasonable to me, but I guess I feel like it would
> be better to drive the CancelBackup() decision off of whether we've
> ever reached PM_RUN rather than consulting XLogCtl.  It just feels
> cleaner to me to drive all of the postmaster decisions off of the same
> signalling mechanism rather than having a separate one (that only
> works because it's used very late in shutdown when we theoretically
> don't need a lock) just for this one case.

Okay, how about the attached patch? It uses the postmaster-local flag
"ReachedEndOfRecovery" (better name?) instead of XLogCtl one.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


fix_smart_shutdown_in_recovery_v5_fujii.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Specification for Trusted PLs?

2010-05-24 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


>> Well, the best way to define what a trusted language can do is to
>> define a *whitelist* of what it can do, not a blacklist of what it
>> can't do. That's the only way to get a complete definition. It's then
>> up to the implementation step to figure out how to represent that in
>> the form of tests.

> Yes, PL/Perl is following this approach. For a whitelist see
> plperl_opmask.h (generated by plperl_opmask.pl at build phase).

Ah, okay, I can mostly agree with that. My objection was with trying 
to build a cross-language generic whitelist. But it looks like the 
ship has already sailed upthread and we've more or less got a working 
definition. David, I think you started this thread, I assume you have 
some concrete reason for asking about this (new trusted language?). 
May have been stated, but I missed it.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005241025
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkv6jE4ACgkQvJuQZxSWSsjWugCdEwR/n0V3IeFB7w/h5hhPQW/J
ln0An2FZKa2CHWaWdHKOvQvEbBIvyzwK
=wqO5
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-24 Thread Tom Lane

Robert Haas  writes:
> On Mon, May 24, 2010 at 9:28 AM, Simon Riggs  wrote:
>> On Mon, 2010-05-24 at 09:26 -0400, Robert Haas wrote:
>>> This looks pretty reasonable to me, but I guess I feel like it would
>>> be better to drive the CancelBackup() decision off of whether we've
>>> ever reached PM_RUN rather than consulting XLogCtl.
>> 
>> That is exactly what XLogCtl tells us and why it is suggested for use.

> Sure.  My only point is that the postmaster doesn't (and can't) use
> that method of getting the information at any other time when it is
> needed, so I don't know why we'd want to use it in just this one case.
>  Maybe there's a reason, but it's not obvious to me.

I'm with Robert on this.  The postmaster is designed to be driven by an
internal state machine.  Making it rely on the contents of shared memory
is a fundamentally dangerous idea.  It might coincidentally be safe in
this one case, but I can easily imagine that property failing as a result
of subsequent changes.

The postmaster should not look at shared memory if there is any
reasonable alternative, and we clearly have a reasonable alternative.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Tom Lane

Heikki Linnakangas  writes:
> (As also discussed in the Royal Oak) I think we should simply not dirty 
> a page when a hint bit is updated. Reading a page from disk is 
> expensive, setting hint bits on the access is generally cheap compared 
> to that. But that is orthogonal to the idea of a per-page XID epoch.

I'm not sure it's cheap.  What you suggest would result in a substantial
increase in clog accesses, which means (1) more I/O and (2) more
contention.  Certainly it's worth experimenting with, but it's no
guaranteed win.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 9:28 AM, Simon Riggs  wrote:
> On Mon, 2010-05-24 at 09:26 -0400, Robert Haas wrote:
>> On Mon, May 24, 2010 at 1:27 AM, Fujii Masao  wrote:
>> > On Wed, May 19, 2010 at 2:47 PM, Fujii Masao  wrote:
>>  Oh, right. How about allowing the postmaster only in PM_STARTUP,
>>  PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
>>  walreceiver? We can keep walreceiver alive until all read only
>>  backends have gone, and prevent unexpected startup of walreceiver.
>> >>>
>> >>> Yes, that seems like something we should be checking, if we aren't 
>> >>> already.
>> >>
>> >> I'll do that.
>> >
>> > Here is the updated version. I added the above-mentioned check
>> > into the patch.
>>
>> This looks pretty reasonable to me, but I guess I feel like it would
>> be better to drive the CancelBackup() decision off of whether we've
>> ever reached PM_RUN rather than consulting XLogCtl.
>
> That is exactly what XLogCtl tells us and why it is suggested for use.

Sure.  My only point is that the postmaster doesn't (and can't) use
that method of getting the information at any other time when it is
needed, so I don't know why we'd want to use it in just this one case.
 Maybe there's a reason, but it's not obvious to me.

>>  It just feels
>> cleaner to me to drive all of the postmaster decisions off of the same
>> signalling mechanism rather than having a separate one (that only
>> works because it's used very late in shutdown when we theoretically
>> don't need a lock) just for this one case.
>>
>> I could be all wet, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Robert Haas

On Sun, May 23, 2010 at 9:44 PM, Jan Wieck  wrote:
> I'm not sure the retention policies of the shared buffer cache, the WAL
> buffers, CLOG buffers and every other thing we try to cache are that easy to
> fold into one single set of logic. But I'm all ears.

I'm not sure either, although it seems like LRU ought to be good
enough for most things.  I'm more worried about things like whether
the BufferDesc abstraction is going to get in the way.

>>> CommitTransaction() inside of xact.c will call a function, that inserts
>>> a new record into this array. The operation will for most of the time be
>>> nothing than taking a spinlock and adding the record to shared memory.
>>> All the data for the record is readily available, does not require
>>> further locking and can be collected locally before taking the spinlock.
>>
>> What happens when you need to switch pages?
>
> Then the code will have to grab another free buffer or evict one.

Hopefully not while holding a spin lock.  :-)

>>> The function will return the "sequence" number which CommitTransaction()
>>> in turn will record in the WAL commit record together with the
>>> begin_timestamp. While both, the begin as well as the commit timestamp
>>> are crucial to determine what data a particular transaction should have
>>> seen, the row count is not and will not be recorded in WAL.
>>
>> It would certainly be better if we didn't to bloat the commit xlog
>> records to do this.  Is there any way to avoid that?
>
> If you can tell me how a crash recovering system can figure out what the
> exact "sequence" number of the WAL commit record at hand should be, let's
> rip it.

Hmm...  could we get away with WAL-logging the next sequence number
just once per checkpoint?  When you replay the checkpoint record, you
update the control file with the sequence number.  Then all the
commits up through the next checkpoint just use consecutive numbers
starting at that value.

> It is an option. "Keep it until I tell you" is a perfectly valid
> configuration option. One you probably don't want to forget about, but valid
> none the less.

As Tom is fond of saying, if it breaks, you get to keep both pieces.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-24 Thread Simon Riggs

On Mon, 2010-05-24 at 09:26 -0400, Robert Haas wrote:
> On Mon, May 24, 2010 at 1:27 AM, Fujii Masao  wrote:
> > On Wed, May 19, 2010 at 2:47 PM, Fujii Masao  wrote:
>  Oh, right. How about allowing the postmaster only in PM_STARTUP,
>  PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
>  walreceiver? We can keep walreceiver alive until all read only
>  backends have gone, and prevent unexpected startup of walreceiver.
> >>>
> >>> Yes, that seems like something we should be checking, if we aren't 
> >>> already.
> >>
> >> I'll do that.
> >
> > Here is the updated version. I added the above-mentioned check
> > into the patch.
> 
> This looks pretty reasonable to me, but I guess I feel like it would
> be better to drive the CancelBackup() decision off of whether we've
> ever reached PM_RUN rather than consulting XLogCtl. 

That is exactly what XLogCtl tells us and why it is suggested for use.

>  It just feels
> cleaner to me to drive all of the postmaster decisions off of the same
> signalling mechanism rather than having a separate one (that only
> works because it's used very late in shutdown when we theoretically
> don't need a lock) just for this one case.
> 
> I could be all wet, though.
> 
-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Heikki Linnakangas


On 22/05/10 16:35, Tom Lane wrote:

Josh Berkus  writes:

   From a discussion at dinner at pgcon, I wanted to send this to the list
for people to poke holes in it:


Somebody (I think Joe or Heikki) poked a big hole in this last night at
the Royal Oak.


Me.


 Although the scheme would get rid of the need to replace
old XIDs with FrozenXid, it does not get rid of the need to set hint
bits before you can truncate CLOG.


Hmm, we don't rely on setting hint bits to truncate CLOG anymore 
(http://archives.postgresql.org/pgsql-committers/2006-11/msg00026.php). 
It's the replacement of xids with FrozenXid that matters, the hint bits 
are really just hints.


Doesn't change the conclusion, though: you still need to replace XIDs 
with FrozenXids to truncate the clog. Conceivably we could keep around 
more than 2^32 transactions in clog with this scheme, but then you need 
a lot more space for the clog. But perhaps it would be better to do that 
than to launch anti-wraparound vacuums, or to refuse more updates in the 
extreme cases.



 So in your example of an insert-only
table that's probably never read again, there's still a minimum of one
update visit required on every old page.  Now that's still better than
two update visits ... but we could manage that already, just by tweaking
vacuum's heuristics about when to freeze vs when to set hint bits.


(As also discussed in the Royal Oak) I think we should simply not dirty 
a page when a hint bit is updated. Reading a page from disk is 
expensive, setting hint bits on the access is generally cheap compared 
to that. But that is orthogonal to the idea of a per-page XID epoch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-24 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> In light of the proposed purging scheme, how would it be able to distinguish 
> between those two cases (nothing there yet vs. was there but purged)?

> There is a difference between an empty result set and an exception.

No, I meant how will the *function* know, if a superuser and/or some 
background process can purge records at any time?

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005240928
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkv6f0UACgkQvJuQZxSWSsh0xwCgmXLtKngoBBYX0TxDM2TlJRId
AVIAoMHYa3c9Ej2vUJyFufxBR5vDPzQ+
=e1mh
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)

2010-05-24 Thread Robert Haas

On Mon, May 24, 2010 at 1:27 AM, Fujii Masao  wrote:
> On Wed, May 19, 2010 at 2:47 PM, Fujii Masao  wrote:
 Oh, right. How about allowing the postmaster only in PM_STARTUP,
 PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
 walreceiver? We can keep walreceiver alive until all read only
 backends have gone, and prevent unexpected startup of walreceiver.
>>>
>>> Yes, that seems like something we should be checking, if we aren't already.
>>
>> I'll do that.
>
> Here is the updated version. I added the above-mentioned check
> into the patch.

This looks pretty reasonable to me, but I guess I feel like it would
be better to drive the CancelBackup() decision off of whether we've
ever reached PM_RUN rather than consulting XLogCtl.  It just feels
cleaner to me to drive all of the postmaster decisions off of the same
signalling mechanism rather than having a separate one (that only
works because it's used very late in shutdown when we theoretically
don't need a lock) just for this one case.

I could be all wet, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao

Hi,

I'm now designing the "synchronous" replication feature based on
SR for 9.1, while discussing that at another thread.
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php

At the first design phase, I'd like to clarify which synch levels
should be supported 9.1 and how it should be specified by users.

The log-shipping replication has some synch levels as follows.

   The transaction commit on the master
   #1 doesn't wait for replication (already suppored in 9.0)
   #2 waits for WAL to be received by the standby
   #3 waits for WAL to be received and flushed by the standby
   #4 waits for WAL to be received, flushed and replayed by
  the standby
   ..etc?

Which should we include in 9.1? I'd like to add #2 and #3.
They are enough for high-availability use case (i.e., to
prevent failover from losing any transactions committed).
AFAIR, MySQL semi-synchronous replication supports #2 level.

#4 is useful for some cases, but might often make the
transaction commit on the master get stuck since read-only
query can easily block recovery by the lock conflict. So
#4 seems not to be worth working on until that HS problem
has been addressed. Thought?

Second, we need to discuss about how to specify the synch
level. There are three approaches:

* Per standby
  Since the purpose, location and H/W resource often differ
  from one standby to another, specifying level per standby
  (i.e., we set the level in recovery.conf) is a
  straightforward approach, I think. For example, we can
  choose #3 for high-availability standby near the master,
  and choose #1 (async) for the disaster recovery standby
  remote.

* Per transaction
  Define the PGC_USERSET option specifying the level and
  specify it on the master in response to the purpose of
  transaction. In this approach, for example, we can choose
  #4 for the transaction which should be visible on the
  standby as soon as a "success" of the commit has been
  returned to a client. We can also choose #1 for
  time-critical but not mission-critical transaction.

* Mix
  Allow users to specify the level per standby and
  transaction at the same time, and then calculate the real
  level from them by using some algorithm.

Which should we adopt for 9.1? I'd like to implement the
"per-standby" approach at first since it's simple and seems
to cover more use cases. Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-24 Thread Robert Haas

2010/5/24 KaiGai Kohei :
> BTW, I guess the reason why permissions on attributes are not checked here is
> that we missed it at v8.4 development.

That's a little worrying.  Can you construct and post a test case
where this results in a user-visible failure in CVS HEAD?

> The attached patch provides a common checker function of DML, and modifies
> ExecCheckRTPerms(), CopyTo() and RI_Initial_Check() to call the checker
> function instead of individual ACL checks.

This looks pretty sane to me, although I have not done a full review.
I am disinclined to create a whole new directory for it.   I think the
new function should go in src/backend/catalog/aclchk.c and be declared
in src/include/utils/acl.h.  If that sounds reasonable to you, please
revise and post an updated patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Specification for Trusted PLs?

2010-05-24 Thread Jan Wieck


On 5/23/2010 11:19 PM, Andrew Dunstan wrote:


Jan Wieck wrote:


ISTM we are in danger of confusing several different things. A user 
that doesn't want data to be shared should not stash it in global 
objects. But to me, trusting a language is not about making data 
private, but about not allowing the user to do things that are 
dangerous, such as referencing memory, or the file system, or the 
operating system, or network connections, or loading code which might 
do any of those things.


How is "loading code which might do any of those things" different 
from writing a stored procedure, that accesses data, a careless 
"superuser" left in a global variable? Remember, the code of a PL 
function is "open" source - like in "everyone can select from 
pg_proc". You really don't expect anyone to scan for your global 
variables just because they can write functions in the same language?




Well, that threat arises from the unsafe actions of the careless 
superuser. And we could at least ameliorate it by providing a per role 
data stash, at very little cost, as I mentioned. It's not like we don't 
know about such threats, and I'm certainly not pretending they don't 
exist. The 9.0 PL/Perl docs say:


The %_SHARED variable and other global state within the language is
public data, available to all PL/Perl functions within a session.
Use with care, especially in situations that involve use of multiple
roles or SECURITY DEFINER functions.


But the threats I was referring to arise if the language allows them to, 
without any requirement for unsafe actions by another user. Protecting 
against those is the essence of trustedness in my mind at least.


I can agree with that.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

77 matches

Mail list logo