Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Tom Lane
"Greg Stark"  writes:
> On Sun, Dec 14, 2008 at 1:42 AM, Robert Haas  wrote:
>>> What if relabeling support were to spread some more?
>> 
>> The only example I can think of besides XML is JSON.  There might be a
>> few more.  Basically, relabelling is a handy shortcut when you are
>> serializing data and want to avoid specifying a list of columns and an
>> (almost) identical list of labels.

> The whole relabeling thing seems like a seriously silly idea.

I wouldn't say that it's silly.  What I do say is that it makes no sense
to imagine that it would be used at the same time as named parameters.
The entire point of something like XMLELEMENT is that it takes a list of
undifferentiated parameters, which therefore do not need to have names
so far as the function is concerned.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
> The point here is that synchronous replication, at least to some
> people, is going to imply that the user-visible states of the two
> copies are consistent.  To other people, it is going to imply that
> committed transactions will never be lost even in the event of a
> catastropic loss of the primary 1 picosecond after the commit is
> acknowledged.  We need to choose some word that implies that we are
> guaranteeing the latter of these two things but not the former.
> Otherwise, we will have confused users, and terminological confusion
> when and if we ever implement the former as well.

With apologies for replying to my own post:

It's also important to understand that these two invariants are
completely separate and it is possible to guarantee either without the
other.  If you want (1), the standby needs to apply the WAL before
sending an acknowledgment to the primary but does not necessarily need
to write it to disk (of course, it will have to be written to disk
before the modified buffers are written to disk, but that's a separate
issue).  If you want (2), the standby needs to write the WAL to disk
before sending the acknowledgment but does not necessarily need to
apply it.  If you want both, then, you need to wait for both (and it's
worth noting that your performance will probably be nothing to write
home about).

I also did some research on terminology that has been used in the
literature.  As Jim Gray describes it:

1-safe replication.  Transaction is committed when it has been locally
WAL-logged to durable storage.
Group-safe replication.  Transaction is committed when WAL has been
received by all remote servers, but not necessarily written to durable
storage.
Group-safe & 1-safe replication.  Transaction is committed when it has
been locally WAL-logged to durable storage and WAL has been received
by all remote servers.
2-safe replication.  Transaction is committed when it has been written
to durable storage on both local and remote servers.
Very safe replication.  As 2-safe, but fails any read-write
transaction if the secondary is down.

(Actually, it appears that "Transaction Processing" Jim Gray and
Andreas Reuter, 1993 uses 2-safe to refer to either 2-safe or
group-safe; the distinction between the two is a subsequent
development. See e.g. Advances in Database Technology-EDBT 2004
by Elisa Bertino)

The term of art for making sure that transactions committed on the
primary are visible on the secondary seems to be "one-copy
serializability" (see, for example, a Google Books search on that
term).

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Tatsuo Ishii
> The point here is that synchronous replication, at least to some
> people, is going to imply that the user-visible states of the two
> copies are consistent.  To other people, it is going to imply that
> committed transactions will never be lost even in the event of a
> catastropic loss of the primary 1 picosecond after the commit is
> acknowledged.  We need to choose some word that implies that we are
> guaranteeing the latter of these two things but not the former.
> Otherwise, we will have confused users, and terminological confusion
> when and if we ever implement the former as well.

Right. Before watching this thread, I had thought that the log
shipping sync replication behaves former (and I had told so to people
in Japan who are interested in 8.4 development. Of course this is my
fault, though).

Now I understand the log shipping sync replication does not behave
same as other "sync replications" such as pgpool and PGCluster (there
maybe more, but I don't know)
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Robert Haas
> The whole relabeling thing seems like a seriously silly idea. Why is
> it at all a shortcut to use "AS" instead of "," ?

Because a lot of times you don't want to relabel, so you omit the "AS
label" part altogether, and the label is deduced from the expression
itself.  For example, I don't need to write:

SELECT json(r.foo AS foo, r.bar AS bar, r.baz AS baz, r.bletch AS
quux) FROM rel r;

I can just write:

SELECT json(r.foo, r.bar, r.baz, r.bletch AS quux) FROM rel r;

...which is a a lot more compact when the number of arguments is large.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Jeff Davis
On Sat, 2008-12-13 at 22:23 -0500, Robert Haas wrote:
> > If it's guaranteed to be visible on the standby after it's committed on
> > the master, and you don't have any way to make it actually simultaneous,
> > then that implies that it's visible on the slave for some brief period
> > of time before it's committed on the master.
> >
> > That situation is still asymmetric, so why is that a better use of the
> > term "synchronous"?
> 
> Because that happens anyway.  If I request a commit on a single,
> unreplicated server, the server makes the commit visible to new
> transactions and then sends me a message informing me that the commit
> has completed.  Since the message takes some finite time to reach me,
> there is a window of time after the commit has completed and before I
> know that the commit has been completed.
> 

Oh, I see the distinction now.

Thanks for the detailed reply.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
> Might it not be true that anybody unfamiliar would be confused and that this
> is a bit of a straw man?
[...]
> If my application assumes that it can commit to one server, and then read
> back the commit from another server, and my application breaks as a result,
> it's because I didn't understand the problem. Even if PostgreSQL didn't use
> the word "synchronous replication", I could still be confused. I need to
> understand the problem no matter what words are used.

That is certainly true.  But there is value in choosing words which
elucidate the situation as much as possible.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
> If it's guaranteed to be visible on the standby after it's committed on
> the master, and you don't have any way to make it actually simultaneous,
> then that implies that it's visible on the slave for some brief period
> of time before it's committed on the master.
>
> That situation is still asymmetric, so why is that a better use of the
> term "synchronous"?

Because that happens anyway.  If I request a commit on a single,
unreplicated server, the server makes the commit visible to new
transactions and then sends me a message informing me that the commit
has completed.  Since the message takes some finite time to reach me,
there is a window of time after the commit has completed and before I
know that the commit has been completed.

Suppose for the sake of argument that the single, unreplicated server
did these two tasks in the opposite order - namely, first, it sent a
message to the process requesting the commit stating that the commit
had completed, and only then made the transaction visible.  This would
create a race condition: the process requesting the commit might
receive the commit and begin a new transaction before the previous
transaction had been made visible, and would therefore not be able to
see the results of its own previous actions.  I think it's fair to say
that this behavior would be judged totally intolerable.

Therefore, there can't possibly be any applications out there which
are depending on the fact that commits don't become visible until they
are acknowledged, but there very well could be some applications which
depend on the fact that one commits are acknowledged, they are
visible.  If replication is synchronous in this sense, then I can open
a connection to the master, write some data, close the connection,
open a new connection to the master or the slave (not caring which),
and read back the data that I just wrote (assuming no one else has
modified it in the mean time).  If it isn't, then I can't.  Some
people will not care about this, but some will.

The point here is that synchronous replication, at least to some
people, is going to imply that the user-visible states of the two
copies are consistent.  To other people, it is going to imply that
committed transactions will never be lost even in the event of a
catastropic loss of the primary 1 picosecond after the commit is
acknowledged.  We need to choose some word that implies that we are
guaranteeing the latter of these two things but not the former.
Otherwise, we will have confused users, and terminological confusion
when and if we ever implement the former as well.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Jeff Davis
On Sat, 2008-12-13 at 21:35 -0500, Robert Haas wrote:
> On Sat, Dec 13, 2008 at 1:29 PM, Tom Lane  wrote:
> > "Robert Haas"  writes:
> >> I think we need to reserve the term "synchronous replication" for a
> >> system where transactions that begin at the same time on the primary
> >> and standby see the same tuples.  Clearly that is "more" synchronous
> >
> > We won't call it anything, because we never will or can implement that.
> > See the theory of relativity: the notion of exactly simultaneous events
> 
> OK, fine.  I'll be more precise.  I think we need to reserve the term
> "synchronous replication" for a system where transactions that begin
> on the standby after the transactions has committed on the master see
> the effects of the committed transaction.
> 

If it's guaranteed to be visible on the standby after it's committed on
the master, and you don't have any way to make it actually simultaneous,
then that implies that it's visible on the slave for some brief period
of time before it's committed on the master.

That situation is still asymmetric, so why is that a better use of the
term "synchronous"?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
On Sat, Dec 13, 2008 at 1:29 PM, Tom Lane  wrote:
> "Robert Haas"  writes:
>> I think we need to reserve the term "synchronous replication" for a
>> system where transactions that begin at the same time on the primary
>> and standby see the same tuples.  Clearly that is "more" synchronous
>
> We won't call it anything, because we never will or can implement that.
> See the theory of relativity: the notion of exactly simultaneous events

OK, fine.  I'll be more precise.  I think we need to reserve the term
"synchronous replication" for a system where transactions that begin
on the standby after the transactions has committed on the master see
the effects of the committed transaction.

> at distinct locations isn't even well-defined, because observers at yet
> other locations will disagree about what is "simultaneous".  And I'm
> not just making a joke here --- speed-of-light delays in a WAN are
> meaningful compared to current computer speeds.  In practice, the
> slave and the master will never commit at exactly the same time.
>
> I agree with the point made upthread that we should use the term
> "synchronous replication" the way it's commonly used in the industry.
> Inventing our own terminology might be fun but it's not really going
> to result in less confusion.

I just googled "synchronous replication" and read through the first
page of hits.  Most of them do not address the question of whether
synchronous replication can be said to have be completed when WAL has
been received by the standby not but yet applied.  One of the ones
that does is:

http://code.google.com/p/google-mysql-tools/wiki/SemiSyncReplicationDesign

...which refers to what we're proposing to call "Synchronous
Replication" as "Semi-Synchronous Replication" (or 2-safe replication)
specifically to distinguish it.  The other is:

http://www.cnds.jhu.edu/pub/papers/cnds-2002-4.pdf

...which doesn't specifically examine the issue but seems to take the
opposite position, namely that the server on which the transaction is
executed needs to wait only for one server to apply the changes to the
database (the others need only to know that they need to commit it;
they don't actually need to have done it).  However, that same paper
refers to two-phase commit as a synchronous replication algorithm, and
Wikipedia's discussion of two-phase commit:

http://en.wikipedia.org/wiki/Two-phase_commit_protocol

...clearly implies that the transaction must be applied everywhere
before it can be said to have committed.

The second page of Google results is mostly a further discussion of
the MySQL solution, which is mostly described as "semi-synchronous
replication".

Simon Riggs said upthread that Oracle called this "synchronous redo
transport".  That is obviously much closer to what we are doing than
"synchronous replication".

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Greg Stark
On Sun, Dec 14, 2008 at 1:42 AM, Robert Haas  wrote:
>> What if relabeling support were to spread some more?
>
> The only example I can think of besides XML is JSON.  There might be a
> few more.  Basically, relabelling is a handy shortcut when you are
> serializing data and want to avoid specifying a list of columns and an
> (almost) identical list of labels.

The whole relabeling thing seems like a seriously silly idea. Why is
it at all a shortcut to use "AS" instead of "," ? The relabeling adds
zero actual expressiveness, it just makes a fancy way to pass an
argument.





-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Robert Haas
> What if relabeling support were to spread some more?

The only example I can think of besides XML is JSON.  There might be a
few more.  Basically, relabelling is a handy shortcut when you are
serializing data and want to avoid specifying a list of columns and an
(almost) identical list of labels.  Otherwise, it's not good for much.
 I think we should eventually aim to support user-defined functions
that work like this, because people will forever be inventing new ways
to serialize things and it'd be nice not to have to recompile to add
support for a new one.

I suppose you might want to do something like this:

html_input(foo) returns 
html_input(foo AS bar) returns 
html_input(foo, type: hidden) returns 

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future request: BgBouncer && "cache lookup failed for function": Auto recache function.

2008-12-13 Thread Joshua D. Drake
On Sun, 2008-12-14 at 03:28 +0300, Oleg Serov wrote:
> Hello!. I'm using PgBouncer with permanent connection, So, when i
> deleting(or editing?) some functions i have an error
> ERROR:  cache lookup failed for function ..;
> Can you make recaching of invalidate functions?'

I believe it already does that if you are running 8.3.

Joshua D. Drake



> 
-- 
PostgreSQL
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Stats target increase vs compute_tsvector_stats()

2008-12-13 Thread Greg Stark
I don't quote know how this data but any constant factor seems like it  
would be arbitrary. It sounds like a more principled algorithm would  
be to use stats_target^2. But that has the same problem. Even  
stats_target^1.5 would be too big for stats_target 10,000.


I think just using 10 is probably the right thing.

--
Greg


On 13 Dec 2008, at 13:02, Tom Lane  wrote:


I started making the changes to increase the default and maximum stats
targets 10X, as I believe was agreed to in this thread:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00386.php

I came across this bit in ts_typanalyze.c:

   /* We want statistic_target * 100 lexemes in the MCELEM array */
   num_mcelem = stats->attr->attstattarget * 100;

I wonder whether the multiplier here should be changed?  This code is
new for 8.4, so we have zero field experience about what desirable
lexeme counts are; but the prospect of up to a million lexemes in
a pg_statistic entry doesn't seem quite right.  I'm tempted to cut the
multiplier to 10 so that the effective range of MCELEM sizes remains
the same as what Jan had in mind when he wrote the code.

   regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_to_string(anyarray, text) that was working in 8.1 is not working in 8.3

2008-12-13 Thread Tom Lane
Greg Stark  writes:
> Huh, I didn't realize that ever worked in the past. I thought the way  
> to do what the op describes was to cast it to text[] or whatever  
> datatype you from out-of-band knowledge to expect.

We don't seem to allow that either ...

regression=# select array_to_string(histogram_bounds::text[],  '-') from 
pg_stats where attname = 'unique2' and tablename = 'tenk1';
ERROR:  cannot cast type anyarray to text[]
LINE 1: select array_to_string(histogram_bounds::text[],  '-') from ...
   ^


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_to_string(anyarray, text) that was working in 8.1 is not working in 8.3

2008-12-13 Thread Greg Stark
Huh, I didn't realize that ever worked in the past. I thought the way  
to do what the op describes was to cast it to text[] or whatever  
datatype you from out-of-band knowledge to expect.


--
Greg


On 13 Dec 2008, at 19:38, Tom Lane  wrote:


Corey Horton  writes:

I'm trying to use array_to_string on the pg_stats column
histogram_bounds...
test83=# select array_to_string(histogram_bounds::anyarray, '-') from
pg_stats where attname = 'id' and tablename = 'widgets';
ERROR:  argument declared "anyarray" is not an array but type  
anyarray

In 8.1, it worked fine...


Hmm.  This seems to have been broken in this patch:
http://archives.postgresql.org/pgsql-committers/2008-01/msg00173.php
which was in response to this complaint:
http://archives.postgresql.org/pgsql-bugs/2008-01/msg00029.php
and was attempting to prevent that same failure message in a different
context :-(.  I guess we forgot that pg_statistic makes it possible  
that

the *actual* datatype passed to a function could be anyarray.

While we could probably revert just enough of the changes to
enforce_generic_type_consistency to allow this case again, I wonder
just how safe that'd really be.  It would amount to expecting that
functions that take anyarray but don't take or return anyelement to
not only work on any array type, but to be always prepared for the
input element type to change on-the-fly (since that's exactly what
would happen when scanning pg_statistic).  Quite a lot of the built-in
anyarray functions are prepared to do that, but I'm not sure they all
are.

Are we prepared to re-open what could be at least a risk of crashing
bugs, in order to support this type of usage?  I have to admit that
it's nice to be able to process the pg_statistic columns like this
--- I've done it myself.  And we'd not heard any reports of problems
with it before 8.3.

   regards, tom lane

--
Sent via pgsql-sql mailing list (pgsql-...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future request: BgBouncer && "cache lookup failed for function": Auto recache function.

2008-12-13 Thread Tom Lane
"Oleg Serov"  writes:
> Hello!. I'm using PgBouncer with permanent connection, So, when i
> deleting(or editing?) some functions i have an error
> ERROR:  cache lookup failed for function ..;

You're going to need to explain exactly what you're doing if you
want help with that.  However, if the answer is that you're doing
DROP/CREATE of existing functions, then the fix is to use CREATE OR
REPLACE FUNCTION instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [SQL] array_to_string(anyarray, text) that was working in 8.1 is not working in 8.3

2008-12-13 Thread Tom Lane
Corey Horton  writes:
> I'm trying to use array_to_string on the pg_stats column 
> histogram_bounds...
> test83=# select array_to_string(histogram_bounds::anyarray, '-') from 
> pg_stats where attname = 'id' and tablename = 'widgets';
> ERROR:  argument declared "anyarray" is not an array but type anyarray
> In 8.1, it worked fine...  

Hmm.  This seems to have been broken in this patch:
http://archives.postgresql.org/pgsql-committers/2008-01/msg00173.php
which was in response to this complaint:
http://archives.postgresql.org/pgsql-bugs/2008-01/msg00029.php
and was attempting to prevent that same failure message in a different
context :-(.  I guess we forgot that pg_statistic makes it possible that
the *actual* datatype passed to a function could be anyarray.

While we could probably revert just enough of the changes to
enforce_generic_type_consistency to allow this case again, I wonder
just how safe that'd really be.  It would amount to expecting that
functions that take anyarray but don't take or return anyelement to
not only work on any array type, but to be always prepared for the
input element type to change on-the-fly (since that's exactly what
would happen when scanning pg_statistic).  Quite a lot of the built-in
anyarray functions are prepared to do that, but I'm not sure they all
are.

Are we prepared to re-open what could be at least a risk of crashing
bugs, in order to support this type of usage?  I have to admit that
it's nice to be able to process the pg_statistic columns like this
--- I've done it myself.  And we'd not heard any reports of problems
with it before 8.3.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Future request: BgBouncer && "cache lookup failed for function": Auto recache function.

2008-12-13 Thread Oleg Serov
Hello!. I'm using PgBouncer with permanent connection, So, when i
deleting(or editing?) some functions i have an error
ERROR:  cache lookup failed for function ..;
Can you make recaching of invalidate functions?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Mark Mielke

Markus Wanner wrote:

I don't think synchronous replication guarantees that it will be
immediately visible. Even if it did push the change to the other
machine, and the other machine had committed it, that doesn't guarantee
that any reader sees it any more than if I commit to the same machine
(no replication), I am guaranteed to see the change from another
session.



AFAIK every snapshot taken after a transaction has acknowledged its
commit is guaranteed to see changes from that transaction. Isn't that a
pretty frequent and obvious user expectation?
  


Yes - but that's only really true while the session continues. From 
another session? I've never assumed that I could reconnect and be 
guaranteed to get the latest snapshot that includes absolutely 
everything that has been committed.


Any system that guaranteed this even when involving multiple machines 
would be guaranteed to be inefficient and difficult to scale in my 
opinion. How could any system promise to have reasonable commit times 
while also guaranteeing that once a commit completes, any session to any 
other server will be able to see the commit? I think this forces some 
sort of serialization between multiple machines and defeats the purpose 
of having multiple machines. Where before it was indeterminate to know 
when the commit would take effect at each replica, it's not 
indeterminate when my commit will succeed. That is, my commit cannot 
succeed until every single server acknowledge that it is has fully 
received and committed my transaction. What happens if there are network 
problems, or what happens if I am replicating over a slower link? What 
if I am committing to 100 servers? Is it reasonable to expect 100 server 
negotiations to complete in full before my own commit will return?



Synchronous replication only means that I can be assured that
my change has been saved permanently by the time my commit completes. It
doesn't mean anybody else can see my change or is guaranteed to see my
change if the query from another session.


So you wouldn't be surprised if a transaction from two hours ago isn't
visible on another node, just because that node happens to be rather
busy with lots of other readers and maintenance tasks?
  


Any system that is two hours behind should fall out of the pool used to 
satisfy reads from. So, if there was a surprise, it would be this. I 
don't believe ACID requires that a commit on one server is immediately 
visible on another server. Any work I do on the "behind" server would 
still be safe from a transaction and referential integrity perspective. 
However, if I executed 'commit' on this "behind" server, I would expect 
the commit to wait until it catches up, or in the case of a 2 hour 
behind, I would expect the commit to fail. Look at the alternative - all 
commits to any server in the pool would be locked up waiting for this 
one machine to catch up on 2 hours of transaction. This emphasizes that 
the problem is that a server two hours of date is still in the pool, 
rather than the problem being keeping things up-to-date.




If my application assumes that it can commit to one server, and then
read back the commit from another server, and my application breaks as a
result, it's because I didn't understand the problem.


Well, yeah, depends on user expectations. I'm surprised to hear that you
have that understanding of synchronous replication.
  


I've seen people face it in the past. Most recently we had a 
presentation from the developer of digg.com, and he described how he had 
this problem with MySQL and that he had to work around it.


On a smaller scale and slightly unrelated, I had this problem frequently 
between memcache and PostgreSQL. That is, memcache would always be 
latest, but PostgreSQL might not be latest, because the commit had not 
occurred.


It seems like a standard enough problem to me. I don't expect Postgres-R 
to do the impossible. As with my previous paragraph, I don't expect 
Postgres-R to wait 2-hours to commit just because one server is falling 
behind.



Even if PostgreSQL
didn't use the word "synchronous replication", I could still be
confused. I need to understand the problem no matter what words are used.



As said, it depends on what the common understanding of "synchronous
replication" is. I've so far been under the impression, that these
potential lags are unexpected and confusing. Several people pointed me
at that problem and I've thus "relabeled" Postgres-R as not being
synchronous. I'm at least surprised to suddenly get pushed into the
other direction. :-)

However, I absolutely agree that it's not that important how we name it.
What is important, is that users and developers understand the difference


I agree they are unexpected and confusing. I don't agree that they are 
unexpected or confusing to those knowledgeable in the domain. So, the 
question becomes - whose expectation is wrong? Should the user learn 
more? Or should we push for a c

[HACKERS] visibility map and reltuples

2008-12-13 Thread Ned T. Crigler
It appears that the visibility map patch is causing pg_class.reltuples to be
set improperly after a vacuum. For example, it is set to 0 if the map
indicated that no pages in the heap needed to be scanned.

Perhaps reltuples should not be updated unless every page was scanned during
the vacuum?

-- 
Ned T. Crigler

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lifetime of TubleTableSlot* returned by ExecProcNode

2008-12-13 Thread Bramandia Ramadhana
I see. Thanks for the advice. I would research on how to use tuplestore
object.

Regards,

Bramandia R.

On Sat, Dec 13, 2008 at 10:36 AM, Tom Lane  wrote:

> "Bramandia Ramadhana"  writes:
> > Hmm how if an upper level node needs to store (for future use) the
> > TupleTableSlot* returned by lower level node, e.g. I create a specialized
> > Sort Node which needs to read all tuples from lower level nodes. In this
> > case, would it be necessary and sufficient to make a copy the
> TupleTableSlot
>
> It would be a pretty crummy way to approach it, because a Slot is not
> intended to be a compact representation.  You probably want to use a
> tuplestore or tuplesort object instead.
>
>regards, tom lane
>


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi,

Mark Mielke wrote:
> Might it not be true that anybody unfamiliar would be confused and that
> this is a bit of a straw man?

Might be. I've neglected the issue myself for a while.

> I don't think synchronous replication guarantees that it will be
> immediately visible. Even if it did push the change to the other
> machine, and the other machine had committed it, that doesn't guarantee
> that any reader sees it any more than if I commit to the same machine
> (no replication), I am guaranteed to see the change from another
> session.

AFAIK every snapshot taken after a transaction has acknowledged its
commit is guaranteed to see changes from that transaction. Isn't that a
pretty frequent and obvious user expectation?

> Synchronous replication only means that I can be assured that
> my change has been saved permanently by the time my commit completes. It
> doesn't mean anybody else can see my change or is guaranteed to see my
> change if the query from another session.

So you wouldn't be surprised if a transaction from two hours ago isn't
visible on another node, just because that node happens to be rather
busy with lots of other readers and maintenance tasks?

> If my application assumes that it can commit to one server, and then
> read back the commit from another server, and my application breaks as a
> result, it's because I didn't understand the problem.

Well, yeah, depends on user expectations. I'm surprised to hear that you
have that understanding of synchronous replication.

> Even if PostgreSQL
> didn't use the word "synchronous replication", I could still be
> confused. I need to understand the problem no matter what words are used.

As said, it depends on what the common understanding of "synchronous
replication" is. I've so far been under the impression, that these
potential lags are unexpected and confusing. Several people pointed me
at that problem and I've thus "relabeled" Postgres-R as not being
synchronous. I'm at least surprised to suddenly get pushed into the
other direction. :-)

However, I absolutely agree that it's not that important how we name it.
What is important, is that users and developers understand the difference.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi,

Aidan Van Dyk wrote:
> Well, I think the PG MVCC (which wal-streaming just ships across
> somewhere else) will save that.  So with hot-standby you could have
> another client could see the result *after* the COMMIT has been
> requested, but *before* the COMMIT returns...  But we have this
> situation in a single current PG instance anyways, so it's nothing
> new

AFAIU the proposed algorithm only waits until WAL is written on the
slave before acknowledging COMMIT. Application of the changes may be
deferred, so it's not necessarily immediately visible on the slave.

> But with hot-standby, I could also see that it could be done such that
> the wal-stream is fsynced to disk (i.e. xlog) and acknowledged, but
> because of a current running query, application of it is delayed...  But
> this is hot-standby's problem of describing itself, not sync-rep.

I'm thinking of the overall system and don't care much if it's
hot-standby's or sync-rep's problem. But it's certainly the master which
needs to await certain acknowledgments of the slaves. That has so far
been discussed within this sync-rep thread.

> IMHO, sync-rep is about getting the change "durrably to a slave" before
> acknoledging the COMMIT.  That slave could be any number of things:
> - A "WAL archive" type system having the ability to be used for
>   recover
> - A PG with special "recovery mode" that reads the stream and applies it
> - A full hot-standby recovery
> 
> I could see any and all of those (and probably other) being usefull and
> used.

I certainly agree to that.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Mark Mielke

Markus Wanner wrote:

Tom Lane wrote:
  

We won't call it anything, because we never will or can implement that.
See the theory of relativity: the notion of exactly simultaneous events
at distinct locations isn't even well-defined



That has never been the point of the discussion. It's rather about the
question if changes from transactions are guaranteed to be visible on
remote nodes immediately after commit acknowledgment. Whether or not
this is guaranteed, in both cases the term "synchronous replication" is
commonly used, which is causing confusion.
  


Might it not be true that anybody unfamiliar would be confused and that 
this is a bit of a straw man?


I don't think synchronous replication guarantees that it will be 
immediately visible. Even if it did push the change to the other 
machine, and the other machine had committed it, that doesn't guarantee 
that any reader sees it any more than if I commit to the same machine 
(no replication), I am guaranteed to see the change from another 
session. Synchronous replication only means that I can be assured that 
my change has been saved permanently by the time my commit completes. It 
doesn't mean anybody else can see my change or is guaranteed to see my 
change if the query from another session.


If my application assumes that it can commit to one server, and then 
read back the commit from another server, and my application breaks as a 
result, it's because I didn't understand the problem. Even if PostgreSQL 
didn't use the word "synchronous replication", I could still be 
confused. I need to understand the problem no matter what words are used.


Cheers,
mark

--
Mark Mielke 



Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Aidan Van Dyk
* Markus Wanner  [081213 16:33]:
> Hi,
> 
> Hannu Krosing wrote:
> > You can have a variantof sync rep + hot standby where the master does
> > not return committed before the slave has both synced the data and
> > replied the transaction so that it is visible on slave but in that case
> > you may have a usecase, where it is actually visible on slave _before_
> > it is visible on master.
> 
> As long as it's not visible *before* the client requests a COMMIT, that
> certainly doesn't matter (because the application cannot check that).

Well, I think the PG MVCC (which wal-streaming just ships across
somewhere else) will save that.  So with hot-standby you could have
another client could see the result *after* the COMMIT has been
requested, but *before* the COMMIT returns...  But we have this
situation in a single current PG instance anyways, so it's nothing
new

But with hot-standby, I could also see that it could be done such that
the wal-stream is fsynced to disk (i.e. xlog) and acknowledged, but
because of a current running query, application of it is delayed...  But
this is hot-standby's problem of describing itself, not sync-rep.

IMHO, sync-rep is about getting the change "durrably to a slave" before
acknoledging the COMMIT.  That slave could be any number of things:
- A "WAL archive" type system having the ability to be used for
  recover
- A PG with special "recovery mode" that reads the stream and applies it
- A full hot-standby recovery

I could see any and all of those (and probably other) being usefull and
used.

But in the current patch, it focusses on the streaming (sending), and
and a receiver "recovery" mode that can accept/apply them, again,
without worrying about acutally running queries (yet) ...

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Le 13 déc. 08 à 22:32, Tom Lane a écrit :

Spread to what?  AFAICS the way that XMLELEMENT uses AS is a
single-purpose wart


Ok now that explains.
The common lisp inspired syntax is only nice if we're to avoid using  
AS, which I thought was the situation. Sorry for some more confusion  
here.


Regards,
- --
dim




-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAklEK6sACgkQlBXRlnbh1blNKACgmwZSY1ZpKBhK/SxhPdjZ1F6q
mtcAn3OaNs1jIQOymz/6ex/ghlO+avcO
=dGhM
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Tom Lane
Dimitri Fontaine  writes:
> What if relabeling support were to spread some more?

Spread to what?  AFAICS the way that XMLELEMENT uses AS is a
single-purpose wart (much like a lot of other stuff the SQL committee
invents :-().  I do not see a need to reserve AS in function argument
lists for that purpose.  In any case, the proposed meaning here is only
relevant to functions that expose names for their parameters; so in
principle you could still do something like what XMLELEMENT does for any
function that does not create names for its parameters.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi,

Hannu Krosing wrote:
> You can have a variantof sync rep + hot standby where the master does
> not return committed before the slave has both synced the data and
> replied the transaction so that it is visible on slave but in that case
> you may have a usecase, where it is actually visible on slave _before_
> it is visible on master.

As long as it's not visible *before* the client requests a COMMIT, that
certainly doesn't matter (because the application cannot check that).

What matters is, that an application might expect a node to show the
changes of a transaction which has previously (seen from the application
itself) been committed and acknowledged by another node.

AFAICT the common understanding of synchronous replication is, that all
nodes confirm to have committed the changes of a transaction *before*
acknowledging COMMIT to the application (and obviously only *after* the
application requested to COMMIT the transaction, so the guarantee is
that all nodes commit *sometime* within that time frame, which is
certainly possible to guarantee, see 2PC approaches).

This guarantee is not provided by the Postgres-R algorithm, nor by the
approach presented. Both only guarantee, that the transaction *will* get
committed (and thus get visible) on all nodes *sometime* *after* the
application requested to commit it (even in case of various failures,
that is) [1]. As cited before, that has been enough of a reason for Jan
Wieck to call Postgres-R asynchronous, and I certainly see his point.

Note that the amount of time that passes between the commit
acknowledgment and the actual commit on remote nodes may theoretically
be infinitely long. And in practice certainly long enough for an
application to notice the difference. However, it still is a practical
optimization, because most applications should cope with it just fine.
But not all...

Do you consider the proposed log shipping approach to be synchronous?
How about the Postgres-R algorithm?

Regards

Markus Wanner

[1]: of course these approaches also guarantee that the transaction is
committed on the local node *before* acknowledging commit, so that
subsequent (seen from the application) queries are guaranteed to see the
changes. But that guarantee only holds true for the local node.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi,

Simon Riggs wrote:
>> Hot Standby (although the latter
>> seems to have stalled a bit...)
> 
> It's just being worked on asynchronously. ;-)

LOL, thanks for bringing humor into this discussion :-)

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Le 13 déc. 08 à 17:05, Tom Lane a écrit :

I personally agree that AS seems more SQL-ish, but that's in the eye
of the beholder.


So do I, but I fear it's already taken for another meaning.


The argument about ambiguity in XMLELEMENT is bogus becase XMLELEMENT
doesn't (and won't) have named parameters.


My concern is the other way around. This function provides support for  
arguments relabeling, but reading some other threads here I think we  
don't yet support this feature for user defined function. Or maybe  
only for C-language user defined functions.


What if relabeling support were to spread some more?
My point is that we couldn't offer generalization of an existing  
feature if we reuse AS for default parameter value. Or the user would  
have to choose between having more than one argument with a default  
value and relabeling support. That would be awkward.


No it could very well be that the point does not exists, but someone  
would have to explain why to me, cause I'm sure not getting it by  
myself...


Regards,
- --
dim



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAklEJyEACgkQlBXRlnbh1bmlgwCfW8PPDh1rIH6Fk/3oEQ0t1+TH
vDYAni0kE4us/AvWuI6HTyaywAgP9Tga
=jB1l
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi,

Tom Lane wrote:
> We won't call it anything, because we never will or can implement that.
> See the theory of relativity: the notion of exactly simultaneous events
> at distinct locations isn't even well-defined

That has never been the point of the discussion. It's rather about the
question if changes from transactions are guaranteed to be visible on
remote nodes immediately after commit acknowledgment. Whether or not
this is guaranteed, in both cases the term "synchronous replication" is
commonly used, which is causing confusion.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1212

2008-12-13 Thread Euler Taveira de Oliveira
ITAGAKI Takahiro escreveu:

> - A new GUC variable 'explain_analyze_format' is added.
I'm afraid that this variable name doesn't tell what it means. What about
'explain_analyze_stats_format' or even 'explain_stats_format'?


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Hannu Krosing
On Sat, 2008-12-13 at 21:35 +0200, Hannu Krosing wrote:

> We still could call Sync Rep as a feature "synchronous replication" on
> basis that "WAL Streaming - Synchronous Write" is the highest security
> level achievable using the feature.
> 
> And maybe have Sync Hot Standby as a feature on top of that which
> provides "WAL Streaming - Synchronous Apply"

Or maybe better call it Serializable Hot Standby, as the actual
guarantee that can be achieved is that when one client does something on
master and after committing on master starts another transaction on
slave, then the effects of query on master are visible on slave.


-- 
--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Hannu Krosing
On Sat, 2008-12-13 at 13:05 -0500, Robert Haas wrote:
> > I certainly agree to using such terms. Unfortunately, in my experience,
> > synchronous replication is commonly used to mean that transactions are
> > guaranteed to be immediately visible on remote nodes after the client
> > got commit acknowledgment. That's the cause for confusion I'm envisioning.
> 
> I think that's a very important point.  It's very possible that 8.4
> may support both this feature and Hot Standby (although the latter
> seems to have stalled a bit...).  That makes me think "oh, great, I
> can offload any subset of my read-only queries to the standby".  Not
> so fast.
> 
> I think we need to reserve the term "synchronous replication" for a
> system where transactions that begin at the same time on the primary
> and standby see the same tuples.

Define "same time". 

You can have a variantof sync rep + hot standby where the master does
not return committed before the slave has both synced the data and
replied the transaction so that it is visible on slave but in that case
you may have a usecase, where it is actually visible on slave _before_
it is visible on master.

actually you can't have that "same time" guarantee even on single
system, that is, if you start two transactions connections "at the same
time", you still cant be sure there is not third transaction which has
committed between those two and which makes the visible data on those
two different.


>  Clearly that is "more" synchronous
> than what is being proposed here; if we call this "synchronous
> replication", what will we call that?  "Really Synchronous, Honest, No
> Kidding"?   Admittedly, we may never implement that feature, but that
> seems irrelevant.
> 
> It would be useful to have names for all the different possibilities.
>  Random ideas:
> 
> Log Shipping.  After each log switch, the previous WAL log is copied
> to the standby in its entirety.
> 
> WAL Streaming - Asynchronous.  The WAL log is streamed from master to
> standby as it is written, but transactions on the master never wait.
> 
> WAL Streaming - Synchronous Receive.  The WAL log is streamed from
> master to standby as it is written, and transactions on the master
> wait until the standby acknowledges receipt of the WAL.
> 
> WAL Streaming - Synchronous Write.  The WAL log is streamed from
> master to standby as it is written, and transactions on the master
> wait until the standby acknowledges that the WAL has been written to
> disk.
> 
> WAL Streaming - Synchronous Apply.  The WAL log is streamed from
> master to standby as it is written, and transactions on the master
> wait until the standby acknowledges that WAL has been written to disk
> and applied.

We still could call Sync Rep as a feature "synchronous replication" on
basis that "WAL Streaming - Synchronous Write" is the highest security
level achievable using the feature.

And maybe have Sync Hot Standby as a feature on top of that which
provides "WAL Streaming - Synchronous Apply"



--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Simon Riggs

On Sat, 2008-12-13 at 13:05 -0500, Robert Haas wrote:

> Hot Standby (although the latter
> seems to have stalled a bit...)

It's just being worked on asynchronously. ;-)

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Aidan Van Dyk
Synchronous replication, "sync rep" is *not* intersted in the "slave's
visiblity of the commit", because PostgreSQL doesn't "serve" requests
when in recovery (wal receiving) mode *now*.

This sync rep patch/proposal/discution is *strictly* (at this point yet,
hot standby may eventually or hopefully soon change that) the means to
get the data "safely in 2 seperate places", before the COMMIT returns,
by means of wal streaming.  That "safely in 2 places" can have various
implementation options (like received, on disk, or applied), and
Fujii-san explained some of the options as to what to consider "safe"
and their trade-offs at his presentation at last year.

Once both sync-rep (the wal-streaming get changes in two places) and
hot-standby (run queries while WAL is being applied) are available in
PostgreSQL, at that point we might need to start "other client
visibility", but even then, we still don't need to worry about
multi-master options...

a.


* Markus Wanner  [081213 12:17]:
> Hi,
> 
> Simon Riggs wrote:
> > On Sat, 2008-12-13 at 14:07 +0100, Markus Wanner wrote:
> >> Speaking of a "synchronous commit"
> >> is utterly misleading, because the commit itself is exactly the thing
> >> that's *not* synchronous.
> > 
> > Not really sure where you're going here.
> 
> I'm pointing to a potential misunderstanding, trying to help to prevent
> you from running into the same issues and discussions as I did.
> 
> I've learned the hard way, that the Postgres-R algorithm is not fully
> synchronous (in the strict sense). This caused confusion for people who
> take the word "synchronous" by its original meaning. The algorithm
> proposed here seems similar enough to potentially cause the same confusion.
> 
> As I see it now, I think it's well worth to point out the difference,
> from both, the technical as well as from the marketing perspective. The
> former for better understanding, the later to prevent users from
> thinking it must be slow per definition. Arguing that your approach is
> not fully synchronous definitely helps defending that concern.
> 
> However, I'm just now realizing, that the difference is only relevant as
> soon as you begin to allow read-only access on the slave. AFAIK that's
> among the goals of this effort, no?
> 
> > "synchronous replication" is
> > used exactly as described in the Wikipedia entry here:
> > http://en.wikipedia.org/wiki/Database_replication
> 
> That article describes pretty much all variants of replication, what
> exactly are you referring to?
> 
> Under "Database Replication > Multi-Master replication" it describes
> eager vs lazy variants, which is IMO a more appropriate and useful
> distinction than sync vs async. (But that's admittedly a sentence I've
> contributed myself, IIRC).
> 
> Under "Storage Replication > Synchronous Replication" one can read:
> "Write is not considered complete until acknowledgement by both local
> and remote storage." For the proposed approach this might hold true for
> WAL writing. However, the user certainly doesn't care how synchronous
> the log is shipped nor written, is as long as she doesn't see the
> changes on the slave.
> 
> That's the difference between fully synchronous and eager (or virtually
> or approximately synchronous) algorithms. You seem to refer to both as
> "synchronous". Phrases like "synchronous commit" or "synchronous data
> transfer" do not help me to understand what exactly you are talking about.
> 
> Explaining that the slave commits (and therefore makes the transactions
> visible) asynchronously would help. And it would prevent disappointment
> for users who expect changes to be immediately visible on the slave.
> 
> > No two word phrase is going to accurately sum up the complexity and
> > potential for data loss in these situations. DRBD saw that too and just
> > called them A, B and C and then describe them more accurately.
> 
> Agreed. I've chosen lazy, eager and sync, so far. I'm open for better
> terms, and I leave it up to you to call your variants whatever you like.
> But to understand what you are talking about, I'd prefer to get to know
> these distinctions crisp and clear.
> 
> > But I don't think we should say "PostgreSQL just implemented algorithm
> > B" which is just unhelpful. I don't think its "marketing" to refer to it
> > by the phrase most commonly used for the technology we are building.
> 
> I certainly agree to using such terms. Unfortunately, in my experience,
> synchronous replication is commonly used to mean that transactions are
> guaranteed to be immediately visible on remote nodes after the client
> got commit acknowledgment. That's the cause for confusion I'm envisioning.
> 
> 
> I'm hoping to be somewhat helpful to this effort of getting a log
> shipping replication variant into Postgres. It can only be beneficial
> for Postgres-R in that we gain field experience with ..uhm.. this
> special kind of replication, however we name it.
> 
> I'm already on xmas vacation, so I won't bother you any fu

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Tom Lane
"Robert Haas"  writes:
> I think we need to reserve the term "synchronous replication" for a
> system where transactions that begin at the same time on the primary
> and standby see the same tuples.  Clearly that is "more" synchronous
> than what is being proposed here; if we call this "synchronous
> replication", what will we call that?  "Really Synchronous, Honest, No
> Kidding"?   Admittedly, we may never implement that feature, but that
> seems irrelevant.

We won't call it anything, because we never will or can implement that.
See the theory of relativity: the notion of exactly simultaneous events
at distinct locations isn't even well-defined, because observers at yet
other locations will disagree about what is "simultaneous".  And I'm
not just making a joke here --- speed-of-light delays in a WAN are
meaningful compared to current computer speeds.  In practice, the
slave and the master will never commit at exactly the same time.

I agree with the point made upthread that we should use the term
"synchronous replication" the way it's commonly used in the industry.
Inventing our own terminology might be fun but it's not really going
to result in less confusion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
> I certainly agree to using such terms. Unfortunately, in my experience,
> synchronous replication is commonly used to mean that transactions are
> guaranteed to be immediately visible on remote nodes after the client
> got commit acknowledgment. That's the cause for confusion I'm envisioning.

I think that's a very important point.  It's very possible that 8.4
may support both this feature and Hot Standby (although the latter
seems to have stalled a bit...).  That makes me think "oh, great, I
can offload any subset of my read-only queries to the standby".  Not
so fast.

I think we need to reserve the term "synchronous replication" for a
system where transactions that begin at the same time on the primary
and standby see the same tuples.  Clearly that is "more" synchronous
than what is being proposed here; if we call this "synchronous
replication", what will we call that?  "Really Synchronous, Honest, No
Kidding"?   Admittedly, we may never implement that feature, but that
seems irrelevant.

It would be useful to have names for all the different possibilities.
 Random ideas:

Log Shipping.  After each log switch, the previous WAL log is copied
to the standby in its entirety.

WAL Streaming - Asynchronous.  The WAL log is streamed from master to
standby as it is written, but transactions on the master never wait.

WAL Streaming - Synchronous Receive.  The WAL log is streamed from
master to standby as it is written, and transactions on the master
wait until the standby acknowledges receipt of the WAL.

WAL Streaming - Synchronous Write.  The WAL log is streamed from
master to standby as it is written, and transactions on the master
wait until the standby acknowledges that the WAL has been written to
disk.

WAL Streaming - Synchronous Apply.  The WAL log is streamed from
master to standby as it is written, and transactions on the master
wait until the standby acknowledges that WAL has been written to disk
and applied.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Stats target increase vs compute_tsvector_stats()

2008-12-13 Thread Tom Lane
I started making the changes to increase the default and maximum stats
targets 10X, as I believe was agreed to in this thread:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00386.php

I came across this bit in ts_typanalyze.c:

/* We want statistic_target * 100 lexemes in the MCELEM array */
num_mcelem = stats->attr->attstattarget * 100;

I wonder whether the multiplier here should be changed?  This code is
new for 8.4, so we have zero field experience about what desirable
lexeme counts are; but the prospect of up to a million lexemes in
a pg_statistic entry doesn't seem quite right.  I'm tempted to cut the
multiplier to 10 so that the effective range of MCELEM sizes remains
the same as what Jan had in mind when he wrote the code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Robert Haas
> I personally agree that AS seems more SQL-ish, but that's in the eye
> of the beholder.
>
> The argument about ambiguity in XMLELEMENT is bogus becase XMLELEMENT
> doesn't (and won't) have named parameters.  But it is true that
> XMLELEMENT is doing something subtly different with the AS clause than
> what a named parameter would do; so you could argue that there's a
> potential for user confusion there.

It's not ambiguous unless for some reason you wanted to support doing
both of those things at the same time, but I'm having a hard time
coming up with a realistic use case for that.  Still, I think we
probably do want to at least leave the door open to do both things at
different times.  For the XMLELEMENT-type case, "value AS label" seems
far superior to "label: value", so if you're going to pick one syntax
for both things, it should be that one.

Alternatively, using "label: value" for identifying which parameter is
intended to get the value and "value AS label" for relabelling seems
OK too, though your argument about standards-compliance is a valid
one.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi,

Simon Riggs wrote:
> On Sat, 2008-12-13 at 14:07 +0100, Markus Wanner wrote:
>> Speaking of a "synchronous commit"
>> is utterly misleading, because the commit itself is exactly the thing
>> that's *not* synchronous.
> 
> Not really sure where you're going here.

I'm pointing to a potential misunderstanding, trying to help to prevent
you from running into the same issues and discussions as I did.

I've learned the hard way, that the Postgres-R algorithm is not fully
synchronous (in the strict sense). This caused confusion for people who
take the word "synchronous" by its original meaning. The algorithm
proposed here seems similar enough to potentially cause the same confusion.

As I see it now, I think it's well worth to point out the difference,
from both, the technical as well as from the marketing perspective. The
former for better understanding, the later to prevent users from
thinking it must be slow per definition. Arguing that your approach is
not fully synchronous definitely helps defending that concern.

However, I'm just now realizing, that the difference is only relevant as
soon as you begin to allow read-only access on the slave. AFAIK that's
among the goals of this effort, no?

> "synchronous replication" is
> used exactly as described in the Wikipedia entry here:
> http://en.wikipedia.org/wiki/Database_replication

That article describes pretty much all variants of replication, what
exactly are you referring to?

Under "Database Replication > Multi-Master replication" it describes
eager vs lazy variants, which is IMO a more appropriate and useful
distinction than sync vs async. (But that's admittedly a sentence I've
contributed myself, IIRC).

Under "Storage Replication > Synchronous Replication" one can read:
"Write is not considered complete until acknowledgement by both local
and remote storage." For the proposed approach this might hold true for
WAL writing. However, the user certainly doesn't care how synchronous
the log is shipped nor written, is as long as she doesn't see the
changes on the slave.

That's the difference between fully synchronous and eager (or virtually
or approximately synchronous) algorithms. You seem to refer to both as
"synchronous". Phrases like "synchronous commit" or "synchronous data
transfer" do not help me to understand what exactly you are talking about.

Explaining that the slave commits (and therefore makes the transactions
visible) asynchronously would help. And it would prevent disappointment
for users who expect changes to be immediately visible on the slave.

> No two word phrase is going to accurately sum up the complexity and
> potential for data loss in these situations. DRBD saw that too and just
> called them A, B and C and then describe them more accurately.

Agreed. I've chosen lazy, eager and sync, so far. I'm open for better
terms, and I leave it up to you to call your variants whatever you like.
But to understand what you are talking about, I'd prefer to get to know
these distinctions crisp and clear.

> But I don't think we should say "PostgreSQL just implemented algorithm
> B" which is just unhelpful. I don't think its "marketing" to refer to it
> by the phrase most commonly used for the technology we are building.

I certainly agree to using such terms. Unfortunately, in my experience,
synchronous replication is commonly used to mean that transactions are
guaranteed to be immediately visible on remote nodes after the client
got commit acknowledgment. That's the cause for confusion I'm envisioning.


I'm hoping to be somewhat helpful to this effort of getting a log
shipping replication variant into Postgres. It can only be beneficial
for Postgres-R in that we gain field experience with ..uhm.. this
special kind of replication, however we name it.

I'm already on xmas vacation, so I won't bother you any further on this
issue. Have fun coding and make sure to enjoy this time of the year.

All the best.

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Grzegorz Jaskiewicz


On 2008-12-13, at 16:19, Tom Lane wrote:




I'm sure it's technically possible, but I see no redeeming social  
value

in it ... we should pick one and quit repainting the bike shed.


+1000


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Bruce Momjian
David E. Wheeler wrote:
> On Dec 13, 2008, at 5:19 PM, Tom Lane wrote:
> 
> > I'm sure it's technically possible, but I see no redeeming social  
> > value
> > in it ... we should pick one and quit repainting the bike shed.
> 
> Well, as I've said, I'm okay with AS, though it's not my favorite. I  
> can see the argument that it's more likely to eventually make it into  
> the SQL standard. I don't suppose that the position of the label and  
> the value on either side of "AS" could be reversible, could it?
> 
>SELECT foo( bar AS 'ick', 6 AS baz );
> 
> Probably not, I'm thinking?

Yea, probably not.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Tom Lane
"David E. Wheeler"  writes:
> I don't suppose that the position of the label and  
> the value on either side of "AS" could be reversible, could it?

No.  Consider

SELECT foo(bar AS baz) FROM ...

If the from-clause provides columns named both bar and baz, it would
be impossible to decide what is meant.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread David E. Wheeler

On Dec 13, 2008, at 5:19 PM, Tom Lane wrote:

I'm sure it's technically possible, but I see no redeeming social  
value

in it ... we should pick one and quit repainting the bike shed.


Well, as I've said, I'm okay with AS, though it's not my favorite. I  
can see the argument that it's more likely to eventually make it into  
the SQL standard. I don't suppose that the position of the label and  
the value on either side of "AS" could be reversible, could it?


  SELECT foo( bar AS 'ick', 6 AS baz );

Probably not, I'm thinking…

Best,

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Tom Lane
"David E. Wheeler"  writes:
> On Dec 13, 2008, at 5:05 PM, Tom Lane wrote:
>> However, after looking at the precedent of XMLELEMENT, it's hard to  
>> deny that if the SQL committee ever chose to standardize named parameters,
>> AS is what they would use.  The chances that ":" would become the
>> standard are negligible --- that's not the sort of syntax they like
>> to standardize.

> Any chance that both "AS" and ":" could be supported, so that it's at  
> the discretion of the user?

I'm sure it's technically possible, but I see no redeeming social value
in it ... we should pick one and quit repainting the bike shed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread David E. Wheeler

On Dec 13, 2008, at 5:05 PM, Tom Lane wrote:

However, after looking at the precedent of XMLELEMENT, it's hard to  
deny

that if the SQL committee ever chose to standardize named parameters,
AS is what they would use.  The chances that ":" would become the
standard are negligible --- that's not the sort of syntax they like
to standardize.


Any chance that both "AS" and ":" could be supported, so that it's at  
the discretion of the user?


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Bruce Momjian
Dimitri Fontaine wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hi,
> 
> Le 13 d?c. 08 ? 11:39, Peter Eisentraut a ?crit :
> > On Friday 12 December 2008 20:05:57 Tom Lane wrote:
> >> Excellent.  I checked that psql's colon-variable feature behaves the
> >> same.  So it looks like the proposed "name: value" syntax would  
> >> indeed
> >> not break any existing features.  Barring better ideas I think we  
> >> should
> >> go with that one.
> >
> > I personally thought that AS was a better idea.
> 
> It seems some people want to be able to overload some default  
> parameters (but not others) and at the same time alias them to some  
> new label. I'm not sure I understand it all, but it seems an example  
> of it would be like:
>SELECT xml_function(a, b: 'foo' AS bar);
> 
> If this is what some people want when all the spare parts are bound  
> together, we don't have the option to use AS for both the meanings.

I agree "AS" is better.  And why would the "AS" above be inside the
parentheses;  I assume it would be:

SELECT xml_function(a, b: 'foo') AS bar;

Giving labels to parameters passed into functions makes no sense.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Tom Lane
Dimitri Fontaine  writes:
> Le 13 déc. 08 à 11:39, Peter Eisentraut a écrit :
>> I personally thought that AS was a better idea.

> It seems some people want to be able to overload some default  
> parameters (but not others) and at the same time alias them to some  
> new label. I'm not sure I understand it all, but it seems an example  
> of it would be like:
>SELECT xml_function(a, b: 'foo' AS bar);

> If this is what some people want when all the spare parts are bound  
> together, we don't have the option to use AS for both the meanings.

I personally agree that AS seems more SQL-ish, but that's in the eye
of the beholder.

The argument about ambiguity in XMLELEMENT is bogus becase XMLELEMENT
doesn't (and won't) have named parameters.  But it is true that
XMLELEMENT is doing something subtly different with the AS clause than
what a named parameter would do; so you could argue that there's a
potential for user confusion there.

However, after looking at the precedent of XMLELEMENT, it's hard to deny
that if the SQL committee ever chose to standardize named parameters,
AS is what they would use.  The chances that ":" would become the
standard are negligible --- that's not the sort of syntax they like
to standardize.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Simon Riggs

On Sat, 2008-12-13 at 14:07 +0100, Markus Wanner wrote:

> Speaking of a "synchronous commit"
> is utterly misleading, because the commit itself is exactly the thing
> that's *not* synchronous.

Not really sure where you're going here. "synchronous replication" is
used exactly as described in the Wikipedia entry here:
http://en.wikipedia.org/wiki/Database_replication

No two word phrase is going to accurately sum up the complexity and
potential for data loss in these situations. DRBD saw that too and just
called them A, B and C and then describe them more accurately. 

But I don't think we should say "PostgreSQL just implemented algorithm
B" which is just unhelpful. I don't think its "marketing" to refer to it
by the phrase most commonly used for the technology we are building.
Nobody suggested we call it "wizrep" or suchlike...

The docs can contain the exact description of data loss and timing
windows.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Grzegorz Jaskiewicz


On 2008-12-13, at 13:07, Markus Wanner wrote:



However, that is a marketing decision [1], which should not be mixed
with the technical discussion here. Speaking of a "synchronous commit"
is utterly misleading, because the commit itself is exactly the thing
that's *not* synchronous.




[1]: Some people like the term "virtually synchronous" for marketing
purposes. That's at least half-ways technically correct.


Marketing people are virtually trustworthy, from my life experience.
If you ask me, this is just preposterous.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi,

Simon Riggs wrote:
> You're right that neither the data transfer nor data availability is
> entirely synchronous, but data transfer is synchronous at time of
> *commit*: it is recorded on multiple nodes at the same time.

I'm unsure what you mean by a "data transfer being synchronous". To what
other process or state should the data transfer be synchronous to?

> The term "synchronous replication" is already well used in the industry
> to mean synchronous commit, so I don't think we should change the name
> now. The project here is also known to everybody as "synch rep".

I understand very well, that you don't want to change the name. I've
been hesitant to "relabel" Postgres-R from synchronous to asynchronous
to eager.

However, that is a marketing decision [1], which should not be mixed
with the technical discussion here. Speaking of a "synchronous commit"
is utterly misleading, because the commit itself is exactly the thing
that's *not* synchronous.

It *is* an optimization to fully synchronous replication to defer commit
on the "slave" and only make sure that the transaction *can* be applied
at some time in the future.

However, this *does* have the drawback of transactions not being
immediately visible on the slave. Often enough, this is acceptable. But
it certainly matters to some applications developers.

> What is confusing is that "replication" itself is a much abused term and
> is used to describe technologies for HA, DR and data movement.

I absolutely agree to that. And I'm thus recommending to at least be
consistent and honest with the term "synchronous" and point out that WAL
writing is synchronous for the log shipping approach here (AFAIK). But
that the commit is asynchronous for performance reasons. In other words:
this approach is certainly (and hopefully, for performance reasons)
different from a fully synchronous approach. Even for marketing reasons,
it might make sense to point out that difference (.. "no, we are faster
than fully sync rep.").

Regards

Markus Wanner

[1]: Some people like the term "virtually synchronous" for marketing
purposes. That's at least half-ways technically correct.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 13 déc. 08 à 11:39, Peter Eisentraut a écrit :

On Friday 12 December 2008 20:05:57 Tom Lane wrote:

Excellent.  I checked that psql's colon-variable feature behaves the
same.  So it looks like the proposed "name: value" syntax would  
indeed
not break any existing features.  Barring better ideas I think we  
should

go with that one.


I personally thought that AS was a better idea.


It seems some people want to be able to overload some default  
parameters (but not others) and at the same time alias them to some  
new label. I'm not sure I understand it all, but it seems an example  
of it would be like:

  SELECT xml_function(a, b: 'foo' AS bar);

If this is what some people want when all the spare parts are bound  
together, we don't have the option to use AS for both the meanings.


Regards,
- --
dim



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAklDqg0ACgkQlBXRlnbh1blPKwCfayDs3vFnswOYe7yLRyEaJf00
HvYAn1sfYndeKfI4ac09IxuxUVuUqbdX
=BGDG
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Albert Cervera i Areny
A Dissabte 13 Desembre 2008, Peter Eisentraut va escriure:
> On Friday 12 December 2008 20:05:57 Tom Lane wrote:
> > Excellent.  I checked that psql's colon-variable feature behaves the
> > same.  So it looks like the proposed "name: value" syntax would indeed
> > not break any existing features.  Barring better ideas I think we should
> > go with that one.
>
> I personally thought that AS was a better idea.

+1

-- 
Albert Cervera i Areny
http://www.NaN-tic.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Peter Eisentraut
On Friday 12 December 2008 20:05:57 Tom Lane wrote:
> Excellent.  I checked that psql's colon-variable feature behaves the
> same.  So it looks like the proposed "name: value" syntax would indeed
> not break any existing features.  Barring better ideas I think we should
> go with that one.

I personally thought that AS was a better idea.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1268)

2008-12-13 Thread Peter Eisentraut
On Friday 12 December 2008 19:31:11 Robert Haas wrote:
> Not really.  I'm not an SELinux expert.  But typically the two do
> exist alongside one another.  For example, installing SELinux (MAC)
> does on your system does not make "chmod g+w file" (DAC) stop working;
> it merely performs an ADDITIONAL security check before allowing access
> to the file.  You have to satisfy BOTH SELinux AND the ordinary
> filesystem permissions system in order to perform an operation on a
> file.

The MAC permissions are usually set up globally (in some cryptic file) and 
apply mandatorily (= M).  So a rule might say, a file named topsecret.pdf can 
only be stored in a certain place, can only be read by certain people, can 
only be opened by a special viewer, cannot be copied and pasted out of, etc.  
And there is nothing you can do about it, even if you own the file (short of 
changing the global policy).

The DAC permissions are set up by the object owner at their discretion (= D).  
So if you write a draft.odt and want your group to look at it, you do a chgrp 
g+r or whatever, as you want.  It would be silly in this case to have to 
request a global MAC policy change for every such step.

> The contention of the author of this patch is that row-level access is
> somehow different - that even though we have two sets of checks for
> files, tables, and (assuming Stephen Frost's patch is accepted)
> columns, we will only have one set of checks for rows, and you can
> pick which one you want.

Yes, that is the part that is puzzling me as well.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1268)

2008-12-13 Thread Peter Eisentraut
On Friday 12 December 2008 19:09:26 Alvaro Herrera wrote:
> I don't understand -- why wouldn't we just have two columns, one for
> plain row-level security and another for whatever security system the
> platforms happens to offer?  If we were to follow that route, we could
> have row-level security first, extracting the feature from the current
> patch; and the rest of PGACE could be a much smaller patch implementing
> the rest of the stuff, with SELinux support for now with an eye to
> implementing Solaris TX or whatever.

Exactly.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mostly Harmless: Welcoming our C++ friends

2008-12-13 Thread Kurt Harriman

Tom Lane wrote:

Kurt Harriman  writes:

However, probably an easier alternative would be to have
just one buildfarm machine do a nightly build configured
with the --enable-cplusplus option.



There is no such option, and won't be.


Yours is the first comment anyone has posted to the list
regarding my proposed c++configure patch, and it sounds
alarmingly definite.

May I ask you to elaborate?  Have you more to say on the
subject?


This would build one file - main.c - as C++ (necessary
because on some platforms the main() function needs to be
C++ to ensure the C++ runtime library is initialized).


Useless, since main.c doesn't include any large number of headers,


main.c directly or indirectly includes these 71 headers:
access/attnum.h access/genam.h  access/heapam.h
access/htup.h   access/rmgr.h   access/sdir.h
access/skey.h   access/tupdesc.haccess/tupmacs.h
access/xlog.h   access/xlogdefs.h   bootstrap/bootstrap.h
c.h catalog/genbki.hcatalog/pg_am.h
catalog/pg_attribute.h  catalog/pg_class.h  catalog/pg_index.h
executor/execdesc.h executor/tuptable.h fmgr.h
lib/stringinfo.hnodes/bitmapset.h   nodes/execnodes.h
nodes/nodes.h   nodes/params.h  nodes/parsenodes.h
nodes/pg_list.h nodes/plannodes.h   nodes/primnodes.h
nodes/tidbitmap.h   nodes/value.h   pg_config.h
pg_config_manual.h  pg_config_os.h  pgtime.h
port.h  postgres.h  postgres_ext.h
postmaster/postmaster.h rewrite/prs2lock.h  storage/backendid.h
storage/block.h storage/buf.h   storage/bufpage.h
storage/item.h  storage/itemid.hstorage/itemptr.h
storage/lock.h  storage/lwlock.hstorage/off.h
storage/relfilenode.h   storage/shmem.h tcop/dest.h
tcop/tcopprot.h utils/array.h   utils/elog.h
utils/errcodes.hutils/guc.h utils/help_config.h
utils/hsearch.h utils/int8.hutils/palloc.h
utils/pg_crc.h  utils/pg_locale.h   utils/ps_status.h
utils/rel.h utils/relcache.hutils/snapshot.h
utils/timestamp.h   utils/tuplestore.h


and in particular there is no reason for it to include the headers
that are critical to function libraries.


Extra #includes could be added to main.c just for the purpose of
getting them C++-syntax-checked.  Or, a few more .c files could be
chosen to expand the set of C++-syntax-checked headers.  For
instance, xml.c pulls in spi.h and 96 other headers.  66 of them
overlap with main.c; but these 31 are new:
access/xact.h   catalog/namespace.h catalog/pg_language.h
catalog/pg_proc.h   catalog/pg_type.h   commands/dbcommands.h
executor/execdefs.h executor/executor.h executor/spi.h
lib/dllist.hlibpq/pqformat.hmb/pg_wchar.h
miscadmin.h nodes/memnodes.hnodes/nodeFuncs.h
nodes/relation.htcop/pquery.h   tcop/utility.h
utils/builtins.hutils/catcache.hutils/date.h
utils/datetime.hutils/datum.h   utils/lsyscache.h
utils/memutils.hutils/plancache.h   utils/portal.h
utils/resowner.hutils/syscache.hutils/tzparser.h
utils/xml.h
funcapi.h is still missing.  One file that includes it is
pl_exec.c, which pulls in 8 more headers not already listed:
access/transam.hcommands/trigger.h  executor/spi_priv.h
funcapi.h   parser/scansup.hplpgsql.h
utils/snapmgr.h utils/typcache.h
So C++-compiling just a few source files is sufficient to syntax
check a useful subset of header files including those which are
most important for add-on development.

However, the above approach has a couple of obvious caveats:

:( It doesn't give C++ users a precise specification of exactly
   which header files they may rely upon from release to release.

:( From time to time, C++ programmers will probably come along
   asking for even more header files to be sanitized.

The alternative which you have suggested, using pgcompinclude,
could solve these caveats by enforcing C++ safety upon every
PostgreSQL header file.  And it would not require any more .c
files beyond main.c to be kept C++-clean.

http://archives.postgresql.org/pgsql-patches/2007-07/msg00056.php

I'll get started on the pgcompinclude thing.

Regards,
... kurt

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Simon Riggs

On Sat, 2008-12-13 at 00:00 +0100, Markus Wanner wrote:
> Hi,
> 
> Fujii Masao wrote:
> > I'd like to define the meaning of "synch rep" again. "synch rep" means:
> > 
> > (1) Transaction commit waits for WAL records to be replicated to the standby
> >   before the command returns a "success" indication to the client.
> > 
> > (2) The standby has (can read) all WAL files indispensable for recovery.
> 
> Let me point out that - very much like the original Postgres-R algorithm
> - this guarantees committed transactions to be durable and consistent
> (no late aborts of conflicting transactions), but it does not guarantee
> that a transaction committed on one node is immediately visible on the
> other node. In that sense, it is not synchronous as commonly understood,
> because it does not "operate with all their parts in synchrony" [1], as
> implied by the term "synchronous". This might (and often has in the
> past) lead to confusion.

You're right that neither the data transfer nor data availability is
entirely synchronous, but data transfer is synchronous at time of
*commit*: it is recorded on multiple nodes at the same time.

The term "synchronous replication" is already well used in the industry
to mean synchronous commit, so I don't think we should change the name
now. The project here is also known to everybody as "synch rep".

* Oracle Data Guard calls it "synchronous redo transport"
* MS Exchange calls it "synchronous replication"
* MS SQL Server has "Database Mirroring", "Log Shipping" and
"Replication". "Database Mirroring" provides synchronous mechanism, with
"Replication" meaning data transfer to other databases,
publish&subscribe.
* DB2 HADR provides "synchronous replication"
* MySQL call it "synchronous replication"

What is confusing is that "replication" itself is a much abused term and
is used to describe technologies for HA, DR and data movement.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: default values for function parameters

2008-12-13 Thread Asko Oja
On Fri, Dec 12, 2008 at 8:05 PM, Tom Lane  wrote:

> Michael Meskes  writes:
> > On Fri, Dec 12, 2008 at 10:06:30AM -0500, Tom Lane wrote:
> >> Hmm ... actually, ecpg might be a problem here anyway.  I know it has
> >> special meaning for :name, but does it allow space between the colon
> >> and the name?  If it does then the colon syntax loses.  If it doesn't
>
> > No. Here's the lexer rule:
> > :{identifier}((("->"|\.){identifier})|(\[{array}\]))*
> > No space possible between ":"  and {identifier}.
>
> Excellent.  I checked that psql's colon-variable feature behaves the
> same.  So it looks like the proposed "name: value" syntax would indeed
> not break any existing features.  Barring better ideas I think we should
> go with that one.

+1
"name: value" should be good enough

>
>regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>