Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread Heikki Linnakangas

Joshua D. Drake wrote:

David Fetter wrote:

On Thu, Sep 11, 2008 at 11:28:36PM -0400, Tom Lane wrote:

Joshua Drake [EMAIL PROTECTED] writes:

I think something like:
psql: FATAL:  Ident authentication failed for user root
HINT: 
http://www.postgresql.org/docs/8.3/static/client-authentication.html

Would be nice.

Do you really think that's helpful in the typical case where someone
fat-fingered their password?

I'm not averse to hint messages that are actually helpful, ie,
reasonably connected to the *specific* failure situation.  The problem
I've got with David's proposal is that it provides a one-size-fits-all
hint for every possible auth failure.  One size does not fit all here.


Here's a few different sizes: one for each auth method.


The only thing I would say here is that you point the URL to current 
which will be wrong in one release. Perhaps something that pulls the 
pgversion macro?


We don't put URLs in error messages. The hint needs to be a real sentence.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move src/tools/backend/ to wiki

2008-09-12 Thread Peter Eisentraut

Alvaro Herrera wrote:

However I wonder how much value there really is in the developer's FAQ,
considering that some answers seem rather poor.  For example the
answer on ereport() was wrong, and nobody ever pointed it out.  The
answer on palloc/pfree is very incomplete too.


I think the developer's FAQ has essentially been unmaintained for many 
years.  I think we should gradually migrate the content to other wiki 
pages and eventually drop the FAQ.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Joshua Drake [EMAIL PROTECTED] writes:
 I think something like:

 psql: FATAL:  Ident authentication failed for user root
 HINT: http://www.postgresql.org/docs/8.3/static/client-authentication.html

 Would be nice.
...

 Or to put it even more baldly: this is not an area in which you can
 improve matters significantly with five minutes' thought and a one-line
 patch.  It would take some actual work.

Actually I think there is a problem with the original message that could be
improved. The problem is that Ident is a Postgres-specific term that a
newbie DBA is unlikely to understand. What's worse it's an ambiguous term that
is easily misunderstood to refer to the rfc1413 ident protocol which Postgres
might or might not be using.

I would suggest instead describing it using more generic terminology though
offhand I'm not sure what that would be. A detail line could include the
Postgres-specific authentication method which failed.

I do think it's true that the pg_hba setup is far more complex than it has to
be and that that's a bigger problem than a simple error message too.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread Markus Wanner

Hi,

David Fetter wrote:

I'm all for something, and that's a much better something.  What we
have now--nothing--actively distresses newbies for no good reason.

I don't know how many people we've lost right at that point, but the
number has to be high, as most people don't just hop into IRC with
their problem.


Maybe something much more specific, i.e. triggering only if one tried to 
connect via localhost or unix sockets, and only if one tried to 
authenticate as 'root' without a password.


The hint shoud IMO say something like: The default superuser is 
postgres, not root. Something that's useful for this specific case and 
doesn't disturb in others. And something that's public knowledge, which 
any reasonably serious attacker already knows anyway.


Maybe also point out that the unix user is chosen by default. Assuming 
that most of these users didn't explicitly type 'root' and are wondering 
where that 'root' user came from.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] pgbench duration option

2008-09-12 Thread Magnus Hagander
Tom Lane wrote:
 I wrote:
 Are people satisfied that the Windows part of the patch is okay?
 
 I went ahead and applied this patch after some trivial stylistic fixes.
 The buildfarm will soon tell us if the WIN32 part of the patch compiles,
 but not whether it works --- would someone double-check the functioning
 of the -T switch on Windows?

Confirmed, it works fine for me. Quick look at the win32 specific code
committed, I see nothing else that would be off in that either (two
handles that are never closed, but there's no point in caring in
pgbench, since it will never happen in a loop)

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 17:58 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  So part of the handshake between
  primary and standby must be what is your recentxmin?. The primary will
  then use the lower/earliest of the two.
 
 Even then, the master might already have vacuumed away tuples that are 
 visible to an already running transaction in the slave, before the slave 
 connects. Presumably the master doesn't wait for the slave to connect 
 before starting to accept new connections.

Yep, OK.

  As you mentioned, the options there are to defer applying WAL, or cancel 
  queries. I think both options need the same ability to detect when 
  you're about to remove a tuple that's still visible to some snapshot, 
  just the action is different. We should probably provide a GUC to 
  control which you want.
  
  I don't see any practical way of telling whether a tuple removal will
  affect a snapshot or not. Each removed row would need to be checked
  against each standby snapshot. Even if those were available, it would be
  too costly. 
 
 How about using the same method as we use in HeapTupleSatisfiesVacuum? 
 Before replaying a vacuum record, look at the xmax of the tuple 
 (assuming it committed). If it's  slave's OldestXmin, it can be 
 removed. Otherwise not. Like HeapTupleSatisfiesVacuum, it's 
 conservative, but doesn't require any extra bookkeeping.
 
 And vice versa: if we implement the more precise book-keeping, with all 
 snapshots in shared memory or something, we might as well use it in 
 HeapTupleSatisfiesVacuum. That has been discussed before, but it's a 
 separate project.

Tuple removals earlier than the slave's OldestXmin are easy, thats true.
I'm not sure what you had in mind for Otherwise not? 

Maybe you mean stop applying WAL until slave's OldestXmin is  tuple
removal xid. Not sure, reading other subthreads of this post.

I think its possible to defer removal actions on specific blocks only,
but that is an optimisation that's best left for a while.

BTW, tuple removals would need a cleanup lock on a block, just as they
do on master server. So WAL apply can be delayed momentarily by
pinholders anyway, whatever we do.

  It was also suggested we might take the removed rows and put them in a
  side table, but that makes me think of the earlier ideas for HOT and so
  I've steered clear of that.
 
 Yeah, that's non-trivial. Basically a whole new, different 
 implementation of MVCC, but without changing any on-disk formats.
 
 BTW, we haven't talked about how to acquire a snapshot in the slave. 
 You'll somehow need to know which transactions have not yet committed, 
 but will in the future. In the master, we keep track of in-progress 
 transaction in the ProcArray, so I suppose we'll need to do the same in 
 the slave. Very similar to prepared transactions, actually. I believe 
 the Abort records, which are not actually needed for normal operation, 
 become critical here. The slave will need to put an entry to ProcArray 
 for any new XLogRecord.xl_xid it sees in the WAL, and remove the entry 
 at a Commit and Abort record. And clear them all at a shutdown record.

I wouldn't do it like that.

I was going to maintain a current snapshot in shared memory, away from
the PROCARRAY. Each time we see a TransactionId we check whether its
already been seen, if not, insert it. When a transaction commits or
aborts we remove the stated xid. If we see a shutdown checkpoint we
clear the array completely. When query backends want a snapshot they
just read the array.  It doesn't matter whether queries commit or abort,
since those changes can't be seen anyway by queries until commit.

Reason for doing it this way is PROCARRAY may be full of query backends,
so having dummy backends in there as well sounds confusing.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 11:38 +0300, Hannu Krosing wrote:
 On Thu, 2008-09-11 at 09:24 +0300, Heikki Linnakangas wrote:
 
  I like the idea of acquiring snapshots locally in the slave much more. 
  As you mentioned, the options there are to defer applying WAL, or cancel 
  queries. 
 
 More exotic ways to defer applying WAL include using some smart
 filesystems to get per-backend data snapshots, using either
 copy-of-write overlay filesystems and filesystem or disk level
 snapshots.

That's certainly possible.

That would mean we maintain a single consistent viewpoint of the whole
database for a period of time. The frozen viewpoint could move forwards
by operator command, or we might keep multiple frozen views.

We can have a LookasideHash table in memory that keeps track of which
blocks have had rows removed from them since the frozen view was
taken.

If you request a block, we check to see whether there is a lookaside
copy of it prior to the tuple removals. We then redirect the block
request to a viewpoint relation's block. Each viewpoint gets a separate
relfilenode. We do the switcheroo while holding cleanup lock on block.

So effectively we would be adding frozen snapshot technology to
Postgres. Although we would want to do, copy-on-clean rather than
copy-on-write. Which could mean significantly better performance.

That might be encumbered by patent in some way.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What is d2mdir?

2008-09-12 Thread Peter Eisentraut

Tom Lane wrote:

Abhijit Menon-Sen [EMAIL PROTECTED] writes:

At 2008-09-02 15:10:23 +0300, [EMAIL PROTECTED] wrote:

[EMAIL PROTECTED] sgml]$ make man



As Alvaro noted recently, you need to use make man D2MDIR=/some/path.


I see it's been like that for quite some time, but still it seems pretty
bogus.  Why isn't configure handling this?  If there's some good reason
not to automate it, why isn't it documented in the Building The
Documentation appendix?


I have updated the documentation accordingly.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 15:42 +0300, Heikki Linnakangas wrote:
 Gregory Stark wrote:
  b) vacuum on the server which cleans up a tuple the slave has in scope has 
  to
 block WAL reply on the slave (which I suppose defeats the purpose of 
  having
 a live standby for users concerned more with fail-over latency).
 
 One problem with this, BTW, is that if there's a continuous stream of 
 medium-length transaction in the slave, each new snapshot taken will 
 prevent progress in the WAL replay, so the WAL replay will advance in 
 baby steps, and can fall behind indefinitely. As soon as there's a 
 moment that there's no active snapshot, it can catch up, but if the 
 slave is seriously busy, that might never happen.

It should be possible to do mixed mode.

Stall WAL apply for up to X seconds, then cancel queries. Some people
may want X=0 or low, others might find X = very high acceptable (Merlin
et al).

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress inputdir

2008-09-12 Thread Peter Eisentraut

Alvaro Herrera wrote:

Jorgen Austvik - Sun Norway wrote:

The attached patch makes pg_regress write converted files to  
inputdir/sql and inputdir/expected, which is one way to make it read  
and write to the same directory. Tested on Solaris x86 with pgsql make  
check and standalone.


Okay, so this patch does change it in a way that it still works, but
what else do you need to be able to run the test from another directory?
I tried to run the test from another directory with this patch
installed, and found that it didn't work because it's replacing
@abs_builddir@ in the input files improperly (to the current path; it
should be using the output dir path, I think)

So maybe this is a step in the right direction, but ISTM you need a
slightly larger patch for it to be actually useful.

If I am not making sense, then maybe I am not understanding what you
mean by running it standalone.  In that case, please explain.



I think the appropriate interface would be adding another option to 
pg_regress called --workdir or --tempdir, which defaults to PWD, and 
write the converted sql files there, and then look for the sql files to 
execute in workdir/sql and in inputdir/sql.  In some way, this copies 
the vpath search mechanism.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)

2008-09-12 Thread KaiGai Kohei

Hello,

The latest SE-PostgreSQL patches are updated here:

[1/4] 
http://sepgsql.googlecode.com/files/sepostgresql-sepgsql-8.4devel-3-r1005.patch
[2/4] 
http://sepgsql.googlecode.com/files/sepostgresql-pg_dump-8.4devel-3-r1005.patch
[3/4] 
http://sepgsql.googlecode.com/files/sepostgresql-policy-8.4devel-3-r1005.patch
[4/4] 
http://sepgsql.googlecode.com/files/sepostgresql-docs-8.4devel-3-r1005.patch

They contains rebasing to the CVS HEAD, because we cannot apply the previous 
ones
correctly due to progress in the base version.
Rest of changes are here:
 - A new PGACE hook: pgaceIsAllowExecutorRunHook().
   It enables to control overriding ExecutorRun_hook, because several important
   hooks are invoked from standard_ExecutorRun().
 - T_SEvalItem related equal() functions are added to nodes/equalfuncs.c.
   # I've left for implement them.
 - Fix typo in src/include/security/pgace.h


BTW, I thought I have to show the overview of the patch to help reviwers.
The main patch ([1/4]) is especially big and contains new concepts.

The following explanation shows several key concept of SE-PostgreSQL.
I'm happy if it can reduce the load of reviwers.

No need to say, please feel free to ask me anything. :-)

Thanks,


Security hooks
--
We called it as PostgreSQL Access Control Extention (PGACE).
The src/include/security/pgace.h newly added declares these hooks as
inline functions. If HAVE_SELINUX is available at build time, they have
a valid implementation to invoke functions to make access control decision.
When the SE-PostgreSQL feature is disabled at build time or run time,
it does not change any existing behavior.

These hooks have a prefix of pgace, like pgaceHeapTupleInsert().
This hook is invoked just before inserting a new tuple into a relation,
and the SE-PostgreSQL subsystem can make its decision.

Its argument provides information to make a decision. The pgaceHeapTupleInsert()
has four arguments like the target Relation object and newly inserted HeapTuple.

Specifications of each hooks are described in the 
src/include/security/pgace.h.


Security attribute management
-
We need a security attribute of tuple to use it as a basic of access control
decision. SELinux calls it as security context, and most of security aware
operating system has similar idea called as label.
It is represented as a text format like system_u:object_r:etc_t:s0, and has
its characteristic that many objects tend to share a single security context.

We stores text represented security attribute into pg_security system catalog
and put an alternate key (oid of pg_security) on each tuples, because it is
unacceptable approach to put a raw string data on individual tuples.

The alternate key is stored in the tail of HeapTupleHeader, as oid doing.
This field is valid when t_infomask has HEAP_HASSECURITY bit.

   HeapTupleHeader
  +-+
  |   : |
  +-+
  |  t_infomask |
  +-+ pg_security system catalog
  |t_hoffo---+  
+---+-+
  +-+|  |  oid  |   seclabel
  |
  |   : ||  
+---+-+
  |   : ||  | 16389 | system_u:object_r:sepgsql_table_t:s0  
  |
  +-+|  | 16401 | system_u:object_r:sepgsql_proc_t:s0   
  |
  |*Oid security_id*|- | 16402 | 
system_u:object_r:sepgsql_secret_table_t:s0 |
  +-+|  |   :   |   :   
  |
  | Oid object_id   ||
  +-+ --+
  |   Data field|
  |   : |

The alternate key is just a internal representation, so we have to translate
it to/from text format when communicating to in-kernel SELinux, or export/import
them.

Note that the security attribute is also assigned to tuples within system
catalogs. A security attribute of a tuple within pg_attribute means column's
security attribute, and used to column-level access controls, for example.

The src/backend/security/pgaceCommon.c have functions to traslate them:

  char *pgaceLookupSecurityLabel(Oid security_id);
  Oid   pgaceLookupSecurityId(char *security_label);

When a new security_label is given and not found on pg_security,
pgaceLookupSecurityId() tries to insert a new tuple into pg_security and
returns its object id as an alternate key.

Two more similar functions are also declared:
  char *pgaceSidToSecurityLabel(Oid security_id)
  Oid   pgaceSecurityLabelToSid(char *security_label)
It also enables to translate between a cosmetic text format and an internal
security identifier.
An example of cosmetic format is:
  unconfined_u:unconfined_r:unconfined_t:SystemLow-SystemHigh
 
We have a case when `pg_security` system catalog is not available because

Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 09:45 +0100, Simon Riggs wrote:
 On Thu, 2008-09-11 at 15:42 +0300, Heikki Linnakangas wrote:
  Gregory Stark wrote:
   b) vacuum on the server which cleans up a tuple the slave has in scope 
   has to
  block WAL reply on the slave (which I suppose defeats the purpose of 
   having
  a live standby for users concerned more with fail-over latency).
  
  One problem with this, BTW, is that if there's a continuous stream of 
  medium-length transaction in the slave, each new snapshot taken will 
  prevent progress in the WAL replay, so the WAL replay will advance in 
  baby steps, and can fall behind indefinitely. As soon as there's a 
  moment that there's no active snapshot, it can catch up, but if the 
  slave is seriously busy, that might never happen.
 
 It should be possible to do mixed mode.
 
 Stall WAL apply for up to X seconds, then cancel queries. Some people
 may want X=0 or low, others might find X = very high acceptable (Merlin
 et al).

Or even milder version.

* Stall WAL apply for up to X seconds, 
* then stall new queries, let old ones run to completion (with optional
fallback to canceling after Y sec), 
* apply WAL. 
* Repeat.

-
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New FSM patch

2008-09-12 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
I've also been working on a low level benchmark using a C user-defined 
function that exercises just the FSM, showing the very raw CPU 
performance vs. current implementation. More on that later, but ATM it 
looks like the new implementation can be faster or slower than the 
current one, depending on the table size.


Let me describe this test case first:
- The test program calls RecordAndGetPageWithFreeSpace in a tight loop, 
with random values. There's no activity to the heap. In normal usage, 
the time spent in RecordAndGetWithFreeSpace is minuscule compared to the 
heap and index updates that cause RecordAndGetWithFreeSpace to be called.
- WAL was placed on a RAM drive. This is of course not how people set up 
their database servers, but the point of this test was to measure CPU 
speed and scalability. The impact of writing extra WAL is significant 
and needs to be taken into account, but that's a separate test and 
discussion, and needs to be considered in comparison to the WAL written 
by heap and index updates.


That said, the test results are pretty interesting.

I ran the test using a custom scripts with pgbench. I ran it with 
different table sizes, and with 1 or 2 clients, on CVS HEAD and a 
patched version. The unit is thousands of RecordAndGetPageWithFreeSpace 
calls per second:


Table size  Patched CVS HEAD
1 clnt  2 clnts 1 clnt  2 clients
8 kB4.593.4562.83   26.85
336 kB  13.85   6.4341.816.55
3336 kB 14.96   6.3 22.45   10.55
6 kB14.85   6.565.444.08
36 kB   14.48   11.04   0.790.74
336 kB  12.68   11.50.070.07
3336 kB 7.675.370.050.05

The big surprise to me was that performance on CVS HEAD tanks as the 
table size increases. One possible explanation is that searches for X 
bytes of free space, for a very high X, will not find any matches, and 
the current FSM implementation ends up scanning through the whole FSM 
list for that relation.


Another surprise was how badly both implementations scale. On CVS HEAD, 
I expected the performance to be roughly the same with 1 and 2 clients, 
because all access to the FSM is serialized on the FreeSpaceLock. But 
adding the 2nd client not only didn't help, but it actually made the 
performance much worse than with a single client. Context switching or 
cache line contention, perhaps? The new FSM implementation shows the 
same effect, which was an even bigger surprise. At table sizes  32 MB, 
the FSM no longer fits on a single FSM page, so I expected almost a 
linear speed up with bigger table sizes from using multiple clients. 
That's not happening, and I don't know why. Although, going from 33MB to 
333 MB, the performance with 2 clients almost doubles, but it still 
doesn't exceed that with 1 client.


Going from 3 GB to 33 GB, the performance of the new implementation 
drops. I don't know why, I think I'll run some more tests with big table 
sizes to investigate that a bit more. The performance in the old 
implementation stays almost the same at that point; I believe that's 
because max_fsm_pages is exceeded at that point.


All in all, this isn't a very realistic test case, but it's interesting 
nevertheless. I'm happy with the performance of the new FSM on this 
test, as it's in the same ballpark as the old one, even though it's not 
quite what I expected.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 17:04 +0300, Heikki Linnakangas wrote:
 Csaba Nagy wrote:
  and that means in fact that if you have
  continuously overlapping small transactions, the blocking horizon
  could be even blocked forever, as there'll always be a query running,
  and the new queries will always have the snapshot of the currently
  running ones because WAL recovery is stalled... 
 
 Hmm, no I don't think the WAL recovery can become completely stalled. To 
 completely stop progressing, we'd need to take a new snapshot that 
 includes transaction X, and at the same time be blocked on a vacuum 
 record that vacuums a tuple that's visible to transaction X. I don't 
 think that can happen, because for such a scenario to arise, in the 
 corresponding point in time in the master, there would've been a 
 scenario where the vacuum would've removed a tuple that would have been 
   visible to a newly starting transaction. Which can't happen. I think..

ISTM Csaba is correct. If WAL blocks the xids don't change and so the
snapshots never change, so wal is blocked. 

The only way out of that is to store up removals for particular blocks,
but that's complicated.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 01:07 +0100, Simon Riggs wrote:
 Transaction snapshots is probably the most difficult problem for Hot
 Standby to resolve. 

In summary of thread so far:

When queries on standby run for significantly longer than longest
queries on primary, some problems can occur. Various people have argued
for these responses to the problems:

1. Master uses Standby's OldestXmin
Effects: 
* Long running queries on standby...
   Can delay row removal on primary
   Do not delay apply of WAL records on standby
* Queries on standby give consistent answers in all cases.

2. Master ignores Standby's OldestXmin
Effects:
* Long running queries on standby...
   Have no effect on primary
   Can delay apply of WAL records on standby
* Queries on standby give consistent answers in all cases.

3. Ignore problem
Effects:
* Long running queries on standby...
   Have no effect on primary
   Do not delay apply of WAL records on standby
* Queries on standby give inconsistent answers in some cases, though
doesn't generate any messages to show inconsistency occurred. Acceptable
for read-only and insert only tables only.

Hot Standby should provide all 3 responses as options.

(1) would be implemented by sending Standby OldestXmin to primary.
Snapshots would not be sent from primary, they will be derived locally
from transactions currently being applied.

(2) would be implemented by setting a timer. When Startup process has
waited for more than redo_max_delay/max_lag_delay (SIGHUP) we cancel
queries. If timeout is 0 we aggressively cancel queries without a
timeout.

(3) would be implemented using read_consistency = on (default) | off, a
USERSET parameter. When read_consistency = off we ignore the backend's
xmin when deciding whether to wait before applying WAL or not.

Summary OK for everyone?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-12 Thread KaiGai Kohei

Josh Berkus wrote:

Hackers,

At this point, almost all patches have been assigned to reviewers.  If 
you submitted a patch and don't get feedback by Saturday, take a look at 
who's in the reviewers column and send them a query.  Since I have no 
way to track when patches are assigned to reviewers, I have no idea if 
some of them are sitting on their hands.

   :
Note that patches need not have only one reviewer!  If you have time, 
please take on testing some of the more complex patches, especially:

Column-level Permissions
Common Table Expressions
SE-PostgreSQL patches


I updated my patches:
http://archives.postgresql.org/pgsql-hackers/2008-09/msg00859.php

It contains rebasing to the latest CVS HEAD, a new hook to avoid
bypass access controls, and a small fix.

In addition, I tried to attach several short explanations about key points
to understand the big patch. I wrote the documentation of SE-PostgreSQL,
but it is not a description for *the patch*.
If reviewers have any unclear things, please feel free to ask me.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 15:17 +0200, Jochem van Dieten wrote:
 On Thu, Sep 11, 2008 at 2:07 AM, Simon Riggs wrote:
  Transaction snapshots is probably the most difficult problem for Hot
  Standby to resolve.
 
   * remotely from primary node
   * locally on the standby node
 
  If we derive a snapshot locally, then we will end up with a situation
  where the xmin of the local snapshot precedes the xmin of the primary
  node. When this occurs it will then be possible for WAL records to
  arrive on the standby that request removal of rows that a transaction
  might wish to see. Preventing that situation can be done by either
  deferring WAL apply or by cancelling queries.
 
 Which operations can request row removal? Isn't that just specific
 operations that have their own 'is this save to remove' calculations
 anyway (i.e. vacuum and HOT prune)?
 
 What I am thinking about is a design where the primary node were to
 regularly push an OldestXMin into the WAL, the WAL apply process on
 the standby nodes pushes it into shared memory and the backends keep
 an OldestMasterXMin in shared memory. The standby nodes then regularly
 pushes back the oldest OldestMasterXMin from all backends to the
 master. Vacuum and HOT prune could then base their calculations on an
 OldestXMin that is not the OldestXMin of the master itself, but of the
 master and the standby nodes. That way removal of records that are
 still visible on one of the standby nodes is prevented on the master
 instead of worked around on the standby nodes.
 
 The obvious downside would be bloat on the master (which could get out
 of hand if a slave is a few days behind due to a long report), but I
 think in terms of visibility and consistency this would work. Or am I
 completely misunderstanding the problem?

Yes, just sending the xmin from standby to primary is the best way of
having primary and standby work together. I will include this, thanks.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 11:21 +0200, Csaba Nagy wrote:
 On Fri, 2008-09-12 at 09:38 +0100, Simon Riggs wrote:
  If you request a block, we check to see whether there is a lookaside
  copy of it prior to the tuple removals. We then redirect the block
  request to a viewpoint relation's block. Each viewpoint gets a separate
  relfilenode. We do the switcheroo while holding cleanup lock on block.
 
 Wouldn't it make sense to also have a hint bit on the pages which are
 copied away ? 

There is no need for hint bit (and one bit would not be enough anyway,
as we may need to keep multiple versions)

A LSN of last row-removal (VACUUM or HOT-pruning) would serve as a hint
to start digging around in hash tables.

It seems though , that you may have to look into several disk snapshots
to find the page you need.

 Then instead of looking up a hash table, you first would
 look up that bit, and if not set you won't look up the hash table at
 all.  Then when you clean up the lookaside copies you clear those bits
 too...
 
 That would probably perform somewhat better for reading than always
 looking up a potentially big hash table, and the cost of setting the
 hint is probably a lot less than copying away the page in the first
 place. Resetting the hint bit might be a bit more expensive.
 
 Cheers,
 Csaba.
 
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Csaba Nagy
On Fri, 2008-09-12 at 09:38 +0100, Simon Riggs wrote:
 If you request a block, we check to see whether there is a lookaside
 copy of it prior to the tuple removals. We then redirect the block
 request to a viewpoint relation's block. Each viewpoint gets a separate
 relfilenode. We do the switcheroo while holding cleanup lock on block.

Wouldn't it make sense to also have a hint bit on the pages which are
copied away ? Then instead of looking up a hash table, you first would
look up that bit, and if not set you won't look up the hash table at
all.  Then when you clean up the lookaside copies you clear those bits
too...

That would probably perform somewhat better for reading than always
looking up a potentially big hash table, and the cost of setting the
hint is probably a lot less than copying away the page in the first
place. Resetting the hint bit might be a bit more expensive.

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-12 Thread Magnus Hagander
Brendan Jurd wrote:
 Hi Magnus,
 
 Josh has assigned your patch to me for an initial review.
 
 First up I'd like to say that this is a really nice upgrade.
 Shielding a running server from reloading a bogus conf file makes a
 whole lot of sense.

Hi!

Thanks for your review.

 The patch doesn't include any updates to documentation.  I humbly
 suggest that the change to the ident map (making sameuser the
 default) should be mentioned both in the Manual itself, and in the
 sample conf file comments.  Here are a few sites that may be in want
 of an update:

Yes, absolutely. Thanks for the couple of pointers, that'll help me not
to miss them :-)

 The new error messages are good, but I wonder if they could be
 improved by using DETAIL segments.  The guidelines on error message
 style say that the primary message should be terse and make a
 reasonable effort to fit on one line.  For example, this:
 
 LOG:  invalid IP address a.b.c.d in file
 /home/direvus/src/postgres/inst/data/pg_hba.conf line 77: Name or
 service not known
 
 Could be rewritten as something like this:
 
 LOG:  invalid IP address a.b.c.d in auth config: Name or service not 
 known
 DETAIL: In file /home/direvus/src/postgres/inst/data/pg_hba.conf, line 
 77

Makes a lot of sense, I'll go about making that change.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 18:17 +0300, Heikki Linnakangas wrote:
 Fujii Masao wrote:
  I think that this case would often happen. So, we should establish a certain
  solution or procedure to the case where TLI of the master doesn't match
  TLI of the slave. If we only allow the case where TLI of both servers is the
  same, the configuration after failover always needs to get the base backup
  on the new master. It's unacceptable for many users. But, I think that it's
  the role of admin or external tools to copy history files to the slave from
  the master.
 
 Hmm. There's more problems than the TLI with that. For the original 
 master to catch up by replaying WAL from the new slave, without 
 restoring from a full backup, the original master must not write to disk 
 *any* WAL that hasn't made it to the slave yet. That is certainly not 
 true for asynchronous replication, but it also throws off the idea of 
 flushing the WAL concurrently to the local disk and to the slave in 
 synchronous mode.
 
 I agree that having to get a new base backup to get the old master catch 
 up with the new master sucks, so I hope someone sees a way around that.

If we were going to recover from failed-over standby back to original
master just via WAL logs we would need all of the WAL files from the
point of failover. So you'd need to be storing all WAL file just in case
the old master recovers. I can't believe doing that would be the common
case, because its so impractical and most people would run out of disk
space and need to delete WAL files.

It should be clear that to make this work you must run with a base
backup that was derived correctly on the current master. You can do that
by re-copying everything, or you can do that by just shipping changed
blocks (rsync etc). So I don't see a problem in the first place.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Thu, 2008-09-11 at 12:18 +0100, Gregory Stark wrote:

 Is there any middle ground or brilliant ways to get the best of both worlds?

Possibly. Nobody has commented yet on the other ideas on the post
itself.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 11:19 +0100, Simon Riggs wrote:
 On Thu, 2008-09-11 at 01:07 +0100, Simon Riggs wrote:
  Transaction snapshots is probably the most difficult problem for Hot
  Standby to resolve. 
 
 In summary of thread so far:
 
 When queries on standby run for significantly longer than longest
 queries on primary, some problems can occur. Various people have argued
 for these responses to the problems:
 
 1. Master uses Standby's OldestXmin
...
 2. Master ignores Standby's OldestXmin
...
 3. Ignore problem
...

 Summary OK for everyone?

Maybe we should at least mention option 4.

4. Slave keeps copies of removed pages or rows when WAL apply removes
old versions .

Possible ways to do this

 * inside Slave - have some backup store tied to OldestXmin intervals

  * variant 1 - have one global store, accessed through shared mem
  * variant 2 - present removed pages to interested backends and 
let them (decide to) keep them 

 * outside Slave - having file system keep old snapshots as long as
needed, still must tie to OldestXmin intervals, but most of work done by
storage layer (SAN or overlay file system).

-
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Simon Riggs

On Fri, 2008-09-12 at 00:03 +0900, Fujii Masao wrote:

 In my procedure, old WAL files are copyed by admin using scp, rsync
 or other external tool. So, I don't think that my procedure makes a
 problem more difficult. Since there are many setup cases, we should
 not leave all procedures to postgres, I think.

So the procedure is

1. Startup WALReceiver to begin receiving WAL
2. Do some manual stuff
3. Initiate recovery

So either

* WALReceiver is not started by postmaster. 
I don't think its acceptable that WALReceiver is not under the
postmaster. You haven't reduced the number of failure modes by doing
that, you've just swept the problem under the carpet and pretended its
not Postgres' problem.

* Postgres startup requires some form of manual process, as an
**intermediate** stage.

I don't think either of those is acceptable. It must just work.

Why not:
1. Same procedure as Warm Standby now
a) WAL archiving to standby starts
b) base backup

2. Startup standby, with additional option to stream WAL. WALReceiver
starts, connects to Primary. Primary issues log switch. Archiver turns
itself off after sending that last file. WALSender starts streaming
current WAL immediately after log switch.

3. Startup process on standby begins reading WAL from point mentioned by
backup_label. When it gets to last logfile shipped by primary's
archiver, it switches to reading WAL files written by WALReceiver.

So all automatic. Uses existing code. Synchronous replication starts
immediately. Also has the advantage that we do not get WAL bloat on
primary. Configuration is almost identical to current Warm Standby, so
little change for existing Postgres sysadmins.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NDirectFileRead and Write

2008-09-12 Thread ITAGAKI Takahiro

Tom Lane [EMAIL PROTECTED] wrote:

 * Let's rename them BufFileReadCount and BufFileWriteCount to reflect
 their actual purpose.

It looks good. No one uses NDirectFileRead/Write variables now,
so we can rename it freely.

BTW, we have the word 'Direct' in ShowBufferUsage().
Can I keep it as-is?

ShowBufferUsage()
!   Shared blocks: ...
!   Local  blocks: ...
!   Direct blocks: R read, W written


 * In any case I agree that the current arrangement
 with execdebug.h declaring variables defined in bufmgr.c is just weird.

Ok.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 13:53 +0300, Hannu Krosing wrote:

 4. Slave keeps copies of removed pages or rows when WAL apply removes
 old versions .
 
 Possible ways to do this
 
  * inside Slave - have some backup store tied to OldestXmin intervals
 
   * variant 1 - have one global store, accessed through shared mem
   * variant 2 - present removed pages to interested backends and 
 let them (decide to) keep them 
 
  * outside Slave - having file system keep old snapshots as long as
 needed, still must tie to OldestXmin intervals, but most of work done by
 storage layer (SAN or overlay file system).

Possible options for outside Slave filesystem snapshooting - 

ZFS ( http://en.wikipedia.org/wiki/ZFS ) is very likely usable 

Linux LVM + XFS may be usable -
http://arstechnica.com/articles/columns/linux/linux-20041013.ars

Possibly also http://en.wikipedia.org/wiki/Btrfs .


Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Richard Huxton
Gregory Stark wrote:
 In that case the problem is dealing with different usage patterns on different
 tables. There might be a way to solve just that use case such as deferring WAL
 records for those tables. That doesn't guarantee inter-table data consistency
 if there were other queries which read from those tables and updated other
 tables based on that data though. Perhaps there's a solution for that too
 though.

There was a suggestion (Simon - from you?) of a transaction voluntarily
restricting itself to a set of tables. That would obviously reduce the
impact of all the options where the accessed tables weren't being
updated (where update = vacuum + HOT if I've got this straight).

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 3. Ignore problem
 Effects:
 * Long running queries on standby...
Have no effect on primary
Do not delay apply of WAL records on standby
 * Queries on standby give inconsistent answers in some cases, though
 doesn't generate any messages to show inconsistency occurred. Acceptable
 for read-only and insert only tables only.

This seems like a non-starter. 

Your comment about read-only and insert-only tuples only seems to make sense
if you assume there are other tables being updated simultaneously. Otherwise
of course there would be no WAL records for tuple removals.

In that case the problem is dealing with different usage patterns on different
tables. There might be a way to solve just that use case such as deferring WAL
records for those tables. That doesn't guarantee inter-table data consistency
if there were other queries which read from those tables and updated other
tables based on that data though. Perhaps there's a solution for that too
though.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 12:31 +0300, Hannu Krosing wrote:
 On Fri, 2008-09-12 at 09:45 +0100, Simon Riggs wrote:
  On Thu, 2008-09-11 at 15:42 +0300, Heikki Linnakangas wrote:
   Gregory Stark wrote:
b) vacuum on the server which cleans up a tuple the slave has in scope 
has to
   block WAL reply on the slave (which I suppose defeats the purpose of 
having
   a live standby for users concerned more with fail-over latency).
   
   One problem with this, BTW, is that if there's a continuous stream of 
   medium-length transaction in the slave, each new snapshot taken will 
   prevent progress in the WAL replay, so the WAL replay will advance in 
   baby steps, and can fall behind indefinitely. As soon as there's a 
   moment that there's no active snapshot, it can catch up, but if the 
   slave is seriously busy, that might never happen.
  
  It should be possible to do mixed mode.
  
  Stall WAL apply for up to X seconds, then cancel queries. Some people
  may want X=0 or low, others might find X = very high acceptable (Merlin
  et al).
 
 Or even milder version.
 
 * Stall WAL apply for up to X seconds, 
 * then stall new queries, let old ones run to completion (with optional
 fallback to canceling after Y sec), 
 * apply WAL. 
 * Repeat.

Now that I have thought a little more about delegating keeping old
versions to filesystem level (ZFS , XFS+LVM) snapshots I'd like to
propose the following:

 0. run queries and apply WAL freely until WAL application would 
remove old rows.

 1. stall applying WAL for up to N seconds

 2. stall starting new queries for up to M seconds

 3.  if some backends are still running long queries, then 

   3.1. make filesystem level snapshot (FS snapshot), 
   3.2. mount the FS snapshot somewhere (maybe as data.at.OldestXmin 
in parallel to $PGDATA) and 
   3.3 hand this mounted FS snapshot over to those backends

 4. apply WAL

 5. GoTo 0.

Of course we need to do the filesystem level snapshots in 3. only if the
long-running queries don't already have one given to them. Or maybe also
if they are running in READ COMMITTED mode and and have aquired a new PG
snapshot since they got their FS snapshot need a new one.

Also, snapshots need to be reference counted, so we can unmount and
destroy them once all their users have finished.

I think that enabling long-running queries this way is both low-hanging
fruit (or at least medium-height-hanging ;) ) and also consistent to
PostgreSQL philosophy of not replication effort. As an example we trust
OS's file system cache and don't try to write our own.


Hannu










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Csaba Nagy
On Fri, 2008-09-12 at 12:31 +0100, Richard Huxton wrote:
 There was a suggestion (Simon - from you?) of a transaction voluntarily
 restricting itself to a set of tables.

While thinking about how easy it would be for the DBA to specify the set
of tables a single query is accessing, first I thought that it should be
straight enough to look at the query itself for that. Then I thought
what about views, rules, triggers, user functions etc. ? All those have
the potential to access more than you see in the query itself. And then
the actually interesting question: what will the slave do with views,
rules, triggers ? I guess triggers are out of the question to be
executed, what about rules ? Probably must be also ignored... user
functions will probably get errors if they try to update something...
Views should probably function correctly.

So in any case the functionality available for querying slaves would be
less than for the primary. This is probably good enough for most
purposes...

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NDirectFileRead and Write

2008-09-12 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 BTW, we have the word 'Direct' in ShowBufferUsage().
 Can I keep it as-is?

 ShowBufferUsage()
 !   Shared blocks: ...
 !   Local  blocks: ...
 !   Direct blocks: R read, W written

Good point.  Seems like it should be changed, but I am not sure to what.
Ideas anyone?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Csaba Nagy
 I think that enabling long-running queries this way is both
 low-hanging
 fruit (or at least medium-height-hanging ;) ) and also consistent to
 PostgreSQL philosophy of not replication effort. As an example we trust
 OS's file system cache and don't try to write our own.

I have again questions (unfortunately I only have questions usually):

* how will the buffers keep 2 different versions of the same page ?
* how will you handle the creation of snapshots ? I guess there's no portable 
and universal API for that (just guessing), or there is some POSIX thing which 
is supported or not by the specific FS ? So if the FS is not supporting it, you 
skip the snapshot step ? And if there's no universal API, will it be handled by 
plugins providing a specified API for snapshotting the FS ?

I hope my continuous questioning is not too annoying...

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 13:54 +0200, Csaba Nagy wrote:
  I think that enabling long-running queries this way is both
  low-hanging
  fruit (or at least medium-height-hanging ;) ) and also consistent to
  PostgreSQL philosophy of not replication effort. As an example we trust
  OS's file system cache and don't try to write our own.
 
 I have again questions (unfortunately I only have questions usually):
 
 * how will the buffers keep 2 different versions of the same page ?

As the FS snapshot is mounted as a different directory, it will have
it's own buffer pages.

To conserve RAM, one could go to FS snapshot files only in case main
pages have LSN too big to be trusted.

 * how will you handle the creation of snapshots ? 

probably an external command, possibly shell script. 
similar to current archive_command for wal copying

maybe 'create_fs_snapshot_command' and 'destroy_fs_snapshot_command'

 I guess there's no portable and universal API for that (just guessing), 
 or there is some POSIX thing which is supported or not by the specific FS ? 
 So if the FS is not supporting it, you skip the snapshot step ? 

Yes, if not FS snapshots are not supported, we fall back to either
inconsistent read or killing long-running queries.

 And if there's no universal API, will it be handled by plugins providing 
 a specified API for snapshotting the FS ?

Yes, the simplest one being external command. As FS snapshots are
supposed to happen not-too often, at least not every second, just having
external commands may be enough.

 I hope my continuous questioning is not too annoying...

On the contrary, much appreciated. :)


Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Csaba Nagy
On Fri, 2008-09-12 at 15:08 +0300, Hannu Krosing wrote:
  * how will the buffers keep 2 different versions of the same page ?
 
 As the FS snapshot is mounted as a different directory, it will have
 it's own buffer pages.

Lack of knowledge about this shows my ignorance about the implementation
of the page buffers...

  * how will you handle the creation of snapshots ? 
 
 probably an external command, possibly shell script. 
 similar to current archive_command for wal copying
 
 maybe 'create_fs_snapshot_command' and 'destroy_fs_snapshot_command'
[snip]
 Yes, the simplest one being external command. As FS snapshots are
 supposed to happen not-too often, at least not every second, just having
 external commands may be enough.

You could restrict the creation of snapshots to some minimum amount of
time between them, and maybe also restrict the maximum number of
concurrent snapshots possible. Then if the time limit (as calculated
from the last open snapshot) is currently not met, any new query could
reuse that last snapshot. The time intervals do not need to be evenly
distributed BTW, it could be a function of the already opened snapshots,
like increase the minimum interval exponentially with the number of
already opened snapshots. That would help to catch more long running
queries to just a few snapshots.

  I hope my continuous questioning is not too annoying...
 
 On the contrary, much appreciated. :)

Ok, then I'll continue :-) I would like to see this feature succeed, but
there's slim chance I'll ever code well in C...

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread David Fetter
On Fri, Sep 12, 2008 at 10:08:56AM +0200, Markus Schiltknecht wrote:
 Hi,

 David Fetter wrote:
 I'm all for something, and that's a much better something.  What we
 have now--nothing--actively distresses newbies for no good reason.

 I don't know how many people we've lost right at that point, but
 the number has to be high, as most people don't just hop into IRC
 with their problem.

 Maybe something much more specific, i.e. triggering only if one
 tried to connect via localhost or unix sockets, and only if one
 tried to authenticate as 'root' without a password.

It's not the root part that confuses people, but the entire message.

 The hint shoud IMO say something like: The default superuser is
 postgres, not root. Something that's useful for this specific case
 and  doesn't disturb in others. And something that's public
 knowledge, which  any reasonably serious attacker already knows
 anyway.

I, too, disagree with the security by obscurity approach to auth
error messages.  A system cracker will not be deterred by any such a
thing, but a new user can easily be.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress inputdir

2008-09-12 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Alvaro Herrera wrote:
 I tried to run the test from another directory with this patch
 installed, and found that it didn't work because it's replacing
 @abs_builddir@ in the input files improperly (to the current path; it
 should be using the output dir path, I think)

 I think the appropriate interface would be adding another option to 
 pg_regress called --workdir or --tempdir, which defaults to PWD, and 
 write the converted sql files there, and then look for the sql files to 
 execute in workdir/sql and in inputdir/sql.  In some way, this copies 
 the vpath search mechanism.

That would be required to make pg_regress run as far as its own
facilities are concerned.  But I think Alvaro is worried about something
at a higher level: the regression test process as a whole has some
directory layout assumptions built into it, particularly in regards
to where to find .so's.  If we don't have a workable solution for that
it's not really going to help to change pg_regress like this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread David Fetter
On Fri, Sep 12, 2008 at 08:53:39AM +0100, Gregory Stark wrote:
 
 Tom Lane [EMAIL PROTECTED] writes:
 
  Joshua Drake [EMAIL PROTECTED] writes:
  I think something like:
 
  psql: FATAL:  Ident authentication failed for user root
  HINT: http://www.postgresql.org/docs/8.3/static/client-authentication.html
 
  Would be nice.
 ...
 
  Or to put it even more baldly: this is not an area in which you
  can improve matters significantly with five minutes' thought and a
  one-line patch.  It would take some actual work.
 
 Actually I think there is a problem with the original message that
 could be improved. The problem is that Ident is a
 Postgres-specific term that a newbie DBA is unlikely to understand.
 What's worse it's an ambiguous term that is easily misunderstood to
 refer to the rfc1413 ident protocol which Postgres might or might
 not be using.

For Ident auth, we need to split that functionality into two
separate pieces: socket and network, and add descriptive error
messages for each.

 I would suggest instead describing it using more generic terminology
 though offhand I'm not sure what that would be.  A detail line could
 include the Postgres-specific authentication method which failed.

Excellent idea :)

 I do think it's true that the pg_hba setup is far more complex than
 it has to be and that that's a bigger problem than a simple error
 message too.

Agreed.  Any ideas as to how we might address this?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New FSM patch

2008-09-12 Thread Zdenek Kotala

Heikki Linnakangas napsal(a):

Heikki Linnakangas wrote:
I've also been working on a low level benchmark using a C user-defined 
function that exercises just the FSM, showing the very raw CPU 
performance vs. current implementation. More on that later, but ATM it 
looks like the new implementation can be faster or slower than the 
current one, depending on the table size.


Let me describe this test case first:
- The test program calls RecordAndGetPageWithFreeSpace in a tight loop, 
with random values. There's no activity to the heap. In normal usage, 
the time spent in RecordAndGetWithFreeSpace is minuscule compared to the 
heap and index updates that cause RecordAndGetWithFreeSpace to be called.
- WAL was placed on a RAM drive. This is of course not how people set up 
their database servers, but the point of this test was to measure CPU 
speed and scalability. The impact of writing extra WAL is significant 
and needs to be taken into account, but that's a separate test and 
discussion, and needs to be considered in comparison to the WAL written 
by heap and index updates.


That said, the test results are pretty interesting.

I ran the test using a custom scripts with pgbench. I ran it with 
different table sizes, and with 1 or 2 clients, on CVS HEAD and a 
patched version. The unit is thousands of RecordAndGetPageWithFreeSpace 
calls per second:


Table sizePatchedCVS HEAD
1 clnt2 clnts1 clnt2 clients
8 kB4.593.4562.8326.85
336 kB13.856.4341.816.55
3336 kB14.966.322.4510.55
6 kB14.856.565.444.08
36 kB14.4811.040.790.74
336 kB12.6811.50.070.07
3336 kB7.675.370.050.05

The big surprise to me was that performance on CVS HEAD tanks as the 
table size increases. One possible explanation is that searches for X 
bytes of free space, for a very high X, will not find any matches, and 
the current FSM implementation ends up scanning through the whole FSM 
list for that relation.


Another surprise was how badly both implementations scale. On CVS HEAD, 
I expected the performance to be roughly the same with 1 and 2 clients, 
because all access to the FSM is serialized on the FreeSpaceLock. But 
adding the 2nd client not only didn't help, but it actually made the 
performance much worse than with a single client. Context switching or 
cache line contention, perhaps? 


The new FSM implementation shows the 
same effect, which was an even bigger surprise. At table sizes  32 MB, 
the FSM no longer fits on a single FSM page, so I expected almost a 
linear speed up with bigger table sizes from using multiple clients. 
That's not happening, and I don't know why. Although, going from 33MB to 
333 MB, the performance with 2 clients almost doubles, but it still 
doesn't exceed that with 1 client.


It looks likes that there are lot of lock issues on FSM pages. When number of 
FSM pages is increased then number of collisions is lower. It is probably why 2 
clients significantly speed up between 33MB and 333MB. I think it is time to 
take DTrace ;-).
Do you have any machine with DTrace support? If not send me your test suit and I 
will try it run on my machine.


Zdenek





--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Fri, 2008-09-12 at 12:25 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  3. Ignore problem
  Effects:
  * Long running queries on standby...
 Have no effect on primary
 Do not delay apply of WAL records on standby
  * Queries on standby give inconsistent answers in some cases, though
  doesn't generate any messages to show inconsistency occurred. Acceptable
  for read-only and insert only tables only.
 
 This seems like a non-starter. 

It works, and is proposed as a non-default option since a number of
people have independently said to me that this would be
acceptable/preferred.

 Your comment about read-only and insert-only tuples only seems to make sense
 if you assume there are other tables being updated simultaneously. Otherwise
 of course there would be no WAL records for tuple removals.

Yeh, you got it.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New FSM patch

2008-09-12 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Let me describe this test case first:
 - The test program calls RecordAndGetPageWithFreeSpace in a tight loop, 
 with random values.

What's the distribution of the random values, exactly?  In particular,
how do the request sizes compare to available free space per-page?

The design intent for FSM was that we'd not bother to record pages that
have less free space than the average request size, so as to (usually)
avoid the problem of uselessly searching a lot of entries.  I can't tell
whether your test case models that behavior at all.  If it does then
there may be something else that needs fixing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Heikki Linnakangas

Simon Riggs wrote:

If we were going to recover from failed-over standby back to original
master just via WAL logs we would need all of the WAL files from the
point of failover. So you'd need to be storing all WAL file just in case
the old master recovers. I can't believe doing that would be the common
case, because its so impractical and most people would run out of disk
space and need to delete WAL files.


Depends on the transaction volume and database size of course. It's 
actually not any different from the scenario where the slave goes 
offline for some reason. You have the the same decision there of how 
long to keep the WAL files in the master, in case the slave wakes up.


I think we'll need an option to specify a maximum for the number of WAL 
files to keep around. The DBA should set that to the size of the WAL 
drive, minus some safety factor.



It should be clear that to make this work you must run with a base
backup that was derived correctly on the current master. You can do that
by re-copying everything, or you can do that by just shipping changed
blocks (rsync etc). So I don't see a problem in the first place.


Hmm, built-in rsync capability would be cool. Probably not in the first 
phase, though..


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 17:08 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
 
  It should be clear that to make this work you must run with a base
  backup that was derived correctly on the current master. You can do that
  by re-copying everything, or you can do that by just shipping changed
  blocks (rsync etc). So I don't see a problem in the first place.
 
 Hmm, built-in rsync capability would be cool. Probably not in the first 
 phase, though..

We have it for WAL shipping, in form of GUC archive_command  :)

Why not add full_backup_command ?

--
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-12 Thread D'Arcy J.M. Cain
On Fri, 12 Sep 2008 06:53:55 +1000
Brendan Jurd [EMAIL PROTECTED] wrote:
 Josh has assigned your patch to me for an initial review.

And me.

 First up I'd like to say that this is a really nice upgrade.
 Shielding a running server from reloading a bogus conf file makes a
 whole lot of sense.

Yes.

 The patch applied cleanly to HEAD, compiled fine on amd64 gentoo and

I had a small problem compiling.  I'm not sure why it would be
different for me.  I run NetBSD -current.  Here is the error:

../../../src/include/libpq/hba.h:51: error: field 'addr' has incomplete
type

I was able to fix this by adding the following line to hba.h:

#include libpq/pqcomm.h/* needed for struct sockaddr_storage */

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress inputdir

2008-09-12 Thread Alvaro Herrera
Tom Lane wrote:
 Peter Eisentraut [EMAIL PROTECTED] writes:

  I think the appropriate interface would be adding another option to 
  pg_regress called --workdir or --tempdir, which defaults to PWD, and 
  write the converted sql files there, and then look for the sql files to 
  execute in workdir/sql and in inputdir/sql.  In some way, this copies 
  the vpath search mechanism.
 
 That would be required to make pg_regress run as far as its own
 facilities are concerned.  But I think Alvaro is worried about something
 at a higher level: the regression test process as a whole has some
 directory layout assumptions built into it, particularly in regards
 to where to find .so's.  If we don't have a workable solution for that
 it's not really going to help to change pg_regress like this.

Maybe the same work dir can be used as a place to store the shared
objects.  I think all it'd require is to change @abs_builddir@ to point
to workdir.

That should work fine as long as nobody attempts to put the workdir in
some mount point that's marked noexec (which is somewhat common with
/tmp)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-12 Thread Simon Riggs

On Fri, 2008-09-12 at 15:08 +0300, Hannu Krosing wrote:
 On Fri, 2008-09-12 at 13:54 +0200, Csaba Nagy wrote:
   I think that enabling long-running queries this way is both
   low-hanging
   fruit (or at least medium-height-hanging ;) ) and also consistent to
   PostgreSQL philosophy of not replication effort. As an example we trust
   OS's file system cache and don't try to write our own.
  
  I have again questions (unfortunately I only have questions usually):
  
  * how will the buffers keep 2 different versions of the same page ?
 
 As the FS snapshot is mounted as a different directory, it will have
 it's own buffer pages.

RelFileNode has a spcNode which can be redirected to a temporary
filesystem snapshot. So its relatively easy to imagine redirecting
access to a table from its normal tablespace to the snapshot one.

 To conserve RAM, one could go to FS snapshot files only in case main
 pages have LSN too big to be trusted.

That would mean you'd need to do two I/Os, one to get the newly changed
page to get its LSN and another to get the old COW copy. We might waste
buffer space with that technique also. Since we'd be trying to avoid
cacheing bigger tables anyway (since 8.3) it seems easier to just go
straight to the COW copy.

So I think its fairly straightforward to support temporary snapshots in
Postgres, with creation/destruction handled in the way you say.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Simon Riggs

On Fri, 2008-09-12 at 17:08 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:

 I think we'll need an option to specify a maximum for the number of WAL 
 files to keep around. The DBA should set that to the size of the WAL 
 drive, minus some safety factor.
 
  It should be clear that to make this work you must run with a base
  backup that was derived correctly on the current master. You can do that
  by re-copying everything, or you can do that by just shipping changed
  blocks (rsync etc). So I don't see a problem in the first place.
 
 Hmm, built-in rsync capability would be cool. Probably not in the first 
 phase, though..

Built-in? Why? I mean make base backup using rsync. That way only
changed data blocks need be migrated, so much faster.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Csaba Nagy
On Fri, 2008-09-12 at 17:24 +0300, Hannu Krosing wrote:
 On Fri, 2008-09-12 at 17:08 +0300, Heikki Linnakangas wrote:
  Hmm, built-in rsync capability would be cool. Probably not in the first 
  phase, though..
 
 We have it for WAL shipping, in form of GUC archive_command  :)
 
 Why not add full_backup_command ?

I see the current design is all master-push centered, i.e. the master is
in control of everything WAL related. That makes it hard to create a
slave which is simply pointed to the server and takes all it's data from
there...

Why not have a design where the slave is in control for it's own data ?
I mean the slave could ask for the base files (possibly through a
special function deployed on the master), then ask for the WAL stream
and so on. That would easily let a slave cascade too, as it could relay
the WAL stream and serve the base backup too... or have a special WAL
repository software with the same interface as a normal master, but
having a choice of base backups and WAL streams. Plus that a slave in
control approach would also allow multiple slaves at the same time for a
given master...

The way it would work would be something like:

* configure the slave with a postgres connection to the master;
* the slave will connect and set up some meta data on the master
identifying itself and telling the master to keep the WAL needed by this
slave, and also get some meta data about the master's details if needed;
* the slave will call a special function on the slave and ask for the
base backup to be streamed (potentially compressed with special
knowledge of postgres internals);
* once the base backup is streamed, or possibly in parallel,  ask for
streaming the WAL files;
* when the base backup is finished, start applying the WAL stream, which
is cached in the meantime, and it it's streaming continues;
* keep the master updated about the state of the slave, so the master
can know if it needs to keep the WAL files which were not yet streamed;
* in case of network error, the slave connects again and starts to
stream the WAL from where it was left;
* in case of extended network outage, the master could decide to
unsubscribe the slave when a certain time-out happened;
* when the slave finds itself unsubscribed after a longer disconnection,
it could ask for a new base backup based on differences only... some
kind of built in rsync thingy;

The only downside of this approach is that the slave machine needs a
full postgres super user connection to the master. That could be a
security problem in certain scenarios. The master-centric scenario needs
a connection in the other direction, which might be seen as more secure,
I don't know for sure...

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New FSM patch

2008-09-12 Thread Heikki Linnakangas

Zdenek Kotala wrote:
It looks likes that there are lot of lock issues on FSM pages. When 
number of FSM pages is increased then number of collisions is lower. It 
is probably why 2 clients significantly speed up between 33MB and 333MB. 


Yes, that's what I thought as well. With table size under 33 MB, the FSM 
consists of just one (bottom-level) FSM page,



I think it is time to take DTrace ;-).
Do you have any machine with DTrace support? 


No.

If not send me your test 
suit and I will try it run on my machine.


Sure, here you are. tests.sh is the main script to run. You'll need to 
adjusts the paths there for your environment.


As it is, the tests will take many hours to run, so you'll probably want 
to modify tests.sh and pgbenchtests.sh to reduce the number of 
iterations. At least on my server, the variance in the numbers was very 
small, so repeating the tests 4 times in tests.sh is probably overkill.


Thanks!

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


fsmtest.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New FSM patch

2008-09-12 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:

Let me describe this test case first:
- The test program calls RecordAndGetPageWithFreeSpace in a tight loop, 
with random values.


What's the distribution of the random values, exactly?  In particular,
how do the request sizes compare to available free space per-page?


The request, and old avail sizes are in the range of 0-8100 
(random()%8100).



The design intent for FSM was that we'd not bother to record pages that
have less free space than the average request size, so as to (usually)
avoid the problem of uselessly searching a lot of entries.  I can't tell
whether your test case models that behavior at all.  If it does then
there may be something else that needs fixing.


Probably not. The test case starts with a table that's practically 
empty, so all pages are put into the FSM.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move src/tools/backend/ to wiki

2008-09-12 Thread David Fetter
On Fri, Sep 12, 2008 at 10:29:54AM +0300, Peter Eisentraut wrote:
 Alvaro Herrera wrote:
 However I wonder how much value there really is in the developer's
 FAQ, considering that some answers seem rather poor.  For example
 the answer on ereport() was wrong, and nobody ever pointed it out.
 The answer on palloc/pfree is very incomplete too.

 I think the developer's FAQ has essentially been unmaintained for
 many  years.  I think we should gradually migrate the content to
 other wiki  pages and eventually drop the FAQ.

+1 :)

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Heikki Linnakangas

Simon Riggs wrote:

Built-in? Why? I mean make base backup using rsync. That way only
changed data blocks need be migrated, so much faster.


Yes, what I meant is that it would be cool to have that functionality 
built-in, so that you wouldn't need to configure extra rsync scripts and 
authentication etc.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move src/tools/backend/ to wiki

2008-09-12 Thread Alvaro Herrera
Peter Eisentraut wrote:
 Alvaro Herrera wrote:
 However I wonder how much value there really is in the developer's FAQ,
 considering that some answers seem rather poor.  For example the
 answer on ereport() was wrong, and nobody ever pointed it out.  The
 answer on palloc/pfree is very incomplete too.

 I think the developer's FAQ has essentially been unmaintained for many  
 years.  I think we should gradually migrate the content to other wiki  
 pages and eventually drop the FAQ.

In a way, this has already started; some answers are now just
pointers to other wiki pages or to the docs.

I think getting rid of the FAQ completely is not necessarily a good
idea; it seems useful as a collection of interesting questions.  Moving
the contents to new pages is probably OK.  Also, as the answers mature
on the Wiki, perhaps it'd be possible to move them to the SGML docs (and
reduce the Wiki answer to just a pointer).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Andrew Dunstan



Heikki Linnakangas wrote:

Simon Riggs wrote:

Built-in? Why? I mean make base backup using rsync. That way only
changed data blocks need be migrated, so much faster.


Yes, what I meant is that it would be cool to have that functionality 
built-in, so that you wouldn't need to configure extra rsync scripts 
and authentication etc.




If this were a nice pluggable library I'd agree, but AFAIK it's not, and 
I don't see great value in reinventing the wheel.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Heikki Linnakangas

Csaba Nagy wrote:

Why not have a design where the slave is in control for it's own data ?
I mean the slave could ask for the base files (possibly through a
special function deployed on the master), then ask for the WAL stream
and so on. That would easily let a slave cascade too, as it could relay
the WAL stream and serve the base backup too... or have a special WAL
repository software with the same interface as a normal master, but
having a choice of base backups and WAL streams. Plus that a slave in
control approach would also allow multiple slaves at the same time for a
given master...


I totally agree with that.


The only downside of this approach is that the slave machine needs a
full postgres super user connection to the master. That could be a
security problem in certain scenarios. 


I think the master-slave protocol needs to be separate from the normal 
FE/BE protocol, with commands like send a new base backup, or 
subscribe to new WAL that's generated. A master-slave connection isn't 
associated with any individual database, for example. We can keep the 
permissions required for establishing a master-slave connection 
different from super-userness. In particular, while the slave will be 
able to read all data from the whole cluster, by receiving it in the WAL 
and base backups, it doesn't need to be able to modify anything in the 
master.



The master-centric scenario needs
a connection in the other direction, which might be seen as more secure,
I don't know for sure...


Which one initiates the connection, the master or slave, is a different 
question. I believe we've all assumed that it's the slave that connects 
to the master, and I think that makes the most sense.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Markus Wanner

Hi,

Andrew Dunstan wrote:
If this were a nice pluggable library I'd agree, but AFAIK it's not, and 
I don't see great value in reinventing the wheel.


I certainly agree.

However, I thought of it more like the archive_command, as proposed by 
Hannu. That way we don't need to reinvent any wheel and still the 
standby could trigger the base data synchronization itself.


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread Joshua D. Drake

Heikki Linnakangas wrote:

The only thing I would say here is that you point the URL to current 
which will be wrong in one release. Perhaps something that pulls the 
pgversion macro?


We don't put URLs in error messages. The hint needs to be a real sentence.



Which is exactly why its the first thing we do when supporting people. 
Point them to the URL in the docs.


Perhaps the middle man isn't needed.

Joshua D. Drake


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread Alvaro Herrera
Joshua D. Drake wrote:
 Heikki Linnakangas wrote:

 The only thing I would say here is that you point the URL to current  
 which will be wrong in one release. Perhaps something that pulls the  
 pgversion macro?

 We don't put URLs in error messages. The hint needs to be a real sentence.

 Which is exactly why its the first thing we do when supporting people.  
 Point them to the URL in the docs.

Let's add a new field in error reports: errurl()

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread David Fetter
On Fri, Sep 12, 2008 at 12:30:12PM -0400, Alvaro Herrera wrote:
 Joshua D. Drake wrote:
  Heikki Linnakangas wrote:
 
  The only thing I would say here is that you point the URL to
  current  which will be wrong in one release. Perhaps something
  that pulls the  pgversion macro?
 
  We don't put URLs in error messages. The hint needs to be a real
  sentence.
 
  Which is exactly why its the first thing we do when supporting
  people.  Point them to the URL in the docs.
 
 Let's add a new field in error reports: errurl()

Excellent idea :)

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Simon Riggs

On Fri, 2008-09-12 at 17:11 +0200, Csaba Nagy wrote:

 Why not have a design where the slave is in control for it's own data ?
 I mean the slave...

The slave only exists because it is a copy of the master. If you try to
startup a slave without first having taken a copy, how would you
bootstrap the slave? With what? To what? It sounds cool, but its not
practical.

I posted a workable suggestion today on another subthread.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better auth errors from libpq

2008-09-12 Thread Andrew Chernow

David Fetter wrote:

On Fri, Sep 12, 2008 at 12:30:12PM -0400, Alvaro Herrera wrote:

Joshua D. Drake wrote:

Heikki Linnakangas wrote:


The only thing I would say here is that you point the URL to
current  which will be wrong in one release. Perhaps something
that pulls the  pgversion macro?

We don't put URLs in error messages. The hint needs to be a real
sentence.

Which is exactly why its the first thing we do when supporting
people.  Point them to the URL in the docs.

Let's add a new field in error reports: errurl()


Excellent idea :)

Cheers,
David.


If you start using urls in error messages, it might be helpful to point 
everything at a single url and only change the query_string, maybe


http://www.postgresql.org/docs/?err_id

Not sure if err_id is the correct value for the query_string, but I'm 
sure something like the above would simply/centralize management.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread David E. Wheeler

On Sep 5, 2008, at 11:33, David E. Wheeler wrote:


On Sep 5, 2008, at 11:30, Tom Lane wrote:


Thanks for reviewing.  I've committed this with your suggestions and
one additional non-cosmetic change: schema-qualify names in the
bodies of the SQL functions so that they are not search_path  
dependent.


Thanks, I'll check that out.


Finally got to this; sorry for the delay.

Two things I noticed:

1. Did I neglect to include the documentation patch? I've attached it  
here. It's necessary because of the addition of the new functions.


2. Many thanks for switching to using the network_show function  
instead of the SQL-based casting I had. Can you tell me how to go  
about finding such functions? Because for my 8.3 version of citext, I  
have a whole bunch of functions that do casting like this:


CREATE OR REPLACE FUNCTION int8(citext)
RETURNS int8
AS 'SELECT int8( $1::text )'
LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION citext(int8)
RETURNS citext
AS 'SELECT text( $1 )::citext'
LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION int4(citext)
RETURNS int4
AS 'SELECT int4( $1::text )'
LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION citext(int4)
RETURNS citext
AS 'SELECT text( $1 )::citext'
LANGUAGE SQL IMMUTABLE STRICT;

...and so on. I'd love to be able to replace these (and many others)  
with internal C functions, if  only I could figure out what those  
functions were. A pointer to making that determination (if they even  
exist in 8.3) would be greatly appreciated.


Thanks,

David



citext_doc.patch
Description: Binary data



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] fix dblink security hole

2008-09-12 Thread Alvaro Herrera
Marko Kreen escribió:
 Currently dblink allows regular users to initiate libpq connection
 to user-provided connection string.  This breaks the default
 policy that normal users should not be allowed to freely interact
 with outside environment.

Since people is now working on implementing the SQL/MED stuff to manage
connections, should we bounce this patch?  With luck, the CREATE
CONNECTION (?) stuff will be done for the next commitfest and we can
just switch dblink to use that instead.

http://archives.postgresql.org/message-id/[EMAIL PROTECTED]

Thoughts?  Can we really expect SQL/MED connection mgmt to be done for
the next fest?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] fix dblink security hole

2008-09-12 Thread David Fetter
On Fri, Sep 12, 2008 at 01:14:36PM -0400, Alvaro Herrera wrote:
 Marko Kreen escribió:
  Currently dblink allows regular users to initiate libpq connection
  to user-provided connection string.  This breaks the default
  policy that normal users should not be allowed to freely interact
  with outside environment.
 
 Since people is now working on implementing the SQL/MED stuff to
 manage connections,

I don't see any code for this.  Is there some?

 should we bounce this patch?  With luck, the CREATE CONNECTION (?)
 stuff will be done for the next commitfest and we can just switch
 dblink to use that instead.

That would be great :)

 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 Thoughts?  Can we really expect SQL/MED connection mgmt to be done
 for the next fest?

Connection management would be awesome.  The whole SQL/MED spec is
gigantic, tho.  Should we see about an implementation roadmap for the
parts we care about?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread Tom Lane
David E. Wheeler [EMAIL PROTECTED] writes:
 1. Did I neglect to include the documentation patch? I've attached it  
 here. It's necessary because of the addition of the new functions.

Maybe it got left out of the later patch iterations?  Anyway,
will take care of it.

 2. Many thanks for switching to using the network_show function  
 instead of the SQL-based casting I had. Can you tell me how to go  
 about finding such functions?

Er, look into pg_cast and then pg_proc?  For instance

select oid::regprocedure, prosrc from pg_proc
where oid in (select castfunc from pg_cast);

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread David E. Wheeler

On Sep 12, 2008, at 10:58, Tom Lane wrote:


1. Did I neglect to include the documentation patch? I've attached it
here. It's necessary because of the addition of the new functions.


Maybe it got left out of the later patch iterations?  Anyway,
will take care of it.


Great, thank you.


2. Many thanks for switching to using the network_show function
instead of the SQL-based casting I had. Can you tell me how to go
about finding such functions?


Er, look into pg_cast and then pg_proc?  For instance

select oid::regprocedure, prosrc from pg_proc
where oid in (select castfunc from pg_cast);


That looks like *exactly* what I need. Thanks!

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread David E. Wheeler


On Sep 12, 2008, at 11:06, David E. Wheeler wrote:


Er, look into pg_cast and then pg_proc?  For instance

select oid::regprocedure, prosrc from pg_proc
where oid in (select castfunc from pg_cast);


That looks like *exactly* what I need. Thanks!


Pity. Looks like there were only a few I wasn't using, text_char,  
char_text, text_name, and texttoxml. Do I really need to keep all my  
other casts like these in 8.3?


CREATE OR REPLACE FUNCTION int8(citext)
RETURNS int8
AS 'SELECT int8( $1::text )'
LANGUAGE SQL IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION citext(int8)
RETURNS citext
AS 'SELECT text( $1 )::citext'
LANGUAGE SQL IMMUTABLE STRICT;

Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread David E. Wheeler

On Sep 12, 2008, at 11:14, David E. Wheeler wrote:

Pity. Looks like there were only a few I wasn't using, text_char,  
char_text, text_name, and texttoxml.


Oh, and text_name seems to give me this error:

 ERROR:  compressed data is corrupt

That's when I have this cast:

 CREATE OR REPLACE FUNCTION citext(name)
 RETURNS citext
 AS 'text_name'
 LANGUAGE internal IMMUTABLE STRICT;

This version does not give me an error:

 CREATE OR REPLACE FUNCTION citext(name)
 RETURNS citext
 AS 'SELECT text( $1 )::citext'
 LANGUAGE SQL IMMUTABLE STRICT;

Maybe I did something wrong?

Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread Tom Lane
David E. Wheeler [EMAIL PROTECTED] writes:
 Oh, and text_name seems to give me this error:

   ERROR:  compressed data is corrupt

 That's when I have this cast:

   CREATE OR REPLACE FUNCTION citext(name)
   RETURNS citext
   AS 'text_name'
   LANGUAGE internal IMMUTABLE STRICT;

I think you've got the direction backwards.


BTW, I removed the Limitations entry about I/O casting not working
with citext; we fixed that, no?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread Tom Lane
David E. Wheeler [EMAIL PROTECTED] writes:
 Pity. Looks like there were only a few I wasn't using, text_char,  
 char_text, text_name, and texttoxml. Do I really need to keep all my  
 other casts like these in 8.3?

 CREATE OR REPLACE FUNCTION int8(citext)
 RETURNS int8
 AS 'SELECT int8( $1::text )'
 LANGUAGE SQL IMMUTABLE STRICT;

Yeah, those are all replaced by the CoerceViaIO mechanism.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread David E. Wheeler

On Sep 12, 2008, at 11:31, Tom Lane wrote:


David E. Wheeler [EMAIL PROTECTED] writes:

Oh, and text_name seems to give me this error:



 ERROR:  compressed data is corrupt



That's when I have this cast:



 CREATE OR REPLACE FUNCTION citext(name)
 RETURNS citext
 AS 'text_name'
 LANGUAGE internal IMMUTABLE STRICT;


I think you've got the direction backwards.


Oh. Duh.


BTW, I removed the Limitations entry about I/O casting not working
with citext; we fixed that, no?


Yes, we did. Thanks for the catch.

I've got another patch I'm working on adding support for char (and  
tests for char). Just to fill out a gap I saw in the casting coverage.  
I'm trying to get it done now. With that, AFAIK, citext will work just  
like text.


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread David E. Wheeler

On Sep 12, 2008, at 11:34, Tom Lane wrote:


CREATE OR REPLACE FUNCTION int8(citext)
RETURNS int8
AS 'SELECT int8( $1::text )'
LANGUAGE SQL IMMUTABLE STRICT;


Yeah, those are all replaced by the CoerceViaIO mechanism


Okay, thanks for the sanity check. The SQL versions are fine for me in  
8.3.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread David E. Wheeler

On Sep 12, 2008, at 11:35, David E. Wheeler wrote:

I've got another patch I'm working on adding support for char (and  
tests for char). Just to fill out a gap I saw in the casting  
coverage. I'm trying to get it done now. With that, AFAIK, citext  
will work just like text.


Looks like the IO conversions handle char and char, so the attached  
patch just updates the regression test.


Best,

David


char_casts.patch
Description: Binary data



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-12 Thread Ron Mayer

Tom Lane wrote:

Ron Mayer [EMAIL PROTECTED] writes:

... ISO 8601 intervals ...


On the output side, seems like a GUC variable
is the standard precedent here.  I'd still vote against overloading
DateStyle --- it does too much already --- but a separate variable for
interval style wouldn't bother me.  In fact, given that we are now
somewhat SQL-compliant on interval input, a GUC that selected
PG traditional, SQL-standard, or ISO 8601 interval output format seems
like it could be a good idea.


Is it OK that this seems to me it wouldn't be backward compatible
with the current interval_out that looks to me to be using
the DateStyle GUC?

I supposed it could be made backward compatible if the new
IntervalStyle GUC defaulted to a value of guess_from_datestyle,
but I fear an option like that  might add rather than remove
confusion.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Review] Tests citext casts by David Wheeler.

2008-09-12 Thread Alvaro Herrera
David E. Wheeler escribió:
 On Sep 12, 2008, at 11:35, David E. Wheeler wrote:

 I've got another patch I'm working on adding support for char (and  
 tests for char). Just to fill out a gap I saw in the casting coverage. 
 I'm trying to get it done now. With that, AFAIK, citext will work just 
 like text.

 Looks like the IO conversions handle char and char, so the attached  
 patch just updates the regression test.

There are unresolved conflicts in the patch ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Hannu Krosing
On Fri, 2008-09-12 at 17:45 +0100, Simon Riggs wrote:
 On Fri, 2008-09-12 at 17:11 +0200, Csaba Nagy wrote:
 
  Why not have a design where the slave is in control for it's own data ?
  I mean the slave...
 
 The slave only exists because it is a copy of the master. If you try to
 startup a slave without first having taken a copy, how would you
 bootstrap the slave? With what? To what? 

As I understand it, Csaba meant that slave would bootstrap itself by
connecting to master in some early phase of startup, requesting a
physical filesystem level copy of data, then commencing the startup in
Hot Standby mode.

If done that way, all the slave needs is a superuser level connection to
master database.

Of course this can also be done using little hot standby startup script
from slave, if shell access to master is provided,.
 
--
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-12 Thread Alvaro Herrera
Hannu Krosing escribió:
 On Fri, 2008-09-12 at 17:45 +0100, Simon Riggs wrote:
  On Fri, 2008-09-12 at 17:11 +0200, Csaba Nagy wrote:
  
   Why not have a design where the slave is in control for it's own data ?
   I mean the slave...
  
  The slave only exists because it is a copy of the master. If you try to
  startup a slave without first having taken a copy, how would you
  bootstrap the slave? With what? To what? 
 
 As I understand it, Csaba meant that slave would bootstrap itself by
 connecting to master in some early phase of startup, requesting a
 physical filesystem level copy of data, then commencing the startup in
 Hot Standby mode.

Interesting ...

This doesn't seem all that difficult -- all you need is to start one
connection to get the WAL stream and save it somewhere; meanwhile a
second connection uses a combination of pg_file_read on master +
pg_file_write on slave to copy the data files over.  When this step is
complete, recovery of the stored WAL commences.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-12 Thread Ron Mayer

Tom Lane wrote:

somewhat SQL-compliant on interval input, a GUC that selected
PG traditional, SQL-standard, or ISO 8601 interval output format seems
like it could be a good idea.


Trying to do the SQL-standard output now, and have a question
of what to do in the SQL-standard mode when trying to output
an interval that as both a YEAR and a DAY component.

AFAICT the SQL standard doesn't let you have both, so the
SQL-standard output actually won't be.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-12 Thread Tom Lane
Ron Mayer [EMAIL PROTECTED] writes:
 Trying to do the SQL-standard output now, and have a question
 of what to do in the SQL-standard mode when trying to output
 an interval that as both a YEAR and a DAY component.

 AFAICT the SQL standard doesn't let you have both, so the
 SQL-standard output actually won't be.

The reason it's not SQL-standard is the data value isn't.
So not a problem.  Someone conforming to the spec limits on
what he puts in will see spec-compliant output.  I think all
you need is 'yyy-mm dd hh:mm:ss' where you omit yyy-mm if
zeroes, omit dd if zero, omit hh:mm:ss if zeroes (but maybe
only if dd is also 0?  otherwise your output is just dd which
is uncomfortably ambiguous).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-12 Thread Ron Mayer

Tom Lane wrote:

The reason it's not SQL-standard is the data value isn't.
So not a problem.  Someone conforming to the spec limits on
what he puts in will see spec-compliant output.  I think all
you need is 'yyy-mm dd hh:mm:ss' where you omit yyy-mm if
zeroes, omit dd if zero, omit hh:mm:ss if zeroes (but maybe
only if dd is also 0?  otherwise your output is just dd which
is uncomfortably ambiguous).


Great.  That's what I'll do.

Any convention or preference on the naming of the GUC?
I assume intervalstyle is reasonable?

Or thoughts regarding the current EncodeInterval() that's
already using the datestyle GUC?


pg82=# select interval '1';
 interval
--
 00:00:01
(1 row)

pg82=# set datestyle='sql';
SET

pg82=# select interval '1';
 interval
--
 @ 1 sec
(1 row)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-12 Thread Ron Mayer

Ron Mayer wrote:

Tom Lane wrote:

you need is 'yyy-mm dd hh:mm:ss' where you omit yyy-mm if
zeroes, omit dd if zero, omit hh:mm:ss if zeroes (but maybe
only if dd is also 0?  otherwise your output is just dd which
is uncomfortably ambiguous).


Oh, and if both parts are 0, I guess we desire
the (more comfortable than the alternatives) '0'?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential Join Performance Issue

2008-09-12 Thread Lawrence, Ramon
 From: Tom Lane [mailto:[EMAIL PROTECTED]
 I was intending to do it the other way, actually.  An extra field in
 HashPath hardly costs anything.  The other reason for it is that there
 are other possible uses for knowing whether a hash will be
multi-batch.
 (For example, if we were prepared to tell the executor that it *must*
 keep the hash to one batch, we could assume that the sort order of the
 left input is preserved.  I haven't looked into the risks/benefits of
 that too much, but it's been in the back of the mind for a long time.)

Having the number of batches in HashPath could be potentially useful for
a variety of reasons.  For our research, we have added an nbatch
variable in both HashPath and HashJoin.  Having it in HashJoin is useful
as we modified EXPLAIN to output the number of batches.  There are costs
in putting an nbatch variable in HashPath as the system may set this
variable potentially hundreds/thousands of times during costing and does
not (currently) use it until you convert the chosen HashPath to a plan.

 I'd be more inclined to deal with the issue by trying to establish a
 safety margin in the estimate of whether the hash will go
multi-batch.
 IOW we should disuse_physical_tlist if the hash is estimated to be
close
 to but still within one batch.

Our experiments with large TPC-H 1GB joins show that it is almost always
better to not use physical_tlists if the number of batches is  1.
There is a noticeable (approximately 5-15%) improvement when using
physical_tlists for in-memory joins.  For batches of size 2, it
sometimes can go either way depending how many attributes are projected
out of the outer relation.  Using physical_tlists may be better even for
batches of size 2 if most of the attributes of the outer relation are
kept.  For a larger number of batches, the extra I/O cost significantly
dominates over the physical_tlist optimization.  Performance of
multi-batch joins may improve 50% or more by disabling the optimization.

It is possible to create a safety margin by having
ExecChooseHashTableSize() return the value
inner_rel_bytes/hash_table_bytes which represents the fraction of the
memory available that the inner relation is expected to consume.  You
can then make decisions based on that.   However, this is only as good
as the inner relation size estimate and especially for large queries,
the estimate may be quite inaccurate.  A more robust solution could
examine the width of the path and the width of the relation combined
with the number of batches to see if projecting early would be worth it.
It may be best to keep it simple and just use number of batches  1 as a
criteria and instead focus on examining issues with inaccurate join size
estimates.  

--
Dr. Ramon Lawrence
Assistant Professor, Department of Computer Science, University of
British Columbia Okanagan
E-mail: [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-12 Thread Ron Mayer

Tom Lane wrote:

I think all
you need is 'yyy-mm dd hh:mm:ss' where you omit yyy-mm if
zeroes, omit dd if zero, omit hh:mm:ss if zeroes (but maybe
only if dd is also 0?  otherwise your output is just dd which
is uncomfortably ambiguous).


Cool.  I think I have it pretty much working with a new
GUC intervalstyle that can take values of

sql_standard that I think will output SQL standard
   interval literals when given a sql
   standard interval.

iso_8601 that will output ISO 8601 Time Intervals of
   the format with time-unit deignators, and

backward_compatible that will output the same thing
   that postgres currently does that depends
   on the value of the DateStyle GUC.

I'll add the documentation and regression tests and
can submit a patch early next week.   Oh.  One more
question is that under ecpg there seems to be a fair
amount of near-duplicated code (EncodeDateTime,
EncodeInterval) for turning dates and times and
intervals to strings.

Should those ECPG functions be made identical to
the ones in the backend?
Could those somehow share code with the backend for
some of their work?


Anyway - here's a quick test of the
SQL Standard and ISO interval output as it stands
right now...



regression=# drop table test_intervals;
DROP TABLE
regression=# create temporary table test_intervals (i interval);
CREATE TABLE
regression=# insert into test_intervals values
regression-#   ('0 years'),
regression-#   ('1 year 1 month'),
regression-#   ('1 day 2 hours 3 minutes 4 seconds'),
regression-#   ('1 year 1 minute');
INSERT 0 4
regression=#
regression=# insert into test_intervals values
regression-#   ('1-1'),
regression-#   ('1'),
regression-#   (interval '1' year),
regression-#   ('1:00:00'),
regression-#   ('1 1:02:03');
INSERT 0 5
regression=#
regression=# insert into test_intervals values
regression-#   ('P1Y1M'),
regression-#   ('P1DT1H1M1S'),
regression-#   ('PT1S');
INSERT 0 3
regression=#
regression=# set intervalstyle to sql_standard;
SET
regression=# select * from test_intervals;
  i
-
 0
 1-1
 1 2:3:4
 1-0 0 0:1:0
 1-1
 0:0:1
 1-0
 1:0:0
 1 1:2:3
 1-1
 1 1:1:1
 0:0:1
(12 rows)


regression=#
regression=# set intervalstyle to iso_8601;
SET
regression=# select * from test_intervals;
 i

 PT0S
 P1Y1M
 P1DT2H3M4S
 P1YT1M
 P1Y1M
 PT1S
 P1Y
 PT1H
 P1DT1H2M3S
 P1Y1M
 P1DT1H1M1S
 PT1S
(12 rows)

regression=#
regression=# set intervalstyle to backward_compatible;
SET
regression=# set datestyle to sql;
SET
regression=# select * from test_intervals;
   i
---
 @ 0
 @ 1 year 1 mon
 @ 1 day 2 hours 3 mins 4 secs
 @ 1 year 1 min
 @ 1 year 1 mon
 @ 1 sec
 @ 1 year
 @ 1 hour
 @ 1 day 1 hour 2 mins 3 secs
 @ 1 year 1 mon
 @ 1 day 1 hour 1 min 1 sec
 @ 1 sec
(12 rows)

regression=# set datestyle to iso;
SET
regression=# select * from test_intervals;
i
-
 00:00:00
 1 year 1 mon
 1 day 02:03:04
 1 year 00:01:00
 1 year 1 mon
 00:00:01
 1 year
 01:00:00
 1 day 01:02:03
 1 year 1 mon
 1 day 01:01:01
 00:00:01
(12 rows)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Upcoming back-branch update releases

2008-09-12 Thread Tom Lane
It's been three months since our last release set, and in view of the
recent discovery of a potential-data-loss issue in 8.3, it seems that
we should get moving on another one.  Accordingly, core has decided that
we will freeze a set of update releases next Thursday (9/18) for public
announcement Monday 9/22.  Get those last minute bug fixes in ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-12 Thread Tom Lane
Ron Mayer [EMAIL PROTECTED] writes:
 Cool.  I think I have it pretty much working with a new
 GUC intervalstyle that can take values of

 sql_standard that I think will output SQL standard
 interval literals when given a sql
 standard interval.

 iso_8601 that will output ISO 8601 Time Intervals of
 the format with time-unit deignators, and

 backward_compatible that will output the same thing
 that postgres currently does that depends
 on the value of the DateStyle GUC.

Actually, we have never considered that new releases need to preserve
the behavior of postgresql.conf settings.  So the above seems
unnecessarily baroque.  How about decoupling interval_out's behavior
from DateStyle altogether, and instead providing values of IntervalStyle
that match all the previous behaviors?

 Should those ECPG functions be made identical to
 the ones in the backend?

The ECPG situation is a mess :-(.  That code was forked off from the
backend some time ago, and has not been well maintained at all.  If you
are brave enough to tackle that mess, more power to you; but I strongly
suggest doing it as an independent patch.

 Could those somehow share code with the backend for
 some of their work?

The palloc and elog dependencies seem to be the hard part.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers