Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Simon Riggs

On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
  When a backend dies with FATAL, it writes an abort record before exiting.
  
  (I was under the impression it doesn't until few minutes ago myself, 
  when I actually read the shutdown code :-))
 
  Not in all cases; keep reading :-)
 
 If it doesn't, that's a bug.  A FATAL exit is not supposed to leave the
 shared state corrupted, it's only supposed to be a forcible session
 termination.  Any open transaction should be rolled back.

Please look back at the earlier discussion we had on this exact point:
http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php

Heikki, perhaps now you understand my continued opposition to your ideas
for simplification: I had already thought of them and was forced to rule
them out, not through my own choice. Tom, note that I listen to what you
say and try to write code that meets those requirements.

From here, I don't mind which way we go. I want code that is as simple
as possible to do the job, whatever we agree that to be.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Including kerberos realm

2009-01-09 Thread Magnus Hagander
Magnus Hagander wrote:
 Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
 Alvaro Herrera wrote:
 Not that this affects me in any way, but should there be a GUC variable
 to set the default behavior system-wide?
 I thought about that, but I don't want to add extra gucs without a good
 reason. You'd typically not have very many different lines in pg_hba for
 this, and just duplicating the parameter there would be ok I think.
 I'd rather move more of the krb parameters to be *just* in pg_hba.conf,
 but for now I left those in postgresql.conf as fallbacks..
 If you think those parameters would make more sense in pg_hba.conf,
 let's just move them and be done with it.  There has never been any
 intention that administrator-only GUCs would be promised compatible
 across versions.  And the GUC mechanism is really rather a lot of
 overhead compared to options on a pg_hba line ...
 
 Well, it does make sense to have defaults in postgresql.conf - but I
 don't think it's worth the overhead.
 
 I'll commit the stuff I have for now and put it on my TODO to remove
 them completely from postgresql.conf later. I'll see if I have time to
 get it done for 8.4.

Ok, I've applied a patch for this for the parameter krb_realm and
krb_server_hostname, which are the ones that currently supported both.

Should we also consider moving the remaining ones there?
(krb_server_keyfile, krb_srvname, krb_caseinsens_users)

They do make sense to set on a per-server basis, on the other hand they
are the only remaining authentication-method-specific parameters left...

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Heikki Linnakangas

Simon Riggs wrote:

On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:

Simon Riggs si...@2ndquadrant.com writes:

On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:

When a backend dies with FATAL, it writes an abort record before exiting.

(I was under the impression it doesn't until few minutes ago myself, 
when I actually read the shutdown code :-))

Not in all cases; keep reading :-)

If it doesn't, that's a bug.  A FATAL exit is not supposed to leave the
shared state corrupted, it's only supposed to be a forcible session
termination.  Any open transaction should be rolled back.


Please look back at the earlier discussion we had on this exact point:
http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php


I think the running-xacts list we dump to WAL at every checkpoint is 
enough to handle that. Just treat the dead transaction as in-progress 
until the next running-xacts record. It's presumably extremely rare to 
have a process die with FATAL, and not write an abort record.


A related issue is that currently the recovery PANICs if it runs out of 
recovery procs. I think that's not acceptable, regardless of whether we 
use slotids or some other method to avoid it in normal operation, 
because it means you can't recover at all if you set max_connections too 
low in the standby (or in the primary, and you have to recover from 
crash), or you run out of recovery procs because of an abort failed in 
the primary like discussed on that thread. The standby should just 
fast-forward to the next running-xacts record in that case.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Open item: kerberos warning message

2009-01-09 Thread Greg Stark
For what it's worth this always bothered me. I often - but nit always  
- - have kerberos tickets gsst...@... lying around but my unix id is  
stark.


I never set up kerberos authentication for postgres but whrn the  
tickets happen to be there it fails to authenticate. I think I  
complained about this in the past but I don't recall - it would have  
been a long time ago.


--
Greg


On 8 Jan 2009, at 11:22, Stephen Frost sfr...@snowman.net wrote:


Magnus, et al,

* Magnus Hagander (mag...@hagander.net) wrote:
Looking at the open item about the new error message shown when  
Kerberos

is compiled in, and not used:
assword:
FATAL:  password authentication failed for user mha
psql: pg_krb5_init: krb5_cc_get_principal: No credentials cache found
FATAL:  password authentication failed for user mha


That is annoying, I can understand that.

The reason this is happening is that we are initializing Kerberos  
even

if we're not going to use it. The reason for doing *this*, is that if
kerberos is compiled in, we use it to find out if we should try a
different username than the one logged in to the local system - we  
look

at the kerberos login.


This made sense before we had mappings support because the only user  
you

could possibly be in PG is the one you authenticated as.


We don't do this for any other login, including kerberos over GSSAPI.
AFAIK, we've heard no complaints.


Well, I havn't moved all my systems to GSSAPI yet.. :)


Thoughts?


Now that we have support for mappings, I expect it will be more common
for a user to authenticate with princ 'A' and then connect using their
Unix id 'B' to a PG user 'B'.  As such, I'm alright with dropping
support for this.  Users can always use -U (or equiv) if necessary.

   Thanks,

   Stephen


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
  Simon Riggs si...@2ndquadrant.com writes:
  On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
  When a backend dies with FATAL, it writes an abort record before exiting.
 
  (I was under the impression it doesn't until few minutes ago myself, 
  when I actually read the shutdown code :-))
  Not in all cases; keep reading :-)
  If it doesn't, that's a bug.  A FATAL exit is not supposed to leave the
  shared state corrupted, it's only supposed to be a forcible session
  termination.  Any open transaction should be rolled back.
  
  Please look back at the earlier discussion we had on this exact point:
  http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php
 
 I think the running-xacts list we dump to WAL at every checkpoint is 
 enough to handle that. Just treat the dead transaction as in-progress 
 until the next running-xacts record. It's presumably extremely rare to 
 have a process die with FATAL, and not write an abort record.

I agree, but I'll wait for Tom to speak further.

 A related issue is that currently the recovery PANICs if it runs out of 
 recovery procs. I think that's not acceptable, regardless of whether we 
 use slotids or some other method to avoid it in normal operation, 
 because it means you can't recover at all if you set max_connections too 
 low in the standby (or in the primary, and you have to recover from 
 crash), or you run out of recovery procs because of an abort failed in 
 the primary like discussed on that thread. 

 The standby should just 
 fast-forward to the next running-xacts record in that case.

What do you mean by fast forward?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Heikki Linnakangas

Simon Riggs wrote:

On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
A related issue is that currently the recovery PANICs if it runs out of 
recovery procs. I think that's not acceptable, regardless of whether we 
use slotids or some other method to avoid it in normal operation, 
because it means you can't recover at all if you set max_connections too 
low in the standby (or in the primary, and you have to recover from 
crash), or you run out of recovery procs because of an abort failed in 
the primary like discussed on that thread. 


The standby should just 
fast-forward to the next running-xacts record in that case.


What do you mean by fast forward?


I mean the standby should stop trying to track the in progress 
transactions in recovery procs, and apply the WAL records like it does 
before the consistent state is reached.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Zeugswetter Andreas OSB sIT
 
 You don't want to just 
 modify pg_standby to accept small files, because then you've made it 
 harder to make absolutely sure when the file is ready to be 
 processed if a non-atomic copy is being done.

It is hard, but I think it is the right way forward.
Anyway I think the size is not robust at all because some (most ? e.g. win32) 
non-atomic copy
implementations will also show the final size right from the beginning.

Could we use stricter file locking when opening WAL for recovery ?

Or implement a wait loop when the crc check (+ a basic validity check) for the 
next
record fails (and the next record is on a 512 byte boundary ?).
I think standby and restore recovery can be treated differently to startup 
recovery because
a copied wal file, even if the copy is not atomic, will not have trailing valid 
WAL records
from a recycled WAL. (A solution that recycles WAL files for restore/standby 
would need to make
sure it renames the files *after* restoring the content.)

Btw how do we detect end of WAL when restoring a backup and WAL after PANIC ?

 1) Provide the length as part of the archive command

+1

Andreas
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Visibility map and freezing

2009-01-09 Thread Heikki Linnakangas

Jeff Davis wrote:

On Wed, 2009-01-07 at 09:34 +0200, Heikki Linnakangas wrote:

autovacuum_freeze_max_age - autovacuum_freeze_scan_age
vacuum_freeze_max_age   - vacuum_freeze_scan_age
vacuum_freeze_min_age   - vacuum_freeze_tuple_age

*_scan_age settings control when the table is fully scanned to freeze 
tuples and advance relfrozenxid, and vacuum_freeze_tuple_age controls 
how old a tuple needs to be to be frozen. One objection is that you can 
read freeze_scan to mean that a scan is frozen, like a tuple is 
frozen. Any better ideas?


I see what you mean about the possible misinterpretation, but I think
it's a big improvement, and I don't have a better suggestion.


Thinking about this some more, I'm not too happy with those names 
either. vacuum_freeze_scan_age and autovacuum_freeze_scan_age don't mean 
quite the same thing, like vacuum_cost_delay and 
autovacuum_vacuum_cost_delay do, for example.


I'm now leaning towards:

autovacuum_freeze_max_age
vacuum_freeze_table_age
vacuum_freeze_min_age

where autovacuum_freeze_max_age and vacuum_freeze_min_age are unchanged, 
and vacuum_freeze_table_age is the new setting that controls when VACUUM 
or autovacuum should perform a full scan of the table to advance 
relfrozenxid.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Automatic view update rules

2009-01-09 Thread Bernd Helmle
--On Sonntag, Dezember 28, 2008 15:29:58 +0100 Bernd Helmle 
be...@oopsware.de wrote:



maybe the better solution is to not allow such a view to be updatable



Yes, it seems we have to check for target lists having negative attnums in
checkTree(). Another solution would be to simply ignore those columns
(extract them from the target list and include all updatable columns
only).


While looking at this it turned out that the problem of updatability is 
more complex than
i originally thought: Consider a relation tree of the following 
views/relations:


View1 - View2 - View3 - Table1

That means, View1 consists of View2 and so on. What happens now if someone 
is going to change View3, so that it's not updatable anymore? What the 
patch actually does is, scanning all relations/views involved in a current 
view (and cascading updates) und reject update rules as soon as it finds 
more than one relation within a view definition. Unfortunately this seems 
not to be enough, we had really check all involved views for updatability 
recursively. The infrastructure for this is already there, but i wonder if 
it could be made easier when we are going to maintain a separate 
is_updatable flag somewhere in the catalog, which would make checking the 
relation tree for updatability more easier.


Comments?

--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
  A related issue is that currently the recovery PANICs if it runs out of 
  recovery procs. I think that's not acceptable, regardless of whether we 
  use slotids or some other method to avoid it in normal operation, 
  because it means you can't recover at all if you set max_connections too 
  low in the standby (or in the primary, and you have to recover from 
  crash), or you run out of recovery procs because of an abort failed in 
  the primary like discussed on that thread. 
  
  The standby should just 
  fast-forward to the next running-xacts record in that case.
  
  What do you mean by fast forward?
 
 I mean the standby should stop trying to track the in progress 
 transactions in recovery procs, and apply the WAL records like it does 
 before the consistent state is reached.

If you say something is not acceptable you need to say what behaviour
you do find acceptable in its place. It's good to come up with new
ideas, but it's not good to ignore the problems the new ideas have. This
is a general point that applies both here and to your proposals with
synch rep. It's much harder to say how it should work in a way that
covers all the requirements and points others have made, otherwise its
just handwaving.

So, if we don't PANIC, how should we behave?

Without full information on running-xacts we would be unable to take a
snapshot, so should:
* backends be forcibly disconnected?
* backends hang waiting for snapshot info to be re-available again in X
minutes worth of WAL time?
* backends throw an ERROR:  unable to provide snapshot at this time,
DETAIL: retry your statement later. 
...other alternatives

and possibly prevent new connections.

If max_connections is higher on primary then the standby will *never* be
available for querying. Should we have multiple ERRORs depending upon
whether the situation is hopefully-temporary or looks-permanent?

Don't assume I want the PANIC. That clearly needs to be revisited if we
change slotids. I just want to make a balanced judgement based upon a
full consideration of the options.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Heikki Linnakangas

Simon Riggs wrote:

On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote:
I mean the standby should stop trying to track the in progress 
transactions in recovery procs, and apply the WAL records like it does 
before the consistent state is reached.


...

So, if we don't PANIC, how should we behave?

Without full information on running-xacts we would be unable to take a
snapshot, so should:
* backends be forcibly disconnected?
* backends hang waiting for snapshot info to be re-available again in X
minutes worth of WAL time?
* backends throw an ERROR:  unable to provide snapshot at this time,
DETAIL: retry your statement later. 
...other alternatives


and possibly prevent new connections.


All of those seem reasonable to me. The 2nd option seems nicest, X 
minutes should probably be controlled by max_standby_delay, after which 
you can throw an error.


If we care enough, we could also keep tracking the transactions in 
backend-private memory of the startup process, until there's enough room 
in proc array. That would make the outage shorter, because you wouldn't 
have to wait until the next running-xacts record, but only until enough 
transactions have finished that they all fit in proc array again.


But whatever is the simplest, really.


If max_connections is higher on primary then the standby will *never* be
available for querying. Should we have multiple ERRORs depending upon
whether the situation is hopefully-temporary or looks-permanent?

Don't assume I want the PANIC. That clearly needs to be revisited if we
change slotids. 


It needs to be revisited whether we change slotids or not, IMHO.

Note that with slotids, you have a problem as soon as any of the slots 
that don't exist on standby are used, regardless of how many concurrent 
transactions there actually is. Without slots you only have a problem if 
 you really have more than standby's max_connections concurrent 
transactions. That makes a big difference in practice.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solve a problem of LC_TIME of windows.

2009-01-09 Thread Magnus Hagander
ITAGAKI Takahiro wrote:
 Hiroshi Inoue in...@tpf.co.jp wrote:
 
 Seems LC_CTYPE and LC_TIME should be convertible even though we use
 wcsftime (which internally calls strftime?).
 
 Ok, wcsftime() requries both LC_TIME and LC_CTYPE are the same setting
 (at least encoding) on Windows.
 
 The attached patch is an updated version to fix cache_locale_time().
 Now it sets LC_TIME and LC_CTYPE to the specified locale and restore
 them at end of the function. I tested the patch on Windows XP Japanese
 Edition (SJIS) with UTF-8 and EUCJP databases, and worked expectedly.
 
 #ifdef WIN32 codes seems to be ugly in the patch,
 but I have no other idea...

I have applied this version of the patch (with only a minor further
addition to the comment).

Thank you all for your work and patience in getting this fixed! Let's
hope it stays fixed :-)

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 14:38 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote:
  I mean the standby should stop trying to track the in progress 
  transactions in recovery procs, and apply the WAL records like it does 
  before the consistent state is reached.
  
  ...
  
  So, if we don't PANIC, how should we behave?
  
  Without full information on running-xacts we would be unable to take a
  snapshot, so should:
  * backends be forcibly disconnected?
  * backends hang waiting for snapshot info to be re-available again in X
  minutes worth of WAL time?
  * backends throw an ERROR:  unable to provide snapshot at this time,
  DETAIL: retry your statement later. 
  ...other alternatives
  
  and possibly prevent new connections.
 
 All of those seem reasonable to me. The 2nd option seems nicest, X 
 minutes should probably be controlled by max_standby_delay, after which 
 you can throw an error.

Hmm, we use the recovery procs to track transactions that have
TransactionIds assigned. That means we will overflow only if we have
approach 100% write transactions at any time, or if we have more write
transactions in progress than we have max_connections on standby.

So it sounds like the overflow situation would probably be both rare
and, if it did occur, may not occur for long periods.

 If we care enough, we could also keep tracking the transactions in 
 backend-private memory of the startup process, until there's enough room 
 in proc array. That would make the outage shorter, because you wouldn't 
 have to wait until the next running-xacts record, but only until enough 
 transactions have finished that they all fit in proc array again.
 
 But whatever is the simplest, really.

The above does sound best since it would allow us to have the snapshot
hang for a short period. But at this stage of the game, more complex.

For now though, since it looks like it would happen fairly rarely, I'd
opt for the simplest: throw an ERROR.

  If max_connections is higher on primary then the standby will *never* be
  available for querying. Should we have multiple ERRORs depending upon
  whether the situation is hopefully-temporary or looks-permanent?
  
  Don't assume I want the PANIC. That clearly needs to be revisited if we
  change slotids. 
 
 It needs to be revisited whether we change slotids or not, IMHO.
 
 Note that with slotids, you have a problem as soon as any of the slots 
 that don't exist on standby are used, regardless of how many concurrent 
 transactions there actually is. Without slots you only have a problem if 
   you really have more than standby's max_connections concurrent 
 transactions. That makes a big difference in practice.

Sometimes, but mostly people set max_connections higher because they
intend to use those extra connections. So no real advantage there
against the slotid approach :-)

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer pool statistics in Explain Analyze

2009-01-09 Thread Tom Lane
ITAGAKI Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 I think there two independent items here:
   [1] Should we add those statistics to pg_stat_statements or not?
   [2] Should we add those statistics to EXPLAIN ANALYZE or not?

 I wanted to have [1] and proposed it, but it is rejected from 8.4.
 However, the reason is not because we have little demand for it,
 but [1] and [2] are mixed in the patch and they are bad designed.

No, I think you misunderstood me entirely.  The reason that I rejected
those parts of the patch is that I think the statistics that are
available are wrong/useless.  If the bufmgr.c counters were really good
for something they'd have been exposed long since (and we'd probably
never have built a lot of the other stats collection infrastructure).

The EXPLAIN ANALYZE code you submitted is actually kinda cute, and
I'd have had no problem with it if I thought it were displaying
numbers that were useful and unlikely to be obsoleted in future
releases.  The work that needs to be done is on collecting the
numbers more than displaying them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about truncate

2009-01-09 Thread Peter Eisentraut

David Fetter wrote:

On Thu, Jan 08, 2009 at 02:39:52PM +0200, Peter Eisentraut wrote:

David Fetter wrote:

+1 for adding recursion to GRANT/REVOKE :)
This area is under SQL standard control, so we can't really invent our  
own behavior.


Consider the following:

CREATE TABLE persons (name, email);
CREATE TABLE employees (grade, salary) INHERITS (persons);

GRANT SELECT ON persons TO allstaff;  -- ???
GRANT SELECT ON employees TO managers;

What you want in practice is that allstaff can read only those columns  
of employees that come from the persons table.  Both recursive and  
nonrecursive GRANT do the wrong thing here.


What *would* do the right thing here, or would anything?


I think we don't need GRANT to be recursive, but instead the permission 
checks at runtime should allow


SELECT * FROM persons;

to succeed even if there are no permissions on employees.

But only on the columns of persons and only if actually queried 
through persons.


Needs a more detailed analysis, but that is how I imagine it ought to work.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about truncate

2009-01-09 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 This area is under SQL standard control, so we can't really invent our  
 own behavior.

 What *would* do the right thing here, or would anything?

 I think we don't need GRANT to be recursive, but instead the permission 
 checks at runtime should allow
 SELECT * FROM persons;
 to succeed even if there are no permissions on employees.

Hmm, if we are supposing that the spec should control this, then
surely we can find chapter and verse spelling out what should happen.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Heikki Linnakangas
The foreign_data test case is failing when I run make installcheck 
against a server that's been initialized with a locale other than C 
(en_GB.UTF-8).


The reason is the different ordering of upper and lower case characters, 
per attached diff file. We can simply add an alternative expected output 
file, but I'd prefer not to if we can modify the test case instead. We 
could rename some of the object so that they sort the same in all 
locales, but that seems a bit awkward in this case.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** /home/hlinnaka/git-sandbox/pgsql/src/test/regress/expected/foreign_data.out 
Fri Jan  9 13:11:06 2009
--- /home/hlinnaka/git-sandbox/pgsql/src/test/regress/results/foreign_data.out  
Fri Jan  9 15:47:27 2009
***
*** 658,667 
  SELECT * FROM information_schema.foreign_servers ORDER BY 1, 2;
   foreign_server_catalog | foreign_server_name | foreign_data_wrapper_catalog 
| foreign_data_wrapper_name | foreign_server_type | foreign_server_version | 
authorization_identifier 
  
+-+--+---+-++--
-  regression | S6  | regression   
| foo   | || 
foreign_data_user
   regression | s4  | regression   
| foo   | oracle  || 
foreign_data_user
   regression | s5  | regression   
| foo   | | 15.0   | 
regress_test_role
   regression | s6  | regression   
| foo   | | 16.0   | 
regress_test_indirect
   regression | s8  | regression   
| postgresql| || 
foreign_data_user
   regression | st1 | regression   
| foo   | || 
regress_test_indirect
   regression | st2 | regression   
| foo   | || 
regress_test_role
--- 658,667 
  SELECT * FROM information_schema.foreign_servers ORDER BY 1, 2;
   foreign_server_catalog | foreign_server_name | foreign_data_wrapper_catalog 
| foreign_data_wrapper_name | foreign_server_type | foreign_server_version | 
authorization_identifier 
  
+-+--+---+-++--
   regression | s4  | regression   
| foo   | oracle  || 
foreign_data_user
   regression | s5  | regression   
| foo   | | 15.0   | 
regress_test_role
   regression | s6  | regression   
| foo   | | 16.0   | 
regress_test_indirect
+  regression | S6  | regression   
| foo   | || 
foreign_data_user
   regression | s8  | regression   
| postgresql| || 
foreign_data_user
   regression | st1 | regression   
| foo   | || 
regress_test_indirect
   regression | st2 | regression   
| foo   | || 
regress_test_role
***
*** 670,680 
  SELECT * FROM information_schema.foreign_server_options ORDER BY 1, 2, 3;
   foreign_server_catalog | foreign_server_name |   option_name| 
option_value 
  
+-+--+--
-  regression | S6  | mixed_case_names | true
   regression | s4  | dbname   | b
   regression | s4  | host | a
   regression | s6  | dbname   | b
   regression | s6  | host | a
   regression | s8  | connect_timeout  | 30
   regression | s8  | dbname   | db1
  (7 rows)
--- 670,680 
  SELECT * FROM 

Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-09 Thread Peter Eisentraut

Simon Riggs wrote:

I notice that we allow commands such as

SET TRANSACTION read only read write read only;

BEGIN TRANSACTION read only read only read only;

Unsurprisingly, these violate the SQL Standard:
* p.977 section 19.1 syntax (1)
* p.957 section 17.3 syntax (2)


Well, we allow a lot of things.  Violations of the SQL standard happen 
when a command that appears in the standard doesn't do what the standard 
says.  Allowing commands that are not in the standard is not a violation.


While there is no huge use case for these particular cases, tolerating 
redundant options is sometimes useful.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Andrew Dunstan



Heikki Linnakangas wrote:
The foreign_data test case is failing when I run make installcheck 
against a server that's been initialized with a locale other than C 
(en_GB.UTF-8).


The reason is the different ordering of upper and lower case 
characters, per attached diff file. We can simply add an alternative 
expected output file, but I'd prefer not to if we can modify the test 
case instead. We could rename some of the object so that they sort the 
same in all locales, but that seems a bit awkward in this case.


Regression tests have always failed on non-C locales AFAIK. The 
buildfarm goes out of its way to avoid that.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Peter Eisentraut

Andrew Dunstan wrote:
Regression tests have always failed on non-C locales AFAIK. The 
buildfarm goes out of its way to avoid that.


The regression tests should work just fine in non-C locales.  If the 
buildfarm goes out of its way to avoid non-C locales, then it loses some 
significant code coverage, considering that there are several variant 
code paths for locales, and considering the amount of users that use them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: new border setting in psql

2009-01-09 Thread D'Arcy J.M. Cain
On Thu, 8 Jan 2009 18:45:43 -0500 (EST)
Greg Smith gsm...@gregsmith.com wrote:
  A. Einstein was a really smart dude.
  Which character in the above example would you escape.
 
 . is on the long list of characters to be escaped I sent out earlier. 
 The parser looks for all sorts of enumeration syntaxes--A., I), (IV)--but 
 they all require some punctuation which makes those characters the ones to 
 focus on.

I tried escaping the '.' but it didn't change the behaviour.

  I would suggest that if we want actual ReST-safe output we should create 
  a border = 4 setting.  The code changes would be minimal.  All we need 
  to do is check for a value of 4 in addition to checking whether escaping 
  is necessary.
 
 This seems like a reasonable spec to me.  If you just want that general 
 format, you can get that and may very well end up with something that's 
 useful ReST anyway with the border=3 mode your existing patch implements. 
 As you demonstrated, there are several situations where a character you 
 think might do something special turns out to be ignored, with \ being 
 the most likely to cause trouble.

Enter the following into my test rig http://www.druid.darcy/rest.py:

+++
| id |  name  |
+++
|  8 | T'est  |
+++
|  9 | T*est  |
+++
| 10 | T\est  |
+++
| 11 | T\\est |
+++
| 12 | T_est  |
+++
| 13 | T`est  |
+++

Notice that only the backslash needs to be escaped.  However, if you
just add the backslash it won't work because the table will be
malformed.  You need to widen every other field as well.

As Tom has pointed out, ReST has problems beyond our implementation.
People who use it are aware of these warts.  Given that I really think
that the cleanest solution is to just give them border 3, don't
mention ReST in the docs and have it happily work for 95% of the cases
in a ReST processor.

How about my other proposal under the Output filter for psql
subject?  That would let people create parsers as safe as they need
them.  I think that this proposal is still useful for those that need
something quick and dirty though.

-- 
D'Arcy J.M. Cain da...@druid.net |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Including kerberos realm

2009-01-09 Thread Bruce Momjian
Magnus Hagander wrote:
 Magnus Hagander wrote:
  Tom Lane wrote:
  Magnus Hagander mag...@hagander.net writes:
  Alvaro Herrera wrote:
  Not that this affects me in any way, but should there be a GUC variable
  to set the default behavior system-wide?
  I thought about that, but I don't want to add extra gucs without a good
  reason. You'd typically not have very many different lines in pg_hba for
  this, and just duplicating the parameter there would be ok I think.
  I'd rather move more of the krb parameters to be *just* in pg_hba.conf,
  but for now I left those in postgresql.conf as fallbacks..
  If you think those parameters would make more sense in pg_hba.conf,
  let's just move them and be done with it.  There has never been any
  intention that administrator-only GUCs would be promised compatible
  across versions.  And the GUC mechanism is really rather a lot of
  overhead compared to options on a pg_hba line ...
  
  Well, it does make sense to have defaults in postgresql.conf - but I
  don't think it's worth the overhead.
  
  I'll commit the stuff I have for now and put it on my TODO to remove
  them completely from postgresql.conf later. I'll see if I have time to
  get it done for 8.4.
 
 Ok, I've applied a patch for this for the parameter krb_realm and
 krb_server_hostname, which are the ones that currently supported both.
 
 Should we also consider moving the remaining ones there?
 (krb_server_keyfile, krb_srvname, krb_caseinsens_users)
 
 They do make sense to set on a per-server basis, on the other hand they
 are the only remaining authentication-method-specific parameters left...

If they make more sense in postgresql.conf, I would just leave them
there.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  Isn't this redundant given the existence of pglesslog?
 
  It does the same as pglesslog, but is simpler to use because it is
  automatic.
 
 Which also means that everyone pays the performance penalty whether
 they get any benefit or not.  The point of the external solution
 is to do the work only in installations that get some benefit.
 We've been over this ground before...

If there is a performance penalty, you are right, but if the zeroing is
done as part of the archiving, it seems near zero cost enough to do it
all the time, no?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] parallel restore

2009-01-09 Thread Andrew Dunstan

Attached is the latest version.

Changes:

. some tidy up as variously requested.
. some common code is factored out
. some descriptive comments added
. platform specific stuff (e.g. spawn, reap) is factored out
. --truncate_before_load is gone, and we now do this during parallel 
restore if we created the table. For a non-parallel restore the 
equivalent would be to run the whole restore in a single transaction.


One of the hardest parts of getting this to work was handling 
dependencies right. I will work on adding some more comments regarding that.


Simon asked about a way to adjust the number of worker children as we go 
along. That's way out of scope at this stage. In testing it appears that 
the sweet spot is roughly m=number_of_processors, which makes some 
sense, but more experience will clarify this.


cheers

andrew


parallel_restore_14.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-09 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Simon Riggs wrote:
 I notice that we allow commands such as
 SET TRANSACTION read only read write read only;
 BEGIN TRANSACTION read only read only read only;

 Well, we allow a lot of things.  Violations of the SQL standard happen 
 when a command that appears in the standard doesn't do what the standard 
 says.  Allowing commands that are not in the standard is not a violation.

I agree that spec violation is not a good argument for rejecting
these.  However, self-consistency with our own common practice should
be considered.  In practically every utility command we have that takes
a list of options, we throw conflicting or redundant options errors
in similar cases.

My own feeling is that the second example is okay but the first should
be rejected, since (a) it's quite unclear what the user wants, and (b)
the ensuing behavior would be determined by implementation artifacts
like which order we processed the options in.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 16:14 +0200, Peter Eisentraut wrote:
 Simon Riggs wrote:
  I notice that we allow commands such as
  
  SET TRANSACTION read only read write read only;
  
  BEGIN TRANSACTION read only read only read only;
  
  Unsurprisingly, these violate the SQL Standard:
  * p.977 section 19.1 syntax (1)
  * p.957 section 17.3 syntax (2)
 
 Well, we allow a lot of things.  Violations of the SQL standard happen 
 when a command that appears in the standard doesn't do what the standard 
 says.  Allowing commands that are not in the standard is not a violation.

Except when the standard explicitly forbids it, as with the above.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Peter Eisentraut

Heikki Linnakangas wrote:
The foreign_data test case is failing when I run make installcheck 
against a server that's been initialized with a locale other than C 
(en_GB.UTF-8).


I have removed one of the differences but can't reproduce the other 
right now (although it looks consequential).  I'll check that on a 
different machine.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgindent does a pretty awful job with function-pointer typedefs

2009-01-09 Thread Tom Lane
I wonder if this can be improved:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/optimizer/planner.h.diff?r1=1.41;r2=1.42

Similar examples can be found elsewhere ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Heikki Linnakangas

Andrew Dunstan wrote:

Heikki Linnakangas wrote:
The foreign_data test case is failing when I run make installcheck 
against a server that's been initialized with a locale other than C 
(en_GB.UTF-8).


The reason is the different ordering of upper and lower case 
characters, per attached diff file. We can simply add an alternative
expected output file, but I'd prefer not to if we can modify the test 
case instead. We could rename some of the object so that they sort the 
same in all locales, but that seems a bit awkward in this case.


Regression tests have always failed on non-C locales AFAIK. The 
buildfarm goes out of its way to avoid that.


No, that's the only test case that's failing.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-09 Thread Heikki Linnakangas

Simon Riggs wrote:

On Fri, 2009-01-09 at 16:14 +0200, Peter Eisentraut wrote:

Simon Riggs wrote:

I notice that we allow commands such as

SET TRANSACTION read only read write read only;

BEGIN TRANSACTION read only read only read only;

Unsurprisingly, these violate the SQL Standard:
* p.977 section 19.1 syntax (1)
* p.957 section 17.3 syntax (2)
Well, we allow a lot of things.  Violations of the SQL standard happen 
when a command that appears in the standard doesn't do what the standard 
says.  Allowing commands that are not in the standard is not a violation.


Except when the standard explicitly forbids it, as with the above.


No, it just means that the statement SET TRANSACTION read only read 
write read only; doesn't conform to the standard, and it's therefore 
implementation-dependent what it does. See the meaning of shall in 
Syntax Rules, section 6.3.3.2 Terms denoting rule requirements.


I agree with Tom that the 2nd form is harmless, but we should throw an 
error for the first.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Andrew Dunstan



Peter Eisentraut wrote:

Andrew Dunstan wrote:
Regression tests have always failed on non-C locales AFAIK. The 
buildfarm goes out of its way to avoid that.


The regression tests should work just fine in non-C locales.  If the 
buildfarm goes out of its way to avoid non-C locales, then it loses 
some significant code coverage, considering that there are several 
variant code paths for locales, and considering the amount of users 
that use them.


It was discussed here at the time, IIRC, and we put in the check 
precisely because other locales broke the buildfarm. Originally 
buildfarm just inherited the locale from its environment.


If it is no longer true that other locales break the tests, then I'm 
happy to examine alternatives.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solve a problem of LC_TIME of windows.

2009-01-09 Thread Hiroshi Saito

Hi.

Thanks all.

I tried CVS-HEAD now.


HIROSHI=# select to_char(now(),'TMDay');
to_char
--
Saturday
(1 行)

HIROSHI=# set LC_MESSAGES=Ja;
SET
HIROSHI=# select to_char(now(),'TMDay');
to_char
--
Saturday
(1 行)


Umm, It does not look at a comfortable result.:-(
I will check it on tomorrow night. sorry busy now..

Regards,
Hiroshi Saito

- Original Message - 
From: Magnus Hagander mag...@hagander.net




ITAGAKI Takahiro wrote:

Hiroshi Inoue in...@tpf.co.jp wrote:


Seems LC_CTYPE and LC_TIME should be convertible even though we use
wcsftime (which internally calls strftime?).


Ok, wcsftime() requries both LC_TIME and LC_CTYPE are the same setting
(at least encoding) on Windows.

The attached patch is an updated version to fix cache_locale_time().
Now it sets LC_TIME and LC_CTYPE to the specified locale and restore
them at end of the function. I tested the patch on Windows XP Japanese
Edition (SJIS) with UTF-8 and EUCJP databases, and worked expectedly.

#ifdef WIN32 codes seems to be ugly in the patch,
but I have no other idea...


I have applied this version of the patch (with only a minor further
addition to the comment).

Thank you all for your work and patience in getting this fixed! Let's
hope it stays fixed :-)

//Magnus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers 



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
 Greg Smith gsm...@gregsmith.com wrote: 
 I thought at one point that the direction this was going toward was
to 
 provide the size of the WAL file as a parameter you can use in the 
 archive_command:  %p provides the path, %f the file name, and now %l
the 
 length.  That makes an example archive command something like:
 
 head -c %l %p | gzip  /mnt/server/archivedir/%f
 
Hard to beat for performance.  I thought there was some technical
snag.
 
 Expanding it back to always be 16MB on the other side might require
some 
 trivial script, can't think of a standard UNIX tool suitable for that
but 
 it's easy enough to write.
 
Untested, but it seems like something close to this would work:
 
cat $p $( dd if=/dev/null blocks=1 ibs=$(( (16 * 1024 * 1024) - $(stat
-c%s $p) )) )
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 Which also means that everyone pays the performance penalty whether
 they get any benefit or not.  The point of the external solution
 is to do the work only in installations that get some benefit.
 We've been over this ground before...

 If there is a performance penalty, you are right, but if the zeroing is
 done as part of the archiving, it seems near zero cost enough to do it
 all the time, no?

It's the same cost no matter which process does it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 17:11 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Fri, 2009-01-09 at 16:14 +0200, Peter Eisentraut wrote:
  Simon Riggs wrote:
  I notice that we allow commands such as
 
  SET TRANSACTION read only read write read only;
 
  BEGIN TRANSACTION read only read only read only;
 
  Unsurprisingly, these violate the SQL Standard:
  * p.977 section 19.1 syntax (1)
  * p.957 section 17.3 syntax (2)
  Well, we allow a lot of things.  Violations of the SQL standard happen 
  when a command that appears in the standard doesn't do what the standard 
  says.  Allowing commands that are not in the standard is not a violation.
  
  Except when the standard explicitly forbids it, as with the above.
 
 No, it just means that the statement SET TRANSACTION read only read 
 write read only; doesn't conform to the standard, and it's therefore 
 implementation-dependent what it does. See the meaning of shall in 
 Syntax Rules, section 6.3.3.2 Terms denoting rule requirements.

which says

If any condition required by Syntax Rules is not satisfied when the
evaluation of Access or General Rules is attempted and the
implementation is neither processing non-conforming SQL language nor
processing conforming SQL language in a non-conforming manner, then an
exception condition is raised: syntax error or access rule violation.

If we *choose* to be an SQL implementation that conforms to the SQL
standard, then it should throw an error. 

Of course, we can *choose* not to conform to the standard in this or any
case, but exactly why would we? I thought we had made a choice to
conform to the SQL Standard, unless we have specific reason not to.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Greg Smith gsm...@gregsmith.com wrote: 
 I thought at one point that the direction this was going toward was to 
 provide the size of the WAL file as a parameter you can use in the 
 archive_command:
 
 Hard to beat for performance.  I thought there was some technical
 snag.
 
Yeah: the archiver process doesn't have that information available.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: new border setting in psql

2009-01-09 Thread Cédric Villemain
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

D'Arcy J.M. Cain a écrit :
 
 As Tom has pointed out, ReST has problems beyond our implementation.
 People who use it are aware of these warts.  Given that I really think
 that the cleanest solution is to just give them border 3, don't
 mention ReST in the docs and have it happily work for 95% of the cases
 in a ReST processor.

+1

 
 How about my other proposal under the Output filter for psql
 subject?  That would let people create parsers as safe as they need
 them.  I think that this proposal is still useful for those that need
 something quick and dirty though.
 

Can be interesting, but for my own usage border=3 will be enough.


- --
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAklneBwACgkQo/dppWjpEvxucQCeIuTatyfoEw/TkqAVLK/hI0wq
WkIAn3mt4tnMz3BAjXIJqtmMlj9d4r4l
=Ykl+
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
All that is useless until we get a %l in archive_command...

*I* didn't see an easy way to get at the written size later on in the
chain (i.e. in the actual archiving), so I took the path of least
resitance.

The reason *I* shy way from pg_lesslog and pg_clearxlogtail, is that
they seem to possibly be frail... I'm just scared of somethign changing
in PG some time, and my pg_clearxlogtail not nowing, me forgetting to
upgrade, and me not doing enough test of my actually restoring backups...

Sure, it's all me being neglgent, but the simpler, the better...

If I wrapped this zeroing in GUC, people could choose wether to pay the
penalty or not, would that satisfy anyone?

Again, *I* think that the force_switch case is going to happen when the
admin's quite happy to pay that penalty...  But obviously not
everyone...

a.

* Kevin Grittner kevin.gritt...@wicourts.gov [090109 11:01]:
  Greg Smith gsm...@gregsmith.com wrote: 
  I thought at one point that the direction this was going toward was
 to 
  provide the size of the WAL file as a parameter you can use in the 
  archive_command:  %p provides the path, %f the file name, and now %l
 the 
  length.  That makes an example archive command something like:
  
  head -c %l %p | gzip  /mnt/server/archivedir/%f
  
 Hard to beat for performance.  I thought there was some technical
 snag.
  
  Expanding it back to always be 16MB on the other side might require
 some 
  trivial script, can't think of a standard UNIX tool suitable for that
 but 
  it's easy enough to write.
  
 Untested, but it seems like something close to this would work:
  
 cat $p $( dd if=/dev/null blocks=1 ibs=$(( (16 * 1024 * 1024) - $(stat
 -c%s $p) )) )
  
 -Kevin
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 If any condition required by Syntax Rules is not satisfied when the
 evaluation of Access or General Rules is attempted and the
 implementation is neither processing non-conforming SQL language nor
 processing conforming SQL language in a non-conforming manner, then an
 exception condition is raised: syntax error or access rule violation.

 If we *choose* to be an SQL implementation that conforms to the SQL
 standard, then it should throw an error. 

That reading would forbid any nonstandard syntax whatsoever...

What this is actually describing is the standards conformance checking
mode that the standard says you ought to provide, but we never have
(nor have most other vendors AFAIK).  In SQL92 this was described as
a SQL Flagger and it was optional.  Not sure what the latest spec
says about that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solve a problem of LC_TIME of windows.

2009-01-09 Thread Tom Lane
Hiroshi Saito z-sa...@guitar.ocn.ne.jp writes:
 HIROSHI=# set LC_MESSAGES=Ja;
 SET
 HIROSHI=# select to_char(now(),'TMDay');
  to_char
 --
  Saturday
 (1 行)

I thought this was supposed to be driven by LC_TIME now, not
LC_MESSAGES.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Peter Eisentraut wrote:
 The regression tests should work just fine in non-C locales.

 It was discussed here at the time, IIRC, and we put in the check 
 precisely because other locales broke the buildfarm. Originally 
 buildfarm just inherited the locale from its environment.

I don't think we are prepared to buy into a general policy that the
regression tests should pass in *any* locale; maintaining a large
number of variant expected-files isn't very practical.  However, the
de facto policy is that we try to keep them passing in locales that
are used by any of the regular developers.  I think it would be useful
to have buildfarm members testing in a few common locales.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 09:31 -0500, Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   Tom Lane wrote:
   Isn't this redundant given the existence of pglesslog?
  
   It does the same as pglesslog, but is simpler to use because it is
   automatic.
  
  Which also means that everyone pays the performance penalty whether
  they get any benefit or not.  The point of the external solution
  is to do the work only in installations that get some benefit.
  We've been over this ground before...
 
 If there is a performance penalty, you are right, but if the zeroing is
 done as part of the archiving, it seems near zero cost enough to do it
 all the time, no?

It can already be done as part of the archiving, using an external tool
as Tom notes.

Yes, we could make the archiver do this, but I see no big advantage over
having it done externally. It's not faster, safer, easier. Not easier
because we would want a parameter to turn it off when not wanted.

The patch as stands is IMHO not acceptable because the work to zero the
file is performed by the unlucky backend that hits EOF on the current
WAL file, which is bad enough, but it is also performed while holding
WALWriteLock. 

I like Greg Smith's analysis of this and his conclusion that we could
provide a %l option, but even that would require work to have that info
passed to the archiver. Perhaps inside the notification file which is
already written and read by the write processes. But whether that can or
should be done for this release is a different debate.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks

2009-01-09 Thread Heikki Linnakangas
The hot standby patch has some hacks to decide which full-page-images 
can be restored holding an exclusive lock and which ones need a 
vacuum-strength lock. It's not very pretty as is, as mentioned in 
comments too.


How about we refactor things so that redo-functions are responsible for 
calling RestoreBkpBlocks? The redo function can then pass an argument 
indicating what kind of lock is required. We should also change 
XLogReadBufferExtended so that it doesn't lock the page; the caller 
knows better what kind of a lock it needs. That makes it more analogous 
with ReadBufferExtended too, although I think we should keep 
XLogReadBuffer() unchanged for now.


See attached patch. One shortfall of this patch is that you can pass 
only one argument to RestoreBkpBlocks, but there can multiple backup 
blocks in one WAL record. That's OK in current usage, though.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** src/backend/access/gin/ginxlog.c
--- src/backend/access/gin/ginxlog.c
***
*** 438,443  gin_redo(XLogRecPtr lsn, XLogRecord *record)
--- 438,445 
  {
  	uint8		info = record-xl_info  ~XLR_INFO_MASK;
  
+ 	RestoreBkpBlocks(lsn, record, false);
+ 
  	topCtx = MemoryContextSwitchTo(opCtx);
  	switch (info)
  	{
*** src/backend/access/gist/gistxlog.c
--- src/backend/access/gist/gistxlog.c
***
*** 395,400  gist_redo(XLogRecPtr lsn, XLogRecord *record)
--- 395,402 
  {
  	uint8		info = record-xl_info  ~XLR_INFO_MASK;
  
+ 	RestoreBkpBlocks(lsn, record, false);
+ 
  	MemoryContext oldCxt;
  
  	oldCxt = MemoryContextSwitchTo(opCtx);
*** src/backend/access/heap/heapam.c
--- src/backend/access/heap/heapam.c
***
*** 4777,4782  heap_redo(XLogRecPtr lsn, XLogRecord *record)
--- 4777,4784 
  {
  	uint8		info = record-xl_info  ~XLR_INFO_MASK;
  
+ 	RestoreBkpBlocks(lsn, record, false);
+ 
  	switch (info  XLOG_HEAP_OPMASK)
  	{
  		case XLOG_HEAP_INSERT:
***
*** 4816,4827  heap2_redo(XLogRecPtr lsn, XLogRecord *record)
--- 4818,4832 
  	switch (info  XLOG_HEAP_OPMASK)
  	{
  		case XLOG_HEAP2_FREEZE:
+ 			RestoreBkpBlocks(lsn, record, false);
  			heap_xlog_freeze(lsn, record);
  			break;
  		case XLOG_HEAP2_CLEAN:
+ 			RestoreBkpBlocks(lsn, record, true);
  			heap_xlog_clean(lsn, record, false);
  			break;
  		case XLOG_HEAP2_CLEAN_MOVE:
+ 			RestoreBkpBlocks(lsn, record, true);
  			heap_xlog_clean(lsn, record, true);
  			break;
  		default:
*** src/backend/access/nbtree/nbtxlog.c
--- src/backend/access/nbtree/nbtxlog.c
***
*** 714,719  btree_redo(XLogRecPtr lsn, XLogRecord *record)
--- 714,721 
  {
  	uint8		info = record-xl_info  ~XLR_INFO_MASK;
  
+ 	RestoreBkpBlocks(lsn, record, false);
+ 
  	switch (info)
  	{
  		case XLOG_BTREE_INSERT_LEAF:
*** src/backend/access/transam/xlog.c
--- src/backend/access/transam/xlog.c
***
*** 2934,2941  CleanupBackupHistory(void)
   * modifications of the page that appear in XLOG, rather than possibly
   * ignoring them as already applied, but that's not a huge drawback.
   */
! static void
! RestoreBkpBlocks(XLogRecord *record, XLogRecPtr lsn)
  {
  	Buffer		buffer;
  	Page		page;
--- 2934,2941 
   * modifications of the page that appear in XLOG, rather than possibly
   * ignoring them as already applied, but that's not a huge drawback.
   */
! void
! RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord *record, bool cleanup)
  {
  	Buffer		buffer;
  	Page		page;
***
*** 2943,2948  RestoreBkpBlocks(XLogRecord *record, XLogRecPtr lsn)
--- 2943,2951 
  	char	   *blk;
  	int			i;
  
+ 	if (!(record-xl_info  XLR_BKP_BLOCK_MASK))
+ 		return;
+ 
  	blk = (char *) XLogRecGetData(record) + record-xl_len;
  	for (i = 0; i  XLR_MAX_BKP_BLOCKS; i++)
  	{
***
*** 2955,2960  RestoreBkpBlocks(XLogRecord *record, XLogRecPtr lsn)
--- 2958,2968 
  		buffer = XLogReadBufferExtended(bkpb.node, bkpb.fork, bkpb.block,
  		RBM_ZERO);
  		Assert(BufferIsValid(buffer));
+ 		if (cleanup)
+ 			LockBufferForCleanup(buffer);
+ 		else
+ 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+ 
  		page = (Page) BufferGetPage(buffer);
  
  		if (bkpb.hole_length == 0)
***
*** 5210,5218  StartupXLOG(void)
  	TransactionIdAdvance(ShmemVariableCache-nextXid);
  }
  
- if (record-xl_info  XLR_BKP_BLOCK_MASK)
- 	RestoreBkpBlocks(record, EndRecPtr);
- 
  RmgrTable[record-xl_rmid].rm_redo(EndRecPtr, record);
  
  /* Pop the error context stack */
--- 5218,5223 
*** src/backend/access/transam/xlogutils.c
--- src/backend/access/transam/xlogutils.c
***
*** 217,224  XLogCheckInvalidPages(void)
  
  /*
   * XLogReadBuffer
!  *		A shorthand of XLogReadBufferExtended(), for reading from the main
!  *		fork.
   *
   * For historical reasons, instead of a ReadBufferMode argument, this only
   * supports RBM_ZERO (init == true) and RBM_NORMAL (init == false) modes.
--- 217,234 
  
  

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
* Simon Riggs si...@2ndquadrant.com [090109 11:33]:
 
 
 The patch as stands is IMHO not acceptable because the work to zero the
 file is performed by the unlucky backend that hits EOF on the current
 WAL file, which is bad enough, but it is also performed while holding
 WALWriteLock. 

Agreed, but noting that that extra zero work is contitional on the
force_swich, meaning that commits backup behind that WALWriteLock only
during forced xlog switches (like archive_timeout and pg_backup).  I
actually did look through verify that when I made the patch, although I
claim that verification to be something anybody else should beleive ;-)
But my given output when I showd the stats/lines/etc did demonstrate
that.

 I like Greg Smith's analysis of this and his conclusion that we could
 provide a %l option, but even that would require work to have that info
 passed to the archiver. Perhaps inside the notification file which is
 already written and read by the write processes. But whether that can or
 should be done for this release is a different debate.

It's certainly not already in this commitfest, just like this patch.  I
thought the initial reaction after I posted it made it pretty clear it
wasn't something people (other than a few of us) were willing to
allow...

a.
-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Solve a problem of LC_TIME of windows.

2009-01-09 Thread Hiroshi Saito

I thought this was supposed to be driven by LC_TIME now, not
LC_MESSAGES.


Uga, yes yes!

HIROSHI=# set LC_TIME=Ja;
SET
HIROSHI=# select to_char(now(),'TMDay');
to_char
-
土曜日
(1 行)

Thanks:-)



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET TRANSACTION and SQL Standard

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 11:20 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  If any condition required by Syntax Rules is not satisfied when the
  evaluation of Access or General Rules is attempted and the
  implementation is neither processing non-conforming SQL language nor
  processing conforming SQL language in a non-conforming manner, then an
  exception condition is raised: syntax error or access rule violation.
 
  If we *choose* to be an SQL implementation that conforms to the SQL
  standard, then it should throw an error. 
 
 That reading would forbid any nonstandard syntax whatsoever...

No, it does allow you to choose on a case by case basis. But yes, I had
thought our (not just my) default position was to conform to the
standard. 

 What this is actually describing is the standards conformance checking
 mode that the standard says you ought to provide, but we never have
 (nor have most other vendors AFAIK).  In SQL92 this was described as
 a SQL Flagger and it was optional.  Not sure what the latest spec
 says about that.

I've been thinking about that as something for next release.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Guillaume Smet
On Fri, Jan 9, 2009 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, the
 de facto policy is that we try to keep them passing in locales that
 are used by any of the regular developers.  I think it would be useful
 to have buildfarm members testing in a few common locales.

If you define common locales, I can set up as many new animals as
needed to cover the locales needed for any branch we'd like to test.

Perhaps we should add a parameter to the buildfarm config file so that
the buildfarm script can check the locale is accepted and set it
directly. Considering that we won't have the locale information in the
animal description, it's a good way to have it in the report.

Just let me know.

-- 
Guillaume

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 18:30 +0200, Heikki Linnakangas wrote:

 The hot standby patch has some hacks to decide which full-page-images 
 can be restored holding an exclusive lock and which ones need a 
 vacuum-strength lock. It's not very pretty as is, as mentioned in 
 comments too.

Agreed!

 How about we refactor things so that redo-functions are responsible for 
 calling RestoreBkpBlocks? The redo function can then pass an argument 
 indicating what kind of lock is required. We should also change 
 XLogReadBufferExtended so that it doesn't lock the page; the caller 
 knows better what kind of a lock it needs. That makes it more analogous 
 with ReadBufferExtended too, although I think we should keep 
 XLogReadBuffer() unchanged for now.

Much better idea, thanks. I felt a new rmgr function was overkill but
couldn't think of how to do this.

 See attached patch. One shortfall of this patch is that you can pass 
 only one argument to RestoreBkpBlocks, but there can multiple backup 
 blocks in one WAL record. That's OK in current usage, though.

If we're doing this because of cleanup locks, then I'd say we don't
currently need a cleanup lock with any WAL record that uses multiple
backup blocks. So we can just document that so anybody adding such a
record in the future will be careful.

So all seems good.

Would you want to push ResolveRedoVisibilityConflicts() down into the
rmgrs as well and make reachedSafeStartPoint a global? That is only
called for cleanup records.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Automatic view update rules

2009-01-09 Thread Bernd Helmle
--On Freitag, Januar 09, 2009 13:20:57 +0100 Bernd Helmle 
maili...@oopsware.de wrote:



That means, View1 consists of View2 and so on. What happens now if
someone is going to change View3, so that it's not updatable anymore?
What the patch actually does is, scanning all relations/views involved in
a current view (and cascading updates) und reject update rules as soon as
it finds more than one relation within a view definition. Unfortunately
this seems not to be enough, we had really check all involved views for
updatability recursively. The infrastructure for this is already there,
but i wonder if it could be made easier when we are going to maintain a
separate is_updatable flag somewhere in the catalog, which would make
checking the relation tree for updatability more easier.


I've decided to check updatability of all involved views during view 
creation. Please find attached a new version with all other open issues 
adressed.


--
 Thanks

   Bernd

view_update.patch.bz2
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
 Aidan Van Dyk ai...@highrise.ca 01/09/09 10:22 AM  
 The reason *I* shy way from pg_lesslog and pg_clearxlogtail, is that
 they seem to possibly be frail... I'm just scared of somethign
 changing in PG some time, and my pg_clearxlogtail not nowing, me
 forgetting to upgrade, and me not doing enough test of my actually
 restoring backups...
 
A fair concern.  I can't speak about pglesslog, but pg_clearxlogtail
goes out of its way to minimize this risk.  Changes to log records
themselves can't break it; it is only dependent on the partitioning. 
It bails with a message to stderr and a non-zero return code if it
finds anything obviously wrong.  It also checks the WAL format for
which it was compiled against the WAL format on which it was invoked,
and issues a warning if they don't match.  We ran into this once on a
machine running multiple releases of PostgreSQL where the archive
script invoked the wrong executable.  It worked correctly in spite of
the warning, but the warning was enough to alert us to the mismatch
and change the path in the archive script.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 18:30 +0200, Heikki Linnakangas wrote:
 How about we refactor things?

Was that a patch against HEAD or a patch on patch?

It would be useful to nibble away at the patch, committing all the
necessary refactorings to make the patch work. That will reduce size of
eventual commit.

Do you want me to start using the GIT repo to make it easier to extract
parts? You'd need to show me the setup you use first.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New patch for Column-level privileges

2009-01-09 Thread Stephen Frost
Jamie, et al,

* Jaime Casanova (jcasa...@systemguards.com.ec) wrote:
  para
 ! Currently, productnamePostgreSQL/productname does not recognize
 ! column-level SELECT privileges when a JOIN is involved.
   One possible workaround is to create a view having just the desired
   columns and then grant privileges to that view.
  /para

Remove this from the documentation, and:

  - Other minor cleanups (thanks KaiGai)
  - Added pg_dump support
  - Added support for 'ALL(col1)' grant/revokes (helped with pg_dump)
  - Added more regression tests
  - Updated documentation accordingly

Please test, comment, etc.

Thanks,

Stephen


colprivs_wip.2009010901.diff.gz
Description: Binary data


signature.asc
Description: Digital signature


Re: [HACKERS] foreign_data test fails with non-C locale

2009-01-09 Thread Andrew Dunstan



Guillaume Smet wrote:

On Fri, Jan 9, 2009 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  

However, the
de facto policy is that we try to keep them passing in locales that
are used by any of the regular developers.  I think it would be useful
to have buildfarm members testing in a few common locales.



If you define common locales, I can set up as many new animals as
needed to cover the locales needed for any branch we'd like to test.

Perhaps we should add a parameter to the buildfarm config file so that
the buildfarm script can check the locale is accepted and set it
directly. Considering that we won't have the locale information in the
animal description, it's a good way to have it in the report.


  


Sure, we can easily have buildfarm's initdb step set any locale (and 
encoding, for that matter) we like. That's a simple change.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Richard Huxton
Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Greg Smith gsm...@gregsmith.com wrote: 
 I thought at one point that the direction this was going toward was to 
 provide the size of the WAL file as a parameter you can use in the 
 archive_command:
  
 Hard to beat for performance.  I thought there was some technical
 snag.
  
 Yeah: the archiver process doesn't have that information available.

Am I being really dim here - why isn't the first record in the WAL file
a fixed-length record containing e.g. txid_start, time_start, txid_end,
time_end, length? Write it once when you start using the file and once
when it's finished.

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
* Richard Huxton d...@archonet.com [090109 12:22]:

  Yeah: the archiver process doesn't have that information available.

 Am I being really dim here - why isn't the first record in the WAL file
 a fixed-length record containing e.g. txid_start, time_start, txid_end,
 time_end, length? Write it once when you start using the file and once
 when it's finished.

It would break the WAL write-block/sync-block forward only progress of
the xlog, which avoids the whole torn-page problem that the heap has.

a.
-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks

2009-01-09 Thread Heikki Linnakangas

Simon Riggs wrote:

On Fri, 2009-01-09 at 18:30 +0200, Heikki Linnakangas wrote:

How about we refactor things?


Was that a patch against HEAD or a patch on patch?


Against HEAD.


It would be useful to nibble away at the patch, committing all the
necessary refactorings to make the patch work. That will reduce size of
eventual commit.


Agreed. This in particular is a change I feel pretty confident to commit 
beforehand.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Maintaining patchset with GIT (was Re: Hot standby, RestoreBkpBlocks and cleanup locks)

2009-01-09 Thread Heikki Linnakangas

Simon Riggs wrote:
 Do you want me to start using the GIT repo to make it easier to 
extract parts?


It would be useful. Even more so for your own sanity, I think :-). I 
find maintaining multiple interdependent patches much easier with GIT, 
though it's still easy to get confused.


Feel free to continue with CVS, of course.

 You'd need to show me the setup you use first.

Well, the first thing to do is to clone the repository, see wiki for 
that. And get an account at git.postgresql.org so that you can publish 
your stuff as a git repository. (I should get one myself..)


For reviewing hot standby, I created one hotstandbyv6a branch, and 
applied and committed all the patches in right order. But if you want to 
keep the patches separate, you should create a separate branch for each.


If patch B depends on patch A, create branch A first, and then branch 
branch B from that. That's what I did for the relation forks and FSM 
work. Just remember to always commit to the right branch. Whenever you 
commit changes to branch A, also merge those changes to branch B with 
git checkout B; git merge A. To sync with PostgreSQL CVS HEAD: git 
merge origin/master


To generate diffs, you can do for example git diff A..B to create a 
diff between A and B, or git diff origin/master..A to create a diff 
between PostgreSQL CVS HEAD and A. git log is also very helpful.


There's a learning curve, but don't hesitate to ask.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: pgindent does a pretty awful job with function-pointer typedefs

2009-01-09 Thread Bruce Momjian
Tom Lane wrote:
 I wonder if this can be improved:
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/optimizer/planner.h.diff?r1=1.41;r2=1.42
 
 Similar examples can be found elsewhere ...

Agreed, pgindent does a poor job aligning function pointers.  I rerand
planner.h here --- 'planner_hook_type' is defined as a typedef to BSD
indent, but the alignment is clearly off, but I can't figure out how to
improve it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Bruce Momjian
Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  Greg Smith gsm...@gregsmith.com wrote: 
  I thought at one point that the direction this was going toward was to 
  provide the size of the WAL file as a parameter you can use in the 
  archive_command:
  
  Hard to beat for performance.  I thought there was some technical
  snag.
  
 Yeah: the archiver process doesn't have that information available.

OK, thanks, I understand now.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Richard Huxton
Aidan Van Dyk wrote:
 * Richard Huxton d...@archonet.com [090109 12:22]:
 
 Yeah: the archiver process doesn't have that information available.
 
 Am I being really dim here - why isn't the first record in the WAL file
 a fixed-length record containing e.g. txid_start, time_start, txid_end,
 time_end, length? Write it once when you start using the file and once
 when it's finished.
 
 It would break the WAL write-block/sync-block forward only progress of
 the xlog, which avoids the whole torn-page problem that the heap has.

I thought that only applied when the filesystem page-size was less than
the data we were writing?

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Yes, we could make the archiver do this, but I see no big advantage over
 having it done externally. It's not faster, safer, easier. Not easier
 because we would want a parameter to turn it off when not wanted.

And the other question to ask is how much effort and code should we be
putting into the concept anyway.  AFAICS, file-at-a-time WAL shipping
is a stopgap implementation that will be dead as a doornail once the
current efforts towards realtime replication are finished.  There will
still be some use for forced log switches in connection with backups,
but that's not going to occur often enough to be important to optimize.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Visibility map and freezing

2009-01-09 Thread Jeff Davis
On Fri, 2009-01-09 at 13:49 +0200, Heikki Linnakangas wrote:
 Thinking about this some more, I'm not too happy with those names 
 either. vacuum_freeze_scan_age and autovacuum_freeze_scan_age don't mean 
 quite the same thing, like vacuum_cost_delay and 
 autovacuum_vacuum_cost_delay do, for example.

If the distinction you're making is that autovacuum_freeze_max_age
affects the launching of a vacuum rather than the behavior of a vacuum,
maybe we could incorporate the word launch like:

autovacuum_launch_freeze_threshold

 I'm now leaning towards:
 
 autovacuum_freeze_max_age
 vacuum_freeze_table_age
 vacuum_freeze_min_age
 
 where autovacuum_freeze_max_age and vacuum_freeze_min_age are unchanged, 
 and vacuum_freeze_table_age is the new setting that controls when VACUUM 
 or autovacuum should perform a full scan of the table to advance 
 relfrozenxid.

I'm still bothered by the fact that max and min really mean the same
thing here.

I don't think we can perfectly capture the meaning of these GUCs in the
name. I think our goal should be to avoid confusion between them.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solve a problem of LC_TIME of windows.

2009-01-09 Thread Magnus Hagander
Hiroshi Saito wrote:
 I thought this was supposed to be driven by LC_TIME now, not
 LC_MESSAGES.
 
 Uga, yes yes!
 
 HIROSHI=# set LC_TIME=Ja;
 SET
 HIROSHI=# select to_char(now(),'TMDay');
 to_char
 -
 土曜日
 (1 行)
 
 Thanks:-)

Great! Thanks for testing!

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
 Tom Lane t...@sss.pgh.pa.us wrote: 
 AFAICS, file-at-a-time WAL shipping
 is a stopgap implementation that will be dead as a doornail once the
 current efforts towards realtime replication are finished.
 
As long as there is a way to rsync log data to multiple targets not
running replicas, with compression because of low-speed WAN
connections, I'm happy.  Doesn't matter whether that is using existing
techniques or the new realtime techniques.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 13:22 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  Yes, we could make the archiver do this, but I see no big advantage over
  having it done externally. It's not faster, safer, easier. Not easier
  because we would want a parameter to turn it off when not wanted.
 
 And the other question to ask is how much effort and code should we be
 putting into the concept anyway.  AFAICS, file-at-a-time WAL shipping
 is a stopgap implementation that will be dead as a doornail once the
 current efforts towards realtime replication are finished.  There will
 still be some use for forced log switches in connection with backups,
 but that's not going to occur often enough to be important to optimize.

Agreed.

Half-filled WAL files were necessary to honour archive_timeout. With
continuous streaming all WAL files will be 100% full before we switch,
for most purposes.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, slot ids and stuff

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 11:20 +, Simon Riggs wrote:
 On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
  Simon Riggs wrote:
   On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
   Simon Riggs si...@2ndquadrant.com writes:
   On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
   When a backend dies with FATAL, it writes an abort record before 
   exiting.
  
   (I was under the impression it doesn't until few minutes ago myself, 
   when I actually read the shutdown code :-))
   Not in all cases; keep reading :-)
   If it doesn't, that's a bug.  A FATAL exit is not supposed to leave the
   shared state corrupted, it's only supposed to be a forcible session
   termination.  Any open transaction should be rolled back.
   
   Please look back at the earlier discussion we had on this exact point:
   http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php
  
  I think the running-xacts list we dump to WAL at every checkpoint is 
  enough to handle that. Just treat the dead transaction as in-progress 
  until the next running-xacts record. It's presumably extremely rare to 
  have a process die with FATAL, and not write an abort record.
 
 I agree, but I'll wait for Tom to speak further.

OK, will proceed without this. Time is pressing.

Heikki and I both agree that FATAL errors that fail to write abort
records are rare and an acceptable problem in real usage. If they do
occur, their nuisance factor is short-lived because of measures taken
within the patch.

Hot Standby does not *rely* upon there always an abort record for FATAL
errors, so we cannot reasonably say the current design would be
unacceptably fragile as I had once thought.

So based upon that, out comes the slotid concept and luckily much of the
cruftier aspects of the patch. Less code, probably fewer bugs. Good
thing.

I will produce a new v7 with those changes and merge the changes for v6b
also, so we can begin review again from there. 

Hi ho, hi ho...

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Maintaining patchset with GIT (was Re: Hot standby, RestoreBkpBlocks and cleanup locks)

2009-01-09 Thread Simon Riggs

On Fri, 2009-01-09 at 19:38 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
   Do you want me to start using the GIT repo to make it easier to 
 extract parts?
 
 It would be useful. 

Thanks, this is all good.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-09 Thread Bruce Momjian

Hiroshi, is this patch still needed?

---

Hiroshi Inoue wrote:
 Magnus Hagander wrote:
  On 25 nov 2008, at 05.00, Tom Lane t...@sss.pgh.pa.us wrote:
  
  Hiroshi Inoue in...@tpf.co.jp writes:
  Tom Lane wrote:
  If that's true then this code is presently broken for *every* locale
  under Windows, not only Japanese.
 
  Maybe there are a few languages/countires where 2 encodings are
  widely used.
 
  UTF8 vs Latin-N?
  
  We already special-cases utf8...
  
  I think the thing us that as long as the encodings are compatible 
  (latin1 with different names for example) it worked  fine.
  
   In any case I think the problem is that gettext is
  looking at a setting that is not what we are looking at.  Particularly
  with the 8.4 changes to allow per-database locale settings, this has
  got to be fixed in a bulletproof way.
 
 Attached is a new patch to apply bind_textdomain_codeset() to most
 server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
 and PG_EUC_JIS_2004. EUC-JP may be OK for EUC_JIS_2004.
 
 Unfortunately it's hard for Saito-san and me to check encodings
 other than EUC-JP.
 
 regards,
 Hiroshi Inoue

 *** mbutils.c.origSun Nov 23 08:42:57 2008
 --- mbutils.c Wed Nov 26 12:17:12 2008
 ***
 *** 822,830 
 --- 822,870 
   return clen;
   }
   
 + #ifdef  WIN32
 + static const struct codeset_map {
 + int encoding;
 + const char *codeset;
 + } codeset_map_array[] = {
 + {PG_UTF8, UTF-8},
 + {PG_LATIN1, LATIN1}, 
 + {PG_LATIN2, LATIN2},
 + {PG_LATIN3, LATIN3},
 + {PG_LATIN4, LATIN4},
 + {PG_ISO_8859_5, ISO-8859-5},
 + {PG_ISO_8859_6, ISO_8859-6},
 + {PG_ISO_8859_7, ISO-8859-7},
 + {PG_ISO_8859_8, ISO-8859-8},
 + {PG_LATIN5, LATIN5},
 + {PG_LATIN6, LATIN6},
 + {PG_LATIN7, LATIN7},
 + {PG_LATIN8, LATIN8},
 + {PG_LATIN9, LATIN-9},
 + {PG_LATIN10, LATIN10},
 + {PG_KOI8R, KOI8-R},
 + {PG_WIN1250, CP1250},
 + {PG_WIN1251, CP1251},
 + {PG_WIN1252, CP1252},
 + {PG_WIN1253, CP1253},
 + {PG_WIN1254, CP1254},
 + {PG_WIN1255, CP1255},
 + {PG_WIN1256, CP1256},
 + {PG_WIN1257, CP1257},
 + {PG_WIN1258, CP1258},
 + {PG_WIN866, CP866},
 + {PG_WIN874, CP874},
 + {PG_EUC_CN, EUC-CN},
 + {PG_EUC_JP, EUC-JP},
 + {PG_EUC_KR, EUC-KR},
 + {PG_EUC_TW, EUC-TW}};
 + #endif /* WIN32 */
 + 
   void
   SetDatabaseEncoding(int encoding)
   {
 + const char *target_codeset = NULL;
 + 
   if (!PG_VALID_BE_ENCODING(encoding))
   elog(ERROR, invalid database encoding: %d, encoding);
   
 ***
 *** 846,852 
*/
   #ifdef ENABLE_NLS
   if (encoding == PG_UTF8)
 ! if (bind_textdomain_codeset(postgres, UTF-8) == NULL)
   elog(LOG, bind_textdomain_codeset failed);
   #endif
   }
 --- 886,907 
*/
   #ifdef ENABLE_NLS
   if (encoding == PG_UTF8)
 ! target_codeset = UTF-8;
 ! #ifdef  WIN32
 ! else
 ! {
 ! int i;
 ! 
 ! for (i = 0; i  sizeof(codeset_map_array) / sizeof(struct 
 codeset_map); i++)
 ! if (codeset_map_array[i].encoding == encoding)
 ! {
 ! target_codeset = codeset_map_array[i].codeset;
 ! break;
 ! }
 ! }
 ! #endif /* WIN32 */
 ! if (target_codeset != NULL)
 ! if (bind_textdomain_codeset(postgres, target_codeset) == NULL)
   elog(LOG, bind_textdomain_codeset failed);
   #endif
   }

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-09 Thread Jeff Davis
On Mon, 2008-12-22 at 13:47 -0600, Kenneth Marshall wrote: 
 Dear PostgreSQL developers,
 
 I am re-sending this to keep this last change to the
 internal hash function on the radar.
 

Hi Ken,

A few comments:

1. New patch with very minor changes attached.

2. I reverted the change you made to indices.sgml. We still don't use
WAL for hash indexes, and in my opinion we should continue to discourage
their use until we do use WAL. We can add back in the comment that hash
indexes are suitable for large keys if we have some results to show
that.

3. There was a regression test failure in union.sql because the ordering
of the results was different. I updated the regression test.

4. Hash functions affect a lot more than hash indexes, so I ran through
a variety of tests that use a HashAggregate plan. Test setup and results
are attached. These results show no difference between the old and the
new code (about 0.1% better).

5. The hash index build time shows some improvement. The new code won in
every instance in which a there were a lot of duplicates in the table
(100 distinct values, 50K of each) by around 5%.

The new code appeared to be the same or slightly worse in the case of
hash index builds with few duplicates (100 distinct values, 5 of
each). The difference was about 1% worse, which is probably just noise.

Note: I'm no expert on hash functions. Take all of my tests with a grain
of salt.

I would feel a little better if I saw at least one test that showed
better performance of the new code on a reasonable-looking distribution
of data. The hash index build that you showed only took a second or two
-- it would be nice to see a test that lasted at least a minute.

Regards,
Jeff Davis




test_results.tar.gz
Description: application/compressed-tar
diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c
index 96d5643..8a236b5 100644
--- a/src/backend/access/hash/hashfunc.c
+++ b/src/backend/access/hash/hashfunc.c
@@ -200,39 +200,94 @@ hashvarlena(PG_FUNCTION_ARGS)
  * hash function, see http://burtleburtle.net/bob/hash/doobs.html,
  * or Bob's article in Dr. Dobb's Journal, Sept. 1997.
  *
- * In the current code, we have adopted an idea from Bob's 2006 update
- * of his hash function, which is to fetch the data a word at a time when
- * it is suitably aligned.  This makes for a useful speedup, at the cost
- * of having to maintain four code paths (aligned vs unaligned, and
- * little-endian vs big-endian).  Note that we have NOT adopted his newer
- * mix() function, which is faster but may sacrifice some randomness.
+ * In the current code, we have adopted Bob's 2006 update of his hash
+ * which fetches the data a word at a time when it is suitably aligned.
+ * This makes for a useful speedup, at the cost of having to maintain
+ * four code paths (aligned vs unaligned, and little-endian vs big-endian).
+ * It also two separate mixing functions mix() and final(), instead
+ * of a slower multi-purpose function.
  */
 
 /* Get a bit mask of the bits set in non-uint32 aligned addresses */
 #define UINT32_ALIGN_MASK (sizeof(uint32) - 1)
+#define rot(x,k) (((x)(k)) | ((x)(32-(k
 
 /*--
  * mix -- mix 3 32-bit values reversibly.
- * For every delta with one or two bits set, and the deltas of all three
- * high bits or all three low bits, whether the original value of a,b,c
- * is almost all zero or is uniformly distributed,
- * - If mix() is run forward or backward, at least 32 bits in a,b,c
- *	 have at least 1/4 probability of changing.
- * - If mix() is run forward, every bit of c will change between 1/3 and
- *	 2/3 of the time.  (Well, 22/100 and 78/100 for some 2-bit deltas.)
+ *
+ * This is reversible, so any information in (a,b,c) before mix() is
+ * still in (a,b,c) after mix().
+ *
+ * If four pairs of (a,b,c) inputs are run through mix(), or through
+ * mix() in reverse, there are at least 32 bits of the output that
+ * are sometimes the same for one pair and different for another pair.
+ * This was tested for:
+ * * pairs that differed by one bit, by two bits, in any combination
+ *   of top bits of (a,b,c), or in any combination of bottom bits of
+ *   (a,b,c).
+ * * differ is defined as +, -, ^, or ~^.  For + and -, I transformed
+ *   the output delta to a Gray code (a^(a1)) so a string of 1's (as
+ *   is commonly produced by subtraction) look like a single 1-bit
+ *   difference.
+ * * the base values were pseudorandom, all zero but one bit set, or
+ *   all zero plus a counter that starts at zero.
+ * 
+ * This does not achieve avalanche.  There are input bits of (a,b,c)
+ * that fail to affect some output bits of (a,b,c), especially of a.  The
+ * most thoroughly mixed value is c, but it doesn't really even achieve
+ * avalanche in c. 
+ * 
+ * This allows some parallelism.  Read-after-writes are good at doubling
+ * the number of bits affected, so the goal of mixing pulls in the opposite
+ * direction as the goal of parallelism.  I did what 

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-09 Thread Kenneth Marshall
On Fri, Jan 09, 2009 at 12:04:15PM -0800, Jeff Davis wrote:
 On Mon, 2008-12-22 at 13:47 -0600, Kenneth Marshall wrote: 
  Dear PostgreSQL developers,
  
  I am re-sending this to keep this last change to the
  internal hash function on the radar.
  
 
 Hi Ken,
 
 A few comments:
 
 1. New patch with very minor changes attached.
 
 2. I reverted the change you made to indices.sgml. We still don't use
 WAL for hash indexes, and in my opinion we should continue to discourage
 their use until we do use WAL. We can add back in the comment that hash
 indexes are suitable for large keys if we have some results to show
 that.
 
 3. There was a regression test failure in union.sql because the ordering
 of the results was different. I updated the regression test.
 
 4. Hash functions affect a lot more than hash indexes, so I ran through
 a variety of tests that use a HashAggregate plan. Test setup and results
 are attached. These results show no difference between the old and the
 new code (about 0.1% better).
 
 5. The hash index build time shows some improvement. The new code won in
 every instance in which a there were a lot of duplicates in the table
 (100 distinct values, 50K of each) by around 5%.
 
 The new code appeared to be the same or slightly worse in the case of
 hash index builds with few duplicates (100 distinct values, 5 of
 each). The difference was about 1% worse, which is probably just noise.
 
 Note: I'm no expert on hash functions. Take all of my tests with a grain
 of salt.
 
 I would feel a little better if I saw at least one test that showed
 better performance of the new code on a reasonable-looking distribution
 of data. The hash index build that you showed only took a second or two
 -- it would be nice to see a test that lasted at least a minute.
 
 Regards,
   Jeff Davis
 
 

Jeff,

Thanks for the review. I would not really expect any differences in hash
index build times other than normal noise variances. The most definitive
benchmark that I have seen was done with my original patch submission
in March posted by Luke of Greenplum:

  We just applied this and saw a 5 percent speedup on a hash aggregation
   query with four columns in a 'group by' clause run against a single
   TPC-H table.

I wonder if they would be willing to re-run their test? Thanks again.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Greg Smith

On Fri, 9 Jan 2009, Simon Riggs wrote:


Half-filled WAL files were necessary to honour archive_timeout. With
continuous streaming all WAL files will be 100% full before we switch,
for most purposes.


The main use case I'm concerned about losing support for is:

1) Two systems connected by a WAN with significant transmit latency
2) The secondary system runs a warm standby aimed at disaster recovery
3) Business requirements want the standby to never be more than (say) 5 
minutes behind the primary, presuming the WAN is up

4) WAN traffic is expensive (money==bandwidth, one of the two is scarce)

This seems a pretty common scenario in my experience.  Right now, this 
case is served quite well like this:


-archive_timeout='5 minutes'
-[pglesslog|pg_clearxlogtail] | gzip | rsync

The main concern I have with switching to a more synchronous scheme is 
that network efficiency drops as the payload breaks into smaller pieces. 
I haven't had enough time to keep up with all the sync rep advances 
recently to know for sure if there's a configuration there that's suitable 
for this case.  If that can be configured to send only in relatively large 
chunks, while still never letting things lag too far behind, then I'd 
agree completely that the case for any of these WAL cleaner utilities is 
dead--presuming said support makes it into the next release.


If that's not available, say because the only useful option sends in very 
small pieces, there may still be a need for some utility to fill in for 
this particular requirement.  Luckily there are many to choose from if it 
comes to that.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New patch for Column-level privileges

2009-01-09 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 Please test, comment, etc.

A few random comments based on a very fast scan through the patch:

The RTE fields really ought to be bitmaps not integer lists.  The
list representation is expensive to store, copy, etc.  You could use
the same approach the HOT patch used, ie offset the indexes by
FirstLowInvalidHeapAttributeNumber (cf pull_varattnos).

Patch is desperately lacking comments surrounding the altered meaning
of ATTRIBUTE_TUPLE_SIZE, the fact that TupleDescs no longer contain
complete copies of pg_attribute rows, etc.  It might be a good idea
to rename ATTRIBUTE_TUPLE_SIZE to ATTRIBUTE_TUPLE_FIXED_PART_SIZE
or something like that.

Don't like exporting allocacl() from acl.c nor having aclchk.c be so
intimate with the internal representation of ACLs.  Seems like the
operations actually needed could be represented a bit more abstractly,
ie copy an ACL or merge two ACLs.

applyColumnPrivs is misnamed as it's not applying any privileges nor
indeed doing much of anything directly to do with privileges.  It should
probably be named something more like findReferencedColumns.  And what's
with the special exception for SortGroupClause!?

Actually, though, you probably shouldn't have applyColumnPrivsWalker at all.
It requires an additional traversal of the parse tree, and an additional RTE
search for each var, for no good reason.  Can't we just mark the column
as referenced in make_var() and maybe a couple other places that already have
their hands on the RTE?

I don't see anything in transformDeleteStatement to handle the
fact that DELETE ... WHERE foo = 42 requires select on foo.

In the gram.y changes, don't treat CREATE differently from the other
cases.  You need a test and error in the statement execution code anyway
for the case of a privilege type inappropriately applied to columns, and
it's better to throw that very specific error message than the generic
syntax error that bison is going to throw for CREATE (column list).

The comment added to InsertPgAttributeTuple seems not to belong to it
(copy/paste error?)

What's the point of disallowing column privileges on a sequence?  (aclcheck.c
line 800 or so) It's not nonsensical to want to restrict what someone can do
in SELECT * FROM sequence.

pg_attribute_aclmask seems to need a rethink.  I don't like the amount
of policy copied-and-pasted from pg_class_aclmask, nor the fact that
it will fail for system attributes (attnum  0), nor the fact that it
insists on looping over the attributes even if the relation privileges
would provide what's needed.  (Indeed, you shouldn't need that merge
ACLs operation at all -- you could be ORing a couple of bitmasks
instead, no?)

I don't find what you've done with no_priv_msg[] to be particularly
comfortable: if anyone ever tried to print an ACL_KIND_COLUMN error
with the regular code path (hardly unlikely) you'd get a core dump
due to the format wanting two %s arguments with only one supplied.
I think the best thing is for no_priv_msg[] to include just
+   gettext_noop(permission denied for column %s),
and then make aclcheck_error_col() use its own error text rather than
pulling from the array.

Don't like naming of Priv node type, it's way too nonspecific.
Also, you need outfuncs.c support for it, see comment at head of
that file.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Hiroshi, is this patch still needed?

This patch is for a problem that's entirely separate from the LC_TIME
issue, if that's what you were wondering.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
 Greg Smith gsm...@gregsmith.com wrote: 
 The main use case I'm concerned about losing support for is:
 
 1) Two systems connected by a WAN with significant transmit latency
 2) The secondary system runs a warm standby aimed at disaster
recovery
 3) Business requirements want the standby to never be more than (say)
5 
 minutes behind the primary, presuming the WAN is up
 4) WAN traffic is expensive (money==bandwidth, one of the two is
scarce)
 
 This seems a pretty common scenario in my experience.  Right now,
this 
 case is served quite well like this:
 
 -archive_timeout='5 minutes'
 -[pglesslog|pg_clearxlogtail] | gzip | rsync
 
You've come pretty close to describing our environment, other than
having 72 primaries each using rsync to push the WAL files to another
server at the same site while a server at the central site uses rsync
to pull them back.  We don't run warm standby on the backup server at
the site of origin, and don't want to have to do so.
 
It is critically important that the flow of xlog data never hold up
the primary databases, and that failure to copy xlog to either of the
targets not interfere with copying to the other.  (We have WAN
failures surprising often, sometimes for days at a time, and the
backup server on-site is in the same rack of the same cabinet as the
database server.)
 
Compression of xlog data is important not only for WAN transmission,
but for storage space.  We keep two weeks of WAL files to allow
recovery from either of the last two weekly backups, and we archive
the first weekly backup of each month, with the WAL files needed for
recovery, for one year.
 
So it appears we care about somewhat similar issues.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-09 Thread Hiroshi Inoue

Bruce Momjian wrote:

Hiroshi, is this patch still needed?


Yes though it should be slightly changed now.
*set lc_messages does not work* issue isn't directly related to this
 issue.

regards,
Hiroshi Inoue

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-09 Thread Jeff Davis
On Fri, 2009-01-09 at 14:29 -0600, Kenneth Marshall wrote:
 Jeff,
 
 Thanks for the review. I would not really expect any differences in hash
 index build times other than normal noise variances. The most definitive
 benchmark that I have seen was done with my original patch submission
 in March posted by Luke of Greenplum:
 
   We just applied this and saw a 5 percent speedup on a hash aggregation
query with four columns in a 'group by' clause run against a single
TPC-H table.
 
 I wonder if they would be willing to re-run their test? Thanks again.

Separating mix() and final() should have some performance benefit,
right?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer pool statistics in Explain Analyze

2009-01-09 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 No, I think you misunderstood me entirely.  The reason that I rejected
 those parts of the patch is that I think the statistics that are
 available are wrong/useless.  If the bufmgr.c counters were really good
 for something they'd have been exposed long since (and we'd probably
 never have built a lot of the other stats collection infrastructure).

The collective stats across the whole cluster and the individual stats for a
specific query broken down by plan node are complementary. Depending on the
circumstance people sometimes need each.

I actually also wrote a patch exposing this same data. I think the bufmgr
counters are flawed but still useful. Just as an example think of how often
you have to explain why a sequential scan of a small table can be faster than
an index scan. Seeing the index scan actually require more logical buffer
fetches than a sequential scan would go a long way to clearing up that
confusion. Better yet, users would be in a position to see whether the planner
is actually estimating i/o costs accurately.

 The EXPLAIN ANALYZE code you submitted is actually kinda cute, and
 I'd have had no problem with it if I thought it were displaying
 numbers that were useful and unlikely to be obsoleted in future
 releases.  The work that needs to be done is on collecting the
 numbers more than displaying them.

I agree that we need more data -- my favourite direction is to use a
programmatic interface to dtrace to find out how many buffer reads are
satisfied from filesystem cache and how many from physical reads. But when we
do that doesn't obviate the need for these stats, it would enhance them. You
would get a clear view of how many buffer fetches were satisfied from shared
buffers, filesystem cache, and physical reads.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 2 small patches that fix 8.3.5 compile issues on Vista+MingW+Msys

2009-01-09 Thread Bruce Momjian

Uh, do we still need this patch?

---

Charlie Savage wrote:
 Charlie Savage wrote:
  A couple of months ago I noted that 8.3.4 doesn't compile on Vista using 
  MingW+msys under certain conditions:
  
  http://archives.postgresql.org/pgsql-hackers/2008-09/msg01496.php
  
  8.3.5 has the same problem.
  
  Attached are two one line patches that fix it.
 
 Sorry, I attached incorrect patches in the original email.  Here are the 
 correct ones.
 
 Thanks,
 
 Charlie
 -- 
 Charlie Savage
 http://cfis.savagexi.com

 *** libpq-be.h.oldWed Nov  5 23:32:50 2008
 --- libpq-be.hWed Nov  5 23:35:52 2008
 ***
 *** 47,52 
 --- 47,53 
   
   #ifdef ENABLE_SSPI
   #define SECURITY_WIN32
 + #include ntsecapi.h
   #include security.h
   #undef SECURITY_WIN32
   
 
 *** libpq-int.h.old   Wed Nov  5 23:37:48 2008
 --- libpq-int.h   Wed Nov  5 23:38:14 2008
 ***
 *** 54,59 
 --- 54,60 
   
   #ifdef ENABLE_SSPI
   #define SECURITY_WIN32
 + #include ntsecapi.h
   #include security.h
   #undef SECURITY_WIN32
   
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4516: FOUND variable does not work after RETURN QUERY

2009-01-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Uh, is this ready to be applied?

I don't think any consensus has been reached on changing this behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] per-database locale: createdb switches

2009-01-09 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Alvaro Herrera wrote:
  Tom Lane wrote:
  Alvaro Herrera alvhe...@commandprompt.com writes:
  Alvaro Herrera wrote:
  I like Teodor's proposal; I'll see about implementing that.
  Attached.
  You missed updating the sgml docs, and personally I'd be inclined to
  list -l before the individual --lc switches; otherwise it looks fine.
  
  Thanks, committed that way.  I noticed that --lc-ctype and --lc-collate
  were forgotten in SGML docs, so I added them too.
 
 Should we have a shorthand CREATE DATABASE option like that as well?

createdb is really about convenience;  not sure it is warranted for
CREATE DATABASE.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Greg Smith

On Fri, 9 Jan 2009, Aidan Van Dyk wrote:


*I* didn't see an easy way to get at the written size later on in the
chain (i.e. in the actual archiving), so I took the path of least
resitance.


I was hoping it might fall out of the other work being done in that area, 
given how much that code is still being poked at right now.  As Hannu 
pointed out, from a conceptual level you just need to carry along the same 
information that pg_current_xlog_location() returns to the archiver on all 
the paths where a segment might end early.



If I wrapped this zeroing in GUC, people could choose wether to pay the
penalty or not, would that satisfy anyone?


Rather than creating a whole new GUC, it might it be possible to turn 
archive_mode into an enum setting:  off, on, and cleaned as the modes 
perhaps.  That would avoid making a new setting, with the downside that a 
bunch of critical code would look less clear than it does with a boolean.



Again, *I* think that the force_switch case is going to happen when the
admin's quite happy to pay that penalty...  But obviously not
everyone...


I understand the case you've made for why it doesn't matter, and for 
almost every case you're right.  The situation it may be vulnerable to is 
where a burst of transactions come in just as the archive timeout expires 
after minimal WAL activity.  There I think you can end up with a bunch of 
clients waiting behind an almost full zero fill operation, which pushes up 
the worst-case latency.  I've been able to measure the impact of the 
similar case where zero-filling a brand new segment can impact things; 
this would be much less like to happen because the timing would have to 
line up just wrong, but I think it's still possible.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1389)

2009-01-09 Thread Alvaro Herrera
Tom Lane wrote:

 I guess I'm still wondering which part of this actually needs to be
 hand-coded so that it can be flexible.  I'm envisioning the whole
 loop replaced by something like
 
   FillRelOptions((void *) rdopts, options, constanttable);
 
 where the constant table contains entries like
 
   { fillfactor, RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor) }

I attach a patch that does things this way (it includes the btree test
code because I'm too lazy right now to strip it out).

I'm not really sure about removing the other macros completely, because
they would be useful whenever one wanted to create something
nonstandard.


 BTW, are we just assuming that there's never a possibility of no match?
 It seems like there ought to be an elog complaint if you get to the
 bottom of the loop; which again is something I don't see the point of
 writing out each time.

We need to be quiet about it when not validating, I think.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Index: src/backend/access/common/reloptions.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/access/common/reloptions.c,v
retrieving revision 1.17
diff -c -p -r1.17 reloptions.c
*** src/backend/access/common/reloptions.c	8 Jan 2009 19:34:41 -	1.17
--- src/backend/access/common/reloptions.c	9 Jan 2009 23:40:36 -
***
*** 34,43 
   * To add an option:
   *
   * (i) decide on a class (integer, real, bool, string), name, default value,
!  * upper and lower bounds (if applicable).
!  * (ii) add a record below.
!  * (iii) add it to StdRdOptions if appropriate
!  * (iv) add a block to the appropriate handling routine (probably
   * default_reloptions)
   * (v) don't forget to document the option
   *
--- 34,44 
   * To add an option:
   *
   * (i) decide on a class (integer, real, bool, string), name, default value,
!  * upper and lower bounds (if applicable); for string types, consider a 
!  * validation routine.
!  * (ii) add a record below (or use add_type_reloption).
!  * (iii) add it to the appropriate options struct (perhaps StdRdOptions)
!  * (iv) add it to the appropriate handling routine (perhaps
   * default_reloptions)
   * (v) don't forget to document the option
   *
*** parse_one_reloption(relopt_value *option
*** 762,767 
--- 763,854 
  		pfree(value);
  }
  
+ void *
+ allocateReloptStruct(Size base, relopt_value *options, int numoptions)
+ {
+ 	Size	size = base;
+ 	int		i;
+ 
+ 	for (i = 0; i  numoptions; i++)
+ 		if (options[i].gen-type == RELOPT_TYPE_STRING)
+ 			size += GET_STRING_RELOPTION_LEN(options[i]) + 1;
+ 
+ 	return palloc0(size);
+ }
+ 
+ void
+ fillRelOptions(void *rdopts, Size basesize, relopt_value *options,
+ 			   int numoptions, bool validate, reloptParseElem *elems,
+ 			   int numelems)
+ {
+ 	int		i;
+ 	int		offset = basesize;
+ 
+ 	for (i = 0; i  numoptions; i++)
+ 	{
+ 		int		j;
+ 		bool	found = false;
+ 
+ 		for (j = 0; j  numelems; j++)
+ 		{
+ 			if (pg_strcasecmp(options[i].gen-name, elems[j].optname) == 0)
+ 			{
+ relopt_string *optstring;
+ char   *itempos = ((char *) rdopts) + elems[j].offset;
+ char   *string_val;
+ 
+ switch (options[i].gen-type)
+ {
+ 	case RELOPT_TYPE_BOOL:
+ 		*(bool *) itempos = options[i].isset ?
+ 			options[i].values.bool_val :
+ 			((relopt_bool *) options[i].gen)-default_val;
+ 		break;
+ 	case RELOPT_TYPE_INT:
+ 		*(int *) itempos = options[i].isset ?
+ 			options[i].values.int_val :
+ 			((relopt_int *) options[i].gen)-default_val;
+ 		break;
+ 	case RELOPT_TYPE_REAL:
+ 		*(double *) itempos = options[i].isset ?
+ 			options[i].values.real_val :
+ 			((relopt_real *) options[i].gen)-default_val;
+ 		break;
+ 	case RELOPT_TYPE_STRING:
+ 		optstring = (relopt_string *) options[i].gen;
+ 		if (options[i].isset)
+ 			string_val = options[i].values.string_val;
+ 		else if (!optstring-default_isnull)
+ 			string_val = optstring-default_val;
+ 		else
+ 			string_val = NULL;
+ 
+ 		if (string_val == NULL)
+ 			*(int *) itempos = 0;
+ 		else
+ 		{
+ 			strcpy((char *) rdopts + offset, string_val);
+ 			*(int *) itempos = offset;
+ 			offset += strlen(string_val) + 1;
+ 		}
+ 		break;
+ 	default:
+ 		elog(ERROR, unrecognized reloption type %c,
+ 			 options[i].gen-type);
+ 		break;
+ }
+ found = true;
+ break;
+ 			}
+ 		}
+ 		if (validate  !found)
+ 			elog(ERROR, storate parameter \%s\ not found in parse table,
+  options[i].gen-name);
+ 	}
+ 	SET_VARSIZE(rdopts, offset);
+ }
+ 
+ 
  /*
   * Option parser for anything that uses StdRdOptions (i.e. fillfactor only)
   */
*** default_reloptions(Datum reloptions, boo
*** 770,779 
  {
  	relopt_value   *options;
  	StdRdOptions   

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1389)

2009-01-09 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Tom Lane wrote:
 
  I guess I'm still wondering which part of this actually needs to be
  hand-coded so that it can be flexible.  I'm envisioning the whole
  loop replaced by something like
  
  FillRelOptions((void *) rdopts, options, constanttable);
  
  where the constant table contains entries like
  
  { fillfactor, RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor) }
 
 I attach a patch that does things this way (it includes the btree test
 code because I'm too lazy right now to strip it out).

The irony of doing things this way is that we've come full-circle from
the original coding of these routines (the main difference being that
the default values and checks no longer need to be written as code, but
rather as table entries).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
* Greg Smith gsm...@gregsmith.com [090109 18:39]:

 I was hoping it might fall out of the other work being done in that area, 
 given how much that code is still being poked at right now.  As Hannu  
 pointed out, from a conceptual level you just need to carry along the 
 same information that pg_current_xlog_location() returns to the archiver 
 on all the paths where a segment might end early.

I was(am) also hoping that somethig falls out of sync-rep that gives me
better PITR backups (better than a small archive_timeout)... That hope
is what made me abandon this patch after the initial feedback.

 Rather than creating a whole new GUC, it might it be possible to turn  
 archive_mode into an enum setting:  off, on, and cleaned as the modes  
 perhaps.  That would avoid making a new setting, with the downside that a 
 bunch of critical code would look less clear than it does with a boolean.

I'm content to wait and see what falls out of sync-rep stuff...

... for now ... 

 I understand the case you've made for why it doesn't matter, and for  
 almost every case you're right.  The situation it may be vulnerable to is 
 where a burst of transactions come in just as the archive timeout expires 
 after minimal WAL activity.  There I think you can end up with a bunch of 
 clients waiting behind an almost full zero fill operation, which pushes 
 up the worst-case latency.  I've been able to measure the impact of the  
 similar case where zero-filling a brand new segment can impact things;  
 this would be much less like to happen because the timing would have to  
 line up just wrong, but I think it's still possible.

Ya, and it's one of just many of the times PG hits these worst-latency
spikes ;-)  GEnerally, it's *very* good... and once in a while, when all
the stars line up correctly, you get these spikes

But even with these spikes, it's plenty fast enough for the stuff I've
done...

a.
-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] New patch for Column-level privileges

2009-01-09 Thread Stephen Frost
Tom, et al,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 A few random comments based on a very fast scan through the patch:

Thanks for the feedback!

 The RTE fields really ought to be bitmaps not integer lists.  The
 list representation is expensive to store, copy, etc.  You could use
 the same approach the HOT patch used, ie offset the indexes by
 FirstLowInvalidHeapAttributeNumber (cf pull_varattnos).

Done (was actually easier than I expected it to be).

 Patch is desperately lacking comments surrounding the altered meaning
 of ATTRIBUTE_TUPLE_SIZE, the fact that TupleDescs no longer contain
 complete copies of pg_attribute rows, etc.  It might be a good idea
 to rename ATTRIBUTE_TUPLE_SIZE to ATTRIBUTE_TUPLE_FIXED_PART_SIZE
 or something like that.

Done.

 Don't like exporting allocacl() from acl.c nor having aclchk.c be so
 intimate with the internal representation of ACLs.  Seems like the
 operations actually needed could be represented a bit more abstractly,
 ie copy an ACL or merge two ACLs.

Done.

 applyColumnPrivs is misnamed as it's not applying any privileges nor
 indeed doing much of anything directly to do with privileges.  It should
 probably be named something more like findReferencedColumns.  And what's
 with the special exception for SortGroupClause!?

I'm not sure what the story with SortGroupClause is..  Might just have
been a bit more difficult to do.  KaiGai?

 Actually, though, you probably shouldn't have applyColumnPrivsWalker at all.
 It requires an additional traversal of the parse tree, and an additional RTE
 search for each var, for no good reason.  Can't we just mark the column
 as referenced in make_var() and maybe a couple other places that already have
 their hands on the RTE?

I certainly like this idea and I'll look into it, but it might take me a
bit longer to find the appropriate places beyond make_var().

 I don't see anything in transformDeleteStatement to handle the
 fact that DELETE ... WHERE foo = 42 requires select on foo.

I've fixed this (I believe).

 In the gram.y changes, don't treat CREATE differently from the other
 cases.  You need a test and error in the statement execution code anyway
 for the case of a privilege type inappropriately applied to columns, and
 it's better to throw that very specific error message than the generic
 syntax error that bison is going to throw for CREATE (column list).

Done.

 The comment added to InsertPgAttributeTuple seems not to belong to it
 (copy/paste error?)

Fixed.

 What's the point of disallowing column privileges on a sequence?  (aclcheck.c
 line 800 or so) It's not nonsensical to want to restrict what someone can do
 in SELECT * FROM sequence.

I've removed the offending code but to be honest I'm a bit nervous that
there are other parts of the code that aren't expecting a sequence and
may have to be changed.

 pg_attribute_aclmask seems to need a rethink.  I don't like the amount
 of policy copied-and-pasted from pg_class_aclmask, nor the fact that
 it will fail for system attributes (attnum  0), nor the fact that it
 insists on looping over the attributes even if the relation privileges
 would provide what's needed.  (Indeed, you shouldn't need that merge
 ACLs operation at all -- you could be ORing a couple of bitmasks
 instead, no?)

I'll have to think of the entry points for pg_attribute_aclmask.  In
general, we shouldn't ever get to it if the relation privileges are
sufficient but I think it's exposed to the user at some level, where
we would be wrong to say a user doesn't have rights on the column
when they do have the appropriate table-level rights.  Unfortunately,
aclmask() uses information beyond the bitmasks (who granted them,
specifically) so I don't think we can just OR the bitmasks.

With regard to looping through the attributes, I could split it up into
two functions, would that be better?  A function that handles
all-attribute cases (either 'AND'd or 'OR'd together depending on what
the caller wants) could be added pretty easily and then
pg_attribute_aclmask could be reverted to just handling a single
attribute at a time (which would fix it for system attributes as
well..).

 I don't find what you've done with no_priv_msg[] to be particularly
 comfortable: if anyone ever tried to print an ACL_KIND_COLUMN error
 with the regular code path (hardly unlikely) you'd get a core dump
 due to the format wanting two %s arguments with only one supplied.
 I think the best thing is for no_priv_msg[] to include just
 + gettext_noop(permission denied for column %s),
 and then make aclcheck_error_col() use its own error text rather than
 pulling from the array.

Done.

 Don't like naming of Priv node type, it's way too nonspecific.
 Also, you need outfuncs.c support for it, see comment at head of
 that file.

Done.

Updated patch attached.

Thanks!

Stephen


colprivs_2009010902.diff.gz
Description: Binary data


signature.asc
Description: Digital signature