Re: [HACKERS] Move src/tools/backend/ to wiki

2008-09-15 Thread Peter Eisentraut

Alvaro Herrera wrote:

I think getting rid of the FAQ completely is not necessarily a good
idea; it seems useful as a collection of interesting questions.  Moving
the contents to new pages is probably OK.  Also, as the answers mature
on the Wiki, perhaps it'd be possible to move them to the SGML docs (and
reduce the Wiki answer to just a pointer).


Yes, that's about the right plan.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Simon Riggs

On Sat, 2008-09-13 at 10:59 +0300, Heikki Linnakangas wrote:

 The importance of the WAL will only increase as more people start to
 use it for PITR, replication etc.

Agreed.

 The 2nd use case, however, I find pretty unconvincing. I can't think of 
 a good example of that. Anything that needs to define its own resource 
 manager is very low-level stuff, and probably needs to go into the core 
 anyway.

New indexes are a big one, but I listed others also.

Indexes have always been able to be added dynamically. Now they can be
recovered correctly as well.

Other data structures can be maintained by trigger code that writes new
types of WAL. That was always possible before, now they can be
recoverable too.

If we have extensible functions, triggers, indexes, why not WAL? What is
the problem with making WAL extensible? It carries no penalty at all for
standard WAL records, since the internal design for WAL already caters
for exactly this.

 So, let's focus on the 1st use case. 

No, lets look at both...you can't just wave away half the use cases. If
you look at all of the use cases the argument for doing it externally
quickly falls apart since it severely limits what can be achieved.

 I think a better approach for that 
 is to implement the filters as external programs, like pglesslog. It 
 allows more flexibility, although it also means that you can't rely on 
 existing backend functions to manipulate the WAL. I'd love to see a WAL 
 toolkit on pgfoundry, with tools like the filter to only restore a 
 single database, pglesslog, a WAL record viewer etc. A while ago, you 
 also talked about replacing the Slony triggers in the master with a tool 
 that reads the WAL; another good example of an external tool that needs 
 to read WAL. The toolkit could provide some sort of a framework and 
 common user interface to read and write WAL files for all those tools.

This patch provides exactly the toolkit you describe, just internally. 

As you point out, doing it other ways means you can't access internal
functions easily and can't maintain internal data structures correctly
either. So doing it externally is *not* a substitute and this is not a
simple discussion of include/exclude from core.

I'm lost as to why suggesting we limit the functionality is going to be
a good thing? If external tools really are so good, then we can do that
*as well*.

But this is only a plugin API, so the tools will be developed externally
anyway.

 A while ago, you 
 also talked about replacing the Slony triggers in the master with a tool 
 that reads the WAL

Writes the WAL you mean? Slony triggers could write data to WAL rather
than log tables and the slon daemon can be implemented as an rmgr
plugin. Or many other options.

 Another tool like that is pglesslog, although that 
 one couldn't actually be implemented with these hooks. 

Sounds like we'll want to integrate that into synch replication some
how, so suggestions as to how to do that welcome - if you're not already
doing it via some other plugin in synch rep code?

 I'm sure there's 
 more tricks like that people would find useful, if the tools existed. 

Agreed. So lets make them exist.


If there's an argument against doing this, I've not heard it made
clearly by anybody. When we discussed it first on hackers there was no
objection, so I wrote the patch. If people want to see this blocked now,
we need some good reasons. 

I've got nothing riding on the acceptance of this patch, I just think
its a good thing. That's why I deprioritised it. If there's some hidden
threat to national security or whatever, tell me off list.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Simon Riggs

On Sat, 2008-09-13 at 10:48 +0100, Florian G. Pflug wrote:

 The main idea was to invert the meaning of the xid array in the snapshot
 struct - instead of storing all the xid's between xmin and xmax that are
 to be considering in-progress, the array contained all the xid's 
 xmin that are to be considered completed.

 The downside is that the size of the read-only snapshot is theoretically
 unbounded, which poses a bit of a problem if it's supposed to live
 inside shared memory...

Why do it inverted? That clearly has problems.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Autovacuum and Autoanalyze

2008-09-15 Thread Simon Riggs
Disabling autovacuum can have catastrophic effects, since it disables
the ANALYZing of tables.

Can we have a mode where we disable autoVACUUM yet enable autoANALYZE?

ANALYZE times are fairly bounded because of the way we do sampling.
VACUUM times are not bounded at all, and typically  O(n). So it makes
sense to switch off the VACUUM at certain times, but never good to
switch off ANALYZE.

While we're there, it would be useful if CREATE TABLE AS SELECT was
followed by an automatic ANALYZE. Especially important for temp tables.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 On Sat, 2008-09-13 at 10:59 +0300, Heikki Linnakangas wrote:

 The 2nd use case, however, I find pretty unconvincing. I can't think of 
 a good example of that. Anything that needs to define its own resource 
 manager is very low-level stuff, and probably needs to go into the core 
 anyway.

 New indexes are a big one, but I listed others also.

 Indexes have always been able to be added dynamically. Now they can be
 recovered correctly as well.

Hm, so currently if you want to add a new indexam you can't just insert into
pg_am and make them recoverable. You basically have to build in your new index
access method into Postgres with the new rmgr. That is annoying and a problem
worth tackling.

But I'm a bit worried about having this be an external plugin. There's no way
to looking at a WAL file to know whether it will be recoverable with the
plugins available. Worse, there's a risk you could have a plugin but not the
*right* plugin. Perhaps this could be tackled simply by having startup insert
a record listing all the rmgr's in use with identifying information and their
version numbers.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-15 Thread Tatsuo Ishii
  * Single Evaluation:
  
with
  foo(i) as (select random() as i)
select * from foo union all select * from foo;
 i
---
 0.233165248762816
  0.62126633618027
(2 rows)
  
The standard specifies that non-recursive WITH should be evaluated
once.
 
 What shall we do? I don't think there's an easy way to fix this as Tom
 suggested. Maybe we should not allow WITH clause without RECURISVE for
 8.4?

This is a still remaing issue...

  * Binary recursion and subselect strangeness:
  
with recursive foo(i) as
  (values (1)
  union all
  select * from
(select i+1 from foo where i  10
union all
select i+1 from foo where i  X) t)
select * from foo;
  
Produces 10 rows of output regardless of what X is. This should be
  fixed for 8.4.
Also, this is non-linear recursion, which the standard seems to
  disallow.
 
 I will try to fix this. However detecting the query being not a
 non-linear one is not so easy.

I have implemented rejection of non-linear recursion and now this type
of query will not be executed anyway.

  * Multiple recursive references:
  
with recursive foo(i) as
  (values (1)
  union all
  select i+1 from foo where i  10
  union all
  select i+1 from foo where i  20)
select * from foo;
ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
  recursive query
  
If we're going to allow non-linear recursion (which the standard
does not), this seems like it should be a valid case.
 
 I will try to disallow this.

Non-linear recursion is not allowed now.

  * Strange result with except:
  
with recursive foo(i) as
  (values (1)
  union all
  select * from
  (select i+1 from foo where i  10
  except
  select i+1 from foo where i  5) t)
select * from foo;
ERROR:  table foo has 0 columns available but 1 columns specified
  
This query works if you replace except with union. This should be
  fixed for 8.4.
 
 I will try to fix this.

This is a non-linear recursion too and will not be executed anyway.

  * Aggregates allowed:
  
with recursive foo(i) as
  (values(1)
  union all
  select max(i)+1 from foo where i  10)
select * from foo;
  
Aggregates should be blocked according to the standard.
Also, causes an infinite loop. This should be fixed for 8.4.
 
 I will try to fix this.

Fixed.

  * DISTINCT should supress duplicates:
  
with recursive foo(i) as
  (select distinct * from (values(1),(2)) t
  union all
  select distinct i+1 from foo where i  10)
select * from foo;
  
This outputs a lot of duplicates, but they should be supressed
  according to the standard. This query is essentially the same as
  supporting UNION for recursive queries, so we should either fix both for
  8.4 or block both for consistency.
 
 I'm not sure if it's possible to fix this. Will look into.

Ok, now this type of DISTINCT is not allowed.

Included is the latest patches against CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


recursive_query.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Mechanism to transmit multiple event notifications with one signal

2008-09-15 Thread Fujii Masao
Hi,

On Fri, Sep 12, 2008 at 12:41 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Markus Wanner [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Sooner or later we shall have to bite the bullet and set up a
 multiplexing system to transmit multiple event types to backends with
 just one signal.  We already did it for signals to the postmaster.

 Agreed. However, it's non-trivial if you want reliable queues (i.e. no
 message skipped, as with signals) for varying message sizes.

 No, that's not what I had in mind at all, just the ability to deliver
 one of a specified set of event notifications --- ie, get around the
 fact that Unix only gives us two user-definable signal types.

 For signals sent from other backends, it'd be sufficient to put a
 bitmask field into PGPROC entries, which the sender could OR bits into
 before sending the one real signal event (either SIGUSR1 or SIGUSR2).

 I'm not sure what to do if we need signals sent from processes that
 aren't connected to shared memory; but maybe we need not cross that
 bridge here.

 (Also, I gather that the Windows implementation could already support
 a bunch more signal types without much trouble.)

There was discussion of event notification to backends from other process
in the thread, Synchronous Log Shipping Replication
(http://archives.postgresql.org/pgsql-hackers/2008-09/msg00802.php).

The problem is that it's difficult to define new event notification using signal
to backends since they already use SIGUSR1 and SIGUSR2. Attached is
a WIP patch for the mechanism to transmit multiple event notifications with
one signal. This mechanism would be useful for various uses, at least
I would use this for implementing synchronous replication.

In this patch,
* I put a bitmask field into PGPROC for the communication of event type
   as advised by Tom in the above thread.
* I replaced CatchupInterruptHandler() by new signal handler for SIGUSR1,
   adjusted the codes around catchup interrupt to this mechanism.
* remaining work is adjusting the comments around catchup interrupt.

Comments welcome.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


multiplexer_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.3.3 compiler warnings with gcc 4.3

2008-09-15 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= [EMAIL PROTECTED] writes:
 I'm getting these on Fedora-9:
 tqual.c:115: warning: inlining failed in call to ‘SetHintBits’: call is

 They're just cosmetic.  We don't generally worry about fixing cosmetic
 warnings in back branches.

Are they? It seems like these were marked inline for a reason. SetHintBits has
at least one conditional in it which can often be optimized out (it's either
InvalidTransactionId or else has been tested already).

In the tuplestore case there's not much point in having that function if it's
not going to be inlined. I'm imagining that it was put in because someone did
verify that the non-inlined version was consuming significant time. -- Perhaps
a bit assumption though.

Also, is 8.3 really a back branch? It's the current release and anyone
downloading Postgres from source on the web site will be getting these
warnings -- and there are a lot of them. Enough that it looks like something's
gone wrong with the build.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 10:47 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  On Sat, 2008-09-13 at 10:59 +0300, Heikki Linnakangas wrote:
 
  The 2nd use case, however, I find pretty unconvincing. I can't think of 
  a good example of that. Anything that needs to define its own resource 
  manager is very low-level stuff, and probably needs to go into the core 
  anyway.
 
  New indexes are a big one, but I listed others also.
 
  Indexes have always been able to be added dynamically. Now they can be
  recovered correctly as well.
 
 Hm, so currently if you want to add a new indexam you can't just insert into
 pg_am and make them recoverable. You basically have to build in your new index
 access method into Postgres with the new rmgr. That is annoying and a problem
 worth tackling.

Agreed.

 But I'm a bit worried about having this be an external plugin. There's no way
 to looking at a WAL file to know whether it will be recoverable with the
 plugins available. Worse, there's a risk you could have a plugin but not the
 *right* plugin. 

That risk was discussed and is handled in the plugin. You are limited to
only insert data into WAL that has a current plugin that says it will
handle redo for that type.

 Perhaps this could be tackled simply by having startup insert
 a record listing all the rmgr's in use with identifying information and their
 version numbers.

Non-standard plugins in use are listed when in use, so we can all see
what's going on. Plugins can issue their own startup messages if they
choose, with version numbers and other details.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] New shapshot RPMs (Sep 15 2008) are ready for testing

2008-09-15 Thread Devrim GÜNDÜZ
Hi,

I just released new RPM sets, which is based on Sep 15 10:00 AM EEST CVS
snapshot.

These packages *do* require a dump/reload, even from previous 8.4
packages, because of a catversion update.

Over the last week, we have 70 more testers (now 220!) . 

As usual, please find detailed info from:

http://yum.pgsqlrpms.org

A mini howto about 8.4devel release + RPMs is here:

http://yum.pgsqlrpms.org/news-8.4devel-ready-for-testing.php

The tarball I used is here:

http://yum.pgsqlrpms.org/srpms/8.4

(I will remove tarballs after a few weeks...)

Please report any packaging related errors to me. If you find any
PostgreSQL 8.4 bugs, please post them to [EMAIL PROTECTED] or
fill out this form: 

http://www.postgresql.org/support/submitbug 

Regards,
-- 
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
   http://www.gunduz.org


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] pg_regress inputdir

2008-09-15 Thread Peter Eisentraut

Tom Lane wrote:

But I think Alvaro is worried about something
at a higher level: the regression test process as a whole has some
directory layout assumptions built into it, particularly in regards
to where to find .so's.


The only information about the location of the .so's is in the test 
files themselves, which seems reasonable, because they are created and 
installed at the same time as the .so's that they are presumably 
supposed to test.  So I see no problem here.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] no XLOG during COPY?

2008-09-15 Thread Simon Riggs

On Thu, 2008-09-11 at 15:25 -0400, Andrew Dunstan wrote:

 Great, thanks (and also to Guillaume).
 
 It looks to me like the simple way around this issue would be to provide 
 an option to have pg_restore emit:
 begin; truncate foo; copy foo ... commit;
 
 The truncate will be trivial as there won't be any data or indexes at 
 that stage anyway.

Not sure which stage you're talking about. If this is a parallel restore
and you are running a create in one session and a load in another, then
ISTM you have no way of knowing that for certain.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Simon Riggs

On Sat, 2008-09-13 at 10:48 +0100, Florian G. Pflug wrote:

 The current read-only snapshot (which current meaning the
 corresponding state on the master at the time the last replayed wal
 record was generated) was maintained in shared memory. It' xmin field
 was continually updated with the (newly added) XLogRecord.xl_xmin
 field, which contained the xid of the oldest running query on the
 master, with a pruning step after each ReadOnlySnapshot.xmin update to
 remove all entries  xmin from the xid array. If a commit was seen for
 an xid, that xid was added to the ReadOnlySnapshot.xid array.
 
 The advantage of this concept is that it handles snapshotting on the
 slave without too much additional work for the master (The only change
 is the addition of the xl_xmin field to XLogRecord). It especially
 removes that need to track ShmemVariableCache-nextXid.

Snapshots only need to know which transactions are currently running
during WAL apply. The standby can't remove any tuples itself, so it
doesn't need to know what the master's OldestXmin is. 

So passing xl_xmin from master to standby seems not necessary to me. The
standby's OldestXmin needs to be passed through to the master, not the
other way around so that WAL records for tuple removal are not
generated.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review Report: propose to include 3 new functions into intarray and intagg

2008-09-15 Thread Heikki Linnakangas

Markus Wanner wrote:

Dmitry Koterov wrote:
But, what about intarray patch? Does somebody plan to review it? I'd 
prefer to include it too. If you approve, I'll correct the code style 
in intarray contrib patch too.


I've already volunteered for reviewing it as well. I just felt like 
splitting things up...


Looking at the intarray patch:

I find it a bit unfriendly to have a function that depends on sorted 
input, but doesn't check it. But that's probably not a good enough 
reason to reject an otherwise simple and useful function. Also, we 
already have uniq, which doesn't strictly speaking require sorted input, 
but it does if you want to eliminate all duplicates from the array.


_int_group_count_sort seems a bit special purpose. Why does it bother to 
sort the output? That's wasted time if you don't need sorted output, or 
if you want the array sorted by the integer value instead of frequency. 
If you want sorted output, you can just sort it afterwards.


Also, it's requiring sorted input for a small performance gain, but 
there's a lot more precedence in the existing intarray functions to not 
require sorted input, but to sort the input instead (union, intersect, 
same, overlap).


I realize that the current implementation is faster for the use case 
where the input is sorted, and output needs to be sorted, but if we go 
down that path we'll soon have dozens of different variants of various 
functions, with different ordering requirements of inputs and outputs.


So, I'd suggest changing _int_group_count_sort so that it doesn't 
require sorted input, and doesn't sort the output. The binary search 
function looks good to me (I think I'd prefer naming it bsearch(), 
though, though I can see that it was named bidx in reference to the 
existing idx function). Also, as Markus pointed out, the SGML docs need 
to be updated.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] no XLOG during COPY?

2008-09-15 Thread Andrew Dunstan



Simon Riggs wrote:

On Thu, 2008-09-11 at 15:25 -0400, Andrew Dunstan wrote:

  

Great, thanks (and also to Guillaume).

It looks to me like the simple way around this issue would be to provide 
an option to have pg_restore emit:

begin; truncate foo; copy foo ... commit;

The truncate will be trivial as there won't be any data or indexes at 
that stage anyway.



Not sure which stage you're talking about. If this is a parallel restore
and you are running a create in one session and a load in another, then
ISTM you have no way of knowing that for certain.

  


Er, who doesn't know what for certain, exactly? pg_restore will 
certainly know that it has created the table in another session and can 
thus safely truncate the table in the same transaction as the data load.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Gregory Stark

Simon Riggs [EMAIL PROTECTED] writes:

 So passing xl_xmin from master to standby seems not necessary to me. The
 standby's OldestXmin needs to be passed through to the master, not the
 other way around so that WAL records for tuple removal are not
 generated.

I think most people were pretty leery of doing it that way because it means
activity on the standby database can cause the master to bloat. The consensus
seemed to be headed towards having WAL replay on the standby stall if it meets
a tuple removal record for a tuple that's visible to a query running on it.
Probably with a mechanism to configure a maximum amount of time it can be
stalled before shooting those queries.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review Report: propose to include 3 new functions into intarray and intagg

2008-09-15 Thread Markus Wanner

Hi,

sorry for not having completed this review, yet. As you are obviously 
looking at the patch as well, I'll try to quickly write down my points 
so far.


Trying to compile the intarray module, I now receive an error:

error: ‘INT4OID’ undeclared (first use in this function)

That can be solved by including catalog/pg_type.h from 
contrib/intarr/_int_op.c.



The PG_FUNCTION_INFO_V1 and prototype definition certainly belong to the 
top of the file, where all others are.


Some lines are longer than 80 columns and again comments are a bit 
sparse or even useless (no additional things, please).



Heikki Linnakangas wrote:
I find it a bit unfriendly to have a function that depends on sorted 
input, but doesn't check it. But that's probably not a good enough 
reason to reject an otherwise simple and useful function. Also, we 
already have uniq, which doesn't strictly speaking require sorted input, 
but it does if you want to eliminate all duplicates from the array.


I think it's a performance optimization which is absolutely required in 
some cases. Some time ago I've also had to rip out the sorting step from 
certain intarray module functions to save processing time.


One option already mentioned somewhere would be saving a 'sorted' 
property for the array. Then again, I think such a thing would certainly 
have to be done globally, for all kinds of arrays.


_int_group_count_sort seems a bit special purpose. Why does it bother to 
sort the output? That's wasted time if you don't need sorted output, or 
if you want the array sorted by the integer value instead of frequency. 
If you want sorted output, you can just sort it afterwards.


Agreed. IMO the normal GROUP BY and ORDER BY stuff of the database 
itself should be used for such a thing. However, that means turning an 
array into a set of rows...


Also, it's requiring sorted input for a small performance gain, but 
there's a lot more precedence in the existing intarray functions to not 
require sorted input, but to sort the input instead (union, intersect, 
same, overlap).


..and exactly these are the functions I had to wrap again to strip the 
sorting step, due to poor performance for known-sorted arrays.


I realize that the current implementation is faster for the use case 
where the input is sorted, and output needs to be sorted, but if we go 
down that path we'll soon have dozens of different variants of various 
functions, with different ordering requirements of inputs and outputs.


Agreed.

However, given the OP is using that in production, there seems to be a 
use case for the optimization, where we have none for the same function 
without it.


So, I'd suggest changing _int_group_count_sort so that it doesn't 
require sorted input, and doesn't sort the output. The binary search 
function looks good to me (I think I'd prefer naming it bsearch(), 
though, though I can see that it was named bidx in reference to the 
existing idx function). Also, as Markus pointed out, the SGML docs need 
to be updated.


As is, it should probably also carry the '_int' prefix, because it's not 
a general purpose array function. So propose to name it '_int_bsearch'.


Overall I think these functions are overly specialized and should be 
replaced be more general counterparts in core. However, until we have 
that, it's hard to refuse such a thing for contrib.


By that reasoning, however, the intarray would have to provide methods 
for sorted as well as asorted input arrays as well.


I'm closing my part of reviewing of this patch now. Dmitry, how do you 
want to proceed with these patches? Do you have time for some cleaning 
up and writing documentation?


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 13:13 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  So passing xl_xmin from master to standby seems not necessary to me. The
  standby's OldestXmin needs to be passed through to the master, not the
  other way around so that WAL records for tuple removal are not
  generated.
 
 I think most people were pretty leery of doing it that way because it means
 activity on the standby database can cause the master to bloat. The consensus
 seemed to be headed towards having WAL replay on the standby stall if it meets
 a tuple removal record for a tuple that's visible to a query running on it.
 Probably with a mechanism to configure a maximum amount of time it can be
 stalled before shooting those queries.

Well, my impression from all inputs is there is no single preferred
route. Any one of the approaches seems to have a possible objection,
depending upon the needs of the user. So I'm going to give options.

In any case it's not just a two horse race. There are other options,
favoured by some people, that you've not personally commented on in any
of your summaries (thats up to you, of course).

And Standby causing master to bloat is not such a big problem. It's no
different to running queries directly on the master. But please don't
take it that I don't see the problem or think it the best solution in
all cases.

Certainly, halting WAL replay for any length of time might mean it can
never catch up again, so that won't be acceptable for many cases.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-15 Thread Fujii Masao
On Fri, Sep 12, 2008 at 7:41 PM, Simon Riggs [EMAIL PROTECTED] wrote:

 On Thu, 2008-09-11 at 18:17 +0300, Heikki Linnakangas wrote:
 Fujii Masao wrote:
  I think that this case would often happen. So, we should establish a 
  certain
  solution or procedure to the case where TLI of the master doesn't match
  TLI of the slave. If we only allow the case where TLI of both servers is 
  the
  same, the configuration after failover always needs to get the base backup
  on the new master. It's unacceptable for many users. But, I think that it's
  the role of admin or external tools to copy history files to the slave from
  the master.

 Hmm. There's more problems than the TLI with that. For the original
 master to catch up by replaying WAL from the new slave, without
 restoring from a full backup, the original master must not write to disk
 *any* WAL that hasn't made it to the slave yet. That is certainly not
 true for asynchronous replication, but it also throws off the idea of
 flushing the WAL concurrently to the local disk and to the slave in
 synchronous mode.

 I agree that having to get a new base backup to get the old master catch
 up with the new master sucks, so I hope someone sees a way around that.

 If we were going to recover from failed-over standby back to original
 master just via WAL logs we would need all of the WAL files from the
 point of failover. So you'd need to be storing all WAL file just in case
 the old master recovers. I can't believe doing that would be the common
 case, because its so impractical and most people would run out of disk
 space and need to delete WAL files.

No. The original master doesn't need all WAL files. It needs WAL file which
its pg_control points as latest checkpoint location and subsequent files.

 It should be clear that to make this work you must run with a base
 backup that was derived correctly on the current master. You can do that
 by re-copying everything, or you can do that by just shipping changed
 blocks (rsync etc). So I don't see a problem in the first place.

PITR doesn't always need a base backup. We can do PITR from the data
files just after crash if they aren't corrupted (i.e. not media crash).

As the situation demands, most users would like to choose the setup
procedure that bad influence on the cluster is smaller. They would choose
the procedure without a base backup if there are few WAL files to be
replayed. Meanwhile, they would use a base backup if the indispensable
WAL files have already been deleted. But, in that case, they might not take
new base backup and use old one (e.g. taken 2 days before).

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Simon Riggs [EMAIL PROTECTED] writes:
 Indexes have always been able to be added dynamically. Now they can be
 recovered correctly as well.

 Hm, so currently if you want to add a new indexam you can't just insert into
 pg_am and make them recoverable. You basically have to build in your new index
 access method into Postgres with the new rmgr. That is annoying and a problem
 worth tackling.

I concur with Heikki that that's not exactly a compelling use-case.
I've never heard of anyone building a non-core index AM at all; much
less trying to use it in a production context.  Given the obvious
potential for version-mismatch-type problems, it's hard to believe
that anyone ever would try.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Magnus Hagander
D'Arcy J.M. Cain wrote:
 On Fri, 12 Sep 2008 06:53:55 +1000
 Brendan Jurd [EMAIL PROTECTED] wrote:
 Josh has assigned your patch to me for an initial review.
 
 And me.

Thanks for the reviews, guys. I've adjusted the patch per your comments
(I think), and applied it.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-15 Thread Fujii Masao
On Fri, Sep 12, 2008 at 12:17 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:
 Fujii Masao wrote:

 I think that this case would often happen. So, we should establish a
 certain
 solution or procedure to the case where TLI of the master doesn't match
 TLI of the slave. If we only allow the case where TLI of both servers is
 the
 same, the configuration after failover always needs to get the base backup
 on the new master. It's unacceptable for many users. But, I think that
 it's
 the role of admin or external tools to copy history files to the slave
 from
 the master.

 Hmm. There's more problems than the TLI with that. For the original master
 to catch up by replaying WAL from the new slave, without restoring from a
 full backup, the original master must not write to disk *any* WAL that
 hasn't made it to the slave yet. That is certainly not true for asynchronous
 replication, but it also throws off the idea of flushing the WAL
 concurrently to the local disk and to the slave in synchronous mode.

Yes.

If the master fails after writing WAL to disk and before sending it to
the slave,
at least latest WAL file would be inconsistent between both servers. So,
regardless of using a base backup, in a setup procedure, we need to delete
those inconsistent WAL files or overwrite them.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Merlin Moncure
On Mon, Sep 15, 2008 at 8:40 AM, Simon Riggs [EMAIL PROTECTED] wrote:

 On Mon, 2008-09-15 at 13:13 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:

  So passing xl_xmin from master to standby seems not necessary to me. The
  standby's OldestXmin needs to be passed through to the master, not the
  other way around so that WAL records for tuple removal are not
  generated.

 I think most people were pretty leery of doing it that way because it means
 activity on the standby database can cause the master to bloat. The consensus
 seemed to be headed towards having WAL replay on the standby stall if it 
 meets
 a tuple removal record for a tuple that's visible to a query running on it.
 Probably with a mechanism to configure a maximum amount of time it can be
 stalled before shooting those queries.

 Well, my impression from all inputs is there is no single preferred
 route. Any one of the approaches seems to have a possible objection,
 depending upon the needs of the user. So I'm going to give options.

 In any case it's not just a two horse race. There are other options,
 favoured by some people, that you've not personally commented on in any
 of your summaries (thats up to you, of course).

 And Standby causing master to bloat is not such a big problem. It's no
 different to running queries directly on the master. But please don't
 take it that I don't see the problem or think it the best solution in
 all cases.

It's not a problem, but a relative disadvantage.  What makes warm
standby really attractive relative to other data transfer solutions is
that there are no side effects on the master outside of setting the
archive command...communication is one way.  Any 'master bloat' style
approach seems to be increasingly fragile if you want to increase the
number of standby servers, if it's even possible to do that.

I'm not saying it should be the only way to do it (and it may not even
be possible)...but it would be very attractive to be able to run a hot
standby much the same as a warm standby is running today...I could,
for example easily script a second standby but keep it a week behind
for example.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-15 Thread Alvaro Herrera
Simon Riggs escribió:

 On Fri, 2008-09-12 at 17:08 +0300, Heikki Linnakangas wrote:

   It should be clear that to make this work you must run with a base
   backup that was derived correctly on the current master. You can do that
   by re-copying everything, or you can do that by just shipping changed
   blocks (rsync etc). So I don't see a problem in the first place.
  
  Hmm, built-in rsync capability would be cool. Probably not in the first 
  phase, though..
 
 Built-in? Why? I mean make base backup using rsync. That way only
 changed data blocks need be migrated, so much faster.

Why rsync?  Just compare the LSNs ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Brendan Jurd
On Mon, Sep 15, 2008 at 10:53 PM, Magnus Hagander [EMAIL PROTECTED] wrote:

 Thanks for the reviews, guys. I've adjusted the patch per your comments
 (I think), and applied it.


Cool.  I had a look at the commitdiff and I think you missed one of
the error messages (it's still all on one line).  It's at
src/backend/libpq/hba.c line 841.  I've included a patch to split it
up below.

Cheers,
BJ

--- src/backend/libpq/hba.c
+++ src/backend/libpq/hba.c
@@ -840,9 +840,10 @@ hba_syntax:
if (line_item)
ereport(LOG,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
-errmsg(invalid entry in file \%s\ at line 
%d, token \%s\,
-   HbaFileName, line_num,
-   (char *) lfirst(line_item;
+errmsg(invalid entry for token \%s\,
+   (char *) lfirst(line_item)),
+errdetail(In file \%s\, line %d,
+  HbaFileName, line_num)));
else
ereport(LOG,
(errcode(ERRCODE_CONFIG_FILE_ERROR),

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-15 Thread Heikki Linnakangas

Alvaro Herrera wrote:

Simon Riggs escribió:

On Fri, 2008-09-12 at 17:08 +0300, Heikki Linnakangas wrote:

It should be clear that to make this work you must run with a base
backup that was derived correctly on the current master. You can do that
by re-copying everything, or you can do that by just shipping changed
blocks (rsync etc). So I don't see a problem in the first place.
Hmm, built-in rsync capability would be cool. Probably not in the first 
phase, though..

Built-in? Why? I mean make base backup using rsync. That way only
changed data blocks need be migrated, so much faster.


Why rsync?  Just compare the LSNs ...


True, that's much better. Only works for data files, though, so we'll 
still need something else for clog etc. But the volume of the other 
stuff is much smaller, so I support we don't need to bother delta 
compressing them.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Heikki Linnakangas

Gregory Stark wrote:

Simon Riggs [EMAIL PROTECTED] writes:


So passing xl_xmin from master to standby seems not necessary to me. The
standby's OldestXmin needs to be passed through to the master, not the
other way around so that WAL records for tuple removal are not
generated.


I think most people were pretty leery of doing it that way because it means
activity on the standby database can cause the master to bloat. The consensus
seemed to be headed towards having WAL replay on the standby stall if it meets
a tuple removal record for a tuple that's visible to a query running on it.
Probably with a mechanism to configure a maximum amount of time it can be
stalled before shooting those queries.


Yes, but we can quite easily provide an option on top of that to 
advertise the slave xmin back to the master, for those who prefer some 
bloat on master over stalling replay or killing queries in the slave. In 
fact, I think a lot of people will choose some compromise, where the 
slave xmin is advertised back to the master, but the master will obey it 
only up to some limit, after which the slave will need to stall or kill 
queries again.


It's not something that needs to be done in the first phase, but should 
be straightforward to add after the basics are working. In any case, 
we'll need the capability in the slave to notice when it's about to 
remove a tuple that's still visible to a snapshot in the slave.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Simon Riggs [EMAIL PROTECTED] writes:
 Indexes have always been able to be added dynamically. Now they can be
 recovered correctly as well.

 Hm, so currently if you want to add a new indexam you can't just insert into
 pg_am and make them recoverable. You basically have to build in your new 
 index
 access method into Postgres with the new rmgr. That is annoying and a problem
 worth tackling.

 I concur with Heikki that that's not exactly a compelling use-case.
 I've never heard of anyone building a non-core index AM at all; much
 less trying to use it in a production context.  Given the obvious
 potential for version-mismatch-type problems, it's hard to believe
 that anyone ever would try.

Well wasn't GIST such an instance until we put it in core? IIRC it lived in
contrib for a long time. It happens that the route they took was to implement
it without recoverability until it was in core then add logging. I suspect we
would lean on any new method to have logging before it was merged in though.

I think the version-mismatch problems are fairly important though which is why
I was suggesting providing checks for that in postgres. Simon's right though
that the plugin could check for it itself.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 Cool.  I had a look at the commitdiff and I think you missed one of
 the error messages (it's still all on one line).  It's at
 src/backend/libpq/hba.c line 841.  I've included a patch to split it
 up below.

Hm, that doesn't seem like a great improvement --- in particular, it
violates the style guideline that detail messages should be complete
sentences.

I think the error text is all right as-is, really.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Magnus Hagander
Tom Lane wrote:
 Brendan Jurd [EMAIL PROTECTED] writes:
 Cool.  I had a look at the commitdiff and I think you missed one of
 the error messages (it's still all on one line).  It's at
 src/backend/libpq/hba.c line 841.  I've included a patch to split it
 up below.
 
 Hm, that doesn't seem like a great improvement --- in particular, it
 violates the style guideline that detail messages should be complete
 sentences.
 
 I think the error text is all right as-is, really.

I think I need to hit myself over the head with those guidelines
repeatedly, because I think the *changed* messages are in violation of
this, and need to be changed again...

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 16:26 +0300, Heikki Linnakangas wrote:
 In any case, we'll need the capability in the slave to notice when
 it's about to remove a tuple that's still visible to a snapshot in the
 slave.

Agreed.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 08:52 -0400, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  Simon Riggs [EMAIL PROTECTED] writes:
  Indexes have always been able to be added dynamically. Now they can be
  recovered correctly as well.
 
  Hm, so currently if you want to add a new indexam you can't just insert into
  pg_am and make them recoverable. You basically have to build in your new 
  index
  access method into Postgres with the new rmgr. That is annoying and a 
  problem
  worth tackling.
 
 I concur with Heikki that that's not exactly a compelling use-case.
 I've never heard of anyone building a non-core index AM at all; much
 less trying to use it in a production context.  Given the obvious
 potential for version-mismatch-type problems, it's hard to believe
 that anyone ever would try.

The lack of a chicken is not an argument against the use case for an
egg.

But in any case, Bizgres was exactly this case, so they already did. We
just forced the authors to produce a code fork to do it, confusing
people rather than attracting people to Postgres.

We have plugin APIs with possible version mismatches in other contexts,
but I don't see us doing anything about that. In the context of WAL, the
basic WAL format has not changed in about 6 releases, so its been one of
the most stable file formats. Certain message types have changed, but
messages are all independent across rmgrs, so insulated from change.

The version mismatch idea presumes that a code author would structure
their code in two pieces: one to generate the WAL and one to read it.
Seems much more likely to me that authors would have one module
containing both as a way of avoiding the problem altogether. So I'm not
sure what to check, and against what?

When people do write useful plugins in the future they will be
potentially usable with any server at 8.4 or above. If we had had this
feature a few releases ago, we could have made GIN available to earlier
releases, for example.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 09:07 -0400, Merlin Moncure wrote:
  Well, my impression from all inputs is there is no single preferred
  route. Any one of the approaches seems to have a possible objection,
  depending upon the needs of the user. So I'm going to give options.
 
  In any case it's not just a two horse race. There are other options,
  favoured by some people, that you've not personally commented on in
 any
  of your summaries (thats up to you, of course).
 
  And Standby causing master to bloat is not such a big problem. It's no
  different to running queries directly on the master. But please don't
  take it that I don't see the problem or think it the best solution in
  all cases.
 
 It's not a problem, but a relative disadvantage.  

In some circumstances, yes, in others, not. But its not the only
consideration and different people attach different weightings in their
decision making. 

 What makes warm
 standby really attractive relative to other data transfer solutions is
 that there are no side effects on the master outside of setting the
 archive command...communication is one way.  Any 'master bloat' style
 approach seems to be increasingly fragile if you want to increase the
 number of standby servers, if it's even possible to do that.
 
 I'm not saying it should be the only way to do it (and it may not even
 be possible)...but it would be very attractive to be able to run a hot
 standby much the same as a warm standby is running today...I could,
 for example easily script a second standby but keep it a week behind
 for example.

I hope to provide options for all users, not just a single approach. If
you choose never to use the option to link standby and master, I respect
that and understand. I've listened to your requirements and believe I'm
including things to allow it to work for you the way you've said.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 The version mismatch idea presumes that a code author would structure
 their code in two pieces: one to generate the WAL and one to read it.
 Seems much more likely to me that authors would have one module
 containing both as a way of avoiding the problem altogether. So I'm not
 sure what to check, and against what?

No, the danger is that someone generates a backup with one version of the
plugin and then restores with a different version of the plugin.

That would be frightfully easy to do when doing a minor upgrade, for example.
Or on a standby database if you've installed a new version of the plugin since
the standby was built.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Heikki Linnakangas

Simon Riggs wrote:

On Mon, 2008-09-15 at 08:52 -0400, Tom Lane wrote:

I've never heard of anyone building a non-core index AM at all; much
less trying to use it in a production context.  Given the obvious
potential for version-mismatch-type problems, it's hard to believe
that anyone ever would try.


The lack of a chicken is not an argument against the use case for an
egg.

But in any case, Bizgres was exactly this case, so they already did. We
just forced the authors to produce a code fork to do it, confusing
people rather than attracting people to Postgres.


Are you referring to the bitmap index patch? IIRC, there was some 
non-trivial changes to indexam API in there, as well as issues with 
VACUUM. If anything, that patch supports the assumption that anything 
that needs WAL-logging is working at such a low-level that it needs to 
be in core anyway.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 We have plugin APIs with possible version mismatches in other contexts,
 but I don't see us doing anything about that. In the context of WAL, the
 basic WAL format has not changed in about 6 releases, so its been one of
 the most stable file formats.

Er, that's simply false.  Read the revision history for xlog_internal.h.

 The version mismatch idea presumes that a code author would structure
 their code in two pieces: one to generate the WAL and one to read it.

No, the version mismatch problem is that you might try to read the WAL
with a different version of the plugin than you wrote it with.  Or maybe
with a completely unrelated plugin that was unfortunate enough to choose
the same rmgr ID.  We can't afford to insert complete versioning
information into each WAL record, so it's going to be pretty difficult
to avoid this risk.

 When people do write useful plugins in the future they will be
 potentially usable with any server at 8.4 or above. If we had had this
 feature a few releases ago, we could have made GIN available to earlier
 releases, for example.

Well, the initial commit for GIN looked like this:

2006-05-02 07:28  teodor

* contrib/tsearch2/Makefile, contrib/tsearch2/ginidx.c,
contrib/tsearch2/tsearch.sql.in,
contrib/tsearch2/expected/tsearch2.out,
contrib/tsearch2/sql/tsearch2.sql, src/backend/access/Makefile,
src/backend/access/gin/Makefile, src/backend/access/gin/README,
src/backend/access/gin/ginarrayproc.c,
src/backend/access/gin/ginbtree.c,
src/backend/access/gin/ginbulk.c,
src/backend/access/gin/gindatapage.c,
src/backend/access/gin/ginentrypage.c,
src/backend/access/gin/ginget.c,
src/backend/access/gin/gininsert.c,
src/backend/access/gin/ginscan.c, src/backend/access/gin/ginutil.c,
src/backend/access/gin/ginvacuum.c,
src/backend/access/gin/ginxlog.c,
src/backend/access/transam/rmgr.c, src/backend/commands/cluster.c,
src/backend/commands/opclasscmds.c, src/backend/commands/vacuum.c,
src/backend/utils/adt/selfuncs.c, src/backend/utils/init/globals.c,
src/backend/utils/misc/guc.c, src/include/access/gin.h,
src/include/access/rmgr.h, src/include/catalog/catversion.h,
src/include/catalog/pg_am.h, src/include/catalog/pg_amop.h,
src/include/catalog/pg_amproc.h, src/include/catalog/pg_opclass.h,
src/include/catalog/pg_operator.h, src/include/catalog/pg_proc.h,
src/include/utils/selfuncs.h, src/test/regress/data/array.data,
src/test/regress/expected/arrays.out,
src/test/regress/expected/create_index.out,
src/test/regress/expected/create_table.out,
src/test/regress/expected/opr_sanity.out,
src/test/regress/expected/sanity_check.out,
src/test/regress/input/copy.source,
src/test/regress/output/copy.source,
src/test/regress/output/misc.source,
src/test/regress/sql/arrays.sql,
src/test/regress/sql/create_index.sql,
src/test/regress/sql/create_table.sql,
src/test/regress/sql/opr_sanity.sql: GIN: Generalized Inverted
iNdex.  text[], int4[], Tsearch2 support for GIN.

Had the only core source file touched been rmgr.c, then maybe this
argument would be valid ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Brendan Jurd
On Mon, Sep 15, 2008 at 11:46 PM, Magnus Hagander [EMAIL PROTECTED] wrote:
 Tom Lane wrote:

 Hm, that doesn't seem like a great improvement --- in particular, it
 violates the style guideline that detail messages should be complete
 sentences.

 I think the error text is all right as-is, really.

 I think I need to hit myself over the head with those guidelines
 repeatedly, because I think the *changed* messages are in violation of
 this, and need to be changed again...


Right, I was just applying the same DETAIL structure as was used
elsewhere in the patch.  Tom's got a point about the complete
sentences though.

I still feel that the primary message is too long once the arguments
have been substituted in.  How about just tweaking the DETAIL portion
so that it is a complete sentence?

Or perhaps the DETAIL portion would be more appropriate as a CONTEXT?

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 Or perhaps the DETAIL portion would be more appropriate as a CONTEXT?

+1

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-15 Thread Heikki Linnakangas

Fujii Masao wrote:

On Fri, Sep 12, 2008 at 12:17 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:

Hmm. There's more problems than the TLI with that. For the original master
to catch up by replaying WAL from the new slave, without restoring from a
full backup, the original master must not write to disk *any* WAL that
hasn't made it to the slave yet. That is certainly not true for asynchronous
replication, but it also throws off the idea of flushing the WAL
concurrently to the local disk and to the slave in synchronous mode.


Yes.

If the master fails after writing WAL to disk and before sending it to
the slave,
at least latest WAL file would be inconsistent between both servers. So,
regardless of using a base backup, in a setup procedure, we need to delete
those inconsistent WAL files or overwrite them.


And if you're unlucky, the changes in the latest WAL file might already 
have been flushed to data files as well.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Integrating hash index tupledesc hack a bit better

2008-09-15 Thread Tom Lane
A few days ago I complained that the lossy-hash-indexes patch needed a
better solution for letting catalog/index.c know that hash indexes will
really store only hashcodes and not the underlying column type.  The
current version of the patch puts some hard-wired knowledge into
index.c:

/* HACK: make hash always use int4 as storage (really it's uint32) */
if (opclassTup-opcmethod == HASH_AM_OID)
keyType = INT4OID;
else
keyType = opclassTup-opckeytype;

which might not exactly be a stop-ship issue but it still seems pretty
distasteful.

We could avoid the problem if all hash opclasses were marked as having
storage type int4, but that seems to just move the problem over to
opclass creation --- I doubt we want to insist on user-defined opclasses
changing to accommodate this, so we'd have to force this marking
internally in CREATE OPERATOR CLASS.

The idea that seems most attractive to me at the moment is to add a
column to pg_am, probably defined like pg_opclass.opckeytype but
applying to all opclasses of the index type.

Another possibility is the idea I floated before of adding an AM entry
point to let it manipulate the index tupledesc in a more general fashion
during CREATE INDEX.  However it's unclear how much flexibility that
would really buy us given the assumptions that are wired into various
places, so I'm thinking it's probably just useless complication.

Comments, better ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 16:26 +0300, Heikki Linnakangas wrote:
 In any case, 
 we'll need the capability in the slave to notice when it's about to 
 remove a tuple that's still visible to a snapshot in the slave.

Looks like what I'll do is this:

Alter functions in pruneheap.c so that we bubble up the latest xid that
is being removed as part of block cleaning. We then add that xid into
the WAL record for cleaning.

If latest xid of clean is ahead of oldestxmin of running queries on
standby then Startup process needs to take action, of some kind.

Re-examining the tuples in WAL apply seems bad plan, since we'd have to
touch stuff in the block twice and juggle the locks.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 Bottom line is that any backup of Postgres needs to include plugin
 directories, otherwise parts of the application could stop working
 following restore. This patch doesn't change that.

No, backups of executables are normally not the same backups as the data and
in many cases -- minor upgrades for example -- cannot be.

 * add the rmgr bms to the long header of each WAL file

 * change !RmgrIdIsValid() so that it causes FATAL by default. This then
 allows people to correct a mistake and retry. We provide an option to
 treat such errors as corrupt data and assume we have reached the end of
 WAL.

I'm not sure but I think this just begs the question. The problem is to ensure
that the rmgrid means the same thing on the restoring database as it does on
the original database, or at least a compatible version. I think this would
mean having a long text description and version number to compare.

And as Tom points out startup isn't often enough. Would WAL headers even be
often enough? We would have to ensure there was never two versions of the
plugin in the same WAL file.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql is not translate-aware

2008-09-15 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Actually this is wrong -- since the library is going to run with
  postgres text domain, we need to add the files to the backend's
  nls.mk:
 
 Can't we give it its own text domain?  It seems fundamentally wrong
 for a plug-in language to require core support for its messages.
 (Now that I think about it, that may have been the reason we don't have
 localization for it already.)  I suppose this must be possible,
 since e.g. glibc manages to have its own messages separate from
 whatever app it's linked with.

What glibc does is use dgettext() instead of plain gettext(), so they
are able to pass the domain along the message.  See here, at the bottom:
http://github.com/bneumeier/glibc/tree/master/include/libintl.h
where _libc_intl_domainname is defined as libc elsewhere.

I guess one way to go about this is to add some way to pass the domain
from the caller, i.e. 

ereport(ERROR,
(errdomain(plpgsql),
 errmsg( ... )))

but it seems a bit fragile, if only because it's easy to forget to add
it to new code, and easy to overlook a message that should have it.

Another idea would be to redefine errmsg() and friends within the
plpgsql sources, but that seems worse because every other library will
need to do the same.

Refinement of the previous idea: maybe we can add a compiler flag, to be
set in Makefiles, that defines the domain and passes it down to
EVALUATE_MESSAGE.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 16:43 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  Bottom line is that any backup of Postgres needs to include plugin
  directories, otherwise parts of the application could stop working
  following restore. This patch doesn't change that.
 
 No, backups of executables are normally not the same backups as the data and
 in many cases -- minor upgrades for example -- cannot be.

So you advise your clients to do backup in two halves. Then complain
that this is a bad way to do a backup because it may cause
insurmountable problems on restore. And then seek to reject a patch
because of that, even though similar problems already exist in other
parts of the system. I'm sorry, but that is circular, then faulted
logic.

If you do a minor upgrade that changes the on-disk format of *any* part
of the system then you have just introduced a limitation into what
software can be used for restore. That could be a new version of a
custom datatype just as easily as it could be an rmgr plugin. Shall we
ban custom datatypes? Should we add a version number into every varlen
header just in case somebody switched release levels, then forgot?

  * add the rmgr bms to the long header of each WAL file
 
  * change !RmgrIdIsValid() so that it causes FATAL by default. This then
  allows people to correct a mistake and retry. We provide an option to
  treat such errors as corrupt data and assume we have reached the end of
  WAL.
 
 I'm not sure but I think this just begs the question. The problem is to ensure
 that the rmgrid means the same thing on the restoring database as it does on
 the original database, or at least a compatible version. I think this would
 mean having a long text description and version number to compare.

Why is that any different to using functions or other plugins? If you
restore data into a database where the functions have the same names,
yet do different things then you are in trouble. Period.

If you don't use an rmgr plugin at all then you have nothing different
to do, nor will you see any different messages. If you use *any*
external server software, expect to read the instructions or have
problems.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Robert Treat
On Friday 12 September 2008 07:44:36 Csaba Nagy wrote:
 And then
 the actually interesting question: what will the slave do with views,
 rules, triggers ? I guess triggers are out of the question to be
 executed, what about rules ? Probably must be also ignored... user
 functions will probably get errors if they try to update something...
 Views should probably function correctly.


If we dont have rules, we dont get views, so those need to be. Really we 
should allow anything that would work in the context of a read only 
transaction. (ie. functions that dont change anything should be fine to call)

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks and contrib/rmgr_hook

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 10:04 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:

  The version mismatch idea presumes that a code author would structure
  their code in two pieces: one to generate the WAL and one to read it.
 
 No, the version mismatch problem is that you might try to read the WAL
 with a different version of the plugin than you wrote it with.  Or maybe
 with a completely unrelated plugin that was unfortunate enough to choose
 the same rmgr ID.  We can't afford to insert complete versioning
 information into each WAL record, so it's going to be pretty difficult
 to avoid this risk.

I'm happy to include additional things into the patch, but I don't see
anything to force rejection of the patch entirely, from what has been
said.

Bottom line is that any backup of Postgres needs to include plugin
directories, otherwise parts of the application could stop working
following restore. This patch doesn't change that.

I proposed a registration scheme to avoid one of those problems.

If a plugin changed its file format, it would clearly need to include a
version test within it. It wouldn't be the fault of the plugin API if
the plugin author didn't handle that. But if they can work out how to
build an index AM and write WAL, I'm sure they can handle version
management between software components and informative error messages if
problems occur. And if they can't, they'll get a bad rep and nobody will
use the plugin.

Few ideas:

* add the rmgr bms to the long header of each WAL file

* change !RmgrIdIsValid() so that it causes FATAL by default. This then
allows people to correct a mistake and retry. We provide an option to
treat such errors as corrupt data and assume we have reached the end of
WAL.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] EncodeDateTime performance

2008-09-15 Thread George McCollister
I'm trying to optimize postgres performance on a headless solid state
hardware platform (no fans or disks). I have the database stored on a
USB 2.0 flash drive (hdparm benchmarks reads at 10 MB/s). Performance is
limited by the 533Mhz CPU.

Hardware:
IXP425 XScale (big endian) 533Mhz 64MB RAM
USB 2.0 Flash Drive
 
Software:
Linux 2.6.21.4
postgres 8.2.5

I created a fresh database using initdb, then added one table.

Here is the create table:
CREATE TABLE archivetbl
(
  DateTime timestamp without time zone,
  StationNum smallint,
  DeviceDateTime timestamp without time zone,
  DeviceNum smallint,
  Tagname character(64),
  Value double precision,
  Online boolean
)
WITH (OIDS=FALSE);
ALTER TABLE archivetbl OWNER TO novatech;

I populated the table with 38098 rows.

I'm doing this simple query:
select * from archivetbl;
 
It takes 79 seconds to complete the query (when postgres is compiled
with -O2). I'm running the query from pgadmin3 over TCP/IP.

oprofile is showing that memset (via dopr) is using about 60% of the CPU. I 
traced back further and noticed most of the usage was coming from 
EncodeDateTime.

I'm not quite sure why oprofile is showing that memset is hogging so much CPU. 
Regardless, I found way to eliminate most of the sprintf calls that were taking 
place in my situation.

I made some modifications to EncodeDateTime and have attached them as a patch. 
These changes alone reduced the query time of the select *  from archivetbl; 
from 79 seconds to just 35 seconds.

This patch is against 8.2.5. Since I'm cross compiling changing versions is a 
bit of a pain, but if someone thinks the performance has changed much in this 
area I could probably get the latest version cross compiling.

Regards,
George McCollister


diff -Naur postgresql-8.2.5/src/backend/utils/adt/datetime.c postgresql-8.2.5.new/src/backend/utils/adt/datetime.c
--- postgresql-8.2.5/src/backend/utils/adt/datetime.c	2007-06-12 10:58:39.0 -0500
+++ postgresql-8.2.5.new/src/backend/utils/adt/datetime.c	2008-09-15 12:16:32.0 -0500
@@ -3287,6 +3287,53 @@
 	return TRUE;
 }	/* EncodeTimeOnly() */
 
+void ymdhm(char * buf, int year, int mon, int day, int hour, int min)
+{
+	buf[0] = (char)((year / 1000) % 10) + '0';
+	buf[1] = (char)((year / 100) % 10) + '0';
+	buf[2] = (char)((year / 10) % 10) + '0';
+	buf[3] = (char)(year % 10) + '0';
+	buf[4] = '-';
+	buf[5] = (char)((mon / 10) % 10) + '0';
+	buf[6] = (char)(mon % 10) + '0';
+	buf[7] = '-';
+	buf[8] = (char)((day / 10) % 10) + '0';
+	buf[9] = (char)(day % 10) + '0';
+	buf[10] = ' ';
+	buf[11] = (char)((hour / 10) % 10) + '0';
+	buf[12] = (char)(hour % 10) + '0';
+	buf[13] = ':';
+	buf[14] = (char)((min / 10) % 10) + '0';
+	buf[15] = (char)(min % 10) + '0';
+	buf[16] = '\0';
+}
+
+void append_seconds(char * buf, int sec)
+{
+	buf[0] = ':';
+	buf[1] = (char)((sec / 10) % 10) + '0';
+	buf[2] = (char)(sec % 10) + '0';
+	buf[3] = '\0';
+}
+
+#ifdef HAVE_INT64_TIMESTAMP
+void append_seconds_and_fsecs(char * buf, int sec, fsec_t fsec)
+{
+	buf[0] = ':';
+	buf[1] = (char)((sec / 10) % 10) + '0';
+	buf[2] = (char)(sec % 10) + '0';
+	buf[3] = '.';
+	buf[4] = (char)((fsec / 10) % 10) + '0';
+	buf[5] = (char)((fsec / 1) % 10) + '0';
+	buf[6] = (char)((fsec / 1000) % 10) + '0';
+	buf[7] = (char)((fsec / 100) % 10) + '0';
+	buf[8] = (char)((fsec / 10) % 10) + '0';
+	buf[9] = (char)(fsec % 10) + '0';
+	buf[10] = '\0';
+}
+#endif
+
+
 
 /* EncodeDateTime()
  * Encode date and time interpreted as local time.
@@ -3315,9 +3362,8 @@
 		case USE_ISO_DATES:
 			/* Compatible with ISO-8601 date formats */
 
-			sprintf(str, %04d-%02d-%02d %02d:%02d,
-	(tm-tm_year  0) ? tm-tm_year : -(tm-tm_year - 1),
-	tm-tm_mon, tm-tm_mday, tm-tm_hour, tm-tm_min);
+			ymdhm(str, (tm-tm_year  0) ? tm-tm_year : -(tm-tm_year - 1),
+tm-tm_mon, tm-tm_mday, tm-tm_hour, tm-tm_min);
 
 			/*
 			 * Print fractional seconds if any.  The field widths here should
@@ -3329,7 +3375,7 @@
 #ifdef HAVE_INT64_TIMESTAMP
 			if (fsec != 0)
 			{
-sprintf(str + strlen(str), :%02d.%06d, tm-tm_sec, fsec);
+append_seconds_and_fsecs(str + strlen(str), tm-tm_sec, fsec);
 TrimTrailingZeros(str);
 			}
 #else
@@ -3340,7 +3386,7 @@
 			}
 #endif
 			else
-sprintf(str + strlen(str), :%02d, tm-tm_sec);
+append_seconds(str + strlen(str), tm-tm_sec);
 
 			/*
 			 * tzp == NULL indicates that we don't want *any* time zone info

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Magnus Hagander
Tom Lane wrote:
 Brendan Jurd [EMAIL PROTECTED] writes:
 Or perhaps the DETAIL portion would be more appropriate as a CONTEXT?
 
 +1

Is this the proper syntax for errcontext() messags?

//Magnus

Index: hba.c
===
RCS file: /cvsroot/pgsql/src/backend/libpq/hba.c,v
retrieving revision 1.167
diff -c -r1.167 hba.c
*** hba.c	15 Sep 2008 12:32:56 -	1.167
--- hba.c	15 Sep 2008 17:38:01 -
***
*** 660,666 
  	(errcode(ERRCODE_CONFIG_FILE_ERROR),
  	 errmsg(invalid IP address \%s\: %s,
  			token, gai_strerror(ret)),
! 	 errdetail(In file \%s\, line %d,
  			   HbaFileName, line_num)));
  			if (cidr_slash)
  *cidr_slash = '/';
--- 660,666 
  	(errcode(ERRCODE_CONFIG_FILE_ERROR),
  	 errmsg(invalid IP address \%s\: %s,
  			token, gai_strerror(ret)),
! 	 errcontext(File \%s\, line %d,
  			   HbaFileName, line_num)));
  			if (cidr_slash)
  *cidr_slash = '/';
***
*** 697,703 
  		(errcode(ERRCODE_CONFIG_FILE_ERROR),
  		 errmsg(invalid IP mask \%s\: %s,
  token, gai_strerror(ret)),
! 		 errdetail(In file \%s\, line %d,
  			   HbaFileName, line_num)));
  if (gai_result)
  	pg_freeaddrinfo_all(hints.ai_family, gai_result);
--- 697,703 
  		(errcode(ERRCODE_CONFIG_FILE_ERROR),
  		 errmsg(invalid IP mask \%s\: %s,
  token, gai_strerror(ret)),
! 		 errcontext(File \%s\, line %d,
  			   HbaFileName, line_num)));
  if (gai_result)
  	pg_freeaddrinfo_all(hints.ai_family, gai_result);
***
*** 773,779 
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg(invalid authentication method \%s\,
  		token),
!  errdetail(In file \%s\ line %d,
  		HbaFileName, line_num)));
  		goto hba_other_error;
  	}
--- 773,779 
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg(invalid authentication method \%s\,
  		token),
!  errcontext(File \%s\ line %d,
  		HbaFileName, line_num)));
  		goto hba_other_error;
  	}
***
*** 784,790 
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg(invalid authentication method \%s\: not supported on this platform,
  		token),
!  errdetail(In file \%s\ line %d,
  		HbaFileName, line_num)));
  		goto hba_other_error;
  	}
--- 784,790 
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg(invalid authentication method \%s\: not supported on this platform,
  		token),
!  errcontext(File \%s\ line %d,
  		HbaFileName, line_num)));
  		goto hba_other_error;
  	}
***
*** 796,802 
  		ereport(LOG,
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg(krb5 authentication is not supported on local sockets),
!  errdetail(In file \%s\ line %d,
  		HbaFileName, line_num)));
  		goto hba_other_error;
  	}
--- 796,802 
  		ereport(LOG,
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg(krb5 authentication is not supported on local sockets),
!  errcontext(File \%s\ line %d,
  		HbaFileName, line_num)));
  		goto hba_other_error;
  	}
***
*** 840,852 
  	if (line_item)
  		ereport(LOG,
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
!  errmsg(invalid entry in file \%s\ at line %d, token \%s\,
! 		HbaFileName, line_num,
! 		(char *) lfirst(line_item;
  	else
  		ereport(LOG,
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
!  errmsg(missing field in file \%s\ at end of line %d,
  		HbaFileName, line_num)));
  
  	/* Come here if suitable message already logged */
--- 840,855 
  	if (line_item)
  		ereport(LOG,
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
!  errmsg(invalid token \%s\,
! 		(char *) lfirst(line_item)),
! 	errcontext(File %s, line %d,
! 		HbaFileName, line_num)));
  	else
  		ereport(LOG,
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
!  errmsg(missing field at end of line %d,
! 		line_num),
!  errcontext(File %s, line %d,
  		HbaFileName, line_num)));
  
  	/* Come here if suitable message already logged */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Or perhaps the DETAIL portion would be more appropriate as a CONTEXT?
 
 +1

 Is this the proper syntax for errcontext() messags?

Hmm, I think I'd suggest following the wording already in use in
ts_locale.c:

errcontext(line %d of configuration file \%s\,
   stp-lineno,
   stp-filename);

if only to save translation effort.

Also, in the last example, don't include the line number in the
errmsg.  Otherwise it looks ok to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Florian G. Pflug

Simon Riggs wrote:

On Sat, 2008-09-13 at 10:48 +0100, Florian G. Pflug wrote:


The main idea was to invert the meaning of the xid array in the snapshot
struct - instead of storing all the xid's between xmin and xmax that are
to be considering in-progress, the array contained all the xid's 
xmin that are to be considered completed.



The downside is that the size of the read-only snapshot is theoretically
unbounded, which poses a bit of a problem if it's supposed to live
inside shared memory...


Why do it inverted? That clearly has problems.


Because it solves the problem of sponteaously apprearing XIDs in the 
WAL. At least prior to 8.3 with virtual xids, a transaction might have 
allocated it's xid long before actually writing anything to disk, and 
therefore long before this XID ever shows up in the WAL. And with a 
non-inverted snapshot such an XID would be considered to be completed 
by transactions on the slave... So, one either needs to periodically log 
a snapshot on the master or log XID allocations which both seem to cause 
considerable additional load on the master. With an inverted snapshot, 
it's sufficient to log the current RecentXmin - a values that is readily 
available on the master, and therefore the cost amounts to just one 
additional 4-byte field per xlog entry.


regards, Florian Pflug



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Florian G. Pflug

Simon Riggs wrote:

On Sat, 2008-09-13 at 10:48 +0100, Florian G. Pflug wrote:


The current read-only snapshot (which current meaning the
corresponding state on the master at the time the last replayed wal
record was generated) was maintained in shared memory. It' xmin field
was continually updated with the (newly added) XLogRecord.xl_xmin
field, which contained the xid of the oldest running query on the
master, with a pruning step after each ReadOnlySnapshot.xmin update to
remove all entries  xmin from the xid array. If a commit was seen for
an xid, that xid was added to the ReadOnlySnapshot.xid array.

The advantage of this concept is that it handles snapshotting on the
slave without too much additional work for the master (The only change
is the addition of the xl_xmin field to XLogRecord). It especially
removes that need to track ShmemVariableCache-nextXid.


Snapshots only need to know which transactions are currently running
during WAL apply. The standby can't remove any tuples itself, so it
doesn't need to know what the master's OldestXmin is. 


I used the logged xmin value to track the shared snapshot's xmin, which 
in turn allowed me to prune the xid array, eliminating all xids  that xmin.


regards, Florian Pflug



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-15 Thread Simon Riggs

On Mon, 2008-09-15 at 19:20 +0100, Florian G. Pflug wrote:
 Simon Riggs wrote:
  On Sat, 2008-09-13 at 10:48 +0100, Florian G. Pflug wrote:
  
  The main idea was to invert the meaning of the xid array in the snapshot
  struct - instead of storing all the xid's between xmin and xmax that are
  to be considering in-progress, the array contained all the xid's 
  xmin that are to be considered completed.
  
  The downside is that the size of the read-only snapshot is theoretically
  unbounded, which poses a bit of a problem if it's supposed to live
  inside shared memory...
  
  Why do it inverted? That clearly has problems.
 
 Because it solves the problem of sponteaously apprearing XIDs in the 
 WAL. At least prior to 8.3 with virtual xids, a transaction might have 
 allocated it's xid long before actually writing anything to disk, and 
 therefore long before this XID ever shows up in the WAL. And with a 
 non-inverted snapshot such an XID would be considered to be completed 
 by transactions on the slave... So, one either needs to periodically log 
 a snapshot on the master or log XID allocations which both seem to cause 
 considerable additional load on the master. With an inverted snapshot, 
 it's sufficient to log the current RecentXmin - a values that is readily 
 available on the master, and therefore the cost amounts to just one 
 additional 4-byte field per xlog entry.

I think I understand what you're saying now, though I think it
mostly/only applies before your brilliant idea in 8.3.

If we have a transaction history that looks like this:

ReadA, WriteB, WriteA (both commit in either order)

then pre-8.3 we would have xidA  xidB, whereas at 8.3 and above we see
that xidA is actually higher than xidB. Now, TransactionIds are assigned
in the order of their first page write and *not* in the order of
transaction start as was previously the case, which isn't very obvious.

So when we replay WAL, we know that WAL is only written with locks held,
so that WriteA and WriteB must be independent of each other. So that
means the locks held by Read1 can be ignored and we can assume that the
above history is the same as

WriteB, WriteA

So if we took a snapshot in the middle of WriteB we would be safe to say
that only transaction B was in progress and that transaction A was not
yet started. So the snapshot we derive on the standby is different to
the one we would have derived on the client, yet the serializable order
is the same. In general, this means that all reads on a transaction
prior to the first write can be reordered later so that they can be
assumed to occur exactly prior to the first write of a transaction.
(Please shoot me down, if incorrect).

So when we see the first WAL record of a transaction we know that there
are no in progress transactions with a *lower* xid that we have not yet
seen. So we cannot be confused about whether a transaction is
in-progress, or not.

Almost. Now having written all of that I see there is an obvious race
condition between assignment of an xid and actions that result in the
acquisition of WALInsertLock. So even though the above seems mostly
correct, there is still a gap to plug, but a much smaller one.

So when writing the first WAL record for xid=48, it is possible that
xid=47 has just been assigned and is also just about to write a WAL
record. Thus the absence of a WAL record for xid=47 is not evidence that
xid=47 is complete because it was read-only.

We might handle this with the inverted technique you describe, but there
should be an easier way to track dense packing of xid sequence.

We expect xids to be written to WAL in the order assigned, so we might
check whether a newly assigned xid is 1 higher than the last highest
value to have inserted into WAL. If it is not, then we can write a short
WAL record to inform readers of WAL that the missing xids in sequence
are in progress also. So readers of WAL will see xids in the correct
sequence and are thus able to construct valid snapshots direct from WAL.

I think I should measure how often that occurs to see what problem or
overhead this might cause, if any.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of pg_hba.conf and authentication inconsistencies

2008-09-15 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Or perhaps the DETAIL portion would be more appropriate as a CONTEXT?
 +1
 
 Is this the proper syntax for errcontext() messags?
 
 Hmm, I think I'd suggest following the wording already in use in
 ts_locale.c:
 
   errcontext(line %d of configuration file \%s\,
  stp-lineno,
  stp-filename);
 
 if only to save translation effort.
 
 Also, in the last example, don't include the line number in the
 errmsg.  Otherwise it looks ok to me.

Updated and applied.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread Ron Mayer


Unless I'm compiling stuff wrong, it seems HEAD is giving me
slightly different output on Intervals than 8.3 in the roundoff
of seconds.   8.3 was rounding to the nearest fraction of a second,
HEAD seems to be truncating.

In the psql output below it shows 8.3.1 outputting 6.70 secs
while the similar output for head is showing 6.69 secs.

At first glance it seems this might be because HEAD defaults
to USE_INTEGER_DATETIMES, which leads to HAVE_INT64_TIMESTAMP
which leads to
  sprintf(cp, %s%d.%02d secs, is_nonzero ?   : ,
  tm-tm_sec, ((int) sec) / 1);
in EncodeInterval in datetime.c which doesn't seem to be
doing any rounding.

Am I interpreting this right?  If so, shall I submit a patch
that rounds it to hundredths of a second (hundredths seems
hardcoded in the sprintf), or perhaps just silently add that
fix to the EncodeInterval patch I'm doing any for SQL Standard
and ISO intervals?




psql (8.4devel)
Type help for help.
regression=# set datestyle to sql;
SET
regression=#  select '1 year 2 mons 3 days 04:05:06.69'::interval;
interval
-
 @ 1 year 2 mons 3 days 4 hours 5 mins 6.69 secs
(1 row)


Welcome to psql 8.3.1, the PostgreSQL interactive terminal.
...
pg83=# set datestyle to sql;
SET
pg83=#  select '1 year 2 mons 3 days 04:05:06.69'::interval;
interval
-
 @ 1 year 2 mons 3 days 4 hours 5 mins 6.70 secs
(1 row)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread Tom Lane
Ron Mayer [EMAIL PROTECTED] writes:
 Unless I'm compiling stuff wrong, it seems HEAD is giving me
 slightly different output on Intervals than 8.3 in the roundoff
 of seconds.   8.3 was rounding to the nearest fraction of a second,
 HEAD seems to be truncating.

Yeah, that's surely because of the change to integer timestamps.

 Am I interpreting this right?  If so, shall I submit a patch
 that rounds it to hundredths of a second (hundredths seems
 hardcoded in the sprintf), or perhaps just silently add that
 fix to the EncodeInterval patch I'm doing any for SQL Standard
 and ISO intervals?

This is not the only place where the float-timestamps code has rounding
behavior that doesn't appear in the integer-timestamps code.  I don't
know if we want to buy into making them act the same ... after all,
if they acted exactly the same, we'd not have bothered with writing the
integer code.  In this example, rounding to hundredths doesn't seem like
a particularly good idea; it seems to me it should give you the exact
down-to-the-microsecond value.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread Kevin Grittner
 On Mon, Sep 15, 2008 at  4:58 PM, in message
[EMAIL PROTECTED],
Tom Lane [EMAIL PROTECTED] wrote: 
 Ron Mayer [EMAIL PROTECTED] writes:
 Unless I'm compiling stuff wrong, it seems HEAD is giving me
 slightly different output on Intervals than 8.3 in the roundoff
 of seconds.   8.3 was rounding to the nearest fraction of a second,
 HEAD seems to be truncating.
 
 Yeah, that's surely because of the change to integer timestamps.
 
 Am I interpreting this right?  If so, shall I submit a patch
 that rounds it to hundredths of a second (hundredths seems
 hardcoded in the sprintf), or perhaps just silently add that
 fix to the EncodeInterval patch I'm doing any for SQL Standard
 and ISO intervals?
 
 This is not the only place where the float-timestamps code has
rounding
 behavior that doesn't appear in the integer-timestamps code.  I
don't
 know if we want to buy into making them act the same ... after all,
 if they acted exactly the same, we'd not have bothered with writing
the
 integer code.  In this example, rounding to hundredths doesn't seem
like
 a particularly good idea; it seems to me it should give you the
exact
 down-to-the-microsecond value.
 
I find the results on 8.3.3 with integer timestamps surprising:
 
ccdev=# select version();
 version
--
 PostgreSQL 8.3.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20070115 (prerelease) (SUSE Linux)
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.69'::interval;
   interval
--
 1 year 2 mons 3 days 04:05:06.69
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.69'::interval(1);
 interval
--
 1 year 2 mons 3 days 04:05:06.70
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.69'::interval(2);
 interval
--
 1 year 2 mons 3 days 04:05:06.70
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.69'::interval(3);
 interval
--
 1 year 2 mons 3 days 04:05:06.70
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.69'::interval(4);
 interval
--
 1 year 2 mons 3 days 04:05:06.70
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.69'::interval(5);
 interval
--
 1 year 2 mons 3 days 04:05:06.70
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.69'::interval(6);
   interval
--
 1 year 2 mons 3 days 04:05:06.69
(1 row)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread Kevin Grittner
 Kevin Grittner [EMAIL PROTECTED] wrote: 
 
 I find the results on 8.3.3 with integer timestamps surprising:
 
Even more surprising is the behavior for interval(1) here:
 
ccdev=# select '1 year 2 mons 3 days 04:05:06.64321'::interval;
  interval
-
 1 year 2 mons 3 days 04:05:06.64321
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.64321'::interval(1);
 interval
--
 1 year 2 mons 3 days 04:05:06.60
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.64321'::interval(2);
 interval
--
 1 year 2 mons 3 days 04:05:06.64
(1 row)

ccdev=# select '1 year 2 mons 3 days 04:05:06.64321'::interval(3);
 interval
---
 1 year 2 mons 3 days 04:05:06.643
(1 row)
 
etc.
 
That trailing zero should be considered a bug.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread Ron Mayer

Kevin Grittner wrote:

...not the only place where the float-timestamps code has
rounding behavior that doesn't appear in the integer-timestamps 
code. ...  
I find the results on 8.3.3 with integer timestamps surprising:


Agreed it's surprising and agreed there are more places.

Sounds like I should keep that separate and perhaps later
submit a separate patch to identify and/or remove surprising
rounding behavior.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread Tom Lane
Ron Mayer [EMAIL PROTECTED] writes:
 Sounds like I should keep that separate and perhaps later
 submit a separate patch to identify and/or remove surprising
 rounding behavior.

Agreed.  It's easier to review and get consensus on patches if you
keep separate issues separate.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread R Mayer

Unless I'm compiling stuff wrong, it seems HEAD is giving me
slightly different output on Intervals than 8.3 in the roundoff
of seconds.   8.3 was rounding to the nearest fraction of a second,
HEAD seems to be truncating.

In the psql output below it shows 8.3.1 outputting 6.70 secs
while the similar output for head is showing 6.69 secs.

At first glance it seems this might be because HEAD defaults
to USE_INTEGER_DATETIMES, which leads to HAVE_INT64_TIMESTAMP
which leads to
  sprintf(cp, %s%d.%02d secs, is_nonzero ?   : ,
  tm-tm_sec, ((int) sec) / 1);
in EncodeInterval in datetime.c which doesn't seem to be
doing any rounding.

Am I interpreting this right?  If so, shall I submit a patch
that rounds it to hundredths of a second (hundredths seems
hardcoded in the sprintf), or perhaps just silently add that
fix to the EncodeInterval patch I'm doing any for SQL Standard
and ISO intervals?




psql (8.4devel)
Type help for help.
regression=# set datestyle to sql;
SET
regression=#  select '1 year 2 mons 3 days 04:05:06.69'::interval;
interval
-
 @ 1 year 2 mons 3 days 4 hours 5 mins 6.69 secs
(1 row)


Welcome to psql 8.3.1, the PostgreSQL interactive terminal.
...
pg83=# set datestyle to sql;
SET
pg83=#  select '1 year 2 mons 3 days 04:05:06.69'::interval;
interval
-
 @ 1 year 2 mons 3 days 4 hours 5 mins 6.70 secs
(1 row)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.3 vs HEAD difference in Interval output?

2008-09-15 Thread Ron Mayer

Tom Lane wrote:


This is not the only place where the float-timestamps code has rounding
behavior that doesn't appear in the integer-timestamps code. 


Yeah... For that matter, I find this surprising as well:

regression=# select '0.7 secs'::interval, ('7 secs'::interval/10);
interval |  ?column?
-+-
 00:00:00.69 | 00:00:00.70
(1 row)

I'll start making a list of them for a future patch down the road.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Coping with nLocks overflow

2008-09-15 Thread Tom Lane
We have recently seen one definite and one probable report of overflow
of the nLocks field of a backend's local lock table:
http://archives.postgresql.org/pgsql-bugs/2008-09/msg00021.php

While it's still unclear exactly why 8.3 seems more prone to this than
earlier releases, the general shape of the problem seems clear enough.
We have many code paths that intentionally take a lock on some object
and leave it locked until end of transaction.  Repeat such a command on
the same object enough times within one transaction, and voila,
overflow.  What's news, perhaps, is that we've reached a performance
level where this can actually happen within transactions of lengths that
people might try to run.

I'm considering that a simple solution to this might be to widen nLocks
from int to int64.  This would definitely fix it on machines that have
working int64 arithmetic, and if there are any left that do not, they're
probably not fast enough to encounter the overflow in real-world usage
anyway.  For machines that aren't native 64-bit it would add a couple
of cycles to lock acquisition/release, but that's hardly likely to be
measurable compared to all the other work done in
LockAcquire/LockRelease.

Comments, objections?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-15 Thread Ron Mayer

Tom Lane wrote:

support for SQL-spec interval literals.  I decided to go look at exactly
how unfinished it was, and it turns out that it's actually pretty close.
Hence the attached proposed patch ;-)


Is this code handling negative interval literals right?
I think I quote the relevant spec part at the bottom.


If I'm reading the spec right, I find this surprising.

regression=# select interval '-1-1';
 interval
---
 -1 days -01:00:00

regression=# select interval '1-1';
   interval
--
 1 year 1 mon


Also if I read the spec right, ISTM the sign should
apply to the entire interval literal.  But if I compiled
the patch right, the negative sign applies only to the
first part it encounters?

regression=# select interval '-1 2:3:4';
 interval
---
 -1 days +02:03:04
(1 row)


If I understand right, this'll add confusion to
SQL standard output for mixed year-month and
day-time intervals that have both negative and
positive components too. :(


This looks to me like the relevant part of SQL 200N(8?),
which seems to me to allow a sign inside the quotes:

interval literal ::=
   INTERVAL [ sign ] interval string interval qualifier
interval string ::=
   quote unquoted interval string quote
unquoted interval string ::=
   [ sign ] { year-month literal | day-time literal }




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-15 Thread Tom Lane
Ron Mayer [EMAIL PROTECTED] writes:
 Is this code handling negative interval literals right?
 I think I quote the relevant spec part at the bottom.

We support independent signs for the different components of the
interval.  I'm not sure how well the spec copes with that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposed patch: make SQL interval-literal syntax work per spec

2008-09-15 Thread Ron Mayer

Tom Lane wrote:

Ron Mayer [EMAIL PROTECTED] writes:

Is this code handling negative interval literals right?
I think I quote the relevant spec part at the bottom.


We support independent signs for the different components of the


Even so it surprises me that:
'-1-1'::interval gives me a day-hour interval while
'1-1'::interval gives me a year-month interval.



I'm not sure how well the spec copes with that.


If I'm read the spec right, SQL 2008  expects -1 12:34:56 to
be what we'd see as -1 -12:34:56, and doesn't handle with
different signs for different components at all.

SQL 92 seems to have been simpler, apparently requiring
the negative sign to stay outside the quotes.
==SQL 92===
interval literal ::=
INTERVAL [ sign ] interval string interval qualifier
interval string ::=
   quote { year-month literal | day-time literal } quote
===

SQL 200N seems to allow a sign inside the string:
==SQL 200N=
interval literal ::=
   INTERVAL [ sign ] interval string interval qualifier
interval string ::=
   quote unquoted interval string quote
unquoted interval string ::=
   [ sign ] { year-month literal | day-time literal }
===


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)

2008-09-15 Thread KaiGai Kohei

Peter, Abhijit,

Could you tell me the current status of reviewing process in the CommitFest:Sep?
I can understand the patches contain several new concept and a bit large,
and it is a tough work to review them comprehensively.

I'm not unconcern anything even if reviwing comments are partial one.
However, I cannot improve anything without any comments. :(

Did the previous message which I posted help you to understand the patches?
If not, please feel free to ask me anything.

Thanks,

KaiGai Kohei wrote:

Hello,

The latest SE-PostgreSQL patches are updated here:

[1/4] 
http://sepgsql.googlecode.com/files/sepostgresql-sepgsql-8.4devel-3-r1005.patch
[2/4] http://sepgsql.googlecode.com/files/sepostgresql-pg_dump-8.4devel-3-r1005.patch 
[3/4] http://sepgsql.googlecode.com/files/sepostgresql-policy-8.4devel-3-r1005.patch 
[4/4] http://sepgsql.googlecode.com/files/sepostgresql-docs-8.4devel-3-r1005.patch 

They contains rebasing to the CVS HEAD, because we cannot apply the 
previous ones

correctly due to progress in the base version.
Rest of changes are here:
 - A new PGACE hook: pgaceIsAllowExecutorRunHook().
   It enables to control overriding ExecutorRun_hook, because several 
important

   hooks are invoked from standard_ExecutorRun().
 - T_SEvalItem related equal() functions are added to 
nodes/equalfuncs.c.

   # I've left for implement them.
 - Fix typo in src/include/security/pgace.h


BTW, I thought I have to show the overview of the patch to help reviwers.
The main patch ([1/4]) is especially big and contains new concepts.

The following explanation shows several key concept of SE-PostgreSQL.
I'm happy if it can reduce the load of reviwers.

No need to say, please feel free to ask me anything. :-)

Thanks,


--
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NDirectFileRead and Write

2008-09-15 Thread ITAGAKI Takahiro
Here is a revised patch to renewal NDirectFileRead/Write.

 Tom Lane [EMAIL PROTECTED] wrote:
  * Let's rename them BufFileReadCount and BufFileWriteCount to reflect
  their actual purpose.
  * In any case I agree that the current arrangement
  with execdebug.h declaring variables defined in bufmgr.c is just weird.

- NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count.
- They are still declared in execdebug.h and buffile.c includes it.


I did not touch messages in ShowBufferUsage() in the patch.
I think it still has meaning because BufFile counters are
kinds of direct block access.

 ShowBufferUsage()
 !   Shared blocks: ...
 !   Local  blocks: ...
 !   Direct blocks: R read, W written

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



BufFileCount.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql is not translate-aware

2008-09-15 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Can't we give it its own text domain?  It seems fundamentally wrong
 for a plug-in language to require core support for its messages.

 Another idea would be to redefine errmsg() and friends within the
 plpgsql sources, but that seems worse because every other library will
 need to do the same.

 Refinement of the previous idea: maybe we can add a compiler flag, to be
 set in Makefiles, that defines the domain and passes it down to
 EVALUATE_MESSAGE.

It seems like something like that could work.  I'd be inclined to change
the ereport() macro itself, not errmsg/errdetail/etc.  The domain name
could be passed in at ereport and then used when evaluating the
messages.

A possible problem is that some places use _() (ie, gettext) directly
instead of leaving it to the elog.c code to apply translation.  But I
suppose we might be able to redefine _() too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add default_val to pg_settings

2008-09-15 Thread Greg Smith
Attached is an updated and separate version of my patch exposing the 
internal GUC boot_val as default_val, which failed to attach itself to the 
already committed changes to the pg_settings view.


Brief reminder of what it does:

postgres=# select name,setting,default_val from pg_settings where 
name='shared_buffers';
  name  | setting | default_val
+-+-
 shared_buffers | 4096| 1024

General justification for this change with a longer example is at 
http://archives.postgresql.org/pgsql-hackers/2008-09/msg00233.php


Based on feedback the first time around, I updated the documentation for 
this column to read Default value assumed at server startup if the 
parameter is not otherwise set.


Would only take a quick search/replace of the patch to change 
default_val to something else if there are still any objections there. 
boot_val is another candidate name, I feel that would make the purpose 
of the column less obvious to users of pg_settings even if it is more 
correct. I'm really more concerned about the feature than exactly what 
it's named though. I didn't bother to expose the reset_val since I can't 
think of any obvious use cases for wanting to know it.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MDIndex: doc/src/sgml/catalogs.sgml
===
RCS file: /home/gsmith/cvsrepo/pgsql/doc/src/sgml/catalogs.sgml,v
retrieving revision 2.174
diff -c -r2.174 catalogs.sgml
*** doc/src/sgml/catalogs.sgml  15 Sep 2008 18:43:41 -  2.174
--- doc/src/sgml/catalogs.sgml  16 Sep 2008 03:18:42 -
***
*** 6422,6427 
--- 6422,6433 
values)/entry
   /row
   row
+   entrystructfielddefault_val/structfield/entry
+   entrytypetext/type/entry
+   entryDefault value assumed at server startup if the parameter is not 
+   otherwise set/entry
+  /row
+  row
entrystructfieldsourcefile/structfield/entry
entrytypetext/type/entry
entryInput file the current value was set from (NULL for values set in
Index: src/backend/utils/misc/guc.c
===
RCS file: /home/gsmith/cvsrepo/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.472
diff -c -r1.472 guc.c
*** src/backend/utils/misc/guc.c10 Sep 2008 19:16:22 -  1.472
--- src/backend/utils/misc/guc.c16 Sep 2008 03:44:09 -
***
*** 6087,6092 
--- 6087,6094 
{
case PGC_BOOL:
{
+   struct config_bool *lconf = (struct config_bool 
*) conf;
+ 
/* min_val */
values[9] = NULL;
  
***
*** 6095,6100 
--- 6097,6106 
  
/* enumvals */
values[11] = NULL;
+ 
+   /* default_val */
+   snprintf(buffer, sizeof(buffer), %s, 
lconf-boot_val ? on : off);
+   values[12] = pstrdup(buffer);
}
break;
  
***
*** 6112,6117 
--- 6118,6127 
  
/* enumvals */
values[11] = NULL;
+ 
+   /* default_val */
+   snprintf(buffer, sizeof(buffer), %d, 
lconf-boot_val);
+   values[12] = pstrdup(buffer);
}
break;
  
***
*** 6129,6139 
--- 6139,6155 
  
/* enumvals */
values[11] = NULL;
+ 
+   /* default_val */
+   snprintf(buffer, sizeof(buffer), %g, 
lconf-boot_val);
+   values[12] = pstrdup(buffer);
}
break;
  
case PGC_STRING:
{
+   struct config_string *lconf = (struct 
config_string *) conf;
+ 
/* min_val */
values[9] = NULL;
  
***
*** 6142,6152 
--- 6158,6180 
  
/* enumvals */
values[11] = NULL;
+ 
+   /* default_val */
+   if (lconf-boot_val == NULL)
+   {
+   values[12] = NULL;
+   } else
+   {
+   snprintf(buffer, sizeof(buffer), %s, 
lconf-boot_val);
+   values[12] = pstrdup(buffer);
+   }
}