Re: [HACKERS] [0/4] Proposal of SE-PostgreSQL patches

2008-05-30 Thread KaiGai Kohei
Tom Lane wrote:
 The idea of input functions that alter system tables scares me.

An example:
  SELECT 'system_u:object_r:sepgsql_table_t:SystemHigh'::security_label;

can insert a new tuple into pg_security, but it is not a desirable behavior.

To fix this, I'll remove security_label type and define security_context
system column as a text type column, and translate the contents into
an internal identifical representation just before update or insert a tuple.

It enables to avoid to insert a temporary used (unnecessary) security context,
and enables to use various kind of text functions.

Please comment it, if you have anything.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei [EMAIL PROTECTED]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Gurjeet Singh
On Fri, May 30, 2008 at 10:40 AM, Tom Lane [EMAIL PROTECTED] wrote:

 But since you mention it: one of the plausible answers for fixing the
 vacuum problem for read-only slaves is to have the slaves push an xmin
 back upstream to the master to prevent premature vacuuming.  The current
 design of pg_standby is utterly incapable of handling that requirement.
 So there might be an implementation dependency there, depending on how
 we want to solve that problem.


I think it would be best to not make the slave interfere with the master's
operations; that's only going to increase the operational complexity of such
a solution.

There could be multiple slaves following a master, some serving
data-warehousing queries, some for load-balancing reads, some others just
for disaster recovery, and then some just to mitigate human errors by
re-applying the logs with a delay.

I don't think any one installation would see all of the above mentioned
scenarios, but we need to take care of multiple slaves operating off of a
single master; something similar to cascaded Slony-I.

My two cents.

Best regards,
-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-30 Thread Pavan Deolasee
On Fri, May 30, 2008 at 2:41 PM, Simon Riggs [EMAIL PROTECTED] wrote:

  What I still
 don't accept is that an unconstrained wait is justifiable. You've just
 said its a minor detail, but that's not the way I see it. It might be a
 second, but it might be an hour or more.


I am suggesting a timed wait. May be say between 60-300 seconds.
That's the maximum VACUUM would get delayed. If exiting transactions
don't finish within that time, VACUUM just works as it does today. So
it can't certainly be much worse than what it is today.

 A non-waiting solution seems like the only way to proceed.


Yeah, but we don't have a simple solution yet which would work in all
cases and is not too complicated.

 Is this a non-issue anyway, with DSM?


I thought about that. DSM would certainly reduce the cost of heap
scans. But still the second pass would be required and it would
re-dirty all the pages again,

Thanks,
Pavan


-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - Collation at database level

2008-05-30 Thread Radek Strnad

Zdenek Kotala wrote:

Radek Strnad napsal(a):

snip



I'm thinking of dividing the problem into two parts - in beginning
pg_collation will contain two functions. One will have hard-coded rules
for these basic collations (SQL_CHARACTER, GRAPHIC_IRV, LATIN1, ISO8BIT,
UCS_BASIC). It will compare each string character bitwise and guarantee
that the implementation will meet the SQL standard implemented in
PostgreSQL.
Second one will allow the user to use installed system locales. The set
of these collations will obviously vary between systems. Catalogs will
contain encoding and collation for calling the system locale function.
This will allow us to use collations such as en_US.utf8, cs_CZ.iso88592
etc. if they will be availible.

We will also need to change the way how strings are compared. Regarding
the set database collation the right function will be used.
http://doxygen.postgresql.org/varlena_8c.html#4c7af81f110f9be0bd8eb2bd99525675 



This design will make possible switch to ICU or any other implementation
quite simple and will not cause any major rewriting of what I'm coding
right now.



Collation function is main point here. How you mentioned one will be 
only wrapper about strcmp and second one about strcoll. (maybe you 
need four - char/wchar) Which function will be used it is defined in 
pg_collation catalog by CREATE COLLATION command. But you need specify 
name of locale for system locales. It means you need attribute for 
storing locale name.


You're right. I've extended pg_collation for system locale columns. In 
the first stage we actually don't need any other catalogs such as 
encoding, etc. and we can build this functionality only on following 
pg_collation catalog. Used collation function (system or built-in) will 
be decided  on existing collation name.


CATALOG(pg_collations, ###)
{
   NameDatacolname;/* collation name */
   Oidcolschema;/* collation schema */
   NameData   colcharset;/*  character set specification */
   Oid colexistingcollation; /* existing collation */
   boolcolpadattribute;/* pad attribute */
   boolcolcasesensitive;/* case sensitive */
   boolcolaccent;/* accent sensitive */
   NameDatacolsyslccollate;/* lc_collate */
   NameDatacolsyslcctype; /* lc_ctype */
   regproccolfunc;/* used collation function */
} FormData_pg_collations;



FormData_pg_collations;
It would be good to send list of new and modified SQL commands (like 
CREATE COLLATION) for wide discussion.


CREATE COLLATION collation name FOR character set specification FROM 
existing collation name [ pad characteristic ] [ case sensitive ] 
[ accent sensitive ] [ LC_COLLATE lc_collate ] [ LC_CTYPE lc_ctype ]


pad characteristic := NO PAD | PAD SPACE
case sensitive := CASE SENSITIVE | CASE INSENSITIVE
accent sensitive := ACCENT SENSITIVE | ACCENT INSENSITIVE

Since you can specify order by in select clause there's no need for 
adding ascending and descending type  of collation. They will allways be 
ascending.


DROP COLLATION collation name

CREATE DATABASE ... [ COLLATE collation name ] ...

ALTER DATABASE ... [ COLLATE collation name ] ...

   Any thoughts?

 Radek



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-30 Thread Pavan Deolasee
On Fri, May 30, 2008 at 1:56 AM, Simon Riggs [EMAIL PROTECTED] wrote:



 Been thinking some more about this. You're right that the second scan
 could re-dirty many pages and is probably something to avoid.

Right. IMHO it would help us a lot.

 The main
 issue I see is that you don't really know how much work will happen in
 the first phase and how much would happen in the second.

With HOT, I see very little work left for the second pass. The dead
space is already collected in the first pass. The second pass only
cleans up the DEAD line pointers. Also if we can update FSM early (at
the end of first pass), we can avoid further irreverssible bloat of
the heap.

 no problem at all. I'd rather keep it as it is than have sometimes
 better, sometimes worse behaviour.


For large tables, two heap scans along with several additional page
writes doesn't seem to the cost we can afford, especially in IO-bound
application. IMHO a timed wait is not such a bad thing. Note that its
all about VACUUM which is a background, maintenance activity and it
won't harm to delay it by few seconds or even minutes. Also, as I said
earlier waiting is a minor detail, may be there is a better way to
do things.

Unless there are some strong objections, I would like to give it a
shot and see if there are any real benefits. We can then fix any
regression cases. Let me know if somebody thinks there are certain
show stoppers or the benefits of avoiding a second scan on a large
table is not worth. I personally have a strong feeling that it's worth
the efforts.


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andrew Sullivan
On Thu, May 29, 2008 at 01:58:34PM -0700, David Fetter wrote:
 
 If people on core had come to the idea that we needed to build in
 replication *before* 8.3 came out, they certainly didn't announce it.
 
 Now is a great time to mention this because it gives everybody time to:
 
 1.  Come to a consensus on what the out-of-the-box replication should
 be, and 
 
 2.  Build, test and debug whatever the consensus out-of-the-box
 replication turns out to be.

None of that is an argument for why this has to go in 8.4.

I argued in Ottawa that the idea that you have to plan a feature for
_the next release_ is getting less tenable with each release.  This is
because major new features for Postgres are now often big and
complicated.  The days of big gains from single victories are mostly
over (though there are exceptions, like HOT).  Postgres is already
mature.  As for the middle-aged person with a mortgage, longer-term
planning is simply a necessary part of life now.

There are two possibilities here.  One is to have huge releases on
much longer timetables.  I think this is unsustainable in a free
project, because people will get bored and go away if they don't get
to use the results of their work in a reasonably short time frame.
The other is to accept that sometimes, planning and development for
new features will have to start a long time before actual release --
maybe planning and some coding for 2 releases out.  That allows large
features like the one we're discussing to be developed responsibly
without making everything else wait for it.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Marko Kreen
On 5/30/08, Gurjeet Singh [EMAIL PROTECTED] wrote:
 On Fri, May 30, 2008 at 10:40 AM, Tom Lane [EMAIL PROTECTED] wrote:
  But since you mention it: one of the plausible answers for fixing the
  vacuum problem for read-only slaves is to have the slaves push an xmin
  back upstream to the master to prevent premature vacuuming.  The current
  design of pg_standby is utterly incapable of handling that requirement.
  So there might be an implementation dependency there, depending on how
  we want to solve that problem.

 I think it would be best to not make the slave interfere with the master's
 operations; that's only going to increase the operational complexity of such
 a solution.

I disagree - it's better to consider syncronized WAL-slaves
as equal to master, so having queries there affect master is ok.

You need to remeber this solution tries not to replace 100-node Slony-I
setups.  You can run sanity checks on slaves or use them to load-balance
read-only OLTP queries, but not random stuff.

 There could be multiple slaves following a master, some serving
 data-warehousing queries, some for load-balancing reads, some others just
 for disaster recovery, and then some just to mitigate human errors by
 re-applying the logs with a delay.

To run warehousing queries you better use Slony-I / Londiste.  For
warehousring you want different / more indexes on tables anyway,
so I think it's quite ok to say don't do it for complex queries
on WAL-slaves.

 I don't think any one installation would see all of the above mentioned
 scenarios, but we need to take care of multiple slaves operating off of a
 single master; something similar to cascaded Slony-I.

Again, the synchronized WAL replication is not generic solution.
Use Slony/Londiste if you want to get totally independent slaves.

Thankfully the -core has set concrete and limited goals,
that means it is possible to see working code in reasonable time.
I think that should apply to read-only slaves too.

If we try to make it handle any load, it will not be finished in any time.

Now if we limit the scope I've seen 2 variants thus far:

1) Keep slave max in sync, let the load there affect master (xmin).
  - Slave can be used to load-balance OLTP load
  - Slave should not be used for complex queries.

2) If long query is running, let slave lag (avoid applying WAL data).
  - Slave cannot be used to load-balance OLTP load
  - Slave can be used for complex queries (although no new indexes
or temp tables can be created).

I think 1) is more important (and more easily implementable) case.

For 2) we already have solutions (Slony/Londiste/Bucardo, etc)
so there is no point to make effort to solve this here.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Sugestion: xpath

2008-05-30 Thread marcoduarte
I think it is good idea make a function returning set of records, 
like xpath_table, but accepting xml data in a text paramiter. 
Today if you have a function with a xml parameter, you have to save it in 
a table to use xpath.

Thanks,

Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-30 Thread Pavan Deolasee
On Fri, May 30, 2008 at 3:31 PM, Simon Riggs [EMAIL PROTECTED] wrote:


 Perhaps we can start first scan, check xid after we scan each few
 blocks. Once we find the xid is older, then we know the size of the
 second scan can be limited to only those blocks already scanned. So the
 two endpoints of behaviour are we skip the scan completely or we do the
 whole scan, but at least there is a saving in many cases without
 waiting.


Hmm. Interesting. I was about to suggest that we use some heuristic
such as size of the table to decide whether or not try the
optimization. But what you just suggested makes more sense. So instead
of waiting, we anyways start the first scan. If we are lucky, in some
time all the old transactions would go away and then we can start
marking the DEAD line pointers as DEAD_RECLAIMED. The second pass just
needs to rescan the initial blocks to remove the DEAD line pointers.

Thanks,
Pavan



-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hint Bits and Write I/O

2008-05-30 Thread Florian G. Pflug

Kevin Grittner wrote:

On Wed, May 28, 2008 at  6:26 PM, in message

[EMAIL PROTECTED],
Florian G. Pflug [EMAIL PROTECTED] wrote: 
 

I think we should put some randomness into the decision,
to spread the IO caused by hit-bit updates after a batch load.
 
Currently we have a policy of doing a VACUUM FREEZE ANALYZE on a table

after a bulk load, or on the entire database after loading a pg_dump
of a database.  We do this before putting the table or database into
production.  This avoids surprising clusters of writes at
unpredictable times.  Please don't defeat that.  (I'm not sure whether
your current suggestion would.)


No, VACUUM (and therefore VACUUM FREEZE) dirty all buffers they set hit 
bits on anyway, since they also update the xmin values. But a more 
IO-friendly approach to setting hit bits might make that VACUUM FREEZE 
step unnecessary ;-)


regards, Florian Pflug

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andrew Dunstan



Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:
  

On Fri, 2008-05-30 at 12:31 +0530, Gurjeet Singh wrote:


On Fri, May 30, 2008 at 10:40 AM, Tom Lane [EMAIL PROTECTED] wrote:
  

But since you mention it: one of the plausible answers for fixing the
vacuum problem for read-only slaves is to have the slaves push an xmin
back upstream to the master to prevent premature vacuuming.


I think it would be best to not make the slave interfere with the
master's operations; that's only going to increase the operational
complexity of such a solution.
  


  

We ruled that out as the-only-solution a while back. It does have the
beauty of simplicity, so it may exist as an option or possibly the only
way, for 8.4.



Yeah.  The point is that it's fairly clear that we could make that work.
A solution that doesn't impact the master at all would be nicer, but
it's not at all clear to me that one is possible, unless we abandon
WAL-shipping as the base technology.


  


Quite. Before we start ruling things out let's know what we think we can 
actually do.


I hope that NTT will release their code ASAP so we will have a better 
idea of what we have and what we need.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sugestion: xpath

2008-05-30 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I think it is good idea make a function returning set of records, 
 like xpath_table, but accepting xml data in a text paramiter. 
 Today if you have a function with a xml parameter, you have to save it in 
 a table to use xpath.

I believe you're mistaken.  Why would you need to put the value into
a table?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change lock requirements for adding a trigger

2008-05-30 Thread Simon Riggs

On Thu, 2008-05-29 at 19:18 -0500, Decibel! wrote:

 Is there a reason that we can't add a trigger to a table while a  
 select is running? This is a serious pain when trying to setup  
 londiste or slony.

This is constrained by locking.

There are a subset of DDL commands that might be able to be performed
with just an ExclusiveLock or ShareLock rather than an
AccessExclusiveLock. Nobody has studied which sub-statements this might
apply to, but its do-able since CREATE INDEX already does this.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Fri, 2008-05-30 at 12:31 +0530, Gurjeet Singh wrote:
 On Fri, May 30, 2008 at 10:40 AM, Tom Lane [EMAIL PROTECTED] wrote:
 But since you mention it: one of the plausible answers for fixing the
 vacuum problem for read-only slaves is to have the slaves push an xmin
 back upstream to the master to prevent premature vacuuming.

 I think it would be best to not make the slave interfere with the
 master's operations; that's only going to increase the operational
 complexity of such a solution.

 We ruled that out as the-only-solution a while back. It does have the
 beauty of simplicity, so it may exist as an option or possibly the only
 way, for 8.4.

Yeah.  The point is that it's fairly clear that we could make that work.
A solution that doesn't impact the master at all would be nicer, but
it's not at all clear to me that one is possible, unless we abandon
WAL-shipping as the base technology.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Robert Hodges
Hi Tom,

Thanks for the reasoned reply.  As you saw from point #2 in my comments, I
think you should do this feature.  I hope this answers Josh Berkus' concern
about my comments.

You make a very interesting comment which seems to go to the heart of this
design approach:

 About the only thing that would make me want to consider row-based
 replication in core would be if we determine that read-only slave
 queries are impractical atop a WAL-log-shipping implementation.

It's possible I'm misunderstanding some of the implementation issues, but it
is striking that the detailed responses to your proposal list a number of
low-level dependencies between master and slave states when replicating WAL
records.  It appears that you are designing a replication mechanism that
works effectively between a master and a relatively small number of nearby
slaves.  This is clearly an important use case but it also seems clear that
the WAL approach is not a general-purpose approach to replication.  In other
words, you'll incrementally get to that limited end point I describe.  This
will still leave a lot to be desired on read scaling, not to mention many
other cases.

Hence my original comments.  However, rather than harp on that further I
will open up a separate thread to describe a relatively small set of
extensions to PostgreSQL that would be enabling for a wide range of
replication applications.  Contrary to popular opinion these extensions are
actually well understood at the theory level and have been implemented as
prototypes as well as in commercial patches multiple times in different
databases.  Those of us who are deeply involved in replication deserve just
condemnation for not stepping up and getting our thoughts out on the table.

Meanwhile, I would be interested in your reaction to these thoughts on the
scope of the real-time WAL approach.  There's obviously tremendous interest
in this feature.  A general description that goes beyond the NTT slides
would be most helpful for further discussions.

Cheers, Robert

P.s., The NTT slides were really great.  Takahiro and Masao deserve
congratulations on an absolutely first-rate presentation.

On 5/29/08 9:09 PM, Tom Lane [EMAIL PROTECTED] wrote:

 Andrew Sullivan [EMAIL PROTECTED] writes:
 On Thu, May 29, 2008 at 12:05:18PM -0700, Robert Hodges wrote:
 people are starting to get religion on this issue I would strongly
 advocate a parallel effort to put in a change-set extraction API
 that would allow construction of comprehensive master/slave
 replication.

 You know, I gave a talk in Ottawa just last week about how the last
 effort to develop a comprehensive API for replication failed.

 Indeed, core's change of heart on this issue was largely driven by
 Andrew's talk and subsequent discussion.  We had more or less been
 waiting for the various external replication projects to tell us
 what they wanted in this line, and it was only the realization that
 no such thing was likely to happen that forced us to think seriously
 about what could be done within the core project.

 As I said originally, we have no expectation that the proposed features
 will displace the existing replication projects for high end
 replication problems ... and I'd characterize all of Robert's concerns
 as high end problems.  We are happy to let those be solved outside
 the core project.

 About the only thing that would make me want to consider row-based
 replication in core would be if we determine that read-only slave
 queries are impractical atop a WAL-log-shipping implementation.
 Which could happen; in fact I think that's the main risk of the
 proposed development plan.  But I also think that the near-term
 steps of the plan are worth doing anyway, for various other reasons,
 and so we won't be out too much effort if the plan fails.

 regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Merlin Moncure
On Fri, May 30, 2008 at 9:31 AM, Marko Kreen [EMAIL PROTECTED] wrote:
 On 5/30/08, Gurjeet Singh [EMAIL PROTECTED] wrote:

 I think it would be best to not make the slave interfere with the master's
 operations; that's only going to increase the operational complexity of such
 a solution.

 I disagree - it's better to consider syncronized WAL-slaves
 as equal to master, so having queries there affect master is ok.

 You need to remeber this solution tries not to replace 100-node Slony-I
 setups.  You can run sanity checks on slaves or use them to load-balance
 read-only OLTP queries, but not random stuff.

 There could be multiple slaves following a master, some serving
 data-warehousing queries, some for load-balancing reads, some others just
 for disaster recovery, and then some just to mitigate human errors by
 re-applying the logs with a delay.

 To run warehousing queries you better use Slony-I / Londiste.  For
 warehousring you want different / more indexes on tables anyway,
 so I think it's quite ok to say don't do it for complex queries
 on WAL-slaves.

 I don't think any one installation would see all of the above mentioned
 scenarios, but we need to take care of multiple slaves operating off of a
 single master; something similar to cascaded Slony-I.

 Again, the synchronized WAL replication is not generic solution.
 Use Slony/Londiste if you want to get totally independent slaves.

I strongly agree with Gurjeet.  The warm standby replication mechanism
is pretty simple and is wonderfully flexible with the one big
requirement that your clusters have to be mirrors of each other.

Synchronous wal replication obviously needs some communication channel
from the slave back to the master. Hopefully, it will be possible to
avoid this for asynchronous shipping.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Simon Riggs

On Fri, 2008-05-30 at 11:30 -0400, Andrew Dunstan wrote:
 
 Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:

  On Fri, 2008-05-30 at 12:31 +0530, Gurjeet Singh wrote:
  
  On Fri, May 30, 2008 at 10:40 AM, Tom Lane [EMAIL PROTECTED] wrote:

  But since you mention it: one of the plausible answers for fixing the
  vacuum problem for read-only slaves is to have the slaves push an xmin
  back upstream to the master to prevent premature vacuuming.
  
  I think it would be best to not make the slave interfere with the
  master's operations; that's only going to increase the operational
  complexity of such a solution.

  We ruled that out as the-only-solution a while back. It does have the
  beauty of simplicity, so it may exist as an option or possibly the only
  way, for 8.4.
  
  Yeah.  The point is that it's fairly clear that we could make that work.
  A solution that doesn't impact the master at all would be nicer, but
  it's not at all clear to me that one is possible, unless we abandon
  WAL-shipping as the base technology.
 
 Quite. Before we start ruling things out let's know what we think we can 
 actually do.

Let me re-phrase: I'm aware of that possibility and believe we can and
could do it for 8.4. My assessment is that people won't find it
sufficient and I am looking at other alternatives also. There may be a
better one possible for 8.4, there may not. Hence I've said something
in 8.4, something better later. There is no need to decide that is the
only way forward, yet.

I hope and expect to put some of these ideas into a more concrete form,
but this has not yet happened. Nothing has slipped, not having any
trouble getting on with it, just that my plans were to not start it yet.
I think having a detailed design ready for review by September commit
fest is credible.

 I hope that NTT will release their code ASAP so we will have a better 
 idea of what we have and what we need.

That has very little to do with Hot Standby, though there could be patch
conflicts, which is why I'm aiming to get WAL streaming done first.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Simon Riggs

On Fri, 2008-05-30 at 12:31 +0530, Gurjeet Singh wrote:
 On Fri, May 30, 2008 at 10:40 AM, Tom Lane [EMAIL PROTECTED] wrote:
 But since you mention it: one of the plausible answers for
 fixing the
 vacuum problem for read-only slaves is to have the slaves push
 an xmin
 back upstream to the master to prevent premature vacuuming.
  The current
 design of pg_standby is utterly incapable of handling that
 requirement.
 So there might be an implementation dependency there,
 depending on how
 we want to solve that problem.
 
 I think it would be best to not make the slave interfere with the
 master's operations; that's only going to increase the operational
 complexity of such a solution.
 
 There could be multiple slaves following a master, some serving
 data-warehousing queries, some for load-balancing reads, some others
 just for disaster recovery, and then some just to mitigate human
 errors by re-applying the logs with a delay.

Agreed.

We ruled that out as the-only-solution a while back. It does have the
beauty of simplicity, so it may exist as an option or possibly the only
way, for 8.4.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andreas 'ads' Scherbaum
On Thu, 29 May 2008 09:22:26 -0700 Steve Atkins wrote:
 On May 29, 2008, at 9:12 AM, David Fetter wrote:
 
  Either one of these would be great, but something that involves
  machines that stay useless most of the time is just not going to work.
 
 I have customers who are thinking about warm standby functionality, and
 the only thing stopping them deploying it is complexity and maintenance,
 not the cost of the HA hardware. If trivial-to-deploy replication that  
 didn't offer read-only access of the slaves were available today I'd bet
 that most of them would be using it.

Sure, have a similar customer. They are right now using a set of
Perl-scripts which ship the logfiles to the slave, take care of the
status, apply the logfiles, validate checksums ect ect. The whole thing
works very well in combination with RedHat cluster software, but it
took several weeks to implement the current solution.

Not everyone wants to spend the time and the manpower to implement a
simple replication.


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL User Group

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andreas 'ads' Scherbaum
On Thu, 29 May 2008 18:29:01 -0400 Tom Lane wrote:

 Dimitri Fontaine [EMAIL PROTECTED] writes:
  While at it, would it be possible for the simple part of the core  
  team statement to include automatic failover?
 
 No, I think it would be a useless expenditure of energy.  Failover
 includes a lot of things that are not within our purview: switching
 IP addresses to point to the new server, some kind of STONITH solution
 to keep the original master from coming back to life, etc.  Moreover
 there are already projects/products concerned with those issues.

True words. Failover is not and should not be part of PostgreSQL.

But PG can help the failover solution, as example: an easy-to-use
interface about the current slave status comes into my mind. Other
ideas might also be possible.


 It might be useful to document where to find solutions to that problem,
 but we can't take it on as part of core Postgres.

Ack


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL User Group

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication hooks

2008-05-30 Thread Robert Hodges
Hi Marko,

Replication requirements vary widely of course, but DDL support is shared by 
such a wide range of use cases it is very difficult to see how any real 
solution would fail to include it.  This extends to change extraction APIs, 
however, defined.  The question of what DDL to replicate is also quite 
clear-all of it with as few exceptions as possible.

For instance, it is almost impossible to set up and manage replicated systems 
easily if you cannot propagate schema changes in serialized order along with 
other updates from applications.  The inconvenience of using alternative 
mechanisms like the SLONY 'execute script' is considerable and breaks most 
commonly used database management tools.

That said, SLONY at least serializes the changes.  Non-serialized approaches 
lead to serious outages and can get you into distributed consensus problems, 
such as when is it 'safe' to change schema across different instances.  These 
are very hard to solve practically and tend to run into known impossibility 
results like Brewer's Conjecture, which holds that it is impossible to keep 
distributed databases consistent while also remaining open for updates and 
handling network partitions.

I'll post back later on the question of the API.  The key is to do something 
simple that avoids the problems discussed by Andrew and ties it accurately to 
use cases.  However, this requires a more prepared response than my hastily 
written post from last night.

Cheers, Robert

On 5/29/08 9:05 PM, Marko Kreen [EMAIL PROTECTED] wrote:

On 5/29/08, Andrew Sullivan [EMAIL PROTECTED] wrote:
 On Thu, May 29, 2008 at 12:05:18PM -0700, Robert Hodges wrote:
   people are starting to get religion on this issue I would strongly
   advocate a parallel effort to put in a change-set extraction API
   that would allow construction of comprehensive master/slave
   replication.

 You know, I gave a talk in Ottawa just last week about how the last
  effort to develop a comprehensive API for replication failed.  I had
  some ideas about why, the main one of which is something like this:
  Big features with a roadmap have not historically worked, so unless
  we're willing to change the way we work, we won't get that.

  I don't think an API is what's needed.  It's clear proposals for
  particlar features that can be delivered in small pieces.  That's what
  the current proposal offers.  I think any kind of row-based approach
  such as what you're proposing would need that kind of proposal too.

  That isn't to say that I think an API is impossible or undesirable.
  It is to say that the last few times we tried, it went nowhere; and
  that I don't think the circumstances have changed.

I think the issue is simpler - API for synchronous replication is
undesirable - it would be too complex and hinder future development
(as I explained above).

And the API for asynchronous replication is already there - triggers,
txid functions for queueing.

There is this tiny matter of replicating schema changes asynchronously,
but I suspect nobody actually cares.  Few random points about that:

- The task cannot even be clearly defined (on technical level - how
  the events should be represented).
- Any schema changes need to be carefully prepared anyway.  Whether
  to apply them to one or more servers does not make much difference.
- Major plus of async replica is ability to actually have different
  schema on slaves.
- People _do_ care about exact schema on single place - failover servers.
- But for failover server we want also synchronous replication.

So if we have synchronous WAL based replication for failover servers,
the interest in hooks to log schema changes will decrease even more.

--
marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Robert Hodges, CTO, Continuent, Inc.
Email:  [EMAIL PROTECTED]
Mobile:  +1-510-501-3728  Skype:  hodgesrm


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andreas 'ads' Scherbaum
On Thu, 29 May 2008 23:02:56 -0400 Andrew Dunstan wrote:

 Well, yes, but you do know about archive_timeout, right? No need to wait 
 2 hours.

Then you ship 16 MB binary stuff every 30 second or every minute but
you only have some kbyte real data in the logfile. This must be taken
into account, especially if you ship the logfile over the internet
(means: no high-speed connection, maybe even pay-per-traffic) to the
slave.


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL User Group

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial max_connections for initdb on FreeBSD.

2008-05-30 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 I don't think so. This is a particular case. So my advice is to hack
 test_config_settings() and add your custom values to trial_conns[] and
 trial_bufs[].

Thanks. I dropped those to their bare minimums (1 and 16), and still did
not have enough memory (!) It also appears that users cannot manipulate their
shmmax setting inside of a FreeBSD jail, so I'll just give up this part of the
quest.

(Shane, the PGOPTIONS stuff does not apply to this case)

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200805301018
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkhADL4ACgkQvJuQZxSWSsj/6QCgqzS3EgTgM0QB+kYsdlHycimp
FuAAn22suYbUfGzDSgPA9M33W12pvuNY
=JxDi
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Greg Smith

On Fri, 30 May 2008, Andreas 'ads' Scherbaum wrote:


Then you ship 16 MB binary stuff every 30 second or every minute but
you only have some kbyte real data in the logfile.


Not if you use pg_clearxlogtail ( 
http://www.2ndquadrant.com/replication.htm ), which got lost in the giant 
March commitfest queue but should probably wander into contrib as part of 
8.4.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Gurjeet Singh
On Thu, May 29, 2008 at 7:42 PM, Tom Lane [EMAIL PROTECTED] wrote:

 The big problem
 is that long-running slave-side queries might still need tuples that are
 vacuumable on the master, and so replication of vacuuming actions would
 cause the slave's queries to deliver wrong answers.


Another issue with read-only slaves just popped up in my head.

How do we block the readers on the slave while it is replaying an ALTER
TABLE or similar command that requires Exclusive lock and potentially alters
the table's structure. Or does the WAL replay already takes an x-lock on
such a table?

Best regards,
-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Simon Riggs

On Fri, 2008-05-30 at 11:12 -0700, Robert Hodges wrote:
 This is clearly an important use case but it also seems clear that
 the WAL approach is not a general-purpose approach to replication.

I think we cannot make such a statement yet, if ever.

I would note that log-based replication is now the mainstay of
commercial database replication techniques for loosely-coupled groups of
servers. It would seem strange to assume that it should not be good for
us too, simply because we know it to be difficult.

IMHO the project has a pretty good track record of delivering
functionality that looked hard at first glance.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Shouldn't Natural JOINs Follow FK Constraints?

2008-05-30 Thread David E. Wheeler

Howdy,

I just saw this in the docs:

Finally, NATURAL is a shorthand form of USING: it forms a USING list  
consisting of exactly those column names that appear in both input  
tables. As with USING, these columns appear only once in the output  
table.


That sounds useful if I happen to have named my columns exactly the  
same between the two tables, but couldn't a NATURAL JOIN follow the FK  
constraints, instead? That would be so much more useful and much less  
magical, I should think.


Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Robert Treat
On Thursday 29 May 2008 22:59:21 Merlin Moncure wrote:
 On Thu, May 29, 2008 at 9:26 PM, Josh Berkus [EMAIL PROTECTED] wrote:
  I fully accept that it may be the case that it doesn't make technical
  sense to tackle them in any order besides sync-read-only slaves because
  of dependencies in the implementation between the two.  If that's the
  case, it would be nice to explicitly spell out what that was to deflect
  criticism of the planned prioritization.
 
  There's a very simple reason to prioritize the synchronous log shipping
  first; NTT may open source their solution and we'll get it a lot sooner
  than the other components.

 That's a good argument.  I just read the NTT document and the stuff
 looks fantastic.  You've convinced me...

It would be a better argument if the NTT guys hadn't said that they estimated 
6 months time before the code would be released, which puts us beyond 8.4. 
Now it is possible that the time frame could be sooner, but unless someone 
already has the patch, this reminds me a little too much of the arguments for 
including windows support in a single release because we already had a work 
port/patch set to go from. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andrew Dunstan



Andreas 'ads' Scherbaum wrote:

On Thu, 29 May 2008 23:02:56 -0400 Andrew Dunstan wrote:

  
Well, yes, but you do know about archive_timeout, right? No need to wait 
2 hours.



Then you ship 16 MB binary stuff every 30 second or every minute but
you only have some kbyte real data in the logfile. This must be taken
into account, especially if you ship the logfile over the internet
(means: no high-speed connection, maybe even pay-per-traffic) to the
slave.



  


Sure there's a price to pay. But that doesn't mean the facility doesn't 
exist. And I rather suspect that most of Josh's customers aren't too 
concerned about traffic charges or affected by such bandwidth 
restrictions. Certainly, none of my clients are, and they aren't in the 
giant class. Shipping a 16Mb file, particularly if compressed, every 
minute or so, is not such a huge problem for a great many commercial 
users, and even many domestic users.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Robert Treat
On Thursday 29 May 2008 20:31:31 Greg Smith wrote:
 On Thu, 29 May 2008, Tom Lane wrote:
  There's no point in having read-only slave queries if you don't have a
  trustworthy method of getting the data to them.

 This is a key statement that highlights the difference in how you're
 thinking about this compared to some other people here.  As far as some
 are concerned, the already working log shipping *is* a trustworthy method
 of getting data to the read-only slaves.  There are plenty of applications
 (web oriented ones in particular) where if you could direct read-only
 queries against a slave, the resulting combination would be a giant
 improvement over the status quo even if that slave was as much as
 archive_timeout behind the master.  That quantity of lag is perfectly fine
 for a lot of the same apps that have read scalability issues.

 If you're someone who falls into that camp, the idea of putting the sync
 replication job before the read-only slave one seems really backwards.


Just looking at it from an overall market perspective, synchronous log 
shipping pretty much only addresses failover needs, where as read-only slaves 
address both failover and scaling issues. (Note I say address, not solve). 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Robert Treat
On Friday 30 May 2008 01:10:20 Tom Lane wrote:
 Greg Smith [EMAIL PROTECTED] writes:
  I fully accept that it may be the case that it doesn't make technical
  sense to tackle them in any order besides sync-read-only slaves because
  of dependencies in the implementation between the two.

 Well, it's certainly not been my intention to suggest that no one should
 start work on read-only-slaves before we finish the other part.  The
 point is that I expect the log shipping issues will be done first
 because they're easier, and it would be pointless to not release that
 feature if we had it.

 But since you mention it: one of the plausible answers for fixing the
 vacuum problem for read-only slaves is to have the slaves push an xmin
 back upstream to the master to prevent premature vacuuming.  The current
 design of pg_standby is utterly incapable of handling that requirement.
 So there might be an implementation dependency there, depending on how
 we want to solve that problem.


Sure, but whose to say that after synchronous wal shipping is finished it 
wont need a serious re-write due to new needs from the hot standby feature. I 
think going either way carries some risk. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] move hash_any from hash to utils

2008-05-30 Thread Zdenek Kotala
I'm trying to cleanup header dependences and it seems hash_any and  hash_uint32 
function is candidate to move into separate c file in backend/utils/misc 
directory. It significantly reduce amount of unnecessary includes from files 
like  network.c ( http://doxygen.postgresql.org/network_8c.html)


Any comments?


Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shouldn't Natural JOINs Follow FK Constraints?

2008-05-30 Thread Jeff Davis
On Fri, 2008-05-30 at 13:19 -0700, David E. Wheeler wrote:
 Howdy,
 
 I just saw this in the docs:
 
  Finally, NATURAL is a shorthand form of USING: it forms a USING list  
  consisting of exactly those column names that appear in both input  
  tables. As with USING, these columns appear only once in the output  
  table.
 
 That sounds useful if I happen to have named my columns exactly the  
 same between the two tables, but couldn't a NATURAL JOIN follow the FK  
 constraints, instead? That would be so much more useful and much less  
 magical, I should think.

A good point, but I'm sure the SQL standard has something to say about
this, so I don't think we have much of a choice.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Shouldn't Natural JOINs Follow FK Constraints?

2008-05-30 Thread David E . Wheeler

Howdy,

I just saw this in the docs:

Finally, NATURAL is a shorthand form of USING: it forms a USING list  
consisting of exactly those column names that appear in both input  
tables. As with USING, these columns appear only once in the output  
table.


That sounds useful if I happen to have named my columns exactly the  
same between the two tables, but couldn't a NATURAL JOIN follow the FK  
constraints, instead? That would be so much more useful and much less  
magical, I should think.


Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Gurjeet Singh
On Sat, May 31, 2008 at 1:52 AM, Greg Smith [EMAIL PROTECTED] wrote:

 On Fri, 30 May 2008, Andreas 'ads' Scherbaum wrote:

  Then you ship 16 MB binary stuff every 30 second or every minute but
 you only have some kbyte real data in the logfile.


 Not if you use pg_clearxlogtail (
 http://www.2ndquadrant.com/replication.htm ), which got lost in the giant
 March commitfest queue but should probably wander into contrib as part of
 8.4.


This means we need to modify pg_standby to not check for filesize when
reading XLogs.

Best regards,

-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Tatsuo Ishii
 Andreas 'ads' Scherbaum wrote:
  On Thu, 29 May 2008 23:02:56 -0400 Andrew Dunstan wrote:
 

  Well, yes, but you do know about archive_timeout, right? No need to wait 
  2 hours.
  
 
  Then you ship 16 MB binary stuff every 30 second or every minute but
  you only have some kbyte real data in the logfile. This must be taken
  into account, especially if you ship the logfile over the internet
  (means: no high-speed connection, maybe even pay-per-traffic) to the
  slave.
 
 
 

 
 Sure there's a price to pay. But that doesn't mean the facility doesn't 
 exist. And I rather suspect that most of Josh's customers aren't too 
 concerned about traffic charges or affected by such bandwidth 
 restrictions. Certainly, none of my clients are, and they aren't in the 
 giant class. Shipping a 16Mb file, particularly if compressed, every 
 minute or so, is not such a huge problem for a great many commercial 
 users, and even many domestic users.

Sumitomo Electric Co., Ltd., a 20 billion dollars selling company in
Japan (parent company of Sumitomo Electric Information Systems Co.,
Ltd., which is one of the Recursive SQL development support company)
uses 100 PostgreSQL servers. They are doing backups by using log
shipping to another data center and have problems with the amount of
the transferring log data. They said this is one of the big problems
they have with PostgreSQL and hope it will be solved in the near
future.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Overhauling GUCS

2008-05-30 Thread Josh Berkus
Magnus and I had the fortune to share a plane flight when leaving pgCon. 
 This gave us a chance to discuss my ideas around reforming PostgreSQL 
configuration at some length.  Here's what we came up with (Magnus will 
presumably correct the parts I got wrong):


Currently, PostgreSQL,conf and our set of GUC configurations suffer from 
4 large problems:


1. Most people have no idea how to set these.
2. The current postgresql.conf file is a huge mess of 194 options, the 
vast majority of which most users will never touch.
3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf, 
and the settings.sgml), which are only synched with each other manually.

4. We don't seem to be getting any closer to autotuning.

We think all of these are solvable for 8.4.   We also think that it's 
vitally important than any overhaul of the GUCS to be completed in one 
version to minimize the pain involved.


Here's a list of the things we want to change.  It's all a package and 
should make sense if you take all the changes as a whole.


1) Add several pieces of extra information to guc.c in the form of extra 
gettext commands:  default value, subcategory, long description, 
recommendations, enum lists.

2) Incorporate this data into pg_settings
3) Delete postgresql.conf.sample
4) Add a script called pg_generate_conf to generate a postgresql.conf 
based on guc.c and command-line switches (rather than 
postgresql.conf.sample), which would be called automatically by initdb.


For (4), pg_generate_conf would take the following switches:
-b , --basic = short conf file, listing only the 15-18 most commonly 
changed options

-a , --advanced = conf file listing all 196+ options
-t, --terse = conf file lists only category headings and actual 
settings, no comments
-n, --normal = conf file has category and subcategory settings, with 
short desc comments
-v, --verbose = conf file lists full descriptions and recommendations in 
comments with each option

-c option = value set specific option to specific value in the file
-f filename  = take options and values from file filename

The default would be -b, -n with specific settings for shared_buffers 
and wal_sync_method.   The current postgresql.conf is a lot more like an 
-a, -v file.


This would help us in the following ways:

A. there would now only be 2 places to maintain GUCS lists, the docs and 
guc.c.
B. by having a generated postgresql.conf and an easy way to generate it, 
writing autoconfiguration scripts (as well as shortcuts like SET 
PERSISTENT) become vastly easier.
C. Most users would deal only with a limited set of 15-20 configuration 
variables.


There's obviously some refinements needed, but what do people think of 
the above general idea?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Joshua D. Drake


On Sat, 2008-05-31 at 02:48 +0530, Gurjeet Singh wrote:

 
 Not if you use pg_clearxlogtail
 ( http://www.2ndquadrant.com/replication.htm ), which got lost
 in the giant March commitfest queue but should probably wander
 into contrib as part of 8.4.
 
 This means we need to modify pg_standby to not check for filesize when
 reading XLogs.
 
 Best regards,
 

It does.

Joshua D. Drake

 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Greg Smith

On Sat, 31 May 2008, Gurjeet Singh wrote:


Not if you use pg_clearxlogtail


This means we need to modify pg_standby to not check for filesize when
reading XLogs.


No, the idea is that you run the segments through pg_clearxlogtail | gzip, 
which then compresses lightly used segments massively because all the 
unused bytes are 0.  File comes out the same size at the other side, but 
you didn't ship a full 16MB if there was only a few KB used.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Simon Riggs

On Fri, 2008-05-30 at 01:10 -0400, Tom Lane wrote:
 Greg Smith [EMAIL PROTECTED] writes:
  I fully accept that it may be the case that it doesn't make technical 
  sense to tackle them in any order besides sync-read-only slaves because 
  of dependencies in the implementation between the two.
 
 Well, it's certainly not been my intention to suggest that no one should
 start work on read-only-slaves before we finish the other part.  The
 point is that I expect the log shipping issues will be done first
 because they're easier, and it would be pointless to not release that
 feature if we had it.

Agreed.

I'm arriving late to a thread that seems to have grown out of all
proportion.

AFAICS streaming WAL and hot standby are completely orthogonal features.
Streaming WAL is easier and if NTT can release their code to open source
we may get this in the Sept commit fest. Hot Standby is harder and it
was my viewpoint at PGCon that we may not have a perfect working version
of this by the end of 8.4. We are very likely to have something working,
but maybe not the whole feature set as we might wish to have. I expect
to be actively working on this soon. I definitely do want to see WAL
streaming going in as early as possible and before end of 8.4, otherwise
code conflicts and other difficulties are likely to push out the 8.4
date and/or Hot Standby.

So as I see it, Tom has only passed on my comments on this, not added or
removed anything. The main part of the announcement was really about
bringing the WAL streaming into core and effectively favouring it over a
range of other projects.

Can we all back off a little on this for now? Various concerns have been
validly expressed, but it will all come good AFAICS.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-30 Thread Simon Riggs

On Fri, 2008-05-30 at 14:50 +0530, Pavan Deolasee wrote:
 On Fri, May 30, 2008 at 2:41 PM, Simon Riggs [EMAIL PROTECTED] wrote:
 
   What I still
  don't accept is that an unconstrained wait is justifiable. You've just
  said its a minor detail, but that's not the way I see it. It might be a
  second, but it might be an hour or more.
 
 
 I am suggesting a timed wait. May be say between 60-300 seconds.
 That's the maximum VACUUM would get delayed. If exiting transactions
 don't finish within that time, VACUUM just works as it does today. So
 it can't certainly be much worse than what it is today.
 
  A non-waiting solution seems like the only way to proceed.
 

I understand what you're saying and agree that in (some) cases a small
wait is not important. I'm just saying some != all, and the gap covers
important cases:

If we have a database with 100 tables (very common) and we add 5 minutes
waiting time to each vacuum, then we'll make a complete database VACUUM
take ~7 hours longer than it did before. 1000 tables would cause rioting
in the streets.

Waiting for 5 minutes for a 0.5 second vacuum isn't sensible either,
whatever the gain. It's clear Amdahl's Law would not advise us to
optimise that (in this way).

So if its a large table and we submitted it with a non-zero vacuum wait,
then maybe a wait is an acceptable optimisation.

Perhaps we can start first scan, check xid after we scan each few
blocks. Once we find the xid is older, then we know the size of the
second scan can be limited to only those blocks already scanned. So the
two endpoints of behaviour are we skip the scan completely or we do the
whole scan, but at least there is a saving in many cases without
waiting.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Dimitri Fontaine
Le vendredi 30 mai 2008, Tom Lane a écrit :
 No, I think it would be a useless expenditure of energy.  Failover
 includes a lot of things that are not within our purview: switching
 IP addresses to point to the new server, some kind of STONITH solution
 to keep the original master from coming back to life, etc.  Moreover
 there are already projects/products concerned with those issues.

Well, I forgot that there's in fact no active plan to put pgbouncer features 
into core at the moment (I'm sure I've read something about this on the lists 
though). If it was the case, the slave could proxy queries to the master, and 
stop proxying but serve them if the master tells it it's dying.
This way, no need to switch IP addresses, the clients just connect as usual 
and get results back and do not have to know whether the host they're qerying 
against is a slave or a master. This level of smartness is into -core.

The STONITH part in case of known failure (fatal) does not seem that hard 
either, as the master at fatal time could write somewhere it's now a slave 
and use this at next startup time (recovery.conf?). If it can't even do that, 
well, we're back to crash situation with no provided automatic failover 
solution. Not handled failure cases obviously will continue to exist.

I'm not asking for all cases managed in -core please, just for some level 
of effort on the topic. Of course, I'm just the one asking questions and 
trying to raise ideas, so I'm perfectly fine with your current answer 
(useless expenditure of energy) even if somewhat disagreeing on the useless 
part of it :)

As for the integrated pgbouncer daemon part, I'm thinking this would allow the 
infamous part 3 of the proposal (read-only slave) to get pretty simple to 
setup when ready: the slave knows who its master is, and as soon as an XID is 
needed the transaction queries are forwarded/proxied to it. Thanks again 
Florian !

 It might be useful to document where to find solutions to that problem,
 but we can't take it on as part of core Postgres.

Even the part when it makes sense (provided it does and I'm not completely off 
tracks here)?

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andreas 'ads' Scherbaum
On Fri, 30 May 2008 16:22:41 -0400 (EDT) Greg Smith wrote:

 On Fri, 30 May 2008, Andreas 'ads' Scherbaum wrote:
 
  Then you ship 16 MB binary stuff every 30 second or every minute but
  you only have some kbyte real data in the logfile.
 
 Not if you use pg_clearxlogtail ( 
 http://www.2ndquadrant.com/replication.htm ), which got lost in the giant 
 March commitfest queue but should probably wander into contrib as part of 
 8.4.

Yes, this topic was discussed several times in the past but to
solve this it needs a patch/solution which is integrated into PG
itself, not contrib.


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL User Group

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] move hash_any from hash to utils

2008-05-30 Thread Alvaro Herrera
Zdenek Kotala wrote:
 I'm trying to cleanup header dependences and it seems hash_any and  
 hash_uint32 function is candidate to move into separate c file in 
 backend/utils/misc directory. It significantly reduce amount of 
 unnecessary includes from files like  network.c ( 
 http://doxygen.postgresql.org/network_8c.html)

   Any comments?

I considered that too, so +1.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Simon Riggs
On Thu, 2008-05-29 at 10:12 -0400, Tom Lane wrote:
 The Postgres core team met at PGCon to discuss a few issues, the largest
 of which is the need for simple, built-in replication for PostgreSQL.
 Historically the project policy has been to avoid putting replication
 into core PostgreSQL, so as to leave room for development of competing
 solutions, recognizing that there is no one size fits all replication
 solution.  However, it is becoming clear that this policy is hindering
 acceptance of PostgreSQL to too great an extent, compared to the benefit
 it offers to the add-on replication projects.  Users who might consider
 PostgreSQL are choosing other database systems because our existing
 replication options are too complex to install and use for simple cases.
 In practice, simple asynchronous single-master-multiple-slave
 replication covers a respectable fraction of use cases, so we have
 concluded that we should allow such a feature to be included in the core
 project.  We emphasize that this is not meant to prevent continued
 development of add-on replication projects that cover more complex use
 cases.
 
 We believe that the most appropriate base technology for this is
 probably real-time WAL log shipping, as was demoed by NTT OSS at PGCon.
 We hope that such a feature can be completed for 8.4.  Ideally this
 would be coupled with the ability to execute read-only queries on the
 slave servers, but we see technical difficulties that might prevent that
 from being completed before 8.5 or even further out.  (The big problem
 is that long-running slave-side queries might still need tuples that are
 vacuumable on the master, and so replication of vacuuming actions would
 cause the slave's queries to deliver wrong answers.)
 
 Again, this will not replace Slony, pgPool, Continuent, Londiste, or
 other systems for many users, as it will be not be highly scalable nor
 support long-distance replication nor replicating less than an entire
 installation.  But it is time to include a simple, reliable basic
 replication feature in the core system.

I'm in full support of this and commend the work of the NTT team.

The goals and timescales are realistic and setting a timetable in this
way will help planning for many users,

I'm expecting to lead the charge on the Hot Standby project. The problem
mentioned is just one of the issues, though overall I'm now optimistic
about our eventual success in that area. I'm discussing this now with a
couple of sponsors and would welcome serious financial commitments to
this goal. Please contact me off-list if you agree also.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-30 Thread Simon Riggs

On Thu, 2008-05-29 at 09:57 +0530, Pavan Deolasee wrote:
 On Thu, May 29, 2008 at 2:02 AM, Simon Riggs [EMAIL PROTECTED] wrote:
 
 
  I'm not happy that the VACUUM waits. It might wait a very long time and
  cause worse overall performance than the impact of the second scan.
 
 
 Lets not get too paranoid about the wait. It's a minor detail in the
 whole theory. I would suggest that the benefit of avoiding second scan
 would be huge. Remember, its just not a scan, it also dirties those
 blocks again, forcing them write to disk. Also, if you really have a
 situation where vacuum needs to wait for very long, then you are
 already in trouble. The long running transactions would prevent
 vacuuming many tuples.

Been thinking some more about this. You're right that the second scan
could re-dirty many pages and is probably something to avoid. The main
issue I see is that you don't really know how much work will happen in
the first phase and how much would happen in the second. Avoiding the
second scan might be worth waiting for, it might not. You really have no
idea how many tuples the LRT might effect. There is no typical here,
it can vary from one extreme to another. Waiting could be very bad, or
no problem at all. I'd rather keep it as it is than have sometimes
better, sometimes worse behaviour.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Josh Berkus
Andrew,

 Sure there's a price to pay. But that doesn't mean the facility doesn't
 exist. And I rather suspect that most of Josh's customers aren't too
 concerned about traffic charges or affected by such bandwidth
 restrictions. Certainly, none of my clients are, and they aren't in the
 giant class. Shipping a 16Mb file, particularly if compressed, every
 minute or so, is not such a huge problem for a great many commercial
 users, and even many domestic users.

The issue is that when you're talking about telecommunications companies 
(and similar) once a minute isn't adequate.  Those folks want at least 
every second, or even better synchronous.

Anyway, this is a pretty pointless discussion given that we want both 
capabilities, and stuff will get implemented in the order it makes 
technical sense.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding second heap scan in VACUUM

2008-05-30 Thread Simon Riggs

On Fri, 2008-05-30 at 14:22 +0530, Pavan Deolasee wrote:

 For large tables, two heap scans along with several additional page
 writes doesn't seem to the cost we can afford, especially in IO-bound
 application. IMHO a timed wait is not such a bad thing. Note that its
 all about VACUUM which is a background, maintenance activity and it
 won't harm to delay it by few seconds or even minutes. Also, as I said
 earlier waiting is a minor detail, may be there is a better way to
 do things.
 
 Unless there are some strong objections, I would like to give it a
 shot and see if there are any real benefits. We can then fix any
 regression cases. Let me know if somebody thinks there are certain
 show stoppers or the benefits of avoiding a second scan on a large
 table is not worth. I personally have a strong feeling that it's worth
 the efforts.

I'm not really clear what you intend to do now.

We both agreed that avoiding a second pass is a good thing. What I still
don't accept is that an unconstrained wait is justifiable. You've just
said its a minor detail, but that's not the way I see it. It might be a
second, but it might be an hour or more. If I run a VACUUM at 0700,
thinking it will finish by 0900 before my database gets busy, it is a
very bad thing to find that it only started at 0900 and is now
interfering with my business workload.

A non-waiting solution seems like the only way to proceed.

Is this a non-issue anyway, with DSM?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] intercepting WAL writes

2008-05-30 Thread Simon Riggs

On Wed, 2008-05-28 at 21:47 -0400, Mike wrote:
 On Wed, May 28, 2008 at 8:30 PM, Mike [EMAIL PROTECTED] wrote:
  When you say a bit of decoding, is that because the data written to the
 logs
  is after the query parser/planner? Or because it's written in several
  chunks? Or?
 
 Because that's the actual recovery record.  There is no SQL text, just
 the WAL record type (XLOG_HEAP_INSERT, XLOG_HEAP_UPDATE,
 XLOG_XACT_COMMIT, ...) and the data as it relates to that operation.
 
 Oh- right- that makes sense.

Hmm, yes and no.

WAL doesn't carry enough information to reconstruct updates or deletes
external to the database in question. That's the barrier you need to
cross, not just piping the results somewhere.

I envision being able to do this in the future and exposing an API to
allow it to happen, but we aren't there yet.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Fwd: Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Csaba Nagy
[Looks like this mail missed the hackers list on reply to all, I wonder
how it could happen... so I forward it]

On Thu, 2008-05-29 at 17:00 +0100, Dave Page wrote:
 Yes, we're talking real-time streaming (synchronous) log shipping.

Is there any design already how would this be implemented ?

Some time ago I was interested in this subject and thought about having
a normal postgres connection where the slave would issue a query to a
special view which would simply stream WAL records as bytea from a
requested point, without ever finishing. This would have the advantage
(against a custom socket streaming solution) that it can reuse the
complete infrastructure of connection making/managing/security (think
SSL for no sniffing) of the postgres server. It would also be a public
interface, which could be used by other possible tools too (think PITR
management application/WAL stream repository). Another advantage would
be that a PITR solution could be serving as a source for the WAL stream
too, so the slave could either get the real time stream from the master,
or rebuild a PITR state from a WAL repository server, using the same
interface...

Probably some kind of WAL subscription management should be also
implemented, so that the slave can signal the master which WAL records
it already applied and can be recycled on the master, and it would be
nice if there could be multiple subscribers at the same time. Some
subscriber time-out could be also implemented, while marking the
subscription as timed out, so that the slave can know that it has to
rebuild itself...

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andreas 'ads' Scherbaum
On Fri, 30 May 2008 17:05:57 -0400 Andrew Dunstan wrote:
 Andreas 'ads' Scherbaum wrote:
  On Thu, 29 May 2008 23:02:56 -0400 Andrew Dunstan wrote:
 
  Well, yes, but you do know about archive_timeout, right? No need to wait 
  2 hours.
 
  Then you ship 16 MB binary stuff every 30 second or every minute but
  you only have some kbyte real data in the logfile. This must be taken
  into account, especially if you ship the logfile over the internet
  (means: no high-speed connection, maybe even pay-per-traffic) to the
  slave.
 
 Sure there's a price to pay. But that doesn't mean the facility doesn't 
 exist. And I rather suspect that most of Josh's customers aren't too 
 concerned about traffic charges or affected by such bandwidth 
 restrictions. Certainly, none of my clients are, and they aren't in the 
 giant class. Shipping a 16Mb file, particularly if compressed, every 
 minute or so, is not such a huge problem for a great many commercial 
 users, and even many domestic users.

The real problem is not the 16 MB, the problem is: you can't compress
this file. If the logfile is rotated it still contains all the
old binary data which is not a good starter for compression.

So you may have some kB changes in the wal logfile every minute but you
still copy 16 MB data. Sure, it's not so much - but if you rotate a
logfile every minute this still transfers 16*60*24 = ~23 GB a day.


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL User Group

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Andrew Dunstan



Tatsuo Ishii wrote:

Andreas 'ads' Scherbaum wrote:


On Thu, 29 May 2008 23:02:56 -0400 Andrew Dunstan wrote:

  
  
Well, yes, but you do know about archive_timeout, right? No need to wait 
2 hours.



Then you ship 16 MB binary stuff every 30 second or every minute but
you only have some kbyte real data in the logfile. This must be taken
into account, especially if you ship the logfile over the internet
(means: no high-speed connection, maybe even pay-per-traffic) to the
slave.



  
  
Sure there's a price to pay. But that doesn't mean the facility doesn't 
exist. And I rather suspect that most of Josh's customers aren't too 
concerned about traffic charges or affected by such bandwidth 
restrictions. Certainly, none of my clients are, and they aren't in the 
giant class. Shipping a 16Mb file, particularly if compressed, every 
minute or so, is not such a huge problem for a great many commercial 
users, and even many domestic users.



Sumitomo Electric Co., Ltd., a 20 billion dollars selling company in
Japan (parent company of Sumitomo Electric Information Systems Co.,
Ltd., which is one of the Recursive SQL development support company)
uses 100 PostgreSQL servers. They are doing backups by using log
shipping to another data center and have problems with the amount of
the transferring log data. They said this is one of the big problems
they have with PostgreSQL and hope it will be solved in the near
future.

  


Excellent data point. Now, what I'd like to know is whether they are 
getting into trouble simply because of the volume of log data generated 
or because they have a short archive_timeout set. If it's the former 
(which seems more likely) then none of the ideas I have seen so far in 
this discussion seemed likely to help, and that would indeed be a major 
issue we should look at. Another question is this: are they being 
overwhelmed by the amount of network traffic generated, or by difficulty 
in postgres producers and/or consumers to keep up? If it's network 
traffic, then perhaps compression would help us.


Maybe we need to set some goals for the level of log volumes we expect 
to be able to create/send/comsume.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Pgsql-hackers

2008-05-30 Thread Ollie Joiner



a  fur cap on his head. The sledge drove round the square twice, and Kay tied on





Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-30 Thread Gurjeet Singh
On Sat, May 31, 2008 at 3:41 AM, Greg Smith [EMAIL PROTECTED] wrote:

 On Sat, 31 May 2008, Gurjeet Singh wrote:

  Not if you use pg_clearxlogtail


 This means we need to modify pg_standby to not check for filesize when
 reading XLogs.


 No, the idea is that you run the segments through pg_clearxlogtail | gzip,
 which then compresses lightly used segments massively because all the unused
 bytes are 0.  File comes out the same size at the other side, but you didn't
 ship a full 16MB if there was only a few KB used.


Got it. I remember reading about pg_clearxlogtail in these mailing lists;
but somehow forgot how it actually worked!

-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Overhauling GUCS

2008-05-30 Thread Greg Smith

On Fri, 30 May 2008, Josh Berkus wrote:

1) Add several pieces of extra information to guc.c in the form of extra 
gettext commands:  default value, subcategory, long description, 
recommendations, enum lists.

2) Incorporate this data into pg_settings


When you last brought this up in February (I started on a long reply to 
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00759.php that I 
never quite finished) the thing I got stuck on was how to deal with the 
way people tend to comment in these files as they change things.


One problem I didn't really see addressed by the improvements you're 
suggesting is how to handle migrating customized settings to a new version 
(I'm talking about 8.4-9.0 after this is in place, 8.3-8.4 is a whole 
different problem).  It would be nice to preserve history of what people 
did like in your examples (which look exactly like what I find myself 
running into in the field).  Now, that will get a lot easier just by 
virtue of having a smaller config file, but I think that adding something 
into pg_settings that allows saving user-added commentary would be a nice 
step toward some useful standardization on that side of things.  It would 
make future automated tools aimed at parsing and generating new files, as 
part of things like version upgrades, a lot easier if there was a standard 
way such comments were handled in addition to the raw data itself.


The other thing I'd like to see make its way into pg_settings, so that 
tools can operate on it just by querying the database, is noting what file 
the setting came from so that you can track things like include file 
usage.  I think with those two additions (comments and source file 
tracking) it would even be concievable to clone a working facsimile of 
even a complicated postgresql.conf file set remotely just by reading 
pg_settings.


While a bit outside of the part you're specifically aiming to improve 
here, if you could slip these two additions in I think it would be a boon 
to future writers of multi-server management tools as well.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers