Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-24 Thread Markus Schaber
Hi, Simon,

Simon Riggs wrote:

 1. Provide a filter that can be easily used by archive_command to remove
 full page writes from WAL files. This would require us to disable the
 file size test when we begin recovery on a new WAL files, plus would
 need to redesign initial location of the checkpoint record since we
 could no longer rely on the XLogRecPtr being a byte offset within the
 file.

pg_WAL_filter could correct the XLogRecPtr and file sizes during the
filter run.

HTH,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-22 Thread Simon Riggs
On Sat, 2006-10-21 at 19:24 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  But they can be re-created anew with the same name each time? Or I guess
  not, but you redefine a view every 30 minutes to point to the latest
  one?
 
  If so, then I have a patch that will speed up COPY when in the same
  transaction as the table that created it. I've finally fixed a bug in my
  earlier prototypes that seems to make that work now, in all cases.
 
 Can you make the patch cover the case of
 
   begin;
   truncate foo;
   copy foo from ...
   commit;
 
 It might be infeasible to detect this case, but if it's not ...

I think it is possible to detect this case without making catalog
entries, so I'll give this a try. Methinks that the truncate *must* be
the immediately preceding command, otherwise we might have a trigger
executing to put rows back into the table before we COPY.

I should also be able to get that to work with Insert Select without
much bother too.

We might also add this capability to COPY itself by providing a WITH
TRUNCATE option, which would be even cleaner code-wise. We can add that
to pg_dump so it will work with/without --single-transaction mode. That
will change the permissions reqd slightly, but seems OK.

I'll go for all of the above changes unless there are objections.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-22 Thread Alvaro Herrera
Simon Riggs wrote:
 On Sat, 2006-10-21 at 19:24 -0400, Tom Lane wrote:

  Can you make the patch cover the case of
  
  begin;
  truncate foo;
  copy foo from ...
  commit;
  
  It might be infeasible to detect this case, but if it's not ...
 
 I think it is possible to detect this case without making catalog
 entries, so I'll give this a try. Methinks that the truncate *must* be
 the immediately preceding command, otherwise we might have a trigger
 executing to put rows back into the table before we COPY.

Hmm, is it possible to save the is empty info somewhere in local
memory, perhaps the relcache (not necessarily propagated), and have
heap_insert turn it off?

That would help when you do things like

begin;
truncate foo, bar;
copy foo from ...
copy bar from ...
commit;

On the other hand, what happens if you do

begin;
truncate foo;

-- another session
copy foo from ...

-- original session
copy foo from ...
commit;


How do you detect that the table is no longer empty?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-22 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Simon Riggs wrote:
 I think it is possible to detect this case without making catalog
 entries, so I'll give this a try. Methinks that the truncate *must* be
 the immediately preceding command, otherwise we might have a trigger
 executing to put rows back into the table before we COPY.

 Hmm, is it possible to save the is empty info somewhere in local
 memory, perhaps the relcache (not necessarily propagated), and have
 heap_insert turn it off?

The relcache isn't a very safe place to store state --- it's a cache,
not stable storage.

However, I don't understand why Simon is on about empty.  ISTM the
important state is new relfilenode assigned in this transaction.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-22 Thread Simon Riggs
On Sun, 2006-10-22 at 12:12 -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Simon Riggs wrote:
  I think it is possible to detect this case without making catalog
  entries, so I'll give this a try. Methinks that the truncate *must* be
  the immediately preceding command, otherwise we might have a trigger
  executing to put rows back into the table before we COPY.
 
  Hmm, is it possible to save the is empty info somewhere in local
  memory, perhaps the relcache (not necessarily propagated), and have
  heap_insert turn it off?
 
 The relcache isn't a very safe place to store state --- it's a cache,
 not stable storage.

I was imagining adding this onto the Relation struct, just as we do with
rd_createSubid and rd_targblock. That isn't a safe place to store that
state so we can't do this across multiple backends. We wouldn't want
that anyway since otherwise various statements would need to access
shared state before they can act, which is not good.

 However, I don't understand why Simon is on about empty.  ISTM the
 important state is new relfilenode assigned in this transaction.

Thank you for supplying clarity of thought; empty would be important if
we were taking full table locks, which we don't want to do (Christmas
Past...). So we need not have the DML immediately following a truncate,
only that a truncate has previously occurred within the top level
transaction or an unaborted subtransaction.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Simon Riggs
On Fri, 2006-10-20 at 17:04 -0400, Theo Schlossnagle wrote:
 On Oct 20, 2006, at 4:24 PM, Simon Riggs wrote:

  Is it possible to create tables in fashion that will not write info
  to the WAL log -- knowingly and intentionally making them
  unrecoverable?  This is very desirable for us.  We snapshot tables
  from a production environment.  If the database goes down and we
  recover, the old snapshots are out of date anyway and serve no useful
  purpose.  The periodic snapshot procedure would re-snap them in short
  order anyway.  I'd like to do:
 
  INSERT INTO TABLE tblfoo_snap1 AS SELECT * from table on remote
  database NO LOGGING;
 
  (NO LOGGING being the only part we're currently missing) Is something
  like this possible?

  Do you want this because of:
  1) performance?
 
 performance in that a substantial portion of my time is spent writing  
 to pg_xlog
 
  2) to reduce the WAL volume of PITR backups?
 
 Yes.  Main concern.
 
 
  e.g. archive_command = 'pg_WAL_filter -f | ... '
  e.g. archive_command = 'pg_WAL_filter -x 35456 | ... '
 
  There are some other ideas for generally reducing WAL volume also.
 
 I'd like to see them not written to the xlogs at all (if possible).   
 Seems rather unnecessary unless I'm missing something.

You aware you can turn archive_command off until after the load? That
way you'll have nothing to back up at all until its done. In 8.1, when
archive_command is off, CREATE TABLE AS SELECT is optimised to produce
no WAL.

Turning off WAL is a difficult topic. Without it you have no crash
recovery, which IMHO everybody says they don't care about until they
crash, then they realise. It's hard to be selective about writing WAL
for specific operations also.

However, there may be two cases not discussed before:
1. A newly created database into which a full load and/or pg_dump
restore is being run. In that case we could have a mode where we turn
off WAL completely during initdb via pg_control and then turn it back on
again permanently (i.e. a one way switch) once the server is fully
loaded. That covers the restore database use case.

2. Turn off WAL for one or more tablespaces, though never the main data
directory. This allows for data which is externally recoverable to be
isolated from things like the catalog and other more normal data. That
would be a tablespace level option that would propagate to each object.
That covers the regular snapshot load use case you describe.


-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Martijn van Oosterhout
On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
 Turning off WAL is a difficult topic. Without it you have no crash
 recovery, which IMHO everybody says they don't care about until they
 crash, then they realise. It's hard to be selective about writing WAL
 for specific operations also.

It's been discussed before. One idea is to declare tables without
logging. The idea being that during recovery those tables and related
indexes are simply truncated. No foreign keys allowed. Obviously they
will not be saved via PITR either.

Put another way, the table structure is saved in WAL, but the data
isn't.

Have a ncie day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Theo Schlossnagle


On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:


On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:

Turning off WAL is a difficult topic. Without it you have no crash
recovery, which IMHO everybody says they don't care about until they
crash, then they realise. It's hard to be selective about writing WAL
for specific operations also.


It's been discussed before. One idea is to declare tables without
logging. The idea being that during recovery those tables and related
indexes are simply truncated. No foreign keys allowed. Obviously they
will not be saved via PITR either.

Put another way, the table structure is saved in WAL, but the data
isn't.


This is exactly what I'd like.  Simon suggested turning off WAL  
during the loads as a possible hack solution.  The reason this won't  
work is that we snap all the time, lots of tables.  We have between  
2000 and 4000 snapshot operations per day (throughout).  At the same  
time we have reporting queries running (that create and/or populate  
other tables) that last from 5 minutes to 18 hours.  It is important  
that we run everything but the snapshots with WAL on (as we must have  
PITR -- sans snapshots)


// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Simon Riggs
On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:
 On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:
 
  On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
  Turning off WAL is a difficult topic. Without it you have no crash
  recovery, which IMHO everybody says they don't care about until they
  crash, then they realise. It's hard to be selective about writing WAL
  for specific operations also.
 
  It's been discussed before. One idea is to declare tables without
  logging. The idea being that during recovery those tables and related
  indexes are simply truncated. No foreign keys allowed. Obviously they
  will not be saved via PITR either.
 
  Put another way, the table structure is saved in WAL, but the data
  isn't.
 
 This is exactly what I'd like.  Simon suggested turning off WAL  
 during the loads as a possible hack solution.  The reason this won't  
 work is that we snap all the time, lots of tables.  We have between  
 2000 and 4000 snapshot operations per day (throughout).  At the same  
 time we have reporting queries running (that create and/or populate  
 other tables) that last from 5 minutes to 18 hours.  It is important  
 that we run everything but the snapshots with WAL on (as we must have  
 PITR -- sans snapshots)

These tables are loaded once then read-only, yes?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Theo Schlossnagle


On Oct 21, 2006, at 3:12 PM, Simon Riggs wrote:


On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:

On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:


On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:

Turning off WAL is a difficult topic. Without it you have no crash
recovery, which IMHO everybody says they don't care about until  
they
crash, then they realise. It's hard to be selective about  
writing WAL

for specific operations also.


It's been discussed before. One idea is to declare tables without
logging. The idea being that during recovery those tables and  
related
indexes are simply truncated. No foreign keys allowed. Obviously  
they

will not be saved via PITR either.

Put another way, the table structure is saved in WAL, but the data
isn't.


This is exactly what I'd like.  Simon suggested turning off WAL
during the loads as a possible hack solution.  The reason this won't
work is that we snap all the time, lots of tables.  We have between
2000 and 4000 snapshot operations per day (throughout).  At the same
time we have reporting queries running (that create and/or populate
other tables) that last from 5 minutes to 18 hours.  It is important
that we run everything but the snapshots with WAL on (as we must have
PITR -- sans snapshots)


These tables are loaded once then read-only, yes?


No, they are loaded, and then reloaded, and then reloaded. Queries  
that use them will get the most recently loaded version of them.  It  
meets a business rule like: table foo on the warehouse should be  
representative of version of table foo on OLTP no older than 30 minutes.


// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Simon Riggs
On Sat, 2006-10-21 at 15:17 -0400, Theo Schlossnagle wrote:
 On Oct 21, 2006, at 3:12 PM, Simon Riggs wrote:
 
  On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:
  On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:
 
  On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
  Turning off WAL is a difficult topic. Without it you have no crash
  recovery, which IMHO everybody says they don't care about until  
  they
  crash, then they realise. It's hard to be selective about  
  writing WAL
  for specific operations also.
 
  It's been discussed before. One idea is to declare tables without
  logging. The idea being that during recovery those tables and  
  related
  indexes are simply truncated. No foreign keys allowed. Obviously  
  they
  will not be saved via PITR either.
 
  Put another way, the table structure is saved in WAL, but the data
  isn't.
 
  This is exactly what I'd like.  Simon suggested turning off WAL
  during the loads as a possible hack solution.  The reason this won't
  work is that we snap all the time, lots of tables.  We have between
  2000 and 4000 snapshot operations per day (throughout).  At the same
  time we have reporting queries running (that create and/or populate
  other tables) that last from 5 minutes to 18 hours.  It is important
  that we run everything but the snapshots with WAL on (as we must have
  PITR -- sans snapshots)
 
  These tables are loaded once then read-only, yes?
 
 No, they are loaded, and then reloaded, and then reloaded. Queries  
 that use them will get the most recently loaded version of them.  It  
 meets a business rule like: table foo on the warehouse should be  
 representative of version of table foo on OLTP no older than 30 minutes.

But they can be re-created anew with the same name each time? Or I guess
not, but you redefine a view every 30 minutes to point to the latest
one?

If so, then I have a patch that will speed up COPY when in the same
transaction as the table that created it. I've finally fixed a bug in my
earlier prototypes that seems to make that work now, in all cases.

I was being slightly slow before; I thought this was a new requirement
but its just the old one slightly restated.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Theo Schlossnagle


On Oct 21, 2006, at 4:40 PM, Simon Riggs wrote:


On Sat, 2006-10-21 at 15:17 -0400, Theo Schlossnagle wrote:

On Oct 21, 2006, at 3:12 PM, Simon Riggs wrote:


On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:

On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:


On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
Turning off WAL is a difficult topic. Without it you have no  
crash

recovery, which IMHO everybody says they don't care about until
they
crash, then they realise. It's hard to be selective about
writing WAL
for specific operations also.


It's been discussed before. One idea is to declare tables without
logging. The idea being that during recovery those tables and
related
indexes are simply truncated. No foreign keys allowed. Obviously
they
will not be saved via PITR either.

Put another way, the table structure is saved in WAL, but the data
isn't.


This is exactly what I'd like.  Simon suggested turning off WAL
during the loads as a possible hack solution.  The reason this  
won't

work is that we snap all the time, lots of tables.  We have between
2000 and 4000 snapshot operations per day (throughout).  At the  
same

time we have reporting queries running (that create and/or populate
other tables) that last from 5 minutes to 18 hours.  It is  
important
that we run everything but the snapshots with WAL on (as we must  
have

PITR -- sans snapshots)


These tables are loaded once then read-only, yes?


No, they are loaded, and then reloaded, and then reloaded. Queries
that use them will get the most recently loaded version of them.  It
meets a business rule like: table foo on the warehouse should be
representative of version of table foo on OLTP no older than 30  
minutes.


But they can be re-created anew with the same name each time? Or I  
guess

not, but you redefine a view every 30 minutes to point to the latest
one?


closest to the latter.  A view is redefined when the new snapshot is  
complete.



If so, then I have a patch that will speed up COPY when in the same
transaction as the table that created it. I've finally fixed a bug  
in my

earlier prototypes that seems to make that work now, in all cases.

I was being slightly slow before; I thought this was a new requirement
but its just the old one slightly restated.


We don't use COPY.  We directly INSERT INTO target_snap SELECT * from  
remote_select(...) t(cast);


remote_select is part of dbi-link.

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-21 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 But they can be re-created anew with the same name each time? Or I guess
 not, but you redefine a view every 30 minutes to point to the latest
 one?

 If so, then I have a patch that will speed up COPY when in the same
 transaction as the table that created it. I've finally fixed a bug in my
 earlier prototypes that seems to make that work now, in all cases.

Can you make the patch cover the case of

begin;
truncate foo;
copy foo from ...
commit;

It might be infeasible to detect this case, but if it's not ...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] xlogdump fixups and WAL log question.

2006-10-20 Thread Theo Schlossnagle
Not sure who cares, so xzilla indicated I should drop a note here.  I  
just made the xlogdump stuff work for 8.1 (trivial) and fixed a few  
other small issues that caused it to not work right both generally  
and in our environment.


http://pgfoundry.org/tracker/index.php? 
func=detailaid=1000760group_id=1000202atid=772


We're using it to track down what's causing some wal log ruckus.   
We're generating about a quarter terabyte of WAL logs a day (on bad  
days) which is posing some PITR backup pains.  That amount isn't a  
severe challenge to backup, but our previous install was on Oracle  
and it generated substantially less archive redo logs (10-20 gigs per  
day).


Is it possible to create tables in fashion that will not write info  
to the WAL log -- knowingly and intentionally making them  
unrecoverable?  This is very desirable for us.  We snapshot tables  
from a production environment.  If the database goes down and we  
recover, the old snapshots are out of date anyway and serve no useful  
purpose.  The periodic snapshot procedure would re-snap them in short  
order anyway.  I'd like to do:


INSERT INTO TABLE tblfoo_snap1 AS SELECT * from table on remote  
database NO LOGGING;


(NO LOGGING being the only part we're currently missing) Is something  
like this possible?


Cheers ;-)
Theo

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-20 Thread Tom Lane
Theo Schlossnagle [EMAIL PROTECTED] writes:
 Is it possible to create tables in fashion that will not write info  
 to the WAL log -- knowingly and intentionally making them  
 unrecoverable?

Use temp tables?

Also, it's likely that much of the WAL volume is full-page images.
While you can't safely turn those off in 8.1, you can dial down the
frequency of occurrence by increasing checkpoint_segments and
checkpoint_timeout as much as you can stand.  (The tradeoffs are
amount of space occupied by pg_xlog/ and time to recover from a crash.)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-20 Thread Theo Schlossnagle


On Oct 20, 2006, at 1:58 PM, Tom Lane wrote:


Theo Schlossnagle [EMAIL PROTECTED] writes:

Is it possible to create tables in fashion that will not write info
to the WAL log -- knowingly and intentionally making them
unrecoverable?


Use temp tables?


temp tables won't work too well -- unless I can make a whole  
tablespace temp and multiple backends can see it.  They work fine  
for small tables we snapshot (couple hundred or even a few thousand  
rows), but many of the tables are a few hundred thousand rows and  
several processes on the system all need them.



Also, it's likely that much of the WAL volume is full-page images.
While you can't safely turn those off in 8.1, you can dial down the
frequency of occurrence by increasing checkpoint_segments and
checkpoint_timeout as much as you can stand.  (The tradeoffs are
amount of space occupied by pg_xlog/ and time to recover from a  
crash.)


Our pg_xlog is currently at 9.6GB.  Not sure I can reasonably tune it  
up much higher.


// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-20 Thread Simon Riggs
On Fri, 2006-10-20 at 13:18 -0400, Theo Schlossnagle wrote:
 Not sure who cares, so xzilla indicated I should drop a note here.  I  
 just made the xlogdump stuff work for 8.1 (trivial) and fixed a few  
 other small issues that caused it to not work right both generally  
 and in our environment.
 
 http://pgfoundry.org/tracker/index.php? 
 func=detailaid=1000760group_id=1000202atid=772

Diogo Biazus was working on that; I care also.

 We're using it to track down what's causing some wal log ruckus.   
 We're generating about a quarter terabyte of WAL logs a day (on bad  
 days) which is posing some PITR backup pains.  That amount isn't a  
 severe challenge to backup, but our previous install was on Oracle  
 and it generated substantially less archive redo logs (10-20 gigs per  
 day).

As Tom says, definitely because of full_page_writes=on

 Is it possible to create tables in fashion that will not write info  
 to the WAL log -- knowingly and intentionally making them  
 unrecoverable?  This is very desirable for us.  We snapshot tables  
 from a production environment.  If the database goes down and we  
 recover, the old snapshots are out of date anyway and serve no useful  
 purpose.  The periodic snapshot procedure would re-snap them in short  
 order anyway.  I'd like to do:
 
 INSERT INTO TABLE tblfoo_snap1 AS SELECT * from table on remote  
 database NO LOGGING;
 
 (NO LOGGING being the only part we're currently missing) Is something  
 like this possible?

Do you want this because of:
1) performance?
2) to reduce the WAL volume of PITR backups?

If you're thinking (1), then I guess I'd ask whether you've considered
what will happen when the reporting environment includes data from other
sources as it inevitably will. At that point, data loss would be much
more annoying. My experience is that the success of your current
implementation will lead quickly to a greatly increased user
requirement.

I've been looking at ways of reducing the WAL volume for PITR backups.
Here's a few ideas:

1. Provide a filter that can be easily used by archive_command to remove
full page writes from WAL files. This would require us to disable the
file size test when we begin recovery on a new WAL files, plus would
need to redesign initial location of the checkpoint record since we
could no longer rely on the XLogRecPtr being a byte offset within the
file.

e.g. archive_command = 'pg_WAL_filter -f | ... '

2. Include tablespaceid within the header of xlog records. This would
allow us to filter out WAL from one or more tablespaces, similarly to
(1), plus it would also allow single tablespace recovery.

e.g. archive_command = 'pg_WAL_filter -x 35456 | ... '

There are some other ideas for generally reducing WAL volume also.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-20 Thread Theo Schlossnagle


On Oct 20, 2006, at 4:24 PM, Simon Riggs wrote:


On Fri, 2006-10-20 at 13:18 -0400, Theo Schlossnagle wrote:

Not sure who cares, so xzilla indicated I should drop a note here.  I
just made the xlogdump stuff work for 8.1 (trivial) and fixed a few
other small issues that caused it to not work right both generally
and in our environment.

http://pgfoundry.org/tracker/index.php?
func=detailaid=1000760group_id=1000202atid=772


Diogo Biazus was working on that; I care also.


Cool.  Patch is short.


We're using it to track down what's causing some wal log ruckus.
We're generating about a quarter terabyte of WAL logs a day (on bad
days) which is posing some PITR backup pains.  That amount isn't a
severe challenge to backup, but our previous install was on Oracle
and it generated substantially less archive redo logs (10-20 gigs per
day).


As Tom says, definitely because of full_page_writes=on


Can I turn that off in 8.1?


Is it possible to create tables in fashion that will not write info
to the WAL log -- knowingly and intentionally making them
unrecoverable?  This is very desirable for us.  We snapshot tables
from a production environment.  If the database goes down and we
recover, the old snapshots are out of date anyway and serve no useful
purpose.  The periodic snapshot procedure would re-snap them in short
order anyway.  I'd like to do:

INSERT INTO TABLE tblfoo_snap1 AS SELECT * from table on remote
database NO LOGGING;

(NO LOGGING being the only part we're currently missing) Is something
like this possible?


Do you want this because of:
1) performance?


performance in that a substantial portion of my time is spent writing  
to pg_xlog



2) to reduce the WAL volume of PITR backups?


Yes.  Main concern.



e.g. archive_command = 'pg_WAL_filter -f | ... '
e.g. archive_command = 'pg_WAL_filter -x 35456 | ... '

There are some other ideas for generally reducing WAL volume also.


I'd like to see them not written to the xlogs at all (if possible).   
Seems rather unnecessary unless I'm missing something.


// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org